EDAboard.com | EDAboard.eu | EDAboard.de | EDAboard.co.uk | RTV forum PL | NewsGroups PL

Databus crossing clock domains with data freeze

elektroda.net NewsGroups Forum Index - FPGA - Databus crossing clock domains with data freeze

Goto page 1, 2  Next

Nicholas Kinar
Guest

Wed Jan 06, 2010 10:11 pm   



Hello--

What is the best standard practice to have a data bus cross a clock
domain by implementing a data freeze?

There is an extremely brief description of the data freeze given here:
http://www.fpga4fun.com/CrossClockDomain4.html

What is the best way to "freeze" the data in the source clock domain?

I have a 108-bit bus which needs to cross between a high-speed clock
domain (280MHz) and a clock domain operated at a lower speed (70MHz).

I am using Verilog as the HDL and my FPGA is a Cyclone II.

Nicholas

Jonathan Bromley
Guest

Wed Jan 06, 2010 11:07 pm   



On Wed, 06 Jan 2010 14:11:50 -0600, Nicholas Kinar wrote:

Quote:
What is the best way to "freeze" the data in the source clock domain?

I have a 108-bit bus which needs to cross between a high-speed clock
domain (280MHz) and a clock domain operated at a lower speed (70MHz).

If you are certain that the source clock is more than 2x faster
than your target clock, I think it's rather straightforward.

Create a divide-by-2 signal in the target domain. No
reset is required, because the phase of the divide-by-2
is irrelevant; only its changes are of interest. So
we use a Verilog model that doesn't need a reset in
simulation either:

always @(posedge slow_clock)
if (slow_flag == 1'b1)
slow_flag <= 0;
else
slow_flag <= 1;

In the source domain, put the new data in your freeze
register as soon as you detect a change on slow_flag,
taking care to resynchronize slow_flag to avoid the risk
of input hazards. Again no reset is required; it'll
sort itself out within three clock cycles.

always @(posedge fast_clock) begin
resync_slow_flag <= slow_flag;
old_slow_flag <= resync_slow_flag;
if (old_slow_flag != resync_slow_flag) begin
freeze_register <= source_data;
// Do whatever it takes to indicate that
// source_data has been consumed, and make
// the next source_data available no more
// than 2 fast clocks later
end
end

And finally, capture freeze_register on every slow_clock:

always @(posedge slow_clock)
useful_data <= freeze_register;

In this way you can get a new data value on every slow_clock.
You can carry "data valid" information along with the data
itself, if you don't have a new data item soon enough for
every slow clock.

Draw lots of timing diagrams, and do lots of worst-case
analysis, to convince yourself whether this very simple
approach is robust in your situation. I believe that
it works reliably provided the clock periods obey the
following relationship:

slow_period >= (2*fast_period) + Tss + Tpf + Tsf + Tps

where Tss is the setup time of the useful_data register,
Tsf is the setup time of the resync_slow_flag register,
Tpf is the propagation delay (including routing) from
fast clock to the freeze_register data becoming available
at the useful_data register's input, and Tps is the
propagation delay from slow clock to slow_flag becoming
available at the input to resync_slow_flag.

Note that Tss+Tpf and Tsf+Tps are both pretty much equal
to the shortest clock period that the FPGA can usefully
use, since they are both simple register-to-register
paths with no intervening logic. So, as a first
estimate, you could say

slow_period >= (2*fast_period) + (2/Fmax)

where Fmax is the FPGA's fastest useful clock speed.
But you'll need to apply timing constraints and check
the static timing results to be sure that you are safe.

Whatever you do, please double-check my assumptions
for yourself before doing anything upon which your
life, fortune or good name depends. Clock domain
crossings have been the undoing of many.

See also the "Flancter", and standard asynchronous FIFOs
in the FPGA macrocell library (although they will be much
more resource-hungry than the simple freeze, because they
must work for all combinations of source and target
clock frequency).
--
Jonathan Bromley

Nicholas Kinar
Guest

Wed Jan 06, 2010 11:18 pm   



Hello John--

Thank you for your response!

Quote:
The "flag" is the important item in that description. If you never
have more than one data value to transfer within any 8 (os so) high-
speed clock cycles you can get by with transferring one value at a
time. If you have bursts of data, you need a FIFO but the average
speed cannot be greater than one in four high-speed clocks. The FIFO
would need to be sized such that the longest burst could always be
drained.

Essentially what I have is a 108-bit register which holds samples from
six 18-bit ADCs. Once the register is full of data, I bring high an
"offload_flag" signal which is read in the lower-speed clock domain.
Once the "offload_flag" signal goes high, the 108-bit register is copied
into another register in the slow clock domain.

Then logic in the lower-speed clock domain brings high an
"rs_offload_flag" signal, which is read in the high speed clock domain.
When the "rs_offload_flag" signal goes high, logic in the high speed
clock domain then brings low the "offload_flag" signal.

This code fails timing analysis. There is no more than one data value to
transfer within 8 high speed clock cycles.

Perhaps my problem is that I need to use a synchronizer to bring the
"offload_flag" signal and the "rs_offload_flag" signal between clock
domains?

Quote:

When you have new data in the fast domain, toggle a single bit. Read
(register) that single bit in the slow domain. If the bit has
changed, load the data on the next cycle. You keep track of whether
the bit has changed with a simple clock delay of that bit in the slow
domain.

So I would have to keep track of the state of the single bit in the slow
domain? Would this involve having a register that holds the previous
value of the bit? Every clock cycle, the register would be monitored
for a change. If there is a transition in the bit, then the register is
read.

Then my register would have 109 bits = 108 bits data + 1 bit for transfer.


Quote:

Why not just load the data on the same clock the bit changes, using an
XOR of the fast and slow flag bits for an enable? If the clocks
aren't synchronous with guaranteed setup and hold, the enable may get
to some bits on one side of the clock transition, other bits on the
opposite side resulting in "half" transferred data.

What is the difference between the "fast" and "slow" flag bits? Do you
mean that there are two flag bits?

Quote:

I mentioned earlier to "toggle" a single bit in the fast domain. This
eliminates the need to have a reset handshake back from the slow
domain; it's only when the bit changes that a write occurs. This
points out that you can't have the bit toggle twice within one slow-
domain clock cycle or the change won't be seen and data lost. Also,
since there's a full slow clock cycle between registering the bit and
performing the data load, the data has to remain static for that
duration.

I think that I understand how to do this. The state of the single bit
(say data[0] in a 108-bit data word) is examined for a change in the
slow clock domain. If the bit changes, then it is time to read the
data. The 108-bit register is simply copied into another register that
is in the slow clock domain.

Quote:

If you need more help than descriptions, write again. I love to see
people think through the issue and understand why they write the code
they do.


Yes - it's far easier to write the code yourself than struggle through
understanding lines and lines of code that has been written by someone else.

Nicholas

Nicholas Kinar
Guest

Wed Jan 06, 2010 11:30 pm   



Hello Jonathon--


Thank you for your response!

Quote:

If you are certain that the source clock is more than 2x faster
than your target clock, I think it's rather straightforward.


Yes, the source clock and the target clock are produced by the same PLL.
Since 280MHz is 4 times faster than 70MHz, all that is being used
here is a clock multiplier.


Quote:
Create a divide-by-2 signal in the target domain. No
reset is required, because the phase of the divide-by-2
is irrelevant; only its changes are of interest. So
we use a Verilog model that doesn't need a reset in
simulation either:

always @(posedge slow_clock)
if (slow_flag == 1'b1)
slow_flag <= 0;
else
slow_flag <= 1;


I see - all that is being done here is a simple clock divisor. Perhaps
this could also be done with a shift register?

Quote:
In the source domain, put the new data in your freeze
register as soon as you detect a change on slow_flag,
taking care to resynchronize slow_flag to avoid the risk
of input hazards. Again no reset is required; it'll
sort itself out within three clock cycles.

always @(posedge fast_clock) begin
resync_slow_flag <= slow_flag;
old_slow_flag <= resync_slow_flag;
if (old_slow_flag != resync_slow_flag) begin
freeze_register <= source_data;
// Do whatever it takes to indicate that
// source_data has been consumed, and make
// the next source_data available no more
// than 2 fast clocks later
end
end

And finally, capture freeze_register on every slow_clock:

always @(posedge slow_clock)
useful_data <= freeze_register;


Very neat. It seems like a clean way to do this.


Quote:
In this way you can get a new data value on every slow_clock.
You can carry "data valid" information along with the data
itself, if you don't have a new data item soon enough for
every slow clock.

Draw lots of timing diagrams, and do lots of worst-case
analysis, to convince yourself whether this very simple
approach is robust in your situation. I believe that
it works reliably provided the clock periods obey the
following relationship:

slow_period >= (2*fast_period) + Tss + Tpf + Tsf + Tps

where Tss is the setup time of the useful_data register,
Tsf is the setup time of the resync_slow_flag register,
Tpf is the propagation delay (including routing) from
fast clock to the freeze_register data becoming available
at the useful_data register's input, and Tps is the
propagation delay from slow clock to slow_flag becoming
available at the input to resync_slow_flag.

Note that Tss+Tpf and Tsf+Tps are both pretty much equal
to the shortest clock period that the FPGA can usefully
use, since they are both simple register-to-register
paths with no intervening logic. So, as a first
estimate, you could say

slow_period >= (2*fast_period) + (2/Fmax)

where Fmax is the FPGA's fastest useful clock speed.
But you'll need to apply timing constraints and check
the static timing results to be sure that you are safe.

Whatever you do, please double-check my assumptions
for yourself before doing anything upon which your
life, fortune or good name depends. Clock domain
crossings have been the undoing of many.

See also the "Flancter", and standard asynchronous FIFOs
in the FPGA macrocell library (although they will be much
more resource-hungry than the simple freeze, because they
must work for all combinations of source and target
clock frequency).

I will take a look. Thanks for your help, Jonathon!


Nicholas

John_H
Guest

Wed Jan 06, 2010 11:34 pm   



On Jan 6, 3:11 pm, Nicholas Kinar <n.ki...@usask.ca> wrote:
Quote:
Hello--

What is the best standard practice to have a data bus cross a clock
domain by implementing a data freeze?

There is an extremely brief description of the data freeze given here:http://www.fpga4fun.com/CrossClockDomain4.html

What is the best way to "freeze" the data in the source clock domain?

I have a 108-bit bus which needs to cross between a high-speed clock
domain (280MHz) and a clock domain operated at a lower speed (70MHz).

I am using Verilog as the HDL and my FPGA is a Cyclone II.

Nicholas

The "flag" is the important item in that description. If you never
have more than one data value to transfer within any 8 (os so) high-
speed clock cycles you can get by with transferring one value at a
time. If you have bursts of data, you need a FIFO but the average
speed cannot be greater than one in four high-speed clocks. The FIFO
would need to be sized such that the longest burst could always be
drained.

When you have new data in the fast domain, toggle a single bit. Read
(register) that single bit in the slow domain. If the bit has
changed, load the data on the next cycle. You keep track of whether
the bit has changed with a simple clock delay of that bit in the slow
domain.

Why not just load the data on the same clock the bit changes, using an
XOR of the fast and slow flag bits for an enable? If the clocks
aren't synchronous with guaranteed setup and hold, the enable may get
to some bits on one side of the clock transition, other bits on the
opposite side resulting in "half" transferred data.

I mentioned earlier to "toggle" a single bit in the fast domain. This
eliminates the need to have a reset handshake back from the slow
domain; it's only when the bit changes that a write occurs. This
points out that you can't have the bit toggle twice within one slow-
domain clock cycle or the change won't be seen and data lost. Also,
since there's a full slow clock cycle between registering the bit and
performing the data load, the data has to remain static for that
duration.

If you need more help than descriptions, write again. I love to see
people think through the issue and understand why they write the code
they do.

- John

Nicholas Kinar
Guest

Thu Jan 07, 2010 1:37 am   



Hello Jonathan--

Quote:
In the source domain, put the new data in your freeze
register as soon as you detect a change on slow_flag,
taking care to resynchronize slow_flag to avoid the risk
of input hazards. Again no reset is required; it'll
sort itself out within three clock cycles.

always @(posedge fast_clock) begin
resync_slow_flag <= slow_flag;
old_slow_flag <= resync_slow_flag;
if (old_slow_flag != resync_slow_flag) begin
freeze_register <= source_data;
// Do whatever it takes to indicate that
// source_data has been consumed, and make
// the next source_data available no more
// than 2 fast clocks later
end
end

What I don't understand is the assignment logic at the top of this
always block. Isn't "old_slow_flag" equal to "resync_slow_flag"?

What is the best way to detect a change on "slow_flag"?

Once again, thank you so much for your help.

Nicholas

(and I am sorry for misspelling your name in previous posts...)

Nicholas Kinar
Guest

Thu Jan 07, 2010 5:04 am   



Quote:

What I don't understand is the assignment logic at the top of this
always block. Isn't "old_slow_flag" equal to "resync_slow_flag"?


To help me better understand this, I've re-written the code:

reg old_slow_flag;
reg slow_flag;

always @(posedge fast_clock) begin

if (old_slow_flag != slow_flag) begin

old_slow_flag <= slow_flag;
freeze_register <= source_data;

// Do whatever it takes to indicate that
// source_data has been consumed, and make
// the next source_data available no more
// than 2 fast clocks later

end
end

Nicholas Kinar
Guest

Thu Jan 07, 2010 5:14 am   



Quote:

If you are certain that the source clock is more than 2x faster
than your target clock, I think it's rather straightforward.


The examples are very helpful. Thank you for posting terse snippets of
code (rather than a large entire example program with lines and lines of
code).

Nicholas

Nicholas Kinar
Guest

Thu Jan 07, 2010 5:23 am   



Quote:

Draw lots of timing diagrams, and do lots of worst-case
analysis, to convince yourself whether this very simple
approach is robust in your situation. I believe that
it works reliably provided the clock periods obey the
following relationship:

slow_period >= (2*fast_period) + Tss + Tpf + Tsf + Tps


Yes - I just tried this using the Quartus II synthesis tools. Many
thanks for posting this procedure! I've verified that the solution you
propose passes timing analysis in Quartus II for my particular design.
Quote:

slow_period >= (2*fast_period) + (2/Fmax)



Quote:
where Fmax is the FPGA's fastest useful clock speed.
But you'll need to apply timing constraints and check
the static timing results to be sure that you are safe.

Whatever you do, please double-check my assumptions
for yourself before doing anything upon which your
life, fortune or good name depends. Clock domain
crossings have been the undoing of many.


To me, this equation seems to be very reasonable. The procedure also
works well.


Quote:
See also the "Flancter", and standard asynchronous FIFOs
in the FPGA macrocell library (although they will be much
more resource-hungry than the simple freeze, because they
must work for all combinations of source and target
clock frequency).

Agreed. Thanks again for your help, Jonathan!

Nicholas Kinar
Guest

Thu Jan 07, 2010 5:27 am   



Quote:

Perhaps my problem is that I need to use a synchronizer to bring the
"offload_flag" signal and the "rs_offload_flag" signal between clock
domains?

No, this is not the problem. Although a synchronizer is indeed useful

for these signals, it is not the cause of the failing timing analysis.


Quote:

When you have new data in the fast domain, toggle a single bit. Read
(register) that single bit in the slow domain. If the bit has
changed, load the data on the next cycle. You keep track of whether
the bit has changed with a simple clock delay of that bit in the slow
domain.


Toggling a single bit is indeed a solution. Thank you for suggesting this.

Nicholas Kinar
Guest

Thu Jan 07, 2010 5:51 am   



Quote:

I think that I understand how to do this. The state of the single bit
(say data[0] in a 108-bit data word) is examined for a change in the
slow clock domain. If the bit changes, then it is time to read the
data. The 108-bit register is simply copied into another register that
is in the slow clock domain.


Whoops, I think that I mean 109-bit data word here, since [108:1] would
be the data, and data[0] would be the flag bit. Any other combination
of data and flag bits also be possible.

My apologies.

Alternately, data[108] could be the flag bit, and [107:0] would then be
the data. The possibilities are endless, but it is probably best to
assign the MSB or the LSB as the flag bit - for the sake of simplicity.

Jonathan Bromley
Guest

Thu Jan 07, 2010 11:09 am   



This really belongs on comp.lang.verilog, but the topic
has been flogged to death there; and Verilog users on
comp.arch.fpga may find it vaguely relevant.

On Wed, 06 Jan 2010 21:04:28 -0600, Nicholas Kinar wrote:

Quote:
What I don't understand is the assignment logic at the top of this
always block. Isn't "old_slow_flag" equal to "resync_slow_flag"?

No, it's not. I carefully and correctly used a nonblocking
assignment <=, causing a delayed update of the target.
Whatever else you do, please get a robust understanding
of the relationship between blocking = and nonblocking <=
assignment before you do anything further with Verilog.

Quote:
To help me better understand this, I've re-written the code:

reg old_slow_flag;
reg slow_flag;

always @(posedge fast_clock) begin

if (old_slow_flag != slow_flag) begin

old_slow_flag <= slow_flag;
freeze_register <= source_data;

// Do whatever it takes to indicate that
// source_data has been consumed, and make
// the next source_data available no more
// than 2 fast clocks later

end
end

OK, I see what you're saying, but I'm afraid it's
completely wrong. I'm not quite sure how to respond
to this, but as a first try I'll explain why your
code is broken and then show you a slightly different
(and perhaps clearer) way to organize the design I
first offered.

Error #1: Your model will never start running in
simulation, because both regs initialize to 1'bX
(as do all regs in Verilog) and the comparison
(old_slow_flag != slow_flag)
evaluates to "unknown", because both sides of the
equality are unknown. When if() tests an unknown
condition, it (by definition) takes the "else"
branch; your code has no "else", so the if()
statement will do nothing. Consequently the regs
will remain stuck at X in simulation. By contrast,
synthesis can't handle X values and will give you
the logic you expect; but that's probably wrong
too, because....

Error #2: slow_flag is generated in the source
clock domain, and therefore may not have enough
setup and hold time relative to the target clock.
Consequently you could get an input hazard:
the comparator result (old != new), which is used
to enable various flipflops, might be detected
"true" by some of the flops and "false" by others.
That's why I took care to resynchronize it.

Subsequently you said that the fast clock is known
to be exactly 4x the slow clock; indeed, with a PLL,
you can get the slow clock's edges exactly lined up
with the fast clock's and you don't need to worry
about clock domain crossing at all. That somewhat
changes the goalposts, but my original solution
remains valid even when the two clocks are completely
asynchronous.

OK< so now let's go back to my example and see
(a) why it's right and (b) how I could recode it
to be a little easier to follow.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
THIS IS THE BIT THAT YOU REALLY MUST UNDERSTAND.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

always @(posedge fast_clock) begin // line 1
resync_slow_flag <= slow_flag; // line 2
old_slow_flag <= resync_slow_flag; // line 3
if (old_slow_flag != resync_slow_flag) begin // 4
freeze_register <= source_data;
// Do other related things
end
end

On line 1, the always block waits for the next clock
edge. So far, so good.
On line 2, we copy the asynchronous signal "slow_flag"
into its resynchronizing register. However, we use <=
nonblocking assignment, which means that the updating of
resync_slow_flag is postponed until all other activity
caused by the clock edge has finished. This postponed
signal update behaviour nicely models the clock-to-output
delay of real flip-flops. In particular, it means that...
On line 3, we copy resync_slow_flag to old_slow_flag.
But the value of resync_slow_flag that we copy HAS NOT YET
BEEN UPDATED by the nonblocking assignment. In other
words, we get the flipflop's value as it was just before
the clock edge. Similarly, on line 4 the values tested
by the if() expression are the values as they were just
before the clock edge. The same would be true if those
values were tested in any other always block triggered by
the same clock event; the value that you read is the value
as it was just before (and also at the moment of) the clock
edge. New values assigned using <= are projected future
values; they will take effect just after the clock edge
and, in particular, AFTER the execution of any always
block that was triggered by the same clock. This is
precisely how you get proper synchronous behaviour in
Verilog RTL code.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

However, some folk (particularly those coming from a
software background, where assignments always update
their targets immediately) find this sort of thing a little
awkward to read. So you can, if you wish, re-code the
always block in a more software-like style provided you
give it some local variables.

always @(posedge fast_clock) begin : my_resync // 1
reg resync_slow_flag, old_slow_flag; // 2
if (old_slow_flag != resync_slow_flag) begin // 3
freeze_register <= source_data; // 4
// Do other related things // 5
end // 6
old_slow_flag = resync_slow_flag; // 7
resync_slow_flag = slow_flag; // 8
end

By labelling the "begin" on line 1, we are entitled to
declare local variables in the block (line 2). Note that
these variables are STATIC; they are NOT re-initialized
each time the block executes, but instead they hold their
value from its previous execution.
On line 3 we test those variables' values; of course, the
values we get are the values left over from the previous
clock - that's flipflop behaviour.
On lines 7 and 8 we update the variables using blocking
assignment, which takes effect immediately. Consequently
we must respect your original concerns, and reverse the
order of the two assignments so that the updating of
old_slow_flag uses the old, rather than the updated, value
of resync_slow_flag.

This use of local variables has a number of benefits. We
have hidden the variables resync_* and old_* inside the
begin...end block where they're used, and they are not
(easily) accessible by other code that shouldn't be
concerned with them. We go back to a software-like model
of execution in which local variables update instantly.
However, we MUST continue to use nonblocking assignment
to freeze_reg because it will be read by other blocks
of code, outside this particular always block. Whenever
you write to a variable within a clocked process, and that
variable will be read by any other process, it's essential
to use nonblocking assignment in this way or else you will
get catastrophic race conditions in simulation, and mismatch
between simulation and synthesis behaviour.

Asks with trepidation..... where did you learn your Verilog,
without finding out about the correct usage of <= ???
--
Jonathan Bromley

Jonathan Bromley
Guest

Thu Jan 07, 2010 11:15 am   



On Wed, 06 Jan 2010 15:30:34 -0600, Nicholas Kinar wrote:

Quote:
Create a divide-by-2 signal in the target domain. No
reset is required, because the phase of the divide-by-2
is irrelevant; only its changes are of interest. So
we use a Verilog model that doesn't need a reset in
simulation either:

always @(posedge slow_clock)
if (slow_flag == 1'b1)
slow_flag <= 0;
else
slow_flag <= 1;


I see - all that is being done here is a simple clock divisor. Perhaps
this could also be done with a shift register?

I'm not quite sure what you mean here. In principle, the simplest
possible clock divisor is

always @(posedge clock)
clock_2 <= ~clock_2;

It is indeed a shift register; on each clock it shifts
the inverse of itself into itself, and it has only 1 stage.

But this doesn't work in simulation because reg clock_2 will
initialize to 1'bx, and ~(1'bx) is 1'bx so the divisor will be
stuck at X. It would work in synthesis.

My model works around the stuck-at-X problem without requiring
any explicit reset in simulation. There are several other ways
to get the same effect, but the one I offered is probably the
least hassle and the most portable across tools. It will
synthesise to just one flipflop and one inverter; I don't see
how you can do any better.

--
Jonathan Bromley

Nicholas Kinar
Guest

Thu Jan 07, 2010 5:18 pm   



Hello Jonathan--

Thank you once again for your detailed response. This is very much
appreciated.

Quote:
This really belongs on comp.lang.verilog, but the topic
has been flogged to death there; and Verilog users on
comp.arch.fpga may find it vaguely relevant.


Sure, but I find that comp.arch.fpga tends to be slightly more active
than comp.lang.verilog. So when I weighed the possibilities of posting
there, I thought that I would go with comp.arch.fpga because clock
domain crossing is a topic that encompasses all work with FPGAs.

Quote:

Error #1: Your model will never start running in
simulation, because both regs initialize to 1'bX
(as do all regs in Verilog) and the comparison
(old_slow_flag != slow_flag)
evaluates to "unknown", because both sides of the
equality are unknown. When if() tests an unknown
condition, it (by definition) takes the "else"
branch; your code has no "else", so the if()
statement will do nothing. Consequently the regs
will remain stuck at X in simulation. By contrast,
synthesis can't handle X values and will give you
the logic you expect; but that's probably wrong
too, because....

I try to initialize registers so that everything is in a known state.

reg old_slow_flag = 0;
reg slow_flag = 0;

But your example code takes into consideration the 1'bX condition and is
more robust.

Quote:

Error #2: slow_flag is generated in the source
clock domain, and therefore may not have enough
setup and hold time relative to the target clock.
Consequently you could get an input hazard:
the comparator result (old != new), which is used
to enable various flipflops, might be detected
"true" by some of the flops and "false" by others.
That's why I took care to resynchronize it.


Okay, now I understand the reasons for resynchronization.


Quote:

OK< so now let's go back to my example and see
(a) why it's right and (b) how I could recode it
to be a little easier to follow.

Thank you for going over this line-by-line.


Quote:
always @(posedge fast_clock) begin // line 1
resync_slow_flag <= slow_flag; // line 2
old_slow_flag <= resync_slow_flag; // line 3
if (old_slow_flag != resync_slow_flag) begin // 4
freeze_register <= source_data;
// Do other related things
end
end




Quote:
On line 2, we copy the asynchronous signal "slow_flag"
into its resynchronizing register. However, we use <=
nonblocking assignment, which means that the updating of
resync_slow_flag is postponed until all other activity
caused by the clock edge has finished. This postponed
signal update behaviour nicely models the clock-to-output
delay of real flip-flops. In particular, it means that...

This nicely explains what is happening with the non-blocking assignment.

Quote:
On line 3, we copy resync_slow_flag to old_slow_flag.
But the value of resync_slow_flag that we copy HAS NOT YET
BEEN UPDATED by the nonblocking assignment.

Ah! - okay. This explains the behavior of the code.

In other
Quote:
words, we get the flipflop's value as it was just before
the clock edge. Similarly, on line 4 the values tested
by the if() expression are the values as they were just
before the clock edge. The same would be true if those
values were tested in any other always block triggered by
the same clock event; the value that you read is the value
as it was just before (and also at the moment of) the clock
edge. New values assigned using <= are projected future
values; they will take effect just after the clock edge
and, in particular, AFTER the execution of any always
block that was triggered by the same clock. This is
precisely how you get proper synchronous behaviour in
Verilog RTL code.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

However, some folk (particularly those coming from a
software background, where assignments always update
their targets immediately) find this sort of thing a little
awkward to read.

That would be me!


So you can, if you wish, re-code the
Quote:
always block in a more software-like style provided you
give it some local variables.

always @(posedge fast_clock) begin : my_resync // 1
reg resync_slow_flag, old_slow_flag; // 2
if (old_slow_flag != resync_slow_flag) begin // 3
freeze_register <= source_data; // 4
// Do other related things // 5
end // 6
old_slow_flag = resync_slow_flag; // 7
resync_slow_flag = slow_flag; // 8
end


This is a nice example.


Quote:
By labelling the "begin" on line 1, we are entitled to
declare local variables in the block (line 2). Note that
these variables are STATIC; they are NOT re-initialized
each time the block executes, but instead they hold their
value from its previous execution.
On line 3 we test those variables' values; of course, the
values we get are the values left over from the previous
clock - that's flipflop behaviour.
On lines 7 and 8 we update the variables using blocking
assignment, which takes effect immediately. Consequently
we must respect your original concerns, and reverse the
order of the two assignments so that the updating of
old_slow_flag uses the old, rather than the updated, value
of resync_slow_flag.

This use of local variables has a number of benefits. We
have hidden the variables resync_* and old_* inside the
begin...end block where they're used, and they are not
(easily) accessible by other code that shouldn't be
concerned with them. We go back to a software-like model
of execution in which local variables update instantly.
However, we MUST continue to use nonblocking assignment
to freeze_reg because it will be read by other blocks
of code, outside this particular always block. Whenever
you write to a variable within a clocked process, and that
variable will be read by any other process, it's essential
to use nonblocking assignment in this way or else you will
get catastrophic race conditions in simulation, and mismatch
between simulation and synthesis behaviour.




Quote:
Asks with trepidation..... where did you learn your Verilog,
without finding out about the correct usage of <= ???


I am actually an environmental physicist who has had to self-teach
myself Verilog and everything to do with FPGAs. It is impossible to
really grasp the nuances of the language, the technology and the ideas
without participating in some sort of community, despite the
proliferation of "crash-courses" and "learn HDL in a week" materials.
Such courses and books may get you started, but there is always a lack
of experience. That's the reason why I rely on your help and the help
of other people on this newsgroup.

Nicholas Kinar
Guest

Thu Jan 07, 2010 5:27 pm   



Quote:

always @(posedge slow_clock)
if (slow_flag == 1'b1)
slow_flag <= 0;
else
slow_flag <= 1;

I see - all that is being done here is a simple clock divisor. Perhaps
this could also be done with a shift register?

I'm not quite sure what you mean here. In principle, the simplest
possible clock divisor is

always @(posedge clock)
clock_2 <= ~clock_2;

It is indeed a shift register; on each clock it shifts
the inverse of itself into itself, and it has only 1 stage.

But this doesn't work in simulation because reg clock_2 will
initialize to 1'bx, and ~(1'bx) is 1'bx so the divisor will be
stuck at X. It would work in synthesis.

I'm fond of setting the register to zero when it is initialized

reg clock_2 = 0;

But I believe that some older versions of the Verilog language standard
do not support this, and your code always ensures that this will work.

Quote:

My model works around the stuck-at-X problem without requiring
any explicit reset in simulation. There are several other ways
to get the same effect, but the one I offered is probably the
least hassle and the most portable across tools. It will
synthesise to just one flipflop and one inverter; I don't see
how you can do any better.


Yes I agree - it's a very clean solution.

Goto page 1, 2  Next

elektroda.net NewsGroups Forum Index - FPGA - Databus crossing clock domains with data freeze

Arabic versionBulgarian versionCatalan versionCzech versionDanish versionGerman versionGreek versionEnglish versionSpanish versionFinnish versionFrench versionHindi versionCroatian versionIndonesian versionItalian versionHebrew versionJapanese versionKorean versionLithuanian versionLatvian versionDutch versionNorwegian versionPolish versionPortuguese versionRomanian versionRussian versionSlovak versionSlovenian versionSerbian versionSwedish versionTagalog versionUkrainian versionVietnamese versionChinese version
RTV map EDAboard.com map News map EDAboard.eu map EDAboard.de map EDAboard.co.uk map Opony