Al Momen
Guest
Sun Feb 07, 2010 2:33 pm
Quote:
This really belongs on comp.lang.verilog, but the topic
has been flogged to death there; and Verilog users on
comp.arch.fpga may find it vaguely relevant.
On Wed, 06 Jan 2010 21:04:28 -0600, Nicholas Kinar wrote:
What I don't understand is the assignment logic at the top of this
always block. Isn't "old_slow_flag" equal to "resync_slow_flag"?
No, it's not. I carefully and correctly used a nonblocking
assignment <=, causing a delayed update of the target.
Whatever else you do, please get a robust understanding
of the relationship between blocking = and nonblocking <=
assignment before you do anything further with Verilog.
To help me better understand this, I've re-written the code:
reg old_slow_flag;
reg slow_flag;
always @(posedge fast_clock) begin
if (old_slow_flag != slow_flag) begin
old_slow_flag <= slow_flag;
freeze_register <= source_data;
// Do whatever it takes to indicate that
// source_data has been consumed, and make
// the next source_data available no more
// than 2 fast clocks later
end
end
OK, I see what you're saying, but I'm afraid it's
completely wrong. I'm not quite sure how to respond
to this, but as a first try I'll explain why your
code is broken and then show you a slightly different
(and perhaps clearer) way to organize the design I
first offered.
Error #1: Your model will never start running in
simulation, because both regs initialize to 1'bX
(as do all regs in Verilog) and the comparison
(old_slow_flag != slow_flag)
evaluates to "unknown", because both sides of the
equality are unknown. When if() tests an unknown
condition, it (by definition) takes the "else"
branch; your code has no "else", so the if()
statement will do nothing. Consequently the regs
will remain stuck at X in simulation. By contrast,
synthesis can't handle X values and will give you
the logic you expect; but that's probably wrong
too, because....
Error #2: slow_flag is generated in the source
clock domain, and therefore may not have enough
setup and hold time relative to the target clock.
Consequently you could get an input hazard:
the comparator result (old != new), which is used
to enable various flipflops, might be detected
"true" by some of the flops and "false" by others.
That's why I took care to resynchronize it.
Subsequently you said that the fast clock is known
to be exactly 4x the slow clock; indeed, with a PLL,
you can get the slow clock's edges exactly lined up
with the fast clock's and you don't need to worry
about clock domain crossing at all. That somewhat
changes the goalposts, but my original solution
remains valid even when the two clocks are completely
asynchronous.
OK< so now let's go back to my example and see
(a) why it's right and (b) how I could recode it
to be a little easier to follow.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
THIS IS THE BIT THAT YOU REALLY MUST UNDERSTAND.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
always @(posedge fast_clock) begin // line 1
resync_slow_flag <= slow_flag; // line 2
old_slow_flag <= resync_slow_flag; // line 3
if (old_slow_flag != resync_slow_flag) begin // 4
freeze_register <= source_data;
// Do other related things
end
end
On line 1, the always block waits for the next clock
edge. So far, so good.
On line 2, we copy the asynchronous signal "slow_flag"
into its resynchronizing register. However, we use <=
nonblocking assignment, which means that the updating of
resync_slow_flag is postponed until all other activity
caused by the clock edge has finished. This postponed
signal update behaviour nicely models the clock-to-output
delay of real flip-flops. In particular, it means that...
On line 3, we copy resync_slow_flag to old_slow_flag.
But the value of resync_slow_flag that we copy HAS NOT YET
BEEN UPDATED by the nonblocking assignment. In other
words, we get the flipflop's value as it was just before
the clock edge. Similarly, on line 4 the values tested
by the if() expression are the values as they were just
before the clock edge. The same would be true if those
values were tested in any other always block triggered by
the same clock event; the value that you read is the value
as it was just before (and also at the moment of) the clock
edge. New values assigned using <= are projected future
values; they will take effect just after the clock edge
and, in particular, AFTER the execution of any always
block that was triggered by the same clock. This is
precisely how you get proper synchronous behaviour in
Verilog RTL code.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
However, some folk (particularly those coming from a
software background, where assignments always update
their targets immediately) find this sort of thing a little
awkward to read. So you can, if you wish, re-code the
always block in a more software-like style provided you
give it some local variables.
always @(posedge fast_clock) begin : my_resync // 1
reg resync_slow_flag, old_slow_flag; // 2
if (old_slow_flag != resync_slow_flag) begin // 3
freeze_register <= source_data; // 4
// Do other related things // 5
end // 6
old_slow_flag = resync_slow_flag; // 7
resync_slow_flag = slow_flag; // 8
end
By labelling the "begin" on line 1, we are entitled to
declare local variables in the block (line 2). Note that
these variables are STATIC; they are NOT re-initialized
each time the block executes, but instead they hold their
value from its previous execution.
On line 3 we test those variables' values; of course, the
values we get are the values left over from the previous
clock - that's flipflop behaviour.
On lines 7 and 8 we update the variables using blocking
assignment, which takes effect immediately. Consequently
we must respect your original concerns, and reverse the
order of the two assignments so that the updating of
old_slow_flag uses the old, rather than the updated, value
of resync_slow_flag.
This use of local variables has a number of benefits. We
have hidden the variables resync_* and old_* inside the
begin...end block where they're used, and they are not
(easily) accessible by other code that shouldn't be
concerned with them. We go back to a software-like model
of execution in which local variables update instantly.
However, we MUST continue to use nonblocking assignment
to freeze_reg because it will be read by other blocks
of code, outside this particular always block. Whenever
you write to a variable within a clocked process, and that
variable will be read by any other process, it's essential
to use nonblocking assignment in this way or else you will
get catastrophic race conditions in simulation, and mismatch
between simulation and synthesis behaviour.
Asks with trepidation..... where did you learn your Verilog,
without finding out about the correct usage of <= ???
--
Jonathan Bromley
Hello Jonathan,
I have simulated your code, however, I am only capturing every other data
beats or the Even Data ( i.e. 0,2,4,6,8,..) the Odd Data beats are lost.
My Slow clock is (slow_period >= (2*fast_period) + (2/Fmax)) as an example
Fast clock 100Mhz, slow Clock at 48Mhz. The source data are bursts in at
the Fast clock frequency.
From your solution, it seems the source data are been sampled at slow_flag
or at the half of the fast clock rate. This will drop the second beats or
Odd data beats ( 1,3,5,7,.. Data Beats)
Please clarify, if I am missing something?
---------------------------------------
Posted through
http://www.FPGARelated.com
Jonathan Bromley
Guest
Sun Feb 07, 2010 3:45 pm
On Sun, 07 Feb 2010 07:33:49 -0600, "Al Momen" wrote:
Quote:
I have simulated your code, however, I am only capturing every other data
beats or the Even Data ( i.e. 0,2,4,6,8,..) the Odd Data beats are lost.
My Slow clock is (slow_period >= (2*fast_period) + (2/Fmax)) as an example
Fast clock 100Mhz, slow Clock at 48Mhz. The source data are bursts in at
the Fast clock frequency.
From your solution, it seems the source data are been sampled at slow_flag
or at the half of the fast clock rate. This will drop the second beats or
Odd data beats ( 1,3,5,7,.. Data Beats)
Please clarify, if I am missing something?
I'm confused: do you get data on every cycle of the 100MHz clock?
If so, then you cannot expect to get every data item across
the interface unless you add some buffer storage (FIFO) on the
fast, source side. The original question, for which I wrote
the specific answer, was about data transfer _with data freeze_;
in other words, the source data must be held until the slow,
target clock domain has taken it. Clearly you can't do that
if you have new data on every cycle of the fast source clock.
FIFO storage allows you to buffer up a fast burst and then
transfer it slowly while the source is inactive. But
that's not what the original question was about. It
sounds as though your application needs a full-dress
two-clock FIFO.
--
Jonathan Bromley