Goto page Previous 1, 2
valentin tihhomirov
Guest
Sun Jan 24, 2010 3:45 pm
KJ wrote:
Quote:
Why to use assignments to bring some order into the evaluation logic if
you admit yourself that it is almost impossible in practice to rely on
it? IMO, assignments exist in all languages for one important thing: you
compute once and save/share the result.
You're correct that one uses a concurrent signal assignment or a
variable assignment as a method to save (*1) a result to be used by
something else. The reason for doing this is to improve code
maintainability. It can also help with debug of the code when there
is a clear (and correct) match between the naming of the signal and
what it logically really represents.
Intermediate variables also hint the compiler the intermediate type to
use in composite function. I belive that this along with improved
debugging makes the intermediate vars a preferred coding standard in SW
companies. Yet, I often preferred the conciseness of immediate
composition. Jonathan explained why -- I have background in VDHL where
assignments incur delays :)
Quote:
Where I differ with you is when you say "almost impossible in practice
to rely on it". Rely on it to do what?
Jonathan pointed out that a delta delay on assignment brings some order
into VHDL evaluation of events by simulator. We rely on VHDL to produce
a predictable behaviour.
The "HD" in VHDL does stand
Quote:
for 'hardware description', but that does not necessarily imply that
source code literally describes the precise implementation. If it
did, then one would expect the source code that someone writes to
describe functionality by instantiating look up tables (and defining
their contents), flip flops and whatever other primitives are
available in a particular FPGA. That description would be completely
useless and have to be completely re-written to use and/or arrays and
flops if targetting a CPLD type device. Both of those descriptions
would also be completely useless if the target implementation was
discrete logic parts.
So one must accept the paradox that the hardware that is being
described in the original source code is most likely NOT a description
of the actual implemented hardware. What is the point of a 'hardware
description' language that is typically not used by an end user to
describe the actual hardware implementation? In a word,
'productivity'. Having to change source code to describe different
physical implementations (i.e. targetting different types of devices)
is not as productive as describing a mythical hardware implementation
and then using tools to translate that into something that is
physically realizable in multiple forms.
Given that the original source describes mythical and not necessarily
real hardware, one has to consider what is the purpose of the various
tools that one will use to interpret that source code. Here you seem
to be thinking that the simulator should be simulating an actual
hardware description and if an assignment will end up resulting in no
hardware, just a wire connection, then there should be no simulation
induced delay. But the simulator is simulating the description of the
mythical hardware (i.e. what is actually in the source code). The
translation of the mythical hardware description into a description of
actual hardware is the job of a synthesis tool. Those are two
independent paths.
Nevertheless, synthesizers manage to produce the same behaviour HW from
the mythical one. They ensure a skewless clock distribution and hold
delay on output when meet a sync FF wait pattern. At that, synthesizers
do not demand the added delay hack.
Quote:
This added delay is just a hack that in certain situations may have
some merit...nothing more.
This merit is a confidence, which may be not too much for you but I
would prefer it built-in.
In general, your speech sounds like an attempt to justify imperfection
and discreepancy of sim from synth.
Regards.
Jonathan Bromley
Guest
Sun Jan 24, 2010 8:57 pm
On Sat, 23 Jan 2010 22:20:07 +0200, valentin tihhomirov wrote:
Quote:
Do you see that you first denounce the
implication and then reannounce it again?
Certainly, but I also made it clear that the second case
was an exception.
I'm guessing that you are working in English as a
second language. Please accept my apologies if
I have been too idiomatic or insufficiently precise.
Quote:
Why to use assignments to bring some order into the evaluation logic if
you admit yourself that it is almost impossible in practice to rely on
it?
I didn't say that. I said that it was impossible in practice to
balance delta delays for all receivers of a clock. That's a very
different problem.
If all receivers of a clock see it without delta delays, there
is no problem at all, and the delta-delay signal assignment
behaviour of VHDL does precisely what we need. It gives us
a straightforward, efficient abstraction for simulation
whilst ensuring that the same behaviour can be preserved
by synthesis. Without the delta delay on signal assignment,
it is very hard to maintain that matching behaviour.
Quote:
IMO, assignments exist in all languages for one important thing: you
compute once and save/share the result.
Variable assignment meets that need in VHDL.
Quote:
The benefit of manually inserting the trigger hold delay is that you do
it almost thoughtlessly. But doing it per every FF is also quite
tiresome. Doing this automatically by simulators is what I desired
complaining their discrepancy from HW.
I'm sorry, I don't really understand your complaint here. VHDL,
like other hardware description languages, does not pretend to
be a perfect model of digital hardware. But it does provide
modelling that is sufficiently accurate for most designers'
needs, providing they follow certain rules. Those rules
are neither complicated nor onerous.
Quote:
Ah, now I understand the problem.
Did you by any chance learn your VHDL from Navabi's book? It makes
extensive use of custom enumerated data types to model logic.
Unfortunately, it is academic and does not match the real world.
In general, enumerated types in VHDL synthesise to hardware signals
that have enough bits to provide a unique 1/0 binary value for
each enumeration literal. So, for example, your "trit" data type
with its three values 'U', '0', '1' will synthesise to at least
two digital signals (possibly three, if the tool chooses one-hot
coding). That explains the curious error messages you got
from XST. Simulation would handle the code just fine.
There is just one exception to this. The built-in data types
boolean and bit, and the predefined type std_ulogic, are handled
specially by synthesis because they were created specifically
in order to model the behaviour of a single digital signal.
I don't really understand why you need your "trit" data type;
std_(u)logic has all the values you need, and there is already
a definition in ieee.std_logic_1164 that may serve you well:
subtype X01 is std_logic range 'X' to '1';
If you use that type, it will reliably synthesise to a single
wire and any conversion functions' behaviour will match
simulation.
--
Jonathan Bromley
KJ
Guest
Sun Jan 24, 2010 10:04 pm
On Jan 24, 8:45 am, valentin tihhomirov <nos...@server.org> wrote:
Quote:
Nevertheless, synthesizers manage to produce the same behaviour HW from
the mythical one.
You're mistaken and apparently didn't understand my post. Synthesizer
do *not* produce an implementation that is the exact same behavior as
specified in the original source unless your source code consists of
only instantiation of primitives that are available in the target
device and nothing else. In particular, assignments (b <= a; b <Some_function(a)) are not implemented at all, they are transformed
into a logical equivalent which is implemented with those primitives.
Two things that are logically equivalent are not necessarily the
same. Logic does not consider delays at all, this is covered in
Boolean Logic 101.
Any FPGA or CPLD *could* implement the assignment b <= a; in
hardware. They do not do so because that is (almost) never what the
design engineer intends to be implemented in the hardware.
Implementing the code as literally written would wildly waste device
resources and would cause the engineer to quickly find a better
synthesis tool.
Quote:
They ensure a skewless clock distribution and hold
delay on output when meet a sync FF wait pattern. At that, synthesizers
do not demand the added delay hack.
Right...like I said, synthesizers don't actually implement was is
literally described in the source code...which implies they implement
'something else'. A simulator simulates what is literally described,
not 'something else'. Someone skilled in using both synthesis and
simulation tools knows how to use them (i.e. write code) so that the
simulation of 'literally described source code' will be close enough
to the 'something else' that the synthesizer actually implements to
have high confidence that they are describing essentially the same
thing, even though they are not exactly the same.
Quote:
This added delay is just a hack that in certain situations may have
some merit...nothing more.
This merit is a confidence, which may be not too much for you but I
would prefer it built-in.
OK, so find a synthesis tool that implements your source code as
literally written. You'd probably find that you would be the only one
in the world that would use it. Alternatively, choose to write your
code as only instantiations of primitives supported by the target
device.
Quote:
In general, your speech sounds like an attempt to justify imperfection
and discreepancy of sim from synth.
No, just trying to enlighten you on what the differences are.
Synthesis tools and simulation tools are both 'tools'. Tools can be
used by both those skilled in their use and those that are unskilled.
Kevin Jennings
Kenn Heinrich
Guest
Mon Jan 25, 2010 4:58 am
valentin tihhomirov <nospam_at_server.org> writes:
Quote:
KJ wrote:
Why to use assignments to bring some order into the evaluation logic if
you admit yourself that it is almost impossible in practice to rely on
it? IMO, assignments exist in all languages for one important thing: you
compute once and save/share the result.
You're correct that one uses a concurrent signal assignment or a
variable assignment as a method to save (*1) a result to be used by
something else. The reason for doing this is to improve code
maintainability. It can also help with debug of the code when there
is a clear (and correct) match between the naming of the signal and
what it logically really represents.
Intermediate variables also hint the compiler the intermediate type to
use in composite function. I belive that this along with improved
debugging makes the intermediate vars a preferred coding standard in
SW companies. Yet, I often preferred the conciseness of immediate
composition. Jonathan explained why -- I have background in VDHL
where assignments incur delays :)
Where I differ with you is when you say "almost impossible in practice
to rely on it". Rely on it to do what?
Jonathan pointed out that a delta delay on assignment brings some
order into VHDL evaluation of events by simulator. We rely on VHDL to
produce a predictable behaviour.
This is a key point, that the language has well defined semantics. But
Kevin's "mythical implementation" analogy is very accurate - these well
defined semantics are for the mythical (simulation) aspect, not for the
physical hardware.
The problem you stumbled on, with a signal assignment causing a clock
skew, is one of the classic nuisance problems in VHDL. I don't know
whether to call it a design mistake, or just an oversight - perhaps the
design committe should have forseen this scenario. It would appear
trivially obvious to a real hardware guy, but would simulate with such
counter-intuituve behaviour. I think all hardware guys worth their salt
have run across this one before and made a mental note: "Just don't do
that".
Quote:
The "HD" in VHDL does stand
for 'hardware description', but that does not necessarily imply that
source code literally describes the precise implementation. If it
did, then one would expect the source code that someone writes to
describe functionality by instantiating look up tables (and defining
their contents), flip flops and whatever other primitives are
available in a particular FPGA. That description would be completely
useless and have to be completely re-written to use and/or arrays and
flops if targetting a CPLD type device. Both of those descriptions
would also be completely useless if the target implementation was
discrete logic parts.
So one must accept the paradox that the hardware that is being
described in the original source code is most likely NOT a description
of the actual implemented hardware. What is the point of a 'hardware
description' language that is typically not used by an end user to
describe the actual hardware implementation? In a word,
'productivity'. Having to change source code to describe different
physical implementations (i.e. targetting different types of devices)
is not as productive as describing a mythical hardware implementation
and then using tools to translate that into something that is
physically realizable in multiple forms.
Given that the original source describes mythical and not necessarily
real hardware, one has to consider what is the purpose of the various
tools that one will use to interpret that source code. Here you seem
to be thinking that the simulator should be simulating an actual
hardware description and if an assignment will end up resulting in no
hardware, just a wire connection, then there should be no simulation
induced delay. But the simulator is simulating the description of the
mythical hardware (i.e. what is actually in the source code). The
translation of the mythical hardware description into a description of
actual hardware is the job of a synthesis tool. Those are two
independent paths.
Nevertheless, synthesizers manage to produce the same behaviour HW
from the mythical one. They ensure a skewless clock distribution and
hold delay on output when meet a sync FF wait pattern. At that,
synthesizers do not demand the added delay hack.
This added delay is just a hack that in certain situations may have
some merit...nothing more.
This merit is a confidence, which may be not too much for you but I
would prefer it built-in.
In general, your speech sounds like an attempt to justify imperfection
and discreepancy of sim from synth.
Regards.
I think there are two things you can take away from this discussion. The
first is Andy's well described case about simulation with well defined
semantics versus synthesis as a parallel-but-not-quite-identical
process. The second is that the delta cycle delay problem you saw is
both (1) well understood under the LRM language semantics, and (2) in
your specific case completely counterintuitive to what an old-school
breadboard and wire-wrap hardware engineer will expect.
As far as the suggestion to add
foo <= bar after 100 ps;
you must be very careful here. The LRM allows you to set a time
resolution limit for any simulation, and will intepret time expressions
less than the limit as zero. Therefore, the above code simulated with a
resolution limit of 1 ns will behave the same as
foo <= bar after 0 ns;
which is the same as
foo <= bar;
Watch out!
As far as your example working well from academia, one think to note
about the skew problem is that it really only happens one way. I.e. when
clk2 <= clk1; you will find that a process sensitive to clk2 will see
data produced by a "flop" (a process) on clk1 a cycle early, but NOT the
other way around. It might be the case that your academic example didn't
exhibit this problem because there was only a one-way crossing from the
one "clock domain" to the other. Or perhaps it was lucky delta
balancing.
If you absolutely must keep two different data types for clocks, the
most timing-robust (although admittedly ugly) solution is to pass, in
parallel, two versions of the same clock through your entire heirarchy
and produce both the std_logic and TRI versions out of one single,
coherent, delta balanced clock generator process. That puts all of your
balancing in one easy to manage place and gives you the illusion of
being able to pass cleanly back and forth between the two domains, which
mimics your real hardware pretty accurately.
Another thing you might wqant to try is to make use of the fact that
'X','0', and '1' in the std_logic enum are consecutive, and there's an
actual subtype X01 and conversions (cvt_to_x01, To_x01) that can let you
deal with the surjection. Of course, this would mean re-engineering
your codebase to remove your custom clock type.
As a general principle, though, it's often good software engineering
practice (abstraction, conciseness, type-safety) to encapsulate your
data in user types with conversion functions. However, the case of
clocks is an exception - it's just plain dangerous, for all the reasons
discussed above.
And as a further slight digression on some of the earlier posts, the
reason that the simple (2001- style) port mappings do not cause delta
delays while the signal assigment do is the following:
Every port and declared signal defines a new, distinct signal. However,
the simulator (and hence the LRM) deals with _groups_ of signals that
are tied together, called "nets" in the LRM. When you do simple port
maps, the signals get combined to belong to the same "net". In the LRM,
one _net_ gets updated in between one simulation cycle. Thus, you can
tie arbitrary levels of heirarchy together, and every port lives on the
same net, so every flop sees the rising edge on the same simulation
cycle. This is slightly different that just joining signals togther,
due to in/out/buffer, type conversion, resolution, and so
forth. However, a signal assignment is actually a VHDL process, and NOT
a port mapping (even though the use of symmetric arrows (<= and =>)
would make you think otherwise).
For your clock problem, this is what happens according to the LRM:
- The language says that first you update the net on clk1 (and all
signals attached to it). Then you queue up every process sensitive to
all signals on clk1's net. That's the update part of the first
simulation cycle.
- The PROCESS defined by "clk2 <= clk1" fires because it's sensitive to
clk1 and schedules a pending transaction on clk2. That will be seen in
the *next* simulation cycle, but first we have to finish running every
process (i.e. clock every flop) sensitive to clk1's net.
- This means every *other* process sensitive to clk1 is also running,
assigning your Q outputs to every clk1-domain DFF. Now ths first sim
cycle is complete.
- On the *next* simulation cycle, every signal associated with the clk2
net gets updated, triggering every process sensitive to clk2 to
run. These processes will see the values just written by the clk1
processes, resulting in your phantom early clocking. Since two sim
cycles elapsed but no time did, this was a delta cycle.
Hope this helps a little,
- Kenn
Brian Drummond
Guest
Mon Jan 25, 2010 2:13 pm
On Sun, 24 Jan 2010 21:58:26 -0500, Kenn Heinrich <kwheinri_at_uwaterloo.ca> wrote:
Quote:
valentin tihhomirov <nospam_at_server.org> writes:
Jonathan pointed out that a delta delay on assignment brings some
order into VHDL evaluation of events by simulator. We rely on VHDL to
produce a predictable behaviour.
This is a key point, that the language has well defined semantics. But
Kevin's "mythical implementation" analogy is very accurate - these well
defined semantics are for the mythical (simulation) aspect, not for the
physical hardware.
The problem you stumbled on, with a signal assignment causing a clock
skew, is one of the classic nuisance problems in VHDL. I don't know
whether to call it a design mistake, or just an oversight - perhaps the
design committe should have forseen this scenario. It would appear
trivially obvious to a real hardware guy, but would simulate with such
counter-intuituve behaviour. I think all hardware guys worth their salt
have run across this one before and made a mental note: "Just don't do
that".
I agree with all that Kenn wrote - including that this is a nuisance - except
for regarding this problem as a mistake. An inconvenience, yes.
But while the additional delta cycle problem is one that can mess up simulations
but not synthesis, it occurs in a region of the design space where you can also
see problems in synthesis that may not be revealed in simulation. Therefore I
choose to see it as a handy reminder that "here be monsters..."
Use ONE clock wherever you can; and use some other strategy
(1) "after" clauses
(2) registering signals on the opposite clock edge
(3) synchronisers if necessary
for the exceptions. If the exceptions are someone else's IP (complete with clock
assignments - WHY do IP providers do this???) then "after" clauses are the
simplest and perfectly adequate (with Kenn's warnings on time resolution).
Any design strategy that tries to rely on balancing delta cycles is just too
fragile to contemplate. This is what I was attempting (poorly) to say with my
earlier question - what if the synthesiser took the clock assignment as
permission to insert a clock buffer? If it did that, the added delta cycle would
become a very real delay, and cause the synth result to match the simulation.
Not a problem with most synthesis tools, most of the time. But occasionally...
Using the same clock across more than one FPGA, this problem is unavoidable;
there will be buffers on your clocks, and skew. Ironically, if you model this
with port mapping instead of clock assignment, this will mess up synthesis but
not simulation!
I tend to assign clocks and data in my top level testbench, with "after" clauses
to approximate the I/O buffer delays on the FPGA pins. Then external models from
e.g. memory manufacturers will at least deliver results in the right clock
cycle!
Bringing a clock into an FPGA, I have seen two versions of XST do two different
things; XST7.1 inserted a clock buffer (BUFG) after the IBUF (as expected) and
used the result everywhere. Which worked perfectly.
But XST10.1 inserted the BUFG after the IBUF, and used the result for all the
internal logic. But it chose to feed the DCM (clock generator for related clock
frequencies) directly from the IBUF, thus guaranteeing about 2.5ns of skew
between supposedly identical clocks...
(Digression: why didn't I simply avoid the skew by using the CLKX1 output from
the DCM? I couldn't, because an IP core chained further DCMs off the same clock,
and that can accumulate clock jitter until the third DCM won't lock...)
So in short, you need to give the clock signals extra consideration, whether or
not simulation has this "nuisance". Therefore I tend to think of it as a useful
reminder to pay attention, rather than a problem.
- Brian
valentin tihhomirov
Guest
Mon Jan 25, 2010 9:49 pm
Quote:
I didn't say that. I said that it was impossible in practice to
balance delta delays for all receivers of a clock. That's a very
different problem.
If all receivers of a clock see it without delta delays, there
is no problem at all, and the delta-delay signal assignment
behaviour of VHDL does precisely what we need. It gives us
a straightforward, efficient abstraction for simulation
whilst ensuring that the same behaviour can be preserved
by synthesis. Without the delta delay on signal assignment,
it is very hard to maintain that matching behaviour.
I have concluded this from your contrapose VHDL event evaluation to
Verilog's "mess".
Quote:
IMO, assignments exist in all languages for one important thing: you
compute once and save/share the result.
Variable assignment meets that need in VHDL.
Variables are used to generate logic in the process. They represent
different signals at different times. And, they are hard-to-debug in
simulator.
Quote:
I'm sorry, I don't really understand your complaint here. VHDL,
like other hardware description languages, does not pretend to
be a perfect model of digital hardware. But it does provide
modelling that is sufficiently accurate for most designers'
needs, providing they follow certain rules. Those rules
are neither complicated nor onerous.
I told about tiresomeness of extending every reg assignment with 'after'
clause. The rule "to do everything" is not complex at all per se ;)
Quote:
This is indeed an alternative to waiting on custom logic. Unfortunately,
none besides 'academic style' is supported by XST, as experiments show
http://forums.xilinx.com/xlnx/board/message?board.id=SYNTHBD&thread.id=1747
Ah, now I understand the problem.
Did you by any chance learn your VHDL from Navabi's book? It makes
extensive use of custom enumerated data types to model logic.
Unfortunately, it is academic and does not match the real world.
In general, enumerated types in VHDL synthesise to hardware signals
that have enough bits to provide a unique 1/0 binary value for
each enumeration literal. So, for example, your "trit" data type
with its three values 'U', '0', '1' will synthesise to at least
two digital signals (possibly three, if the tool chooses one-hot
coding). That explains the curious error messages you got
from XST. Simulation would handle the code just fine.
Are you about "Bad condition in wait statement, or only one clock per
process" or "at line 0, operands of <AND> are not of the same size"? How
can "wait until to_bit(multivalued)='1'", that converts to a single bit,
can produce a multibit wait? XST is more cryptic than informative. The
Quartus output is really descriptive, on the other hand.
Quote:
There is just one exception to this. The built-in data types
boolean and bit, and the predefined type std_ulogic, are handled
specially by synthesis because they were created specifically
in order to model the behaviour of a single digital signal.
I don't really understand why you need your "trit" data type;
std_(u)logic has all the values you need, and there is already
a definition in ieee.std_logic_1164 that may serve you well:
subtype X01 is std_logic range 'X' to '1';
If you use that type, it will reliably synthesise to a single
wire and any conversion functions' behaviour will match
simulation.
The special "single bit" treatment of std types is the reason I
introduce a custom enumeration. I will try to emulate the truly
multivalued data lines in one of my subblocks.
valentin tihhomirov
Guest
Mon Jan 25, 2010 10:06 pm
Thank you for sharing exertise. I'm asking, however, about fully
synchronous FPGA-internal case.
valentin tihhomirov
Guest
Mon Jan 25, 2010 10:15 pm
I have also suddenly realized that in my VHDL netlist writer
(unfortunately I do not use EDIF, which does not demand line type
specification) I extensively use intermediate clock assignments: between
parent port and instances. I have just realized how dangerous this might
be. But surprisingly, I have never faced any problems because of this:
neither in sim nor in synthesis.
Andy
Guest
Tue Jan 26, 2010 1:30 am
On Jan 25, 1:49 pm, valentin tihhomirov <nos...@server.org> wrote:
Quote:
IMO, assignments exist in all languages for one important thing: you
compute once and save/share the result.
Variable assignment meets that need in VHDL.
Variables are used to generate logic in the process. They represent
different signals at different times. And, they are hard-to-debug in
simulator.
While some simulators cannot show variables in waveforms, I prefer to
use the source level debugger with break points, assertion statements,
etc. to debug anyway. Besides, the SLD shows you all the stuff that is
happening in zero time, which is largely unavailable in waveforms.
Representing both combinatorial and registered logic with one variable
is much easier to comprehend in a source level debugger than in a
waveform. A given reference to a variable represents one thing (the
output of either gate or a register), and that is what you see at a
breakpoint or a triggered sequential assertion statement in the SLD.
Think about clock-cycle-based behavior first, then worry about
implementation (gates and registers), and variables will open a whole
new way of looking at HDL. Then, after your synthesizer is finished
with register replication, retiming, duplicate removal and other
optimizatoins, your behavioral description will still make sense,
while an implementation based description may not.
Andy
Goto page Previous 1, 2