Lattice LFEC

J

Jedi

Guest
Hello..

Is this normal that same core which performs well
for Altera Cyclone device can only run at half speed
on a LFEC20-5 device?

Tried with several CPU cores from opencores.org
and LAttice LFEC20 shows mostly half the performance
as Cyclone...


rick
 
have set the respective timing constraints? ispLEVER P&R tool is a
timing driven tool.

rgds,

c

Jedi wrote:
Hello..

Is this normal that same core which performs well
for Altera Cyclone device can only run at half speed
on a LFEC20-5 device?

Tried with several CPU cores from opencores.org
and LAttice LFEC20 shows mostly half the performance
as Cyclone...


rick
 
cas7406@yahoo.com wrote:
have set the respective timing constraints? ispLEVER P&R tool is a
timing driven tool.
No...as I don't see the point in doing so when under default
settings Lattice LFEC is at least 50 % slower...


rick
 
Rick,

I can't speak for LatticeEC in specific, but I know that some
designers tend to write their VHDL very specific for one family. Than
it will be hard to get the same performance from another device.

I.e. does the compiled design make use of the IO cell? Switching this
option of can save quite some time (Clock to Out).

Regards,

Luc

On Mon, 20 Jun 2005 18:57:31 GMT, Jedi <me@aol.com> wrote:

cas7406@yahoo.com wrote:
have set the respective timing constraints? ispLEVER P&R tool is a
timing driven tool.


No...as I don't see the point in doing so when under default
settings Lattice LFEC is at least 50 % slower...


rick
 
Luc wrote:
Rick,

I can't speak for LatticeEC in specific, but I know that some
designers tend to write their VHDL very specific for one family. Than
it will be hard to get the same performance from another device.

I.e. does the compiled design make use of the IO cell? Switching this
option of can save quite some time (Clock to Out).

Regards,

Luc
Actually I test with an out-of-the-box t80 design...

I know that Altera Quartus does some good job in
using RAM blocks instead of registers automatically
since version 4.1 or 4.2 8and old 2.2 I think) whereas
the backend tools in ispLever and Actel Libero don't.

A simple comparison would be to use a small binary
counter and see how fast they can go...


rick
 
Hi Rick,

Hmm..actually t80 performance degraded continiously
in Altera Quartus since version 4.1 with same
standard settings (and no automatic RAM block placing).
Same is true for other similar CPU cores as well...
We do not observe this degradation. Which family are you compiling to, are
you using (tight) timing constraints, and do you have any data you can post?

Regards,

Paul
 
Jedi wrote:
Luc wrote:

Rick,

I can't speak for LatticeEC in specific, but I know that some
designers tend to write their VHDL very specific for one family. Than
it will be hard to get the same performance from another device.

I.e. does the compiled design make use of the IO cell? Switching this
option of can save quite some time (Clock to Out).

Regards,

Luc


Actually I test with an out-of-the-box t80 design...

I know that Altera Quartus does some good job in
using RAM blocks instead of registers automatically
since version 4.1 or 4.2 8and old 2.2 I think) whereas
the backend tools in ispLever and Actel Libero don't.
Hmm..actually t80 performance degraded continiously
in Altera Quartus since version 4.1 with same
standard settings (and no automatic RAM block placing).
Same is true for other similar CPU cores as well...

rick
 
Jedi wrote:
Hello..

Is this normal that same core which performs well
for Altera Cyclone device can only run at half speed
on a LFEC20-5 device?

Tried with several CPU cores from opencores.org
and LAttice LFEC20 shows mostly half the performance
as Cyclone...


rick
I cannot comment directly on your comparism as I have not performed the
tests myself. As a comment however I would say that FPGA architectures
are designed with certain characteristics in mind that may benefit
certain coding styles and not the other. This is the reason why most
FPGA vendors have a coding style guide to compliment their silicon.
Without modifying off-the-shelf code to suit a particular FPGA it is
very difficult to make a chalk and cheese comparism.

The website I have discovered below has a comparism of opencore CPUs
implemented on Altera Cyclone, Lattice ECP and Actel ProASIC 3 devices

http://www.fpga.ch/ipcores/results.php


Hope this helps

Ben
 
Hi Ben,

The website I have discovered below has a comparism of opencore CPUs
implemented on Altera Cyclone, Lattice ECP and Actel ProASIC 3 devices

http://www.fpga.ch/ipcores/results.php
Nice results (speaking as an Altera guy :)). I don't agree with the
author's hypothesis that Synplify is the difference -- if Synplify were used
for Cyclone as well, I don't think the conclusion would change. Synplify is
a great synthesis tool.

And the results should get even better if some or all of the various
physical synthesis options were enabled in Quartus II.

Neat link -- thanks.

Paul Leventis
Altera Corp.
 
Paul Leventis (at home) wrote:
Hi Ben,


The website I have discovered below has a comparism of opencore CPUs
implemented on Altera Cyclone, Lattice ECP and Actel ProASIC 3 devices

http://www.fpga.ch/ipcores/results.php


Nice results (speaking as an Altera guy :)). I don't agree with the
author's hypothesis that Synplify is the difference -- if Synplify were used
for Cyclone as well, I don't think the conclusion would change. Synplify is
a great synthesis tool.

And the results should get even better if some or all of the various
physical synthesis options were enabled in Quartus II.

Neat link -- thanks.

Paul Leventis
Altera Corp.

The lattice ispLever tool comes with 3 different synthesis tools:
Leonardo Spectrum, Synplify and Precision RTL synthesis. Using Synplify
and Precision RTL synthesis on the same VHDL code without applying
rigorous timing constraints shows a significant increase in fmax in
favour of the RTL synthesis tool.

So perhaps the author has a point.
 
For those reading this thread, Richard sent us the t80 archive (thanks
Richard), so we could investigate this. The short answer is that the
core didn't have a timing constraint set, so recent versions of Quartus
(4.1 and later) just work to achieve routability, and do not fully
optimize its timing. Setting an aggressive clock period constraint
dramatically speeds up the core when run through Quartus.

Details:

I compiled the t80a core in Quartus II 5.0 SP1, and
achieved 42.74 MHz, which matches Richard's result. However, I noticed
that
there are no timing requirements set -- in that case Quartus will
try to compile the design as fast as possible, and will not fully
optimize the design for timing.

I went to Assignments->Timing Settings and set "Default required Fmax"
to 100 MHz. Then I recompiled.

With that assignment, Quartus achieves a frequency of 75.53 MHz for
this
design.

It is a general rule that you should set an aggressive (unachievable)
timing assignment when you want to see how fast a design can go.
Alternatively, you can choose Settings->Fitter Settings->Standard Fit,
which essentially makes the most common type of unachievable timing
requirement (Fmax on all clocks) automatically for you.

To get even more speed, you can also turn on physical synthesis (all
options)
under Assignments->Settings->Physical Synthesis Optimizations. On this
design, turning physical synthesis on (+ having a 100 MHz default Fmax)
yields a speed of 83.43 MHz.

Best regards,

Vaughn Betz
A;tera
[v b e t z (at) altera.com]
 

Welcome to EDABoard.com

Sponsor

Back
Top