EDK : FSL macros defined by Xilinx are wrong

Jan Coombs · Sep 2, 2016

On Thu, 1 Sep 2016 09:51:36 -0700 (PDT)
lolinka04@gmail.com wrote:

I wonder if they ever gave it to him ?
Would they do it now that these chips are no longer produced?

I did a project to program early Flash logic parts. This
included a PC plugin board, borrowed equation compiler, and
configuration stream generator.

Why might this ancient history be of interest?

Jan Coombs

Sep 2, 2016

In my youth I tried to build a GAL programmer, but I never got it to work with the samples I had.

Later I found out that there appeared to be quite different programming algorithms for different parts with the same name, so be aware...

Thomas

Jan Coombs · Oct 2, 2016

On Fri, 2 Sep 2016 13:53:25 -0700 (PDT)
thomas.entner99@gmail.com wrote:

In my youth I tried to build a GAL programmer, but I never got
it to work with the samples I had.

Youth, thats the bit I missed out on, I was a mature student
then I did the EPLD work.

Later I found out that there appeared to be quite different
programming algorithms for different parts with the same name,
so be aware...

I did have the (NDA) programming documents from LatticeSemi and
Intel for these parts, and seem to remember that the Lattice
programming algorithms were not complete in the regular chip
documentation.

While moving Lattice tools around I noticed that here:

/home/.../Diamond/IspVMsystem/isptools/ispvmsystem/Database/ee9/22v10a/

are these files:

ispgal22v10avp28_spi_loader.jed
ispgal22v10avq32_spi_loader.jed
ispVM_005b.xdf

Perhaps some of the tools do still exist?

> Thomas

Jan Coombs
--

Oct 20, 2016

> What other content would you like to see?

They claim something impressive ("Translate Wikipedia in less than a Tenth of a Second") but give no details about the task nor the system.

If the claim is not total marketing nonsense I would imply that they mean translating from one language to another (e.g. English to German).

From the article link (and the picture) you could also imply that one FPGA (or the card in the hand of the guy) does this. But this is simply unbelievable. So the question is: How many FPGAs are involved? With out this, the claimed time is simply not meaningful, as double the number of FPGA will mean half the time (every Wikipedia article can be translated individually, so it is easy to execute the task in parallel...).

But I guess this is all not Microsoft's fault, but the problem of that specific link. I found following which gives much more insight at the end of the page:
https://www.top500.org/news/microsoft-goes-all-in-for-fpgas-to-build-out-cloud-based-ai/

There it says that 4 FPGAs (Stratix V D5, ca. 500k LE) would require 4 hours to translate Wikipedia. The 0.1 seconds are achieved with a huge cloud of such FPGA equipped systems...

Of course still impressive, but not the same as most people might think after reading the headline. (And it also makes me wonder about the future of the Altera/Intel low cost FPGAs, when to want to sell a Stratix into every server...)

Regards,

Thomas

www.entner-electronics.com - Home of EEBlaster and JPEG Codec

rickman · Oct 24, 2016

On 10/19/2016 7:47 PM, thomas.entner99@gmail.com wrote:

What other content would you like to see?

They claim something impressive ("Translate Wikipedia in less than a Tenth of a Second") but give no details about the task nor the system.

If the claim is not total marketing nonsense I would imply that they mean translating from one language to another (e.g. English to German).

From the article link (and the picture) you could also imply that one FPGA (or the card in the hand of the guy) does this. But this is simply unbelievable. So the question is: How many FPGAs are involved? With out this, the claimed time is simply not meaningful, as double the number of FPGA will mean half the time (every Wikipedia article can be translated individually, so it is easy to execute the task in parallel...).

But I guess this is all not Microsoft's fault, but the problem of that specific link. I found following which gives much more insight at the end of the page:
https://www.top500.org/news/microsoft-goes-all-in-for-fpgas-to-build-out-cloud-based-ai/

There it says that 4 FPGAs (Stratix V D5, ca. 500k LE) would require 4 hours to translate Wikipedia. The 0.1 seconds are achieved with a huge cloud of such FPGA equipped systems...

Of course still impressive, but not the same as most people might think after reading the headline. (And it also makes me wonder about the future of the Altera/Intel low cost FPGAs, when to want to sell a Stratix into every server...)

For sure the release is short of engineering data... it *is* a marketing
pitch. The point is they plan to be providing a combination of FPGA and
CPU which will run much faster and use less power than the CPU alone.
No, they aren't offering hard numbers and the task of translating
wikipedia is not really the best benchmark for serving up or searching
web pages. It is meant to offer a metric that even laymen can relate to.

In other words, it's meant to sound good to those who would not
understand more engineering information.

Microsoft has no incentive to sell FPGAs. Their incentive is to provide
the software on faster hardware. If the hardware doesn't pan out,
Microsoft gets nothing but expenses.

--

Rick C

Jan 23, 2017

On Wednesday, March 28, 2012 at 1:31:41 PM UTC-4, Elam wrote:

I understand that the price depends on the volume etc
but I would like to know the per unit price of Virtex 7 FPGA..

Any guesses..

Thanks
Elam.

Elam,
We can save you substantially off of the Xilinx or Avnet screen pricing on most Xilinx.
John.Pallazola@earthtron.com

rickman · Jan 24, 2017

On 3/11/2014 5:32 PM, langwadt@fonz.dk wrote:

Den tirsdag den 11. marts 2014 22.23.36 UTC+1 skrev Jon Elson:
GaborSzakacs wrote:

A quick DigiKey search showed a range of $2,583.75 (XC7VX330T-1FFG1157C)

to $39,452.40 (XC7V2000T-G2FLG1925E). These won't end up in any of my

designs any time soon.

REALLY! 1900 balls, and all of them have to solder perfectly or the chip

has to come off and be re-balled! Arghhhh! I'd LOVE to know who is

actually USING chips that expensive. Must be the military in those

$500 Million airplanes.

Jon

if it does the job of an asic that would require a million dollar NRE and
you only need 20 it's a bargain

One place I worked at used a very expensive Xilinx device (not sure just
how bad it was, I think $1,500 in around 2000) when only 20% was being
used. Room for expansion in a $100,000 product. It was test equipment
and I think they only sold a couple of handfulls.

--

Rick C

rickman · Jan 24, 2017

On 4/2/2012 7:55 PM, Ed McGettigan wrote:

On Mar 28, 10:31 am, Elam <elampoora...@gmail.com> wrote:
I understand that the price depends on the volume etc
but I would like to know the per unit price of Virtex 7 FPGA..

Any guesses..

Thanks
Elam.

There are too many variables (device, package, speed grade, volume,
delivery date, etc..) involved in pricing for any simple answer.
Contact your local Xilinx sales rep and they would be happy to sit
down and discuss your needs and come up with the right pricing that
matches your situation.
http://www.xilinx.com/company/contact/sales-reps.htm

Using online pricing data for 1-10 parts today will not be reflective
of 1K-10K pricing 18 months from now.

No only do prices vary on a lot of factors, prices are *always* cheaper
(sometimes *much* cheaper) if you give them a design win using their new
product line. They barely care about new sockets using old parts, even
one generation old. It's all about paying for the NRE on the new
product line. If you are buying even just 10k per year, you can get a
great discount typically, much better than the online prices.

--

Rick C

Feb 17, 2017

I suppose one could tweak Vcc vs temp to null out a native tempco.

I am not an expert in this field either, but to my knowledge, things got much more complex with the smaller process geometries.

Generally things will still become faster at higher supply voltage and lower temperature, but I had also designs which (according to the timing analyzer) passed at 85Â°C but failed at 0Â°C.

Regards,

Thomas

www.entner-electronics.com - Home of EEBlaster and JPEG CODEC

Mar 28, 2017

On Friday, February 7, 2003 at 7:02:37 PM UTC+5:30, llaa57 wrote:

I implemented the circuit described in the application note "Unusual
clock dividers" written by Peter Alfke:
http://www.xilinx.com/xcell/xl33/xl33_30.pdf.
I used a Xilinx XC9536 and the input clock is generated by an
oscillator (SCO-061S 48MHz) by Sunny.
My problem is that the output clock's duty cycle is 33% and not 50% as
expected. Why? Is the CPLD unsuitable for this circuit?

Many thanks in advance.

Hi All,
I'm trying to implement the frequency divider of 1.5 for 100mhz clock frequency. i need to obtain 65mhz as output. i'm using xc6slx16 fpga..could anyone help me with vhdl codes for the above..No problem with duty cycle..

Thanks all.

Uwe Bonnes · Mar 28, 2017

vjkaran19@gmail.com wrote:

On Friday, February 7, 2003 at 7:02:37 PM UTC+5:30, llaa57 wrote:
I implemented the circuit described in the application note "Unusual
clock dividers" written by Peter Alfke:
http://www.xilinx.com/xcell/xl33/xl33_30.pdf.
I used a Xilinx XC9536 and the input clock is generated by an
oscillator (SCO-061S 48MHz) by Sunny.
My problem is that the output clock's duty cycle is 33% and not 50% as
expected. Why? Is the CPLD unsuitable for this circuit?

Many thanks in advance.

Hi All,
I'm trying to implement the frequency divider of 1.5 for 100mhz clock frequency. i need to obtain 65mhz as output. i'm using xc6slx16 fpga..could anyone help me with vhdl codes for the above..No problem with duty cycle..

Thanks all.

Capturing a thread that old is evil!

Why don' you start a new thread?

100 /1.5 != 65?

For a 66,666_ clock, use a DCM clock module and first multiply by 2 and then
divide by 3.

Bye
--
Uwe Bonnes bon@elektron.ikp.physik.tu-darmstadt.de

Institut fuer Kernphysik Schlossgartenstrasse 9 64289 Darmstadt
--------- Tel. 06151 1623569 ------- Fax. 06151 1623305 ---------

Mar 29, 2017

On Friday, February 7, 2003 at 7:02:37 PM UTC+5:30, llaa57 wrote:

I implemented the circuit described in the application note "Unusual
clock dividers" written by Peter Alfke:
http://www.xilinx.com/xcell/xl33/xl33_30.pdf.
I used a Xilinx XC9536 and the input clock is generated by an
oscillator (SCO-061S 48MHz) by Sunny.
My problem is that the output clock's duty cycle is 33% and not 50% as
expected. Why? Is the CPLD unsuitable for this circuit?

Many thanks in advance.

Hi,
Thanks to all..

Kevin Neilson · Apr 11, 2017

On Monday, April 10, 2017 at 7:13:23 PM UTC-6, John Larkin wrote:

We have a ZYNQ whose predicted timing isn't meeting decent margins.
And we don't want a lot of output pin timing variation in real life.

We can measure the chip temperature with the XADC thing. So, why not
make an on-chip heater? Use a PLL to clock a bunch of flops, and vary
the PLL output frequency to keep the chip temp roughly constant.

I'm confused by the concept. Doesn't timing get *worse* as temp increases? How would a higher temperature help? By "output pin timing variation" do you mean that there are combinatorial output paths? I think the best best is to stay as cool as possible and keep all outputs registered. If you really need to control output delay you can use the IODELAY block, possibly along with a copper trace feedback line.

I have used precision oscillators with built-in heaters. In that case, it's more important that the crystal stay at a constant temp than what the temp is. By making that temperature above the highest possible ambient temp, the heater can keep the crystal temp constant.

Kevin Neilson · Apr 11, 2017

https://dl.dropboxusercontent.com/u/53724080/Thermal/ESM_Ring_Oscillator.jpg

The change in prop delay vs temp is fairly small.

That's more linear than I would've guessed. Is that the ambient temperature or junction temp?

John Larkin · Apr 12, 2017

On Tue, 11 Apr 2017 09:29:03 -0700 (PDT), Kevin Neilson
<kevin.neilson@xilinx.com> wrote:

On Monday, April 10, 2017 at 7:13:23 PM UTC-6, John Larkin wrote:
We have a ZYNQ whose predicted timing isn't meeting decent margins.
And we don't want a lot of output pin timing variation in real life.

We can measure the chip temperature with the XADC thing. So, why not
make an on-chip heater? Use a PLL to clock a bunch of flops, and vary
the PLL output frequency to keep the chip temp roughly constant.

I'm confused by the concept. Doesn't timing get *worse* as temp increases?

Prop delays get slower.

> How would a higher temperature help?

High temperature is an unfortunate fact of life some times. I'm after
constant temperature, to minimize delay variations as ambient temp and
logic power dissipations change.

> By "output pin timing variation" do you mean that there are combinatorial output paths? I think the best best is to stay as cool as possible and keep all outputs registered.

All our critical outputs are registered in the i/o cells. Xilinx tools
report almost a 3:1 delay range from clock to outputs, over the full
range of process, power supply, and temperature. Apparently the tools
assume the max specified Vcc and temperature spreads for the part and
don't let us tease out anything, or restrict the analysis to any
narrower ranges.

If you really need to control output delay you can use the IODELAY block, possibly along with a copper trace feedback line.

Our output data-valid window is predicted by the tools to be very
narrow relative to the clock period. We figure that controlling the
temperature (and maybe controlling Vcc-core vs temperature) will open
up the timing window. The final analysis will have to be experimental.

We can't crank in a constant delay to fix anything; the problem is the
predicted variation in delay.

I have used precision oscillators with built-in heaters. In that case, it's more important that the crystal stay at a constant temp than what the temp is. By making that temperature above the highest possible ambient temp, the heater can keep the crystal temp constant.

That's the idea, keep the FPGA core near the max naturally-expected
temperature, heat it up as needed, and that will reduce actual timing
variations to below the worst-case predicted by the tools.

I expect that the tools are grossly pessimistic. I sure hope so.

--

John Larkin Highland Technology, Inc

lunatic fringe electronics

rickman · Apr 12, 2017

On 4/11/2017 12:31 PM, Kevin Neilson wrote:

https://dl.dropboxusercontent.com/u/53724080/Thermal/ESM_Ring_Oscillator.jpg

The change in prop delay vs temp is fairly small.

That's more linear than I would've guessed. Is that the ambient temperature or junction temp?

Even if it wasn't especially linear, the proportionality is based on
degrees Kelvin. So the non-linearity would not be terribly pronounced.

That was part of the reason for the inflate-gate thing a couple of years
ago. I remember that between the pressure being relative rather than
absolute and the temperature being Celsius or Fahrenheit rather than
Kevin, the people here took some time to figure out that the reported
pressures were easily explained by the difference in temperature between
the locker rooms and the playing field.

--

Rick C

rickman · Apr 12, 2017

On 4/11/2017 11:37 PM, John Larkin wrote:

On Tue, 11 Apr 2017 09:29:03 -0700 (PDT), Kevin Neilson
kevin.neilson@xilinx.com> wrote:

On Monday, April 10, 2017 at 7:13:23 PM UTC-6, John Larkin wrote:
We have a ZYNQ whose predicted timing isn't meeting decent margins.
And we don't want a lot of output pin timing variation in real life.

We can measure the chip temperature with the XADC thing. So, why not
make an on-chip heater? Use a PLL to clock a bunch of flops, and vary
the PLL output frequency to keep the chip temp roughly constant.

I'm confused by the concept. Doesn't timing get *worse* as temp increases?

Prop delays get slower.

How would a higher temperature help?

High temperature is an unfortunate fact of life some times. I'm after
constant temperature, to minimize delay variations as ambient temp and
logic power dissipations change.

By "output pin timing variation" do you mean that there are combinatorial output paths? I think the best best is to stay as cool as possible and keep all outputs registered.

All our critical outputs are registered in the i/o cells. Xilinx tools
report almost a 3:1 delay range from clock to outputs, over the full
range of process, power supply, and temperature. Apparently the tools
assume the max specified Vcc and temperature spreads for the part and
don't let us tease out anything, or restrict the analysis to any
narrower ranges.

If you really need to control output delay you can use the IODELAY block, possibly along with a copper trace feedback line.

Our output data-valid window is predicted by the tools to be very
narrow relative to the clock period. We figure that controlling the
temperature (and maybe controlling Vcc-core vs temperature) will open
up the timing window. The final analysis will have to be experimental.

We can't crank in a constant delay to fix anything; the problem is the
predicted variation in delay.

I have used precision oscillators with built-in heaters. In that case, it's more important that the crystal stay at a constant temp than what the temp is. By making that temperature above the highest possible ambient temp, the heater can keep the crystal temp constant.

That's the idea, keep the FPGA core near the max naturally-expected
temperature, heat it up as needed, and that will reduce actual timing
variations to below the worst-case predicted by the tools.

I expect that the tools are grossly pessimistic. I sure hope so.

The nature of designing synchronous logic is that you want to know the
worst case delay so you can design to a constant period clock cycle. So
the worst case is the design criteria. The timing analysis tools are
naturally "pessimistic" in that sense. But that is intended so that the
design process is a matter of getting all timing paths to meet the
required timing rather than trying to compare delays on this path to
delays on that path which would be a nightmare.

When you need better timing on the I/Os, as you have done, the signals
can be clocked in the IOB FFs which give the lowest variation in timing
as well as the shortest delays from clock input to signal output.
Typically I/O timing also needs to be designed for worst case as well
because the need is to meet setup timing while hold timing is typically
guaranteed by the spec on the I/Os. But if you are not doing
synchronous design this may not be optimal. If you are trying to get a
specific timing of an output edge, you may have to reclock the signals
through discrete logic.

--

Rick C

colin · Apr 12, 2017

Our biggest box takes about a kilowatt, which includes 70W for the fans. We build enough of them, which run 24/7, to work out the total cost of ownership and running the box a little bit hotter reduces reliability a bit but saves enough electricity to make it worthwhile.

Colin

John Larkin · Apr 12, 2017

On Tue, 11 Apr 2017 09:31:20 -0700 (PDT), Kevin Neilson
<kevin.neilson@xilinx.com> wrote:

https://dl.dropboxusercontent.com/u/53724080/Thermal/ESM_Ring_Oscillator.jpg

The change in prop delay vs temp is fairly small.

That's more linear than I would've guessed. Is that the ambient temperature or junction temp?

Foil-sticky thermocouple on the top of the chip. It was an Altera
Cyclone 3, clocked internally at 250 MHz.

https://dl.dropboxusercontent.com/u/53724080/PCBs/ESM_rev_B.jpg

The ring oscillator was divided internally before we counted it, by 16
as I recall.

Newer chips tend to have an actual, fairly accurate, die temp sensor,
which opens up complex schemes to control die temp, or measure it and
tweak Vccint, or something.

--

John Larkin Highland Technology, Inc

lunatic fringe electronics

rickman · Apr 12, 2017

On 4/11/2017 12:29 PM, Kevin Neilson wrote:

On Monday, April 10, 2017 at 7:13:23 PM UTC-6, John Larkin wrote:
We have a ZYNQ whose predicted timing isn't meeting decent margins.
And we don't want a lot of output pin timing variation in real life.

We can measure the chip temperature with the XADC thing. So, why not
make an on-chip heater? Use a PLL to clock a bunch of flops, and vary
the PLL output frequency to keep the chip temp roughly constant.

I'm confused by the concept. Doesn't timing get *worse* as temp increases? How would a higher temperature help? By "output pin timing variation" do you mean that there are combinatorial output paths? I think the best best is to stay as cool as possible and keep all outputs registered. If you really need to control output delay you can use the IODELAY block, possibly along with a copper trace feedback line.

I have used precision oscillators with built-in heaters. In that case, it's more important that the crystal stay at a constant temp than what the temp is. By making that temperature above the highest possible ambient temp, the heater can keep the crystal temp constant.

That is exactly what John is talking about, except the heater will be on
the FPGA itself.

--

Rick C

EDK : FSL macros defined by Xilinx are wrong

Jan Coombs

Guest

Guest

Jan Coombs

Guest

Guest

rickman

Guest

Guest

rickman

Guest

rickman

Guest

Guest

Guest

Uwe Bonnes

Guest

Guest

Kevin Neilson

Guest

Kevin Neilson

Guest

John Larkin

Guest

rickman

Guest

rickman

Guest

colin

Guest

John Larkin

Guest

rickman

Guest

Welcome to EDABoard.com

Sponsor

Online statistics

Forum statistics

EDK : FSL macros defined by Xilinx are wrong

Jan Coombs

Guest

Guest

Jan Coombs

Guest

Guest

rickman

Guest

Guest

rickman

Guest

rickman

Guest

Guest

Guest

Uwe Bonnes

Guest

Guest

Kevin Neilson

Guest

Kevin Neilson

Guest

John Larkin

Guest

rickman

Guest

rickman

Guest

colin

Guest

John Larkin

Guest

rickman

Guest

Log in

Welcome to EDABoard.com

Sponsor