EDAboard.com | EDAboard.de | EDAboard.co.uk | WTWH Media

My invention: Coding wave-pipelined circuits with buffering

Ask a question - edaboard.com

elektroda.net NewsGroups Forum Index - FPGA - My invention: Coding wave-pipelined circuits with buffering

Goto page Previous  1, 2, 3  Next

Weng Tianxiang
Guest

Sun Jan 21, 2018 9:15 pm   



On Sunday, January 21, 2018 at 8:44:09 AM UTC-8, Jan Coombs wrote:
Quote:
On Sun, 21 Jan 2018 08:22:45 -0800 (PST)
Weng Tianxiang <wtxwtx_at_gmail.com> wrote:

[much irrelevant stuff snipped - please help with this]

My attention on this topic is centered on introduction of my
inventions to public and asking for their critical comments,
challenge or suspicion from technical point of view, not
specially on whether or not they are useful.

I was unable to quickly understand the "2 fast reading
materials" which you sent me.

Personally I never have a chance to write a pipelined circuit,
not mention designing for a wave-pipelined circuit.

Why do you have patents. A patent should disclose the method of
the novelty, so would need an implementation. Perhaps this is
what I am missing?

What I did is a result of my observation that such an
important problem can be perfectly resolved by my insight as a
person outside the wave-pipelined design circle, fully based
on only one reference [1] IEEE Transactions on VLSI Systems
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90.1783&rep=rep1&type=pdf .

Perhaps if you follow wave-pipelined techniques to the limit, you
will find yourself looking at asynchronous (or self clocked)
logic. There is also much historical work on this, and it may
be easier to test on FPGA chips[1].

Jan Coombs
--
[1] or at least drum up some business for Microsemi/Actel


Jam,

I don't think you are right: "Perhaps if you follow wave-pipelined techniques to the limit, you will find yourself looking at asynchronous (or self clocked) logic."

I had studied the asynchronous circuit, but found that it is a dead road based on its structural inefficiency and current commercial trend. And coding or synthesizing a wave-pipelined circuit has nothing to do with their counterpart for an asynchronous circuit, and the former is much more complex than asynchronous circuit!

Synthesizing a wave-pipelined circuit needs much more complex algorithms that have been matured since 1969 based on my observation.

My design never considers PVT, it belongs to another specialty field and I have zero knowledge on it.

From my point of view building a bridge between a code designer and a synthesizer is a very important issue to publicize the technology for wave-pipelined circuits:

in 1980 Intel published and developed 8087 for 32-bit floating multiplier; 10 and more years later, in 1997 they claimed MMX technology, including a second version of 64-bit floating multiplier. From my point of view the second version of 64-bit floating multiplier using MMX technology is none but a technology using wave-pipelined circuit.

Regular engineers never have a chance to implement a wave-pipelined circuit because of the complexity of all related PVT.

But according to my scheme, the most complex part of generating a wave-pipelined circuit is fully left to synthesizer manufacturers and a code designer in HDL only focuses his attention to how to code it with zero knowledge about how a wave-pipelined circuit is synthesized and generated that hopefully leads to a situation that any college student with basic knowledge in HDL can generate the second version of 64-bit floating multiplier within half an hour.

As far as 2 fast reading materials are concerned, please communicate with me through private email and let me know what you want: specification, drawing and source code in VHDL. Sorry, I mistakenly thought you were a lawyer, not an engineer.

Thank you.

Weng

rickman
Guest

Mon Jan 22, 2018 1:46 pm   



Weng Tianxiang wrote on 1/21/2018 11:22 AM:
Quote:
On Saturday, January 20, 2018 at 4:17:02 PM UTC-8, rickman wrote:
Jan Coombs wrote on 1/20/2018 2:20 PM:
On Fri, 19 Jan 2018 17:42:57 -0500
rickman <gnuarm.deletethisbit_at_gmail.com> wrote:

...

I think I understand the concept of wave pipelining. It is
just eliminating the intermediate registers of a pipeline
circuit and designing the combinational logic so that the
delays are even enough across the many paths so the output can
be clocked at a given time and will receive a stable result
from the input N clocks earlier. In other words, the logic is
designed so that the changes rippling through the logic never
catch up to the changes created by the data entered 1 clock
cycle earlier. Nice if you can do it.

Thanks, interesting, but sounds complex to get reliable
operation.

I can see where this would be useful in an ASIC. In ASICs FFs
and logic compete for space within the chip. In FPGAs the
ratio between FFs and logic are fixed and predetermined. So
using logic without using the FFs that are already there is
not of much value.

Generally true, but

1) You might be able to combine three stages that require 2/3 of
a clock cycle for maximum propagation delay, and get the result
in in the time of two clock cycles.

If your stages are only using 2/3 of a clock, you can regroup the logic to
make it 1 clock each in two stages. There is supposed to be software to
handle that for you although I've never used it.


2) If the Microsemi/Actel Igloo/Smartfusion FPGAs are used then
each tile can be a latch or a LUT, so flops are not wasted.

There's your first mistake, no one uses Actel/Microsemi FPGAs. They long
for the day they are as big as Lattice, lol!


Either way there must be a great deal of complex floor planning
and/or timing constraints needed to make this work. Automating
this would be amazing?

Isn't that what the OP is claiming? I'm surprised he could make this work
over PVT. The actual stable time has to be on a clock edge, the same clock
edge under all conditions. I wouldn't want to try that manually in a simple
circuit.

--

Rick C

Viewed the eclipse at Wintercrest Farms,
on the centerline of totality since 1998

Rick,

SMB stands for Series Master component with Buffering function, one of 2 WPC (Wive-Pipelining Component).

I don't understand what you are saying:
"Isn't that what the OP is claiming? I'm surprised he could make this work
over PVT. "

What do OP and PVT stand for?


OP means "original poster" and is a common abbreviation in newsgroups. PVT
means Process, Voltage, Temperature and are the three main factors causing
variations in delay times in silicon chip. If you don't account for these
effects in your timing calculations you wave pipelining idea won't work. If
you aren't aware of this, I suspect you don't really understand how to
design FPGA devices. It isn't all text book analysis.


Quote:
My attention on this topic is centered on introduction of my inventions to public and asking for their critical comments, challenge or suspicion from technical point of view, not specially on whether or not they are useful.

Personally I never have a chance to write a pipelined circuit, not mention designing for a wave-pipelined circuit.

What I did is a result of my observation that such an important problem can be perfectly resolved by my insight as a person outside the wave-pipelined design circle, fully based on only one reference
[1] IEEE Transactions on VLSI Systems
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90.1783&rep=rep1&type=pdf .


Then I think you have not solved anything. The problem with wave pipelining
is that the timing can vary so much that the output of the combinational
circuit won't be stable during the clock edges. If you haven't tested your
ideas by designing a circuit and running it on an FPGA, you don't know any
of this will work in the real world.

--

Rick C

Viewed the eclipse at Wintercrest Farms,
on the centerline of totality since 1998

rickman
Guest

Mon Jan 22, 2018 1:53 pm   



Weng Tianxiang wrote on 1/21/2018 2:15 PM:
Quote:
On Sunday, January 21, 2018 at 8:44:09 AM UTC-8, Jan Coombs wrote:
On Sun, 21 Jan 2018 08:22:45 -0800 (PST)
Weng Tianxiang <wtxwtx_at_gmail.com> wrote:

[much irrelevant stuff snipped - please help with this]

My attention on this topic is centered on introduction of my
inventions to public and asking for their critical comments,
challenge or suspicion from technical point of view, not
specially on whether or not they are useful.

I was unable to quickly understand the "2 fast reading
materials" which you sent me.

Personally I never have a chance to write a pipelined circuit,
not mention designing for a wave-pipelined circuit.

Why do you have patents. A patent should disclose the method of
the novelty, so would need an implementation. Perhaps this is
what I am missing?

What I did is a result of my observation that such an
important problem can be perfectly resolved by my insight as a
person outside the wave-pipelined design circle, fully based
on only one reference [1] IEEE Transactions on VLSI Systems
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90.1783&rep=rep1&type=pdf .

Perhaps if you follow wave-pipelined techniques to the limit, you
will find yourself looking at asynchronous (or self clocked)
logic. There is also much historical work on this, and it may
be easier to test on FPGA chips[1].

Jan Coombs
--
[1] or at least drum up some business for Microsemi/Actel

Jam,

I don't think you are right: "Perhaps if you follow wave-pipelined techniques to the limit, you will find yourself looking at asynchronous (or self clocked) logic."

I had studied the asynchronous circuit, but found that it is a dead road based on its structural inefficiency and current commercial trend. And coding or synthesizing a wave-pipelined circuit has nothing to do with their counterpart for an asynchronous circuit, and the former is much more complex than asynchronous circuit!

Synthesizing a wave-pipelined circuit needs much more complex algorithms that have been matured since 1969 based on my observation.

My design never considers PVT, it belongs to another specialty field and I have zero knowledge on it.

From my point of view building a bridge between a code designer and a synthesizer is a very important issue to publicize the technology for wave-pipelined circuits:

in 1980 Intel published and developed 8087 for 32-bit floating multiplier; 10 and more years later, in 1997 they claimed MMX technology, including a second version of 64-bit floating multiplier. From my point of view the second version of 64-bit floating multiplier using MMX technology is none but a technology using wave-pipelined circuit.

Regular engineers never have a chance to implement a wave-pipelined circuit because of the complexity of all related PVT.

But according to my scheme, the most complex part of generating a wave-pipelined circuit is fully left to synthesizer manufacturers and a code designer in HDL only focuses his attention to how to code it with zero knowledge about how a wave-pipelined circuit is synthesized and generated that hopefully leads to a situation that any college student with basic knowledge in HDL can generate the second version of 64-bit floating multiplier within half an hour.


The multiplier is not a good example to use as many FPGAs contain multiplier
blocks. But then they are pipelined and so won't work in a non-pipelined
solution, so maybe you can show your technique even if it has little
practical value in this case.

The problem is "the most complex part of generating a wave-pipelined circuit
is fully left to synthesizer manufacturers". Your method leaves me
wondering what your software is doing??? Asking the synthesizer companies
to solve your problems of making it work is a bit of a stretch. What makes
you think they will even take on your idea rather than provide their own
solution.

If your patent only covers the idea of writing simple HDL to describe the
circuit desired and leaving the implementation details to the synthesis
companies, I don't think you have actually patented anything. This part if
very obvious. The *real* work is in synthesizing a circuit that will work
in the FPGA.

--

Rick C

Viewed the eclipse at Wintercrest Farms,
on the centerline of totality since 1998

rickman
Guest

Mon Jan 22, 2018 1:57 pm   



Richard Damon wrote on 1/21/2018 1:24 PM:
Quote:
On 1/21/18 11:22 AM, Weng Tianxiang wrote:
What do OP and PVT stand for?


OP = Original Poster, the person who started the topic

PVT = Process / Voltage / Temperature (I presume)

The issue being that gate delay isn't a hard fixed value, but changes
slightly (or not so slightly) from device to device and under varying
operating conditions, which brings in to question the designing of a gate
tree that presents results stably and reliably two clock cycles after
application, even with the inputs changing after one clock cycles.


MUCH more than slightly. The numbers I have been told is 2:1 is not
uncommon. That's why overclockers can get CPU chips to run *much* faster
than they are rated. They provide very excellent cooling, tweak the PSU
voltage and select their special chips.

This is also why we use synchronous logic with registers for pipelines.

--

Rick C

Viewed the eclipse at Wintercrest Farms,
on the centerline of totality since 1998

rickman
Guest

Mon Jan 22, 2018 2:18 pm   



HT-Lab wrote on 1/21/2018 1:19 PM:
Quote:
On 21/01/2018 00:16, rickman wrote:
Jan Coombs wrote on 1/20/2018 2:20 PM:
On Fri, 19 Jan 2018 17:42:57 -0500
rickman <gnuarm.deletethisbit_at_gmail.com> wrote:

...

I think I understand the concept of wave pipelining. It is
just eliminating the intermediate registers of a pipeline
circuit and designing the combinational logic so that the
delays are even enough across the many paths so the output can
be clocked at a given time and will receive a stable result
from the input N clocks earlier. In other words, the logic is
designed so that the changes rippling through the logic never
catch up to the changes created by the data entered 1 clock
cycle earlier. Nice if you can do it.

Thanks, interesting, but sounds complex to get reliable
operation.

I can see where this would be useful in an ASIC. In ASICs FFs
and logic compete for space within the chip. In FPGAs the
ratio between FFs and logic are fixed and predetermined. So
using logic without using the FFs that are already there is
not of much value.

Generally true, but

1) You might be able to combine three stages that require 2/3 of
a clock cycle for maximum propagation delay, and get the result
in in the time of two clock cycles.

If your stages are only using 2/3 of a clock, you can regroup the logic to
make it 1 clock each in two stages. There is supposed to be software to
handle that for you although I've never used it.


2) If the Microsemi/Actel Igloo/Smartfusion FPGAs are used then
each tile can be a latch or a LUT, so flops are not wasted.

There's your first mistake, no one uses Actel/Microsemi FPGAs. They long
for the day they are as big as Lattice, lol!

Microsemi has been at the number 3 spot for as long as I use FPGA's (+/- 28
years starting with Actel's A1010). They are twice as large as Lattice.

Here is a reference:

https://www.eetimes.com/author.asp?doc_id=1331443


There's some BS somewhere...

http://www.fpgadeveloper.com/2011/07/list-and-comparison-of-fpga-companies.html

More importantly, look at the numbers in your link. The Actell/Microsemi
numbers are going in the wrong direction! X, A and L are headed upward
year-to-year and Actel is headed down!

While looking this up I found a link indicating the JTAG interface of the
ProASIC3 devices has a back door which would allow their security to be
bypassed. Security was their claim to fame and this could be a major blow
to the company.

--

Rick C

Viewed the eclipse at Wintercrest Farms,
on the centerline of totality since 1998

Weng Tianxiang
Guest

Mon Jan 22, 2018 8:39 pm   



Quote:
The multiplier is not a good example to use as many FPGAs contain multiplier
blocks. But then they are pipelined and so won't work in a non-pipelined
solution, so maybe you can show your technique even if it has little
practical value in this case.

Rick C


What I patented in my patents is a method on how to code a wave-pipelined circuit in HDL (not only in VHDL, but all HDLs) by a circuit designer, nothing else. If you slightly change the code, a 64x64 bits floating multiplier can be generated!!!

If anybody uses HDL to code, he has nothing to do with PVT, never put PVT into consideration, not me, not you, nobody does it!!! That is other ones' business.

Based on my method what you need to do is that you just describe the logic for the critical path, and call a library to finish your job, nothing else, all others are left to Xilinx or Altera to do!

If you are really interested in a real good FPGA example, I recommend you reading following one paper on website:
Wave-pipelined intra-chip signaling for on-FPGA communications
http://www.doc.ic.ac.uk/~wl/papers/10/integration10tm.pdf

There are numerous circuits in FPGA that are worth being the wave-pipelined circuits.

Weng

rickman
Guest

Mon Jan 22, 2018 11:56 pm   



Weng Tianxiang wrote on 1/22/2018 1:39 PM:
Quote:
The multiplier is not a good example to use as many FPGAs contain multiplier
blocks. But then they are pipelined and so won't work in a non-pipelined
solution, so maybe you can show your technique even if it has little
practical value in this case.

Rick C


What I patented in my patents is a method on how to code a wave-pipelined circuit in HDL (not only in VHDL, but all HDLs) by a circuit designer, nothing else. If you slightly change the code, a 64x64 bits floating multiplier can be generated!!!


I'm not sure what you are talking about. The code example you gave is
exactly the same code anyone would write for a multiplier. The HDL is no
different. So how can the patent be about how to code the wave-pipelined
circuit? Or did I miss something in the code?


> If anybody uses HDL to code, he has nothing to do with PVT, never put PVT into consideration, not me, not you, nobody does it!!! That is other ones' business.

I have no idea what you are talking about. HDL is used to design FPGAs and
ASICs. Part of that design process is meeting timing. Someone, somewhere
has to make the timing work. The tool vendor provides timing data
accessible through a timing analysis tool that can be applied to your
synthesized design. But if you wish to do wave-pipelined design the logic
has to be constructed in a way to balance the timing delays so the
uncertainty at the output of the combinatorial circuit fits within a clock
cycle *including the variation in timing from PVT*!!! So it is impossible
to do wave-pipelined design without considering PVT effects on the timing.

I have no idea what you mean by it "is other ones' business". This has to
be defined for a wave-pipelined design to work. THAT is where the work is,
not in talking about the HDL which is the same as for a non pipelined design.


> Based on my method what you need to do is that you just describe the logic for the critical path, and call a library to finish your job, nothing else, all others are left to Xilinx or Altera to do!

Then you have done nothing...


Quote:
If you are really interested in a real good FPGA example, I recommend you reading following one paper on website:
Wave-pipelined intra-chip signaling for on-FPGA communications
http://www.doc.ic.ac.uk/~wl/papers/10/integration10tm.pdf

There are numerous circuits in FPGA that are worth being the wave-pipelined circuits.


I would like to read your patent to see just what you are patenting.

--

Rick C

Viewed the eclipse at Wintercrest Farms,
on the centerline of totality since 1998

Weng Tianxiang
Guest

Tue Jan 23, 2018 12:39 am   



On Friday, January 19, 2018 at 8:45:26 AM UTC-8, Weng Tianxiang wrote:
Quote:

1. If my CPC_1_2 code is presented to a synthesizer, the first question you may ask is how do you code your WPC (Wive-Pipelining Component). For clarity, I copied the CPC_1_2 code here again.

By the way, I claim that nobody can further simplify the CPC_1_2 code to deliver full information about a critical path to a synthesizer for generating a wave-pipelined circuit! If you can, please challenge my claim.

entity CPC_1_2 is
generic (
input_data_width : positive := 64; -- optional
output_data_width : positive := 128 -- optional
);
port (
CLK : in std_logic;
WE_i : in std_logic; -- '1': write enable to input registers A & B
Da_i : in signed(input_data_width-1 downto 0); -- input data A
Db_i : in signed(input_data_width-1 downto 0); -- input data B
WE_o_i: in std_logic; -- '1': write enable to output register C
Dc_o : out unsigned(output_data_width -1 downto 0) -- output data C
);
end CPC_1_2;

architecture A_CPC_1_2 of CPC_1_2 is
signal Ra : signed(input_data_width-1 downto 0); -- input register A
signal Rb : signed(input_data_width-1 downto 0); -- input register B
signal Rc : signed(output_data_width-1 downto 0); -- output register C
signal Cl : signed(output_data_width-1 downto 0); -- combinational logic

begin
Cl <= Ra * Rb; -- combinational logic output, key part of CPC
Dc_o <= unsigned(Rc); -- output through output register

p_1 : process(CLK)
begin
if Rising_edge(CLK) then
if WE_i = '1' then -- WE_i = '1' : latch input data
Ra <= Da_i;
Rb <= Db_i;
end if;

if WE_O_I = '1' then -- WE_O_I = '1': latch output data
Rc <= Cl;
end if;
end if;
end process;
end A_CPC_1_2;

2. Assume 3 situations:
a) If you know that each data needs 5 cycles to pass the 64*64 bits signed multiplexer and the circuit can accept one data per cycle, you should know how to code the WPC for the circuit. Because we have already assumed that the synthesizer is capable of generating the wave-pipelined circuit for it, leaving most difficult task to the synthesizer. By definition a WPC contains all remaining logic for the circuit except the CPC_1_2.

b) If you know that each data needs 5 cycles to pass the 64*64 bits signed multiplexer and the circuit can accept one data per 2 cycles, you should know how to code the WPC for the circuit.

c) If you know that each data needs 5 cycles to pass the 64*64 bits signed multiplexer and the circuit can accept one data per 2 cycles, but the designer wants the circuit to be able of accepting one data per cycle, not one data per 2 cycles, you should know how to code the WPC for the circuit with 2 copies of critical paths and each alternatively accepting an input data per 2 cycles. Actually all CPCs have 2 types of code patterns, CPC_1_2 is one of them and another CPC_3 is slightly complex, but is an off shelf coding pattern either.In this situation CPC_3 code would replace CPC_1_2 with same input and output interfaces.

Now the problem comes: how do you know all 3 unknown parameters before you code the WPC for the 64*64 bits signed multiplexer? I think that this is the key reason why so many wave-pipelined circuits have been generated, but none of the circuits designers can resolve the 50 years old open problem.

And the circuit may, should and can be any type of pipelined circuits!

To be continued.

Weng


All coding for a wave-pipelined circuit has 3 steps:
1. Write a Critical Path Component (CPC) with defined interface;

2. Call a Wave-Pipelining Component (WPC) provided by a system library;

3. Call one of 3 link statement to link a CPC instantiation with a paired WPC instantiation to specify what your target is.

Now first it is assumed that CPC_1_2 coding is finished, because when coding a new circuit it is very clear what is to code and the coding of a CPC for a wave-pipelined circuit never has any problem, leaving all special features related to critical path to a synthesizer to do, including correcting critical path unbalance logically.

Now second I am trying to code its WPC.

When coding the 64*64 bits signed multiplexer, I have listed 3 situations as shown above in each of which there is an unknown constant before coding its WPC.

In case 1) each data needs 5 cycles to pass the 64*64 bits signed multiplexer. The 5 cycles isn't known until a synthesizer has analyzed the critical path.

In case b) the circuit can accept a data per 2 cycles.

In case c) multiple copies of a same critical path is 2.

Conventional method is:
fix the number as 3, 4, 5, 6,...then write the WPC code, synthesizer the circuit, repeated again if the assumed value fails to generate a wave-pilelined circuit until it reaches a success, and so on.

I introduce a new concept WAVE CONSTANT:
A constant is defined as a wave constant in a WPC if its constant initial value is unknown and undetermined when the WPC is being coded, and will be assigned by a synthesizer after the synthesizer has analyzed the critical path.

In contract with a wave constant A regular constant must be defined with a fixed initial value.

And it also requires that the synthesizer must first analyze the CPC, then analyze its paired WPC.

By doing so coding a wave-pipelined circuit will never have any problem!

The strange thing here is that a wave constant does not appear in its CPC, but the CPC's structure determines its initial value, then the synthesizer assigns the determined initial value to the wave constant which appears only in the paired WPC.

Based on above 3 situations, I introduced 3 wave constants:
a) Series_clock_number is the number of cycles for signals to travel the critical path.

b) Input_clock_number is the number of cycles, under which the circuit can accept one data per Input_Clock_number cycles.

c) Multiple_copy_number is the number of copies of a same critical path in order to meet a requirement for the circuit to accept one data per cycle. The requirement is required by a code designer.

By introducing a wave constant concept, code designers can smoothly and fully describe a wave-pipelined circuit in HDL without manual involvement.

Finally after the 3 WPC were coded, I found that all wave-pipelined circuits are divided into 2 categories and each of 2 categories shares a same WPC component without exception. Then the 2 types of WPC can form a system library and each of wave-pipelined circuits can call the library, avoiding coding same logic again and again.

To be continued.

Your comments are welcome.

Thank you.

Weng

Weng Tianxiang
Guest

Tue Jan 23, 2018 1:19 am   



Quote:
But if you wish to do wave-pipelined design the logic
has to be constructed in a way to balance the timing delays so the
uncertainty at the output of the combinatorial circuit fits within a clock
cycle *including the variation in timing from PVT*!!! So it is impossible
to do wave-pipelined design without considering PVT effects on the timing..

Rick C


The introduction of 3 link statements is used to inform a synthesizer that the CPC must be analyzed and generated as a wave-pipelined circuit instead of a regular one-cycle circuit. So the synthesizer would help you to generate logic that would balance the timing among all paths.

Can you do better than what a synthesizer can do?

When inventing, you must be smarter, not limited by what your experience tells you.

Weng

rickman
Guest

Tue Jan 23, 2018 4:35 am   



Weng Tianxiang wrote on 1/22/2018 6:19 PM:
Quote:
But if you wish to do wave-pipelined design the logic
has to be constructed in a way to balance the timing delays so the
uncertainty at the output of the combinatorial circuit fits within a clock
cycle *including the variation in timing from PVT*!!! So it is impossible
to do wave-pipelined design without considering PVT effects on the timing..

Rick C


The introduction of 3 link statements is used to inform a synthesizer that the CPC must be analyzed and generated as a wave-pipelined circuit instead of a regular one-cycle circuit. So the synthesizer would help you to generate logic that would balance the timing among all paths.

Can you do better than what a synthesizer can do?


I haven't seen what a synthesizer can do. Does anyone make a synthesizer
that does this?


> When inventing, you must be smarter, not limited by what your experience tells you.

Uh, if experience tells me something can't be done, why would I try to do
it? That's the utility of experience, you don't have to go down every blind
alley.

I've yet to see the utility in this idea. I would expect the speed
improvements to be small, if any and as I've mentioned, unless you get an
FPGA vendor to modify their chip designs along with the synthesis vendors to
modify their software, all at no small cost, this will not offer any
improvement in FPGAs.

Since you have not even taken a look at the issue of making this work over
PVT variations, I'm pretty sure it is not possible to even make it work in
today's FPGAs. There is just too much variation in timing of a single path
to wave-pipeline even a row of inverters.

--

Rick C

Viewed the eclipse at Wintercrest Farms,
on the centerline of totality since 1998

Weng Tianxiang
Guest

Tue Jan 23, 2018 5:33 am   



> I'm pretty sure it is not possible to even make it work in today's FPGAs.

If you don't know something, never say that it's impossible, in my point of view, it is beyond your specialty.

Quote:
I've yet to see the utility in this idea. I would expect the speed
improvements to be small, if any.


Intel has two versions of 64x64 bits floating multiplier, one used for compatibility with previous 8087 version, and another is version of MMX technology, a wave-pipelining technology. Based on web testing bench and Intel's literature, the wave-pipelined circuit version has 20% speed faster than its 8087 counterpart (4 cycle vs. 5 cycles). Additionally power consumption is dramatically reduced. A 64x64 bits floating multiplier has the maximum of 151 bits in one of 4 middle stages and you may calculate how many bits of registers have been saved!

> I've mentioned, unless you get an FPGA vendor to modify their chip designs along with the synthesis vendors to modify their software, all at no small cost, this will not offer any improvement in FPGAs.

Defining a new part of HDL specially for wave-pipelined circuit in ASIC and FPGA and letting code designers own a corresponding simple and reliable designing method is one thing, and letting synthesizers implement the new part in HDL is another thing, as Jim, the chairman in VHDL, always asks people here to push the synthesizers to implement new part in 2008-VHDL.

Quote:
I haven't seen what a synthesizer can do. Does anyone make a synthesizer
that does this?


A synthesizer as a software needs an algorithm to do something fast and accurate, and there have been more than effective 20-30 algorithms over there, many of which were issued patents, a result you can get by simply using Google to search for, so it is reasonable that I assumed from the beginning of my project that the technique for synthesizing and generating a wave-pipelined circuit is fully matured now and might have been matured 10-20 years ago.

Weng

rickman
Guest

Tue Jan 23, 2018 6:01 am   



Weng Tianxiang wrote on 1/22/2018 10:33 PM:
Quote:
I'm pretty sure it is not possible to even make it work in today's FPGAs.

If you don't know something, never say that it's impossible, in my point of view, it is beyond your specialty.


FPGAs *are* my specialty. I think you are showing you know little about
actually working with FPGAs. You don't seem to understand that the ratio of
logic to FFs is fixed in any given FPGA so saving registers is not of great
value. You also don't seem to understand that you have too much speed
variation over PVT to even use wave-pipelining in an FPGA.

Do you understand either of these two issues? Do you have a way around
these limitations?


Quote:
I've yet to see the utility in this idea. I would expect the speed
improvements to be small, if any.

Intel has two versions of 64x64 bits floating multiplier, one used for compatibility with previous 8087 version, and another is version of MMX technology, a wave-pipelining technology. Based on web testing bench and Intel's literature, the wave-pipelined circuit version has 20% speed faster than its 8087 counterpart (4 cycle vs. 5 cycles). Additionally power consumption is dramatically reduced. A 64x64 bits floating multiplier has the maximum of 151 bits in one of 4 middle stages and you may calculate how many bits of registers have been saved!


This has nothing to do with "FPGAs" which is the target I was referring to.


Quote:
I've mentioned, unless you get an FPGA vendor to modify their chip designs along with the synthesis vendors to modify their software, all at no small cost, this will not offer any improvement in FPGAs.

Defining a new part of HDL specially for wave-pipelined circuit in ASIC and FPGA and letting code designers own a corresponding simple and reliable designing method is one thing, and letting synthesizers implement the new part in HDL is another thing, as Jim, the chairman in VHDL, always asks people here to push the synthesizers to implement new part in 2008-VHDL.


Adding this to the HDL is trivial. The ENTIRE hard part is getting a
synthesizer to support this by doing all the hard work.


Quote:
I haven't seen what a synthesizer can do. Does anyone make a synthesizer
that does this?

A synthesizer as a software needs an algorithm to do something fast and accurate, and there have been more than effective 20-30 algorithms over there, many of which were issued patents, a result you can get by simply using Google to search for, so it is reasonable that I assumed from the beginning of my project that the technique for synthesizing and generating a wave-pipelined circuit is fully matured now and might have been matured 10-20 years ago.


For FPGAs???

I notice you completely snipped the part about PVT variations in timing
which very likely will be the stake driven through the heart of this approach.

Bottom line - if wave-pipelining were an advantage in FPGAs or even
practical with benefit, one of the FPGA companies would be promoting it. If
they could get a 20% speed improvement, they would be jumping through hoops
as it would give them a *huge* competitive advantage over the other FPGA
companies.

--

Rick C

Viewed the eclipse at Wintercrest Farms,
on the centerline of totality since 1998

Weng Tianxiang
Guest

Tue Jan 23, 2018 8:31 am   



Quote:
Bottom line - if wave-pipelining were an advantage in FPGAs or even
practical with benefit, one of the FPGA companies would be promoting it. If
they could get a 20% speed improvement, they would be jumping through hoops
as it would give them a *huge* competitive advantage over the other FPGA
companies.

Rick C


I appreciate your above paragraph, at least a small step forward!

There are many Indian professors' papers on wave-pipelined circuits for FPGA. Here is one of them: Some Experiments about Wave Pipelining on FPGAs

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.42.2942&rep=rep1&type=pdf

Weng

HT-Lab
Guest

Tue Jan 23, 2018 7:48 pm   



On 22/01/2018 12:18, rickman wrote:
Quote:
HT-Lab wrote on 1/21/2018 1:19 PM:
...

There's your first mistake, no one uses Actel/Microsemi FPGAs.  They
long
for the day they are as big as Lattice, lol!

Microsemi has been at the number 3 spot for as long as I use FPGA's
(+/- 28
years starting with Actel's A1010). They are twice as large as Lattice.

Here is a reference:

https://www.eetimes.com/author.asp?doc_id=1331443

There's some BS somewhere...

http://www.fpgadeveloper.com/2011/07/list-and-comparison-of-fpga-companies.html

Comparing a blogger article from 2011 who took some NASDAQ vales against
eetimes which using a marketing survey company results from 2017, good job!

Quote:

More importantly, look at the numbers in your link.  The
Actell/Microsemi numbers are going in the wrong direction!  X, A and L
are headed upward year-to-year and Actel is headed down!

Sure, but Microsemi is still number 3 which was my point. In my day job
I speak to more companies using Microsemi than Lattice (both dwarf
against Xilinx and Intel though). That is not to say that Lattice makes
bad FPGA's it is just that they haven't carved out a particular niche
area like Microsemi or say Achronix.

Quote:
While looking this up I found a link indicating the JTAG interface of
the ProASIC3 devices has a back door which would allow their security to
be bypassed.  Security was their claim to fame and this could be a major
blow to the company.


Made no impact on their business, they are still the number1 company for
secure/space/avionics FPGA's. Also note ProASIC3 is nearly 10 years old.

Hans
www.ht-lab.com

Mark Curry
Guest

Tue Jan 23, 2018 10:10 pm   



In article <93899eff-01a7-4e78-8076-17febc2c8f0c_at_googlegroups.com>,
Weng Tianxiang <wtxwtx_at_gmail.com> wrote:
Quote:
Hi,

A wive-pipelined circuit has the same logic as its pipeline counterpart except that the wive-pipelined circuit has only one stage, a critical path
from the input register passing through a piece of computational logic to the output register, and no intermediate registers.


Weng,

I read along and not commented here. But I find it's harder to ignore..
I've read up on some of the references you've posted. I think I've now got
a fairly good idea as to what this wave-pipelining thing is now. So
thanks for the refenences.

But you've repeated claimed that your patent doesn't need to deal with PVT
variations - that's a problem for the synthesizer...

PVT variations is the elephant in the room. It's why wave-pipelining (and
other asynchronous design techniques) have failed to grab any hold outside
research facilities. It's a very difficult problem to solve. And it's
only getting worse at each lower technology node.

Now some of the papers you cite offer some fairly clever solutions that FPGA
manufactures COULD use to try and enable more wave-pipelined solutions - the
one paper cited referred to small inline PLLs along the switch network to
allow delays to be more matched. Interesting solutions. But one that
the FPGA manufactures would have to take and implement. There's nothing
for us end FPGA users to use. The underlying technology just doesn't enable
end users to use wave-pipelining solutions in todays FPGAs. Because of the PVT
variation problem. (and simply calling it "PVT" variation falsely
implies that just those three variables matter. There's many, many more
variables that affect the variation distribution)

Now as to your patent claim. I'm unsure at all what you're trying to claim.
You list as an example a straightforward, and very basic pipelined datapath
example. One that matches thousands of others already in existance and prior
art. It's an input pipeline register, and large combinational cloud, and an
output pipeline register. Described in VHDL. I fail to see anything at all
novel there. But you claim that an undescribed, unbuilt tool could then
take such code and implement a wave-pipeline with it? That someone else
would have to build?

In another thread you seem to be claiming that the tool could automatically
determing the latency, and/or clock rate and/or "how many waves" are in
flight along the wave-pipeline circuit? That belies how design is done - it's
putting the cart before the horse. And is a common misconception of new users to
FPGAs designing even traditional pipelined design.

A common question that new users ask is "how fast can I make this pipelined
design run?". The experienced designer then answers - that's not how it's done.
The experienced designer has a specific problem that's trying to be solve - not
trying to see "how fast it can run". The designer must guide the tool towards a
solution with a latency / clock rate / "how many waves" as a design goal up front.
Not a derived solution output from the tool. The designer must know these up front
so as to design the entire solution. How does the designer know what
values are realistic goals? Experience. That's engineering.

So, my 3 cents. Wave-pipelines are in a class of asynchronous design techniques
that's of no use to current FPGA users. Perhaps if Xilinx or Altera (er, Intel)
or even some up and coming startup decides to utilize some of the techniques in the
cited papers, we may see something in the next couple of decades. Personally, I doubt it.

Regards,

Mark

Goto page Previous  1, 2, 3  Next

elektroda.net NewsGroups Forum Index - FPGA - My invention: Coding wave-pipelined circuits with buffering

Ask a question - edaboard.com

Arabic version Bulgarian version Catalan version Czech version Danish version German version Greek version English version Spanish version Finnish version French version Hindi version Croatian version Indonesian version Italian version Hebrew version Japanese version Korean version Lithuanian version Latvian version Dutch version Norwegian version Polish version Portuguese version Romanian version Russian version Slovak version Slovenian version Serbian version Swedish version Tagalog version Ukrainian version Vietnamese version Chinese version Turkish version
EDAboard.com map