EDAboard.com | EDAboard.eu | EDAboard.de | EDAboard.co.uk | RTV forum PL | NewsGroups PL

elektroda.net NewsGroups Forum Index - LSI - **Logic minimization software with LUT6 support?**

Guest

Tue Sep 25, 2007 10:30 pm

I am looking for open source software for logic minimization (a la

espresso)

targeted to a lookup table based architecture that can take advantage

of six

inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation).

Is there such a beast?

thanks much

-Arrigo

Guest

Wed Sep 26, 2007 4:48 am

On Sep 25, 4:30 pm, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

Synthesis tools do this for you, and with probably much greater speed,

accuracy, and intelligent trade-off's than you'd be able to do

manually - especially when you start considering registers rather than

just pure combinational logic.

Thanks,

Marc

Guest

Wed Sep 26, 2007 3:24 pm

AFAIK there are not good multi level optimization algorithms that take

mapping effects into account. Instead logic optimization is done

independently of mapping.

Usually, first a technology independent multi level logic optimization

is performed. (Based on transformations, ATPG or implications). There

should be academic implementations

for that around. Most of them quite old.

Afterwards technology mapping is performed. Mapping for LUTs can be

done delay optimal in polynomial time.

(Flow map). For 6-LUT FPGAs it might make sense to combine mapping

with retiming to balance the amount of logic between stages.

There is a paper by Sunil Khatri on optimization for networks of PLAs.

A 6-LUT might be large enough to benefit from that approach. (I doubt

it)

Kolja Sulimma

On 25 Sep., 23:30, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso)

targeted to a lookup table based architecture that can take advantage

of six

inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation).

Is there such a beast?

thanks much

-Arrigo

espresso)

targeted to a lookup table based architecture that can take advantage

of six

inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation).

Is there such a beast?

thanks much

-Arrigo

Guest

Wed Sep 26, 2007 11:20 pm

On Sep 25, 8:48 pm, Marc Randolph <mr...@my-deja.com> wrote:

On Sep 25, 4:30 pm, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

Sure. I am working at a Virtex 5 design with *lots* of squarer

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range. I am wondering if

the LUT6 would give any advantage

compared to other implementations. Then, looking at Ray Andraka's page

on multipliers I realized that

a "Partial product LUT multiplier" looks like a good architecture for

the squarer (since A=B the number of LUTs is cut in half), and that

the LUT6 probably does not buy you more than a LUT4 since the carry

chain limits the number of bits to four per slice.

-Arrigo

Guest

Wed Sep 26, 2007 11:26 pm

Sure. I am working at a Virtex 5 design with *lots* of squarer

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range.

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range.

9 bits is a small number. Have you considered table lookup?

--

These are my opinions, not necessarily my employer's. I hate spam.

Guest

Wed Sep 26, 2007 11:57 pm

dudesinmexico_at_gmail.com wrote:

On Sep 25, 8:48 pm, Marc Randolph <mr...@my-deja.com> wrote:

On Sep 25, 4:30 pm, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

Sure. I am working at a Virtex 5 design with *lots* of squarer

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range. I am wondering if

the LUT6 would give any advantage

compared to other implementations. Then, looking at Ray Andraka's page

on multipliers I realized that

a "Partial product LUT multiplier" looks like a good architecture for

the squarer (since A=B the number of LUTs is cut in half), and that

the LUT6 probably does not buy you more than a LUT4 since the carry

chain limits the number of bits to four per slice.

-Arrigo

On Sep 25, 4:30 pm, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

Sure. I am working at a Virtex 5 design with *lots* of squarer

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range. I am wondering if

the LUT6 would give any advantage

compared to other implementations. Then, looking at Ray Andraka's page

on multipliers I realized that

a "Partial product LUT multiplier" looks like a good architecture for

the squarer (since A=B the number of LUTs is cut in half), and that

the LUT6 probably does not buy you more than a LUT4 since the carry

chain limits the number of bits to four per slice.

-Arrigo

The LUT size isn't related directly to the carry chain pitch. The 6

input LUTs (12 per partial product for a 6bit in, 12 bit out LUT) make

your partial products, and then you add those together with the

appropriate shifting for the weighting of the partials to arrive at the

complete square. For 9 input bits though, you really only need 4 and 5

input LUTs. A 6 LUT implementation is still going to have 4 partial

products, so there is no savings over 4/5 input LUTs.

That said, you could use dual port BRAMs as direct look-ups instead and

get two 9 bit squarers per BRAM (one on each port).

Guest

Thu Sep 27, 2007 12:29 am

On Sep 26, 3:57 pm, Ray Andraka <r...@andraka.com> wrote:

dudesinmex...@gmail.com wrote:

On Sep 25, 8:48 pm, Marc Randolph <mr...@my-deja.com> wrote:

On Sep 25, 4:30 pm, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

Sure. I am working at a Virtex 5 design with *lots* of squarer

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range. I am wondering if

the LUT6 would give any advantage

compared to other implementations. Then, looking at Ray Andraka's page

on multipliers I realized that

a "Partial product LUT multiplier" looks like a good architecture for

the squarer (since A=B the number of LUTs is cut in half), and that

the LUT6 probably does not buy you more than a LUT4 since the carry

chain limits the number of bits to four per slice.

-Arrigo

The LUT size isn't related directly to the carry chain pitch. The 6

input LUTs (12 per partial product for a 6bit in, 12 bit out LUT) make

your partial products, and then you add those together with the

appropriate shifting for the weighting of the partials to arrive at the

complete square. For 9 input bits though, you really only need 4 and 5

input LUTs. A 6 LUT implementation is still going to have 4 partial

products, so there is no savings over 4/5 input LUTs.

That said, you could use dual port BRAMs as direct look-ups instead and

get two 9 bit squarers per BRAM (one on each port).

On Sep 25, 8:48 pm, Marc Randolph <mr...@my-deja.com> wrote:

On Sep 25, 4:30 pm, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

Sure. I am working at a Virtex 5 design with *lots* of squarer

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range. I am wondering if

the LUT6 would give any advantage

compared to other implementations. Then, looking at Ray Andraka's page

on multipliers I realized that

a "Partial product LUT multiplier" looks like a good architecture for

the squarer (since A=B the number of LUTs is cut in half), and that

the LUT6 probably does not buy you more than a LUT4 since the carry

chain limits the number of bits to four per slice.

-Arrigo

The LUT size isn't related directly to the carry chain pitch. The 6

input LUTs (12 per partial product for a 6bit in, 12 bit out LUT) make

your partial products, and then you add those together with the

appropriate shifting for the weighting of the partials to arrive at the

complete square. For 9 input bits though, you really only need 4 and 5

input LUTs. A 6 LUT implementation is still going to have 4 partial

products, so there is no savings over 4/5 input LUTs.

That said, you could use dual port BRAMs as direct look-ups instead and

get two 9 bit squarers per BRAM (one on each port).

Ray, nice to hear from you. Yes, I have considered BRAMs as lookup

tables.

In fact, we noticed that Synplify does something pretty cool: it packs

two multipliers

in a single BRAM18, using one port for each multiplier. Of course this

is possible since

they share the same table.

According to the documentation, one can use a BRAM36 as two BRAM18s,

however

PlanAhead refuses to place a pblock with two BRAM18s on any area with

just one BRAM36.

This could be either a PlanAhead bug or simply that the BRAM36 still

has just two ports...

Guest

Thu Sep 27, 2007 12:42 am

dudesinmexico_at_gmail.com wrote:

On Sep 26, 3:57 pm, Ray Andraka <r...@andraka.com> wrote:

dudesinmex...@gmail.com wrote:

On Sep 25, 8:48 pm, Marc Randolph <mr...@my-deja.com> wrote:

On Sep 25, 4:30 pm, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

Sure. I am working at a Virtex 5 design with *lots* of squarer

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range. I am wondering if

the LUT6 would give any advantage

compared to other implementations. Then, looking at Ray Andraka's page

on multipliers I realized that

a "Partial product LUT multiplier" looks like a good architecture for

the squarer (since A=B the number of LUTs is cut in half), and that

the LUT6 probably does not buy you more than a LUT4 since the carry

chain limits the number of bits to four per slice.

-Arrigo

The LUT size isn't related directly to the carry chain pitch. The 6

input LUTs (12 per partial product for a 6bit in, 12 bit out LUT) make

your partial products, and then you add those together with the

appropriate shifting for the weighting of the partials to arrive at the

complete square. For 9 input bits though, you really only need 4 and 5

input LUTs. A 6 LUT implementation is still going to have 4 partial

products, so there is no savings over 4/5 input LUTs.

That said, you could use dual port BRAMs as direct look-ups instead and

get two 9 bit squarers per BRAM (one on each port).

Ray, nice to hear from you. Yes, I have considered BRAMs as lookup

tables.

In fact, we noticed that Synplify does something pretty cool: it packs

two multipliers

in a single BRAM18, using one port for each multiplier. Of course this

is possible since

they share the same table.

According to the documentation, one can use a BRAM36 as two BRAM18s,

however

PlanAhead refuses to place a pblock with two BRAM18s on any area with

just one BRAM36.

This could be either a PlanAhead bug or simply that the BRAM36 still

has just two ports...

dudesinmex...@gmail.com wrote:

On Sep 25, 8:48 pm, Marc Randolph <mr...@my-deja.com> wrote:

On Sep 25, 4:30 pm, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

Sure. I am working at a Virtex 5 design with *lots* of squarer

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range. I am wondering if

the LUT6 would give any advantage

compared to other implementations. Then, looking at Ray Andraka's page

on multipliers I realized that

a "Partial product LUT multiplier" looks like a good architecture for

the squarer (since A=B the number of LUTs is cut in half), and that

the LUT6 probably does not buy you more than a LUT4 since the carry

chain limits the number of bits to four per slice.

-Arrigo

The LUT size isn't related directly to the carry chain pitch. The 6

input LUTs (12 per partial product for a 6bit in, 12 bit out LUT) make

your partial products, and then you add those together with the

appropriate shifting for the weighting of the partials to arrive at the

complete square. For 9 input bits though, you really only need 4 and 5

input LUTs. A 6 LUT implementation is still going to have 4 partial

products, so there is no savings over 4/5 input LUTs.

That said, you could use dual port BRAMs as direct look-ups instead and

get two 9 bit squarers per BRAM (one on each port).

Ray, nice to hear from you. Yes, I have considered BRAMs as lookup

tables.

In fact, we noticed that Synplify does something pretty cool: it packs

two multipliers

in a single BRAM18, using one port for each multiplier. Of course this

is possible since

they share the same table.

According to the documentation, one can use a BRAM36 as two BRAM18s,

however

PlanAhead refuses to place a pblock with two BRAM18s on any area with

just one BRAM36.

This could be either a PlanAhead bug or simply that the BRAM36 still

has just two ports...

It is a limitation of the BRAM36.

Guest

Thu Sep 27, 2007 12:46 am

<dudesinmexico_at_gmail.com> wrote in message

news:1190849356.286625.69510_at_d55g2000hsg.googlegroups.com...

Ray, nice to hear from you. Yes, I have considered BRAMs as lookup

tables.

In fact, we noticed that Synplify does something pretty cool: it packs

two multipliers

in a single BRAM18, using one port for each multiplier. Of course this

is possible since

they share the same table.

According to the documentation, one can use a BRAM36 as two BRAM18s,

however

PlanAhead refuses to place a pblock with two BRAM18s on any area with

just one BRAM36.

This could be either a PlanAhead bug or simply that the BRAM36 still

has just two ports...

From p 125 of ug190.pdf (v3.1 Virtex-5 User Guide, 9/11/2007)

Two RAMB18s can be placed in the same RAMB36 location by using the BEL

UPPER/LOWER constraint:

inst "my_ramb18" LOC = RAMB36_X0Y0 | BEL = UPPER

inst "my_ramb18" LOC = RAMB36_X0Y0 | BEL = LOWER

which is echoed in Ansew Record 25115

http://www.xilinx.com/xlnx/xil_ans_display.jsp?iLanguageID=1&iCountryID=1&getPagePath=25115

though I think the note's author forgot to write UPPER for the last line.

Darned cut & paste!

Whether PlanAhead supports these constraints natively isn't obvious. Ask

the hotline or your FAE!

- John_H

Guest

Thu Sep 27, 2007 12:57 am

John_H wrote:

dudesinmexico_at_gmail.com> wrote in message

news:1190849356.286625.69510_at_d55g2000hsg.googlegroups.com...

Ray, nice to hear from you. Yes, I have considered BRAMs as lookup

tables.

In fact, we noticed that Synplify does something pretty cool: it packs

two multipliers

in a single BRAM18, using one port for each multiplier. Of course this

is possible since

they share the same table.

According to the documentation, one can use a BRAM36 as two BRAM18s,

however

PlanAhead refuses to place a pblock with two BRAM18s on any area with

just one BRAM36.

This could be either a PlanAhead bug or simply that the BRAM36 still

has just two ports...

From p 125 of ug190.pdf (v3.1 Virtex-5 User Guide, 9/11/2007)

Two RAMB18s can be placed in the same RAMB36 location by using the BEL

UPPER/LOWER constraint:

inst "my_ramb18" LOC = RAMB36_X0Y0 | BEL = UPPER

inst "my_ramb18" LOC = RAMB36_X0Y0 | BEL = LOWER

which is echoed in Ansew Record 25115

http://www.xilinx.com/xlnx/xil_ans_display.jsp?iLanguageID=1&iCountryID=1&getPagePath=25115

though I think the note's author forgot to write UPPER for the last line.

Darned cut & paste!

Whether PlanAhead supports these constraints natively isn't obvious. Ask

the hotline or your FAE!

- John_H

news:1190849356.286625.69510_at_d55g2000hsg.googlegroups.com...

Ray, nice to hear from you. Yes, I have considered BRAMs as lookup

tables.

In fact, we noticed that Synplify does something pretty cool: it packs

two multipliers

in a single BRAM18, using one port for each multiplier. Of course this

is possible since

they share the same table.

According to the documentation, one can use a BRAM36 as two BRAM18s,

however

PlanAhead refuses to place a pblock with two BRAM18s on any area with

just one BRAM36.

This could be either a PlanAhead bug or simply that the BRAM36 still

has just two ports...

From p 125 of ug190.pdf (v3.1 Virtex-5 User Guide, 9/11/2007)

Two RAMB18s can be placed in the same RAMB36 location by using the BEL

UPPER/LOWER constraint:

inst "my_ramb18" LOC = RAMB36_X0Y0 | BEL = UPPER

inst "my_ramb18" LOC = RAMB36_X0Y0 | BEL = LOWER

which is echoed in Ansew Record 25115

http://www.xilinx.com/xlnx/xil_ans_display.jsp?iLanguageID=1&iCountryID=1&getPagePath=25115

though I think the note's author forgot to write UPPER for the last line.

Darned cut & paste!

Whether PlanAhead supports these constraints natively isn't obvious. Ask

the hotline or your FAE!

- John_H

Oops, I misread what you were saying here, I thought you were trying to

use the BRAM36 dual ported. I don't often use plan ahead. Putting the

bel constraints on instantiated BRAM18's works fine for placing them

together.

Guest

Thu Sep 27, 2007 9:21 am

On 27 Sep., 00:20, dudesinmex...@gmail.com wrote:

On Sep 25, 8:48 pm, Marc Randolph <mr...@my-deja.com> wrote:

On Sep 25, 4:30 pm, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

Sure. I am working at a Virtex 5 design with *lots* of squarer

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range. I am wondering if

the LUT6 would give any advantage

compared to other implementations. Then, looking at Ray Andraka's page

on multipliers I realized that

a "Partial product LUT multiplier" looks like a good architecture for

the squarer (since A=B the number of LUTs is cut in half), and that

the LUT6 probably does not buy you more than a LUT4 since the carry

chain limits the number of bits to four per slice.

-Arrigo

On Sep 25, 4:30 pm, dudesinmex...@gmail.com wrote:

I am looking for open source software for logic minimization (a la

espresso) targeted to a lookup table based architecture that can take advantage

of six inputs LUTs (as you can imagine I have in mind a LUT6/Virtex 5

implementation). Is there such a beast?

Howdy,

You peaked my curiosity. Could you explain why you need this?

Sure. I am working at a Virtex 5 design with *lots* of squarer

circuits (Z=A*B with A=B) where the input

is a signed 9 bit value in the [-255,255] range. I am wondering if

the LUT6 would give any advantage

compared to other implementations. Then, looking at Ray Andraka's page

on multipliers I realized that

a "Partial product LUT multiplier" looks like a good architecture for

the squarer (since A=B the number of LUTs is cut in half), and that

the LUT6 probably does not buy you more than a LUT4 since the carry

chain limits the number of bits to four per slice.

-Arrigo

How about this (A Virtex-5 CLB can be used as an 8 input lookup

table):

8 LUTs compute the absolute value of the input.

6*1 LUTs form a lookup table for the lower 6 bits of the result (they

depend only on 6 inputs)

1*2 LUTs form a lookup table for bit 6

9*4 LUTs form lookup tables for bits 7 to 15

This is about the size of four partial products, but it might be

faster. (Depending of the

performance of the CLB-Muxes compared to the carry chain)

Kolja Sulimma

elektroda.net NewsGroups Forum Index - LSI - **Logic minimization software with LUT6 support?**