Virtex II multipler performance

M

MM

Guest
I've been playing with a test design consisting of a single 16x16=32 coregen
generated multiplier with maximum pipelining and the registered output
option. I am using ISE5.2 and I set the clk constraint to 200 MHz (and it is
the only constraint). The results I am getting for different speed grades of
the same XC2V2000 device are as follows:

6 - 4.986 ns
5 - 4.910 ns
4 - 5.839 ns

It seems a little weird that the middle grade is faster... Any comments on
that?

Also, is this pretty much the best I can get? I might need to do a design
that will have to run at 210 MHz and I don't feel comfortable with these
results. I know this topic has been discussed in the past but I could not
find good conclusive numbers...


Thanks,
/Mikhail
 
Mikhail,
The reason that the -5 part appears to be faster is that the PAR tool
stops optimising when the timing passes. You set it to 5ns. Once it got
better than this it stopped bothering to improve further. It just so
happened that the PAR with the -5 part did a tiny bit better.
It might sound radical, but if you want it to work at 210MHz, try doing
the PAR with the constraint to 210MHz! ;-) See what happens and report back!
cheers, Syms.

"MM" <mbmsv@yahoo.com> wrote in message
news:bpba7l$1ljnjm$1@ID-204311.news.uni-berlin.de...
I've been playing with a test design consisting of a single 16x16=32
coregen
generated multiplier with maximum pipelining and the registered output
option. I am using ISE5.2 and I set the clk constraint to 200 MHz (and it
is
the only constraint). The results I am getting for different speed grades
of
the same XC2V2000 device are as follows:

6 - 4.986 ns
5 - 4.910 ns
4 - 5.839 ns

It seems a little weird that the middle grade is faster... Any comments on
that?

Also, is this pretty much the best I can get? I might need to do a design
that will have to run at 210 MHz and I don't feel comfortable with these
results. I know this topic has been discussed in the past but I could not
find good conclusive numbers...


Thanks,
/Mikhail
 
Mikhail,
I assume your talking about par results, once the implementation tools achieve
your constraints they stop. A better way to view the results is

6- constraint = 5.0ns PASS
5- constraint = 5.0ns PASS
4- constraint = 5.0ns FAIL

If you want to run the implementation at 210MHz I suggest you benchmark at
210MHz, or slightly faster if you are trying to gauge some margin.

MM wrote:

I've been playing with a test design consisting of a single 16x16=32 coregen
generated multiplier with maximum pipelining and the registered output
option. I am using ISE5.2 and I set the clk constraint to 200 MHz (and it is
the only constraint). The results I am getting for different speed grades of
the same XC2V2000 device are as follows:

6 - 4.986 ns
5 - 4.910 ns
4 - 5.839 ns

It seems a little weird that the middle grade is faster... Any comments on
that?

Also, is this pretty much the best I can get? I might need to do a design
that will have to run at 210 MHz and I don't feel comfortable with these
results. I know this topic has been discussed in the past but I could not
find good conclusive numbers...

Thanks,
/Mikhail
 
"Symon" <symon_brewer@hotmail.com> wrote in message
news:bpbfmp$1n2qp1$1@ID-212844.news.uni-berlin.de...
Mikhail,
It might sound radical, but if you want it to work at 210MHz, try
doing
the PAR with the constraint to 210MHz! ;-) See what happens and report
back!

OK, here is my report (use fixed size font to see better):

Requested Actual
-6 -5 -4
5 4.986 4.910 5.839
4.65 4.561 5.146 5.839
4.5 4.328 5.146 5.828
4 4.543 5.146 Impossible
4.3 4.543 5.146 Impossible
4.4 4.328 5.140 Impossible

All of this has been done with default implementation settings.

/Mikhail
 
"MM" <mbmsv@yahoo.com> wrote in message
news:bpbiri$1mgke5$1@ID-204311.news.uni-berlin.de...
"Symon" <symon_brewer@hotmail.com> wrote in message
news:bpbfmp$1n2qp1$1@ID-212844.news.uni-berlin.de...
Mikhail,
It might sound radical, but if you want it to work at 210MHz, try
doing
the PAR with the constraint to 210MHz! ;-) See what happens and report
back!

OK, here is my report (use fixed size font to see better):

Requested Actual
-6 -5 -4
5 4.986 4.910 5.839
4.65 4.561 5.146 5.839
4.5 4.328 5.146 5.828
4 4.543 5.146 Impossible
4.3 4.543 5.146 Impossible
4.4 4.328 5.140 Impossible

All of this has been done with default implementation settings.

/Mikhail

Hi Mikhail,
So, looks like you max out at 230MHz with the -6 parts. If you set the
constraint below 5ns the PAR tool gives up on the -5 part, so looks like
4.91 is the best you'll get with -5. Note, if you over constrain, you don't
get the best results! Other options you could consider would be to use the
built in hardware multipliers alternately, i.e. use two multipliers, so that
each one is active every other go. Use the pipelined versions though.
all the best, Syms.
 

Welcome to EDABoard.com

Sponsor

Back
Top