EDAboard.com | EDAboard.eu | EDAboard.de | EDAboard.co.uk | RTV forum PL | NewsGroups PL

elektroda.net NewsGroups Forum Index - FPGA - **All-real FFT for FPGA**

Guest

Mon Feb 13, 2017 1:05 am

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

--

Tim Wescott

Wescott Design Services

http://www.wescottdesign.com

I'm looking for work -- see my website!

Guest

Mon Feb 13, 2017 2:54 am

On Sunday, February 12, 2017 at 12:05:25 PM UTC-6, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

--

Tim Wescott

Wescott Design Services

http://www.wescottdesign.com

I'm looking for work -- see my website!

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

--

Tim Wescott

Wescott Design Services

http://www.wescottdesign.com

I'm looking for work -- see my website!

It's been a long time, as I remember:

The Hartley transform will work.

Shuffling the data before and after a half size complex FFT will work.

And you can use one of them to check the other.

Guest

Mon Feb 13, 2017 7:24 am

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

Protip: the synthesizer should trim out the unneeded logic, so

you don't need an optimized library macro.

Steve, always helpful

Guest

Mon Feb 13, 2017 8:30 am

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit so

maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed by

taking advantage of some of the symmetries in the FFT. I don't recall

the details as it has been over a decade since I worked with this, but

you can fold half of the real data into the imaginary portion, run a

size N/2 FFT and then unfold the results. I believe you have to run a

final N sized complex pass on these results to get your final answer. I

do recall it saved a *lot* when performing FFTs, nearly 50%.

--

Rick C

Guest

Mon Feb 13, 2017 8:30 am

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically

for FPGAs. Was my search too quick, or are there no IP sources to do

this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit so

maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed by

taking advantage of some of the symmetries in the FFT. I don't recall

the details as it has been over a decade since I worked with this, but

you can fold half of the real data into the imaginary portion, run a

size N/2 FFT and then unfold the results. I believe you have to run a

final N sized complex pass on these results to get your final answer. I

do recall it saved a *lot* when performing FFTs, nearly 50%.

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically

for FPGAs. Was my search too quick, or are there no IP sources to do

this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit so

maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed by

taking advantage of some of the symmetries in the FFT. I don't recall

the details as it has been over a decade since I worked with this, but

you can fold half of the real data into the imaginary portion, run a

size N/2 FFT and then unfold the results. I believe you have to run a

final N sized complex pass on these results to get your final answer. I

do recall it saved a *lot* when performing FFTs, nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was wondering

if it was done for FFTs as well.

--

Tim Wescott

Wescott Design Services

http://www.wescottdesign.com

I'm looking for work -- see my website!

Guest

Mon Feb 13, 2017 8:30 am

On 2/13/2017 12:03 AM, Tim Wescott wrote:

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically

for FPGAs. Was my search too quick, or are there no IP sources to do

this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit so

maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed by

taking advantage of some of the symmetries in the FFT. I don't recall

the details as it has been over a decade since I worked with this, but

you can fold half of the real data into the imaginary portion, run a

size N/2 FFT and then unfold the results. I believe you have to run a

final N sized complex pass on these results to get your final answer. I

do recall it saved a *lot* when performing FFTs, nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was wondering

if it was done for FFTs as well.

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically

for FPGAs. Was my search too quick, or are there no IP sources to do

this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit so

maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed by

taking advantage of some of the symmetries in the FFT. I don't recall

the details as it has been over a decade since I worked with this, but

you can fold half of the real data into the imaginary portion, run a

size N/2 FFT and then unfold the results. I believe you have to run a

final N sized complex pass on these results to get your final answer. I

do recall it saved a *lot* when performing FFTs, nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was wondering

if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings is

in time/size. Essentially it does an N/2 size FFT with an extra pass,

so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

--

Rick C

Guest

Mon Feb 13, 2017 8:30 am

On 2/13/2017 1:48 AM, rickman wrote:

On 2/13/2017 12:03 AM, Tim Wescott wrote:

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically

for FPGAs. Was my search too quick, or are there no IP sources to do

this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit so

maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed by

taking advantage of some of the symmetries in the FFT. I don't recall

the details as it has been over a decade since I worked with this, but

you can fold half of the real data into the imaginary portion, run a

size N/2 FFT and then unfold the results. I believe you have to run a

final N sized complex pass on these results to get your final answer. I

do recall it saved a *lot* when performing FFTs, nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was wondering

if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings is

in time/size. Essentially it does an N/2 size FFT with an extra pass,

so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically

for FPGAs. Was my search too quick, or are there no IP sources to do

this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit so

maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed by

taking advantage of some of the symmetries in the FFT. I don't recall

the details as it has been over a decade since I worked with this, but

you can fold half of the real data into the imaginary portion, run a

size N/2 FFT and then unfold the results. I believe you have to run a

final N sized complex pass on these results to get your final answer. I

do recall it saved a *lot* when performing FFTs, nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was wondering

if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings is

in time/size. Essentially it does an N/2 size FFT with an extra pass,

so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

Opps, I wrote that wrong. The optimized result would be order N/2 *

(log(N)+1). Just to be clear (or less clear depending on how you read

this), there are actually N/2 butterflies in each pass of the FFT. I

didn't show that since the constant 1/2 applies to both the standard FFT

and the optimized FFT. The point is the number of butterflies is cut in

half on each pass of the FFT for the optimized approach.

--

Rick C

Guest

Mon Feb 13, 2017 2:34 pm

On 2/13/2017 12:24 AM, Steve Pope wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

Protip: the synthesizer should trim out the unneeded logic, so

you don't need an optimized library macro.

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

Protip: the synthesizer should trim out the unneeded logic, so

you don't need an optimized library macro.

I don't think the synthesizer is capable of getting the same savings.

The optimizations would see the zero imaginary inputs and optimize the

first pass of the FFT. All passes would be N/2 butterflies while the

optimized approach would use half that many at the expense of an extra

pass. This is a big savings that the synthesis tools aren't likely to

figure out unless they recognize you are performing an FFT.

Someone refresh my memory. If you do an FFT with zeros in the imaginary

part of the inputs, the output has a symmetry that can be used to

process two real streams at once. I can't recall how it works exactly,

but that symmetry is the basis for separating the results of the two

halves of the original sequence before completing the last pass. One

portion is pulled out because of the even symmetry and the other portion

is pulled out because of the odd symmetry.

I found this page that appears to explain it, but I haven't taken the

time to dig into the math. I think I'd have to start all over again,

it's just been too long.

--

Rick C

Guest

Mon Feb 13, 2017 8:12 pm

On Sunday, February 12, 2017 at 11:05:25 AM UTC-7, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I could be mistaken, but doesn't the DCT, which is used for video compression, operate only on real data? It seems like you could find a DCT core designed for JPEG.

save (I think) roughly 2x the calculations of FFTs for complex data.

I could be mistaken, but doesn't the DCT, which is used for video compression, operate only on real data? It seems like you could find a DCT core designed for JPEG.

Guest

Tue Feb 14, 2017 12:56 am

On Mon, 13 Feb 2017 01:48:49 -0500, rickman wrote:

On 2/13/2017 12:03 AM, Tim Wescott wrote:

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data,

that save (I think) roughly 2x the calculations of FFTs for complex

data.

I did a quick search, but didn't find any that are made specifically

for FPGAs. Was my search too quick, or are there no IP sources to do

this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit

so maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed

by taking advantage of some of the symmetries in the FFT. I don't

recall the details as it has been over a decade since I worked with

this, but you can fold half of the real data into the imaginary

portion, run a size N/2 FFT and then unfold the results. I believe

you have to run a final N sized complex pass on these results to get

your final answer. I do recall it saved a *lot* when performing FFTs,

nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was

wondering if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings is

in time/size. Essentially it does an N/2 size FFT with an extra pass,

so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data,

that save (I think) roughly 2x the calculations of FFTs for complex

data.

I did a quick search, but didn't find any that are made specifically

for FPGAs. Was my search too quick, or are there no IP sources to do

this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit

so maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed

by taking advantage of some of the symmetries in the FFT. I don't

recall the details as it has been over a decade since I worked with

this, but you can fold half of the real data into the imaginary

portion, run a size N/2 FFT and then unfold the results. I believe

you have to run a final N sized complex pass on these results to get

your final answer. I do recall it saved a *lot* when performing FFTs,

nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was

wondering if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings is

in time/size. Essentially it does an N/2 size FFT with an extra pass,

so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

There's a distinct symmetry to the Fourier transform that you could use

at each step of the way instead of doing the whole thing and fixing it up

at the end.

I don't know if it would save steps, but it would certainly be easier on

someone who just wants to apply an algorithm.

--

Tim Wescott

Wescott Design Services

http://www.wescottdesign.com

I'm looking for work -- see my website!

Guest

Tue Feb 14, 2017 2:21 am

On Sun, 12 Feb 2017 12:05:17 -0600, Tim Wescott

<seemywebsite_at_myfooter.really> wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

As has been mentioned, there are a number of algorithms that exploit

symmetries to prune the computations down near minimal, but I don't

know of any canned FPGA libraries that do it. I don't know of any

canned FPGA libraries that include a FHT (Fast Hartley Transofrm)

either, which would also serve essentially the same purpose.

As Steve suggested, you could just force the imaginary part to static

zeroes and let the synthesizer optimizations clean it up, but it

probably wouldn't be quite as good as using a well-designed RFFT if

one was available.

All that said, many of the existing FPGA FFT libraries are pretty

efficient, so it may be worth just brute-forcing it and see whether it

is good enough.

---

This email has been checked for viruses by Avast antivirus software.

https://www.avast.com/antivirus

Guest

Tue Feb 14, 2017 4:02 am

On 2/13/2017 2:34 AM, rickman wrote:

On 2/13/2017 12:24 AM, Steve Pope wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

Protip: the synthesizer should trim out the unneeded logic, so

you don't need an optimized library macro.

I don't think the synthesizer is capable of getting the same savings.

The optimizations would see the zero imaginary inputs and optimize the

first pass of the FFT. All passes would be N/2 butterflies while the

optimized approach would use half that many at the expense of an extra

pass. This is a big savings that the synthesis tools aren't likely to

figure out unless they recognize you are performing an FFT.

Someone refresh my memory. If you do an FFT with zeros in the imaginary

part of the inputs, the output has a symmetry that can be used to

process two real streams at once. I can't recall how it works exactly,

but that symmetry is the basis for separating the results of the two

halves of the original sequence before completing the last pass. One

portion is pulled out because of the even symmetry and the other portion

is pulled out because of the odd symmetry.

I found this page that appears to explain it, but I haven't taken the

time to dig into the math. I think I'd have to start all over again,

it's just been too long.

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data, that

save (I think) roughly 2x the calculations of FFTs for complex data.

I did a quick search, but didn't find any that are made specifically for

FPGAs. Was my search too quick, or are there no IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

Protip: the synthesizer should trim out the unneeded logic, so

you don't need an optimized library macro.

I don't think the synthesizer is capable of getting the same savings.

The optimizations would see the zero imaginary inputs and optimize the

first pass of the FFT. All passes would be N/2 butterflies while the

optimized approach would use half that many at the expense of an extra

pass. This is a big savings that the synthesis tools aren't likely to

figure out unless they recognize you are performing an FFT.

Someone refresh my memory. If you do an FFT with zeros in the imaginary

part of the inputs, the output has a symmetry that can be used to

process two real streams at once. I can't recall how it works exactly,

but that symmetry is the basis for separating the results of the two

halves of the original sequence before completing the last pass. One

portion is pulled out because of the even symmetry and the other portion

is pulled out because of the odd symmetry.

I found this page that appears to explain it, but I haven't taken the

time to dig into the math. I think I'd have to start all over again,

it's just been too long.

Opps, forgot the link.

http://www.engineeringproductivitytools.com/stuff/T0001/PT10.HTM

--

Rick C

Guest

Tue Feb 14, 2017 4:08 am

On 2/13/2017 12:56 PM, Tim Wescott wrote:

On Mon, 13 Feb 2017 01:48:49 -0500, rickman wrote:

On 2/13/2017 12:03 AM, Tim Wescott wrote:

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data,

that save (I think) roughly 2x the calculations of FFTs for complex

data.

I did a quick search, but didn't find any that are made specifically

for FPGAs. Was my search too quick, or are there no IP sources to do

this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit

so maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed

by taking advantage of some of the symmetries in the FFT. I don't

recall the details as it has been over a decade since I worked with

this, but you can fold half of the real data into the imaginary

portion, run a size N/2 FFT and then unfold the results. I believe

you have to run a final N sized complex pass on these results to get

your final answer. I do recall it saved a *lot* when performing FFTs,

nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was

wondering if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings is

in time/size. Essentially it does an N/2 size FFT with an extra pass,

so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

There's a distinct symmetry to the Fourier transform that you could use

at each step of the way instead of doing the whole thing and fixing it up

at the end.

I don't know if it would save steps, but it would certainly be easier on

someone who just wants to apply an algorithm.

On 2/13/2017 12:03 AM, Tim Wescott wrote:

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data,

that save (I think) roughly 2x the calculations of FFTs for complex

data.

I did a quick search, but didn't find any that are made specifically

for FPGAs. Was my search too quick, or are there no IP sources to do

this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to include

these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit

so maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed

by taking advantage of some of the symmetries in the FFT. I don't

recall the details as it has been over a decade since I worked with

this, but you can fold half of the real data into the imaginary

portion, run a size N/2 FFT and then unfold the results. I believe

you have to run a final N sized complex pass on these results to get

your final answer. I do recall it saved a *lot* when performing FFTs,

nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was

wondering if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings is

in time/size. Essentially it does an N/2 size FFT with an extra pass,

so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

There's a distinct symmetry to the Fourier transform that you could use

at each step of the way instead of doing the whole thing and fixing it up

at the end.

I don't know if it would save steps, but it would certainly be easier on

someone who just wants to apply an algorithm.

Are you referring to the nature of the FFT on real signals (i.e. data

sets that have zeros for the imaginary data)? That would only help with

the first pass of butterflies. Once your perform those the imaginary

part is not zero anymore.

That's why you get very little optimization by plugging in the zero data

and letting the tools work it out.

On the other hand, the vendor's FFT logic generator should support real

data inputs. It is a common enough need.

--

Rick C

Guest

Tue Feb 14, 2017 8:30 am

On 2/13/2017 9:39 PM, Tim Wescott wrote:

On Mon, 13 Feb 2017 16:08:00 -0500, rickman wrote:

On 2/13/2017 12:56 PM, Tim Wescott wrote:

On Mon, 13 Feb 2017 01:48:49 -0500, rickman wrote:

On 2/13/2017 12:03 AM, Tim Wescott wrote:

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data,

that save (I think) roughly 2x the calculations of FFTs for complex

data.

I did a quick search, but didn't find any that are made

specifically for FPGAs. Was my search too quick, or are there no

IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to

include these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit

so maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed

by taking advantage of some of the symmetries in the FFT. I don't

recall the details as it has been over a decade since I worked with

this, but you can fold half of the real data into the imaginary

portion, run a size N/2 FFT and then unfold the results. I believe

you have to run a final N sized complex pass on these results to get

your final answer. I do recall it saved a *lot* when performing

FFTs,

nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was

wondering if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings

is in time/size. Essentially it does an N/2 size FFT with an extra

pass, so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

There's a distinct symmetry to the Fourier transform that you could use

at each step of the way instead of doing the whole thing and fixing it

up at the end.

I don't know if it would save steps, but it would certainly be easier

on someone who just wants to apply an algorithm.

Are you referring to the nature of the FFT on real signals (i.e. data

sets that have zeros for the imaginary data)? That would only help with

the first pass of butterflies. Once your perform those the imaginary

part is not zero anymore.

Yes, but the imaginary and real parts are still symmetrical around f = 0,

so you should neither have to compute nor use them.

That's why you get very little optimization by plugging in the zero data

and letting the tools work it out.

Yes, unless the optimizers get _really_ smart.

On 2/13/2017 12:56 PM, Tim Wescott wrote:

On Mon, 13 Feb 2017 01:48:49 -0500, rickman wrote:

On 2/13/2017 12:03 AM, Tim Wescott wrote:

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real data,

that save (I think) roughly 2x the calculations of FFTs for complex

data.

I did a quick search, but didn't find any that are made

specifically for FPGAs. Was my search too quick, or are there no

IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to

include these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a bit

so maybe I lost that one.

An FFT is inherently a complex function. Real data can be processed

by taking advantage of some of the symmetries in the FFT. I don't

recall the details as it has been over a decade since I worked with

this, but you can fold half of the real data into the imaginary

portion, run a size N/2 FFT and then unfold the results. I believe

you have to run a final N sized complex pass on these results to get

your final answer. I do recall it saved a *lot* when performing

FFTs,

nearly 50%.

My understanding is that there were some software packages that baked

that into the algorithm, for what savings I don't know. I was

wondering if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings

is in time/size. Essentially it does an N/2 size FFT with an extra

pass, so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

There's a distinct symmetry to the Fourier transform that you could use

at each step of the way instead of doing the whole thing and fixing it

up at the end.

I don't know if it would save steps, but it would certainly be easier

on someone who just wants to apply an algorithm.

Are you referring to the nature of the FFT on real signals (i.e. data

sets that have zeros for the imaginary data)? That would only help with

the first pass of butterflies. Once your perform those the imaginary

part is not zero anymore.

Yes, but the imaginary and real parts are still symmetrical around f = 0,

so you should neither have to compute nor use them.

That's why you get very little optimization by plugging in the zero data

and letting the tools work it out.

Yes, unless the optimizers get _really_ smart.

We are talking two different things. The HDL synthesis optimizers are

optimizing *logic*. They don't know diddly about what you are using the

logic for.

On the other hand, the vendor's FFT logic generator should support real

data inputs. It is a common enough need.

I agree, but when I looked I didn't see it.

The Xilinx logic generators also seem to only support 2^n vector sizes.

I don't know how convoluted things get, but I know that with a general-

purpose processor, a prime-value-sized FFT is nearly as fast as a 2^n

one. Don't ask me how -- it's just a box with magic inside for me at

this point.

data inputs. It is a common enough need.

I agree, but when I looked I didn't see it.

The Xilinx logic generators also seem to only support 2^n vector sizes.

I don't know how convoluted things get, but I know that with a general-

purpose processor, a prime-value-sized FFT is nearly as fast as a 2^n

one. Don't ask me how -- it's just a box with magic inside for me at

this point.

When I first learned FFTs, I did it by looking at the equations in terms

of the twiddle factors each input was multiplied by as it contributed to

the final value. It works out to be the same calculation as the DFT.

It is taking advantage of the cyclic nature of the twiddle factors to

avoid redundant calculations. The same basis provides the various

symmetries.

I can't say how prime sized data sets work out. I'm very surprised they

even exist. I can't picture how the butterflies would work.

--

Rick C

Guest

Tue Feb 14, 2017 8:30 am

On Mon, 13 Feb 2017 22:36:41 -0500, rickman wrote:

On 2/13/2017 9:39 PM, Tim Wescott wrote:

On Mon, 13 Feb 2017 16:08:00 -0500, rickman wrote:

On 2/13/2017 12:56 PM, Tim Wescott wrote:

On Mon, 13 Feb 2017 01:48:49 -0500, rickman wrote:

On 2/13/2017 12:03 AM, Tim Wescott wrote:

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real

data, that save (I think) roughly 2x the calculations of FFTs for

complex data.

I did a quick search, but didn't find any that are made

specifically for FPGAs. Was my search too quick, or are there no

IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to

include these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a

bit so maybe I lost that one.

An FFT is inherently a complex function. Real data can be

processed by taking advantage of some of the symmetries in the

FFT. I don't recall the details as it has been over a decade

since I worked with this, but you can fold half of the real data

into the imaginary portion, run a size N/2 FFT and then unfold the

results. I believe you have to run a final N sized complex pass

on these results to get your final answer. I do recall it saved a

*lot* when performing FFTs,

nearly 50%.

My understanding is that there were some software packages that

baked that into the algorithm, for what savings I don't know. I

was wondering if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings

is in time/size. Essentially it does an N/2 size FFT with an extra

pass, so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

There's a distinct symmetry to the Fourier transform that you could

use at each step of the way instead of doing the whole thing and

fixing it up at the end.

I don't know if it would save steps, but it would certainly be easier

on someone who just wants to apply an algorithm.

Are you referring to the nature of the FFT on real signals (i.e. data

sets that have zeros for the imaginary data)? That would only help

with the first pass of butterflies. Once your perform those the

imaginary part is not zero anymore.

Yes, but the imaginary and real parts are still symmetrical around f =

0,

so you should neither have to compute nor use them.

That's why you get very little optimization by plugging in the zero

data and letting the tools work it out.

Yes, unless the optimizers get _really_ smart.

We are talking two different things. The HDL synthesis optimizers are

optimizing *logic*. They don't know diddly about what you are using the

logic for.

On Mon, 13 Feb 2017 16:08:00 -0500, rickman wrote:

On 2/13/2017 12:56 PM, Tim Wescott wrote:

On Mon, 13 Feb 2017 01:48:49 -0500, rickman wrote:

On 2/13/2017 12:03 AM, Tim Wescott wrote:

On Sun, 12 Feb 2017 20:32:59 -0500, rickman wrote:

On 2/12/2017 1:05 PM, Tim Wescott wrote:

So, there are algorithms out there to perform an FFT on real

data, that save (I think) roughly 2x the calculations of FFTs for

complex data.

I did a quick search, but didn't find any that are made

specifically for FPGAs. Was my search too quick, or are there no

IP sources to do this?

It would seem like a slam-dunk for Xilinx and Intel/Altera to

include these algorithms in their FFT libraries.

I thought I replied to this, but my computer has been crashing a

bit so maybe I lost that one.

An FFT is inherently a complex function. Real data can be

processed by taking advantage of some of the symmetries in the

FFT. I don't recall the details as it has been over a decade

since I worked with this, but you can fold half of the real data

into the imaginary portion, run a size N/2 FFT and then unfold the

results. I believe you have to run a final N sized complex pass

on these results to get your final answer. I do recall it saved a

*lot* when performing FFTs,

nearly 50%.

My understanding is that there were some software packages that

baked that into the algorithm, for what savings I don't know. I

was wondering if it was done for FFTs as well.

I'm not sure what you mean. When you say "baking" it into the

algorithm, it would do pretty much what I described. That *is* the

algorithm. I haven't heard of any other optimizations. The savings

is in time/size. Essentially it does an N/2 size FFT with an extra

pass, so instead of N*log(N) step it takes (N+1)*log(N/2) steps.

There's a distinct symmetry to the Fourier transform that you could

use at each step of the way instead of doing the whole thing and

fixing it up at the end.

I don't know if it would save steps, but it would certainly be easier

on someone who just wants to apply an algorithm.

Are you referring to the nature of the FFT on real signals (i.e. data

sets that have zeros for the imaginary data)? That would only help

with the first pass of butterflies. Once your perform those the

imaginary part is not zero anymore.

Yes, but the imaginary and real parts are still symmetrical around f =

0,

so you should neither have to compute nor use them.

That's why you get very little optimization by plugging in the zero

data and letting the tools work it out.

Yes, unless the optimizers get _really_ smart.

We are talking two different things. The HDL synthesis optimizers are

optimizing *logic*. They don't know diddly about what you are using the

logic for.

Well, yes, but these computers are getting smarter all the time. When

you tell Synopsis to synthesize and it says "your fly is unzipped",

you'll know that the optimizers are too damned smart.

--

Tim Wescott

Wescott Design Services

http://www.wescottdesign.com

I'm looking for work -- see my website!

elektroda.net NewsGroups Forum Index - FPGA - **All-real FFT for FPGA**