efinix bit stream question...

J

John Larkin

Guest
We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

Are the streams very compressible? We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.
 
Am 27.11.22 um 05:34 schrieb John Larkin:
We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

With Xilinx it would for sure. Never used efinix, but I would
consider it broken if it didn\'t.

> Are the streams very compressible?

I would simply test example files with zip, zcat and similar.
IIRC, there is even a flow-through decompressor.

We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

cheers, Gerhard
 
On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de>
wrote:

Am 27.11.22 um 05:34 schrieb John Larkin:
We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

With Xilinx it would for sure. Never used efinix, but I would
consider it broken if it didn\'t.

Are the streams very compressible?

I would simply test example files with zip, zcat and similar.
IIRC, there is even a flow-through decompressor.

We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

cheers, Gerhard

I\'m at home and don\'t have access to a compiled bitstream, and this is
a discussion group.

I\'ll get a T20 bit stream Monday or Tuesday and see what it looks
like. If there are many runs of 0\'s, compression and decompression are
very simple. Or maybe a typical stream is just shorter than the max.

I recall a Xilinx or maybe Altera stream that compressed about 3:1
with a very simple algorithm. I think I compressed runs of 0\'s and 1\'s
on that one, with a PowerBasic program.

We considered fancier dictionary-based schemes, sort of like Zip, but
they weren\'t worth the hassle.
 
On Sun, 27 Nov 2022 08:16:57 -0800, John Larkin
<jlarkin@highlandSNIPMEtechnology.com> wrote:

On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de
wrote:

Am 27.11.22 um 05:34 schrieb John Larkin:
We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

With Xilinx it would for sure. Never used efinix, but I would
consider it broken if it didn\'t.

Are the streams very compressible?

I would simply test example files with zip, zcat and similar.
IIRC, there is even a flow-through decompressor.

We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

cheers, Gerhard


I\'m at home and don\'t have access to a compiled bitstream, and this is
a discussion group.

I\'ll get a T20 bit stream Monday or Tuesday and see what it looks
like. If there are many runs of 0\'s, compression and decompression are
very simple. Or maybe a typical stream is just shorter than the max.

I recall a Xilinx or maybe Altera stream that compressed about 3:1
with a very simple algorithm. I think I compressed runs of 0\'s and 1\'s
on that one, with a PowerBasic program.

We considered fancier dictionary-based schemes, sort of like Zip, but
they weren\'t worth the hassle.

I recall the conclusion that the best dictionary entry for a random
data block is itself. Zip doesn\'t compress random binary data files
very well.

FPGA bit streams are nonrandom in having long runs of 0\'s.
 
On Sat, 26 Nov 2022 20:34:23 -0800, John Larkin
<jlarkin@highlandSNIPMEtechnology.com> wrote:

We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

Are the streams very compressible? We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

Here\'s a T20 bit stream. The length seems to be constant vs functions
coded, but there are enough runs of all 0\'s that it\'s probably worth
compressing.

https://www.dropbox.com/s/vm247lntp78jm20/Efinix_T20_bitstream.hex?dl=0

The actual config file will be binary, not hex of course.
 
On 2/12/22 08:12, John Larkin wrote:
On Sat, 26 Nov 2022 20:34:23 -0800, John Larkin
jlarkin@highlandSNIPMEtechnology.com> wrote:

We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

Are the streams very compressible? We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

Here\'s a T20 bit stream. The length seems to be constant vs functions
coded, but there are enough runs of all 0\'s that it\'s probably worth
compressing.

https://www.dropbox.com/s/vm247lntp78jm20/Efinix_T20_bitstream.hex?dl=0

The actual config file will be binary, not hex of course.

Gzip compresses your 2.0MB down to 105kB. The decompressor isn\'t tiny,
but it\'s fairly small. The lz4 decompressor is tiny and still gets to
221kB. Possibly less if you RLE first. bz2 gets it to 76kB, and xz or
lzma to 72kB.

Compression is one area where it\'s best to rely on work done by people
who understand the theory. Some of these algorithms have a tiny
decompressor, the magic is in the compressor.

CH
 
On 01/12/2022 21:12, John Larkin wrote:
On Sat, 26 Nov 2022 20:34:23 -0800, John Larkin
jlarkin@highlandSNIPMEtechnology.com> wrote:

We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

Are the streams very compressible? We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

Here\'s a T20 bit stream. The length seems to be constant vs functions
coded, but there are enough runs of all 0\'s that it\'s probably worth
compressing.

https://www.dropbox.com/s/vm247lntp78jm20/Efinix_T20_bitstream.hex?dl=0

The actual config file will be binary, not hex of course.

Quick scan with one of my utilities gives:

Filename : \\users\\martin\\downloads\\Efinix~1.hex
File size = 4071902
Entropy = 1.225 ( max. 5.545 )
States used = 3.40 ( max. 256 )

Zero frequency : 0-9 11-47 58-64 71-255

Most frequent bytes:
48 30 \"0\" 2198086
10 A ... 1357302
49 31 \"1\" 98740
52 34 \"4\" 97072
56 38 \"8\" 96870
50 32 \"2\" 94906
54 36 \"6\" 26994
51 33 \"3\" 26880
67 43 \"C\" 26478
57 39 \"9\" 25500
65 41 \"A\" 6820
53 35 \"5\" 5944

The hex file consists mostly of character \"0\" bytes and linefeeds.
Simple run length encoding would compact it a lot.
It seems \"7\",\"B\",\"D\",\"E\",\"F\" are quite rare in these files.

The raw binary file obviously won\'t have the linefeeds and will be only
one byte for every three in the ASCII .hex file so about 1.3M.

Back of the envelope RLE might get you a ~20x decrease in size.

The right compressor and it could be made a lot smaller.
If you put up the binary I\'ll scan that for byte entropy too.

--
Regards,
Martin Brown
 
On 27/11/2022 16:16, John Larkin wrote:
On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de
wrote:

Am 27.11.22 um 05:34 schrieb John Larkin:
We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

With Xilinx it would for sure. Never used efinix, but I would
consider it broken if it didn\'t.

Are the streams very compressible?

I would simply test example files with zip, zcat and similar.
IIRC, there is even a flow-through decompressor.

We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

cheers, Gerhard


I\'m at home and don\'t have access to a compiled bitstream, and this is
a discussion group.

I\'ll get a T20 bit stream Monday or Tuesday and see what it looks
like. If there are many runs of 0\'s, compression and decompression are
very simple. Or maybe a typical stream is just shorter than the max.

Binary looks to have incredibly high redundancy and compressibility.
One of the lowest byte entropy scores I have seen in a long time.

There appear to be strong correlations of identical blocks at strides of
9, 12, 24, 36 as well as huge runs of nul bytes. The odd one of 0a.

Also a quick eyeball reveals walking ones 80,40,20,10,08,04,02,01,00
at around 107227 (stride 9).

There is an incredibly long run of 15372 nul bytes at offset 143811

RLE the nul bytes should get you most of the way there and maybe some
code to RLE the most obvious repeated sequences if you need a bit more.

--
Regards,
Martin Brown
 
On Fri, 2 Dec 2022 12:15:56 +0000, Martin Brown
<\'\'\'newspam\'\'\'@nonad.co.uk> wrote:

On 27/11/2022 16:16, John Larkin wrote:
On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de
wrote:

Am 27.11.22 um 05:34 schrieb John Larkin:
We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

With Xilinx it would for sure. Never used efinix, but I would
consider it broken if it didn\'t.

Are the streams very compressible?

I would simply test example files with zip, zcat and similar.
IIRC, there is even a flow-through decompressor.

We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

cheers, Gerhard


I\'m at home and don\'t have access to a compiled bitstream, and this is
a discussion group.

I\'ll get a T20 bit stream Monday or Tuesday and see what it looks
like. If there are many runs of 0\'s, compression and decompression are
very simple. Or maybe a typical stream is just shorter than the max.

Binary looks to have incredibly high redundancy and compressibility.
One of the lowest byte entropy scores I have seen in a long time.

My comment was about really random data. An FPGA bit stream certainly
has repeated patterns. One might build a N-bit structure, a multiplier
or accumulator or filter or DDS, and bit-slice blocks are very likely
repeated N times.

Maybe I can find some college kid who\'d like to do a project or thesus
to find or code a minimal decomp algorithm for efinix+rasperry pi, in
exchange for some pittance.

I can imagine some dictionary-based thing where a dictionary entry is
its own first occurrence in the bit file. The decompressor is
basically scissors and a pot of glue.


There appear to be strong correlations of identical blocks at strides of
9, 12, 24, 36 as well as huge runs of nul bytes. The odd one of 0a.

Also a quick eyeball reveals walking ones 80,40,20,10,08,04,02,01,00
at around 107227 (stride 9).

There is an incredibly long run of 15372 nul bytes at offset 143811

RLE the nul bytes should get you most of the way there and maybe some
code to RLE the most obvious repeated sequences if you need a bit more.

I was thinking of just compressing runs of 0\'s, but there could be a
few other smallish patterns that might not be horrible to stash in the
decompressor dictionary. That presents the question, are there
patterns that are common to *all* T20 bit streams?

I need a low-paid lackey.
 
On 02/12/2022 15:22, John Larkin wrote:
On Fri, 2 Dec 2022 12:15:56 +0000, Martin Brown
\'\'\'newspam\'\'\'@nonad.co.uk> wrote:

On 27/11/2022 16:16, John Larkin wrote:
On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de
wrote:

Am 27.11.22 um 05:34 schrieb John Larkin:
We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

With Xilinx it would for sure. Never used efinix, but I would
consider it broken if it didn\'t.

Are the streams very compressible?

I would simply test example files with zip, zcat and similar.
IIRC, there is even a flow-through decompressor.

We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

cheers, Gerhard


I\'m at home and don\'t have access to a compiled bitstream, and this is
a discussion group.

I\'ll get a T20 bit stream Monday or Tuesday and see what it looks
like. If there are many runs of 0\'s, compression and decompression are
very simple. Or maybe a typical stream is just shorter than the max.

Binary looks to have incredibly high redundancy and compressibility.
One of the lowest byte entropy scores I have seen in a long time.

My comment was about really random data. An FPGA bit stream certainly
has repeated patterns. One might build a N-bit structure, a multiplier
or accumulator or filter or DDS, and bit-slice blocks are very likely
repeated N times.

I don\'t think an FPGA bitstream is anything remotely like random data.
The vast majority of the bytes are zeroes (70%), then bytes with 1 bit
set ~2% each, 2 bits set <0.7%. It depends how hard you are prepared to
work. Bytes with more than 3 bits set are comparatively rare.

In your example the bytes 8A, A7, BF, DB, ED all appeared just once and
the token BE did not occur at all.

In principle for this application you can afford to use insane amounts
of CPU power to encode if it makes the decoder simpler and faster. My
instinct is that it is only worth compressing enough to make room for
whatever code has to fit into the same space.

I recall way back jumping through endless hoops to fit slightly more
firmware code into 8k ROMs back in the days when 64k was a lot of ram.

Maybe I can find some college kid who\'d like to do a project or thesus
to find or code a minimal decomp algorithm for efinix+rasperry pi, in
exchange for some pittance.

I used to have a university sandwich student for a year and sometimes a
student over the long vacation and give them projects that were
interesting and otherwise wouldn\'t get done. The occasional one turned
out to be exceptionally good. The rest did an OK job. It is only worth
doing if they can finish a project that you don\'t have the time to do.

Usually something that involves taking a lot of raw data and looking to
see if there is anything interesting going on.
I can imagine some dictionary-based thing where a dictionary entry is
its own first occurrence in the bit file. The decompressor is
basically scissors and a pot of glue.

Judging by the way it looks to my correlator I would expect LHA type
algorithms to do rather well on it. There is an inordinate amount of
block duplication. A few simple subs will easily get you under 250k.

There appear to be strong correlations of identical blocks at strides of
9, 12, 24, 36 as well as huge runs of nul bytes. The odd one of 0a.

Also a quick eyeball reveals walking ones 80,40,20,10,08,04,02,01,00
at around 107227 (stride 9).

There is an incredibly long run of 15372 nul bytes at offset 143811

RLE the nul bytes should get you most of the way there and maybe some
code to RLE the most obvious repeated sequences if you need a bit more.

I was thinking of just compressing runs of 0\'s, but there could be a
few other smallish patterns that might not be horrible to stash in the
decompressor dictionary. That presents the question, are there
patterns that are common to *all* T20 bit streams?

I need a low-paid lackey.

What stops you from having one?
But you will get more use out of one that is paid the going rate.

--
Regards,
Martin Brown
 
On Mon, 5 Dec 2022 11:01:34 +0000, Martin Brown
<\'\'\'newspam\'\'\'@nonad.co.uk> wrote:

On 02/12/2022 15:22, John Larkin wrote:
On Fri, 2 Dec 2022 12:15:56 +0000, Martin Brown
\'\'\'newspam\'\'\'@nonad.co.uk> wrote:

On 27/11/2022 16:16, John Larkin wrote:
On Sun, 27 Nov 2022 10:46:47 +0100, Gerhard Hoffmann <dk4xp@arcor.de
wrote:

Am 27.11.22 um 05:34 schrieb John Larkin:
We use the efinix T20 trion FPGA.

Questions about the config bit streams:

Are they always the same size, or does it depend on how much logic is
compiled? Would a simple application use less?

With Xilinx it would for sure. Never used efinix, but I would
consider it broken if it didn\'t.

Are the streams very compressible?

I would simply test example files with zip, zcat and similar.
IIRC, there is even a flow-through decompressor.

We have done some simple run-length
coding to greatly reduce the storage requirement for other FPGAs.
Configs tend to have long runs of 0\'s.

The T20/256 claims to need 5.4 megabits. I\'d like to store the fpga
config and application code in a Raspberry Pi Pico, which has 2 MB of
onboard flash. Storing the full config would use about a third of
that, so reducing that would be useful.

cheers, Gerhard


I\'m at home and don\'t have access to a compiled bitstream, and this is
a discussion group.

I\'ll get a T20 bit stream Monday or Tuesday and see what it looks
like. If there are many runs of 0\'s, compression and decompression are
very simple. Or maybe a typical stream is just shorter than the max.

Binary looks to have incredibly high redundancy and compressibility.
One of the lowest byte entropy scores I have seen in a long time.

My comment was about really random data. An FPGA bit stream certainly
has repeated patterns. One might build a N-bit structure, a multiplier
or accumulator or filter or DDS, and bit-slice blocks are very likely
repeated N times.

I don\'t think an FPGA bitstream is anything remotely like random data.
The vast majority of the bytes are zeroes (70%), then bytes with 1 bit
set ~2% each, 2 bits set <0.7%. It depends how hard you are prepared to
work. Bytes with more than 3 bits set are comparatively rare.

In your example the bytes 8A, A7, BF, DB, ED all appeared just once and
the token BE did not occur at all.

In principle for this application you can afford to use insane amounts
of CPU power to encode if it makes the decoder simpler and faster. My
instinct is that it is only worth compressing enough to make room for
whatever code has to fit into the same space.

I recall way back jumping through endless hoops to fit slightly more
firmware code into 8k ROMs back in the days when 64k was a lot of ram.

Maybe I can find some college kid who\'d like to do a project or thesus
to find or code a minimal decomp algorithm for efinix+rasperry pi, in
exchange for some pittance.

I used to have a university sandwich student for a year and sometimes a
student over the long vacation and give them projects that were
interesting and otherwise wouldn\'t get done. The occasional one turned
out to be exceptionally good. The rest did an OK job. It is only worth
doing if they can finish a project that you don\'t have the time to do.

Usually something that involves taking a lot of raw data and looking to
see if there is anything interesting going on.

I can imagine some dictionary-based thing where a dictionary entry is
its own first occurrence in the bit file. The decompressor is
basically scissors and a pot of glue.

Judging by the way it looks to my correlator I would expect LHA type
algorithms to do rather well on it. There is an inordinate amount of
block duplication. A few simple subs will easily get you under 250k.

There appear to be strong correlations of identical blocks at strides of
9, 12, 24, 36 as well as huge runs of nul bytes. The odd one of 0a.

Also a quick eyeball reveals walking ones 80,40,20,10,08,04,02,01,00
at around 107227 (stride 9).

There is an incredibly long run of 15372 nul bytes at offset 143811

RLE the nul bytes should get you most of the way there and maybe some
code to RLE the most obvious repeated sequences if you need a bit more.

I was thinking of just compressing runs of 0\'s, but there could be a
few other smallish patterns that might not be horrible to stash in the
decompressor dictionary. That presents the question, are there
patterns that are common to *all* T20 bit streams?

I need a low-paid lackey.

What stops you from having one?
But you will get more use out of one that is paid the going rate.

Just kidding. We pay very well.

If we do a product line around raspberry pi, we could piggyback on the
enormous physical and people culture. I\'ve never seen anything like
it.

https://www.raspberrypi.org/

We might sponsor 5 or 10 smart poor high school or college kids, steer
their paths a bit, give them summer projects or jobs, hire a couple of
the best when they graduate.

Pi has enormous momentum so should be around for a while.
 

Welcome to EDABoard.com

Sponsor

Back
Top