Tiny CPUs for Slow Logic

On Thursday, March 21, 2019 at 7:48:17 AM UTC-4, Tom Gardner wrote:
That attitude surprises me, since all my /designs/ have been
based on "what do I need to achieve" plus "what can individual
technologies achieve" plus "which combination of technologies
is best at achieving my objectives". I.e top down with a
knowledge of the bottom pieces.

I'm not designing anything at the moment. I'm trying to discuss an idea. Have you never brain stormed techniques and methods?

Rick C.
 
On 21/03/19 14:57, gnuarm.deletethisbit@gmail.com wrote:
On Thursday, March 21, 2019 at 7:48:17 AM UTC-4, Tom Gardner wrote:

That attitude surprises me, since all my /designs/ have been based on "what
do I need to achieve" plus "what can individual technologies achieve" plus
"which combination of technologies is best at achieving my objectives". I.e
top down with a knowledge of the bottom pieces.

I'm not designing anything at the moment. I'm trying to discuss an idea.
Have you never brain stormed techniques and methods?

Many many times, professionally often in formal settings.

Generating ideas is relatively easy.

Refining an idea and defining it in sufficient detail that
it can be assessed is a more difficult task. Very few ideas
survive.

It is /always/ up to the "champion" (of an idea or product)
to be able to convincingly explain the advantages and
acknowledge the disadvantages.
 
On Thursday, March 21, 2019 at 11:58:54 AM UTC-4, Tom Gardner wrote:
On 21/03/19 14:57, gnuarm.deletethisbit@gmail.com wrote:
On Thursday, March 21, 2019 at 7:48:17 AM UTC-4, Tom Gardner wrote:

That attitude surprises me, since all my /designs/ have been based on "what
do I need to achieve" plus "what can individual technologies achieve" plus
"which combination of technologies is best at achieving my objectives". I.e
top down with a knowledge of the bottom pieces.

I'm not designing anything at the moment. I'm trying to discuss an idea.
Have you never brain stormed techniques and methods?

Many many times, professionally often in formal settings.

Generating ideas is relatively easy.

Refining an idea and defining it in sufficient detail that
it can be assessed is a more difficult task. Very few ideas
survive.

It is /always/ up to the "champion" (of an idea or product)
to be able to convincingly explain the advantages and
acknowledge the disadvantages.

So I guess we need to find a champion.

Rick C.
 
On 3/18/2019 6:13 PM, gnuarm.deletethisbit@gmail.com wrote:
Most of us have implemented small processors for logic operations that don't need to happen at high speed. Simple CPUs can be built into an FPGA using a very small footprint much like the ALU blocks. There are stack based processors that are very small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would anyone be interested? Or is a C compiler mandatory even for processors running very small programs?

I am picturing this not terribly unlike the sequencer I used many years ago on an I/O board for an array processor which had it's own assembler. It was very simple and easy to use, but very much not a high level language. This would have a language that was high level, just not C rather something extensible and simple to use and potentially interactive.

Rick C.
Where do find the memory for the program and data?
On the FPGA, external or floating on a cloud?
Oldben
 
On Tuesday, October 8, 2019 at 11:24:28 PM UTC-4, oldben wrote:
On 3/18/2019 6:13 PM, gnuarm.deletethisbit@gmail.com wrote:
Most of us have implemented small processors for logic operations that don't need to happen at high speed. Simple CPUs can be built into an FPGA using a very small footprint much like the ALU blocks. There are stack based processors that are very small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would anyone be interested? Or is a C compiler mandatory even for processors running very small programs?

I am picturing this not terribly unlike the sequencer I used many years ago on an I/O board for an array processor which had it's own assembler. It was very simple and easy to use, but very much not a high level language.. This would have a language that was high level, just not C rather something extensible and simple to use and potentially interactive.

Rick C.

Where do find the memory for the program and data?
On the FPGA, external or floating on a cloud?
Oldben

This is a bit of an old thread. I don't recall having anything in mind. I started out just trying to consider what might be useful, but I really don't recall. All the CPUs I've designed had local memory.

--

Rick C.

- Get 2,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209
 
On Tuesday, October 8, 2019 at 11:04:15 PM UTC-5, Rick C wrote:
On Tuesday, October 8, 2019 at 11:24:28 PM UTC-4, oldben wrote:
On 3/18/2019 6:13 PM, gnuarm.deletethisbit@gmail.com wrote:
Most of us have implemented small processors for logic operations that don't need to happen at high speed. Simple CPUs can be built into an FPGA using a very small footprint much like the ALU blocks. There are stack based processors that are very small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would anyone be interested? Or is a C compiler mandatory even for processors running very small programs?

I am picturing this not terribly unlike the sequencer I used many years ago on an I/O board for an array processor which had it's own assembler. It was very simple and easy to use, but very much not a high level language. This would have a language that was high level, just not C rather something extensible and simple to use and potentially interactive.

Rick C.

Where do find the memory for the program and data?
On the FPGA, external or floating on a cloud?
Oldben

This is a bit of an old thread. I don't recall having anything in mind. I started out just trying to consider what might be useful, but I really don't recall. All the CPUs I've designed had local memory.

--

Rick C.

- Get 2,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209

Missed this thread back in March.
My interest was/is in treating EDIF as the machine language.
If you can simulate all the logic in under, say, 50 usec.
that's faster than human reaction times and suitable for controllers.
So at 200MHZ that's 10K instructions, or ~10K logic gates per cycle.

So an FPGA uP design using one block RAM and under 200 LUTs is sufficient.
Duplicate the uP if you have more logic than this. 200 LUTs is much less than $1.

Jim Brakefield
 
On 16/10/2019 17:14, jim.brakefield@ieee.org wrote:
On Tuesday, October 8, 2019 at 11:04:15 PM UTC-5, Rick C wrote:
On Tuesday, October 8, 2019 at 11:24:28 PM UTC-4, oldben wrote:
...

Missed this thread back in March.
My interest was/is in treating EDIF as the machine language.
If you can simulate all the logic in under, say, 50 usec.
that's faster than human reaction times and suitable for controllers.
So at 200MHZ that's 10K instructions, or ~10K logic gates per cycle.

So an FPGA uP design using one block RAM and under 200 LUTs is sufficient.
Duplicate the uP if you have more logic than this. 200 LUTs is much less than $1.

Jim Brakefield

Are you trying to emulate a small FPGA on a microcontroller? This sounds
like an overly complicated especially as an EDIF is normally full of
complex (and not always fully documented) primitives which will be a
real pain to simulate.

Perhaps I am wrong and this is a brilliant idea? I would be interested
to hear some more,

Regards,
Hans
www.ht-lab.com
 
On Fri, 25 Oct 2019 10:50:44 +0100, HT-Lab wrote:


> Are you trying to emulate a small FPGA on a microcontroller?

Gee, emulating a small CPU on an FPGA might be a lot better way to go.
I have used smallish FPGAs to do some jobs where typically a
microcontroller would be used, and they worked pretty well.

I've also used Xilinx CPLDs from the 9500 and CoolRunner II family for
simple small logic needs, and they have done quite well. These cost just
a couple $ in small quantity. The smallest, 9536XL is just over $1 in
single quantity at Digi-Key.

Jon
 
On Friday, October 25, 2019 at 4:30:27 PM UTC-4, Jon Elson wrote:
On Fri, 25 Oct 2019 10:50:44 +0100, HT-Lab wrote:


Are you trying to emulate a small FPGA on a microcontroller?

Gee, emulating a small CPU on an FPGA might be a lot better way to go.
I have used smallish FPGAs to do some jobs where typically a
microcontroller would be used, and they worked pretty well.

I've also used Xilinx CPLDs from the 9500 and CoolRunner II family for
simple small logic needs, and they have done quite well. These cost just
a couple $ in small quantity. The smallest, 9536XL is just over $1 in
single quantity at Digi-Key.

When it comes to the Xilinx CoolRunner II parts, only the small ones are cheap. The larger ones get very expensive for what they can do.

--

Rick C.

- Get 1,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209
 
On Friday, October 25, 2019 at 4:50:49 AM UTC-5, HT-Lab wrote:
On 16/10/2019 17:14, jim.brakefield@ieee.org wrote:
On Tuesday, October 8, 2019 at 11:04:15 PM UTC-5, Rick C wrote:
On Tuesday, October 8, 2019 at 11:24:28 PM UTC-4, oldben wrote:
..

Missed this thread back in March.
My interest was/is in treating EDIF as the machine language.
If you can simulate all the logic in under, say, 50 usec.
that's faster than human reaction times and suitable for controllers.
So at 200MHZ that's 10K instructions, or ~10K logic gates per cycle.

So an FPGA uP design using one block RAM and under 200 LUTs is sufficient.
Duplicate the uP if you have more logic than this. 200 LUTs is much less than $1.

Jim Brakefield


Are you trying to emulate a small FPGA on a microcontroller? This sounds
like an overly complicated especially as an EDIF is normally full of
complex (and not always fully documented) primitives which will be a
real pain to simulate.

Perhaps I am wrong and this is a brilliant idea? I would be interested
to hear some more,

Regards,
Hans
www.ht-lab.com

My experience with EDIF was the output of VHDL/Verilog compilers for FPGAs. EDIF output was lots of simple gates and black boxes for block RAM. The FPGA vendors then write a mapper, placer and router into their silicon. So for the application with low duty cycle gates it's more efficient to emulate the gates via a small CPU and its single block RAM. For a while Xilinx supported a similar approach via HDL to "C" and run on their ARM or PPC hard cores.
Now EDIF is just a bunch of black boxes, simple gates or as complex as desired, wired together.
There are applications, such as industrial control that run the control logic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is trading speed for density.
As a side note, ASIC logic simulators have some of the same issues, however, one wants to run the ASIC simulation as fast as possible, essentially in the megahertz range.

In summary, there are a range of applications that do logic "simulation" over a wide range of cycle rates; from millisecond human reaction times all the way up to "as fast as possible". Would argue that there needs to be tool chains that support the six order of magnitude range of logic cycle rates.. In particular, not much attention to the low end of cycle rates, which currently is supported by real-time embedded tools.

Jim Brakefield
 
On Friday, October 25, 2019 at 6:09:03 PM UTC-4, jim.br...@ieee.org wrote:
On Friday, October 25, 2019 at 4:50:49 AM UTC-5, HT-Lab wrote:
On 16/10/2019 17:14, jim.brakefield@ieee.org wrote:
On Tuesday, October 8, 2019 at 11:04:15 PM UTC-5, Rick C wrote:
On Tuesday, October 8, 2019 at 11:24:28 PM UTC-4, oldben wrote:
..

Missed this thread back in March.
My interest was/is in treating EDIF as the machine language.
If you can simulate all the logic in under, say, 50 usec.
that's faster than human reaction times and suitable for controllers.
So at 200MHZ that's 10K instructions, or ~10K logic gates per cycle.

So an FPGA uP design using one block RAM and under 200 LUTs is sufficient.
Duplicate the uP if you have more logic than this. 200 LUTs is much less than $1.

Jim Brakefield


Are you trying to emulate a small FPGA on a microcontroller? This sounds
like an overly complicated especially as an EDIF is normally full of
complex (and not always fully documented) primitives which will be a
real pain to simulate.

Perhaps I am wrong and this is a brilliant idea? I would be interested
to hear some more,

Regards,
Hans
www.ht-lab.com

My experience with EDIF was the output of VHDL/Verilog compilers for FPGAs. EDIF output was lots of simple gates and black boxes for block RAM. The FPGA vendors then write a mapper, placer and router into their silicon. So for the application with low duty cycle gates it's more efficient to emulate the gates via a small CPU and its single block RAM. For a while Xilinx supported a similar approach via HDL to "C" and run on their ARM or PPC hard cores.
Now EDIF is just a bunch of black boxes, simple gates or as complex as desired, wired together.
There are applications, such as industrial control that run the control logic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is trading speed for density.
As a side note, ASIC logic simulators have some of the same issues, however, one wants to run the ASIC simulation as fast as possible, essentially in the megahertz range.

In summary, there are a range of applications that do logic "simulation" over a wide range of cycle rates; from millisecond human reaction times all the way up to "as fast as possible". Would argue that there needs to be tool chains that support the six order of magnitude range of logic cycle rates. In particular, not much attention to the low end of cycle rates, which currently is supported by real-time embedded tools.

Jim Brakefield

I get what you are saying. But why would anyone first design something in HDL only to have it compiled and then simulated in a CPU? Why would such low bandwidth processing not be coded in a sequential language conventionally used on CPUs, like C? Skip the hassle of compiling in an HDL tool and then importing to a simulator running on the target CPU? Where is the advantage exactly?

I will say that if you use the language output from the place and route tools you will get something more like LUTs which are likely to simulate faster than individual gates. Remember that unless you have some very tiny amount of logic that can be implemented in some sort of immense look up table, every connection between gates is a signal that will need to be scheduled to "run" when the inputs change. Fewer entities means less scheduling... maybe.

--

Rick C.

+ Get 1,000 miles of free Supercharging
+ Tesla referral code - https://ts.la/richard11209
 
On Friday, October 25, 2019 at 6:32:28 PM UTC-5, Rick C wrote:
On Friday, October 25, 2019 at 6:09:03 PM UTC-4, jim.br...@ieee.org wrote:
On Friday, October 25, 2019 at 4:50:49 AM UTC-5, HT-Lab wrote:
On 16/10/2019 17:14, jim.brakefield@ieee.org wrote:
On Tuesday, October 8, 2019 at 11:04:15 PM UTC-5, Rick C wrote:
On Tuesday, October 8, 2019 at 11:24:28 PM UTC-4, oldben wrote:
..

Missed this thread back in March.
My interest was/is in treating EDIF as the machine language.
If you can simulate all the logic in under, say, 50 usec.
that's faster than human reaction times and suitable for controllers.
So at 200MHZ that's 10K instructions, or ~10K logic gates per cycle..

So an FPGA uP design using one block RAM and under 200 LUTs is sufficient.
Duplicate the uP if you have more logic than this. 200 LUTs is much less than $1.

Jim Brakefield


Are you trying to emulate a small FPGA on a microcontroller? This sounds
like an overly complicated especially as an EDIF is normally full of
complex (and not always fully documented) primitives which will be a
real pain to simulate.

Perhaps I am wrong and this is a brilliant idea? I would be interested
to hear some more,

Regards,
Hans
www.ht-lab.com

My experience with EDIF was the output of VHDL/Verilog compilers for FPGAs. EDIF output was lots of simple gates and black boxes for block RAM. The FPGA vendors then write a mapper, placer and router into their silicon. So for the application with low duty cycle gates it's more efficient to emulate the gates via a small CPU and its single block RAM. For a while Xilinx supported a similar approach via HDL to "C" and run on their ARM or PPC hard cores.
Now EDIF is just a bunch of black boxes, simple gates or as complex as desired, wired together.
There are applications, such as industrial control that run the control logic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is trading speed for density.
As a side note, ASIC logic simulators have some of the same issues, however, one wants to run the ASIC simulation as fast as possible, essentially in the megahertz range.

In summary, there are a range of applications that do logic "simulation" over a wide range of cycle rates; from millisecond human reaction times all the way up to "as fast as possible". Would argue that there needs to be tool chains that support the six order of magnitude range of logic cycle rates. In particular, not much attention to the low end of cycle rates, which currently is supported by real-time embedded tools.

Jim Brakefield

I get what you are saying. But why would anyone first design something in HDL only to have it compiled and then simulated in a CPU? Why would such low bandwidth processing not be coded in a sequential language conventionally used on CPUs, like C? Skip the hassle of compiling in an HDL tool and then importing to a simulator running on the target CPU? Where is the advantage exactly?

I will say that if you use the language output from the place and route tools you will get something more like LUTs which are likely to simulate faster than individual gates. Remember that unless you have some very tiny amount of logic that can be implemented in some sort of immense look up table, every connection between gates is a signal that will need to be scheduled to "run" when the inputs change. Fewer entities means less scheduling... maybe.

--

Rick C.

+ Get 1,000 miles of free Supercharging
+ Tesla referral code - https://ts.la/richard11209

|>But why would anyone first design something in HDL only to have it compiled and then simulated in a CPU?
Was thinking of EDIF as a universal assembly language.
Parallel processing via multiple interconnected processors.
Hard real-time: Easy to determine worst case delay = # of instructions executed per cycle.
Very few processors are under $0.10.

|>every connection between gates is a signal that will need to be scheduled to "run" when the inputs change
Was thinking in terms of synchronous simulation where each "gate" is evaluated only once per clock cycle.
 
On Friday, October 25, 2019 at 9:05:12 PM UTC-4, jim.br...@ieee.org wrote:
On Friday, October 25, 2019 at 6:32:28 PM UTC-5, Rick C wrote:
On Friday, October 25, 2019 at 6:09:03 PM UTC-4, jim.br...@ieee.org wrote:
On Friday, October 25, 2019 at 4:50:49 AM UTC-5, HT-Lab wrote:
On 16/10/2019 17:14, jim.brakefield@ieee.org wrote:
On Tuesday, October 8, 2019 at 11:04:15 PM UTC-5, Rick C wrote:
On Tuesday, October 8, 2019 at 11:24:28 PM UTC-4, oldben wrote:
..

Missed this thread back in March.
My interest was/is in treating EDIF as the machine language.
If you can simulate all the logic in under, say, 50 usec.
that's faster than human reaction times and suitable for controllers.
So at 200MHZ that's 10K instructions, or ~10K logic gates per cycle.

So an FPGA uP design using one block RAM and under 200 LUTs is sufficient.
Duplicate the uP if you have more logic than this. 200 LUTs is much less than $1.

Jim Brakefield


Are you trying to emulate a small FPGA on a microcontroller? This sounds
like an overly complicated especially as an EDIF is normally full of
complex (and not always fully documented) primitives which will be a
real pain to simulate.

Perhaps I am wrong and this is a brilliant idea? I would be interested
to hear some more,

Regards,
Hans
www.ht-lab.com

My experience with EDIF was the output of VHDL/Verilog compilers for FPGAs. EDIF output was lots of simple gates and black boxes for block RAM. The FPGA vendors then write a mapper, placer and router into their silicon. So for the application with low duty cycle gates it's more efficient to emulate the gates via a small CPU and its single block RAM. For a while Xilinx supported a similar approach via HDL to "C" and run on their ARM or PPC hard cores.
Now EDIF is just a bunch of black boxes, simple gates or as complex as desired, wired together.
There are applications, such as industrial control that run the control logic at kilohertz rates. Very low duty cycle for FPGA LUTs. So one is trading speed for density.
As a side note, ASIC logic simulators have some of the same issues, however, one wants to run the ASIC simulation as fast as possible, essentially in the megahertz range.

In summary, there are a range of applications that do logic "simulation" over a wide range of cycle rates; from millisecond human reaction times all the way up to "as fast as possible". Would argue that there needs to be tool chains that support the six order of magnitude range of logic cycle rates. In particular, not much attention to the low end of cycle rates, which currently is supported by real-time embedded tools.

Jim Brakefield

I get what you are saying. But why would anyone first design something in HDL only to have it compiled and then simulated in a CPU? Why would such low bandwidth processing not be coded in a sequential language conventionally used on CPUs, like C? Skip the hassle of compiling in an HDL tool and then importing to a simulator running on the target CPU? Where is the advantage exactly?

I will say that if you use the language output from the place and route tools you will get something more like LUTs which are likely to simulate faster than individual gates. Remember that unless you have some very tiny amount of logic that can be implemented in some sort of immense look up table, every connection between gates is a signal that will need to be scheduled to "run" when the inputs change. Fewer entities means less scheduling.... maybe.

--

Rick C.

+ Get 1,000 miles of free Supercharging
+ Tesla referral code - https://ts.la/richard11209

|>But why would anyone first design something in HDL only to have it compiled and then simulated in a CPU?
Was thinking of EDIF as a universal assembly language.
Parallel processing via multiple interconnected processors.
Hard real-time: Easy to determine worst case delay = # of instructions executed per cycle.
Very few processors are under $0.10.

|>every connection between gates is a signal that will need to be scheduled to "run" when the inputs change
Was thinking in terms of synchronous simulation where each "gate" is evaluated only once per clock cycle.

I'm pretty sure that does not exist. Race conditions exist in simulations if you interconnect gates as you are describing. VHDL handles this by introducing delta delays which are treated like small delays, but no time ticks off the clock, just deltas. Then each gate can have a delta delay associated with it. This in turn requires that each signal (gate output) be evaluated each time any of the inputs change. Because of the unit delays there can be multiple changes at different delta delays.

Otherwise the input to a FF much be written as an expression with defined rules of order of evaluation.

Or do you have a method of assuring the consistency of evaluation of signals through gates?

--

Rick C.

-- Get 1,000 miles of free Supercharging
-- Tesla referral code - https://ts.la/richard11209
 

Welcome to EDABoard.com

Sponsor

Back
Top