EDAboard.com | EDAboard.de | EDAboard.co.uk | WTWH Media

Tiny CPUs for Slow Logic

Ask a question - edaboard.com

elektroda.net NewsGroups Forum Index - FPGA - Tiny CPUs for Slow Logic

Goto page 1, 2, 3, 4, 5  Next


Guest

Tue Mar 19, 2019 1:45 am   



Most of us have implemented small processors for logic operations that don't need to happen at high speed. Simple CPUs can be built into an FPGA using a very small footprint much like the ALU blocks. There are stack based processors that are very small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would anyone be interested? Or is a C compiler mandatory even for processors running very small programs?

I am picturing this not terribly unlike the sequencer I used many years ago on an I/O board for an array processor which had it's own assembler. It was very simple and easy to use, but very much not a high level language. This would have a language that was high level, just not C rather something extensible and simple to use and potentially interactive.

Rick C.


Guest

Tue Mar 19, 2019 9:45 am   



On Tuesday, March 19, 2019 at 4:15:47 AM UTC-4, David Brown wrote:
Quote:
On 19/03/2019 01:13, gnuarm.deletethisbit_at_gmail.com wrote:
Most of us have implemented small processors for logic operations
that don't need to happen at high speed. Simple CPUs can be built
into an FPGA using a very small footprint much like the ALU blocks.
There are stack based processors that are very small, smaller than
even a few kB of memory.

If they were easily programmable in something other than C would
anyone be interested? Or is a C compiler mandatory even for
processors running very small programs?

I am picturing this not terribly unlike the sequencer I used many
years ago on an I/O board for an array processor which had it's own
assembler. It was very simple and easy to use, but very much not a
high level language. This would have a language that was high level,
just not C rather something extensible and simple to use and
potentially interactive.

Rick C.


If it is going to appeal to software developers, you need C. And it has
to be reasonable, standard C, even if it is for small devices -
programmers are fed up with the pains needed for special device-specific
C on 8051, AVR, PIC, etc. That does not necessarily mean it has to be
fast, but it should work with standard language. Having 16-bit size
rather than 8-bit size makes a huge difference to how programmers feel
about the device - aim for something like the msp430.

You might, however, want to look at extensions for CSP-style
communication between cpus - something like XMOS XC.

If it is to appeal to hardware (FPGA) developers, C might not be as
essential. Some other kind of high level language, perhaps centred
around state machines, might work.

But when I see "extensible, simple to use and potentially interactive",
I fear someone is thinking of Forth. People who are very used to Forth
find it a great language - but you need to understand that /nobody/
wants to learn it. Most programmers would rather work in assembler than
Forth. You can argue that this attitude is irrational, and that Forth
is not harder than other languages - you might be right. But that
doesn't change matters.


Certainly this would be like Forth, but the reality is I'm thinking of a Forth like CPU because they can be designed so simply.

The F18A stack processor designed by Charles Moore is used in the GA144 chip. There are 144 of them with unusual interconnections that allow the CPU to halt waiting for communications, saving power. The CPU is so small that it could be included in an FPGA as what would be equivalent to a logic element.

In the same way that the other functional logic elements like the block RAMs and DSP blocks are used for custom functionality which requires the designer to program by whatever means is devised, these tiny CPUs would not need a high level language like C. The code in them would be small enough to be considered "logic" and developed at the assembly level.

People have mindsets about things and I believe this is one of them. The GA144 is not so easy to program because people want to use it for the sort of large programs they write for other fast CPUs. In an FPGA a very fast processor can be part of the logic rather than an uber-controller riding herd over the whole chip. But this would require designers to change their thinking of how to use CPUs. The F18A runs at 700 MIPS peak rate in a 180 nm process. Instead of one or two in the FPGA like the ARMs in other FPGAs, there would be hundreds, each one running at some GHz.

Rick C.

David Brown
Guest

Tue Mar 19, 2019 9:45 am   



On 19/03/2019 01:13, gnuarm.deletethisbit_at_gmail.com wrote:
Quote:
Most of us have implemented small processors for logic operations
that don't need to happen at high speed. Simple CPUs can be built
into an FPGA using a very small footprint much like the ALU blocks.
There are stack based processors that are very small, smaller than
even a few kB of memory.

If they were easily programmable in something other than C would
anyone be interested? Or is a C compiler mandatory even for
processors running very small programs?

I am picturing this not terribly unlike the sequencer I used many
years ago on an I/O board for an array processor which had it's own
assembler. It was very simple and easy to use, but very much not a
high level language. This would have a language that was high level,
just not C rather something extensible and simple to use and
potentially interactive.

Rick C.


If it is going to appeal to software developers, you need C. And it has
to be reasonable, standard C, even if it is for small devices -
programmers are fed up with the pains needed for special device-specific
C on 8051, AVR, PIC, etc. That does not necessarily mean it has to be
fast, but it should work with standard language. Having 16-bit size
rather than 8-bit size makes a huge difference to how programmers feel
about the device - aim for something like the msp430.

You might, however, want to look at extensions for CSP-style
communication between cpus - something like XMOS XC.

If it is to appeal to hardware (FPGA) developers, C might not be as
essential. Some other kind of high level language, perhaps centred
around state machines, might work.

But when I see "extensible, simple to use and potentially interactive",
I fear someone is thinking of Forth. People who are very used to Forth
find it a great language - but you need to understand that /nobody/
wants to learn it. Most programmers would rather work in assembler than
Forth. You can argue that this attitude is irrational, and that Forth
is not harder than other languages - you might be right. But that
doesn't change matters.

Theo Markettos
Guest

Tue Mar 19, 2019 11:45 am   



gnuarm.deletethisbit_at_gmail.com wrote:
Quote:
Certainly this would be like Forth, but the reality is I'm thinking of a
Forth like CPU because they can be designed so simply.

The F18A stack processor designed by Charles Moore is used in the GA144
chip. There are 144 of them with unusual interconnections that allow the
CPU to halt waiting for communications, saving power. The CPU is so small
that it could be included in an FPGA as what would be equivalent to a
logic element.

In the same way that the other functional logic elements like the block
RAMs and DSP blocks are used for custom functionality which requires the
designer to program by whatever means is devised, these tiny CPUs would
not need a high level language like C. The code in them would be small
enough to be considered "logic" and developed at the assembly level.


The problem this boils down to is programmability.

If you have a small core, you can therefore have lots of them. But writing
software for and managing dozens or hundreds of cores is troublesome. At
this level, you have enough headache with the inter-core communication that
you'd rather not throw a strange assembler-only core architecture into the
mix. A core like this would need a simple inter-core programming model so
it's easy to reason about system behaviour (example: systolic arrays)

There's a certain merit in having a CPU as a building block, like a LAB,
BRAM or DSP block. I'm not familiar with the literature in this space, but
it's the sort of thing that turns up at the 'FPGA' conference regularly
(keyword: CGRA). That merely punts the issue to now being a tools problem -
the tools know how to make use of a DSP block, but how to make use of a CPU
block? How to turn HDL into 'software'? Can you chain the blocks together
to make wider logic?

I suppose there's also a niche at the ultra-cheap end of the spectrum - for
$1 gadgets with an 8051 because a 16 bit CPU would be too expensive (and a
Cortex M0 would have licence fees). But if this is an ASIC then I don't
think there's a whole lot more to pay to get a C-compatible processor (even
in 8 bits). And it's unclear how much speed penalty you'd pay for that.

How much code/data memory would you expect to have? Would that dwarf the
size of your core?

Finally I can see the use as a 'state machine implementation engine' for say
a CPLD. But for that you need tools (taking HDL or state-transition diagrams)
to allow the programmer to describe their state machine. And your
competition is the regular HDL synthesiser which will just make it out of
flip flops. I'm unclear how often you'd win in these circumstances.

And I can't really see 'interactive' as a feature - either you have only one
core, in which case you could equally hook up JTAG (or equivalent) to
something larger for interactive debugging, or you have many cores, in which
case I can't see how you'd interact sensibly with dozens at once.

Theo

Tom Gardner
Guest

Tue Mar 19, 2019 11:45 am   



On 19/03/19 00:13, gnuarm.deletethisbit_at_gmail.com wrote:
Quote:
Most of us have implemented small processors for logic operations that don't
need to happen at high speed. Simple CPUs can be built into an FPGA using a
very small footprint much like the ALU blocks. There are stack based
processors that are very small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would anyone be
interested? Or is a C compiler mandatory even for processors running very
small programs?

I am picturing this not terribly unlike the sequencer I used many years ago
on an I/O board for an array processor which had it's own assembler. It was
very simple and easy to use, but very much not a high level language. This
would have a language that was high level, just not C rather something
extensible and simple to use and potentially interactive.

Who cares about yet another processor programmed in the same
old language. It would not have a *U*SP. In fact it would be
"back to the 80s" :)

However, if you want to make it interesting enough to pass
the elevator test, ensure it can do things that existing
systems find difficult.

You should have a look at how the XMOS hardware and software
complement each other, so that the combination allows hard
real time operation programming in multicore systems. (Hard
means guaranteed-by-design latencies between successive i/o
activities)

David Brown
Guest

Tue Mar 19, 2019 11:45 am   



On 19/03/2019 09:32, gnuarm.deletethisbit_at_gmail.com wrote:
Quote:
On Tuesday, March 19, 2019 at 4:15:47 AM UTC-4, David Brown wrote:
On 19/03/2019 01:13, gnuarm.deletethisbit_at_gmail.com wrote:
Most of us have implemented small processors for logic
operations that don't need to happen at high speed. Simple CPUs
can be built into an FPGA using a very small footprint much like
the ALU blocks. There are stack based processors that are very
small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would
anyone be interested? Or is a C compiler mandatory even for
processors running very small programs?

I am picturing this not terribly unlike the sequencer I used
many years ago on an I/O board for an array processor which had
it's own assembler. It was very simple and easy to use, but very
much not a high level language. This would have a language that
was high level, just not C rather something extensible and simple
to use and potentially interactive.

Rick C.


If it is going to appeal to software developers, you need C. And
it has to be reasonable, standard C, even if it is for small
devices - programmers are fed up with the pains needed for special
device-specific C on 8051, AVR, PIC, etc. That does not
necessarily mean it has to be fast, but it should work with
standard language. Having 16-bit size rather than 8-bit size makes
a huge difference to how programmers feel about the device - aim
for something like the msp430.

You might, however, want to look at extensions for CSP-style
communication between cpus - something like XMOS XC.

If it is to appeal to hardware (FPGA) developers, C might not be
as essential. Some other kind of high level language, perhaps
centred around state machines, might work.

But when I see "extensible, simple to use and potentially
interactive", I fear someone is thinking of Forth. People who are
very used to Forth find it a great language - but you need to
understand that /nobody/ wants to learn it. Most programmers would
rather work in assembler than Forth. You can argue that this
attitude is irrational, and that Forth is not harder than other
languages - you might be right. But that doesn't change matters.

Certainly this would be like Forth, but the reality is I'm thinking
of a Forth like CPU because they can be designed so simply.


I appreciate that.

I can only tell you how /I/ would feel here, and let you use that for
what you think it is worth. I don't claim to speak for all software
developers, but unless other people are giving you feedback too, then
this is the best you've got Smile Remember, I am not trying to argue
about the pros and cons of different designs or languages, or challenge
you to persuade me of anything - I'm just showing you how software
developers might react to your design ideas.

Quote:

The F18A stack processor designed by Charles Moore is used in the
GA144 chip. There are 144 of them with unusual interconnections that
allow the CPU to halt waiting for communications, saving power. The
CPU is so small that it could be included in an FPGA as what would be
equivalent to a logic element.


Yes, but look how popular the chip is - it is barely a blip in the
landscape. There is no doubt that this is a technologically fascinating
device. However, it is very difficult to program such chips - almost no
one is experienced with such multi-cpu arrangements, and the design
requires a completely different way of thinking from existing software
design. Add to that a language that works backwards, and a syntax that
looks like the cat walked across the keyboard, and you have something
that has programmers running away.

My experience with Forth is small and outdated, but not non-existent.
I've worked with dozens of programming languages over the years - I've
studied CSP, programmed in Occam, functional programming languages, lots
of assemblies, a small amount of CPLD/FPGA work in various languages,
and many other kinds of coding. (Most of my work for the past years has
been C, C++ and Python.) I'm not afraid of learning new things. But
when I looked at some of the examples for the GA144, three things struck
me. One is that it was amazing how much they got on the device.
Another is to wonder about the limitations you get from the this sort of
architecture. (That is a big turn-off with the XMOS. It's
fantastically easy to make nice software-based peripherals using
hardware threads. And fantastically easy to run out of hardware threads
before you've made the basic peripherals you get in a $0.50
microcontroller.) And the third thing that comes across is how totally
and utterly incomprehensible the software design and the programming
examples are. The GA144 is squarely in the category of technology that
is cool, impressive, and useless in the real world where developers have
to do a job, not play with toys.

Sure, it would be possible to learn this. But there is no way I could
justify the investment in time and effort that would entail.

And there is no way I would want to go to a language with less safety,
poorer typing, weaker tools, harder testing, more limited static
checking than the development tools I can use now with C and C++.


Quote:

In the same way that the other functional logic elements like the
block RAMs and DSP blocks are used for custom functionality which
requires the designer to program by whatever means is devised, these
tiny CPUs would not need a high level language like C. The code in
them would be small enough to be considered "logic" and developed at
the assembly level.


The modern way to use the DSP blocks on FPGA's is either with ready-made
logic blocks, code generator tools like Matlab, or C to hardware
converters. They are not configured manually at a low level. Even if
when they are generated directly from VHDL or Verilog, the developer
writes "x = y * z + w" with the required number of bits in each element,
and the tools turn that into whatever DSP blocks are needed.

The key thing you have to think about here, is who would use these tiny
cpus, and why. Is there a reason for using a few of them scattered
around the device, programmed in assembly (or worse, Forth) ? Why would
the developer want to do that instead of just adding another software
thread to the embedded ARM processor, where development is so much more
familiar? Why would the hardware designer want them, instead of writing
a little state machine in the language of their choice (VHDL, Verilog,
System C, MyHDL, C-to-HDL compiler, whatever)?

I am missing the compelling use-cases here. Yes, it is possible to make
small and simple cpu units with a stack machine architecture, and fit
lots of them in an FPGA. But I don't see /why/ I would want them -
certainly not why they are better than alternatives, and worth the
learning curve.

Quote:

People have mindsets about things and I believe this is one of them.


Exactly. And you have a choice here - work with people with the
mindsets they have, or give /seriously/ compelling reasons why they
should invest in the time and effort needed to change those mindsets.
Wishful thinking is not the answer.

Quote:
The GA144 is not so easy to program because people want to use it for
the sort of large programs they write for other fast CPUs.


It is not easy to program because it is not easy to program.
Multi-threaded or multi-process software is harder than single-threaded
code.

The tools and language here for the GA144 - based on Forth - are two
generations behind the times. They are totally unfamiliar to almost any
current software developer.

And yes, there is the question of what kind of software you would want
to write. People either want to write small, dedicated software - in
which case they want a language that is familiar and they want to keep
the code simple. Or they want bigger projects, reusing existing code -
in which case they /need/ a language that is standard.

Look at the GA144 site. Apart from the immediate fact that it is pretty
much a dead site, and clearly a company that has failed to take off,
look at the examples. A 10 Mb software Ethernet MAC ? Who wants /that/
in software? A PS/2 keyboard controller? An MD5 hash generator running
in 16 cpus? You can download a 100-line md5 function for C and run it
on any processor.


Quote:
In an
FPGA a very fast processor can be part of the logic rather than an
uber-controller riding herd over the whole chip. But this would
require designers to change their thinking of how to use CPUs. The
F18A runs at 700 MIPS peak rate in a 180 nm process. Instead of one
or two in the FPGA like the ARMs in other FPGAs, there would be
hundreds, each one running at some GHz.

It has long been established that lots of tiny processors running really
fast are far less use than a few big processors running really fast.
700 MIPS sounds marvellous, until you realise how simple and limited
each of these instructions is.

At each step here, you have been entirely right about what can be done.
Yes, you can make small and simple processors - so small and simple
that you can have lots of them at high clock speeds.

And you have been right that using these would need a change in mindset,
programming language, and development practice to use them.

But nowhere do I see any good reason /why/. No good use-cases. If you
want to turn the software and FPGA development world on its head, you
need an extraordinarily good case for it.

Tom Gardner
Guest

Tue Mar 19, 2019 11:45 am   



On 19/03/19 10:08, Theo Markettos wrote:
Quote:
gnuarm.deletethisbit_at_gmail.com wrote:
Certainly this would be like Forth, but the reality is I'm thinking of a
Forth like CPU because they can be designed so simply.

The F18A stack processor designed by Charles Moore is used in the GA144
chip. There are 144 of them with unusual interconnections that allow the
CPU to halt waiting for communications, saving power. The CPU is so small
that it could be included in an FPGA as what would be equivalent to a
logic element.

In the same way that the other functional logic elements like the block
RAMs and DSP blocks are used for custom functionality which requires the
designer to program by whatever means is devised, these tiny CPUs would
not need a high level language like C. The code in them would be small
enough to be considered "logic" and developed at the assembly level.

The problem this boils down to is programmability.

If you have a small core, you can therefore have lots of them. But writing
software for and managing dozens or hundreds of cores is troublesome. At
this level, you have enough headache with the inter-core communication that
you'd rather not throw a strange assembler-only core architecture into the
mix. A core like this would need a simple inter-core programming model so
it's easy to reason about system behaviour (example: systolic arrays)


Yup. The hardware is easy. Programming is painful, but there
are known techniques to control it...

There's an existing commercially successful set of products in
this domain. You get 32-core 4000MIPS processors, and the IDE
guarantees the hard real-time performance.

Programming uses a techniques created in the 70s, first
implemented in the 80s, and which continually reappear, e.g.
TI's DSP engines, Rust, Go etc.

Understand XMOS's xCORE processors and xC language, see how
they complement and support each other. I found the net result
stunningly easy to get working first time, without having to
continually read obscure errata!


Guest

Tue Mar 19, 2019 12:45 pm   



On Tuesday, March 19, 2019 at 6:56:42 AM UTC-4, already...@yahoo.com wrote:
Quote:
On Tuesday, March 19, 2019 at 2:13:38 AM UTC+2, gnuarm.del...@gmail.com wrote:
Most of us have implemented small processors for logic operations that don't need to happen at high speed. Simple CPUs can be built into an FPGA using a very small footprint much like the ALU blocks. There are stack based processors that are very small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would anyone be interested? Or is a C compiler mandatory even for processors running very small programs?

I am picturing this not terribly unlike the sequencer I used many years ago on an I/O board for an array processor which had it's own assembler. It was very simple and easy to use, but very much not a high level language.. This would have a language that was high level, just not C rather something extensible and simple to use and potentially interactive.

Rick C.

It is clear that you have Forth in mind.
It is less clear why you don't say it straight.


Because this is not about Forth. It is about very small processors. I would not really bother with Forth as the programming language specifically because that would be a layer on top of what you are doing and to be efficient it would need to be programmed in assembly.

That said, the assembly language for a stack processor is much like Forth since Forth uses a virtual stack machine as it's programming model. So yes, it would be similar to Forth. I most likely would use Forth to write programs for these, but that is just my preference since that is the language I program in.

But the key here is to program the CPUs in their stack oriented assembly. That's not really Forth even if it is "Forth like".

Is that what you wanted to know?

Rick C.


Guest

Tue Mar 19, 2019 12:45 pm   



On Tuesday, March 19, 2019 at 6:27:44 AM UTC-4, Tom Gardner wrote:
Quote:

Yup. The hardware is easy. Programming is painful, but there
are known techniques to control it...


That is 'C' world, conventional thinking. If you can write a hello world program without using a JTAG debugger, you should be able to write and debug most programs for this core in the simulator with 100% correctness. We aren't talking about TCP/IP stacks.


Quote:
There's an existing commercially successful set of products in
this domain. You get 32-core 4000MIPS processors, and the IDE
guarantees the hard real-time performance.


And they are designed to provide MIPS, not logic functions.

I don't want to go too far into the GA144 since this is not what I'm talking about inserting into an FPGA, but only as an analogy. One of the criticisms of that device is how hard it is to get all 144 processors cranking at full MIPS. But the chip is not intended to utilize "the full MIPS" possible. It is intended to be like an FPGA where you have CPUs available to do what you want without regard to squeezing out every possible MIPS. No small number of these processors will do nothing other than passing data and control to it's neighbors while mostly idling because that is the way they are wired together.

The above mentioned 4000 MIPS processor is clearly intended to utilize every last MIPS. Not at all the same and it will be programmed very differently.


Quote:
Programming uses a techniques created in the 70s, first
implemented in the 80s, and which continually reappear, e.g.
TI's DSP engines, Rust, Go etc.

Understand XMOS's xCORE processors and xC language, see how
they complement and support each other. I found the net result
stunningly easy to get working first time, without having to
continually read obscure errata!


But not at all relevant here since their focus is vastly different from providing logic functions efficiently.

Rick C.


Guest

Tue Mar 19, 2019 12:45 pm   



On Tuesday, March 19, 2019 at 6:21:24 AM UTC-4, Tom Gardner wrote:
Quote:
On 19/03/19 00:13, gnuarm.deletethisbit_at_gmail.com wrote:
Most of us have implemented small processors for logic operations that don't
need to happen at high speed. Simple CPUs can be built into an FPGA using a
very small footprint much like the ALU blocks. There are stack based
processors that are very small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would anyone be
interested? Or is a C compiler mandatory even for processors running very
small programs?

I am picturing this not terribly unlike the sequencer I used many years ago
on an I/O board for an array processor which had it's own assembler. It was
very simple and easy to use, but very much not a high level language. This
would have a language that was high level, just not C rather something
extensible and simple to use and potentially interactive.
Who cares about yet another processor programmed in the same
old language. It would not have a *U*SP. In fact it would be
"back to the 80s" Smile


Sorry, I don't get what any of this means.


Quote:
However, if you want to make it interesting enough to pass
the elevator test, ensure it can do things that existing
systems find difficult.

You should have a look at how the XMOS hardware and software
complement each other, so that the combination allows hard
real time operation programming in multicore systems. (Hard
means guaranteed-by-design latencies between successive i/o
activities)


Yeah I think the XMOS model is way more complex than what I am describing. The XMOS processors are actually very complex and use lots of gates. They also don't run all that fast. Their claim to fame is to be able to communicate through shared memory as if the other CPUs were not there in the good way. Otherwise they are conventional processors, programmed in conventional ways.

The emphasis here is for the CPU to be nearly invisible as a CPU and much more like a function block. You just have to "configure" the operation by writing a bit of code. That's why 'C' is not desirable, it would be too cumbersome for small code blocks.

Rick C.


Guest

Tue Mar 19, 2019 12:45 pm   



On Tuesday, March 19, 2019 at 2:13:38 AM UTC+2, gnuarm.del...@gmail.com wrote:
Quote:
Most of us have implemented small processors for logic operations that don't need to happen at high speed. Simple CPUs can be built into an FPGA using a very small footprint much like the ALU blocks. There are stack based processors that are very small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would anyone be interested? Or is a C compiler mandatory even for processors running very small programs?

I am picturing this not terribly unlike the sequencer I used many years ago on an I/O board for an array processor which had it's own assembler. It was very simple and easy to use, but very much not a high level language. This would have a language that was high level, just not C rather something extensible and simple to use and potentially interactive.

Rick C.


It is clear that you have Forth in mind.
It is less clear why you don't say it straight.


Guest

Tue Mar 19, 2019 12:45 pm   



On Tuesday, March 19, 2019 at 6:08:36 AM UTC-4, Theo Markettos wrote:
Quote:
gnuarm.deletethisbit_at_gmail.com wrote:
Certainly this would be like Forth, but the reality is I'm thinking of a
Forth like CPU because they can be designed so simply.

The F18A stack processor designed by Charles Moore is used in the GA144
chip. There are 144 of them with unusual interconnections that allow the
CPU to halt waiting for communications, saving power. The CPU is so small
that it could be included in an FPGA as what would be equivalent to a
logic element.

In the same way that the other functional logic elements like the block
RAMs and DSP blocks are used for custom functionality which requires the
designer to program by whatever means is devised, these tiny CPUs would
not need a high level language like C. The code in them would be small
enough to be considered "logic" and developed at the assembly level.

The problem this boils down to is programmability.

If you have a small core, you can therefore have lots of them. But writing
software for and managing dozens or hundreds of cores is troublesome.


So how do they design with the many other functional elements in an FPGA? Is it really that hard to program the various logic functions in an FPGA because of the difficulty in defining their communications?


Quote:
At
this level, you have enough headache with the inter-core communication that
you'd rather not throw a strange assembler-only core architecture into the
mix.


Wow! Makes you wonder how FPGAs ever get designed at all.

Quote:
A core like this would need a simple inter-core programming model so
it's easy to reason about system behaviour (example: systolic arrays)


"Inter-core programming model", not sure what that means.

I think you are overthinking this, much as people do when using large CPUs, not to say they are overthinking for those designs. The whole point is that these CPUs would be used like logic blocks, not like CPUs. Small, limited memory with programs written to be easy to debug and/or designed to simply work by being simple.

I'm not sure how software people think really. I worked with one guy to try to solve a problem and they were using a subroutine to do a memory access.. I suppose this was because it needed to be this specific code to get the access to work the way it needed to. But then the guy kept looking at those five lines of code for some serious time. It was pretty clear to me what it was doing. Or he could have run the code or simulated it and looked to see what it did. Actually that would have been a valid use of JTAG to single step through those five lines a few times. But he just kept reading those five lines. Wow!


Quote:
There's a certain merit in having a CPU as a building block, like a LAB,
BRAM or DSP block. I'm not familiar with the literature in this space, but
it's the sort of thing that turns up at the 'FPGA' conference regularly
(keyword: CGRA). That merely punts the issue to now being a tools problem -
the tools know how to make use of a DSP block, but how to make use of a CPU
block? How to turn HDL into 'software'? Can you chain the blocks together
to make wider logic?


I don't see the difficulty. I'm not so familiar with Verilog, but in VHDL you have sequential code. It wouldn't be hard to program a CPU using VHDL I think. If nothing else, there should be a way to code in assembler and embed the code similarly to what is done for ROM like functions in HDL.


Quote:
I suppose there's also a niche at the ultra-cheap end of the spectrum - for
$1 gadgets with an 8051 because a 16 bit CPU would be too expensive (and a
Cortex M0 would have licence fees).


I believe we have officially reached the point where $1 processors are 32 bit ARMs and you have to get below $0.50 before you consider needing 8 bit processors. Not sure what this has to do with adding CPU functional elements to FPGAs.


Quote:
But if this is an ASIC then I don't
think there's a whole lot more to pay to get a C-compatible processor (even
in 8 bits). And it's unclear how much speed penalty you'd pay for that.


You are thinking of something totally different from using a CPU as logic. Processors like ARMs are too large to have hundreds in an FPGA (unless it is a really large chip). Their architectural capabilities are much more than what is required for this. I suppose a small 8 bit CPU could be used, but why use such a tiny data path with such limited capability? The architectural simplicity of a stack machine allows it to be designed to run very fast. With speed comes a certain flexibility to keep up with the discrete logic.


Quote:
How much code/data memory would you expect to have? Would that dwarf the
size of your core?


Small, very small. Maybe 256 words of RAM. Instructions on the F18A are only 5 bits and so pack four per in the 18 bit word. The last instruction is only 3 bits wide expressing a subset of the 32 instructions otherwise coded for. Round the word width up to 20 bits or even 32.

I'm not sure what happens if this actual processor is shrunk from 180 nm to something like 20 nm. It was highly optimized for the 180 nm process it is built in and it may require some tweaks to work well at smaller processes.. The F18A has no external clock and different instructions time differently with basic logic instruction running very fast and memory accesses taking more time. You can think of it as an async processor.


Quote:
Finally I can see the use as a 'state machine implementation engine' for say
a CPLD. But for that you need tools (taking HDL or state-transition diagrams)
to allow the programmer to describe their state machine. And your
competition is the regular HDL synthesiser which will just make it out of
flip flops. I'm unclear how often you'd win in these circumstances.


If you have logic that is well implemented sequentially (at a very high speed, likely multiple GIPS) it will save a lot of room in the FPGA just as multipliers and other function blocks. Hard cores are much more efficient and sequential code is most efficient in a CPU type design which leverages the size advantage of memory over logic.


Quote:
And I can't really see 'interactive' as a feature - either you have only one
core, in which case you could equally hook up JTAG (or equivalent) to
something larger for interactive debugging, or you have many cores, in which
case I can't see how you'd interact sensibly with dozens at once.


If you have to use JTAG to debug something like this you are pretty much doomed. I haven't used JTAG for anything other than programming FPGAs in decades.

In general FPGAs are 99.9% debugged in simulation. The odd 0.1% requires pretty special thinking anyway and I don't find JTAG to be very useful. My best debugging tool is wetware.

The point of interactivity is to allow the code to be tested one definition at a time. But then that is a Forth concept and I'm pretty sure not a familiar concept with most people.

Rick C.

Tom Gardner
Guest

Tue Mar 19, 2019 1:45 pm   



On 19/03/19 11:00, gnuarm.deletethisbit_at_gmail.com wrote:
Quote:
On Tuesday, March 19, 2019 at 6:21:24 AM UTC-4, Tom Gardner wrote:
On 19/03/19 00:13, gnuarm.deletethisbit_at_gmail.com wrote:
Most of us have implemented small processors for logic operations that
don't need to happen at high speed. Simple CPUs can be built into an
FPGA using a very small footprint much like the ALU blocks. There are
stack based processors that are very small, smaller than even a few kB of
memory.

If they were easily programmable in something other than C would anyone
be interested? Or is a C compiler mandatory even for processors running
very small programs?

I am picturing this not terribly unlike the sequencer I used many years
ago on an I/O board for an array processor which had it's own assembler.
It was very simple and easy to use, but very much not a high level
language. This would have a language that was high level, just not C
rather something extensible and simple to use and potentially
interactive.
Who cares about yet another processor programmed in the same old language.
It would not have a *U*SP. In fact it would be "back to the 80s" :)

Sorry, I don't get what any of this means.


However, if you want to make it interesting enough to pass the elevator
test, ensure it can do things that existing systems find difficult.

You should have a look at how the XMOS hardware and software complement
each other, so that the combination allows hard real time operation
programming in multicore systems. (Hard means guaranteed-by-design
latencies between successive i/o activities)

Yeah I think the XMOS model is way more complex than what I am describing.
The XMOS processors are actually very complex and use lots of gates. They
also don't run all that fast.


Individually not especially fast, aggregate fast.


Quote:
Their claim to fame is to be able to
communicate through shared memory as if the other CPUs were not there in the
good way.


Not just shared memory, *far* more interesting than that.

Up to 8 cores in a "tile" share memory.
Comms between tiles is via an interconnection network
Comms with i/o is via the same interconnection network.

At the program level there is *no* difference between comms
via shared memory and comms via interconnection network.
Nor is there any difference between comms with a i/o and
comms with other cores.

All comms is via channels. That's one thing that makes
the hardware+software environment unique.


Quote:
Otherwise they are conventional processors, programmed in
conventional ways.


No. You are missing the key differentiating points...

Conventional processors and programming treats multicore
programming as an advanced add on library - explicitly
so in the case of C. And a right old mess that is.

xC+xCORE *start* by presuming multicore systems, and
use a set of harmonious concepts to make multicore
programming relatively easy and predictable.


Quote:
The emphasis here is for the CPU to be nearly invisible as a CPU and much
more like a function block.


Why bother? What would be the *benefit*?

Yes, you can use a screw instead of a nail, but
that doesn't mean there is a benefit. Unless, of
course, you can't use a hammer.


Quote:
You just have to "configure" the operation by
writing a bit of code. That's why 'C' is not desirable, it would be too
cumbersome for small code blocks.



Guest

Tue Mar 19, 2019 1:45 pm   



On Tuesday, March 19, 2019 at 6:26:37 AM UTC-4, David Brown wrote:
Quote:
On 19/03/2019 09:32, gnuarm.deletethisbit_at_gmail.com wrote:
On Tuesday, March 19, 2019 at 4:15:47 AM UTC-4, David Brown wrote:
On 19/03/2019 01:13, gnuarm.deletethisbit_at_gmail.com wrote:
Most of us have implemented small processors for logic
operations that don't need to happen at high speed. Simple CPUs
can be built into an FPGA using a very small footprint much like
the ALU blocks. There are stack based processors that are very
small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would
anyone be interested? Or is a C compiler mandatory even for
processors running very small programs?

I am picturing this not terribly unlike the sequencer I used
many years ago on an I/O board for an array processor which had
it's own assembler. It was very simple and easy to use, but very
much not a high level language. This would have a language that
was high level, just not C rather something extensible and simple
to use and potentially interactive.

Rick C.


If it is going to appeal to software developers, you need C. And
it has to be reasonable, standard C, even if it is for small
devices - programmers are fed up with the pains needed for special
device-specific C on 8051, AVR, PIC, etc. That does not
necessarily mean it has to be fast, but it should work with
standard language. Having 16-bit size rather than 8-bit size makes
a huge difference to how programmers feel about the device - aim
for something like the msp430.

You might, however, want to look at extensions for CSP-style
communication between cpus - something like XMOS XC.

If it is to appeal to hardware (FPGA) developers, C might not be
as essential. Some other kind of high level language, perhaps
centred around state machines, might work.

But when I see "extensible, simple to use and potentially
interactive", I fear someone is thinking of Forth. People who are
very used to Forth find it a great language - but you need to
understand that /nobody/ wants to learn it. Most programmers would
rather work in assembler than Forth. You can argue that this
attitude is irrational, and that Forth is not harder than other
languages - you might be right. But that doesn't change matters.

Certainly this would be like Forth, but the reality is I'm thinking
of a Forth like CPU because they can be designed so simply.

I appreciate that.

I can only tell you how /I/ would feel here, and let you use that for
what you think it is worth. I don't claim to speak for all software
developers, but unless other people are giving you feedback too, then
this is the best you've got Smile Remember, I am not trying to argue
about the pros and cons of different designs or languages, or challenge
you to persuade me of anything - I'm just showing you how software
developers might react to your design ideas.


That alone is a misunderstanding of what I am suggesting. I see no reason to involve "programmers". I don't think any FPGA designer would have any trouble using these processors and "programmers" are not required. Heck, the last company I worked for designed FPGAs in the software department, so everyone writing HDL for FPGAs was a "programmer" so maybe the distinction is less that I realize.


Quote:
The F18A stack processor designed by Charles Moore is used in the
GA144 chip. There are 144 of them with unusual interconnections that
allow the CPU to halt waiting for communications, saving power. The
CPU is so small that it could be included in an FPGA as what would be
equivalent to a logic element.

Yes, but look how popular the chip is - it is barely a blip in the
landscape. There is no doubt that this is a technologically fascinating
device.


That's not the issue, I'm not proposing anyone use a GA144.


Quote:
However, it is very difficult to program such chips - almost no
one is experienced with such multi-cpu arrangements, and the design
requires a completely different way of thinking from existing software
design.


Again, that's not what I am proposing. They have hundreds of multipliers and DSP blocks in FPGAs with no one worrying about how they will tie together. These CPUs would be similar.


Quote:
Add to that a language that works backwards, and a syntax that
looks like the cat walked across the keyboard, and you have something
that has programmers running away.


Now you are interjecting your own thoughts. I never suggested that cats be used to program these CPUs.


> My experience with Forth is small and outdated, but not non-existent.

Too bad this isn't about Forth.


Quote:
I've worked with dozens of programming languages over the years - I've
studied CSP, programmed in Occam, functional programming languages, lots
of assemblies, a small amount of CPLD/FPGA work in various languages,
and many other kinds of coding.


There are many areas where a "little" knowledge is a dangerous thing. I think programming languages and especially FPGA design are among those areas.


Quote:
(Most of my work for the past years has
been C, C++ and Python.) I'm not afraid of learning new things. But
when I looked at some of the examples for the GA144, three things struck
me. One is that it was amazing how much they got on the device.
Another is to wonder about the limitations you get from the this sort of
architecture. (That is a big turn-off with the XMOS. It's
fantastically easy to make nice software-based peripherals using
hardware threads. And fantastically easy to run out of hardware threads
before you've made the basic peripherals you get in a $0.50
microcontroller.) And the third thing that comes across is how totally
and utterly incomprehensible the software design and the programming
examples are. The GA144 is squarely in the category of technology that
is cool, impressive, and useless in the real world where developers have
to do a job, not play with toys.


I see why you started your comments with the big caveat. You seem to have a bone to pick with Forth and the GA144, neither of which are what I am talking about. You've gotten ahead of yourself.


Quote:
Sure, it would be possible to learn this. But there is no way I could
justify the investment in time and effort that would entail.

And there is no way I would want to go to a language with less safety,
poorer typing, weaker tools, harder testing, more limited static
checking than the development tools I can use now with C and C++.


Yes, well good thing you would never be the person who wrote any code for this. No "programmers" allowed, only FPGA designers... and no amateurs allowed either. Wink


Quote:
In the same way that the other functional logic elements like the
block RAMs and DSP blocks are used for custom functionality which
requires the designer to program by whatever means is devised, these
tiny CPUs would not need a high level language like C. The code in
them would be small enough to be considered "logic" and developed at
the assembly level.

The modern way to use the DSP blocks on FPGA's is either with ready-made
logic blocks, code generator tools like Matlab, or C to hardware
converters. They are not configured manually at a low level. Even if
when they are generated directly from VHDL or Verilog, the developer
writes "x = y * z + w" with the required number of bits in each element,
and the tools turn that into whatever DSP blocks are needed.


I guess I'm not modern then. I use VHDL and like it... Yes, I actually said I like VHDL. The HDL so many love to hate.

I see no reason why these devices couldn't be programmed using VHDL, but it would be harder to debug. But then I expect you are the JTAG sort as well.. That's not really what I'm proposing and I think you are overstating the case for "press the magic button" FPGA design.


Quote:
The key thing you have to think about here, is who would use these tiny
cpus, and why. Is there a reason for using a few of them scattered
around the device, programmed in assembly (or worse, Forth) ? Why would
the developer want to do that instead of just adding another software
thread to the embedded ARM processor, where development is so much more
familiar?


Because and ARM can't keep up with the logic. An ARM is very hard to interface usefully as a *part* of the logic. That's the entire point of the F18A CPUs. Each one is small enough to be dedicated to the task at hand (like in the XMOS) while running at a very high speed, enough to keep up with 100 MHz logic.


Quote:
Why would the hardware designer want them, instead of writing
a little state machine in the language of their choice (VHDL, Verilog,
System C, MyHDL, C-to-HDL compiler, whatever)?


That depends on what the state machine is doing. State machines are all ad-hoc and produce their own little microcosm needing support. You talk about the issues of programming CPUs. State machines are like designing your own CPU but without any arithmetic. Add arithmetic, data movements, etc. and you have now officially designed your own CPU when you could have just used an existing CPU.

That's fine, if it is what you intended. Many FPGA users add their own soft core CPU to an FPGA. Having these cores would make that unnecessary.

The question is why would an FPGA designer want to roll their own FSM when they can use the one in the CPU?


Quote:
I am missing the compelling use-cases here. Yes, it is possible to make
small and simple cpu units with a stack machine architecture, and fit
lots of them in an FPGA. But I don't see /why/ I would want them -
certainly not why they are better than alternatives, and worth the
learning curve.


Yes, but you aren't really an FPGA designer, no? I can see your concerns as a Python programmer.


Quote:
People have mindsets about things and I believe this is one of them.

Exactly. And you have a choice here - work with people with the
mindsets they have, or give /seriously/ compelling reasons why they
should invest in the time and effort needed to change those mindsets.
Wishful thinking is not the answer.


You are a programmer, not an FPGA designer. I won't try to convince you of the value of many small CPUs in an FPGA.


Quote:
The GA144 is not so easy to program because people want to use it for
the sort of large programs they write for other fast CPUs.

It is not easy to program because it is not easy to program.
Multi-threaded or multi-process software is harder than single-threaded
code.


I can see that you don't understand the GA144. If you are working on a design that suits the GA144 (not that there are tons of those) it's not a bad device. If I were working on a hearing aid app, I would give serious consideration to this chip. It is well suited to many types of signal processing. I once did a first pass of an oscilloscope design for it (strictly low bandwidth). There are a number of apps that suit the GA144, but otherwise, yes, it would be a bear to adapt to other apps.

But this is not about the GA144. My point was to illustrate that you don't need to be locked into the mindset of utilizing every last instruction cycle. Rather these CPUs have cycles to spare, so feel free to waste them. That's what FPGAs are all about, wasting resources. FPGAs have some small percentage of the die used for logic and most of the rest used for routing, most of which is not used. Much of the logic is also not used. Waste, waste, waste! So a little CPU that is only used at 1% of it's MIPS capacity is not wasteful if it saves a bunch of logic elsewhere in the FPGA.

That's the point of discussing the GA144.


Quote:
The tools and language here for the GA144 - based on Forth - are two
generations behind the times. They are totally unfamiliar to almost any
current software developer.


And they are not relevant to this discussion.


Quote:
And yes, there is the question of what kind of software you would want
to write. People either want to write small, dedicated software - in
which case they want a language that is familiar and they want to keep
the code simple. Or they want bigger projects, reusing existing code -
in which case they /need/ a language that is standard.


Who is "they" again? I'm not picturing this being programmed by the programming department. To do so would mean two people would need to do a job for one person.


Quote:
Look at the GA144 site. Apart from the immediate fact that it is pretty
much a dead site, and clearly a company that has failed to take off,
look at the examples. A 10 Mb software Ethernet MAC ? Who wants /that/
in software? A PS/2 keyboard controller? An MD5 hash generator running
in 16 cpus? You can download a 100-line md5 function for C and run it
on any processor.


Wow! You are really fixated on the GA144.


Quote:
In an
FPGA a very fast processor can be part of the logic rather than an
uber-controller riding herd over the whole chip. But this would
require designers to change their thinking of how to use CPUs. The
F18A runs at 700 MIPS peak rate in a 180 nm process. Instead of one
or two in the FPGA like the ARMs in other FPGAs, there would be
hundreds, each one running at some GHz.

It has long been established that lots of tiny processors running really
fast are far less use than a few big processors running really fast.
700 MIPS sounds marvellous, until you realise how simple and limited
each of these instructions is.


Again, you are pursuing a MIPS argument. It's not about using all the MIPS.. The MIPS are there to allow the CPU to do it's job in a short time to keep up with logic. All the MIPS don't need to be used.

"A few big processors" would suck in being embedded in the logic. The just can't switch around fast enough. You must be thinking of many SLOW processors compared to one fast processor. Or maybe you are thinking of doing work which is suited for a single processor like in a PC.

Yeah, you can use one of the ARMs in the Zynq to run Linux and then use the other to interface to "real time" hardware. But this is a far cry from what I am describing.


Quote:
At each step here, you have been entirely right about what can be done.
Yes, you can make small and simple processors - so small and simple
that you can have lots of them at high clock speeds.

And you have been right that using these would need a change in mindset,
programming language, and development practice to use them.

But nowhere do I see any good reason /why/. No good use-cases. If you
want to turn the software and FPGA development world on its head, you
need an extraordinarily good case for it.


"On it's head" is a powerful statement. I'm just talking here. I'm not writing a business plan. I'm asking open minded FPGA designers what they would use these CPUs for.

Rick C.


Guest

Tue Mar 19, 2019 1:45 pm   



On Tuesday, March 19, 2019 at 1:14:56 PM UTC+2, gnuarm.del...@gmail.com wrote:
Quote:
On Tuesday, March 19, 2019 at 6:56:42 AM UTC-4, already...@yahoo.com wrote:
On Tuesday, March 19, 2019 at 2:13:38 AM UTC+2, gnuarm.del...@gmail.com wrote:
Most of us have implemented small processors for logic operations that don't need to happen at high speed. Simple CPUs can be built into an FPGA using a very small footprint much like the ALU blocks. There are stack based processors that are very small, smaller than even a few kB of memory.

If they were easily programmable in something other than C would anyone be interested? Or is a C compiler mandatory even for processors running very small programs?

I am picturing this not terribly unlike the sequencer I used many years ago on an I/O board for an array processor which had it's own assembler. It was very simple and easy to use, but very much not a high level language. This would have a language that was high level, just not C rather something extensible and simple to use and potentially interactive.

Rick C.

It is clear that you have Forth in mind.
It is less clear why you don't say it straight.

Because this is not about Forth. It is about very small processors. I would not really bother with Forth as the programming language specifically because that would be a layer on top of what you are doing and to be efficient it would need to be programmed in assembly.

That said, the assembly language for a stack processor is much like Forth since Forth uses a virtual stack machine as it's programming model. So yes, it would be similar to Forth. I most likely would use Forth to write programs for these, but that is just my preference since that is the language I program in.

But the key here is to program the CPUs in their stack oriented assembly. That's not really Forth even if it is "Forth like".

Is that what you wanted to know?

Rick C.


I wanted to understand if there is PR element involved. Like, you afraid that if you say "Forth" then most potential readers immediately stop reading.

I am not a PR consultant, but I was then I'd suggest to remove word "interactive" from description of the language that you have in mind.

BTW, I agree that coding in HDLs suck for many sorts of sequential tasks.
And I agree that having CPU that is *not* narrow in its data paths and optionally not narrow in external addresses, but small/configurable in everything else could be a good way to "offload" such parts of design away from HDL..
I am much less sure that stack processor is a good choice for such tasks.

Goto page 1, 2, 3, 4, 5  Next

elektroda.net NewsGroups Forum Index - FPGA - Tiny CPUs for Slow Logic

Ask a question - edaboard.com

Arabic version Bulgarian version Catalan version Czech version Danish version German version Greek version English version Spanish version Finnish version French version Hindi version Croatian version Indonesian version Italian version Hebrew version Japanese version Korean version Lithuanian version Latvian version Dutch version Norwegian version Polish version Portuguese version Romanian version Russian version Slovak version Slovenian version Serbian version Swedish version Tagalog version Ukrainian version Vietnamese version Chinese version Turkish version
EDAboard.com map