Is this Intel i7 machine good for LTSpice?

On 13/11/2014 04:05, josephkk wrote:
On Wed, 12 Nov 2014 19:29:41 -0500, "Maynard A. Philbrook Jr."
jamie_ka1lpa@charter.net> wrote:


It does not matter if they were failed or intentional. The fact remains
that a large amount of software did not force the use of a FPU, some of
it didn't even attempt to detour in software if there was one present.

Years ago I wrote a sat tracking program that optionally would switch
to the FPU if one was present, there was a speed up but it wasn't what I
called worth a fist full of money to get a CPU or add on FPU for it.

Jamie

The first program that i used that had a noticeable improvement with the
FPU was SPICE. There it made a huge difference. Similar applications had
the same kind of results.

?-)

Anything that used a compiler that could generate inline FP code would
benefit enormously but if you had a noddy compiler that just had a bunch
of library routines that were either calls to the emulator or calls to
FP code in a subroutine then benefits were much less. This old page
shows the variation in different sqrt coding tricks from way back:

http://www.codeproject.com/Articles/69941/Best-Square-Root-Method-Algorithm-Function-Precisi

The inline code approximately 5x faster than the ordinary sqrt call.

How much benefit you got from the FPU depended critically on the quality
of your compiler. You often got a bit of extra precision thrown in too
since the FP stack holds intermediate results to 80bits.

The original Intel FPU had a few quirks in the trig functions which were
found when Cyrix did a full analysis for their own numeric FPU (which
was faster, more accurate and cheaper than the Intel part).

--
Regards,
Martin Brown
 
On Fri, 14 Nov 2014 20:28:48 -0800, josephkk
<joseph_barrett@sbcglobal.net> Gave us:

snip
I am not buying it. It requires memory chips to bring out I/O to indicate
"valid/no error", "error corrected", and "error, not corrected". Now if
accesses are more than 1 chip wide you have to combine the status bits
somehow, whether or not they are shipped to the CPU/DMA/Video. Also you
may want a different ECC protection profile than what the memory chip
maker provides.

?-)

The motherboard is involved. I am sure as much as can be placed on
the RAM 'device' (stick) itself is. The chipset manages its part. The
finished product is the checksum code required by whatever monitors and
manages all of it (the chipset) ultimately gets generated and used for
comparison. Any errors are handled by it, and the RAM and that little
management code hard wired into it all. Then, it is back to square one.
Next refresh and compare sequencing. Should be without missing a beat
over similarly timed non-ECC RAM.

Seeing a bunch of errors would likely slow things, but that would also
indicate a bigger problem.
 
On Fri, 14 Nov 2014 09:25:47 +0200, upsidedown@downunder.com wrote:

On Thu, 13 Nov 2014 19:05:20 -0800, josephkk
joseph_barrett@sbcglobal.net> wrote:

One should also remember that magnetic core as well as dynamic RAM
perform a destructive readout, so you have to perform a writeback
after each read cycle. For core, you only have to do that for the
actual read location (at the X and Y wire crossing), for dynamic RAM,
you have to write back the whole column (hundreds or thousands of
bits). For this reason, real access time (not using CAS multiplexing)
is much shorter as the full cycle time.

The similarities between core and DRAM are real. Early DRAM could not
provide the next sequential address read as it had no registers to store
it.
Newer DRAM does (since EDO at least). That said, newer DRAM speeds up
sequential reads over early DRAM by having the registers for it and not
needing another complete cycle, just another data clock for the next
sequential read data. See the DDR series specifications. The restore
part of the cycle continues unabated making a non-sequential read after
two or more sequential added reads occur much sooner.

Apart from the first DRAMs that used all address lines at once, all
the rest have multiplexed addresses with RAS/CAS selection.

This does not slow the access. For instance the first RAS/CAS
addressed DRAM was 4096x1 bit with 64 rows and 64 columns. The high 6
bits with the RAS signal were decoded and selected one of the 64 rows.
After a while, all the bits from that row were transferred to 64
column sense amplifier and latches.

After the low address bits were decoded with the CAS signal, it just
selected one of the 64 column sense amplifier/latch bit and presented
to the data out pin. Since the DRAM cell access time was much longer
than the output column select multiplexor, multiplexing did not slow
things much even for a single access.

Now that 64 column bits are already in the internal 64 internal
registers, performing several CAS cycles with different low address
bits allowed fast random access _within_ a 64 bit row, just
multiplexing out the selected bit, instead of doing a dynamic RAM cell
access each time.

Later models had internal column address counters, allowing
sequentially select column bit access without doing a RAM cell access
after the initial row activation.

Video-RAMs were very similar. All the TV-line bits were taken from
1024 column bits and then parallel loaded into a shift register
clocked by the bit clock. This required a slow line select every 64
us, so no problem with propagation delays.

Since all the column bits are available simultaneously within the
chip, my point was that it would make sense to put the ECC processing
within the memory chip itself.

I am not buying it. It requires memory chips to bring out I/O to indicate
"valid/no error", "error corrected", and "error, not corrected". Now if
accesses are more than 1 chip wide you have to combine the status bits
somehow, whether or not they are shipped to the CPU/DMA/Video. Also you
may want a different ECC protection profile than what the memory chip
maker provides.

?-)
 

Welcome to EDABoard.com

Sponsor

Back
Top