Clock Edge notation

Hi Peter,
Thank you for your response.

You are a famous inventor in Xilinx and I have read many your patents
and learn a lot from your patents.

As a starter for an inventor, I would like to do more myself to save
first investment. The best ideal way for me to follow is to get first
patent filed and successfully approved by PTO. I know it is very
difficult, but It should be much easier than to learn English as 2nd
language. When you have first successful experiences with patent
application, then it will go smoother for next patent applications.

I think it is better to ask for advices and suggestions on the groups
and to get first hand experiences from other experts and to avoid
misstep as much as possible. I met a layer who hasn't finished his
patent license testing yet and prepared to open the patent application
business and to help me file patent applications.

I have read many patents from Xilinx and especially pay attentions on
their claims. No patent claims from Xilinx contain any logical
equations. I did remember once I read a patent that is not Xilinx's,
but certainly contains an equation. But I cannot find it any more.

Using a logical equation for LUT in claim area in a patent certainly
helps explain the idea of the invention. But Xilinx's layers never use
them, even though in the description area logical equations are used.
So I guess there are some rules in USPTO forbidding to use logical
equations for LUT in patent applications.

Weng


Peter Alfke wrote:
You can file for a US patent up to a year after having divulged the
idea.
That grace period does not apply to foreign filing. There you lose the
right to file immediately after divulging. So, foreign filing is more
demanding, not less.

As far as equations vs LUTs, I think it makes no difference. But
equations may be more widely understood. BTW, the OP is confusing in
the example, using a logic equation that is actually AND and OR...

Peter Alfke (with about 30 patents, but all filed by company patent
lawyers)
 
Weng Tianxiang wrote:
Hi,
I am writing a patent application for FPGA and have no prior
experiences with patent writing.

I found that in Xilinx patents, all lookup table equations are
described in AND/OR/Multiplexer circuits in its claims. Describing a
logic connection for a lookup table in claims is much more complex in
English than presenting an equivalent logic equation.

For example, a lookup table has the equation:
Out <= (A*B) + (C*D);

It is much more concise and simpler than describing the circuit in
AND/OR gate circuits.

Do you have experiences with and any advices on writing an equivalent
logic equation in a patent claim field ?
You should be aware that 'Clarity' and 'Patent' are often mutually
exclusive :)
Patent lawyers have motivation to obfuscate, for many reasons.
Patents are merely a license to litigate, (and an income stream for the
lawyer) so they tend to break them into many small claims, that can be
argued.
If there is prior art, it also helps to sound a lot different, even if
you are the same.
This also helps to get over the first hurdle, of Patent examiner.

Most (all?) FPGA patents will be electronic searchable, so scan those
yourself, and then "work your claim into the gaps" between those patents.

-jg
 
One of the big problems in defining a patent in the terms of logic is that
it is too specific if your patent covers and/or logic. I'll just use
nand/nor logic. That's a good reason why there is no logic in Xilinx's
patent. As mentioned Patents have to be as broad as possible to cover as
much as possible so that even if it so much as smells the same, your patent
will cover it.
And that's why you need an expensive patent lawyer

Simon


"Jim Granville" <no.spam@designtools.co.nz> wrote in message
news:4394cb3f$1@clear.net.nz...
Weng Tianxiang wrote:
Hi,
I am writing a patent application for FPGA and have no prior
experiences with patent writing.

I found that in Xilinx patents, all lookup table equations are
described in AND/OR/Multiplexer circuits in its claims. Describing a
logic connection for a lookup table in claims is much more complex in
English than presenting an equivalent logic equation.

For example, a lookup table has the equation:
Out <= (A*B) + (C*D);

It is much more concise and simpler than describing the circuit in
AND/OR gate circuits.

Do you have experiences with and any advices on writing an equivalent
logic equation in a patent claim field ?

You should be aware that 'Clarity' and 'Patent' are often mutually
exclusive :)
Patent lawyers have motivation to obfuscate, for many reasons.
Patents are merely a license to litigate, (and an income stream for the
lawyer) so they tend to break them into many small claims, that can be
argued.
If there is prior art, it also helps to sound a lot different, even if
you are the same.
This also helps to get over the first hurdle, of Patent examiner.

Most (all?) FPGA patents will be electronic searchable, so scan those
yourself, and then "work your claim into the gaps" between those patents.

-jg
 
Hi Ralf,

You can instantiate a component inside itself, and instantiation can be
controlled with "if" conditional statements.
Use changing generic values, e.g. count down a generic parameter for
successive instantiations, and use the generic to control what is
instantiated at each recursive instantiation level and also ensure that
the recursive instantiation stops.

What you design by this method can be very large and very regular from
little code. Other large regular structures (memories are the most
obvious example), have customized layouts.
If your RTL instantiates a large rectangular array of similar
structures with similar interconnectivity you might want to be able to
tile similar blocks and inter-block signals in the layout.

I am producing a binary tree of similar tiles and was thinking about
laying it out efficiently.

- Paddy.
 
jamesp wrote:

I am a mature student will be doing some complex VHDL and Verilog design
work for my course. As well as having to create and test the
functionality of the design (in both languages) I want to document how
the design is put together and it's complex hierarchy.
If you have an audience for such artwork,
it is much easier to obtain when the design
is complete. Synthesis programs have an
rtl viewer that will do the work for you.
I sketch diagrams in my notebook to get
started. Things usually change significantly
before I am done.

-- Mike Treseler
 
I skimmed over what you wrote, and you might have a misunderstanding of the
way the logic_vectir class works. Logic vectors are analagous to a
physical bus line on a PCB or inside an ASIC. You need to use one of the
other IEEE data classes if you want to do some math like division on them.
 
does not work for me :D

try
slvec <= (others => '0');
slvec(0) <= '1';

mfg
ciry
 
I attended Altera's DSP showcase/seminar in Rochester, there was a
large presence for interest in implementing MPEG4 (section 10 or 11, I
can't remember) for HDTV applications. You didn't say if the advice
you are searching for is for a commercial product, R&D science project
or a student project but even implementing something real time for DVD
quality (720x480) is worth considering.

I think a while back somebody announced the availability of JPEG
library on the www.opencores.org,
http://www.opencores.org/projects.cgi/web/jpeg/overview I haven't
tried it but critiquing it could be a good place to start. You could
incorporate parts of its implementation into yours, omit the parts you
don't agree with, and foster new not present in it. There is also a
section Video Compress Systems,
http://www.opencores.org/projects.cgi/web/video_systems/overview that
would be worth taking a look at.

A slightly different approach is using a tool like ImpulseC
(www.impulsec.com). It isn't really C but it is close. It allows you
to efficiently manage parallel systems of functions and integrate them
into VHDL and Verilog designs.

Maybe it is the bubble I have been living in, but your clock speed
seems high, what device families are you considering? I have seen 80
Mhz designs outperform similar applications on gigahertz PC. Don't
let PC marketing skew your objectivity (or maybe it is there choice in
operating systems).

Could you tell us more about the purpose of your project and you end
application?

Derek
 
JPEGs are lossy because of the quantization step. You can do it without
the quantization step and still notice a significant compression. If
you preload your quantization constants and Huffman codes into lookup
tables, you can easily process one pixel per clock cycle in a 1500 gate
FPGA. I wrote a fully piplelined version that queued up the first eight
rows of an incoming image into block ram before starting on the DCTs.
It worked great. Ideally you would do DCTs on blocks larger than 8x8,
but the advantage to 8x8 is that you can easily do 64 8bit operations
in parallel which is nice for the Z-ordering, etc. Bigger squares
require bigger chips and external memory, and as soon as you have to go
to external memory you lose your pipelining.

You don't want to do a dictionary method for an image. In fact, I'm not
sure you want to do a dictionary method in FPGA. That sounds scary to
me. Use frequency domain (DCT or Wavelet) Z-ordering for photos and use
raw RLE for screen shots and similar images.
 
"Brannon" <brannonking@yahoo.com> wrote in message
news:1135097484.768840.95060@g44g2000cwa.googlegroups.com...
JPEGs are lossy because of the quantization step. You can do it without
the quantization step and still notice a significant compression. If
you preload your quantization constants and Huffman codes into lookup
tables, you can easily process one pixel per clock cycle in a 1500 gate
FPGA. I wrote a fully piplelined version that queued up the first eight
rows of an incoming image into block ram before starting on the DCTs.
It worked great. Ideally you would do DCTs on blocks larger than 8x8,
but the advantage to 8x8 is that you can easily do 64 8bit operations
in parallel which is nice for the Z-ordering, etc. Bigger squares
require bigger chips and external memory, and as soon as you have to go
to external memory you lose your pipelining.
Is the OP not getting pixels in raster-scan order though?
That, and a half-line memory limit means there's not enough storage
for a 2-d DCT.
 
In comp.arch.fpga Melanie Nasic <quinn_the_esquimo@freenet.de> wrote:
: I want the compression to be lossless and not based on perceptional
: irrelevancy reductions.

If it has to be lossless there's no way you can guarantee to
get 2:1 compression (or indeed any compression at all!). You
may do, with certain kinds of input, but it's all down to the
statistics of the data. The smaller your storage the less
you can benefit from statistical variation across the image,
and 1 Kbyte is very small!

Given that a lossless system is inevitably 'variable bit rate'
(VBR) the concept of "real time capability" is somewhat vague;
the latency is bound to be variable. In real-world applications
the output bit-rate is often constrained so a guaranteed minimum
degree of compression must be achieved; such systems cannot be
(always) lossless.

From my experience I would say you will need at least a 4-line
buffer to get near to 2:1 compression on a wide range of input
material. For a constant-bit-rate (CBR) system based on a 4x4
integer transform see:

http://www.bbc.co.uk/rd/pubs/whp/whp119.shtml

This is designed for ease of hardware implementation rather than
ultimate performance, and is necessarily lossy.

Richard.
http://www.rtrussell.co.uk/
To reply by email change 'news' to my forename.
 
news@rtrussell.co.uk wrote:

[ ... ]

Given that a lossless system is inevitably 'variable bit rate'
(VBR) the concept of "real time capability" is somewhat vague;
the latency is bound to be variable.
The output bit rate will vary, but can be bounded -- for an obvious
example, consider using LZW compression with 8-bit inputs and 12-bit
codes as the output. In the worst case, each 8-bit input produces a
12-bit output, and you have to clear and rebuild the dictionary every
4096 characters.

Real-time constraints are a more or less separate issue though -- here,
the output data stream isn't (usually) nearly as difficult to deal with
as things like dictionary searching in the compression process. Like
the compression rate, this will vary, but (again) it's usually pretty
easy to set an upper bound. Using the same example, in LZW you build up
a string out of characters, with one dictionary entry for each "next"
character following a current character. There can be no more than 256
next characters (worst case), so the worst case requirement is to
search all 256 entries for follow-on characters in N microseconds (or
nanoseconds, or whatever). You just about need more details to
guarantee this though -- a trie-based dictionary has different
characteristics than a hash-based dictionary (for an obvious example).
In nearly every case it's still pretty easy to place an upper bound on
the complexity and time involved though.

OTOH, you can run into a little bit of a problem with some of the
basics -- if you happen (for example) to be storing your dictionary in
SRAM, the time per access is pretty easy to estimate. If you're storing
it in something like SDRAM, the worst case can be a bit harder to
figure out.

--
Later,
Jerry.
 
Melanie Nasic wrote:
Hello community,

I am thinking about implementing a real-time compression scheme on an FPGA
working at about 500 Mhz. Facing the fact that there is no "universal
compression" algorithm that can compress data regardless of its structure
and statistics I assume compressing grayscale image data. The image data is
delivered line-wise, meaning that one horizontal line is processed, than
the next one, a.s.o.
Because of the high data rate I cannot spend much time on DFT or DCT and on
data modelling. What I am looking for is a way to compress the pixel data in
spatial not spectral domain because of latency aspects, processing
complexity, etc. Because of the sequential data transmission line by line a
block matching is also not possible in my opinion. The compression ratio is
not so important, factor 2:1 would be sufficient. What really matters is the
real time capability. The algorithm should be pipelineable and fast. The
memory requirements should not exceed 1 kb.
What "standard" compression schemes would you recommend?
JPEG supports lossless encoding that can fit (at least roughly) within
the constraints you've imposed. It uses linear prediction of the
current pixel based on one or more previous pixels. The difference
between the prediction and the actual value is what's then encoded. The
difference is encoded in two parts: the number of bits needed for the
difference and the difference itself. The number of bits is Huffman
encoded, but the remainder is not.

This has a number of advantages. First and foremost, it can be done
based on only the curent scan line or (depending on the predictor you
choose) only one scan line plus one pixel. In the latter case, you need
to (minutely) modify the model you've outlined though -- instead of
reading, compressing, and discarding an entire scan line, then starting
the next, you always retain one scan line worth of data. As you process
pixel X of scan line Y, you're storing pixels 0 through X+1 of the
current scan line plus pixels X-1 through N (=line width) of the
previous scan line.

Another nice point is that the math involved is always simple -- the
most complex case is one addition, one subtraction and a one-bit right
shift.

Are there
potentialities for a non-standard "own solution"?
Yes, almost certainly. Lossless JPEG is open to considerable
improvement. Just for an obvious example, it's pretty easy to predict
the current pixel based on five neighboring pixels instead of three. At
least in theory, this should improve prediction accuracy by close to
40% -- thus reducing the number of bits needed to encode the difference
between the predicted and actual values. At a guess, you won't really
see 40% improvement, but you'll still see a little improvement.

In the JPEG 2000 standard, they added JPEG LS, which is certainly an
improvement, but if memory serves, it requires storing roughly two full
scan lines instead of roughly one scan line. OTOH, it would be pretty
easy to steal some of the ideas in JPEG LS without using the parts that
require more storage -- some things like its handling of runs are
mostly a matter of encoding that shouldn't really require much extra
storage.

The final question, however, is whether any of these is likely to give
you 2:1 compression. That'll depend in your input data -- for typical
photographs, I doubt that'll happen most of the time. For thngs like
line art, faxes, etc., you can probably do quite a bit better than 2:1
on a fairly regular basis. If you're willing to settle for nearl
lossless compression, you can improve ratios a bit further.

--
Later,
Jerry.
 
Melanie Nasic wrote:
Hello community,

I am thinking about implementing a real-time compression scheme on an FPGA
working at about 500 Mhz. Facing the fact that there is no "universal
compression" algorithm that can compress data regardless of its structure
and statistics I assume compressing grayscale image data. The image data is
delivered line-wise, meaning that one horizontal line is processed, than
the next one, a.s.o.
Because of the high data rate I cannot spend much time on DFT or DCT and on
data modelling. What I am looking for is a way to compress the pixel data in
spatial not spectral domain because of latency aspects, processing
complexity, etc. Because of the sequential data transmission line by line a
block matching is also not possible in my opinion. The compression ratio is
not so important, factor 2:1 would be sufficient. What really matters is the
real time capability. The algorithm should be pipelineable and fast. The
memory requirements should not exceed 1 kb.
What "standard" compression schemes would you recommend?
Though it's only rarely used, there's a lossless version of JPEG
encoding. It's almost completely different from normal JPEG encoding.
This can be done within your constraints, but would be improved if you
can relax them minutely. Instead of only ever using the current scan
line, you can improve things if you're willing to place the limit at
only ever storing one scan line. The difference is that when you're in
the middle of a scan line (for example) you're storing the second half
of the previous scan line, and the first half of the current scan line,
rather than having half of the buffer sitting empty. If you're storing
the data in normal RAM, this makes little real difference -- the data
from the previous scan line will remain in memory until you overwrite
it, so it's only really a question of whether you use it or ignore it.

Are there potentialities for a non-standard "own solution"?
Yes. In the JPEG 2000 standard, they added JPEG LS, which is another
lossless encoder. A full-blown JPEG LS encoder needs to store roughly
two full scan lines if memory serves, which is outside your
constraints. Nonetheless, if you're not worried about following the
standard, you could create more or less a hybrid between lossless JPEG
and JPEG LS, that would incorporate some advantages of the latter
without the increased storage requirements.

I suspect you could improve the prediction a bit as well. In essence,
you're creating a (rather crude) low-pass filter by averaging a number
of pixels together. That's equivalent to a FIR with all the
coefficients set to one. I haven't put it to the test, but I'd guess
that by turning it into a full-blown FIR with carefully selected
coefficients (and possibly using more of the data you have in the
buffer anyway) you could probably improve the predictions. Better
predictions mean smaller errors, and tighter compression.
 
Hi Jerry,

thanks for your response(s). Sounds quite promising. Do you know something
about hardware implementation of the compression schemes you propose? Are
there already VHDL examples available or at least C reference models?

Regards, Melanie



"Jerry Coffin" <jerry.coffin@gmail.com> schrieb im Newsbeitrag
news:1135292121.200476.236850@g43g2000cwa.googlegroups.com...
Melanie Nasic wrote:
Hello community,

I am thinking about implementing a real-time compression scheme on an
FPGA
working at about 500 Mhz. Facing the fact that there is no "universal
compression" algorithm that can compress data regardless of its structure
and statistics I assume compressing grayscale image data. The image data
is
delivered line-wise, meaning that one horizontal line is processed, than
the next one, a.s.o.
Because of the high data rate I cannot spend much time on DFT or DCT and
on
data modelling. What I am looking for is a way to compress the pixel data
in
spatial not spectral domain because of latency aspects, processing
complexity, etc. Because of the sequential data transmission line by line
a
block matching is also not possible in my opinion. The compression ratio
is
not so important, factor 2:1 would be sufficient. What really matters is
the
real time capability. The algorithm should be pipelineable and fast. The
memory requirements should not exceed 1 kb.
What "standard" compression schemes would you recommend?

JPEG supports lossless encoding that can fit (at least roughly) within
the constraints you've imposed. It uses linear prediction of the
current pixel based on one or more previous pixels. The difference
between the prediction and the actual value is what's then encoded. The
difference is encoded in two parts: the number of bits needed for the
difference and the difference itself. The number of bits is Huffman
encoded, but the remainder is not.

This has a number of advantages. First and foremost, it can be done
based on only the curent scan line or (depending on the predictor you
choose) only one scan line plus one pixel. In the latter case, you need
to (minutely) modify the model you've outlined though -- instead of
reading, compressing, and discarding an entire scan line, then starting
the next, you always retain one scan line worth of data. As you process
pixel X of scan line Y, you're storing pixels 0 through X+1 of the
current scan line plus pixels X-1 through N (=line width) of the
previous scan line.

Another nice point is that the math involved is always simple -- the
most complex case is one addition, one subtraction and a one-bit right
shift.

Are there
potentialities for a non-standard "own solution"?

Yes, almost certainly. Lossless JPEG is open to considerable
improvement. Just for an obvious example, it's pretty easy to predict
the current pixel based on five neighboring pixels instead of three. At
least in theory, this should improve prediction accuracy by close to
40% -- thus reducing the number of bits needed to encode the difference
between the predicted and actual values. At a guess, you won't really
see 40% improvement, but you'll still see a little improvement.

In the JPEG 2000 standard, they added JPEG LS, which is certainly an
improvement, but if memory serves, it requires storing roughly two full
scan lines instead of roughly one scan line. OTOH, it would be pretty
easy to steal some of the ideas in JPEG LS without using the parts that
require more storage -- some things like its handling of runs are
mostly a matter of encoding that shouldn't really require much extra
storage.

The final question, however, is whether any of these is likely to give
you 2:1 compression. That'll depend in your input data -- for typical
photographs, I doubt that'll happen most of the time. For thngs like
line art, faxes, etc., you can probably do quite a bit better than 2:1
on a fairly regular basis. If you're willing to settle for nearl
lossless compression, you can improve ratios a bit further.

--
Later,
Jerry.
 
Melanie Nasic wrote:
Hi Jerry,

thanks for your response(s).
Sorry 'bout that -- Google claimed the first one hadn't posted, so I
tried again...

Sounds quite promising. Do you know something
about hardware implementation of the compression schemes you propose? Are
there already VHDL examples available or at least C reference models?
I don't believe I've seen any VHDL code for it. One place that has C
code is:

ftp://ftp.cs.cornell.edu/pub/multimed/ljpg.tar.Z

I suspect Google would turn up a few more implementations as well. If
you like printed information, you might consider _Image and Video
Compression Standards: Algorithms and Architectures; Second Edition_ by
Bhaskaran and Konstantinides. Published by Kluwer Academic Publishers,
ISBN 0-7923-9952-8. It doesn't go into tremendous detail, but it's one
of the few (that I know of) that discusses lossless JPEG at all.

--
Later,
Jerry.
 
Thanks.
Can I program it without using JTAG possibly over PCI bus? I am ready to
write appropriate driver if I know where can I get the documentation for
writing such driver.

Best regards

"Antti Lukats" <antti@openchip.org> wrote in message
news:dpmq2m$c97$1@online.de...
"ma" <ma@nowhere.com> schrieb im Newsbeitrag
news:LcBvf.86365$PD2.51133@fe1.news.blueyonder.co.uk...
Hello,

I have a Veritex-4 PCI board and I like to program the PowerPC on it. I
don't have the EDK from Xilinx. Here are my questions:


How can program the PowerPC without buying EDK?

short answer: you can not
long answer: you can if write your own minimal replacement fo EDK

As I know the compiler and linker is free (part of GNU) where can I get
them for free?

ppc gcc can be obtained but it want help you much, see above

How can I download the compiled program to PowerPC?

over JTAG or buy preloading BRAMs

How can I get the output? For example if I write a hello world type of
program, can I see the STDIO on screen?


use EDK or add your peripherals and re-implemented all the funtionality
provide by EDK

Any help is much appreciated.


doing it wihtout EDK costs you WAY more than than obtaining EDK, it could
be done, but the time needed for that just isnt worht doing it

sorry, but Xilinx REALLY REALLY doesnt want anyone to work on the Virtex
PPC without using EDK, it is doable (without EDK) but it really isnt worht
trying

Antti
 
Ajeetha wrote:
Frank,
See:

http://www.deepchip.com/posts/0184.html

HTH
Ajeetha
www.noveldv.com
I love this quote: "It's always been my dream to give my customers a
choice between ViewLogic & ViewLogic." That's a nice perch to sit on!

Jerry
--
Engineering is the art of making what you want from things you can get.
ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ
 
Ben Jones wrote:

Modelsim's implementation of "real" is basically double-precision
floating-point but shorn of its meta-values.
That is the case. See below.

Paul:

I expect that the statement
result := 1.0 / 0.0;
was never compiled on modelsim and if it
got by DC, that is a bug.
Just add a test for zero to your code.

-- Mike Treseler
_____________________________

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use ieee.math_real.all;
-- Fri Jan 27 10:25:29 2006 M.Treseler
entity test_real_big is
end test_real_big;

architecture sim of test_real_big
is
-- constant infinity : real := 1.0/0.0;
-- ** Error: Static divide by zero.
-- (vcom-1144) Value inf is out of real range -1e+308 to 1e+308.
-- (Quartus 5.1 says: "Value cannot contain divisor of zero.")
------------------------------------------------------------------
constant real_right : real := real'right;
constant real_right_str : string := real'image(real'right);
begin
process is
begin
report "maximum real value is " & real_right_str;
assert real_right = 1.0e+308 report "real_right not 1.0e+308";
wait;
end process;
end sim;

--------------------------------------------------------------------
--# 6.1c
--# vsim -c test_real_big
--# Loading /flip/usr1/modeltech/linux/../std.standard
--# Loading /flip/usr1/modeltech/linux/../ieee.std_logic_1164(body)
--# Loading /flip/usr1/modeltech/linux/../ieee.numeric_std(body)
--# Loading /flip/usr1/modeltech/linux/../ieee.math_real(body)
--# Loading work.test_real_big(sim)

--VSIM 1> run
--# ** Note: maximum real value is 1.000000e+308
--# Time: 0 ns Iteration: 0 Instance: /test_real_big
 

Welcome to EDABoard.com

Sponsor

Back
Top