Xilinx DLL driving multiple off chip clocks

K

Ken Morrow

Guest
I have the standard sort of circuit from the Xilinx App note driving an off
chip clock:-

Main clock comes onto chip through an IBUFG to CLKIN of the DLL

CLK0 from the DLL is fed off the chip through an OBUFT.

The output of the OBUFT, which is on a global clock pin, is fed back in via
an IBUFG to form CLKFB of the DLL.

This seems to work fine.

Main clock to output clock delay is constrained to <5 ns and this constraint
is achieved.



Next I wanted to have 4 off chip clock outputs, timed as close as possible
to the first one..

I buffered the CLK0 from the DLL with a BUFG before the OBUFT to try to
ensure that there was low skew between the 4 off chip clock outputs.

The main clock to external clock delay increased to 10nS and failed the
constraint.
It seemed that the router had used a mixture of global and other routing to
get the CLK0 to the various OBUFT,
and that the other routing was slow.

I removed the BUFG and the delay then passed my <5ns constraint without
probs, despite using non-global routing.

I am puzzled? Am I overlooking something?

(Target device is a Virtex II 6000)

Many Thanks,

Ken.
 
"Ken Morrow" <junk@not_morro.co.uk> wrote in message
news:Hod_a.2853$z7.464671@wards.force9.net...
I have the standard sort of circuit from the Xilinx App note driving an
off
chip clock:-

Main clock comes onto chip through an IBUFG to CLKIN of the DLL

CLK0 from the DLL is fed off the chip through an OBUFT.

The output of the OBUFT, which is on a global clock pin, is fed back in
via
an IBUFG to form CLKFB of the DLL.

This seems to work fine.

Main clock to output clock delay is constrained to <5 ns and this
constraint
is achieved.



Next I wanted to have 4 off chip clock outputs, timed as close as possible
to the first one..

I buffered the CLK0 from the DLL with a BUFG before the OBUFT to try to
ensure that there was low skew between the 4 off chip clock outputs.

The main clock to external clock delay increased to 10nS and failed the
constraint.
It seemed that the router had used a mixture of global and other routing
to
get the CLK0 to the various OBUFT,
and that the other routing was slow.

I removed the BUFG and the delay then passed my <5ns constraint without
probs, despite using non-global routing.

I am puzzled? Am I overlooking something?

(Target device is a Virtex II 6000)

Many Thanks,

Ken.

Thinking about it further, even if the delay was 10ns, the DLL should have
removed it.
I would have expected very little delay from the main clock to the output of
the OBUFTs,
wether or not I have the BUFG in the way.
Seems OK without the BUFG, but not with.
 
This may not address your problem, but...

Just a thought, I like using the DDR mechanism to get clocks out of the
FPGA. I've done source-synchonous outputs on V2 up to 200MHz with great
success. Besides, it's free, since the IOB flip-flop's involved would not
otherwise be used.


--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Martin Euredjian

To send private email:
0_0_0_0_@pacbell.net
where
"0_0_0_0_" = "martineu"



"Ken Morrow" <junk@not_morro.co.uk> wrote in message
news:Hod_a.2853$z7.464671@wards.force9.net...
I have the standard sort of circuit from the Xilinx App note driving an
off
chip clock:-

Main clock comes onto chip through an IBUFG to CLKIN of the DLL

CLK0 from the DLL is fed off the chip through an OBUFT.

The output of the OBUFT, which is on a global clock pin, is fed back in
via
an IBUFG to form CLKFB of the DLL.

This seems to work fine.

Main clock to output clock delay is constrained to <5 ns and this
constraint
is achieved.



Next I wanted to have 4 off chip clock outputs, timed as close as possible
to the first one..

I buffered the CLK0 from the DLL with a BUFG before the OBUFT to try to
ensure that there was low skew between the 4 off chip clock outputs.

The main clock to external clock delay increased to 10nS and failed the
constraint.
It seemed that the router had used a mixture of global and other routing
to
get the CLK0 to the various OBUFT,
and that the other routing was slow.

I removed the BUFG and the delay then passed my <5ns constraint without
probs, despite using non-global routing.

I am puzzled? Am I overlooking something?

(Target device is a Virtex II 6000)

Many Thanks,

Ken.
 
"Ken Morrow" <junk@not_morro.co.uk> wrote in message news:<dHo_a.2998$z7.487794@wards.force9.net>...
"Ken Morrow" <junk@not_morro.co.uk> wrote in message
news:Hod_a.2853$z7.464671@wards.force9.net...
I have the standard sort of circuit from the Xilinx App note driving an
off
chip clock:-

Main clock comes onto chip through an IBUFG to CLKIN of the DLL

CLK0 from the DLL is fed off the chip through an OBUFT.

The output of the OBUFT, which is on a global clock pin, is fed back in
via
an IBUFG to form CLKFB of the DLL.

This seems to work fine.

Main clock to output clock delay is constrained to <5 ns and this
constraint
is achieved.



Next I wanted to have 4 off chip clock outputs, timed as close as possible
to the first one..

I buffered the CLK0 from the DLL with a BUFG before the OBUFT to try to
ensure that there was low skew between the 4 off chip clock outputs.

The main clock to external clock delay increased to 10nS and failed the
constraint.
It seemed that the router had used a mixture of global and other routing
to
get the CLK0 to the various OBUFT,
and that the other routing was slow.

I removed the BUFG and the delay then passed my <5ns constraint without
probs, despite using non-global routing.

I am puzzled? Am I overlooking something?

(Target device is a Virtex II 6000)

Many Thanks,

Ken.

Thinking about it further, even if the delay was 10ns, the DLL should have
removed it.
I would have expected very little delay from the main clock to the output of
the OBUFTs,
wether or not I have the BUFG in the way.
Seems OK without the BUFG, but not with.
Howdy Ken,

I recall discovering the same thing on a design 18 months or so ago,
although I don't remember the difference being 5 ns (between BUFG and
not)! I believe the problem is that even though you are driving the
net with a BUFG, it gets off the global clock net immedately and uses
normal routing to get to non-CLK IO's.

This is the reason you'll hear people talking about using a DDR scheme
to generate a clock at the IOB. Anything less than that, and you are
subject to an inexact amount of routing delay and skew. The next best
thing to using DDR is using MAXDELAY and MAXSKEW constraints.

See http://direct.xilinx.com/xcell/xl32/xl32_53.pdf and the other app
notes that this one points to.

Good luck,

Marc
 

Welcome to EDABoard.com

Sponsor

Back
Top