EDAboard.com | EDAboard.de | EDAboard.co.uk | WTWH Media

Need magic incantation to prevent synthesizer misoptimisatio

Ask a question - edaboard.com

elektroda.net NewsGroups Forum Index - FPGA - Need magic incantation to prevent synthesizer misoptimisatio

Aleksandar Kuktin
Guest

Sun Sep 23, 2018 2:45 am   



Hi all!

I'm having a problem with the synthesis and P&R tools introducing a
unnecessary gate in a critical path. Consider the following verilog:

reg [31:0] mem_dataintomem = 32'd0;

always @(posedge CLK)
begin
if (mcu_active && (w_we_recv || w_tlb_recv))
mem_dataintomem <= w_data_recv;
if (mem_ack && !mcu_active)
mem_dataintomem <= 0;
end

Practically, the synthesis correctly implements mem_dataintomem as a
register whose data input is ungated w_data_recv, and whose clock enable
and synchronous reset control the contents of the register.

In this setup, for speed considerations, the best thing to do is create a
single gate signal to the clock enable input of a register, and another
separate single gate signal to drive the synchronous reset of the same
register.

But instead, the tools combine the signals, so that there is one single-
gate signal that feeds the synchronous reset, but is also piped into the
other signal (which therefore becomes a two-gate signal). Sort of like
this:

assign ctrl_syn_reset = mem_ack && !mcu_active;
assign ctrl_clock_enable = mcu_active &&
(w_we_recv || w_tlb_recv) &&
!ctrl_syn_reset; // this is superfluous
register_module mem_dataintomem_somebit(
.CLOCK(CLK),
.DATAIN(w_data_recv),
.CLOCK_ENABLE(ctrl_clock_enable),
.RESET(ctrl_syn_reset), // ignore the drive-high-to-reset quirk
[...]
);

My problem is that in this setup ctrl_clock_enable becomes too slow on
account of the extra gate and timing analysis fails.

What I would like is help figuring out some incantation that would make
the machine see the light and not chain the signals. OTOH, is there a
real need for this? It seems obvious to me that only one of CLOCK_ENABLE
or RESET will ever be asserted, and that you don't have to mix them.

I know for a fact I can sidestep the problem if I remove the clock enable
and instead gate the input to the register, but that is *very*
problematic because that area of the die is already tightly packed and it
can't support the extra logic (that is, all the wires are already taken).

The chip used is Lattice iCE40 HX8K, and the software used is iCEcube2
version 2017.01.27914 . Synthesis is performed by the bundled Synplify
Pro L-2016.09.L+ice40, build 077R.

The code is part of a FOSH project and can be inspected on
https://github.com/akuktin/special_snowflake . The code of interest for
this question is in file cache/cpu_mcu2.v, lines approximately 303 to
333. The code at the start of the post (without else) is one of the
experiments I tried, and failed with.

Gabor
Guest

Sun Sep 23, 2018 2:45 pm   



On Saturday, 9/22/2018 9:03 PM, Aleksandar Kuktin wrote:
Quote:
Hi all!

I'm having a problem with the synthesis and P&R tools introducing a
unnecessary gate in a critical path. Consider the following verilog:

reg [31:0] mem_dataintomem = 32'd0;

always @(posedge CLK)
begin
if (mcu_active && (w_we_recv || w_tlb_recv))
mem_dataintomem <= w_data_recv;
if (mem_ack && !mcu_active)
mem_dataintomem <= 0;
end

Practically, the synthesis correctly implements mem_dataintomem as a
register whose data input is ungated w_data_recv, and whose clock enable
and synchronous reset control the contents of the register.

In this setup, for speed considerations, the best thing to do is create a
single gate signal to the clock enable input of a register, and another
separate single gate signal to drive the synchronous reset of the same
register.

But instead, the tools combine the signals, so that there is one single-
gate signal that feeds the synchronous reset, but is also piped into the
other signal (which therefore becomes a two-gate signal). Sort of like
this:

assign ctrl_syn_reset = mem_ack && !mcu_active;
assign ctrl_clock_enable = mcu_active &&
(w_we_recv || w_tlb_recv) &&
!ctrl_syn_reset; // this is superfluous
register_module mem_dataintomem_somebit(
.CLOCK(CLK),
.DATAIN(w_data_recv),
.CLOCK_ENABLE(ctrl_clock_enable),
.RESET(ctrl_syn_reset), // ignore the drive-high-to-reset quirk
[...]
);

My problem is that in this setup ctrl_clock_enable becomes too slow on
account of the extra gate and timing analysis fails.

What I would like is help figuring out some incantation that would make
the machine see the light and not chain the signals. OTOH, is there a
real need for this? It seems obvious to me that only one of CLOCK_ENABLE
or RESET will ever be asserted, and that you don't have to mix them.

I know for a fact I can sidestep the problem if I remove the clock enable
and instead gate the input to the register, but that is *very*
problematic because that area of the die is already tightly packed and it
can't support the extra logic (that is, all the wires are already taken).

The chip used is Lattice iCE40 HX8K, and the software used is iCEcube2
version 2017.01.27914 . Synthesis is performed by the bundled Synplify
Pro L-2016.09.L+ice40, build 077R.

The code is part of a FOSH project and can be inspected on
https://github.com/akuktin/special_snowflake . The code of interest for
this question is in file cache/cpu_mcu2.v, lines approximately 303 to
333. The code at the start of the post (without else) is one of the
experiments I tried, and failed with.


If the synthesizer is using a simple template approach to generating
the reset and clock control signal, it would probably be best to code
your logic the way the template expects it, typically:

always @ (posedge clk)
if (reset condition)
reset actions
else if (enable condition)
enable actions

Logically this is the same as what you've written, but the template
fits the expectations of the synthesizer better.


--
Gabor

Aleksandar Kuktin
Guest

Tue Oct 09, 2018 6:45 pm   



On Sun, 23 Sep 2018 01:03:30 +0000, Aleksandar Kuktin wrote:

> The chip used is Lattice iCE40 HX8K, [...]

Okay, I figured it out. The flip-flop is behaving differently than my
expectations.

Specifically, I had assumed the synchronous reset is fully independent
and will have its effect if it is independently asserted during the clock
edge. In actual fact, iCE40's synchronous reset only takes effect if the
clock_enable is also asserted. So, to reset the flip-flop, you have to
assert BOTH reset and clock_enable.

Therefore, the extra wire.

elektroda.net NewsGroups Forum Index - FPGA - Need magic incantation to prevent synthesizer misoptimisatio

Ask a question - edaboard.com

Arabic version Bulgarian version Catalan version Czech version Danish version German version Greek version English version Spanish version Finnish version French version Hindi version Croatian version Indonesian version Italian version Hebrew version Japanese version Korean version Lithuanian version Latvian version Dutch version Norwegian version Polish version Portuguese version Romanian version Russian version Slovak version Slovenian version Serbian version Swedish version Tagalog version Ukrainian version Vietnamese version Chinese version Turkish version
EDAboard.com map