How to generate unique ID for each FPGA using Ring Oscillato

Guest
Hi,

I am designing a Physically Uncolonable Function using Ring Oscillator on FPGA (Spartan6). However, it does not generate uniqe ID for each chip, it generates exactly same value for different chips. Different ROs must generate different frequency due to die-imperfection. I will compare those frequencies and generate bits, which should be unique for each chip, since each chip has different physicall die-imperfection.

First, I generate 16 ROs, each with 51 inverter gates, all of them are connected to two multiplexers, I connected the select inputs of multiplexers with 8-bit dip-switch. Outputs of multiplexers are connected to two 16-bit counters. Outputs of counters are connected to a comparator, If first counter reachs end value (111...11) (if it is faster than second counter) comparator gives '1', if second counter is faster than first, comparator gives a '0'. This generates only one bit. I replicated this design 16-times to generate 16-bit response. The first bits of this response (3 downto 0) are connected to 4-leds. I tested the design with 3 different chips, they give exactly same respnose, which is not desired.
How can I make it in order to have different responses?

Thanks.


ring_oscillator.vhd:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;

-- Uncomment the following library declaration if using
-- arithmetic functions with Signed or Unsigned values
use IEEE.NUMERIC_STD.ALL;

-- Uncomment the following library declaration if instantiating
-- any Xilinx primitives in this code.
--library UNISIM;
--use UNISIM.VComponents.all;

entity ring_oscilator is
generic (delay: time := 200ps;
chain_len: integer := 16);
port( rst_i : in std_logic;
ro_o : out std_logic);
end ring_oscilator;

architecture Behavioral of ring_oscilator is
signal chain : std_logic_vector(chain_len downto 0);
attribute keep: boolean;
attribute keep of chain: signal is true;
begin

--assert chain_len mod 2 = 1 report "Length of ring must be an odd number!" severity failure;

gen_chain:
for i in 1 to chain_len generate
chain(i) <= not chain(i-1) after delay;
end generate;
chain(0) <= chain(chain_len) nor rst_i after delay;

ro_o <= chain(chain_len);

end Behavioral;




debounce.vhd


library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;

entity debouncer is
generic(
counter_size : INTEGER := 19); --counter size (19 bits gives 10.5ms with 50MHz clock)
port(
clk_i : in std_logic; --input clock
button_i : in std_logic; --input signal to be debounced
enable_i : in std_logic; -- for simulation
result_o : out std_logic); --debounced signal
end debouncer;

ARCHITECTURE logic of debouncer is
signal flipflops : std_logic_vector(1 downto 0); --input flip flops
signal counter_set : std_logic; --sync reset to zero
signal counter_out : std_logic_vector(counter_size downto 0) := (others => '0'); --counter output
begin

counter_set <= flipflops(0) xor flipflops(1); --determine when to start/reset counter

process(clk_i)
begin
if rising_edge(clk_i) then
flipflops(0) <= button_i;
flipflops(1) <= flipflops(0);

if enable_i = '1' then
if(counter_set = '1') then --reset counter because input is changing
counter_out <= (others => '0');
elsif(counter_out(counter_size) = '0') then --stable input time is not yet met
counter_out <= counter_out + 1;
else --stable input time is met
result_o <= flipflops(1);
end if;
else
result_o <= button_i;
end if;

end if;
end process;
end logic;






top.vhd






library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.std_logic_misc.ALL;
use IEEE.std_logic_unsigned.ALL;
use ieee.numeric_std.all;

entity top is
Generic (
nr_ro : natural := 16;
puf_width : natural := 16
);
Port (
shift_i : in std_logic ; --debug
dout_o : out std_logic_vector(3 downto 0); --debug
sel_i : in std_logic_vector(7 downto 0); --debug
clk_i : in STD_LOGIC;
rst_i : in STD_LOGIC--;
-- puf_out : out STD_LOGIC_VECTOR (puf_width-1 downto 0)
);
end top;

architecture Behavioral of top is

constant c_width : natural := puf_width;
constant c_number_of_ro : natural := nr_ro;

-----------------------------DEBUG-------------------------------------
signal s_sel : std_logic_vector(7 downto 0) := (others => '0');
signal s_dout : std_logic_vector(3 downto 0) := (others => '0');
signal s_shift : std_logic := '0';
signal s_shift_pre : std_logic := '0';
signal s_pulse : std_logic := '0';
signal s_msb : integer range 0 to 16 := 4;
-----------------------------------------------------------------------

signal s_reset : std_logic := '0';
signal s_finish : std_logic_vector (c_width-1 downto 0):= (others => '0');
signal s_finished : std_logic := '0';
signal s_puf_out : std_logic_vector (c_width-1 downto 0):= (others => '0');

--------------components----------------------------
component puf_bit
generic (
nr_ro: natural := c_number_of_ro
);
port (
clk_i : in std_logic;
rst_i : in std_logic;
sel1_i : in unsigned(3 downto 0);
sel2_i : in unsigned(3 downto 0);
finish_o : out std_logic;
puf_bit_o : out std_logic
);
end component;

component debouncer
generic(counter_size : integer := 19); --counter size (19 bits gives 10.5ms with 50MHz clock)
port(
clk_i : in std_logic; --input clock
button_i : in std_logic; --input signal to be debounced
enable_i : in std_logic;
result_o : out std_logic); --debounced signal
end component;

begin

Generate_PUF:
for i in 0 to c_width-1 generate
Multiple_Puf_Bits: puf_bit
generic map (nr_ro => c_number_of_ro)
port map (
clk_i => clk_i,
rst_i => s_reset,
sel1_i => unsigned(s_sel(3 downto 0)),
sel2_i => unsigned(s_sel(7 downto 4)),
finish_o => s_finish(i),
puf_bit_o => s_puf_out(i)
);
end generate;

reset_debounce: debouncer
generic map (counter_size => 19)
port map (clk_i => clk_i, button_i => not rst_i, enable_i => '1', result_o => s_reset); --On FPGA board btn is low active

btn_debounce: debouncer
generic map (counter_size => 19)
port map (clk_i => clk_i, button_i => not shift_i, enable_i => '1', result_o => s_shift); --On FPGA board btn is low active

s_finished <= AND_REDUCE(s_finish);
------------ debug--------------------------------
s_sel <= sel_i;
dout_o <= s_puf_out(7 downto 4);

----------------------------------------------


end Behavioral;
 
On 6/30/2015 11:26 PM, mubinicyer@gmail.com wrote:
> Physically Uncolonable Function

Please describe what that means.

JJS
 
John Speth wrote:
On 6/30/2015 11:26 PM, mubinicyer@gmail.com wrote:
Physically Uncolonable Function

Please describe what that means.

JJS

Probably means that it is impervious to colonoscopy ;-)

--
Gabor
 
mubinicyer@gmail.com wrote:
Hi,

I am designing a Physically Uncolonable Function using Ring Oscillator on FPGA (Spartan6). However, it does not generate uniqe ID for each chip, it generates exactly same value for different chips. Different ROs must generate different frequency due to die-imperfection. I will compare those frequencies and generate bits, which should be unique for each chip, since each chip has different physicall die-imperfection.

First, I generate 16 ROs, each with 51 inverter gates, all of them are connected to two multiplexers, I connected the select inputs of multiplexers with 8-bit dip-switch. Outputs of multiplexers are connected to two 16-bit counters. Outputs of counters are connected to a comparator, If first counter reachs end value (111...11) (if it is faster than second counter) comparator gives '1', if second counter is faster than first, comparator gives a '0'. This generates only one bit. I replicated this design 16-times to generate 16-bit response. The first bits of this response (3 downto 0) are connected to 4-leds. I tested the design with 3 different chips, they give exactly same respnose, which is not desired.
How can I make it in order to have different responses?

Thanks.

I think you are working on a faulty premise. While there will be a
frequency difference for ring oscillators from device to device, the
relative frequency of two such oscillators within any device will
probably be similar. A lot of this has to do with routing delays,
which cannot be easily matched from one section of the device to
another. So even the static timing analysis will probably tell you
right off the bat that oscillator "A" has more prop delay in the ring
than oscillator "B" and so on. It's not just the LUT count.

If you are trying to determine a unique device by measuring some
quality of each of 16 distinct regions in the device, you need a
more reliable way to make this measurement. If you want to use
ring oscillators, I would start by using a fairly long counter
to determine the relative speed, and instead of just seeing which
one reaches terminal count first, latch the actual value of the
other counter when that terminal count is reached. Then look
at these relative numbers on a sampling of die to see if there is
enough information to find a useful threshold to tell the devices
apart. Even then you'd need to look at these values over a range
of temperature and voltage conditions to make sure you can
"fingerprint" the device reliably using just the process differences.

--
Gabor
 
wrote in message
news:8fd7b0b6-582b-44c5-84be-7ce09e7b59c3@googlegroups.com...

Hi,

I am designing a Physically Uncolonable Function using Ring Oscillator...

Why not just read the random contents of some internal ram as part of the
boot sequence. IME it will contain a random but device specific contents not
dependant on temperature, just manufacturing differences from die to die.

Andy
 
On 7/2/2015 7:09 AM, Andy Bennett wrote:
wrote in message
news:8fd7b0b6-582b-44c5-84be-7ce09e7b59c3@googlegroups.com...

Hi,

I am designing a Physically Uncolonable Function using Ring Oscillator...

Why not just read the random contents of some internal ram as part of
the boot sequence. IME it will contain a random but device specific
contents not dependant on temperature, just manufacturing differences
from die to die.

Why would you expect the power up contents of RAM to depend any more on
manufacturing differences than the ring oscillator frequency?

--

Rick
 
rickman wrote:
On 7/2/2015 7:09 AM, Andy Bennett wrote:


wrote in message
news:8fd7b0b6-582b-44c5-84be-7ce09e7b59c3@googlegroups.com...

Hi,

I am designing a Physically Uncolonable Function using Ring
Oscillator...

Why not just read the random contents of some internal ram as part of
the boot sequence. IME it will contain a random but device specific
contents not dependant on temperature, just manufacturing differences
from die to die.

Why would you expect the power up contents of RAM to depend any more on
manufacturing differences than the ring oscillator frequency?

A) In xilinx parts, that is very hard to do because the bitstream
will load the BRAM contents along with every other storage element
in the part. You would need to have special tools and some intimate
part knowledge to selectively load the part at configuration.

B) Whether or not it depends more or less on manufactuing differences
than a ring oscillator, his original method was flawed in that it
presumed that the manufacturing differences where the only thing
affecting the relative oscillator frequencies. In fact, place and
route differences would typically swamp any effect of manufacturing
process unless the design were carefully hand routed and replicated.

--
Gabor
 
On 7/2/2015 11:01 AM, GaborSzakacs wrote:
rickman wrote:
On 7/2/2015 7:09 AM, Andy Bennett wrote:


wrote in message
news:8fd7b0b6-582b-44c5-84be-7ce09e7b59c3@googlegroups.com...

Hi,

I am designing a Physically Uncolonable Function using Ring
Oscillator...

Why not just read the random contents of some internal ram as part of
the boot sequence. IME it will contain a random but device specific
contents not dependant on temperature, just manufacturing differences
from die to die.

Why would you expect the power up contents of RAM to depend any more
on manufacturing differences than the ring oscillator frequency?


A) In xilinx parts, that is very hard to do because the bitstream
will load the BRAM contents along with every other storage element
in the part. You would need to have special tools and some intimate
part knowledge to selectively load the part at configuration.

I am not sure that is correct. I don't have all the details of every
FPGA family memorized, but I do recall reading that someone, somewhere
allows block RAM contents to be preserved through configurations as an
option in the bit stream. I am pretty sure that was a Xilinx part I
read that about. So it is not so much "intimate knowledge" as it is
reading the fine manual.


B) Whether or not it depends more or less on manufactuing differences
than a ring oscillator, his original method was flawed in that it
presumed that the manufacturing differences where the only thing
affecting the relative oscillator frequencies. In fact, place and
route differences would typically swamp any effect of manufacturing
process unless the design were carefully hand routed and replicated.

I am not debating that. I am asking why anyone would expect the power
up contents of RAM to be useful in distinguishing individual parts. The
ring oscillator frequency at least has a chance of working if the design
method is adapted to deal with the designed in differences.

--

Rick
 
On 7/2/2015 11:01 AM, GaborSzakacs wrote:
rickman wrote:

I am not debating that. I am asking why anyone would expect the power up
contents of RAM to be useful in distinguishing individual parts. The ring
oscillator frequency at least has a chance of working if the design method
is adapted to deal with the designed in differences.

I don't know about Xilinx parts, but certainly Altera Cyclone and Stratix
parts have the option of configuring RAM at power up with a don't care
option, and at the same time you configure the RAM so write is permanently
held low you can treat it as unconfigured ROM - it will not be loaded from
flash at power up and will contain random values due to manufacturing
tolerances. As I said before, IME the values are consistant from power up to
power up.

Andy.
 
On 7/2/2015 11:22 AM, Andy Bennett wrote:
On 7/2/2015 11:01 AM, GaborSzakacs wrote:
rickman wrote:

I am not debating that. I am asking why anyone would expect the power
up contents of RAM to be useful in distinguishing individual parts.
The ring oscillator frequency at least has a chance of working if the
design method is adapted to deal with the designed in differences.

I don't know about Xilinx parts, but certainly Altera Cyclone and
Stratix parts have the option of configuring RAM at power up with a
don't care option, and at the same time you configure the RAM so write
is permanently held low you can treat it as unconfigured ROM - it will
not be loaded from flash at power up and will contain random values due
to manufacturing tolerances. As I said before, IME the values are
consistant from power up to power up.

But have you tested that the contents depend on "manufacturing
tolerances"? I can see where the contents of the RAM would also be more
dependent on design differences than manufacturing tolerances.

--

Rick
 
"rickman" wrote in message news:mn3l91$pje$1@dont-email.me...

On 7/2/2015 11:22 AM, Andy Bennett wrote:
On 7/2/2015 11:01 AM, GaborSzakacs wrote:
rickman wrote:


But have you tested that the contents depend on "manufacturing tolerances"?
I can see where the contents of the RAM would also be more dependent on
design differences than manufacturing tolerances.

Not sure what you mean by design differences - the RAM blocks are defined
blocks on the silicon, not synthesised from the logic array.
I have only tested the same FPGA design (mine) on the same FPGA part type
across a number of different parts and got random RAM contents between
parts, but the same contents for each part over multiple power ups and
temperatures.
I have not investigated further.

Andy
 
On 7/2/2015 12:34 PM, Andy Bennett wrote:
"rickman" wrote in message news:mn3l91$pje$1@dont-email.me...

On 7/2/2015 11:22 AM, Andy Bennett wrote:

On 7/2/2015 11:01 AM, GaborSzakacs wrote:
rickman wrote:


But have you tested that the contents depend on "manufacturing
tolerances"? I can see where the contents of the RAM would also be
more dependent on design differences than manufacturing tolerances.

Not sure what you mean by design differences - the RAM blocks are
defined blocks on the silicon, not synthesised from the logic array.
I have only tested the same FPGA design (mine) on the same FPGA part
type across a number of different parts and got random RAM contents
between parts, but the same contents for each part over multiple power
ups and temperatures.
I have not investigated further.

If what you have seen is accurate across the various product lines and
production batches, then a checksum or CRC on a block RAM should serve
as a useful fingerprint. I would be very concerned that this finger
print would be 100% repeatable. If it depends on manufacturing
tolerances I would expect there to be a finite possibility of one or
more bits being on the hairy edge and some small amount of noise
determining the resting state rather than the device specifics.

I guess it might be better to just record the entire block content and
allowing for some small number of bits changing from read to read.

--

Rick
 
rickman wrote:
On 7/2/2015 12:34 PM, Andy Bennett wrote:


"rickman" wrote in message news:mn3l91$pje$1@dont-email.me...

On 7/2/2015 11:22 AM, Andy Bennett wrote:

On 7/2/2015 11:01 AM, GaborSzakacs wrote:
rickman wrote:


But have you tested that the contents depend on "manufacturing
tolerances"? I can see where the contents of the RAM would also be
more dependent on design differences than manufacturing tolerances.

Not sure what you mean by design differences - the RAM blocks are
defined blocks on the silicon, not synthesised from the logic array.
I have only tested the same FPGA design (mine) on the same FPGA part
type across a number of different parts and got random RAM contents
between parts, but the same contents for each part over multiple power
ups and temperatures.
I have not investigated further.

If what you have seen is accurate across the various product lines and
production batches, then a checksum or CRC on a block RAM should serve
as a useful fingerprint. I would be very concerned that this finger
print would be 100% repeatable. If it depends on manufacturing
tolerances I would expect there to be a finite possibility of one or
more bits being on the hairy edge and some small amount of noise
determining the resting state rather than the device specifics.

I guess it might be better to just record the entire block content and
allowing for some small number of bits changing from read to read.

On the other hand, I have to wonder why you need this functionality
on an older Spartan 3 part. All the newer Xilinx series including
Spartan 3A DSP have a serial number ("Device DNA") built in to provide
a reliable mechanism to lock a design to a particular part.

--
Gabor
 
On Thu, 02 Jul 2015 16:25:53 -0400, GaborSzakacs wrote:

On the other hand, I have to wonder why you need this functionality on
an older Spartan 3 part. All the newer Xilinx series including Spartan
3A DSP have a serial number ("Device DNA") built in to provide a
reliable mechanism to lock a design to a particular part.

From Virtex 7 onwards, Xilinx Device DNA is not guaranteed to be unique -
up to 32 devices can have the same serial number (Reference: UG470).
Presumably this came from a desire to improve yield and reduce costs,
rather than from a desire to make it completely useless to designers.

I believe the older devices (up to and including V6) are still safe.

Regards,
Allan
 
On Sat, 04 Jul 2015 12:47:07 +0000, Allan Herriman wrote:

On Thu, 02 Jul 2015 16:25:53 -0400, GaborSzakacs wrote:

On the other hand, I have to wonder why you need this functionality on
an older Spartan 3 part. All the newer Xilinx series including Spartan
3A DSP have a serial number ("Device DNA") built in to provide a
reliable mechanism to lock a design to a particular part.

From Virtex 7 onwards, Xilinx Device DNA is not guaranteed to be unique
-
up to 32 devices can have the same serial number (Reference: UG470).
Presumably this came from a desire to improve yield and reduce costs,
rather than from a desire to make it completely useless to designers.

I believe the older devices (up to and including V6) are still safe.

Correction (after I re-read some documentation):


The Spartan 3A devices have a 57 bit number, a 55 bits slice of which is
guaranteed to be unique. (Reference: UG332)


The Xilinx V6 devices have a 64 bit number that is *not* guaranteed to be
unique, and can be read out over JTAG. A 57 bit slice of the 64 bit
number (also not guaranteed to be unique) can be read out inside the
FPGA. The remaining 7 bits are described as "reserved". (Reference:
UG360)


The Xilinx V7 devices have a 64 bit number that is unique, and can be
read out over JTAG. A 57 bit slice of the 64 bit number can be read out
inside the FPGA. That 57 bit slice is *not* guaranteed to be unique.
(Reference: UG470)


Xilinx Ultrascale (and presumably Ultrascale Plus) devices have a longer
(96 bit) sequence that is unique, and all 96 bits can be read out inside
the FPGA. (Reference: UG570)


BTW, they use efuse technology, and are programmed at the factory. The
Spartan 3A ones are guaranteed for 10 years or 30 million read cycles.

Regards,
Allan
 

Welcome to EDABoard.com

Sponsor

Back
Top