Math Talk

This page concludes the second part of the inquiry Math Talk. It shows an implementation of the SPI protocol on a Field Programmable Gate Array (FPGA).

Byte Exchange with a FPGA as Slave

Implementing the SPI Slave on an FPGA is like old school digital electronics. My key takeaway is to think hardware, not programming. Implementing the SPI protocol on a FPGA is fairly straightforward for as long as we use a directy clocked sequential circuit while preventing clock domain crossings.

Sequential circuit

In real life, two signals going to a single gate will not arrive there at the same time due to wire delays. This causes the output to momentarily have an incorrect value. The problem compounds as the signal travels through more gates and wires.

In Building Math Hardware we created elementary math operations using combinatorial circuits. That was OK, because we didn’t care about such output glitches caused by the input signals propagating to the outputs. From a demonstrator’s point of view it even made it more interesting. Talking to a real device, such as a SPI master is different, because it requires the outputs to be stable at certain times.

own work; requires browser with svg support
D flip-flop

The solution is to introduce a clock signal, and store the signals in a flip-flop (registers) at the rising edge of that clock signal. We then only need to ensure that the longest delay from one flip-flop to the next is less that the clock period. This greatly simplifies the design process, at the cost of introducing some delay.

The Verilog description shown below is an example of a synchronous design. It clocks signal s at the rising edge of the clock signal sysClk into register r.

wire in;
  reg out;
  always @(posedge sysClk)
  out <= in;

Clock domain

Field programmable gate arrays thrive on synchronous designs, but they don’t do well with clock signals that are asynchronous with its system clock. In particular, constructs such as @(posedge SCLK), will give the synthesizer the impression that SCLK is a clock signal and cause it to reserve special low-skew clock buffers, causing the fitter to run out of such buffers, resort to use general routing for the real system clock signal.

spi-sync4
Two-stage shift register

We also need to avoid transferring data from a flip-flop driven by one clock to a flip-flop driven by another clock. This is called a clock domain crossing and might manifest itself in metastability, data loss or incoherence [EE Times]. We prevent clock domain crossings, by synchronizing the input signals to the FPGA clock using a traditional two-stage shift register as illustrated above.

  1. The first flip-flop, creates a synchronous version of the inputs by clocking it with the system clock. The input signal could change within the flip-flop’s setup and hold times and may take longer than a system clock cycle to settle to a stable value (metastability). That’s why it is ran through a second flip-flop.
  2. The second flip-flop, makes it is very unlikely that this metastability propagates to the output.

Adding a third flip-flop gives us access to the previous value. Using the current and previous values, we can generate rise and fall signals as sown below.

reg [2:0] async_r; always @(posedge sysClk) async_r <= { async_r[1:0], async };
  wire rising = ( async_r[2:1] == 2'b01 );
  wire falling = ( async_r[2:1] == 2'b10 );
  wire sync = async_r[1];

In the remainder of this article we’ll refer to the synchronized versions of these SPI signals.

Operation

The main data object is an 8-bit register called data, similar to the one shown on the protocol page,

  • On a falling SCLK edge, the most significant bit from data is clocked into a register from where it is transmitted over its MISO output.
  • On a rising SCLK edge, the MOSI input is shifted into the least significant bit of data.

Once all eight bits are received, the byte is available as rx. This received byte rx should be read when rxValid is active during a rising edge of the sysClk.

Finite State Machine

The Byte module implements the SPI Slave protocol and converts a bit stream into bytes and visa versa. It is implemented using a state machine with 8 states, corresponding to the 8 bits per byte. The illustration below shows the Finite State Machine (FSM) with corresponding data path.

In general, a FSM is in charge and contains the state register, next-state logic and output logic. In this particular case, we didn’t require output logic. The FSM passes a control signal (state) to the the data path. The data path combines the control signal with its input signals to generate the output signals rx, rxValid and the MISO bitstream.

own work, requires svg enabled browser
SPI Byte Exchange FSM with Data path

Timing

The timing diagram below shows the relation between the different signals. You may note that the input signal synchronization comes at the cost of introducing a delay. Given that the system clock is significantly faster then the SPI clock this should not pose a problem. The initial implementation on Xilinx used a 66 MHz system clock and a 4 MHz SPI clock as shown below.

own work
Signals for XIlinx implementation

In a later iteration on Altera, we used a PLL to create a 200 MHz system clock. The only reason for such a high clock was that we eventually plan to use it to measure coarse propagation delay in circuits. The gate level simulation result is shown below.

own work
Signals for Altera implementation

Sources

The complete project including constraints and test bench is available through Github. Much of the credit for the byte level implementation goes to fpga4fun. My key Verilog HDL files are:

Verification

To verify the implementation, we ran the test bench (spi_byte_tb.v) using gate level simulation. This test bench will monitor the communication and report errors when found. In the real world, we connected the Arduino SPI Master. The program on the Arduino, alternates writing 0xAA and 0x55 with a 10/90 duty cycle. As a consequence the LED should blink shortly every cycle.

The next part of this article introduces a Message Exchange Protocol, layered on top of this byte interface. This allows us to pass 32-bit register values over the SPI byte interface.

Embedded software developer
Passionately curious and stubbornly persistent. Enjoys to inspire and consult with others to exchange the poetry of logical ideas.

5 Replies to “Math Talk”

  1. Very well explained. I am looking some details with respect to interfacing Arduino with FPGA (Altera) using I2C where FPGA is a master and arduino is slave.

  2. Can we use this project to establish a communication between Arduino and FPGA?

  3. The Two-stage shift register here is used to synchronize only the serial clock coming from Master. Right? Or do we need to sync the rest signals (i.e MOSI, SS, MISO) ??

Leave a Reply

Your email address will not be published. Required fields are marked *

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.