Divider circuit

Implements a math divider using a circuit of logic gates. Written in parameterized Verilog HDL for Altera and Xilinx FPGA’s.

Divider using logic gates

(c) Copyright 2016 Coert Vonk The attempt-subtraction divider was introduced in the inquiry How do Computer do Math.\(\) This most basic divider consists of interconnected Controlled Subtract-Multiplex (csm) blocks. Each blocks contains a 1-bit Full Subtractor (fs) with the usual inputs a, b and bi and outputs d and bo. The output select signal, os, signal selects between input x and and the difference x-y.

$$ \begin{align*} d^\prime &=x \oplus y \oplus b_i\\ d&=os \cdot x + \overline{os} \cdot d^\prime\\ b_o&=\overline{x} \cdot y + b_i \cdot (\overline{x \oplus y}) \end{align*} $$
1-bit Controlled Subtract-Multiplex

Attempt-subtraction divider

A complete 8:4-bit divider can therefore be implemented by a matrix of csm modules connected on rows and columns as shown in figure below. Each row performs one “attempt subtraction” cycle. Note that the most significant bit is used to drive the Output Select os inputs. (For more details see “Combinational arithmetic“.)

8:4-bit Attempt-Subtraction

Similar to the multipliers, using Verilog HDL we can generate instances of csm blocks based on the word length of the dividend (xWIDTH) and divisor (yWIDTH). To describe the circuit in Verilog HDL, we need to derive the rules that govern the connections between the blocks.

Start by numbering the output ports based on their location in the matrix. For this circuit, we have the output signals difference (\(d\)) and borrow-out (\(b\)). E.g. \(d_{13}\) identifies the difference signal for the block in row 1 and column 3. Next, we express the input signals as a function of the output signal names (\(d\) and \(b\)) and do the same for the quotient itself as shown in the table below.

(c) Copyright 2016 Coert Vonk
Signals do, bo, x, y, bi and os

Based on this table, we can now express the interconnects using Verilog HDL using ?: expressions. generate genvar ii, jj; for ( ii = 0; ii <; xWIDTH; ii = ii + 1) begin: gen_ii for ( jj = 0; jj <; yWIDTH + 1; jj = jj + 1) begin: gen_jj math_divider_csm_block csm( .a ( jj <; 1 ? x[xWIDTH-1-ii] : ii > 0 ? d[ii-1][jj-1] : 1’b0 ), .b ( jj <; yWIDTH ? y[jj] : 1'b0 ), .bi ( jj > 0 ? b[ii][jj-1] : 1’b0 ), .os ( b[ii][yWIDTH] ), .d ( d[ii][jj] ), .bo ( b[ii][jj] ) ); end end for ( ii = 0; ii <; xWIDTH; ii = ii + 1) begin: gen_p assign q[xWIDTH-1-ii] = ~b[ii][yWIDTH]; end for ( jj = 0; jj <;= yWIDTH; jj = jj + 1) begin: gen_r assign r[jj] = d[xWIDTH-1][jj]; end endgenerate[/code]

The complete Verilog HDL source code along with the test bench and constraints is available at:

Results

As usual, the propagation delay \(t_{pd}\) depends size \(N\) and the value of operands. For a given size \(N\), the maximum propagation delay occurs when each subtraction needs to be cancelled.

The worst-case propagation delays for the Terasic Altera Cyclone IV DE0-Nano are found using the post-map Timing Analysis tool. The values in the table below, assume that the size of both operands is the same. The exact value depends on the model and speed grade of the FPGA, the silicon itself, voltage and the die temperature.

\(N\) Timing Analysis Measured
slow 85°C slow 0°C fast 0°C actual
4-bits 11.2 ns 10.0 ns 6.8 ns
8-bits 37.1 ns 33.2 ns 22.0 ns
16-bits 148 ns 132 ns 84.8 ns
27-bits 408 ns 365 ns 236 ns
32-bits 485 ns 434 ns 279 ns
Propagation delay in attempt-subtraction divider

Continuing from “Divider using logic gates”, the next chapter shows an implementation of the square root algorithm introduced in Chapter 7 of the inquiry “How do Computers do Math?“.