Part 7 in our quest to answer the question “How do computers do math?” bring us the long awaited grail.\(\)
Math Operations using Gates
In combinational logic, the output is a pure function of the input values. We have a wide variety of logic gates our disposal to build these functions. Ever since the 80s, the leading logic families are the TTLbased 7400 and the CMOSbased 4000 series.
The figure below shows an example of a package with four twoinput NAND gates.
Following sections discuss implementations of a multiplexer and some basic math operations.
Multiplexer
Albeit technically not a math function, we here introduce the multiplexer because we need it further down. A multiplexer uses a binary value (address) to select between several inputs. The output then assumes the value of that input.
The table below gives the truth table for a 2to1 multiplexer. From this follows the Boolean equation that expresses the output z as a function of the inputs. The illustration also shows the corresponding circuit, where the circle at the input of the top NANDgate represents an inverter.

; 
$$z=\overline{os} \cdot a + os \cdot b\nonumber$$ 
We now reached the point, we get to the core of the question “How do computers do Math?”. The first math operation that we will tackle is addition.
Addition
The full adder forms the building block for nbit adders. This full adder adds the 1bit values a and b to the incoming carry (c_{i}), and outputs a 1bit sum (s) and a 1bit outgoing carry (c_{o}). The carry is similar as when adding decimal numbers — if you have a carry from one column to the next, then that next column has to include that carry.
The truth table, shown below, gives the relation between these inputs and outputs. The Boolean equations and circuit are derived from this table.


$$\begin{aligned} s&=a \oplus b \oplus c_i\\ c_o&=a \cdot b + c_i \cdot (a \oplus b) \end{aligned}$$ 
Now that we can add 1bit values, the time has come to move on to nbit values. The basic carrypropagate adder, starts by adding the least significant bit and passes the carry on to the next bit, and so on. The s outputs are combined to form the sum S.
The circuit shown below gives an example of a 4bit carrypropagate adder. The carry has to propagate from the lowest to the highest bit position. This socalled “ripple carry” limits the speed of the circuit.
The accompanying article “Building Math Hardware” describes Verilog HDL implementations of this and faster algorithms.
Subtraction
Before we can look at subtraction, we need to be able to represent negative numbers. Referring back to the section on digital systems, the positive decimal value would be represented by the binary value 111
as shown below.
\begin{align}
7 &\equiv
\color{red}{1}\color{black}{\times 2^2}\ +\
\color{red}{1}\color{black}{\times 2^1}\ +\
\color{red}{1}\color{black}{\times 2^0} = \mathrm{b}\color{red}{111}
\end{align}$$
Two’s complement is the most common digital representation for negative numbers. It assigns the most significant bit (msb) a negative weight. The other bits have their usual binary weights. The advantage of this system is that the circuits can treat positive and negative numbers the same when doing arithmetic. The 4bit two’s complement for decimal value 7 is 1001
as shown below.
\begin{align}
7 &\equiv
\color{red}{1}\color{black}{\times (2^3)}\ +\
\color{red}{0}\color{black}{\times 2^2}\ +\
\color{red}{0}\color{black}{\times 2^1}\ +\
\color{red}{1}\color{black}{\times 2^0} = \mathrm{b}\color{red}{1001}
\end{align}$$
Inverting the bits of a binary number and adding 1 makes the number negative. This implies that positive numbers have the value 0
as their most significant bit, and negative numbers have the value 1
as their mostsignificant bit.
\begin{aligned}
\color{red}\color{black}7 &\equiv \ ^{\color{red}\sim}\mathrm{b}0111\color{red}\color{red}{+\mathrm{b}1}\color{black} =\mathrm{b}1000 + \mathrm{b}0001 = \mathrm{b}1001
\end{aligned}$$
This implies that using 8bits, we can encode the decimal values 128 to 127 .
Similar to addition, the simplest subtraction method is borrow propagation. Again, we will start by building a 1bit subtractor. The inputs a and b represent the 1bit binary numbers being added. Output d, the difference. b_{i} and b_{o} are the incoming and outgoing borrow signals.
The truth table shown below, gives the relation between the inputs and outputs. The outputs d and b_{o} can be expressed as a function of the inputs. The result d is an ExclusiveOR function, just as the sum was for addition.


$$\begin{aligned} d&=a \oplus b \oplus l_i\\ l_o&=\overline{a} \cdot b + l_i \cdot (\overline{a \oplus b}) \end{aligned}$$ 
To build a 4bit subtractor we combine four of these building blocks.
For a faster approach or a Verilog implementation of the subtractor shown above, refer to NOT_PUBLIC_YET Building Math Hardware.
Multiplication
Multiplication is another important operation in digital signal processing (e.g. fast Fourier Transfers). It is an excellent example of combining simple logic functions to make a much more complex function. The two methods that come to mind are array multipliers.
The array multiplier is a simple technique for implementing multiplication. It resembles the process how you would probably perform a multiplication yourself, except that it sums up the partial products as it goes.
Each partial product bit position can be generated by a simple AND gate between corresponding positions of the multiplicand A and multiplier B bits. Adding the partial products forms the product.
The multiplier adder (ma
), shown below, is the basic building block for the multiplier. It uses an AND gate to form the partial product of two binary digits. To calculate the partial sum and carry out signals, it reuses the 1bit full adders described earlier. The Boolean equations describe the multiplier adder (ma
) as a function of its ports:


$$\begin{aligned} b &= x \cdot y\\ s_o &= s_i\oplus b\oplus c_i\\ c_o &= s_i \cdot b + c_i \cdot(s_i \oplus b) \end{aligned}$$ 
The carry propagation array multiplier works similar to the way humans multiply. The ma_{ij}
building blocks connect in a grid, where ij
identifies the row and column. The figure below gives an example for a 4bit multiplier.
You might notice some redundancy in the top row, where ma_{0x}
merely function as AND gates. [Combinational multiplier].
This method will provide the expected result, but it will take a relatively long time because of the long carry chain. A more optimized method is the carrysave array multiplier described below.
Implementations of this and faster multipliers in Verilog HDL can be found in the accompanying article Building Math Hardware.
Division
Division is another example of combining simple logic function to make a more complex circuit. Note that the method shown below ignores the divideby0 condition. (See also Multipliers and dividers.)
We introduce the most basic division method called attempt subtraction divider. In calculating x/y, it repeatedly subtracts the divisor y⋅os* from the digits of the dividend x. Initially the value of os equals logic 0
, so that it subtract the value y. If the difference is negative, the subtraction is cancelled by making os logic 1
. Together, all the successive os together makes the division result, while the remainder is the difference of last subtraction step.
To implement this algorithm, we need subtractors that can cancel the subtraction. Each subtractor calculates the difference between two input numbers, but if the result is negative, the operation is canceled and replaced with a subtraction of zero.
Such a Controlled SubtractMultiplex (CSM) contains a 1bit subtractor ab with the usual inputs a, b, b_{i} and outputs d and b_{o}. The output select (os) signal selects between bit x and d=ab. The signal is connected to the borrow output of the most significant 1bit subtractor.
 0, means subtraction result was positive ⇒ D’ = D.
 1, means subtraction result was negative ⇒ D’ = X.
Inside each divider cell the os controls a multiplexer that selects between the result of the addition d and the original remainder x. The Boolean expressions express the outputs as a function of the inputs. (Click image to animate)


$$\begin{aligned} d^\prime &=x \oplus y \oplus b_i\\ d&=os \cdot x + \overline{os} \cdot d^\prime\\ b_o&=\overline{x} \cdot y + b_i \cdot (\overline{x \oplus y}) \end{aligned}$$ 
The circuit shown below implements a complete 8:4bit divider matrix using CSM modules. Each row performs one “attempt subtraction” cycle. Note that the most significant bit is used to drive the os inputs. (For details see “Combinational arithmetic“.)
For a faster approach or a Verilog implementation of above divider, refer to NOT_PUBLIC_YET Building Math Hardware.
Square root
The square root operation is an interesting algorithm to implement in hardware. Its implementation is again another example of combining simple logic function, utilizing previous designs.
Samavi and Sutikno improved the classical nonrestoring digit >recurrence square root algorithm. This example uses their algorithm but without the optimizations. The algorithm of this SimplifiedSamovi Square root is:
 Add the next bit from the input number to the right of the current remainder. This becomes the new current remainder (A)
 Take the square root obtained so far, appends 01 to it and subtracts it, properly shifted, from the current remainder. (The 0 in 01 corresponds to multiplying by 2; the 1 is a new guess bit.)
 If the result is
 Positive, then the new root bit is 1, and the result becomes the new remainder.
 Negative, then the current remainder (A) will become the new remainder (as if the subtraction never happened).
The circuit shown below implements a complete 8bit square root circuit. Each row performs one “attempt subtraction” cycle. Like in the division circuit, the most significant bit is used to drive the os inputs.
For a Verilog implementation of above square root algorithm, refer to NOT_PUBLIC_YET Buiding Math Hardware.
I hope the this answer the questions “How do computer do math?“. The easy answer would be “because the physics make it”, but I hope you enjoyed our much longer journey into this exiting field.
For those interested in building the math circuits, I suggest reading the chapter on programmable logic. If you want to know more about digital logic, I suggest reading the next chapter that introduces sequential logic that allows us to build more complicated systems.
Coert, I have tried to make a square root in excel but it would be work. The multiply and divide work in excel and the CSM works. I can not send you a picture of my work in this system. Can you help me?
Ronald lokker from the Netherlands
Thanks. I think the sqrt schematic might use the q* instead of the q to build the new subtrahend.
[in parallel to email conversation]