Adder and subtractor circuits

(c) Copyright 2016 Coert Vonk

Implements an adder and subtractor using circuits of logic gates. Written in parameterized Verilog HDL for Altera and Xilinx FPGA’s.

Adder and subtractor using logic gates

(c) Copyright 2016-2022 Coert Vonk The inquiry “How do Computers do Math?” introduced the carry-propagate adder and the borrow-propagate subtractor. Here we will recapitulate and implement these using Verilog HDL.\(\)

Carry-propagate adder

The full adder (fa) forms the basic building block. This full adder adds two 1-bit values a and b to the incoming carry (ci), and outputs a 1-bit sum (s) and a 1-bit outgoing carry (co). The circuit and Boolean equations, shown below, give the relations between these inputs and outputs.

$$ \begin{aligned} s &= a \oplus b \oplus c_i\\ c_o &= a \cdot b + c_i \cdot (a \oplus b) \end{aligned} \nonumber $$
1-bit full adder

To build a n-bit propagate-adder, we combine n fa blocks. The circuits adds the least significant bits and passes the carry on to the next bit, and so on. Combining the output bits forms the sum s. The circuit shown below gives an example of a 4-bit carry-propagate adder.

4-bit carry-propagate adder

We describe the circuit using array instantiation. In this a[] and b[] are the summands, ci[] is the in-coming carry, co[] is the out-going carry and s[] is the sum.

math_adder_fa_block fa [N-1:0] ( .a  ( a ),
    .b  ( b ),
    .ci ( {c[N-2:0], 1'b0} ),
    .s  ( s[N-1:0] ),
    .co ( {s[N], c[N-2:0]}) );

The compiler will optimize the fa blocks with forced inputs, but most fa blocks will compile to a RTL netlist as shown below.

(c) Copyright 2016-2022 Coert Vonk
1-bit full adder (fa) block in RTL

The 4-bit adder compiles into the daisy chained fa blocks as shown.

(c) Copyright 2016-2022 Coert Vonk
4-bit carry-propagate adder in RTL

The complete Verilog HDL code along with the test bench and constraint files are available through GitHub for Xilinx (Spartan-6 LX9) and Altera (Cyclone IV DE0-Nano) boards.


The propagation delay \(t_{pd}\) depends on size N and the value of operands. For a given size N, adding the value 1 to an operand that contains all zeroes causes the longest propagation delay.

The worst-case propagation delays for the Altera Cyclone IV on the DE0-Nano are found using the post-map Timing Analysis tool. The exact values depend on the model and speed grade of the FPGA, the silicon itself, voltage and the die temperature.

\(N\) Timing Analysis Measured
slow 85°C slow 0°C fast 0°C actual
4-bits 6.9 ns 6.2 ns 4.3 ns
8-bits 8.6 ns 7.6 ns 5.4 ns
16-bits 15.7 ns 13.9 ns 9.7 ns
27-bits 20.2 ns 18.0 ns 12.3 ns
32-bits 26.0 ns 23.2 ns 15.7 ns
42-bits 30.2 ns 26.9 ns 18.1 ns
Propagation delay in propagate-carry adder

Borrow-propagate subtractor

Similar to addition, the simplest subtraction method is borrow-propagate, as introduced in Chapter 7 of the inquiry “How do Computers do Math?“. Again, we will start by building a 1-bit subtractor (fs). The inputs a and b represent the 1-bit binary numbers being added. Output d, the difference. li and lo are the incoming and outgoing borrow/loan signals.

The outputs d and lo can be expressed as a function of the inputs. The difference d is an Exclusive-OR function, just as the sum was for addition.

$$ \begin{align*} d&=a \oplus b \oplus l_i \\ l_o&=\overline{a} \cdot b + l_i \cdot (\overline{a \oplus b}) \end{align*} $$
1-bit full subtractor

To build a 4-bit subtractor we combine four of these building blocks.

4-bit borrow-propagate subtractor

The implementation of the borrow-propagate subtractor is very similar to the adder can be found at GitHub.

Combined adder and subtractor

Another method to subtract is based around the fact that \(a – b = a + (-b)\). This allows us to build a circuit that can add or subtract. When the operation input op equals 1, it subtracts b from a, otherwise it adds the values.

Under two’s complement, subtracting b is the same as adding the bit-wise complement of b and adding 1. The inputs b is negated by inverting its bits (using an XOR with signal op), and 1 is added by setting the least significant carry input to 1. We can build this a 4-bit adder/subtractor using fa blocks as shown below, where r is the result.

4-bit combined adder and subtractor

Note that the circuit also includes a overflow detection for two’s complement. Overflow occurs when c2 differs from the final carry-out c3.

In all these ripple carry adder and subtractors, the carry propagates from the lowest to the highest bit position. This propagation causes a delay that is linear with the number bits.

Moving on from this math adder and subtractor using logic gates, the next chapter explores faster circuits.


derived work

This article series describes implementations of math operations using circuits of logic gates. Written in Verilog HDL for Altera and Xilinx FPGA’s.

Primary school teaches our students methods for addition, subtraction, multiplication and division. Computer hardware implements similar methods and performs them with astounding speed. This enables applications such as computer vision, process control, encryption, hearing aids, video compression to dubious practices as high-frequency trading.


This article describes how to build such hardware. In such, it is a sequel to the inquiry “How do Computers do Math?” that in the chapter “Math Operations Using Gates” introduced conceptual circuits using logic gates. We will model the various math operations using digital gates. Here we combine the gates into circuits and describe them using the Verilog Hardware Description Language (HDL). These Verilog HDL descriptions are compiled and mapped to a Field Programmable Gate Array (FPGA).

Working knowledge of the Verilog HDL is assumed. To learn more about Verilog HDL, I recommend the book FPGA Prototyping with Verilog Examples, an online class or lecture slides. To help you get up to speed with the development boards, wrote Getting Started documents for two popular Altera and Xilinx boards.

We aim to study algorithms to implement the algorithms in generic VLSI, and as such do will not use the highly optimized carry chain connections present on many FPGAs.

Hail to the FPGA

Let’s take this moment to honor the virtues of the FPGA. A FPGA can do many things at the same time and still respond immediately to input events. It is generally more complicated to create and debug the same logic in a FPGA compared to a microprocessor. Because of the challenges that FPGA development poses, many systems combine FPGAs and microprocessors to get the best of both worlds. For example, to recognize objects in a video stream, one would implement the pre-processing (noise removal, normalization, edge detection) in an FPGA, but the higher-level logic on a CPU or DSP.

With a FPGA, you can put massive parallel structures in place. For example, an high-speed data stream can be distributed across the whole FPGA chip to be processed in parallel, instead of having a microprocessor deal with it sequentially.

FPGAs can accelerate machine learning algorithms, video encoding, custom algorithms, compression, indexing and cryptography. By implements a soft microprocessor as part of the FPGA it can also handle high level protocols such as handle board management, protocol bridging and security tasks. [IEEExplore]


The code examples were tested on an Altera “Cyclone IV E” FPGA using Quartus Prime 16.1. Earlier code iterations used a Xilinx “Spartan-6” with their ISE Design Suite. The Verilog descriptions should work equally well on other boards or environments.

Continue reading:
  1. Demonstration
  2. Adder (and subtractor)
  3. Faster adder (and subtractor)
  4. Multiplier
  5. Faster multiplier
  6. Divider
  7. Square root (and conclusion)

Copyright © 1996-2022 Coert Vonk, All Rights Reserved