Architecture

Explains how a CPU works by implementing the LC-3 instruction set. Includes an in-depth look at the instruction cycle phases. The inquiry “How do computers do math introduced the components needed to build a microprocessor. This article series continues by introducing the microprocessor. It uses a top-down approach.

We use a top down approach to help break up the complex microprocessor into simple, more manageable parts. Starting from the architecture, it dives down to an instruction set. In the second half we build a microprocessor using a field programmable gate array.

We assume a familiarity with assembly code.

Credits

This text leans on chapter 4 from the excellent book “Introduction to Computer Systems” by Patt and Partel. The implementation borrows from Davis’ Project#1 description at NC State University.

Architecture

World War II bought widespread destruction, but also spurred a flurry of computer innovations. The first electric computer was built using electro-mechanical relays in Nazi Germany (1941). Two years later the US army built ENIAC using 18,000 vacuum tubes for calculating artillery firing tables and simulating the H bomb.

Glenn A. Beck and Betty Snyder program ENIAC
Source: US Army photo, PD

This computer could perform complex sequences of operations, including loops, branches and subroutines. The program in this computer was hardwired using switches and dials. It was a thousand times faster as the electro-mechanical machine, but it took great effort to change the program.

The programming was hard-wired into their design, meaning that “reprogramming” a computer simply wasn’t possible: Instead, computers would have to be physically disassembled and redesigned. To explain how a CPU works, we take a look at the leading architecture.

von Neuman Architecture

Src: Encyclopædia
Britannica
In 1945, the mathematician John von Neumann formalized processor methods developed at the University of Pennsylvania. His computer architecture design consists of a Control Unit, Arithmetic and Logic Unit (ALU), Memory Unit, Registers and Inputs/Outputs. These methods became known as the von Neumann architecture and still forms the foundation for today’s computers.

Using the von Neumann architecture, computers were able to be modified and programmed via the input of instructions in computer code. This way, the functionality could be simply rewritten using a programming language.

The von Neuman Architecture is based on the principle of:

  1. Fetch an instruction from memory
  2. Decode the instruction
  3. Execute the Instruction
This process is repeated indefinitely, and is known as the fetch-decode-execute cycle.

The Central Processing Unit (CPU) is the electronic circuit responsible for executing the instructions of a computer program. It is also referred to as the microprocessor. The CPU contains the Control Unit, ALU and Registers. The Control Unit interprets program instructions and orchestrates the execution of these instructions. Registers store data before it can be processed. The ALU carries out arithmetic and logical operations.

von Neumann architecture

Instructions

Instructions are a fundamental unit of work that are executed completely, or not at all. In memory, the instructions look just like data — a collection of bits. They are just interpreted differently.

An instructions includes:

  • an opcode, the operation to be performed;
  • operands, the data (locations) to be used in the operation.

There are three instruction types:

  • arithmetic and logical instructions, such as addition and subtraction, or logical operations such as AND, OR and NOT;
  • memory access instructions, such as load and store;
  • control instructions, that may change the address in the program counter, permitting loops or conditional branches.

This article “How a CPU works” continues with Instruction Set on the next page.

Programmable logic

Complexity – CAD – Simulation ….

Logic devices can be classified into two broad categories

  • Fixed devices, where the circuits are permanent. Their function cannot be changed. Examples are:
    • gates (NAND, NOR, XOR),
    • binary counters,
    • multiplexers, and
    • adders.
  • Application-Specific Integrated Circuit (ASIC)
    • The manufacturer defines a integrated circuit containing transistors, but does not connect them together.
    • The user specifies the metal mask that connects the transistors.
    • The manufacturer uses this mask to finish the ASIC.
    • Introduced by Fairchild in 1967. Have since grown to contain over 100 million gates.

Programmable logic devices

Programmable logic devices (PLD), can be changed at any time to perform any number of functions. Prominent flavors of PLDs are:

(c) Copyright 2014-2021 Coert Vonk
Programmable array logic

  • Programmable array logic (PAL)
    • based on sum-of-products, with programmable “fuses”,
    • used for simple combinational logic (a few 100′s gates),
    • introduced by MMI (Birkner and Chua, 1978).
    • The figure on the right shows an example of an AND function with programmable fuses.
  • Complex programmable logic device (CPLD)
    • based on sum-of-products,
    • for medium size combinational logic (10,000′s gates).
  • Field-programmable gate array (FPGA)
    • based on blocks containing a look-up tables, full adder and d flip-flop,
    • used for complex combinational or sequential logic such as state machines (1,000,000′s gates),
    • introduced by Xilinx (Freeman, Vonderschmitt) in 1985.

A CPLD would be sufficient to implement the combinational circuits discussed so far, however our ultimate goal is to create a modest microprocessor circuit. As we will see later, a microprocessor circuit requires a state machine for which we need a FPGA. As a result the remainder of this text will focus on a FPGAs implementation.

Interconnected cells

The core of FPGAs contains a vast array of interconnected logic cells.

The exact logic cell architecture depends on the vendor. (refer to FPGA logic cells for typical cell architectures.)

The main vendors are:

  • Xilinx for leading edge products, and
  • Altera (Intel) for lean and efficient devices.

Each logic cell consists of:

  • a look-up table (LUT), to implement any 4-input Boolean function,
  • a full adder with an additional AND gate, to implement multiplication.
  • a D flip-flop, to implement sequential logic, and
  • a 2-to-1 multiplexer, to bypass the flip-flop if desired
(c) Copyright 2014-2021 Coert Vonk
Example logic cell

Each IO cell consists of:

  • a D flip-flop, to implement sequential logic, and
  • a 2-to-1 multiplexer, to bypass the flip-flop if desired
(c) Copyright 2014-2021 Coert Vonk
Example IO cell

Programmable interconnects

  • Reconfigurable interconnects allow the logic cells to be “wired together”.
  • The functionality of an FPGA can be changed by downloading a different configuration.
  • The circuits are often much faster as with discrete components, because the signals stay within the silicon die of the FPGA.
(c) Copyright 2014-2021 Coert Vonk
Example programmable interconnect

The figure below shows a typical matrix organization of the logic cells that are interconnected using programmable interconnects.

(c) Copyright 2014-2021 Coert Vonk
Example FPGA

Lab environment (thanks Dylon)

  • Altera (now Intel), much better tools.
  • Boards
    • Xilinx, development boards are easy to find. E.g. Spartan6 ($89 at Avnet) that has a USB-to-UART chip on it so you can plug it right into your computer to download new FPGA code as well as use it as a UART.
    • Alternatively, the Xilinx Spartan3E development board is an old standby that works well.
  • Simulator
    • icarus verilogg (free simulator, yum install iverilog) and GTKWave (free waveform viewer, yum install gtkwave) work great. They are just as good as most of the bundled simulators that you’ll find with the tools.
    • a web copy of ModelSim bundled with Xilinx or Altera that wouldn’t be bad either.
  • Cliff Cummings posted papers about Verilog, and book recommendations.
  • OpenCores has lots of Verilog and VHDL code for most any kind of core you can imagine.
  • Scripts SimShop and Tizzy for simulation and state machines! SimShop provides an easy scriptable way to set up a simulation environment, and Tizzy allows you to write state machines in .dot and will do a conversion to Verilog for you.

xx

The typical workflow:

  • The desired logic is specified using traditional schematics or a hardware description language.
  • The logic function is compiled into a binary file that can be downloaded into the FPGA.
  • Test vectors and output verification.
  • The binary file is download the FPGA.

The application-specific integrated circuit (ASIC), is similar to the FPGA, except that it is not reprogrammable. The advantage is higher speed and smaller footprint.

Hardware description language (HDL)

  1. Verilog/VHDL
  2. netlist
  3. synthesis optimizes the functions
  4. mapping to hardware

Build-in components are called macros (counters, RAM, multiplexers, adders, LUT)

  1. See “Introduction to Verilog
  2. In order the obtain reasonable speeds (wires are not ideal), the utilization is typically limited to about 50%.

Lab work

FPGA tools for design entry, simulation, synthesis and uploading is available from: (see also comparison)

What’s next?

The logic next step is the Arithmetic Logical Unit that forms the heart of today’s computers.

Arithmetic Logical Unit (ALU)

  1. Arithmetic Logical Unit (ALU)
    • http://ecen3233.okstate.edu/Fall%202009/labs/Lab05.pdf
    • soft cores for Xilinx, http://www.1-core.com/library/digital/soft-cpu-cores/
  2. Add Simple picture showing different functions feeding into a multiplexor where the operation is the selector.

Now let us build something with Gate-Level Verilog! I also published the companion article that implements the functionality using an FPGA

The inquiry “How do microprocessors work?” picks up from here.

Synchronous sequential

The logic circuits that we have seen so far are referred to as combinatorial circuits. While these circuits can be used quite successfully for math operations, their simplicity comes at a price:

  • The input values need to remain constant during the calculation.
  • The output can have multiple logical transitions before settling to the correct value. The figure below shows that even adding two numbers without carry may cause multiple transitions.
  • There is no indication when the output has settled to the correct value.
(c) Copyright 2011 Coert Vonk
adding 32’d0 to 32’d1 implemented on Altera FPGA

This chapter address solutions to some of these issues by introducing sequential circuits in which where the output not only depends on the current inputs, but also on the past sequence of the inputs values. That is, sequential logic has state (memory). [wiki]

In general, such a memory element can be made using positive feedback.

(c) Copyright 2011 Coert Vonk
Memory element using positive feedback

Digital sequential logic circuits are divided into asynchronous and synchronous circuits.

Asynchronous

The advantage of asynchronous circuits is that the speed is only limited by the propagation delays of the gates, because the circuit does not have to wait for a clock signal to process inputs.

The state of asynchronous circuits can change at any time in response to changing inputs. As a result, similar to combinatorial circuits, the output can have multiple logical transitions before settling to the correct value.

Another disadvantage arises from the fact that memory elements are sensitive to the order that their input signals arrive. If two signals arrive at a logic gate at almost the same time, which state the circuit goes into can depend on which signal gets to the gate first. This may causes small manufacturing differences to lead to different behavior.

These disadvantages make designing asynchronous circuits very challenging and limited to critical parts where speed is at a premium. [wiki]

A very basic memory element can be made using only two inverters. This circuit will maintain its state, but lacking any input that state cannot be changed.

(c) Copyright 2013-2021 Coert Vonk
Basic memory element using inverters

We continue with some common asynchronous circuits.

Set-Reset (SR) Latch (async, level sensitive)

The SR-latch builds on the idea of the inverter latch and introducing two inputs. Set (\(S\)), forces the next value of the output (\(Q_{n+1}\)) to \(1\). Reset (\(R\)), force the next value of the output (\(Q_{n+1}\)) to \(0\).

(c) Copyright 2013-2021 Coert Vonk
SR-latch states

The state transition diagram provides a visual abstraction. It uses circles for the output states, and arrows for the transition conditions.

(c) Copyright 2013-2021 Coert Vonk
SR-latch with transitions

(c) Copyright 2013-2021 Coert Vonk
A complete diagram takes all the inputs (R and S) into account.

The state transition table shows the relationship between the inputs, the current value of the output \((Q_n\)), and the next value of the output (\(Q_{n+1}\)). The ‘ב represents a “don’t care” condition.

In Boolean algebra, this function can be expressed as:\(\) $$ \begin{align*} Q_{n+1} &= S+\overline{R}\cdot{Q_n} \\ &= \overline{\overline{S+Q_n}+R} \end{align*} $$

The function can be built with two NOR gates as shown below.

(c) Copyright 2013-2021 Coert Vonk
Memory element using NOR gates

With the circuit in hand, let us take a closer look at its operation:

  • When S=1 while R=0, drives output Q to 1.
  • When R=1 while S=0, drives output Q to 0.
  • When S=R=0, the latch latches and maintains it’s previously state.
  • When both inputs change to 1 sufficiently close in time, there is a problem. Whatever gate is first, will win the race condition. In the real world, it is impossible to predict which gate that would be, since it depends on minute manufacturing differences. A similar problem occurs when the device powers and both \(Q\) and \(\overline Q\) are \(0\).

In Verilog HDL we could model this as module SR_latch( input S, input R, output Q ); wire Qbar; nor( Q, R, Qbar ); nor( Qbar, S, Q ); endmodule

D-latch (async, level sensitive)

Here a \(D\) input makes the \(R\) and \(S\) complements of each other, thereby removing the possibility of invalid input states (metastability) as we saw in the SR-latch. \(D=1\) sets the latch to 1, and \(D=0\) resets the latch to 0.

Note that in the circuit below the SR-latch here is drawn as cross-coupled NOR gates.

(c) Copyright 2013-2021 Coert Vonk
D-latch

An enable signal controls when the value of the inputs \(R\) and \(S\) input matter. The output reflects the input only when enable is active (\(EN=1\)). In other words, the enable signal serves as a level triggered clock input. Level triggered means that the input is passed to the output for was long as the clock is active.

D-latches cannot be chained, because changes will just race through the chain. Once could prevent this by inverting the ENABLE signal going to the 2nd D-latch.

In Verilog HDL we would model this as module D_latch( input D, input Enable, output Q ); always @(D or Enable) if (Enable) Q &<= D; endmodule[/code]

Synchronous circuits

In synchronous circuits, a clock signal synchronizes state transitions. Inputs are only sampled during the active edge of the clock cycle. Outputs are “held” until the next state is computed, thereby preventing multiple logical transitions. Changes to the logic signals throughout the circuit all begin at the same time, synchronized by the clock.

The figure below shows an example of a synchronous sequential circuit. In it, a D flip-flop serves as a clocked memory element. The following section will examine the various memory element.

(c) Copyright 2013-2021 Coert Vonk
Example of a synchronous circuit

The main advantage of synchronous logic is its simplicity. The logic gates which perform the operations on the data require a finite amount of time to respond to changes to their inputs. This is called propagation delay. The interval between clock pulses must be long enough so that all the logic gates have time to respond to the changes and their outputs “settle” to stable logic values, before the next clock pulse occurs. As long as this condition is met (ignoring certain other details) the circuit is guaranteed to be stable and reliable. This determines the maximum operating speed of a synchronous circuit.

Synchronous circuits also have disadvantages. The maximum clock signal is determined by the slowest (critical) path in the circuit, because every operation must complete in one clock cycle. The best work-around is by making all the paths take roughly the same time, by splitting complex operations into several simple operations, which can be performed over multiple clock cycles. (pipelining)

In addition, synchronous circuits require the usually high clock frequency to be distributed throughout the circuit, causing power dissipation.

D flip-flop (edge triggered)

The classic D-flip-flop is similar to a D-latch, except for the important fact that it only samples the input on a clock transition; in this case that is the rising edge of the clock.

In other words, while \(clk=0\) the value of \(D\) is copied in the first latch. The moment that \(clk\) becomes \(1\), that value remains stable and is copied to output \(Q\).

(c) Copyright 2013-2021 Coert Vonk
D-flip-flop

The use of the clock signal implies that the flip-flop cannot just hold its previous value and samples the input every rising clock edge. Note that the ‘>‘ symbol indicates that the clock input is sampled on the rising clock edge.

The advantage of triggering on the clock edge, is that the input signal only needs to remain stable while it is being copied to the second latch. The so-called timing window:

(c) Copyright 2013-2021 Coert Vonk
D-flip-flop timing

In this timing window, the setup time \(t_su\) is the minimum time before rising clock by which the input must be stable. The hold time \(t_h\) is the minimum time after the clock event during which the input must remain stable. The clock-to-output (propagation) delay \(t_{cq}\) is the maximum time after the clock event for the output to change.

A setup or hold violation causes metastability where the output goes to intermediate voltage values which are eventually resolved to an unknown state.

In Verilog HDL we would model this as module D_flipflop( input D, input clock, output Q ); always @(posedge clock) Q &<= D; endmodule[/code]

The example above is provided for general understanding of the principle. In practice, one would use the better and more efficient solutions only requiring 6 NOR gates. This solution prevents the inverter in on the enable input of the first latch.

register

A register is a memory element that expands on the D-flip-flip. A load signal \(LD\), limits when new data is loaded into the register (only if \(LD=1\) during the active edge of the clock). In the circuit below, this is implemented using a 2:1 multiplexer.

(c) Copyright 2015-2021 Coert Vonk
Register

Multiple data bits are stored together. An \(n\)-bit register consists of \(n\) blocks that share the \(LD\) and \(clk\) signals.

A sequence of bits is commonly written as \(D[15:0]\) referring to bits /(D_{15}\dots D_0/).

Large memories

Memory consists of a large number of locations that each can store a value. Each location in a memory is given a number, called an address.

Memory locations are identified by a \(k\)-bit wide address. Each memory location can store a \(n\)-bit wide value.

The figure below gives an example of 16-bit addresses storing 16-bit values. To save space in this figure, hexadecimal (base-16) notation is used to represent the address and value.

Memory example

Bit density is key in building large memories. Instead of D flip-flops, large memories use more efficient methods such as:

  • Static Random Access Memory (SRAM) that uses six transistors per memory bit. As we have seen this relies on a feedback between two gates.
  • Dynamic Random Access Memory (DRAM) that uses only one transistor per memory bit. The mechanism relies on an electrical charge stored in the capacitor of a MOSFET gate. The drawback is that the charge has to be refreshed periodically.

Let’s take a closer look at DRAM: a single bit (cell) can be implemented as shown below. In this, the capacitor saves the state. The transistor limits access to the capacitor.

DRAM cell

To read, select raised; the charge in the capacitor the appears on pin D. To write, select is raised for long enough to charge or drain the capacitor to the value of D.

These cells can be combined for form a large memory. The cells are organized in a matrix structure, to keep the size of the address demultiplexer practical. Otherwise, to implement k address lines, a demux with 2k outputs would be needed. The figure below shows a simplified structure implementation using a 4-bit address (and 1 bit wide).

To make a n-bit wide memory, n memory DRAM chips can be combined.

nxn DRAM

For more info refer to slides from MIT lectures ”Sequential building blocks” and “Memory Elements” and the web site lwn.net)

Good design practices

[src]

  • Use a single clock, single edge synchronous design wherever possible.
  • Asynchronous interfaces lead to metastability. Minimize the asynchronous interface and use a double clock data to reduce the chance of metastability.
  • Avoid asynchronous presets & clears on FFs. Use synchronous presets & clears whenever possible.
  • Do not gate clocks! Instead, create clock enabled FFs via a MUX to feed back current data.

Clock

In sequential circuits, a clock signal orchestrates the state transitions. This clock is generally a square wave generated by an astable multivibrator.

(c) Copyright 2014-2021 Coert Vonk
Clock signal

A delayed negative feedback causing it to oscillates between 0 and 1 .

(c) Copyright 2014-2021 Coert Vonk
Oscillator

An implementation using inverters and a RC circuit is shown below.

(c) Copyright 2014-2021 Coert Vonk
Oscillator using RC circuit

The functionality can be explained as follows:

  • Suppose initially:
    • output U3=0V (a logical 0), and
    • the capacitor is not charged .·. U1=U3
  • 0→1:
    • The capacitor charges through resistor R .·. U1 increases towards 5V.
    • Once U1≥2V (the 1 threshold) .·. U2 becomes 0V .·. output U3 becomes 5V (a logical 1)
  • 1→0:
    • The capacitor charge reverses through the resistor R .·. U1 decreases towards 0V.
    • Once U1≤0.7V (the 0 threshold) .·. U2 becomes 5V .·. output U3 becomes 0V (a logical 0), and the cycle repeats itself.

Hands On

  • D latch, Yenka Technology, Digital Electronics, build d-latch using gates, use models for d-type flip-flip, binary counter
  • Build or simulate a set-reset latch using NOR gates. (see Digital logic projects, page 27)
  • Build or simulate a D-latch using NAND gates. (see Digital logic projects, page 6)

The following chapter introduces programmable logic that allows us to build more dense and flexible hardware systems.

Math using gates

In this article, we will build logic circuits that implement math functions. It is part of a quest to answer the question “How do computers do math?”. This will section bring us the long awaited grail!\(\)

In combinational logic, the output is a function of the input values. There are a wide variety of logic gates our disposal to build these functions. Ever since the 80s, the leading logic families are the TTL-based 7400 and the CMOS-based 4000 series.

The figure below shows an example of a package with four two-input NAND gates.

Example 7400 quad dual input NAND
7400 Quad Dual Input NAND
7400 Dual In-Line Package

Following sections discuss implementations of a multiplexer and some basic math operations.

Multiplexer (mx)

Albeit technically not a math function, we will cover the multiplexer because we will be needing it to build operations.

A multiplexer uses a binary value (address) to select between several inputs. The output then assumes the value of that input.

The table below gives the truth table for a 2-to-1 multiplexer. From this follows the boolean equation that expresses the output z as a function of the inputs. The illustration also shows the corresponding circuit, where the circle at the input of the top NAND-gate represents an inverter.

os a b z
0 0 1 0
0 0 0 0
0 1 0 1
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 0
1 1 1 1
$$ z= \overline{os} \cdot a + os \cdot b \nonumber $$
1-bit full adder

We have now reached the the heart of the question “How do computers do Math?”. The first math operation that we will tackle using these gates is: addition.

Addition (fa)

Addition icon The full adder forms the building block for all n-bit adders. This full adder adds the 1-bit values a and b to the incoming carry (ci), and outputs a 1-bit sum (s) and a 1-bit outgoing carry (co). The carry is similar as when adding decimal numbers — if you have a carry from one column to the next, then that next column has to include that carry.

The truth table, shown below, gives the relation between these inputs and outputs. The Boolean equations and circuit are derived from this table.

a b ci co s
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
$$ \begin{aligned} s &= a \oplus b \oplus c_i\\ c_o &= a \cdot b + c_i \cdot (a \oplus b) \end{aligned} \nonumber $$
1-bit full adder

Now that we can add 1-bit values, we can move on to n-bit values. The basic carry-propagate adder starts by adding the least significant bit and passing the carry on to each following bit. The s outputs are combined to form the sum S.

Expanding

The circuit shown below gives an example of a 4-bit carry-propagate adder. The carry has to propagate from the lowest to the highest bit position. This so-called “ripple carry” limits the speed of the circuit.

4-bit carry-propagate adder

The accompanying article “Building Math Circuits” describes Verilog HDL implementations of this along with a faster algorithms, and a demonstration setup.

Subtraction (fs)

subtraction icon Before we can look at subtraction, we need to be able to represent negative numbers. Referring back to the section on digital systems, the positive decimal value would be represented by the binary value 111 as shown below. $$ \require{color} 7 \equiv \color{red}{1}\color{black}{\times 2^2}\ +\ \color{red}{1}\color{black}{\times 2^1}\ +\ \color{red}{1}\color{black}{\times 2^0} = \mathrm{b}\color{red}{111} \nonumber $$

The most common digital representation for negative numbers is called “two’s complement”. It assigns the most significant bit (msb) a negative weight. The other bits have their usual binary weights. The advantage of this system is that the circuits can treat positive and negative numbers the same when doing arithmetic. The 4-bit two’s complement for decimal value -7 is 1001 is shown below. $$ \require{color} -7 \equiv \color{red}{1}\color{black}{\times (-2^3)}\ +\ \color{red}{0}\color{black}{\times 2^2}\ +\ \color{red}{0}\color{black}{\times 2^1}\ +\ \color{red}{1}\color{black}{\times 2^0} = \mathrm{b}\color{red}{1001} \nonumber $$

Inverting the bits of a binary number and adding 1 makes the number negative. This implies that positive numbers have the value 0 as their most significant bit, and negative numbers have the value 1 as their most-significant bit. $$ \require{color} \color{red}-\color{black}7 \equiv \,^{\color{red}\sim}\mathrm{b}0111\color{red}\color{red}{+\mathrm{b}1}\color{black} =\mathrm{b}1000 + \mathrm{b}0001 = \mathrm{b}1001 \nonumber $$

This implies that using 8-bits, we can encode the decimal values -128 to 127.

Similar to addition, the simplest subtraction method is borrow propagation. Again, we will start by building a 1-bit subtractor. The inputs a and b represent the 1-bit binary numbers being added. Output d, the difference. bi and bo are the incoming and outgoing borrow signals.

The truth table shown below gives the relation between the inputs and outputs. The outputs d and bo can both be expressed as a function of the inputs. The result d is an exclusive-or (XOR) function, just as the sum was for addition.

a b li lo d
0 0 0 0 0
0 0 1 1 1
0 1 0 1 1
0 1 1 1 0
1 0 0 0 1
1 0 1 0 0
1 1 0 0 0
1 1 1 1 1
$$ \begin{align*} d&=a \oplus b \oplus l_i \\ l_o&=\overline{a} \cdot b + l_i \cdot (\overline{a \oplus b}) \end{align*} $$
1-bit full subtractor

To build a 4-bit subtractor we combine four of these building blocks.

4-bit borrow-propagate subtractor

For a faster approach or a Verilog implementation of the subtractor shown above, refer to Building Math Circuits.

Multiplication (ma)

multiplication icon Multiplication is another important operation in digital signal processing (e.g. fast Fourier Transfers). It is an excellent example of combining simple logic functions to make a much more complex function. The two methods that allow us to do this called “array multipliers”.

The array multiplier is a simple technique for implementing multiplication. It resembles the process how you would probably perform a multiplication yourself, aside from the fact that it sums up the partial products as it goes.

Example of carry-propagate array multiplier

Each partial product bit position can be generated by a simple AND gate between corresponding positions of the multiplicand A and multiplier B bits. Adding the partial products forms the product.

The multiplier adder (ma) shown below, is the basic building block for the multiplier. It uses an AND gate to form the partial product of two binary digits. To calculate the partial sum and carry out signals, it reuses the 1-bit full adders described earlier. The boolean equations describe the multiplier adder (ma) as a function of its ports:

x y si ci b co so
0 0 0 0 0 0 0
0 0 0 1 0 0 1
0 0 1 0 0 0 1
0 0 1 1 0 1 0
0 1 0 0 0 0 0
0 1 0 1 0 0 1
0 1 1 0 0 0 1
0 1 1 1 0 1 0
1 0 0 0 0 0 0
1 0 0 1 0 0 1
1 0 1 0 0 0 1
1 0 1 1 0 1 0
1 1 0 0 1 0 1
1 1 0 1 1 1 1
1 1 1 0 1 1 0
1 1 1 1 1 1 0
$$ \begin{align*} b &= x \cdot y\\ s_o &= s_i\oplus b\oplus c_i\\ c_o &= s_i \cdot b + c_i \cdot(s_i \oplus b) \end{align*} $$
1-bit Multiplier

The carry propagation array multiplier works similarly to the way humans multiply. The maij building blocks connect in a grid, where ij identifies the row and column. The figure below gives an example of a 4-bit multiplier.

4-bit carry-propagate array multiplier

You might notice some redundancy in the top row, where ma0x merely function as AND gates.

This method will provide the expected result, but it will take a relatively long time to do so because of the long carry chain. A more optimized method is the carry-save array multiplier described below.

Implementations of this and faster multipliers in Verilog HDL can be found in the accompanying articles Building Math Circuits.

Division (csm)

division icon Division is another example of combining simple logic function to make a more complex circuit. Note that the method shown below ignores the divide-by-0 condition. (See also Multipliers and dividers.)

Next we’ll introduce the most basic division method called attempt subtraction divider. In calculating x/y, it repeatedly subtracts the divisor y⋅os* from the digits of the dividend x. Initially the value of os equals logic 0, so that it subtracts the value y. If the difference is negative, the subtraction is cancelled by making os logic 1. Together, all the successive OS‘s together makes the division result, while the remainder is the difference of last subtraction step.

Example for attempt-subtraction division

To implement this algorithm, we need subtractors that can cancel the subtraction. Each subtractor calculates the difference between two input numbers, but if the result is negative, the operation is canceled and replaced with a subtraction of zero.

Such a Controlled Subtract-Multiplex (CSM) contains a 1-bit subtractor a-b with the usual inputs a, b, and bi and outputs d and bo. The output select (os) signal selects between bit x and d=a-b. The signal is connected to the borrow output of the most significant 1-bit subtractor.

  • 0, means subtraction result was positive ⇒ D’ = D.
  • 1, means subtraction result was negative ⇒ D’ = X.

Inside each divider cell the os controls a multiplexer that selects between the result of the addition d and the original remainder x. The boolean expressions express the output as a function of the inputs. (Click image to animate)

os x y bi bo d
0 0 0 0 0 0
0 0 0 1 1 1
0 0 1 0 1 1
0 0 1 1 1 0
0 1 0 0 0 1
0 1 0 1 0 0
0 1 1 0 0 0
0 1 1 1 1 1
1 0 0 0 0 0
1 0 0 1 1 0
1 0 1 0 1 0
1 0 1 1 1 0
1 1 0 0 0 1
1 1 0 1 0 1
1 1 1 0 0 1
1 1 1 1 1 1
$$ \begin{align*} d^\prime &=x \oplus y \oplus b_i\\ d&=os \cdot x + \overline{os} \cdot d^\prime\\ b_o&=\overline{x} \cdot y + b_i \cdot (\overline{x \oplus y}) \end{align*} $$
1-bit Controlled Subtract-Multiplex

The circuit shown below implements a complete 8:4-bit divider matrix using CSM modules. Each row performs one “attempt subtraction” cycle. Note that the most significant bit is used to drive the os inputs. (For details see “Combinational arithmetic“.)

8:4-bit Attempt-Subtraction

For a faster approach or a Verilog implementation of above divider, refer to Building Math Circuits.

Square root

square root icon The square root operation is an interesting algorithm to implement in hardware. Its implementation is another example of combining simple logic functions, reusing previous designs.

Samavi and Sutikno improved the classical non-restoring digit recurrence square root algorithm. This example uses their algorithm but without the optimizations. The algorithm of this Simplified-Samovi Square root is:

  1. Add the next bit from the input number to the right of the current remainder. This becomes the new current remainder (A)
  2. Take the square root obtained so far, append 01 to it and subtracts this, properly shifted, from the current remainder. (The 0 in 01 corresponds to multiplying by 2; the 1 is a new guess bit.)
  3. If the result is
    • Positive, then the new root bit is 1, and the result becomes the new remainder.
    • Negative, then the current remainder (A) will become the new remainder (as if the subtraction never happened).
Example for simplified Samovi square root

The circuit shown below implements a complete 8-bit square root circuit. Each row performs one “attempt subtraction” cycle. Like in the division circuit, the most significant bit is used to drive the os inputs.

8-bit simplified-Samovi square root

For a Verilog implementation of above square root algorithm, refer to Building Math Circuits.

I hope the this answered the question “How do computers do math?“. The easy answer would be “because the physics make it”, just like a ball rolling down a hill. I hope you enjoyed our much longer journey into this exciting field!

For those interested in building these math circuits, I suggest reading the Building Math Circuits and chapter on programmable logic. If you want to know more about digital logic, I suggest reading the next chapter that introduces sequential logic that allows us to build more complicated systems.

Metal oxide silicon field effect transistor

This article describes the physics of Metal Oxide Silicon Field-effect Transistors (MOSFET). It uses these MSOFETs to build logic gates. This is part in a quest to answer the question “How do computers do math?”.\(\)

At first glance, the construction of the MOSFET might appear similar to the JET. However, there are some important differences: (1) a thin layer of silicon dioxide (SiO2) isolates the gate, and (2) they have no obvious conductive channel.

The illustration below shows symbols for, and examples of, MOSFETs.

N-channel MOSFET Symbol
P-channel MOSFET Symbol
MOSFET Packages

Semiconductor physics

In the n-channel MOSFET, a p-type substrate connects to the source and drain pins via more heavily doped n+ type regions, as shown below. A very thin layer of silicon dioxide insulates the metal gate terminal from the substrate. A bias over the drain-source terminals causes the depletion layer around the drain to widen. The depletion areas prevent a drain current.

(c) 2016 Copyright Coert Vonk
N-channel MOSFET with Ugs=0

Applying a gate potential \(U_{gs}\) makes the gate positive with respect to the source and substrate, as shown below. The positive charge on the gate repels holes from the p-layer underneath, forming a depletion area. This charge also attracts free electrons from the n+ source and drain that form a thin layer under the gate terminal bridging the source and drain areas. This layer is called an inversion channel because it behaves like a n-type material. The inversion channel is of the same type as the source and drain, and thus it provides a channel through which current can pass.

(c) 2016 Copyright Coert Vonk
N-channel MOSFET with Ugs

Any further increase of the gate voltage attracts more electrons into the inversion channel, reducing its resistance and increaing the current between the source and drain. Removing the gate source voltage stops the current leading and the area beneath the gate to revert to p-type. This method of operation is called enhanced mode, because the gate potential “enhances” the conducting channel. There are also devices where a gate potential “depletes” the conducting channel.

Due to the physics of MOSFET, these transistors draw very little power because the gate current is extremely low; they’re as small as 7 nanometers, the width of 10 silicon atoms; and can switch on/off in the order of GHz. This makes them an ideal building block modern day computers that use integrated circuits with trillions of MOSFETs on one piece of silicon.

MOSFET based logic

A voltage input to the gate controls the flow of current from source to drain. Being voltage-controlled rather than current-controlled, it allows for resistor-less circuits. CMOS draws no current other than leakage when in a steady 1 or 0 state. When the gate switches states, it draws current to charge the capacitance at the gate.

To implement logic functions, a complementary pair of MOSFETs can connect an output to either Vcc or ground. The name Complementary Metal Oxide Semiconductor (CMOS) refers to the use of complementary pairs of p-type and n-type MOSFETs. Note that the CMOS voltage levels are 0 volt and 3.3 volts.

NOT gate in CMOS

The circuit shown below gives the most basic implementation of a CMOS gate, where the FETs are used invert a logical input signal.

NOT-gate in CMOS
Ideal NOT-gate

When a 3.3 volts signal (logic 1) is applied at input \(A\), the n-channel MOSFET is conducting, and the p-channel MOSFET is not, and the output \(X\) is a logic 0. On the other hand, if the input is grounded (logic 0) the situation is reversed, and the output \(X\) is a logic 1. A simulation confirms that the output values stay well within the CMOS range.

Simulation results
\(U_A\) \(U_X\)
0.00 V 3.30 V
3.30 V 0.00 V

By extending this NOT gate, we can build NOR and NAND gates.

NOR gate in CMOS

Putting two n-channel MOSFETs in parallel and two p-channel MOSFETs in series, we can create a two-input NOR gate as shown in the circuit below. A simulation confirms that these output values stay perfectly within CMOS range.

(c) 2016 Copyright Coert Vonk
NOR-gate in CMOS
Simulation results
\(U_A\) \(U_B\) \(U_X\)
0.00 V 0.00 V 3.30 V
0.00 V 3.30 V 0.00 V
3.30 V 0.00 V 0.00 V
3.30 V 3.30 V 0.00 V

NAND gate in CMOS

By putting two n-channel MOSFETs in series and two p-channel MOSFETs in parallel, we can build a two-input NAND gate as shown in the circuit below. Once more, simulation confirms that the output values are at CMOS values.

(c) 2016 Copyright Coert Vonk
NAND-gate in CMOS
Simulation results
\(U_A\) \(U_B\) \(U_X\)
0.00 V 0.00 V 3.30 V
0.00 V 3.30 V 3.30 V
3.30 V 0.00 V 3.30 V
3.30 V 3.30 V 0.00 V

With the MOSFET semiconductor physics down and the logic gates worked out, the next chapter move on to combining these gates to build circuits that performs math operations. In doing so, we move close and close to answering our inquiry “How do computer do math”. More details can be found in the MIT handout CMOS Transistors, Gates, and Wires.

Field effect transistor

This post describes the semiconductor physics of the Field Effect Transistors (FET). It uses the FET to build logic gates. This is part of a quest to answer the question “How do computers do math?”. Here we introduce an improved transistor type. \(\)

The Field Effect Transistor (FET) can be used as a voltage-controlled resistor.

Lilienfeld and Heil discovered the principle of field-effect as early as 1926-1927 [patent], but it took until well after the invention of the bipolar junction transistor before it became practical to make them (Atalla, 1960). Even though FETs are conceptually simple, they are difficult to manufacture because they require an extremely clean manufacturing environment.

The illustration below shows the symbols for the JFETs.

N-FET Symbol
P-FET Symbol
FET packages

Semiconductor physics

In the n-channel device, the channel is in between two connected p-type regions, as shown in the illustration below. More heavily doped n+ type regions connect the source and drain pins to the n-type channel. This n+ type silicon prevents an unwanted pn-junction between the regular n-type silicon with its four valence electrons and the e.g. aluminum terminal with its three valence electrons.

When we place a modest voltage \(U_{ds}\) over the drain and source, the ions at the depletion layers give the channel a positive charge compared to the gate. The depletion layer is thicker towards the drain end of the channel, as the voltage on the drain is more positive than that on the source due to voltage gradient that exists along the channel. The current flows and the silicon channel acts similarty to a conductor.

(c) 2016 Copyright Coert Vonk
N-channel JFET with Ugs=0

To use this JFET as a switch, we can increase the thickness of the depletion zone by applying a reverse bias on the gate, as shown below. This pushes the holes from the p-type region and the electrons in the n-type region away from the junction, increasing the width of the depletion zone. Eventually, once enough potential is applied, the transistor will pinch off the channel current (\(I_d\)). This pinch off voltage is typically a negative few volts.

To use this JFET as an amplifier, we can increase \(U_{ds}\) while \(U_{gs}=0\). At first, the drain current \(I_d\) will increase, but meanwhile the depletion layer is also growing and narrowing the n-channel. At the so called “pinch-off” value \(U_{p}\), the conducting channel is so narrow that it cancels out the effect of the higher \(U_{ds}\). The \(I_d\) doesn’t increase much further, and the JFET is said to be in saturation mode. At this point, a small \(U_{gs}\) can be used to control the current through the source−drain channel from its maximum value to zero. [learn-electronics]

(c) 2016 Copyright Coert Vonk
N-channel JFET with Ugs

JFET based logic

JFETs have very nice properties, but when used in circuits they still require bulky resistors. As we see in the next section, MOSFETs do not require such resistors.

Enough about FET semiconductor physics and building logic gates. The next chapter introduces an improvement on this Field Effect Transistor.

Bipolar junction transistor

This post describes the physics of Bipolar Junction Transistors (BJT), and uses these to build logic gates. This is part of a quest to answer “How do computers do math?”.

If you are pressed for time, overwhelmed by the semiconductor physics, or simply to curious to find the answer to the inquiry, you may skip forward to Math operations using combinational logic. However, if you are somebody who likes to know it all, I encourage you to work through this chapter. \(\)

Bardeen, Brattain and Shockley invented the transistor in 1947. They greatly reduce energy consumption, weight and footprint compared to the existing vacuum tubes. This enabled groundbreaking new developments such as landing a man on the moon. The illustration below shows the symbols for bipolar junction transistors (BJT, along with some examples of them.

NPN transistor symbol
PNP transistor symbol
Transistor packages
Source: wikimedia.org

In this context, the term “bipolar” refers to the two charge carriers: electrons and holes in the same crystal. The model of the Bipolar Junction Transistor is a “voltage controlled current device”, even though for small signals we typically model it as a current amplifier. We will explain this by exploring the BJT semiconductor physics of a NPN-junction.

Semiconductor physics

Joining three layers of semiconductor material creates a transistor. There are two types of bipolar junction transistors. The PNP transistor has a very thin n-type layer between p+ and p-type material. In this the + sign indicates more heavily doped material. The other type of transistor, the NPN transistor, has a very thin p-type layer between n and n+ type materials as shown in the image below.

The physics of both types of transistor are very similar, except that the electron flow is dominant in NPN transistors, while PNP transistors rely mostly on the flow of “holes”. NPN transistors are more popular, because electrons move faster than “holes”. In this discussion, we will focus on the NPN transistor.

When a NPN transistor has no bias (voltage), the base-emitter and base-collector junctions behave like diode junctions as shown in the image below. That means we can apply what we learned in the previous chapter. At both junctions, the electrons from the n-type material and the holes from the p-type attract and recombine leaving positive and negative ions behind. This continues until the electrons and holes no longer have enough energy to overcome the electrostatic field created by the ions, about 0.65 volts. The resulting depletion area acts as an insulator.

NPN transistor unbiased

When we apply a forward bias on the base-emitter junction, as shown below on the left, it pulls the holes in the p-type region to the electrons on the - terminal of the battery, and the electrons in the n-region to the + terminal. With both the electrons and holes pulled towards the junction, the depletion layer becomes very thin. Once enough voltage is applied, the base-emitter junction starts conducting and electrons and holes flow freely. The result is a small flow of holes from the base to the emitter, and electrons from the opposite direction. Together these flows are called the base current \((I_b)\).

When we also apply a strong reverse bias on the collector-base junction, the free electrons and holes pull away from the junction and the depletion layer widens as shown in the bottom-right image. The magic lays in the extremely thin depletion layer between the base and emitter and the strong charge on the collector.

NPN Transistor with base-emitter bias
NPN transistor with base-emitter and strong base-collector bias

Once the electrons “emitted” by the emitter reach the very narrow base region, the strong electric field from the collector attracts them. The electrons will cross the reverse biased collector-base junction to be “collected” by the collector. The result is a net flow of electrons from the collector to the emitter, or in other words: an electrical current from the collector to the emitter (\(I_c\)). This current is substantially larger than the current (\(I_b\)) that reduced the width of the base-emitter depletion zone. In other words, the small base current (\(I_b\)) controls the much larger collector current (\(I_c\)).

Note that even though the base-emitter threshold voltage is 0.65 volts, the collector-emitter drop will be much lower, in the order of a few millivolts. This is because the electric field over the collector-base junction is opposite to that of the base-emitter junction.

Transistor based logic

The problem with diode-resistor circuits was the signal degradation which kept them from being cascaded. RTL replaces the diode switch with a transistor switch. The transistor ensures that the output voltage will always be a valid logic level, so they can be cascaded indefinitely.

Resistor-transistor logic (RTL)

Transistors can be used to amplify signals, but in this context we are only interested in using them as on/off switches. This implies that we apply no base-emitter bias, or apply enough bias to make the collector-emitter conduct.

NOT Gate in RTL

The circuit shown below gives the most basic implementation of a BJT gate, where the transistor is used to invert a logical input signal.

NOT gate in RTL
NOT gate in RTL
Behavior of NOT gate in RTL
Behavior of NOT gate in RTL

When a 5 volts signal (logic 1) is applied at input \(A\), the base-emitter junction is forward biased, resulting in maximum collector current and a minimum collector-emitter voltage drop. The transistor is switched “on”, and the output \(X\) is a logic 0. On the other hand, if the input is grounded (logic 0), the base current (\(I_b\)) is zero, and there will be no collector-emitter current. With the transistor switched “off”, the resistor pulls the output signal up to 5 volts, a logic 1.

A simulation confirms that the output values stay well within the TTL range. The circuit is at rest when the inputs is logic 0 (0 V). In this case, the resistor pulls the output up to 5 volts. An input value of logic 1 turns the transistor “on” so that current flows through the resistor and into the connector-emitter.

Voltages
\(U_A\) \(U_X\)
0.00 V 5.00 V
5.00 V 0.06 V

This implies that the transistor inverts the input signal, while ensuring that the output voltage is a valid TTL level. By extending this NOT gate, we can build other logic gates.

NOR Gate in RTL

Putting two NOT gates in parallel, creates a two-input NOR gate as shown in the circuit below. A simulation confirms that the output values stay within the TTL range. The circuit is at rest when both inputs are logic 0 (0 volts). In this case, the resistor pulls the output up to 5 volts. When either input is logic 1, the corresponding transistor turns “on” so that current flows through the resistor and connector-emitter.

NOR gate in RTL
\(U_A\) \(U_B\) \(U_X\)
0.00 V 0.00 V 5.00 V
0.00 V 5.00 V 0.06 V
5.00 V 0.00 V 0.06 V
5.00 V 5.00 V 0.04 V
Voltages

NAND Gate in RTL

Putting two NOT gates in series creates a two-input NAND gate as shown in the circuit below. The simulation confirms that the output values stay within TTL range. The circuit is at rest when at least one input inputs is logic 0 (0 V). In this case, the resistor pulls the output up to 5 volts. When both inputs are logic 1, the corresponding transistors turn “on” so that current flows through resistor and connector-emitter, driving the output to logic 0.

NAND gate in RTL
voltages
\(U_A\) \(U_B\) \(U_X\)
0.00 V 0.00 V 5.00 V
0.00 V 5.00 V 5.00 V
5.00 V 0.00 V 5.00 V
5.00 V 5.00 V 0.12 V

Limitations of RTL

The RTL gates have two limitations: (1) the input resister and \(C_{be}\) create a RC time that slows down state transitions, and (2) the input resistor take up a significant footprint in an integrated circuit. We will address the first limitation by replacing the input resistor with diodes. The next page tackles the second limitation.

Diode-transistor logic (DTL)

The switching speed of the NOT gate can be improved by replacing the input resistor with two diodes and a resistor as shown in the circuit below. Diodes are not only smaller but also have a low internal resistance when forward biased, thus allowing faster switching. The diode connected to the transistor base raises the input voltage required to turn the transistor “on” to about 1.3 volts.

NOT gate in DTL
voltages
\(U_A\) \(U_X\)
0.00 V 5.00 V
5.00 V 0.04 V

The simulation results are noted in the table on the right. Similar to the resistor-transistor logic, by adding input diodes, we can build an NAND gate, and by placing the inverter in parallel we can make a NOR gate.

Transistor-transistor logic (TTL)

In the DTL gate, there are two diodes have their P-type anodes connected together and to the pull-up resistor. A single NPN transistor can replace these two diodes while taking up about the same amount of space. The figure below shows a NOT gate build using TTL logic. The simulation results are noted in the table on the right.

NOT gate in TTL
voltages
\(U_A\) \(U_X\)
0.00 V 5.00 V
5.00 V 0.02 V

With the BJT semiconductor physics down, and some logic gates explained, the following chapter introduces a type of transistor that does not need the bulky resistors, and describes how to use it to build logic gates.

Gates and diode-resistor logic

This post gives an introduction to digital logic, dives into the semiconductor physics and finishes with an implementation in diode-resistor logic. It forms part of the quest to answer How do computers do math?.\(\)

Logic gates

Digital systems use gates to make logical operations based on their input signals. These gates take one or more binary inputs and produce one binary output.

OR gate

The OR gate is one of the simplest gates. The symbol and truth table for the OR gate are shown in the illustration below. The output is 1 when either input A or B are 1, otherwise it is 0. The OR gate is represented by Boolean algebra operator \(+\). Note that in Boolean algebra \(1+1=1\).

(c) Copyright 2016-2022, Coert Vonk
OR gate
\(A\) \(B\) \(A+B\)
0 0 0
0 1 1
1 0 1
1 1 1

AND gate

The AND gate, is represented by the Boolean algebra operator ‘\(\cdot\)’. Its output is 1 when both inputs are 1, otherwise it is 0.

(c) Copyright 2016-2022, Coert Vonk
AND gate
\(A\) \(B\) \(A\cdot B\)
0 0 0
0 1 0
1 0 0
1 1 1

XOR gate

The eXclusive OR gate (XOR) is represented by the symbol \(\oplus\).

(c) Copyright 2016-2022, Coert Vonk
XOR gate
\(A\) \(B\) \(A \oplus B\)
0 0 0
0 1 1
1 0 1
1 1 0

NOT gate

The inverter gate (NOT) represented by a bar above as in \(\overline{A}\).

(c) Copyright 2016-2022, Coert Vonk
NOT gate
\(A\) \(\overline A\)
0 1
1 0

NOR and NAND gate

At times, an OR gate is combined with an inverter to form Not-OR, or NOR for short. The table below shows this gate and its truth tables.

(c) Copyright 2016-2022, Coert Vonk
NOR gate
\(A\) \(B\) \(\overline{A+B}\)
0 0 1
0 1 0
1 0 0
1 1 0

Other times, an AND gate is combined with an inverter to form to form Not-AND, or NAND for short.

(c) Copyright 2016-2022, Coert Vonk
NAND gate
\(A\) \(B\) \(\overline{A\cdot B}\)
0 0 1
0 1 1
1 0 1
1 1 0

Semiconductor physics

There are many different ways to implemented digital logic. Nazi Germany (1941) used a system with electro-mechanical relays to calculate artillery-firing tables. Two years later the US army built a vacuum tube based system to simulate the hydrogen bomb. Currently, digital systems are built using the semiconductors we will discuss here.

Have you ever wondered about the jagged line on the right of the periodic table? It separates the metals (conductors) on the left from the non-metals (insulators) on the right. The elements creating the zigzag line share some properties of both conductors and insulators. These elements are called semiconductors or metalloids.

http://www.ptable.com/
Periodic table Illustration
source: ptable.com

Using these semiconductors, we can build the gates that we need for digital systems. For this discussion, we will focus building gates using the element silicon.

Pure silicon

Silicon is a a common semiconductor material. It has four electrons in the outer shell (valence shell). This causes it to form covalent bond with four neighboring atoms so that in its valence shell all eight electrons are present (as shown in the illustration below). These bonds hold the atoms together, making the element stable.

(c) Copyright 2016-2022, Coert Vonk
Covalent bonds between Silicon atoms

The bonds help form an orderly crystaline structure called a lattice. Because there are no free electrons, the material will not conduct electricity. The illustration above shows a simplified view of the lattice. The conductivity of the lattrice can be increased by adding very a small number (<1 ppm) of impurities. Examples of such impurities are phosphorous and boron.

N-type silicon

Adding a tiny amount of phosphorous, which has five electrons in valence band, leaves an extra electron that can move easily. Four out of the five phosehorus’ valence electrons bond with their neighboring silicon atoms. This leaves one free electron that becomes mobile.

The resulting material is called an n-type because of the abundance of negatively charged electrons.

Variance shell in n-type material
Simplified view of free electrons in n-type material

When you apply a voltage over an n-type material, electrons will move through the latticswork just as current would flow in a copper wire. The positive potential of a battery attracts the free electrons in the crystal. These electrons will leave the crystal and flow into the positive terminal of the battery. As electrons leave the lattice, electrons from the negative terminal of the battery will enter the lattice, completing the circuit.

P-type silicon

Boron has only three electrons in its valence band. Add a tiny amount of it to silicon only allows for the formation of three covalent bonds. This leaves a hole where an electron can go.

The hole can move if a neighboring electron fills the hole, thereby creating a new hole in the neighboring location. You can imagine a hole has missing a negatively charged electron. In other words, the hole has a positive charge.

The resulting material is called a p-type semiconductor.

Variance shell in p-type material
Simplified view of holes in p-type material

Conduction in the p-type material is by positive holes, instead of negative electrons. A hole moves from the positive terminal of the p-type material to the negative terminal. Electrons from the external circuit enter the negative terminal of the material and fill holes near this terminal. The positive terminal removes electrons from the covalent bonds, creating new holes.

This process continues as the steady stream of holes (hole current) moves toward the negative terminal.

PN-junction

The magic starts when you create a PN-junction by bringing a p-type and n-type material in close contact. In the interface layer, the electrons from the n-type material and the holes from the p-type attract and eliminate each other in a process called recombination. Because of the lack of free electrons and holes in this area, we call it the depletion region.

Diode materials
Recombination at the PN-junction

The loss of an electron from the n-type material leaves a positive ion, while the loss of a hole from the p-type material leaves a negative ion. The crystal lattice structure keeps these charged ions in place, creating an electrostatic field across the junction.

This continues until the electric field from the negatively charged ions in the p-type material gets so strong that it repels the free electrons from the n-type material. In other words, the recombination of electrons and holes across the junction continues until the electrostatic field reaches the point where the electrons and holes no longer have enough energy to overcome it. At this point, an equilibrium is established. For all practical purposes, the movement of carriers across the junction ceases.

This so called depletion layer has no free electrons or holes, and consequently does not conduct electricity. At the depletion layer, the ions in the n and p-type materials cause a respectively positive and negative charge. This potential difference is typically measures 0.65 volts for silicon. In effect, the n-type material has become positive compared to the p-type.

If there were a free electron at the junction barrier, it will now require extra energy to cross the depletion region.

PN-junction diode

This pn-junction makes a diode that conducts electrical current in one direction, and blocks current in the other direction. We will explain this by applying a bias voltage over the diode visualized with a battery. The energy from the battery will affect the depletion zone shown in white.

The symbol and some samples of diodes are shown below.

Diode symbol
Various diodes
Source: fluke.com

Reverse bias occurs when we apply a negative voltage, as shown below on the left. The free electrons in the n-region are pulled to the + terminal of the battery and the holes in the p-type region are pulled to the - terminal. With both charge carriers pulled away from the junction, the depletion layer widensm and the diode will conduct electricity even less.

(c) Copyright Coert Vonk. All rights reserved.
PN-junction in reverse bias
(c) Copyright Coert Vonk. All rights reserved.
PN-junction in forward bias

When we reverse the battery, the diode is in forward bias, as shown above on the right. The free electrons in the n-region are attracted to the + terminal, and the holes in the p-type region are attracted to the electrons on the - terminal. With both the electrons and holes pulled towards the junction, the depletion layer gets very thin. Once enough voltage is applied, the diode starts conducting and electrons and holes flow freely with minimum resistance. This threshold voltage \(U_{th}\) is about 0.65 volts for silicon diodes.

Characteristic voltage-current curve
Scope used as diode tracer
Source: experimentalistsanonymous.com

With the two opposite flow of electrons and holes, some will eventually recombine. An electron will fall from a higher energy level to a the lower energy level of the hole. This leads to energy beging released in the form of a photon.

According to quantum theory, the energy of a photon is the product of frequency \(f\) of electromagnetic radiation and the Planck constant \(\hbar\). The frequency is the quotient of the speed of light \(c\) and the wavelength \(\lambda\).

In silicon or germanium diodes, the wavelength corresponds to infrared radiation (heat). For Light Emitting Diodes (LEDs), the p and n-type materials are chosen so that the energy is released as visible light.

$$ \left. \begin{align} E_g&=\hbar\ f\nonumber\\ f&=\frac{c}{\lambda}\nonumber \end{align} \right\} \implies\ \lambda=\frac{\hbar\ c}{E_g} $$

When we plot the current as a function of the bias voltage, we get the characteristic graph shown above. From the graph, we see that the diode has a very small leak current in reverse bias, and starts conducting once the forward bias surpasses the threshold voltage \(U_{th}\).

Diode with current limiting resistor

Using an AC signal generator and diode-resistor circuit, an xy-plot on an oscilloscope can visualize this. The voltage over the resistor is representative of the current through the diode.

Diode-resistor logic (DRL)

By combining diodes with resistors, we can create logic gates. In these, the diodes act as electrically operated switches. The following examples assume TTL levels, where an input value of 0 to 0.8 volts is recognized as a logic 0, and 2 to 5 volts is recognized as 1.

TTL voltage levels

OR Gate in DRL

If at least one of the inputs is a logic 1 (5 V), the output will be a logic 1. Otherwise, the output will be a logic 0. In Boolean algebra that is written as \(X=A+B\).

OR gate in DLR

A logic OR-gate can be built using two diodes and a pull-down resistor as shown above. The 1N4148 is a common switching diode. If both inputs are logic 0, then the resistor will pull the output to ground (logic 0). If either input is a logic 1, the resistor limits the current through the diode. In that case, the output will be 5 volts minus the diode voltage drop. This is considered a logical 1.

In the simulation, click on an input on the left to toggle its state. The circuit is at rest when both inputs are Low (0 V). In this case, the resistor pulls the output down to 0 volts. When either input level is high, current flows through the diode and the resistor. The diode is in forward bias with a voltage drop of about 0.65 volts. Consequently, the output will be 5 – 0.65 volts, which is recognized as logic 1.

AND Gate in DRL

If both inputs are logic 1 (5 V), the output will be a logic 1. Otherwise, the output will be logic 0. In Boolean algebra this is written as X = A · B.

(c) Copyright 2016-2022, Coert Vonk
AND gate in DRL

A OR gate can be made as shown in the schematic above. If both inputs are logic 0, current will flow through the diode, making the output equal to the diode voltage drop (0.65 V), a logic 0. Otherwise, the resister will pull the output to 5 volts, a logical 1.

In the simulation, click on an input on the left to toggle its state. The circuit is at rest when both inputs are logic 1 (5 V). In this case, the resistor pulls the output up to 5 volts. When either or both inputs are logic 0, current flows through the resistor and diode. The diode is in forward bias with a voltage drop of about 0.65 volts. Consequently, the output will be 0.65 volts above the input value of 0 volts, what is recognized as logic 0.

Combining gates in DRL

As we have seen, the output voltage levels of the single gates were a little off from the ideal levels of 0 and 5 volts. The circuit below cascades 2 OR gates with one AND gate to build the Boolean expression \(X=(A+B)\cdot(C+D)\).

(c) Copyright 2016-2022, Coert Vonk
Double OR to AND gates in DRL

Simulating this circuit reveals a problem when the output should be 0. The circuit in effect acts as a voltage divider, where the current from the pull-up resistor splits over the two diodes and flows to ground over both pull-down resistors. Output voltage \(U_x\) follows from the current through the pull-up resistor \(I_{u}\):

$$ \newcommand{\parallelsum}{\mathbin{\!/\mkern-5mu/\!}} \left. \begin{align} I_{u} &= \frac{5 – 0.65}{10\mathrm k + 10\mathrm k \parallelsum 10\mathrm k} = \frac{4.35}{15\mathrm k} \nonumber \\ U_X&=5 – I_{u}\cdot 10\mathrm{k} \nonumber \end{align} \right\} \implies U_X = 5 – \frac{4.35}{15\cancel{\mathrm{k}}}\times10\cancel{\mathrm{k}}=2.1\ \mathrm V $$

Simulation confirms that combining the gates causes a problem where the output should be a logical 0 as shown below on the left. We can improve this specific circuit by increasing the pull-up resistor to \(470\ \mathrm{k\Omega}\). I will spare you the math, and simply hover over the \(X\) wire in the simulation to reveal the output voltage values . Now, all these levels are within the correct TTL range as shown below on the right.

Simulated results before
\(U_A\) \(U_B\) \(U_C\) \(U_D\) \(U_X\)
0.00 V 0.00 V 0.00 V 0.00 V 2.10 V
0.00 V 5.00 V 0.00 V 0.00 V 2.83 V
5.00 V 0.00 V 5.00 V 0.00 V 4.88 V
5.00 V 5.00 V 5.00 V 5.00 V 4.90 V
Simulated results after
\(U_A\) \(U_B\) \(U_C\) \(U_D\) \(U_X\)
0 V 0 V 0 V 0 V 0.60 V
0.00 V 5.00 V 0.00 V 0.00 V 0.67 V
5.00 V 0.00 V 5.00 V 0.00 V 4.79 V
5.00 V 5.00 V 5.00 V 5.00 V 4.81 V

The following chapter will introduce the transistor and how it can be used to build improved logic gates. When using resistor-diode logic, the voltage drop in the diodes and power dissipation in the resistors degrade the signal as it passes through the gates. As we will see on the next page, these problems solved by introducing another semiconductor: the transistors.

Electronic circuits

In electronic circuits, an electric signal controls the flow of current. We introduces analog, digital and microprocessor systems. This is part 2 in a quest to answer the question “How do computers do math?”. In this chapter we introduce the different types of electronic circuits.\(\)

In electronic circuits, another electric signal controls the flow of electrons. This electrical input signal typically represents physical properties such as sound, vision or temperature. Converting these physical properties to electrical signals, allows them to be easily manipulated and transported.

We distinguish three types of electronic systems: analog, digital and microprocessor systems.

Analog systems

In analog systems, a continuously variable voltage on a wire has a proportional relationship to a physical property as illustrated on the right. The word “analog” comes from the Greek word ανάλογος (analogos) meaning “proportional”. For example, a voltage of -1 to 1 volt may represent a sound pressure of -15 to 0 dBa.

Analog waveform
Example of an analog signal

In an old fashion telephone, a battery provides the energy. The carbon powder in the microphone controls the current, as the vibrations from the voice causes the resistance to fluctuate. Copper wires transport the analog signal. Eventually, on the other end, a loudspeaker recreates the sound.

(c) 2017 Coert Vonk
Analog telephone circuit

Analog systems consist of components such as batteries, resistors, capacitors, diodes and transistors. They first became viable with the introduction of the vacuum tube (Fleming, 1904; de Forest, 1907) and later the semiconductor diode (Ohl, 1939) and transistor (Bardeen, Brattain and Shockley, 1947).

Various transistor packages
Source: wikimedia.org

If you interested in the history, consider watching the PBS documentary Transistorized! (the original video vanished, but it can still be found on YouTube) or the Silicon Valley American Experience.

(c) 2017 Coert Vonk
Analog system architecture

The thermal behavior of the components and the transmission of the analog signal introduce noise. Reducing noise and thereby improving accuracy is complicated and expensive in analog systems. The inherent inaccuracy was acceptable for telephony, but not for precision tools such as calculators. As we will see next, digital systems can offer this precision.

Digital systems

In digital systems, there are only two voltages. One voltage corresponds to a logic 1, and the other to logic 0. These systems are binary after the Latin word bīnārius that stands for “consisting of twos”. Digital systems consist of circuits build using digital gates. Internally the gates use analog components, but the input and output are digital, so only logic 1 and logic 0.

Digital systems became viable with the introduction of the transistor. Later with the invention of integrated circuits, digital systems became commonplace (Kilby and Noyce, 1959).

TTL voltage levels
own work
Weighted values

By combining a set of \(n\) wires and assigning them weighted values, we can transmit \(2^n\) binary values. In the above example with four wires, a mass from 0 to 100 kg may be represented by a 4-bit binary signal from 0000 to 1111. That implies that 16 different values can be transmitted, decimal 0 to 15.\(\)

$$ \require{color} \begin{aligned} \mathrm{b}\color{red}{0000} &\equiv \tfrac{ \color{red}{0}\color{black}{\times 8}\ +\ \color{red}{0}\color{black}{\times 4}\ +\ \color{red}{0}\color{black}{\times 2}\ +\ \color{red}{0}\color{black}{\times 1}}{15} \cdot 100\ \mathrm{kg} = \tfrac{0}{15}\ \cdot\ 100 = 0\ \mathrm{kg}\\ \mathrm{b}\color{red}{0001} &\equiv \tfrac{ \color{red}{0}\color{black}{\times 8}\ +\ \color{red}{0}\color{black}{\times 4}\ +\ \color{red}{0}\color{black}{\times 2}\ +\ \color{red}{1}\color{black}{\times 1}}{15} \cdot 100\ \mathrm{kg} = \tfrac{1}{15}\ \cdot\ 100 = 6\tfrac{2}{3}\ \mathrm{kg}\\ \vdots\\ \mathrm{b}\color{red}{1111} &\equiv \tfrac{ \color{red}{1}\color{black}{\times 8}\ +\ \color{red}{1}\color{black}{\times 4}\ +\ \color{red}{1}\color{black}{\times 2}\ +\ \color{red}{1}\color{black}{\times 1}}{15} \cdot 100\ \mathrm{kg} = \tfrac{15}{15}\ \cdot\ 100 = 100\ \mathrm{kg}\\ \end{aligned} \nonumber $$
A 4-bit value can encode 16 different values

All voltages within a range represent the same digital value. As shown in the figure above, TTL-based logic recognizes an input value of 0 … 0.8 volt as logic 0, while 2 … 5 volt is recognized as logic 1. This implies that small changes in analog signal levels (noise) will be ignored allowing transmission over long distances without noise degradation.

Digital system architecture

The number of bits used limits the precision of digital systems. Digital systems can be very precise, but have to be custom designed for each task. During World War II, the Nazis and allied forces, used electro-mechanical systems and later vacuum tube based calculators to calculate artillery-firing tables. These systems were fast, but it took great effort to re-purpose the machines. Microprocessor systems add flexibility to digital systems.

Microprocessor systems

Since 1945, the von Neumann architecture forms the foundation for modern day computers. It allows the same device to perform a wide variety of functions. One moment it can be a scientific calculator, another a magazine reader of be used to call your cousin.

Phone used as a calculator
Phone as used as a news reader
Phone used as a video communicator

A programmable device takes digital input signals, processes them according to instructions stored in its memory and outputs results. The programmable device is a Central Processing Unit (CPU).

Microprocessor system architecture

Build using digital gates, microprocessors came to life with the invention of integrated circuits and spread rapidly once the whole CPU fit on one piece of silicon (Intel and Texas Instruments, 1971). This architecture is described in more detail in our follow-up inquiry “How do Microprocessors Work?.

In the following chapters we will introduce the building blocks of digital systems, and go in-depth explaining the physics of its components. The next chapter starts with an introduction to digital logic and shows an implementation of diode based logic.

Electrical circuits

Before we can answer the question “How do computers do math?“, we need to get our toes wet with a short introduction to electrical circuits. What are the electrical properties and what does Ohm’s law have to do with it?\(\)

Turbulence in thunderclouds builds up static electricity because of the friction between water droplets. Your physics teacher might run a van der Graaff generator causing her hair to stand up, or you can simply rub a balloon over your sweater (demo). Objects that rub against each other create static charge. This charge caused by an excess or lack of electrons. The electrons can be easily transferred by rubbing objects, because they only have a loose bond with their atomic nucleus (protons and neutrons).

Electrostatic charge
Photo by ghosthuntingtheories.com
Electrostatic discharge

Static electricity may suddenly discharge, such as when a bolt of lightning flashes through the sky. Other times, objects with opposite charges attract like when a balloon clings to a sweater. In between these two extremes are experiments with a modest flow of electrons, such as the van der Graaff generator discharging through a fluorescent light tube.

Electrical charge is expressed in Coulomb and corresponds to a charge of \(-6.24151\times 10^{18}\) electrons. Electric current is the movement of charge (electrons). Your light bulb, computer and phone all harness the movement of electrons.

Current

Electric current flows when there is a closed loop, a circuit, around which electrons flow. For example, in a flashlight, once you close the switch the current flows from the battery through the bulb, wires, switch and back through the battery. The current through the thin wire of the light bulb causes it to heat and light up. When you open the switch, it breaks the current and the light turns off.

Electrons jumping through a semiconductor
Electron traversing a copper wire

The illustration above visualizes an electron traversing a wire from atom to atom. In real life, the scale if very different. The atoms are so small that they are invisible to the naked eye. To get an idea of the scale, imagine a current of about 1,000,000,000,000,000,000 electrons per second to power a light bulb.

Current is the movement of charge, with a unit of 1 Ampere corresponding to moving a charge of one Coulomb a second.

The unit name for current “ampere” is in remembrance of André-Marie Ampère, a French physicist and mathematician and early pioneer of electromagnetism. He introduced the symbol \(I\) from the French term “intensité de courant”. Electrical current is the flow of positive charge opposite to the movement of electrons. This is an historic artifact because he discovered current 60 years before the discovery of the electron.

Analogy

Let us take a step back and compare the electrical current to a water system as shown below. When you open the valve in the pipe, water starts flowing down the pipe, similar to closing the switch in flashlight.

Flashlight

The amount of water flow depends on the pipe diameter and the height difference of the water. Similarly, in the flashlight the current depends on the voltage of the battery and the resistance of the light bulb.

Water analogy

Voltage

In the water model above, the voltage compares to the difference in water level. Voltage is the difference in charge between two points. For example, in the flashlight that is between the \(+\) and \(-\) terminals of the batteries. Instead of charge, people also use the term electric potential, the amount of force available to move electrons from one specific point in a circuit to another point.

The unit “volt” honors of the Italian physicist Alessandro Volta, who discovered the first chemical battery. The symbol for voltage depends on your location. Europe uses symbol \(U\) and the US uses the symbol \(V\). In either case, the unit Volt (\(V\)) is defined as the electric potential over a wire when an electric current of 1 Ampere dissipates 1 Joule of energy.

Resistance

The third electric property is resistance, a material’s tendency to resist the flow of charge (current). Free electrons move through a conductor with some degree of friction or resistance to motion.

A thin carbon wire has a high resistance, and a thick copper wire has a low resistance. In the water analogy, it compares with the narrowness of the water pipe. The narrower the pipe, the smaller the flow of water.

The symbol for resistance is \(R\). Resistance is measured in the unit Ohm (\(\Omega\)) after the German physicist Georg Ohm.

Ohm’s law

Ohm’s law describes the relationship between voltage, current and resistance. The amount of current is proportional with the voltage that motivates the electrons to move, and inversely proportional to the resistance that oppose the flow of electrons. Georg Ohm formulated this relationship as:

$$ U = I \times R \nonumber $$
Ohm’s law

The illustration below shows the circuit and the popular Ohm’s law triangle. Here \(U\) represents the voltage [Volt], \(I\) the current [A] and \(R\) the resistance in Ohms [Ω].

own work
Ohm’s law circuit
own work
Ohm’s law triangle

Using Ohm’s law, we can determine the voltage, current or resistance if the other two properties are known. For example, if we have a circuit with the potential of 12 Volt, a resistance of 120 Ω, then the current follows as $$ \frac{12\ \mathrm{V}}{120\ \Omega}=0.1\ \mathrm{A} \nonumber $$

Another example is a 9 volt battery powering a light emitting diode in series with a resistor. If we want a current of 10 mA, and at that current, the LED has a potential drop of 2.2 volts then we can calculate the required current limiting resistor value as $$ \def\e#1{\times 10^{#1}} \def\num#1{\numx#1}\def\numx#1e#2{{#1}\mathrm{e}{#2}} R=\frac{U_{bat}-U_{led}}{I}=\frac{9 – 2.2}{10\e{-3}}=680\ \Omega\nonumber $$

Armed with this knowledge about electrical circuits, let’s continue with Electronic Circuits.

Copyright © 1996-2022 Coert Vonk, All Rights Reserved