How do computers do math?

This is part 8 in a continuation on the quest to answer the question “How do computers do math?”. It part introduces synchronous sequential logic.

Synchronous sequential logic

The logic circuits that we have seen so far are referred to as combinatorial circuits. While these circuits can be used quite successfully for math operations, their simplicity comes at a price:

  • The input values need to remain constant during the calculation.
  • The output can have multiple logical transitions before settling to the correct value. The figure below shows that even adding two numbers without carry may cause multiple transitions.
  • There is no indication when the output has settled to the correct value.

(c) Copyright 2011 Coert Vonk
adding 32’d0 to 32’d1 implemented on Altera FPGA

This chapter address solutions to some of these issues by introducing sequential circuits in which where the output not only depends on the current inputs, but also on the past sequence of the inputs values. That is, sequential logic has state (memory). [wiki]

In general, such a memory element can be made using positive feedback.

(c) Copyright 2011 Coert Vonk
Memory element using positive feedback

Digital sequential logic circuits are divided into asynchronous and synchronous circuits.

Asynchronous

The advantage of asynchronous circuits is that the speed is only limited by the propagation delays of the gates, because the circuit does not have to wait for a clock signal to process inputs.

The state of asynchronous circuits can change at any time in response to changing inputs. As a result, similar to combinatorial circuits, the output can have multiple logical transitions before settling to the correct value.

Another disadvantage arises from the fact that memory elements are sensitive to the order that their input signals arrive. If two signals arrive at a logic gate at almost the same time, which state the circuit goes into can depend on which signal gets to the gate first. This may causes small manufacturing differences to lead to different behavior.

These disadvantages make designing asynchronous circuits very challenging and limited to critical parts where speed is at a premium. [wiki]

A very basic memory element can be made using only two inverters. This circuit will maintain its state, but lacking any input that state cannot be changed.

(c) Copyright 2013 Coert Vonk
Basic memory element using inverters

We continue with some common asynchronous circuits.

Set-Reset (SR) Latch (async, level sensitive)

The SR-latch builds on the idea of the inverter latch and introducing two inputs. Set (\(S\)), forces the next value of the output (\(Q_{n+1}\)) to \(1\). Reset (\(R\)), force the next value of the output (\(Q_{n+1}\)) to \(0\).


(c) Copyright 2013 Coert Vonk
SR-latch states

The state transition diagram provides a visual abstraction. It uses circles for the output states, and arrows for the transition conditions.


(c) Copyright 2013 Coert Vonk
SR-latch with transitions

(c) Copyright 2013 Coert Vonk
A complete diagram takes all the inputs (R and S) into account.
\(\)

The state transition table shows the relationship between the inputs, the current value of the output \((Q_n\)), and the next value of the output (\(Q_{n+1}\)). The ‘ב represents a “don’t care” condition.

In Boolean algebra, this function can be expressed as:

$$\begin{aligned}
Q_{n+1}&=S+\overline{R}\cdot{Q_n}\\
&=\overline{\overline{S+Q_n}+R}
\end{aligned}$$

The function can be built with two NOR gates as shown below.


(c) Copyright 2013 Coert Vonk
Memory element using NOR gates

With the circuit in hand, let us take a closer look at its operation:

  • When S=1 while R=0, drives output Q to 1.
  • When R=1 while S=0, drives output Q to 0.
  • When S=R=0, the latch latches and maintains it’s previously state.
  • When both inputs change to 1 sufficiently close in time, there is a problem. Whatever gate is first, will win the race condition. In the real world, it is impossible to predict which gate that would be, since it depends on minute manufacturing differences. A similar problem occurs when the device powers and both \(Q\) and \(\overline Q\) are \(0\).

In Verilog HDL we could model this as

module SR_latch( input S, input R, output Q );
  wire Qbar;
  nor( Q, R, Qbar ); 
  nor( Qbar, S, Q );
endmodule

D-latch (async, level sensitive)

Here a \(D\) input makes the \(R\) and \(S\) complements of each other, thereby removing the possibility of invalid input states (metastability) as we saw in the SR-latch. \(D=1\) sets the latch to ’1′, and \(D=0\) resets the latch to ’0′.

Note that in the circuit below the SR-latch here is drawn as cross-coupled NOR gates.


(c) Copyright 2013 Coert Vonk
D-latch

An enable signal controls when the value of the inputs \(R\) and \(S\) input matter. The output reflects the input only when enable is active (\(EN=1\)). In other words, the enable signal serves as a level triggered clock input. Level triggered means that the input is passed to the output for was long as the clock is active.

D-latches cannot be chained, because changes will just race through the chain. Once could prevent this by inverting the ENABLE signal going to the 2nd D-latch.

In Verilog HDL we would model this as

module D_latch( input D, input Enable, output Q );
  always @(D or Enable)
  if (Enable)
  Q <= D;
endmodule

Synchronous circuits

In synchronous circuits, a clock signal synchronizes state transitions. Inputs are only sampled during the active edge of the clock cycle. Outputs are “held” until the next state is computed, thereby preventing multiple logical transitions. Changes to the logic signals throughout the circuit all begin at the same time, synchronized by the clock.

The figure below shows an example of a synchronous sequential circuit. In it, a D flip-flop serves as a clocked memory element. The following section will examine the various memory element.


(c) Copyright 2013 Coert Vonk
Example of a synchronous circuit

The main advantage of synchronous logic is its simplicity. The logic gates which perform the operations on the data require a finite amount of time to respond to changes to their inputs. This is called propagation delay. The interval between clock pulses must be long enough so that all the logic gates have time to respond to the changes and their outputs “settle” to stable logic values, before the next clock pulse occurs. As long as this condition is met (ignoring certain other details) the circuit is guaranteed to be stable and reliable. This determines the maximum operating speed of a synchronous circuit.

Synchronous circuits also have disadvantages. The maximum clock signal is determined by the slowest (critical) path in the circuit, because every operation must complete in one clock cycle. The best work-around is by making all the paths take roughly the same time, by splitting complex operations into several simple operations, which can be performed over multiple clock cycles. (pipelining)

In addition, synchronous circuits require the usually high clock frequency to be distributed throughout the circuit, causing power dissipation.

D flip-flop (edge triggered)

The classic D-flip-flop is similar to a D-latch, except for the important fact that it only samples the input on a clock transition; in this case that is the rising edge of the clock.

In other words, while \(clk=0\) the value of \(D\) is copied in the first latch. The moment that \(clk\) becomes \(1\), that value remains stable and is copied to output \(Q\).


(c) Copyright 2013 Coert Vonk
D-flip-flop

The use of the clock signal implies that the flip-flop cannot just hold its previous value and samples the input every rising clock edge. Note that the ‘>‘ symbol indicates that the clock input is sampled on the rising clock edge.

The advantage of triggering on the clock edge, is that the input signal only needs to remain stable while it is being copied to the second latch. The so-called timing window:

(c) Copyright 2013 Coert Vonk
D-flip-flop timing

In this timing window, the setup time \(t_su\) is the minimum time before rising clock by which the input must be stable. The hold time \(t_h\) is the minimum time after the clock event during which the input must remain stable. The clock-to-output (propagation) delay \(t_cq\) is the maximum time after the clock event for the output to change.

A setup or hold violation causes metastability where the output goes to intermediate voltage values which are eventually resolved to an unknown state.

In Verilog HDL we would model this as

module D_flipflop( input D, input clock, output Q );
  always @(posedge clock)
  Q <= D;
  endmodule

The example above is provided for general understanding of the principle. In practice, one would use the better and more efficient solutions only requiring 6 NOR gates. This solution prevents the inverter in on the enable input of the first latch.

Register

A register is a memory element that expands on the D-flip-flip. A load signal \(LD\), limits when new data is loaded into the register (only if \(LD=1\) during the active edge of the clock). In the circuit below, this is implemented using a 2:1 multiplexer.


(c) Copyright 2015 Coert Vonk
Register

Multiple data bits are stored together. An \(n\)-bit register consists of \(n\) blocks that share the \(LD\) and \(clk\) signals.

A sequence of bits is commonly written as \(D[15:0]\) referring to bits /(D_{15}\dots D_0/).

Large memories

Memory consists of a large number of locations that each can store a value. Each location in a memory is given a number, called an address.

Memory locations are identified by a \(k\)-bit wide address. Each memory location can store a \(n\)-bit wide value.

The figure below gives an example of 16-bit addresses storing 16-bit values. To save space in this figure, hexadecimal (base-16) notation is used to represent the address and value.


(c) Copyright 2016 Coert Vonk
Memory example

Bit density is key in building large memories. Instead of D flip-flops, large memories use more efficient methods such as:

  • Static Random Access Memory (SRAM) that uses six transistors per memory bit. As we have seen this relies on a feedback between two gates.
  • Dynamic Random Access Memory (DRAM) that uses only one transistor per memory bit. The mechanism relies on an electrical charge stored in the capacitor of a MOSFET gate. The drawback is that the charge has to be refreshed periodically.

Let’s take a closer look at DRAM.

A single bit (cell) can be implemented as shown below. In this, the capacitor saves the state. The transistor limits access to the capacitor.

(c) Copyright 2016 Coert Vonk
DRAM cell

To read, select raised; the charge in the capacitor the appears on pin D. To write, select is raised for long enough to charge or drain the capacitor to the value of D.

These cells can be combined for form a large memory. The cells are organized in a matrix structure, to keep the size of the address demultiplexer practical. Otherwise, to implement k address lines, a demux with 2k outputs would be needed. The figure below shows a simplified structure implementation using a 4-bit address (and 1 bit wide).

To make a n-bit wide memory, n memory DRAM chips can be combined.


(c) Copyright 2016 Coert Vonk
nxn DRAM

For more info refer to slides from MIT lectures ”Sequential building blocks” and “Memory Elements” and the web site lwn.net)

Good design practices

[src]

  • Use a single clock, single edge synchronous design wherever possible.
  • Asynchronous interfaces lead to metastability. Minimize the asynchronous interface and use a double clock data to reduce the chance of metastability.
  • Avoid asynchronous presets & clears on FFs. Use synchronous presets & clears whenever possible.
  • Do not gate clocks! Instead, create clock enabled FFs via a MUX to feed back current data.

Clock

In sequential circuits, a clock signal orchestrates the state transitions. This clock is generally a square wave generated by an astable multivibrator.


(c) Copyright 2014 Coert Vonk
Clock signal

A delayed negative feedback causing it to oscillates between ’0′ and ’1′ .


(c) Copyright 2014 Coert Vonk
Oscillator

An implementation using inverters and a RC circuit is shown below.


(c) Copyright 2014 Coert Vonk
Oscillator using RC circuit

The functionality can be explained as follows:

  • Suppose initially:
    • output U3=0V (a logical ’0′), and
    • the capacitor is not charged .·. U1=U3
  • 0→1:
    • The capacitor charges through resistor R .·. U1 increases towards 5V.
    • Once U1≥2V (the ’1′ threshold) .·. U2 becomes 0V .·. output U3 becomes 5V (a logical ’1′)
  • 1→0:
    • The capacitor charge reverses through the resistor R .·. U1 decreases towards 0V.
    • Once U1≤0.7V (the ’0′ threshold) .·. U2 becomes 5V .·. output U3 becomes 0V (a logical ’0′), and the cycle repeats itself.

Hands on

  • D latch, Yenka Technology, Digital Electronics, build d-latch using gates, use models for d-type flip-flip, binary counter
  • Build or simulate a set-reset latch using NOR gates. (see Digital logic projects, page 27)
  • Build or simulate a D-latch using NAND gates. (see Digital logic projects, page 6)

The following chapter introduces programmable logic that allows us to build more dense and flexible hardware systems.

Embedded software developer
Passionately curious and stubbornly persistent. Enjoys to inspire and consult with others to exchange the poetry of logical ideas.

2 Replies to “How do computers do math?”

  1. Coert, I have tried to make a square root in excel but it would be work. The multiply and divide work in excel and the CSM works. I can not send you a picture of my work in this system. Can you help me?
    Ronald lokker from the Netherlands

  2. Thanks. I think the sqrt schematic might use the q* instead of the q to build the new subtrahend.
    [in parallel to email conversation]

Leave a Reply

Your email address will not be published. Required fields are marked *

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.