This multiplier is build around Multiplier Adder (ma) blocks. These ma blocks are themselves build around the Full Adder (fa) blocks introduced in the adder section. These fa blocks have the usual inputs \(a\) and \(b\), \(c_i\) and outputs \(s\) and \(c_o\). The special thing is that the internal signal \(b\) is an AND function of the inputs \(x\) and \(y\) as depicted below.
$$
\begin{align*}
a &= s_i\\
b &= x \cdot y\\
s_o &= s_i\oplus b\oplus c_i\\
c_o &= s_i \cdot b + c_i \cdot(s_i \oplus b)
\end{align*}
$$
1-bit multiplying-adder
Carry-propagate Array Multiplier
As shown in the inquiry “How do Computers do Math?“, a carry-propagate array multiplier can be built by combining many of these ma blocks. The circuit diagram below shows the connections between these blocks for a 4-bit multiplier.
4-bit carry-propagate array multiplier
For an implementation in Verilog HDL, we can instantiate ma blocks based on the word length of the multiplicand and multiplier (\(N\)). If you are new to Verilog HDL, remember that the generate code segment expands during compilation time. In other words, it is just a short hand for writing out the long list of ma block instances.
generate genvar ii, jj;
for ( ii = 0; ii <; N; ii = ii + 1) begin: gen_ii
for ( jj = 0; jj <; N; jj = jj + 1) begin: gen_jj
math_multiplier_ma_block ma(
.x(?), .y(?), .si(?), .ci(?),
.so(?), .co(?) );
end
end
endgenerate[/code]
As you might notice, the input and output ports are not described. For this, we need to derive the rules that govern these interconnects. Start by numbering the output ports based on their location in the matrix. For this circuit, we have the output signals sum (\(s\)) and carry-out (\(c\)). E.g. \(c_{13}\) identifies the carry-out signal for the block in row 1 and column 3. Note that the circuit description depicts the matrix in a slanted fashion.
Output signals 'so' and 'co'
Knowing this, we can enter the output signals in the Verilog HDL code
math_multiplier_ma_block ma(
.x(?), .y(?), .si(?), .ci(?),
.so ( s[ii][jj] ),
.co ( c[ii][jj] ) );
Next, we express the input signals as a function of the output signal names \(s\) and \(c\) as shown in the table below.
Input signals 'x', 'y', 'si' and 'ci'
Based on this table, we can express the input assignments for each ma using "c ? a : b" expressions. Note that Verilog 2001 does not allow these programming statements for the output pins. This is why we expressed the input ports as a function of the output ports instead of visa versa.
All that is left to do is to express the inputs of the module as a function of the output signals
Output signal 'p'
Putting it all together, we get the following snippet
generate genvar ii, jj;
for (ii = 0; ii <; N; ii = ii + 1) begin: gen_ii
for (jj = 0; jj <; N; jj = jj + 1) begin: gen_jj
math_multiplier_ma_block ma(
.x ( a[jj] ),
.y ( b[ii]),
.si ( ii == 0 ? 1'b0 : jj <; N - 1 ? s[ii-1][jj+1] : c[ii-1][N-1] ),
.ci ( jj > 0 ? c[ii][jj-1] : 1'b0 ),
.so ( s[ii][jj] ),
.co ( c[ii][jj] ) );
end
assign p[ii] = s[ii][0];
end
for (jj = 1; jj <; N; jj = jj + 1) begin: gen_jj2
assign p[jj+N-1] = s[N-1][jj];
end
assign p[N*2-1] = c[N-1][N-1];
endgenerate[/code]
The ma block compiles into the RTL netlist shown below
1-bit multiplying-adder in RTL
As shown in the figure below, the for loops unroll into 16 interconnected ma blocks.
4-bit carry-propagate array multiplier in RTL
The complete Verilog HDL source code is available at:
Results
The propagation delay \(t_{pd}\) depends size \(N\) and the value of operands. For a given size \(N\), the maximum propagation delay occurs when the low order bit because a carry/sum that propagate to the highest order bit. This worst-case propagation delay is linear with \(3N\). Note that the average propagation delay is about half of this.
The worst-case propagation delays for the Terasic Altera Cyclone IV DE0-Nano are found using the post-map Timing Analysis tool. The exact value depends on the model and speed grade of the FPGA, the silicon itself, voltage and the die temperature.
\(N\)
Timing Analysis
Measured
slow 85°C
slow 0°C
fast 0°C
actual
4-bits
9.9 ns
8.9 ns
6.1 ns
8-bits
20.8 ns
18.6 ns
12.4 ns
16-bits
41.3 ns
36.9 ns
24.2 ns
27-bits
69.6 ns
62.1 ns
40.9 ns
55 ns
32-bits
83.4 ns
74.5 ns
49.0 ns
Propagation delay in carry-propagate array multiplier
The timing analysis for \(N=27\), reveals that the worst-case propagation delay path goes through \(c_0\) and \(s_o\) as shown below on the left. When measuring the worst-case propagation delay on the actual device, we use input values that cause the maximum number ripple carries and sums propagating. For a 27-bit multiplier that where the input also has a maximum value of 99,999,999, the propagation path is simulated in a spreadsheet as shown below on the right.
Worst case path
Worst case input
Brute force using the FPGA to find all combinations of operands that cause long propagation delays revealed 27'h2FA3A92 * 27h'55D4A77, 27'h60A308B * 27'd99999999 (50ns), 27'h775A668 * 27'd89999999 (55 ns), 27'h56F5D8F * 27'h3AAAB7B (55 ns).
Following this "Math multiplier using logic gates", the next chapter explores methods of making the multiplication operation faster.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.