What is now called the Ztransform (named in honor of Lotfi Zadeh) was known to, mathematician and astronomer, PierreSimon Laplace around 1785. With the introduction of digitally sampleddata, the transform was rediscovered by Hurewicz in 1947, and developed by Lotfi Zadeh and John Ragazzinie around 1952, as a way to solve linear, constantcoefficient difference equations.\(\)
As we will see, the convolution property makes the Ztransform a powerful tool in analyzing sampleddata systems.
Introduction
Just as causal continuous systems are controlled by differential equations
Causal discrete systems operate in accordance with difference equations.
From the difference equation we can derive the system characteristics such as the impulse response, step response and frequency response.
Unlike the continuoustime case, causal difference equations can be iterated just like a computer program would do. All one needs to do, is to rewrite the difference equation so that the term \(y[n]\) is on the left and then iterating forward in time. This will give each value of the output sequence without ever obtaining a general expression for \(y[n]\). In this article, we however will look for a general analytical expression for \(y[n]\) using the Ztransform.
The Ztransform can be thought of as an operator that transforms a discontinuous sequence to a continuous algebraic function of complex variable \(z\). As we will see, one of the nice feature of this transform is that a convolution in time, transforms to a simple multiplication in the \(z\)domain.
Unilateral ZTransform
We solve the difference equations, by taking the Ztransform on both sides of the difference equation, and solve the resulting algebraic equation for output \(Y(z)\), and then do the inverse transform to obtain \(y[n]\).
Assuming causal filters, the output of the filter will be zero for \(t\lt 0\).
Sampling creates a discontinuous function
For digital systems, time is not continuous but passes at discrete intervals. When it measures a continuoustime signal every \(T\) seconds, it is said to be discrete with sampling period \(T\).
To help understand the sampling process, assume a continuous function \(x_c(t)\) as shown below
To work toward a mathematical representation of the sampling process, consider a train of evenly spaced impulse functions starting at \(t=0\). This so called Dirac comb, \(s(t)\), has a spacing of \(T\gt0\) and contains \(t=0\).
The Dirac comb \(s(t)\) can be expressed as
When multiplying the Dirac comb \(s(t)\) with a continuoustime signal \(x_c(t)\), that signal will scale the comb by that \(x_c(t)\)
The resulting signal \(x_s(t)\) follows from substituting \(s(t)\) from equation \(\eqref{eq:combcond}\) as
The impulse function \(\delta(tnT)\) is \(0\) everywhere but at \(t=nT\), the so called sifting property, so we can replace \(x_c(t)\)
Work towards a continuous function
The goal is to form a continuous algebraic expression, so we can use algebra to manipulate the difference equations.
Start with the Laplace transform of sampled signal \(x_s(t)\) from equation \(\eqref{eq:fstarnT}\)
Once more, the impulse function \(\delta(tnT)\) is \(0\) everywhere but at \(t=nT\), the “sifting property”, so we can replace \(\mathrm{e}^{st}\) with
After substituting \(\eqref{eq:est}\) in \(\eqref{eq:laplace0}\), the terms \(\mathrm{e}^{snT}\) and \(x(nT)\) are independent of \(t\) and can be taken outside of the integration.
The Ztransform follows
Define \(z\) and \(x[n]\) as
The scaling with sample period \(T\) in the form \(x[n]\) matches the notation for computer arrays. Anytime you see an \([n]\) you can translate to seconds replace it with \((nT)\). Be careful with integer expressions, such as \([n – k]\), which stands for \(((nk)T)\) seconds, not \((nT – k)\).
From equation \(\eqref{eq:z}\) follows
Substitute \((\ref{eq:z},\ref{eq:s})\) in \(\eqref{eq:Fs}\) and apply the power rule \(x^{ab}=(x^a)^b\). Call the function \(F(z)\) because \(z\) is the only variable after the substitution \(s=\tfrac{1}{T}\ln z\)
The unilateral Ztransform of the discrete function \(x[n]\) follows as
Note that we use the notation \(\def\lfz#1{\overset{\Large#1}{\,\circ\kern6mu\kern7mu\kern7mu\kern6mu\bullet\,}} \def\ztransform{\lfz{\mathcal{Z}}} \ztransform\) as equivalent to the more common Ztransform notation \(\mathfrak{Z}\left\{\,x[n]\,\right\}\).
In review: the Ztransform maps a sequence \(x[n]\) to a continuous polynomial \(X(z)\) of the complex variable \(z\).
Normalized frequency
The discrete signal only exists at time \(t=nT\) where \(n={0,1,2,\ldots}\) By normalizing time \(t\) with the sampling interval \(T\), using definition \(\eqref{eq:z}\), we get the natural time, measured in “samples”, on the timeaxis
$$ x[n]\triangleq x(nT) \nonumber $$ we highlight the normalized time, by using the \([n]\) notation
When using normalized time, other timedependent variables should be normalized as well. The angular frequency \(\omega\), measured in [rad/s], normalizes to the normalized angular frequency with units of [rad/sample] by multiplying it with the sample period \(T\) [s/sample]
regular  normalized  

time  [s]  \(\xrightarrow{\div T}\)  [sample] 
angular frequency  [rad/s]  \(\xrightarrow{\times T}\)  [rad/sample] 
natural frequency  [cycles/s]  \(\xrightarrow{\times T}\)  [cycles/sample] 
Using normalized frequency allows an author to present concepts independent of sample rate, but it comes at a loss of clarity as \(T\) and \(f_s\) are omitted from expressions.
When visualizing variable \(z\), the normalized angular frequency \(\omega T\) corresponds to the angle with the positive horizontal axis.
Formulas expressed in terms of \(f_{s}\) and/or \(T\) are not normalized and can be readily converted to normalized frequency by setting those parameters to \(1\). The inverse is accomplished by replacing instances of the angular frequency parameter \(\omega\), with \(\omega T\).
Note that some authors use \(\omega\) for normalized angular frequency in [rad/sample], and \(\Omega\) for angular frequency in [rad/s]. Here we avoid using \(\omega\) for natural frequencies. Instead we use the product \(\omega T\) to refer to natural angular frequency. By doing so, it is clear that the angular frequency \(\omega\) is scaled by the sample time \(T\).
Nyquist–Shannon sampling theorem
The frequencydomain representation of a sampled signal teaches us about the limitations of using a discrete signal. We will show this by doing a Fourier transform on the sampled signal \(x_s(t)\).
Recall equation \(\eqref{eq:fstarnT}\) and \(\eqref{eq:fstar0}\), but this time bilateral
$$ \begin{align} x_s(t)=x_c(t)\,s(t) &= \sum_{n=\infty}^{\infty}{x_c(nT)\ \delta(tnT)} \nonumber \\ &= \underbrace{\sum_{n=\infty}^{\infty}x_c(nT)}_{f(t)}\ \ \underbrace{\sum_{n=\infty}^{\infty}\delta(tnT)}_{g(t)} \nonumber \end{align} \nonumber $$
The convolution theorem states that multiplication in the timedomain corresponds to convolution in the frequencydomain
$$ \def\lfz#1{\overset{\Large#1}{\,\circ\kern6mu\kern7mu\kern7mu\kern6mu\bullet\,}} \def\fourier{\lfz{\mathcal{F}}} f(t)\,g(t)\fourier \frac{1}{2\pi}{\Large(}F(\omega)*G(\omega){\Large)} \nonumber $$ where \(*\) is the convolution sign
Apply the Fourier transform of a product to \(x_s(t)\) and call it \(X_s(\omega)\)
Recall the Fourier transform of the Dirac comb \(s(t)\)
$$ \def\lfz#1{\overset{\Large#1}{\,\circ\kern6mu\kern7mu\kern7mu\kern6mu\bullet\,}} \def\fourier{\lfz{\mathcal{F}}} s(t)=\sum _{n=\infty }^{\infty }\delta(tnT) \fourier {\frac {2\pi }{T}}\sum _{k=\infty }^{\infty }\delta \left(\omega {\frac {2\pi k}{T}}\right)\triangleq S(\omega) \nonumber $$
Substituting the Fourier transform of the Dirac comb in \(\eqref{eq:fstar2}\) where \(\frac{2\pi}{T}=2\pi f_s=\omega_s\), the angular sample frequency
Recall the convolution with a delayed impulse function \(\delta(\omegaa)\)
$$ F(\omega)*\delta(\omegaa) = F(\omegaa)\nonumber $$
Apply the convolution to \(\eqref{eq:XsjOmega0}\)
Equation \(\eqref{eq:XsjOmega}\) implies that the Fourier transform of \(x_s(t)\) consists of periodically repeated copies of the Fourier transform of \(x_c(t)\). The copies of \(\color{purple}{X_c(\omega)}\) are shifted by integer multiples of the sampling frequency and then superimposed as depicted below.
Plot
From
Consequently, \(x_c(t)\) can be recovered from \(x_s(t)\) with an ideal lowpass filter \(H_r\).
Inequality \(\eqref{eq:ineq}\) is captured in the Nyquist–Shannon sampling theorem
Let \(x_c(t)\) be a bandlimited signal with $$ \begin{align} X_c(\omega)&=0,&\omega\gt \omega_{\small N}\nonumber \end{align} \nonumber $$ then \(x_c(t)\) is uniquely determined by its samples $$ \begin{align} x[n] &= x_c(nT),&n\in\mathbb{Z}\nonumber \end{align} \nonumber $$ if $$ \omega_s = \frac{2\pi}{T}\geq2\,\omega_{\small N} \nonumber $$ where \(\omega_{\small N}\) is called the Nyquist frequency and \(2\omega_{\small N}\) is called the Nyquist rate.
When the inequality of \(\eqref{eq:ineq}\) does not hold, e.g. when the sampling frequency \(\omega_s\) is less than twice the maximum frequency \(\omega_{\small N}\), the copies of \(X_c(\omega)\) overlap, and \(X_c(\omega)\) is no longer recoverable by lowpass filtering as shown below. The resulting distortion is called aliasing.
\(z\)plane
We introduced the normalized angular frequency \(\omega \) and how it maps to a polar representation of the complex variable \(z\). Here we will extent this to include the magnitude \(z\).
Let’s start with the definition of \(z\) from equation \(\eqref{eq:z}\) and split \(s\) in its real and imaginary parts \(s=\sigma+j\omega\)
The polar notation for the complex variable \(z\) is a function of the natural angular frequency \(\omega T\) and \(\mathrm{e}^{\sigma T}\)
This conveniently matches the topology of the \(z\)plane, where the modules \(z\) corresponds to the length of a vector from the origin to \(z\), and \(\omega T\) corresponds to the angle of that vector with the positive horizontal axis.
According to the Nyquist–Shannon sampling theorem, a discrete signal can only have frequencies \(\omega\) between \(0\) and half the sampling frequency
The maximum frequency in the discrete signal, the so called Nyquist frequency corresponds to \(\pi\) radians, because with sample period \(T=\frac{1}{f_s}=\frac{2\pi}{\omega_s}\) and the Nyquist frequency \(\omega_{\small N}=\frac{\omega_s}{2}\), the natural angular frequency for the Nyquist frequency follows as
For those familiar with the Laplace transform, we will map specific features between the \(s\) and \(z\)domain:
 The origin \(s=0\) of the \(s\)plane is mapped to \(z=e^0=1\) on the real axis in \(z\)plane.

Each vertical line \(\sigma=\sigma_0\) in \(s\)plane is mapped to a circle \(z=e^{\sigma_0}\) centered about the origin in \(z\)plane. E.g.
 Leftmost vertical line \(\sigma\to\infty\) is mapped as the origin, where \(z=\mathrm{e}^{\infty}=0\)
 The imaginary axis \(\sigma=0\) is mapped as the unit circle, where \(z=\mathrm{e}^0=1\)
 Rightmost vertical line \(\sigma\to\infty\) is mapped as a circle with an infinite radius, where \(z=\mathrm{e}^{\infty}=\infty\).
 Each horizontal line \(j\omega=j\omega_0\) in \(s\)plane is mapped to an angle from the origin in \(z\)plane of angle \(\omega_0 T\) with respect to the positive horizontal direction.
Region of convergence
Recall the definition of the Ztransform from equations \((\ref{eq:ztransform},\ref{eq:zalt},\ref{eq:zphi})\)
$$ \def\lfz#1{\overset{\Large#1}{\,\circ\kern6mu\kern7mu\kern7mu\kern6mu\bullet\,}} \def\ztransform{\lfz{\mathcal{Z}}} f[n]\ztransform\sum_{n=0}^\infty z^{n}\ f[n]\\ \text{where } z\triangleqz\,\mathrm{e}^{j\varphi}\text{, } z\triangleq \mathrm{e}^{\sigma T}\text{, } \varphi\triangleq \omega T $$
This converges depending on the duration and magnitude of \(f[n]\) as well as on the magnitude \(z\). The phase \(\varphi\) has no effect on the convergence.
The power series for the Ztransform is called a Laurent series. The Laurent series, represents an analytic function at every point inside the region of convergence. Therefore, the Ztransform and all its derivatives must be continuous function of \(z\) inside the region of convergence.
Laurent series converge in an annular (=ring shaped) region of the \(z\)plane, bounded by poles. The set of values of \(z\) for which the Ztransform converges is called the region of convergence (ROC). This means that anytime we use the Ztransform, we need to keep the region of convergence in mind.