# Z-transform

What is now called the Z-transform (named in honor of Lotfi Zadeh) was known to, mathematician and astronomer, Pierre-Simon Laplace around 1785. With the introduction of digitally sampled-data, the transform was re-discovered by Hurewicz in 1947, and developed by Lotfi Zadeh and John Ragazzinie around 1952, as a way to solve linear, constant-coefficient difference equations.

As we will see, the convolution property makes the Z-transform a powerful tool in analyzing sampled-data systems.

## Introduction

Just as causal continuous systems are controlled by differential equations

Causal discrete systems operate in accordance with difference equations.

From the difference equation we can derive the system characteristics such as the impulse response, step response and frequency response.

Unlike the continuous-time case, causal difference equations can be iterated just like a computer program would do. All one needs to do, is to rewrite the difference equation so that the term $$y[n]$$ is on the left and then iterating forward in time. This will give each value of the output sequence without ever obtaining a general expression for $$y[n]$$. In this article, we however will look for a general analytical expression for $$y[n]$$ using the Z-transform.

The Z-transform can be thought of as an operator that transforms a discontinuous sequence to a continuous algebraic function of complex variable $$z$$. As we will see, one of the nice feature of this transform is that a convolution in time, transforms to a simple multiplication in the $$z$$-domain.

## Unilateral Z-Transform

We solve the difference equations, by taking the Z-transform on both sides of the difference equation, and solve the resulting algebraic equation for output $$Y(z)$$, and then do the inverse transform to obtain $$y[n]$$.

Assuming causal filters, the output of the filter will be zero for $$t\lt 0$$.

### Sampling creates a discontinuous function

For digital systems, time is not continuous but passes at discrete intervals. When it measures a continuous-time signal every $$T$$ seconds, it is said to be discrete with sampling period $$T$$.

To help understand the sampling process, assume a continuous function $$x_c(t)$$ as shown below

To work toward a mathematical representation of the sampling process, consider a train of evenly spaced impulse functions starting at $$t=0$$. This so called Dirac comb, $$s(t)$$, has a spacing of $$T\gt0$$ and contains $$t=0$$.

The Dirac comb $$s(t)$$ can be expressed as

$$s(t) = \sum_{n=0}^\infty{\delta(t-nT)} \label{eq:combcond}$$
where the impulse function $$\delta(t-nT)$$ must satisfy the condition
$$\int_{-\infty}^{\infty}\delta(t-nT)=1$$

When multiplying the Dirac comb $$s(t)$$ with a continuous-time signal $$x_c(t)$$, that signal will scale the comb by that $$x_c(t)$$

The resulting signal $$x_s(t)$$ follows from substituting $$s(t)$$ from equation $$\eqref{eq:combcond}$$ as

\begin{align} x_s(t)&\triangleq x_c(t)\,s(t)\label{eq:fstar0} \\ &= x_c(t)\sum_{n=0}^{\infty}{\delta(t-nT)} \nonumber \\ &= \sum_{n=0}^{\infty}{x_c(t)\ \underbrace{\delta(t-nT)}_{\text{delayed impulse}}} \label{eq:fstar} \end{align}

The impulse function $$\delta(t-nT)$$ is $$0$$ everywhere but at $$t=nT$$, the so called sifting property, so we can replace $$x_c(t)$$

\begin{align} x_c(t) &= x_c(nT)\label{eq:ft}\\ \end{align}
so that
$$x_s(t)=\sum_{n=0}^{\infty}{x_c(nT)\ \delta(t-nT)} \label{eq:fstarnT}$$

### Work towards a continuous function

The goal is to form a continuous algebraic expression, so we can use algebra to manipulate the difference equations.

Start with the Laplace transform of sampled signal $$x_s(t)$$ from equation $$\eqref{eq:fstarnT}$$

\def\lfz#1{\overset{\Large#1}{\,\circ\kern-6mu-\kern-7mu-\kern-7mu-\kern-6mu\bullet\,}} \def\laplace{\lfz{\mathscr{L}}} \begin{align} x_s(t) \laplace X_s(s)\triangleq&\int_{0^-}^\infty e^{-st}\overbrace{\sum_{n=0}^{\infty}x_c(nT)\ \delta(t-nT)}^{x_s(t)}\ \mathrm{d}t \nonumber \\ =& \int_{0^-}^\infty \sum_{n=0}^{\infty}e^{-st}\,x_c(nT)\ \delta(t-nT)\ \mathrm{d}t \label{eq:laplace0} \end{align}

Once more, the impulse function $$\delta(t-nT)$$ is $$0$$ everywhere but at $$t=nT$$, the “sifting property”, so we can replace $$\mathrm{e}^{-st}$$ with

$$\mathrm{e}^{-st}=\mathrm{e}^{-snT}\label{eq:est}$$

After substituting $$\eqref{eq:est}$$ in $$\eqref{eq:laplace0}$$, the terms $$\mathrm{e}^{-snT}$$ and $$x(nT)$$ are independent of $$t$$ and can be taken outside of the integration.

\begin{align} X(s)&=\int_{0^-}^\infty \sum_{n=0}^\infty e^{-s\color{blue}{nT}}x_c(\color{blue}{nT})\ \delta(t-nT)\ \mathrm{d}t \nonumber \\ &= \sum_{n=0}^\infty e^{-s{nT}}x_c(nT)\underbrace{\int_{0^-}^\infty \delta(t-nT)\ \mathrm{d}t}_{\text{=1 according to equation (\ref{eq:combcond})}} \nonumber \\ &= \sum_{n=0}^\infty\ e^{-s{nT}}\ x_c(nT) \label{eq:Fs} \end{align}

### The Z-transform follows

Define $$z$$ and $$x[n]$$ as

\shaded{ \begin{align} z &\triangleq e^{sT} \nonumber \\ x[n] &\triangleq x_c(nT) \nonumber \end{align} } \label{eq:z}

The scaling with sample period $$T$$ in the form $$x[n]$$ matches the notation for computer arrays. Anytime you see an $$[n]$$ you can translate to seconds replace it with $$(nT)$$. Be careful with integer expressions, such as $$[n – k]$$, which stands for $$((n-k)T)$$ seconds, not $$(nT – k)$$.

From equation $$\eqref{eq:z}$$ follows

$$\ln z = sT\ \Rightarrow\ s=\tfrac{1}{T}\ln z \label{eq:s}$$

Substitute $$(\ref{eq:z},\ref{eq:s})$$ in $$\eqref{eq:Fs}$$ and apply the power rule $$x^{ab}=(x^a)^b$$. Call the function $$F(z)$$ because $$z$$ is the only variable after the substitution $$s=\tfrac{1}{T}\ln z$$

$$X(z) = \sum_{n=0}^\infty\ \overbrace{z^{-n}}^{(e^{sT})^{-n}}\ x[n]$$

The unilateral Z-transform of the discrete function $$x[n]$$ follows as

$$\def\lfz#1{\overset{\Large#1}{\,\circ\kern-6mu-\kern-7mu-\kern-7mu-\kern-6mu\bullet\,}} \def\ztransform{\lfz{\mathcal{Z}}} \shaded{ x[n] \ztransform X(z)=\sum_{n=0}^\infty z^{-n}\ x[n] } \label{eq:ztransform}$$
in this complex polynomial, $$z$$ is any $$z\in\mathbb{C}$$.

Note that we use the notation $$\def\lfz#1{\overset{\Large#1}{\,\circ\kern-6mu-\kern-7mu-\kern-7mu-\kern-6mu\bullet\,}} \def\ztransform{\lfz{\mathcal{Z}}} \ztransform$$ as equivalent to the more common Z-transform notation $$\mathfrak{Z}\left\{\,x[n]\,\right\}$$.

In review: the Z-transform maps a sequence $$x[n]$$ to a continuous polynomial $$X(z)$$ of the complex variable $$z$$.

### Normalized frequency

The discrete signal only exists at time $$t=nT$$ where $$n={0,1,2,\ldots}$$ By normalizing time $$t$$ with the sampling interval $$T$$, using definition $$\eqref{eq:z}$$, we get the natural time, measured in “samples”, on the time-axis

$$x[n]\triangleq x(nT) \nonumber$$ we highlight the normalized time, by using the $$[n]$$ notation

When using normalized time, other time-dependent variables should be normalized as well. The angular frequency $$\omega$$, measured in [rad/s], normalizes to the normalized angular frequency with units of [rad/sample] by multiplying it with the sample period $$T$$ [s/sample]

Conversion to/from normalized
regular normalized
time [s] $$\xrightarrow{\div T}$$ [sample]
angular frequency [rad/s] $$\xrightarrow{\times T}$$ [rad/sample]
natural frequency [cycles/s] $$\xrightarrow{\times T}$$ [cycles/sample]

Using normalized frequency allows an author to present concepts independent of sample rate, but it comes at a loss of clarity as $$T$$ and $$f_s$$ are omitted from expressions.

When visualizing variable $$z$$, the normalized angular frequency $$\omega T$$ corresponds to the angle with the positive horizontal axis.

Formulas expressed in terms of $$f_{s}$$ and/or $$T$$ are not normalized and can be readily converted to normalized frequency by setting those parameters to $$1$$. The inverse is accomplished by replacing instances of the angular frequency parameter $$\omega$$, with $$\omega T$$.

Note that some authors use $$\omega$$ for normalized angular frequency in [rad/sample], and $$\Omega$$ for angular frequency in [rad/s]. Here we avoid using $$\omega$$ for natural frequencies. Instead we use the product $$\omega T$$ to refer to natural angular frequency. By doing so, it is clear that the angular frequency $$\omega$$ is scaled by the sample time $$T$$.

### Nyquist–Shannon sampling theorem

The frequency-domain representation of a sampled signal teaches us about the limitations of using a discrete signal. We will show this by doing a Fourier transform on the sampled signal $$x_s(t)$$.

Recall equation $$\eqref{eq:fstarnT}$$ and $$\eqref{eq:fstar0}$$, but this time bilateral

\begin{align} x_s(t)=x_c(t)\,s(t) &= \sum_{n=-\infty}^{\infty}{x_c(nT)\ \delta(t-nT)} \nonumber \\ &= \underbrace{\sum_{n=-\infty}^{\infty}x_c(nT)}_{f(t)}\ \ \underbrace{\sum_{n=-\infty}^{\infty}\delta(t-nT)}_{g(t)} \nonumber \end{align} \nonumber

The convolution theorem states that multiplication in the time-domain corresponds to convolution in the frequency-domain

$$\def\lfz#1{\overset{\Large#1}{\,\circ\kern-6mu-\kern-7mu-\kern-7mu-\kern-6mu\bullet\,}} \def\fourier{\lfz{\mathcal{F}}} f(t)\,g(t)\fourier \frac{1}{2\pi}{\Large(}F(\omega)*G(\omega){\Large)} \nonumber$$ where $$*$$ is the convolution sign

Apply the Fourier transform of a product to $$x_s(t)$$ and call it $$X_s(\omega)$$

\def\lfz#1{\overset{\Large#1}{\,\circ\kern-6mu-\kern-7mu-\kern-7mu-\kern-6mu\bullet\,}} \def\fourier{\lfz{\mathcal{F}}} \begin{align} x_s(t) = x_c(t)\,s(t) \fourier&\frac{1}{2\pi}{\Large(}\color{purple}{X_c(\omega)}*S(\omega){\Large)}\triangleq X_s(\omega) \label{eq:fstar2} \end{align}

Recall the Fourier transform of the Dirac comb $$s(t)$$

$$\def\lfz#1{\overset{\Large#1}{\,\circ\kern-6mu-\kern-7mu-\kern-7mu-\kern-6mu\bullet\,}} \def\fourier{\lfz{\mathcal{F}}} s(t)=\sum _{n=-\infty }^{\infty }\delta(t-nT) \fourier {\frac {2\pi }{T}}\sum _{k=-\infty }^{\infty }\delta \left(\omega -{\frac {2\pi k}{T}}\right)\triangleq S(\omega) \nonumber$$

Substituting the Fourier transform of the Dirac comb in $$\eqref{eq:fstar2}$$ where $$\frac{2\pi}{T}=2\pi f_s=\omega_s$$, the angular sample frequency

\def\lfz#1{\overset{\Large#1}{\,\circ\kern-6mu-\kern-7mu-\kern-7mu-\kern-6mu\bullet\,}} \def\fourier{\lfz{\mathcal{F}}} \newcommand\ccancel[2][black]{\color{#1}{\cancel{\color{black}{#2}}}} \newcommand\ccancelto[3][black]{\color{#1}{\cancelto{#2}{\color{black}{#3}}}} \begin{align} x_s(t) \fourier X_s(\omega) &= \frac{1}{\ccancel[red]{2\pi}}\color{purple}{X_c(\omega)} * \left(\frac{\ccancel[red]{2\pi}}{T}\sum_{k=-\infty}^{\infty}\delta(\omega-k\omega_s)\right) \nonumber \\ &= \frac{1}{T}\sum_{k=-\infty}^{\infty}\color{purple}{X_c(\omega)} * \delta(\omega-k\omega_s)\label{eq:XsjOmega0} \end{align}

Recall the convolution with a delayed impulse function $$\delta(\omega-a)$$

$$F(\omega)*\delta(\omega-a) = F(\omega-a)\nonumber$$

Apply the convolution to $$\eqref{eq:XsjOmega0}$$

\begin{align} X_s(\omega) &= \frac{1}{T}\sum_{k=-\infty}^{\infty}\color{purple}{X_c(\omega}-k\omega_s\color{purple}{)} \label{eq:XsjOmega} \end{align}

Equation $$\eqref{eq:XsjOmega}$$ implies that the Fourier transform of $$x_s(t)$$ consists of periodically repeated copies of the Fourier transform of $$x_c(t)$$. The copies of $$\color{purple}{X_c(\omega)}$$ are shifted by integer multiples of the sampling frequency and then superimposed as depicted below.

Plot (a) represents the frequency spectrum of the continuous signal $$\color{purple}{X_c(\omega)}$$ where $$\omega_{\small N}$$ is the highest frequency component. Plot (b) shows the frequency spectrum of the Dirac comb $$S(\omega)$$. Finally, plot (c) shows $$X_s(\omega)$$, the result of the convolution between $$\color{purple}{X_c(\omega)}$$ and the $$S(\omega)$$.

From (c), we see that the replicas of $$\color{purple}{X_c(\omega)}$$ do not overlap when

$$\omega_s-\omega_{\small N}\geq\omega_{\small N}\quad\Rightarrow\quad\shaded{\omega_s\geq2\,\omega_{\small N}} \label{eq:ineq}$$

Consequently, $$x_c(t)$$ can be recovered from $$x_s(t)$$ with an ideal low-pass filter $$H_r$$.

Inequality $$\eqref{eq:ineq}$$ is captured in the Nyquist–Shannon sampling theorem

Let $$x_c(t)$$ be a bandlimited signal with \begin{align} X_c(\omega)&=0,&|\omega|\gt \omega_{\small N}\nonumber \end{align} \nonumber then $$x_c(t)$$ is uniquely determined by its samples \begin{align} x[n] &= x_c(nT),&n\in\mathbb{Z}\nonumber \end{align} \nonumber if $$\omega_s = \frac{2\pi}{T}\geq2\,\omega_{\small N} \nonumber$$ where $$\omega_{\small N}$$ is called the Nyquist frequency and $$2\omega_{\small N}$$ is called the Nyquist rate.

When the inequality of $$\eqref{eq:ineq}$$ does not hold, e.g. when the sampling frequency $$\omega_s$$ is less than twice the maximum frequency $$\omega_{\small N}$$, the copies of $$X_c(\omega)$$ overlap, and $$X_c(\omega)$$ is no longer recoverable by low-pass filtering as shown below. The resulting distortion is called aliasing.

### $$z$$-plane

We introduced the normalized angular frequency $$\omega$$ and how it maps to a polar representation of the complex variable $$z$$. Here we will extent this to include the magnitude $$|z|$$.

Let’s start with the definition of $$z$$ from equation $$\eqref{eq:z}$$ and split $$s$$ in its real and imaginary parts $$s=\sigma+j\omega$$

$$z\triangleq\mathrm{e}^{sT} = \mathrm{e}^{(\sigma+j\omega)T} = \mathrm{e}^{\sigma T}\,\mathrm{e}^{j\omega T} \label{eq:zalt}$$

The polar notation for the complex variable $$z$$ is a function of the natural angular frequency $$\omega T$$ and $$\mathrm{e}^{\sigma T}$$

$$\shaded{ z\triangleq|z|\,\mathrm{e}^{j\omega T}\quad\text{where}\quad |z|\triangleq \mathrm{e}^{\sigma T} } \label{eq:zphi}$$

This conveniently matches the topology of the $$z$$-plane, where the modules $$|z|$$ corresponds to the length of a vector from the origin to $$z$$, and $$\omega T$$ corresponds to the angle of that vector with the positive horizontal axis.

According to the Nyquist–Shannon sampling theorem, a discrete signal can only have frequencies $$|\omega|$$ between $$0$$ and half the sampling frequency

$$0\leq|\omega|\leq\frac{\omega_s}{2}\label{eq:zphirange}$$

The maximum frequency in the discrete signal, the so called Nyquist frequency corresponds to $$\pi$$ radians, because with sample period $$T=\frac{1}{f_s}=\frac{2\pi}{\omega_s}$$ and the Nyquist frequency $$\omega_{\small N}=\frac{\omega_s}{2}$$, the natural angular frequency for the Nyquist frequency follows as

$$\omega_{\small N}T = \frac{\ccancel[green]{\omega_s}}{\ccancel[red]{2}}\,\frac{\ccancel[red]{2}\pi}{\ccancel[green]{\omega_s}} = \pi$$
the range for $$\omega T$$ follows as
$$\shaded{-\pi\leq\omega T\leq\pi}$$

For those familiar with the Laplace transform, we will map specific features between the $$s$$ and $$z$$-domain:

• The origin $$s=0$$ of the $$s$$-plane is mapped to $$z=e^0=1$$ on the real axis in $$z$$-plane.
• Each vertical line $$\sigma=\sigma_0$$ in $$s$$-plane is mapped to a circle $$|z|=e^{\sigma_0}$$ centered about the origin in $$z$$-plane. E.g.
• Leftmost vertical line $$\sigma\to-\infty$$ is mapped as the origin, where $$|z|=\mathrm{e}^{-\infty}=0$$
• The imaginary axis $$\sigma=0$$ is mapped as the unit circle, where $$|z|=\mathrm{e}^0=1$$
• Rightmost vertical line $$\sigma\to\infty$$ is mapped as a circle with an infinite radius, where $$|z|=\mathrm{e}^{\infty}=\infty$$.
• Each horizontal line $$j\omega=j\omega_0$$ in $$s$$-plane is mapped to an angle from the origin in $$z$$-plane of angle $$\omega_0 T$$ with respect to the positive horizontal direction.

### Region of convergence

Recall the definition of the Z-transform from equations $$(\ref{eq:ztransform},\ref{eq:zalt},\ref{eq:zphi})$$

$$\def\lfz#1{\overset{\Large#1}{\,\circ\kern-6mu-\kern-7mu-\kern-7mu-\kern-6mu\bullet\,}} \def\ztransform{\lfz{\mathcal{Z}}} f[n]\ztransform\sum_{n=0}^\infty z^{-n}\ f[n]\\ \text{where } z\triangleq|z|\,\mathrm{e}^{j\varphi}\text{, } |z|\triangleq \mathrm{e}^{\sigma T}\text{, } \varphi\triangleq \omega T$$

This converges depending on the duration and magnitude of $$f[n]$$ as well as on the magnitude $$|z|$$. The phase $$\varphi$$ has no effect on the convergence.

The power series for the Z-transform is called a Laurent series. The Laurent series, represents an analytic function at every point inside the region of convergence. Therefore, the Z-transform and all its derivatives must be continuous function of $$z$$ inside the region of convergence.

Laurent series converge in an annular (=ring shaped) region of the $$z$$-plane, bounded by poles. The set of values of $$z$$ for which the Z-transform converges is called the region of convergence (ROC). This means that anytime we use the Z-transform, we need to keep the region of convergence in mind.