Getting started with FPGA design using Altera Quartus Prime 16.1

A (relatively) short introduction to compiling, simulating and uploading using the Altera Quartus development environment for the Terasic Altera Cyclone IV DE0-Nano under Windows 10. An equivalent tutorial is available for the reader who prefers Xilinx based boards.

Install the FPGA Design Suite

Start by installing the free Quartus Prime Lite, this includes the IDE and required tool chain to create configuration files for the Altera FPGA.

  1. Install Quartus Prime Lite 16.1
    • download the Quartus Prime Lite Edition from
    • include device support for Cyclone IV and ModelSim-Altera Starter Edition.
    • run QuartusLiteSetup-, install to a path without spaces in the name (e.g. C:\intelFPGA_lite\16.1)
    • authorize USB JTAG Blaster II driver installation
  2. Run the Quartus Prime 16.1 Device Installer
    • install Cyclone IV and ModelSim-Altera Starter support

The USB JTAG driver needs some finishing up

  1. Use a USB cable to connect your computer to the NE0-Nano board
  2. Go in Window’s Device Manager
    • Right-click Other devices » USB-Blaster » Update Driver Software
    • Browse my computer for driver software at C:\intelFPGA_lite\16.1

Install Board Support for DE0-Nano

The Teraasic board support for DE0-Nano includes examples, user manual and the Terasic System Builder tool.

Download DE0-Nano CD-ROM from

  • unzip to a directory of your liking (e.g. C:\altera_lite\DE0-Nano)

Note that its Control Panel fails with Load DLL (TERASIC_JTAG_DRIVE.dll) under 64-bit Windows. No biggie, we do not need it.

Before you start

The simulator doesn’t play well with UNC paths (e.g. network shares). It triggers numerous errors and may cause timing simulations to disregard delays. Keeping the files on the server is fine, for as long as you access them through a symbolic link. To create a symbolic link, change to your home directory (cd %USERPROFILE%) and create a symbolic link to your file server (mklink /d symlink \\server\service\path).

If you add the symbolic link to Explorer’s Quick Access, it will resolve the link first. To work around this, first create a regular directory and add that to the Quick Access. Then replace that directory with the symbolic link.

A few last tip before you take off:

Stay clear of absolute path names. Not only do they cause havoc when moving a project to a new directory, but they also confuse the simulator. If the ellipsis () file selector returns an absolute path, simply edit it to make the file relative to the project directory. E.g. change \C:\Users\you\path\project.v to project.v. If you want to quickly change all absolute paths, I suggest closing Quartus and pulling .qsf into an text editor (e.g. emacs).

Lastly, store the files on a local drive. It makes the process of compiling much faster.

Build a circuit

Terasic advises to start with their System Builder to reduce the risk of damaging the board by incorrect I/O settings. I will throw this caution to the wind and use an example Quartus Setting file instead

Start Quartus Prime and create a new project

  • File » New Project Wizard
    • Choose a working directory (.c:\users\you\whatever\example1); project name = example1
    • Project type = Empty Project
    • do not add files
    • Family = Cyclone IV E; name filter = EP4CE22F17C6; click the only device that matches
    • Finish, accepting the default EDA Tools Settings

Start with a Verilog HDL module that tests if inputs are equal (File » New » Design file > Verilog HDL, and later save it as eq1.v)

`timescale 1ns / 1ps
module eq1( input i0, input i1, output eq ); 
    wire p0, p1; // internal signal declaration
    assign eq = p0 | p1; 
    assign p0 = ~i0 & ~i1;
    assign p1 = i0 & i1; 

Add a Verilog HDL module (File » New » Design file > Verilog HDL, and later save it as eq2.v)

`timescale 1ns / 1ps
module eq2( input [1:0] a, input [1:0] b, output aeqb ); 
    wire e0, e1; // internal signal declaration
    eq1 eq_bit0_unit(.i0(a[0]), .i1(b[0]), .eq(e0));
    eq1 eq_bit1_unit(.i0(a[1]), .i1(b[1]), .eq(e1));
    assign aeqb = e0 & e1; // a and b are equal if individual bits are equal

For the top level, we will use a schematic. Create a symbol file (eq2.sym) for the eq2 module so we can reference it in the schematic.

  • File » Create/Update » Create Symbol Files for Current File

Create the top level schematic

  • File » New » Design file » Block Diagram/Schematic
  • File » Save as » example1.bdf
  • Make sure that this top-level module has the same name as its source file and is the same name as your project name.
  • Update the block diagram (.bdf)
    • select the block diagram tab
    • place the new symbol
      • double-click the canvas
      • expand the project directory and select the eq2.sym file that we just created
      • place the symbol in the block diagram
    • add inputs
      • double-click the canvas
      • expand the libraries directory and select “primitive » pin » input”
      • place the pin so that it touches input a
      • change the pin name to SWITCH[1..0]
      • copy and paste the input pin, and place it so that it touches input b
      • change the pin name to SWITCH[3..2]
    • add output
      • double-click the canvas
      • expand the libraries directory and select “primitive » pin » output”
      • place it so it touches output aeqb (a equals b)
      • change the pin name to LED[0]
  • Last, but not least: Mark this file as the top-level module
    • Right-click the file name, and select “Set as Top-Level Entry”


  • Processing » Start » Start Analysis & Elaboration

Implementation Constraints

Time to constrain the implementation by specifying the input and output pins along with timing requirements.

External pins assignments

  • Assignments » Pin Planner
    • This will show the five I/O pins
    • Don’t worry about the Fitter Location, it is just whatever the fitter chose last time
    • Double-click the Location field next each of the pin names to add the pin numbers (based on Table 3-2 and 3-3 in the Terasic board’s user manual)
      LED[0] PIN_A15
      SWITCH[3] PIN_M15
      SWITCH[2] PIN_B9
      SWITCH[1] PIN_T8
      SWITCH[0] PIN_M1
    • Change the I/O standard to 3.3V_LVTTL based on the same user manual (change the first one, then copy and paste)

You should end up with something like

Pin assignments

If you plan to branch out, I suggest downloading and importing the settings file (.qsf) with pin assignments from

Timing requirements

For most design you will want to specify timing requirements. For our simple example however we will not do this.

For the record, to request specific timing requirements (default time quest) you would create Synopsys Design Constraints (File » New SCD File) and save it with the same base name as the top level file (e.g. example1.sdc). For our example it just contains some comments as a reminder.

#create_clock -period 20.000 -name CLOCK_50

For more info, refer to Altera’s class Constraining Source Synchronous Interfaces.

Synthesize and upload to FPGA

Before moving ahead, I like to shorten the path names in the Quartus Settings file (.qsf). This prevents problems when you move the project and especially when you have it stored on a file server. As we will see later, the ModelSim simulator doesn’t support UNC path names, but will honor relative paths even when the projects is on a file server.

  • Assignments » Settings
    • Remove the files (.bdf, .v, .sdc)
    • Add the files back in. After selecting each file using the ellipsis () file selector, edit the resulting path name to exclude the path and press Add.
  • Project Navigator
    • Right-click example1.v and choose Set as Top-Level Entry
  • Alternatively, you can close Quartus and pull the .qsf file in a text editor (e.g. Emacs) and shorten the path names (*_FILE) at the end of the file.

Let us move ahead and generate the binary SRAM Object File (.sof). This is the configuration file to be uploaded to the FPGA

  • Click Compile Design in the Tasks window, or press the Play button in the toolbar.
    • The Compilation Report tab shows the result
    • Correct any critical warnings or errors

This would be the moment to do a timing analysis and/or simulation. However, at this point I’m simply too curious to see if it works just out of the box, so let’s give it a shot.

Connect the DE0-Nano board with a USB cable and upload the configuration file to the FPGA SRAM. (Alternately upload to Flash)

  • Quartus Prime Lite » Tools » Programmer (or double-click Program Device in the task list)
    • Click Hardware Setup, and select USB-Blaster [USB-0]
    • Click Add File, and select your SRAM object file (.sof) in the output_files directory
    • Click Start
    • Save as example1.cdf, so it opens with these settings next time
    • Great! We did it, the design is now on the FPGA (until we remove the power)

Give it a spin

With the FPGA configured, time has come to test it

  • Input is through the DIP switches (SWITCH) on the bottom-left of the board.
  • Output is the right LED (LED[0]) located at the top of the board.
  • We expect the LED to be “on” when switch positions 0 and 1 are identical to positions 2 and 3.

If you prefer bigger switches, you can wire up a breadboard to the GPIO-0 connector.

  • VCC3p3 and Ground are available on the 40-pin expansion header at respectively pin 29 and 12 (or 30) as specified in the board manual.
  • welldoneGPIO_00, 01, 02, 03 are on FPGA BGA pins at respectively D3, C3, A2, A3 and on the 40-pin expansion header at respectively pin 2, 4, 5 and 6.
  • Modify the user constraints file accordingly.

Give yourself a pat on the back; you have achieved something new today!

Timing Analysis

With the first hurdle cleared, we are going to take a closer look at the circuit. The first step will analyze the delays in the circuit to determine the conditions under which the circuit operates reliably. We will follow up with a functional and timing simulation.

If you just want to the timing information on the port-to-port path, you can use

  • Task » Reports » Custom Reports » Report Timing
    • From » click ...
      • Collection = get_ports; List; select SWITCH[0] through SWITCH[3]; click ‘>
    • To » click ...
      • Collection = get_ports; List; select LED[0]; click ‘>

To place timing constraints on the port-to-port paths:

  • Tools » TimeQuest Timing Analyzer
    • Tasks » Update Timing Netlist ; double-click
      • Constraints » Set Maximum Delay
      • From » click ...
        • Collection = get_ports; List; select SWITCH[0] through SWITCH[3]; click ‘>
      • To » click ...
        • Collection = get_ports; List; select LED[0]; click ‘>
      • Delay value = 100 ns (more or less random value)
      • Run
    • Constraints » Write SDC File to example1.sdc
    • Task » Reports » Custom Reports » Report Timing
      • no need to specify the path, just click Report Timing
      • this reveals a Data Delay of about 6.7 ns.

If you want Quartus to meet more stringent restrains, you need to specify these and recompile. This will direct Quartus to seek an implementation that meets these constraints. However, in this case we only specified restraints because we’re interested in the values. [TimeQuest User Guide]

Functional Simulation

functionalThe Verilog HDL (.v) instructions are compiled (Analysis and Synthesis) into Register Transfer Logic (RTL) netlists, called Compiler Database Files (.cdb). As the name implies, data is moved through gates and register, typically subject to some clocking condition. A functional simulation tests the circuit at this RTL level. As such it will simulate the functionality, but not the propagation delays.

Altera Quartus ships bundled with their flavor of ModelSim that lets you draw waveforms for the signals going to the module under test. You can then save these waveforms as a file or HDL test bench. For this example we to skip the step of drawing waveforms, and jump straight to using test benches.

Start by creating a test bench (eq2_tb.v) that instantiate module under test and drive its inputs. As before, you should remove the path from the file name (Assignments » Settings » Files).

`timescale 1ns / 100ps
`default_nettype none

module eq2_tb;
  reg [1:0] a;  // inputs
  reg [1:0] b; 
  wire aeqb;  // output
  // Instantiate the Device Under Test (UUT)
  eq2 dut ( .a(a), 
            .aeqb(aeqb) );
  initial begin
    a = 0;  // initialize inputs
    b = 0;
    #100;  // wait 100 ns for global reset to finish (Xilinx only?)
    // stimulus starts here
    a = 2'b00; b = 2'b00; #10 $display("%b", aeqb);
    a = 2'b01; b = 2'b00; #10 $display("%b", aeqb);
    a = 2'b01; b = 2'b11; #10 $display("%b", aeqb);
    a = 2'b10; b = 2'b10; #10 $display("%b", aeqb);
    a = 2'b10; b = 2'b00; #10 $display("%b", aeqb);
    a = 2'b11; b = 2'b11; #10 $display("%b", aeqb);
    a = 2'b11; b = 2'b01; #10 $display("%b", aeqb);
    #200 $stop;

Configure Quartus to create a script that compiles the test bench and modules for ModelSim

  • Assignments » Settings » EDA Tool Settings » Simulation » Compile test bench
    • New
    • Test bench name = Functional test bench
    • Top level module in test bench = eq2_tb
    • File name = eq2_tb.v, eq1.v, eq2.v (name the test bench and all the modules that ModelSim needs to compile). After selecting each file using the ellipsis () file selector, edit the resulting path name to exclude the project path, then press Add.

If things don’t go your way, please check:

  • I strongly suggest putting the test bench in a separate file, when doing timing simulations. This keeps ModelSim from using the .v file instead of the .vo file what causes propagation delays not to show in timing simulations.
  • (sdfcomp-7) Failed to open SDF file “whatever.sdo” in read mode is most likely caused by the files residing on a UNC path. Refer to the quote at the beginning of this article.
  • (sdfcomp-14) Failed to open “modelsim.ini” specified by the MODELSIM environment variable is also most likely caused by the files residing on a UNC path. Refer to the quote at the beginning of this article.
  • When you get error deleting “msim_transcript” permission denied, close ModelSim first before starting it.
  • Errors like Instantiation of ‘cycloneive_io_obuf’ failed. The design unit was not found indicate that the global libraries altera_ver or cycloneive_ver were not included in Assignments » Settings » Libraries.
  • Keep in mind that the Windows filename limit (260) may be exceeded.
  • To prevent Warning: (vsim-WLF-5000) WLF file currently in use: vsim.wlf, quit ModelSim and delete simulation/modelsim/sim.wlf and simulation/modelsim/wlft*.

Compile the circuit to RTL netlists

  • Processing » Start » Start Analysis & Elaboration

Start the simulation

  • Tools » Simulation Tool » RTL Simulation
  • Select signals of interest
    • Objects » select the signals of interest » Add Wave (or the green circle with the + sign)
    • In the text field in the toolbar, change the simulation time from 1 ps to 200 ns and click the run button to the right (or type run 1 us at the command prompt)
    • Undock the Wave Window (icon on the far top right of the window). In case the Wave Window is hidden, use Windows » Wave.
    • Click “zoom full” and observe the simulated waveforms.

You should see something like

own work

If you make a change to either your DUT or your test bench

  • right-click the test bench and select recompile.
  • make sure that you click the Restart button directly to the right of the simulation time window (or simply type restart at the command prompt).

For more information, refer to Quartus II Handbook 3: Verification

Timing Simulation

timingBefore you start a timing simulation, first close ModelSim.

After compiling your circuit into RTL netlists (.cdb), it is fitted to the physical device (.qsf) trying to satisfy resource assignments and constraints. It then attempts to optimize the remaining logic. The Netlist Writer can export this information for ModelSim in the form of Verilog Output Files (.vo), Standard Delay Format Output Files (.sdo) and scripts to prepare ModelSim through compilation and initialization.

For this exercise, we will do a timing simulation for the whole system. With our top-level entry being a schematic (.bdf), we first need to convert this to Verilog HDL

  • Open the schematic
  • File » Create/Update » Create HDL Design File from Current File
    • File type = Verilog HDL
  • Assignments » Settings » Files
    • replace the example1.bdf file with example1.v
  • Make it the top-level module (so much for using a schematic … eh)

Create a system test bench (example1_tb.v). As before, you should remove the path from the file name (Assignments » Settings » Files).

`timescale 1ns / 100ps
`default_nettype none

module example1_tb;
  reg [3:0] SWITCH;  // inputs
  wire LED;          // output
  example1 uut ( .SWITCH(SWITCH), 
                 .LED(LED) );
  initial begin
    SWITCH[3:0] = 2'b0000; // initialize inputs
    #100;  // wait 100 ns for global reset to finish (Xlinx only?)
    // stimulus starts here
    SWITCH[1:0] = 2'b00; SWITCH[3:2] = 2'b00; #10 $display("%b", LED);
    SWITCH[1:0] = 2'b01; SWITCH[3:2] = 2'b00; #10 $display("%b", LED);
    SWITCH[1:0] = 2'b01; SWITCH[3:2] = 2'b11; #10 $display("%b", LED);
    SWITCH[1:0] = 2'b10; SWITCH[3:2] = 2'b10; #10 $display("%b", LED);
    SWITCH[1:0] = 2'b10; SWITCH[3:2] = 2'b00; #10 $display("%b", LED);
    SWITCH[1:0] = 2'b11; SWITCH[3:2] = 2'b11; #10 $display("%b", LED);
    SWITCH[1:0] = 2'b11; SWITCH[3:2] = 2'b01; #10 $display("%b", LED);
    #200 $stop;

Create the verilog output files

  • Processing » Start » Start EDA Netlist Writer

Configure Quartus to create a script that compiles the test bench and modules for ModelSim

  • Assignments » Settings » Libraries » Global
    • Add
      • altera_ver
      • cycloneive_ver
  • Assignments » Settings » EDA Tool Settings » Simulation
    • Compile test bench
      • Delete the old Functional test bench
      • New
      • Test bench name = Timing test bench
      • Top-level module in test bench = example1_tb
      • Add the files listed below. After selecting each file using the ellipsis () file selector, edit the resulting path name to exclude the project path, then press Add.
        • example_tb.v
        • simulation/modelsim/example1.vo
          • it appears that eq2 and eq1 were included inside simulation/modelsim/example1.vo. If you have larger modules, make sure you include al the .vo files here.
        • if the simulation/modelsim directory doesn’t exist: recompile, and then add the .vo module.
      • If things don’t go your way, refer to the section “Some hints ..” under “Test bench for functional simulation”.

Start a full synthesis for the circuit

  • Processing » Start » Start Compilation

Start ModelSim

  • Tools » Simulation Tool » Gate Level Simulation
    • Timing model = Slow -6 1.2V 85 Model (default), this simulates a slow -6 speed grade model at 1.2 V and 85 °C.

Select the signals

  • Objects » select the signals of interest » Add Wave
  • Increase simulation time to 1 µs and click the run button on the right
  • Undock the Wave Window (so you can expand it)
  • Click “zoom full” and observe the simulated waveforms. You should see something like

Expect something like

own work

For more hands on experience, refer to Altera’s excellent University Program lessons, or the Terasic CD-ROM files (C:\altera_lite\DE0-Nano) for examples and documentation. Altera also has a more generic Become a FPGA Designer video-based class.

Our first implementation is the SPI interface Math Talk. This is then used to build a demonstration of math operations in FPGA as eluded to from the inquiry How do Computer do Math?.

c’est tout

Building Math Hardware

derived work
Primary school teaches our students methods for addition, subtraction, multiplication and division. Computer hardware implements similar methods and performs them with astounding speed. This enables applications such as computer vision, process control, encryption, hearing aids, video compression to dubious practices as high-frequency trading.


This article describes how to build such hardware. In such, it is a sequel to the inquiry “How do Computers do Math?” that in chapter 7 introduced conceptual circuits using logic gates. We will model the various math operations using digital gates. Here we combine the gates into circuits and describe them using the Verilog Hardware Description Language (HDL). These Verilog HDL descriptions are compiled and mapped to a Field Programmable Gate Array (FPGA). Working knowledge of the Verilog HDL is assumed. To learn more about Verilog HDL, I recommend the book FPGA Prototyping with Verilog Examples, an online class or lecture slides. To help you get up to speed with the development boards, wrote Getting Started documents for two popular Altera and Xilinx boards. We aim to study algorithms to implement the algorithms in generic VLSI, and as such do will not use the highly optimized carry chain connections present on many FPGAs.

Hail to the FPGA

Let’s take this moment to honor the virtues of the FPGA. A FPGA can do many things at the same time and still respond immediately to input events. It is generally more complicated to create and debug the same logic in a FPGA compared to a microprocessor. Because of the challenges that FPGA development poses, many systems combine FPGAs and microprocessors to get the best of both worlds. For example, to recognize objects in a video stream, one would implement the pre-processing (noise removal, normalization, edge detection) in an FPGA, but the higher-level logic on a CPU or DSP. With a FPGA, you can put massive parallel structures in place. For example, an high-speed data stream can be distributed across the whole FPGA chip to be processed in parallel, instead of having a microprocessor deal with it sequentially. FPGAs can accelerate machine learning algorithms, video encoding, custom algorithms, compression, indexing and cryptography. By implements a soft microprocessor as part of the FPGA it can also handle high level protocols such as handle board management, protocol bridging and security tasks. [IEEExplore]


The first chapter describes the demonstration setup. Remaining chapters implement the concepts introduced in the inquiry How do Computers do Math?.
  1. Introduction
  2. Demonstration
  3. Adder (and subtractor)
  4. Faster adder (and subtractor)
  5. Multiplier
  6. Faster multiplier
  7. Divider
  8. Square root
  9. Conclusion


The code examples were tested on an Altera “Cyclone IV E” FPGA using Quartus Prime 16.1. Earlier code iterations used a Xilinx “Spartan-6” with their ISE Design Suite. The Verilog descriptions should work equally well on other boards or environments. For debugging, we used gate level simulation. In the real world, a DSLogic Logic Analyzer proved valuable in diagnosing problems.

How do computers do math?

This inquiry answers the question “How do computers do math?”. What once started as an innocent question from a six year old, turned into this 9 page post.

After a short introduction, we dive down into solid-state chemistry from where we emerge traversing along Ohm’s law, semi-conductors, logic gates and combinational logic.

The first seven chapters answer of our inquiry. Chapter 5 and 6 are optional. The remaining chapters merely complete the discussion of logic gates.

  1. Introduction
  2. Electrical circuits
  3. Electronic circuits
  4. Diodes based logic
  5. Bipolar junction transistor (BJT) based logic
  6. Field effect transistor (FET) based logic
  7. Math operations using combinational logic
  8. Synchronous sequential logic
  9. Programmable logic