Hades logoHades applet banner

As explained in more detail on the cosimulation overview page, one of the central research goals of the Hades framework was to study algorithms for system-simulation and fast hardware/software-cosimulation, including the coupling of an event-driven simulation engine with instruction-level processor simulators.

We choose the MIPS-I architecture as the demonstrator for cosimulation and the fast simulator coupling in Hades. The first reason for this decision is the architecture itself, with its simple and regular instruction set, straightforward memory-model, clean exception and interrupt handling. Secondly, the market for 32-bit embedded systems and system-on-a-chip designs is still dominated by microcontrollers based on the MIPS and ARM architectures. Thirdly, documentation and tools for the MIPS architecture are readily available. For example, the DLX processor used in the textbook(s) by J.L.Hennessy and D.Patterson is closed based on the MIPS concepts.

The remainder of this document first gives a broad overview of the MIPS architecture, including instruction-set, memory-model, and interrupts. The following section then describes the relevant details, user-interface, and configuration settings of the TinyMips microprocessor. While TinyMips faithfully implements the full MIPS-I instruction-set and memory-model, it does use a simplified execution model without instruction pipeline. Finally, we present an overview of the GNU toolchain and explain how to setup your own cross-compiler and the binutils assembler and helper tools. Using the gcc cross-compiler allows you to write programs and compile programs for the TinyMips processor on your own computer. reference when studying the applets on the Hades website.

You might want to print a copy of this page and keep it around as reference while studying the interactive applets based on the TinyMips and IDT R3051 microprocessors.

MIPS architecture overview

The MIPS architecture evolved from research on efficient processor organization and VLSI integration at Stanford University. Their prototype chip proved that a microprocessor with five-stage execution pipeline and cache controller could be integrated onto a single silicon chip, greatly improving performance over non-pipelined designs. At the same time, a research group at Berkeley designed the RISC-I chip based on pretty much the same ideas. Today, the acronym RISC is interpreted as "regular instruction set computer", and the RISC ideas are used in every current microprocessor design.

The key concepts of the original MIPS architecture are:

  • five-stage execution pipeline: fetch, decode, execute, memory-access, write-result

  • regular instruction set, all instructions are 32-bit
  • three-operand arithmetical and logical instructions
  • 32 general-purpose registers of 32-bits each
  • no status register or instruction side-effects
  • no complex instructions (like stack management, string operations, etc.)
  • optional coprocessors for system management and floating-point

  • only the load and store instruction access memory
  • flat address space of 4  GBytes of main memory (2^32 bytes)
  • memory-management unit (MMU) maps virtual to actual physical addresses

  • optimizing C compiler replaces hand-written assembly code
  • hardware structure does not check dependecies - not "foolproof"
  • but software toolchain knows about hardware and generates correct code

In 1984, MIPS corporation was founded by members of the Stanford research team to develop a commercial version of the prototype chip. Their first product was the R2000 microprocessor, introduced in 1985, and followed in 1987 by the R2010 floating-point coprocessor. Both chips were successfully used in several of the early workstations. The next MIPS processor, called R3000, was a variant of the R2000 with the same instruction set, but optimized for low-cost embedded systems. This processor and its system-on-a-chip implementations are still popular and used in millions of devices (e.g. printers) even today. Since then, several improved variants of the original instruction set have been introduced:

  • MIPS-I: the original 32-bit instruction set; still common.
  • MIPS-II: improved instruction set with dozens of new instructions.
  • MIPS-III: a 64-bit instruction set used by the R4000 series.
  • MIPS-IV: an upgrade of the MIPS III.

One of the key features of the MIPS architecture is the regular register set. It consists of the 32-bit wide program counter (PC), and a bank of 32 general-purpose registers called r0..r31, each of which is 32-bit wide. All general-purpose registers can be used as the target registers and data sources for all logical, arithmetical, memory access, and control-flow instructions. Only r0 is special because it is internally hardwired to zero. Reading r0 always returns the value 0x00000000, and a value written to r0 is ignored and lost.

Note that the MIPS architecture has no separate status register. Instead, the conditional jump instructions test the contents of the general-purpose registers, and error conditions are handled by the interrupt/trap mechanism. Two separate 32-bit registers called HI and LO are provided for the integer multiplication and division instructions.

MIPS-I memory model and MMU

The original MIPS architecture defines three data-types: 32-bit word, 16-bit halfword, and 8-bit bytes. The later variants add the 64-bit double-word and floating-point data-types. All machine instructions are encoded as 32-bit words, and most integer operations are performed on 32-bit integers. The analysis of typical processor workloads indicated that byte load and store operations were used frequently, which led the MIPS designers to organize the main memory as a single flat array of bytes. Using 32-bit addresses, this results in a maximum main memory of 4 Gigabytes.

However, based on the external 32-bit data bus, all data transfers between memory and processor always use a full word, or 32-bits. Extra logic in the processor and the memory is used to enable and to extract the corresponding subset of the data when executing the half-word and byte load and store instructions. All memory accesses have to be aligned for the corresponding data-type: even addresses for half-word accesses, and multiples-of-four for word accesses and instruction fetch. Misaligned memory accesses are detected by the processor and the program is terminated.

Next to the 32-bit data bus and address-bus, the MIPS processors also generate four byte-enable signals during each memory access, where a low level ('0') indicates that the corresponding group of 8-bits is active during the transfer. The MipsMemory simulation component in Hades implements this behaviour, and also includes a simple MIPS disassembler to better visualize the execution of MIPS programs.

One rather unusual feature of the MIPS architecture is the support of both the big-endian and little-endian memory models. That is, the ordering of bytes inside a four-byte word can be selected by configuring the bus-interface of the processor. While the TinyMips processor can be switched to use either the little-endian or big-endian memory model, this feature has not been thoroughly tested. Only the little-endian variant is used for the example applets, because this is the default generated by our gcc cross-compiler.

To better support multitasking and multithreaded applications, all MIPS processors use a memory management unit (MMU) to map virtual program addresses to actual physical hardware addresses. The same mapping is used for instruction fetch and the load/store memory accesses. The R2000 processor and the later high-performance processors rely on a fully-featured MMU, which is programmed via coprocessor 0 instructions. The low-end processors like the R3000 rely on a much simpler scheme with the following static mapping from virtual to physical addresses:

virtual address range   physical address range (static MMU)   name   description
0xc000.0000 - 0xffff.ffff 0x0000.0000 - 0x7fff.ffff kseg2 1024 MBytes mapped cached kernel segment
0xa000.0000 - 0xbfff.ffff 0x0000.0000 - 0x1fff.ffff kseg1 512 MBytes unmapped uncached kernel segment. The default reset address is 0xbfc0.0000, which is mapped to physical address 0x1fc0.0000.
0x8000.0000 - 0x9fff.ffff 0x0000.0000 - 0x1fff.ffff kseg0 512 MBytes unmapped cached kernel segment.
0x0000.0000 - 0x7fff.ffff 0x4000.0000 - 0xbfff.ffff kuseg 2048 MBytes user space, mapped and cached.

Programs running in user mode can only access memory addresses in the "user space" segment, while memory accesses in either of the kernel segments are only allowed for programs in supervisor mode. This in turn is decided by a status bit in the system coprocessor 0. However, typical embedded systems often don't require multi-user support, and the software could run in privileged mode all the time.

While the static mapping explained above is rather simple, no virtual address remains unchanged by the mapping. This adds another layer of complexity when trying to keep track of memory accesses during a simulation, because the software operates with virtual addresses, while the physical addresses appear on the address bus and are used to control the external memories and peripheral devices. Therefore, the TinyMips processor can also be used with the memory management switched off, so that virtual and physical addresses are the same. This mode helps understanding the software running on the simulated processor, and is used in all of the introductory applets.

MIPS-I instruction set

The MIPS instruction set can be divided into three main groups of instructions, each of which has its own distinctive encoding:

  I-Type (immediate)
  | 31  26 | 25  21 | 20  16 | 15                      0 |
  | opcode | rs     | rt     | offset                    |

  J-Type (jump)
  | 31  26 | 25                                        0 |
  | opcode | instr_index                                 |

  R-Type (register)
  | 31  26 | 25  21 | 20  16 | 15  11 | 10  6 | 5      0 |
  | opcode | rs     | rt     | rd     | sa    | function |

Here, the opcode field indicates the 6-bit main opcode, while the 5-bit fields rt, rs and rd select the target register and one or two source registers for the instruction:

  • The I-type or immediate instructions hold a 16-bit field; depending on the instruction this is interpreted as an unsigned integer in the range 0..65535 or a sign-extended integer in the range -32768..32767.
  • The J-type or jump instructions reserve a 26-bit offset. This can be used as a sign-extended offset for PC-relative branches, or the lowest 5 bits are used to select one of the general-purpose registers.
  • The R-type or register instruction group includes all common arithmetical and logical operations, but also the load- and store instructions. The function field acts as a 6-bit sub-opcode that selects the operation, while the sa field encodes the shift-amount used for the shift-operations.

Please refer to the datasheets or the literature for a complete listing and explanation of all instructions. You can also look at the source code of the MIPS32 interpreter, which defines all opcodes and contains the actual implementation of each instruction.

MIPS-I interrupts


The MIPS coprocessor concept


Register convention

As explained above, the MIPS hardware does not enforce a specific use for the general-purpose registers (except for r0). However, the following register convention has evolved as a standard for MIPS programming and is is used by most tools, compilers, and operating systems:

Register number   Name   Description
0 zero Always returns 0
1 at (assembler temporary) Reserved for use by assembler
2-3 v0 v1 Value returned by subroutine
4-7 a0-a3 (arguments) First four parameters for a subroutine
8-15 t0-t7 (temporaries) Subroutines can use without saving
24-25 t8-t9 (temporaries) Subroutines can use without saving
16-23 s0-s7 Subroutine register variables, must be restored before returning
26-27 k0,k1 Reserved for use by interrupt/trap handler; may change under your feet
28 gp Global pointer; used to access "static" or "extern" variables
29 sp Stack pointer
30 s8/fp Frame pointer or ninth subroutine variable
31 ra Return address for subroutine
These register names are also typically used by disassemblers and debuggers instead of the raw register numbers. When a subroutine wants to use the registers s0-s8 for its intermediate results, it must save the values on the stack and restore those values before returning.

TinyMips overview

While pipelined execution is the focus of the original RISC concept, it is also possible to design a slower non-pipelined implementation of the MIPS-I architecture and instruction set. Similar to the well-known SPIM simulator, the TinyMips microprocessor in Hades implements such a simplified version of the MIPS architecture. Unlike SPIM, Hades allows you to change the system environment for the TinyMips and to add and simulate peripherial devices with exact timing.

Conceptually, all instructions execute on TinyMips in one cycle. Of course, designing this processor as real hardware would require a (rather inefficient) multicycle implementation. On the other hand, the simulation model of a non-pipelined processor is straightforward and much less complex than a pipelined processor. As a result of this, the simulation (unlike the real hardware) runs much faster, and is well suited to demonstrate the software development for embedded systems. Note that Hades also includes a simulation model of the IDT R3051 processor, which models the full instruction pipeline and on-chip caches.

To keep the TinyMips model as simple and regular as possible, it is based on the original MIPS-I 32-bit instruction set. If one of the MIPS-II, -III or -IV instructions is detected at runtime, the simulator will print a warning and enter the exception handler.

The system interface of the TinyMips processor consists of the following:

  • nRESET: low-active reset input.
  • CLK: clock input, the next instruction is executed after a rising-edge.
  • DATA: 32-bit bidirectional data bus.
  • ADDR: 32-bit address bus.
  • NBEN: 4 low-active byte-enable signals for halfword- and byte-transfers.
  • nWR: low-active write-enable output.
  • nRD: low-active read-enable output.
  • ALE: address-latch enable output, indicates a valid address.

So far, the user-interface of the processor is rather plain. It consists of a single memory editor that allows to watch and edit the contents of the on-chip registers.

  • the general-purpose registers are mapped to addresses 0..31. Usually, r29 is used as the stack pointer, r30 as the frame pointer, and r31 as the subroutine return address.
  • the program counter is shown at address 32.
  • the HI multiplication register is shown at address 34.
  • the LO multiplication register is shown at address 35.
  • the MODE register is shown at address 39.
  • the other addresses are unused and show as XXXX.XXXX.

The bits in the MODE register control the behaviour of the simulation model. You can change the values at runtime by typing a new value into the memory editor. When you save the Hades design file with a TinyMips processor instance, the current value of the MODE register is saved, and restored when you load the design file. Currently, the following bits are implemented:

  • bit 4: 0=no MMU 1=use R3000-style static memory mapping
  • bit 2: 0=debugging off 1=trace memory accesses
  • bit 1: 0=debugging off 1=trace instruction execution

After a processor reset, the TinyMips uses the MIPS default virtual address of 0xbfc0.0000 to fetch the first instruction, which translates to physical address 0x1fc0.0000 after conversion by the MMU. However, the reset address can also be specified explictly instead of relying on the default value given above. This allows simplifying the demos and avoids an extra memory component at the (rather odd) address range starting with 0x1fc0.0000. Most of the applets demos disable the MMU and the programs are compiled to start at processor address 0x0000.0000. Note that the start address of a program can easily be specified via the -T flags when the GNU linker/loader is used.

Gnu binutils and gcc toolchain

The focus of the TinyMips demonstration applets on the Hades website is system-simulation and hardware-software cosimulation. While software for small 8-bit microcontrollers is still commonly written in assembly, high-level languages are a prerequisite to develop the often very large system and application software programs used on 32-bit microprocessor systems. We chose the popular GNU toolchain for the software development, because the corresponding tools support the MIPS architecture, are free, open-source, and can be built on a variety of platforms. The gcc compiler also generates very efficient code.

All software used in the applets was compiled for the MIPS via a cross-compiler running on a Linux host (our Hades development platform). You can easily download and build the required tools on your own system. First, visit the binutils project homepage and download the sourcecode. Naturally, you might already have a precompiled version on your system, but you will need to build the cross-toolchain that runs on your computer but generates code for the (Tiny) MIPS architecture. Once you have built the tools, you can already use the GNU assembler to write assembly code programs for the TinyMips.

Afterwards, visit one of the download servers of the GNU project, http://gcc.gnu.org/mirrors.html, and download a release (stable) version of the GCC compiler. Again, follow the instructions to build a cross-compiler that runs on your own system but generates MIPS output.

If you are running Linux, you also try to download the following gcc-mips.tgz archive in tar.gz format; it includes gcc and corresponding binutils ready for Linux/x86 (Pentium,Athlon) hosts. Note that the tools expect to be installed into a directory called /opt/mips. This path is configured into the tools; you will have to build the tools yourself and supply the corresponding -prefix option to change the base directory.

The gnu tools and website provide instructions about how to build the cross-compiler and binutils from the sources. Depending on your system, you might also need additional tools (e.g. flex) to build the binutils or the gcc compiler. If necessary, download, build, and install such tools before and then repeat the binutils and gcc installation. For example, we used the following steps to build the tools:

  mkdir /opt/mips/sources
  cd /opt/mips/sources
  tar -xzvf /tmp/binutils- 
  cd binutils-
  ./configure --target=mips-idt-elf --prefix=/opt/mips
  make install
Next, it might be necessary to create a few header files required for the compiler. On Linux systems, it is often possible to just copy the native headers files and reuse them for the crosscompiler. However, you might also want to edit the files to exactly match your target system:

  mkdir /opt/mips/mips-idt-elf/include
  cd /opt/mips/mips-idt-elf/include
  mkdir bits  
  mkdir sys
  mkdir gnu
  cp /usr/include/stdio.h .
  cp /usr/include/bits/types.h bits/
  cp /usr/include/bits/stdio_lim.h bits/
  cp /usr/include/libio.h .
  cp /usr/include/features.h .
  cp /usr/include/_G_config.h .
  cp /usr/include/bits/stdio.h bits/
  cp /usr/include/sys/cdefs.h sys/
  cp /usr/include/gnu/stubs.h gnu/
Finally, you can unpack, configure, and build gcc as a cross-compiler. We want a compiler that generates MIPS code for the R-3000 series and use the ELF binary-code format:

  mkdir /opt/mips/mips-idt-elf/include
  cd /opt/mips/sources
  tar -xzvf /tmp/gcc-
  cd gcc-
  ./configure --target=mips-idt-elf --with-gnu-as --with-gnu-ld \
  make LANGUAGES=c
  make LANGUAGES=c install

Building the compiler might take a while. The tools are finally installed in the /opt/mips/bin/ directory and can be run from there.

TO BE WRITTEN: Running the Hades disassembler Setting the memory regions Debugging tips

TinyMips Applet Demos

Please click the following link(s) to go back to the live applet demonstrations based on the TinyMips processor:

References and external links

Impressum http://tams.informatik.uni-hamburg.de/applets/hades/webdemos/mips.html