Tenstorrent NPU Setup Guide: Install Drivers & SDK

In this guide, we will walk you through the setup process for your Tenstorrent NPU wormhole (N300S), including hardware installation, driver setup

NOTE
This guide is written for a multi-card setup with 4 Wormhole NPUs, so additional configurations for multi-card systems will be covered as needed. I have tested the setup process on both Ubuntu 20.04 and Ubuntu 22.04.


1. Prerequisites

Grayskull is no longer supported by Tenstorrent. If you are still using Grayskull, you will need to rely on the last available software versions or consider upgrading to a newer Tenstorrent NPU.

Tenstorrent’s last supported software versions for Grayskull include:

  • TT-Firmware: fw_pack-80.14.0.0.fwbundle
  • TT-KMD: ttkmd_1.31
  • TT-Buda: v0.19.3
  • TT-Metalium: v0.55


2. : Install TT-KMD for Tenstorrent Wormhole N300S

First, installing the Kernel-Mode Driver (TT-KMD) is an essential step for setting up the
Tenstorrent Wormhole N300S system correctly.

 Install Software Dependencies


sudo apt update && sudo apt install -y wget git python3-pip dkms cargo python-is-python3

Install the driver (TT-KMD) by running these commands in the terminal:


git clone https://github.com/tenstorrent/tt-kmd.git
cd tt-kmd
sudo dkms add .
sudo dkms install "tenstorrent/$(./tools/current-version)"
sudo modprobe tenstorrent

## To uninstall ex.. 1.31:./tools/current-version
sudo modprobe -r tenstorrent
sudo dkms remove tenstorrent/1.31 --all

## Driver Module
root@test:/# lsmod |grep -i tens
tenstorrent            49152  0

3. Update Device TT-Firmware with TT-Flash

Next, You need to install the TT-Firmware file using the TT-Flash utility.
You only need to update the firmware once once when the card is first unboxed or when updating to a new firmware version.


To Build from git (development):


## Building the repository
sudo apt install cargo
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"

git clone https://github.com/tenstorrent/tt-flash.git
cd tt-flash

## (Optional) Virtual environment
python -m venv venv
source venv/bin/activate


pip install --upgrade pip
pip3 install .

root@test:/opt# tt-flash -h
usage: tt-flash [-h] [-v] [--sys-config SYS_CONFIG] [--no-color] [--no-tty] {flash,verify} ...

Version: 3.1.1

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  --sys-config SYS_CONFIG
                        Path to the pre generated sys-config json
  --no-color            Disable the colorful output
  --no-tty              Force disable the tty command output

command:
  {flash,verify}
    verify              Verify the contents of the SPI. Will display the currently running and flashed bundle version of the fw and checksum the fw against either what was flashed previously according the the file system state, or a
                        given fw bundle. In the case where a fw bundle or flash record are not provided the program will search known locations that the flash record may have been written to and exit with an error if it cannot be found
                        or read.


Download TT-Firmware and Update:


git clone https://github.com/tenstorrent/tt-firmware.git
cd tt-firmware/
root@test:/opt/tt-firmware# ls
experiments  fw_pack-80.15.0.0.fwbundle  latest.fwbundle  LICENSE  LICENSE_understanding.txt  README.md  SUMMARY.md

## Since four cards are installed in this system, all four are shown in the update process.
## However, the output will vary depending on the actual number of installed cards.

root@test:/opt/tt-firmware# tt-flash --fw-tar fw_pack-80.15.0.0.fwbundle --force
Stage: SETUP
        Searching for default sys-config path
        Checking /etc/tenstorrent/config.json: not found
        Checking ~/.config/tenstorrent/config.json: not found

        Could not find config in default search locations, if you need it, either pass it in explicitly or generate one
        Warning: continuing without sys-config, galaxy systems will not be reset
Stage: DETECT
Stage: FLASH
        Sub Stage: VERIFY
                Verifying fw-package can be flashed: complete
                Verifying Wormhole[0] can be flashed
                Verifying Wormhole[1] can be flashed
                Verifying Wormhole[2] can be flashed
                Verifying Wormhole[3] can be flashed
        Stage: FLASH
                Sub Stage FLASH Step 1: Wormhole[0]
                        ROM version is: (80, 14, 0, 0). tt-flash version is: (80, 15, 0, 0)
                        Forced ROM update requested. ROM will now be updated.
                        Board will require reset to complete update, checking if an automatic reset is possible
                                Success: Board can be auto reset; will be triggered if the flash is successful
                Sub Stage FLASH Step 1: Wormhole[1]
                        ROM version is: (80, 14, 0, 0). tt-flash version is: (80, 15, 0, 0)
                        Forced ROM update requested. ROM will now be updated.
                        Board will require reset to complete update, checking if an automatic reset is possible
                                Success: Board can be auto reset; will be triggered if the flash is successful
                Sub Stage FLASH Step 1: Wormhole[2]
                        ROM version is: (80, 14, 0, 0). tt-flash version is: (80, 15, 0, 0)
                        Forced ROM update requested. ROM will now be updated.
                        Board will require reset to complete update, checking if an automatic reset is possible
                                Success: Board can be auto reset; will be triggered if the flash is successful
                Sub Stage FLASH Step 1: Wormhole[3]
                        ROM version is: (80, 14, 0, 0). tt-flash version is: (80, 15, 0, 0)
                        Forced ROM update requested. ROM will now be updated.
                        Board will require reset to complete update, checking if an automatic reset is possible
                                Success: Board can be auto reset; will be triggered if the flash is successful
                Sub Stage FLASH Step 2: Wormhole[0] {n300}
                        Writing new firmware... SUCCESS
                        Firmware verification... SUCCESS
                        Initiating local to remote data copy
                Sub Stage FLASH Step 2: Wormhole[1] {n300}
                        Writing new firmware... SUCCESS
                        Firmware verification... SUCCESS
                        Initiating local to remote data copy
                Sub Stage FLASH Step 2: Wormhole[2] {n300}
                        Writing new firmware... SUCCESS
                        Firmware verification... SUCCESS
                        Initiating local to remote data copy
                Sub Stage FLASH Step 2: Wormhole[3] {n300}
                        Writing new firmware... SUCCESS
                        Firmware verification... SUCCESS
                        Initiating local to remote data copy
                Flash and verification for all chips completed, will now wait for n300 remote copy to complete...
                Remote copy completed
Stage: RESET
 Starting PCI link reset on WH devices at PCI indices: 0, 1, 2, 3
 Finishing PCI link reset on WH devices at PCI indices: 0, 1, 2, 3
FLASH SUCCESS



4. Set Up HugePages

Your system allocates dedicated memory using HugePages to accelerate communication with Tenstorrent devices. Set up HugePages by running these commands in the terminal:


# Install `.deb`
wget https://github.com/tenstorrent/tt-system-tools/releases/download/upstream%2F1.1/tenstorrent-tools_1.1-5_all.deb
sudo dpkg -i tenstorrent-tools_1.1-5_all.deb

# Start Services
sudo systemctl enable --now tenstorrent-hugepages.service
sudo systemctl enable --now 'dev-hugepages\x2d1G.mount'

# System Reboot
sudo reboot


5. Install the System Management Interface (TT-SMI)

Install Tenstorrent Software Management Interface (TT-SMI):
To Build from git (development):




# Clone the repository
git clone https://github.com/tenstorrent/tt-smi.git
cd tt-smi/

# Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Upgrade pip
pip3 install --upgrade pip

# Install protobuf (required for Ubuntu 22.04)
sudo apt install -y protobuf-compiler

# Install tt-smi
pip3 install .

# Usage: Command line arguments
tt-smi [-h] [-l] [-v] [-s] [-ls] [-f [filename]] [-g] [-r 0,1 ... or config.json]

# Run the TT-SMI utility
tt-smi



You will see a display with device information, telemetry, and firmware:

Tenstorrent NPU Setup Guide: Installing Drivers and SDK


You can check the results of the firmware update to 80.15.0.0 applied previously.

Tenstorrent NPU Setup Guide: Installing Drivers and SDK


6. Optional : Multi-Card Configuration (TT-Topology)

If you have installed a multi-card Wormhole system (4 cards), install the Tenstorrent Topology utility and configure a mesh topology by running the following commands in the terminal.
TT-Topology is a command line utility used to flash multiple NB cards on a system to use specific eth routing configurations.

In this guide, we will walk you through the setup process for your Tenstorrent NPU, including hardware installation, driver setup, and SDK configuration.

NOTE
urrently, the Tenstorrent Demo is designed for either a single-card or a 4-card setup. If you are using a multi-card (4 cards) configuration, you must configure the topology to run the demo properly.


  • Flash all the boards to default – set all eth port disables to 0 and reset coordinates to (0,0) for local chips and (1,0) for n300 remote chips.
  • Issue a board level reset to apply the new flash to the chips.
  • Generate a mapping of all possible connections and their type between the available chips.
  • Using a graph algorithm generate coordinates for each chip based on user input. These layouts are discussed in detail in the sections below.
  • Write the new coordinates to the chips.
  • Issue a board level reset to apply the new flash to the chips.
  • Return a png with a graphic representation of the layout and a .json log file with details of the above steps.

For other configurations or detailed explanations, please refer to this.


git clone https://github.com/tenstorrent/tt-topology.git
cd tt-topology
pip3 install --upgrade pip
pip3 install .
tt-topology -l mesh -p mesh_layout.png


Mesh

In the mesh layout is a trivalent graph where each node can have a max of 3 connection. The system assigns the coordinates using a BFS algorithm. Command to generate a mesh layout

Tenstorrent NPU Setup Guide: Installing Drivers and SDK


You can see the configuration as shown in the image below.
After completing the setup, you can check the Coords values by running tt-smi.

Tenstorrent NPU Setup Guide: Installing Drivers and SDK


Now, all the basic configurations are complete.

Tenstorrent provides three open-source SDKs for developing on Tensix Processors:

  • TT-Buda to run existing models
  • TT-Metalium/TT-NN to build your own kernels and models
  • TT-Forge/TT-MLIR to create, compile, and optimize graph operations

Each SDK will have its own system dependency requirements and installation process.

As of October 2024, TT-Buda development has been discontinued. It is planned to be integrated with other SDKs in the future.

TT-Metalium/TT-NN and TT-Forge/TT-MLIR are currently under active development.

In the next section, we will document the installation of the SDK and run example demos. N300S

Thank you.

Leave a Comment