llama.cpp Build Instructions
Overview
This document provides instructions for building llama.cpp from source code on Linux systems.
Prerequisites
- GCC with C++17 support and
std::filesystemsupport (GCC 9+ recommended) - CMake 3.15 or higher
- Git
- Make or Ninja build system
- OpenMP support (usually included with GCC)
Quick Start
1. Clone the Repository
bash
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
2. Build with Default Options
bash
# Configure build
cmake -B build
# Build the main binary and libraries
cmake --build build --config Release -j$(nproc)
3. Verify Build
bash
# Test the main binary
./build/bin/llama-cli --help
# Check binary info
file build/bin/llama-cli
Advanced Build Options
Build Configuration Options
bash
cmake -B build [OPTIONS]
Common options:
-DCMAKE_BUILD_TYPE=Release- Optimized release build (default)-DCMAKE_BUILD_TYPE=Debug- Debug build with symbols-DBUILD_SHARED_LIBS=ON- Build shared libraries instead of static-DLLAMA_CURL=OFF- Disable HTTP download support-DCMAKE_CXX_STANDARD=17- Specify C++ standard
CPU Optimizations
bash
# Build with OpenBLAS support
cmake -B build -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS
# Build with Intel oneMKL
cmake -B build -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=Intel
Building Specific Targets
bash
# Build only the main CLI tool
cmake --build build --config Release --target llama-cli -j$(nproc)
# Build simple example
cmake --build build --config Release --target llama-simple -j$(nproc)
# Build server
cmake --build build --config Release --target llama-server -j$(nproc)
# Build all targets
cmake --build build --config Release -j$(nproc)
Build Artifacts
Main Binaries
build/bin/llama-cli- Main command-line interfacebuild/bin/llama-server- HTTP server for model inferencebuild/bin/llama-simple- Simple example application
Libraries
build/bin/libllama.a- Static llama librarybuild/bin/libggml.a- Core GGML static librarybuild/bin/libggml-cpu.a- CPU backend static library
CLI Tools
build/bin/llama-gguf- GGUF file manipulationbuild/bin/llama-gguf-hash- GGUF file hashingbuild/bin/llama-gemma3-cli- Gemma3-specific CLIbuild/bin/llama-llava-cli- LLaVA multimodal CLIbuild/bin/llama-minicpmv-cli- MiniCPM-V CLIbuild/bin/llama-qwen2vl-cli- Qwen2VL CLI
Troubleshooting
GCC Version Issues
Problem: Build fails with std::filesystem linking errors Solution: Use a newer GCC version with proper filesystem support
Option 1: Use GCC Toolset (RHEL/CentOS/UnionTechOS)
bash
# Check available toolsets
dnf list gcc-toolset-*
# Install and enable GCC 12
scl enable gcc-toolset-12 bash
# Or use in build command directly
source scl_source enable gcc-toolset-12
cmake -B build
cmake --build build --config Release -j$(nproc)
Option 2: Install GCC 9+ manually
bash
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install gcc-11 g++-11
# Set as default
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 100
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-11 100
Option 3: Add explicit filesystem linking
bash
cmake -B build -DCMAKE_EXE_LINKER_FLAGS="-lstdc++fs"
Common Build Errors
Error: ccache not found Solution: Install ccache or disable warning
bash
# Install ccache (recommended for faster builds)
sudo apt-get install ccache # Ubuntu/Debian
sudo dnf install ccache # RHEL/Fedora
# Or disable warning
cmake -B build -DGGML_CCACHE=OFF
Error: Missing OpenMP Solution: Install OpenMP development packages
bash
# Ubuntu/Debian
sudo apt-get install libomp-dev
# RHEL/Fedora
sudo dnf install libgomp-devel
Error: CMAKE minimum version required Solution: Upgrade CMake
bash
# Ubuntu/Debian
sudo apt-get install cmake
# RHEL/Fedora
sudo dnf install cmake
Build Performance Tips
Parallel Compilation
bash
# Use all CPU cores
cmake --build build --config Release -j$(nproc)
# Use specific number of cores
cmake --build build --config Release -j8
Using Ninja Generator
bash
# Install Ninja
sudo apt-get install ninja-build # Ubuntu/Debian
sudo dnf install ninja-build # RHEL/Fedora
# Build with Ninja
cmake -B build -G Ninja
cmake --build build --config Release
Using CCache
bash
# Install ccache
sudo apt-get install ccache # Ubuntu/Debian
sudo dnf install ccache # RHEL/Fedora
# Configure CMake to use ccache
export CC="ccache gcc"
export CXX="ccache g++"
cmake -B build
Production Builds
For production deployments, use these recommended options:
bash
cmake -B build \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_SHARED_LIBS=OFF \
-DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON \
-DGGML_NATIVE=ON \
-DLLAMA_CURL=OFF
cmake --build build --config Release -j$(nproc)
Verification
After building, verify the installation:
bash
# Check version
./build/bin/llama-cli --version
# Test help
./build/bin/llama-cli --help
# Verify binary type
file build/bin/llama-cli
# Check dependencies (if shared libs)
ldd build/bin/llama-cli
Alternative Build Methods
Using Package Managers
Brew (macOS):
bash
brew install llama.cpp
Nix:
bash
nix-shell -p llama.cpp
Winget (Windows):
powershell
winget install ggml.llama.cpp
Using Pre-built Binaries
Download from the GitHub Releases page for your platform.
Usage Example
After successful build:
bash
# Download a model (example)
wget https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf
# Run inference
./build/bin/llama-cli -m llama-2-7b-chat.Q4_K_M.gguf -p "Hello, how are you?" -n 50