This guide provides step-by-step instructions for building Shimmy with GPU acceleration on Windows.
- Visual Studio 2022 with C++ build tools
- Rust (latest stable version)
- Git for cloning repositories
- CMake (for building llama.cpp dependencies)
- CUDA Toolkit 12.0+ (download from NVIDIA)
- Compatible NVIDIA GPU with compute capability 6.0+
- OpenCL SDK or GPU vendor drivers
- Compatible GPU with OpenCL 1.2+ support
- Vulkan SDK (download from LunarG)
- Compatible GPU with Vulkan 1.0+ support
git clone https://github.com/Michael-A-Kuykendall/shimmy.git
cd shimmycargo build --release --features llama-cudacargo build --release --features llama-openclcargo build --release --features llama-vulkancargo build --release --features gpu./target/release/shimmy.exe gpu-infoThis should show your GPU backend as "available".
For permanent installation:
# Install specific GPU backend
cargo install --path . --features llama-opencl
# Or install all GPU backends
cargo install --path . --features gpuError: couldn't read '..\templates/docker/Dockerfile'
Solution: This indicates you're using an older version. Use the latest from source:
git clone https://github.com/Michael-A-Kuykendall/shimmy.git
cargo install --path . --features llama-openclError: no method named 'with_n_cpu_moe' found
Solution: This is from an older published version. The latest source has these methods properly handled.
Common Issues:
- CUDA Toolkit not found: Ensure CUDA is in your PATH
- Compute capability mismatch: Check your GPU compatibility
- Visual Studio version: Ensure you have VS 2022 with C++ tools
Common Issues:
- OpenCL headers missing: Install your GPU vendor's SDK
- No OpenCL runtime: Update your GPU drivers
Test your GPU-accelerated build:
# Check GPU detection
shimmy gpu-info
# Run a simple generation test
shimmy generate test-model --prompt "Hello" --max-tokens 50Pre-built Windows binaries with GPU support are available in GitHub Releases:
- Download from: https://github.com/Michael-A-Kuykendall/shimmy/releases
- Choose the appropriate GPU variant for your system
If you encounter issues:
- Check the main README for general troubleshooting
- Review CUDA documentation for GPU-specific details
- Open an issue at: https://github.com/Michael-A-Kuykendall/shimmy/issues
- v1.7.2+: Full Windows GPU support with templates included
- v1.7.1 and earlier: May have template packaging or MoE compilation issues
- Always use latest:
git cloneand build from source for best experience