Quick Start Guide
Get NL 1.0 running on your machine in under 5 minutes. This guide assumes you have a compatible NVIDIA GPU and basic command-line familiarity.
Prerequisites
- GPU: NVIDIA RTX 4060 Ti (16GB) or RTX 3060 (12GB) minimum
- CUDA: Version 12.1 or later
- Storage: 28GB free disk space (26GB for model + 2GB overhead)
- RAM: 16GB system RAM recommended
Installation Steps
Step 1: Download the Model
curl -O https://download.nolimit.foundation/models/nl-1.0-int8.tar.gzDownload size: ~26GB. This may take 10-30 minutes depending on your connection.
Step 2: Extract Model Files
tar -xzf nl-1.0-int8.tar.gz -C ~/nolimit/Step 3: Install Runtime Dependencies
pip install nolimit-runtime torch==2.1.0 transformers==4.36.0Step 4: Verify Installation
nolimit --version
# Output: NL 1.0.0 (INT8 Quantized)Step 5: Start Local API Server
nolimit serve --model ~/nolimit/nl-1.0-int8 --port 8080Server will start on http://localhost:8080
First Request
Test your installation with a simple completion request:
curl http://localhost:8080/v1/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "Write a Python function to calculate fibonacci numbers",
"max_tokens": 512,
"temperature": 0.7
}'Next Steps
- → Configure inference parameters for your use case
- → Explore the API reference for advanced features
- → Review performance tuning for optimal throughput