Quick Start Guide

Get NL 1.0 running on your machine in under 5 minutes. This guide assumes you have a compatible NVIDIA GPU and basic command-line familiarity.

Prerequisites

  • GPU: NVIDIA RTX 4060 Ti (16GB) or RTX 3060 (12GB) minimum
  • CUDA: Version 12.1 or later
  • Storage: 28GB free disk space (26GB for model + 2GB overhead)
  • RAM: 16GB system RAM recommended

Installation Steps

Step 1: Download the Model

curl -O https://download.nolimit.foundation/models/nl-1.0-int8.tar.gz

Download size: ~26GB. This may take 10-30 minutes depending on your connection.

Step 2: Extract Model Files

tar -xzf nl-1.0-int8.tar.gz -C ~/nolimit/

Step 3: Install Runtime Dependencies

pip install nolimit-runtime torch==2.1.0 transformers==4.36.0

Step 4: Verify Installation

nolimit --version
# Output: NL 1.0.0 (INT8 Quantized)

Step 5: Start Local API Server

nolimit serve --model ~/nolimit/nl-1.0-int8 --port 8080

Server will start on http://localhost:8080

First Request

Test your installation with a simple completion request:

curl http://localhost:8080/v1/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "Write a Python function to calculate fibonacci numbers",
"max_tokens": 512,
"temperature": 0.7
}'

Next Steps

  • → Configure inference parameters for your use case
  • → Explore the API reference for advanced features
  • → Review performance tuning for optimal throughput