RUNE DIGITAL // RESEARCH

Knowledge Distillation & Fine-Tuning

Optimizing Model Performance & Cost

Executive Summary: RUNE's knowledge distillation pipeline transfers reasoning capabilities from Gemini 2.0 Pro (premium teacher model) to rune-ai-v1, our locally-trained model running on RTX 4070 GPU. Currently operational via Ollama with three model variants: rune-ai-v1 (production), claude-jr (specialized), and mistral:7b (base). This enables 10–100x latency improvements while maintaining semantic fidelity—critical for real-time APIs.

Technical Implementation

Live Local Models (Ollama)

Currently running on local RTX 4070 Super GPU:

Teacher-Student Architecture

Systematic knowledge transfer from large models to optimized edge deployments.

Fine-Tuning Pipeline

Domain-specific adaptation ensuring models understand jewelry terminology, grading standards, and asset valuation nuances.

Deployment Optimization

Techniques for reducing inference latency while maintaining accuracy across production endpoints.

Training Data Architecture

Real training curriculum extracted from premium model outputs (Opus 4.5, GPT-5.1, Sonnet 4.5):

Multi-Model Synthesis

Training Curriculum Structure

📊 TRAINING EVIDENCE: OLLAMA_MODELFILE_V3 contains 492 lines of synthesized curriculum from 3 premium AI models. Training produces Claude Jr. running locally on Ollama, achieving 90% output quality at 10% inference cost.
Vertex AI's Model Garden provides pre-distilled models (Gemma, CodeGemma) reducing distillation overhead. Our pipeline leverages Vertex AI's native fine-tuning APIs (Tuning Job) combined with BigQuery for labeled dataset management, enabling reproducible, auditable model training workflows.

See It Live

BURNRATE Dashboard

Real-time financial tracking with AI-powered projections. See model inference costs and optimization metrics.

OPEN DASHBOARD →

CMD_SCHOOL

Interactive terminal training with AI command processing. Learn model integration patterns.

LAUNCH TERMINAL →

Vertex AI Docs

Official Google Cloud documentation for fine-tuning and distillation workflows.

READ DOCS →

Research Hub

Explore all research areas and live demonstrations across the RUNE platform.

VIEW ALL →