Skip to content
general artificial-intelligence

Fine-Tuning

fine-tuning ai training machine-learning
Plain English

Fine-tuning is like teaching a general-purpose doctor to become a specialist. The doctor already knows medicine (the pre-trained model), but by studying thousands of cardiology cases (your specific dataset), they become an expert cardiologist. Fine-tuning takes an AI model that already understands language and trains it further on your data so it speaks your terminology, follows your formats, and excels at your specific tasks.

Technical Definition

Fine-tuning is the process of continuing the training of a pre-trained model on a smaller, task-specific dataset to adapt it for a particular use case. The pre-trained model’s weights are updated (fully or partially) using the new data.

Fine-tuning vs. alternatives:

ApproachTraining?CostBest for
Prompt engineeringNoFreeFormat, style, simple tasks
RAGNoLow (infra only)Grounding in specific documents
Fine-tuningYesMediumStyle, format, domain terminology, complex behavior patterns
Pre-training from scratchYesVery highNew languages, entirely new domains

When to fine-tune:

  • Consistent output format that prompting cannot reliably achieve
  • Domain-specific terminology or jargon the base model gets wrong
  • Classification tasks with labeled data
  • Style matching (writing like your brand voice)
  • Reducing token usage (fine-tuned models need shorter prompts)

When NOT to fine-tune:

  • You need the model to know specific facts (use RAG instead)
  • You have fewer than 100 training examples
  • Prompt engineering already achieves acceptable results

Fine-tuning approaches:

  • Full fine-tuning: update all model parameters. Requires significant compute (GPUs).
  • LoRA (Low-Rank Adaptation): freeze most weights; train small adapter matrices. 90%+ fewer trainable parameters, much lower compute cost.
  • QLoRA: LoRA on a quantized (4-bit) base model. Fine-tune 70B models on a single GPU.

Training data format: typically JSONL with instruction/completion pairs:

{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}

Fine-tuning with OpenAI API

from openai import OpenAI
client = OpenAI()

# 1. Prepare training data (JSONL format)
# training_data.jsonl:
# {"messages": [{"role": "system", "content": "You are a network engineer."},
#   {"role": "user", "content": "What VLAN should guest Wi-Fi use?"},
#   {"role": "assistant", "content": "Isolate guest Wi-Fi on VLAN 20..."}]}

# 2. Upload training file
file = client.files.create(
    file=open("training_data.jsonl", "rb"),
    purpose="fine-tune"
)

# 3. Create fine-tuning job
job = client.fine_tuning.jobs.create(
    training_file=file.id,
    model="gpt-4o-mini-2024-07-18",
    hyperparameters={"n_epochs": 3}
)

# 4. Monitor progress
status = client.fine_tuning.jobs.retrieve(job.id)
print(f"Status: {status.status}")  # queued -> running -> succeeded

# 5. Use the fine-tuned model
response = client.chat.completions.create(
    model=status.fine_tuned_model,  # ft:gpt-4o-mini:org:custom:id
    messages=[{"role": "user", "content": "Configure VLAN 30 for management"}]
)
In the Wild

Fine-tuning is used when prompting alone cannot achieve the required consistency. Companies fine-tune models to match their brand voice, follow internal documentation formats, or classify support tickets into categories. In cybersecurity, fine-tuned models analyze logs and alert triage with domain-specific understanding. The key decision is fine-tuning vs. RAG: fine-tuning teaches the model how to behave (style, format, reasoning patterns), while RAG teaches it what to know (specific facts and documents). Most production systems use both: a fine-tuned model for consistent behavior, with RAG for grounding in current data. LoRA has dramatically lowered the barrier: fine-tuning a 7B parameter model on a single consumer GPU is now routine, and platforms like Hugging Face, Together AI, and Fireworks AI offer managed fine-tuning as a service.