Fine-tuning an LLM

- December 19, 2025

Fine-tuning TinyLlama Locally

I recently fine-tuned TinyLlama on a small custom dataset and was impressed by how well it learned the specific response style. Here's what I did and the results.

You can try it out yourself by checking out the repository.

What is Fine-tuning?

Fine-tuning takes a pre-trained language model (one that already understands general language) and trains it further on specific data to improve performance on particular tasks. Think of it as giving a general-purpose assistant specialized training in a specific domain.

The Training Data

I started with just 3 examples in a simple JSON format:

[
  {"prompt": "Explain Python lists", "response": "Python lists are ordered, mutable collections."},
  {"prompt": "What is a dictionary?", "response": "A dictionary stores key-value pairs with fast lookup."},
  {"prompt": "Explain list comprehension", "response": "A concise syntax for creating lists from iterables."}
]

The goal was to teach the model to give concise, technical answers about Python concepts.

The Setup

I used LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning. Instead of updating all model parameters (which would require massive resources), LoRA only trains a small set of additional parameters—reducing memory usage by ~90% while maintaining nearly the same performance.

The training ran for 10 epochs on a CPU, taking about 2 minutes. The loss decreased from 3.03 to 0.18, showing clear learning progress.

The Results

After training, I compared the original model with the fine-tuned version:

Example 1: "Explain Python lists"

Original Model:

A Python list is an ordered collection of objects.

Fine-tuned Model:

Python lists are ordered, mutable collections.

Expected (from training data):

Python lists are ordered, mutable collections.

Perfect match! The fine-tuned model produced the exact response from the training data.

Example 2: "What is a dictionary?"

Original Model:

A dictionary is a collection of words, their definitions, and their meanings.

Fine-tuned Model:

A dictionary is a collection of key-value pairs.

Expected (from training data):

A dictionary stores key-value pairs with fast lookup.

The fine-tuned model correctly interpreted "dictionary" as a Python data structure (not a book), though it didn't match the exact wording. Still, it captured the core concept.

Key Takeaways

LoRA-based fine-tuning made this accessible on regular hardware—no GPU required. The model learned the concise, technical style from just 3 examples.

8-bit quantized inference reduced memory usage by ~75%, making deployment practical on systems with limited RAM.

Dockerized deployment ensured everything ran consistently across different environments.

Why This Matters

Fine-tuning allows you to customize large language models for specific use cases without massive computational resources. This approach is widely used in production to adapt general-purpose models to specific domains, tasks, or response styles.

The results show that even with minimal training data, the model can learn to match a specific format and style—making it perfect for creating domain-specific assistants or customizing responses for particular applications.

Try It Yourself

Want to fine-tune TinyLlama on your own dataset? Check out the complete implementation in the repository. The code includes everything you need to get started: training scripts, evaluation tools, and a Docker setup for easy deployment.

Search This Blog

A Passionate Technologists' Journal