Fine-tuning
Understanding fine-tuning
When training artificial intelligence (AI) and machine learning (ML) models for a specific purpose, data scientists and engineers have found it easier and less expensive to modify existing pretrained foundation large language models (LLMs) than it is to train new models from scratch. A foundation large language model is a powerful, general-purpose AI trained on vast datasets to understand and generate human-like text across a broad range of topics and tasks.
The ability to leverage the deep learning of existing models can reduce the amount of compute power and orchestrated data needed to tailor a model for specific use cases.
Fine-tuning is the process of adapting or supplementing pretrained models by training them on smaller, task-specific datasets. It has become an essential part of the LLM development cycle, allowing the raw linguistic capabilities of base foundation models to be adapted for a variety of use cases.
Here’s more to explore

The Compact Guide to Fine-Tuning and Building Custom LLMs
Learn techniques for fine-tuning and pretraining your own LLM using Mosaic AI.

The Big Book of Generative AI
Learn best practices for building production-quality GenAI applications.

Get Started With Generative AI
Build generative AI skills in 5 short tutorials. Earn a Databricks certificate.
How fine-tuning LLMs works
Pretrained large language models are trained on enormous amounts of data to make them good at understanding natural language and generating a human-like response to the input, making them a natural place to start for a base model.
Fine-tuning these models improves their ability to perform specific tasks, such as sentiment analysis, question answering or document summarization, with higher accuracy. Third-party LLMs are available, but fine-tuning models with an organization’s own data offers domain-specific results.
The importance and benefits of fine-tuning
Fine-tuning connects the intelligence in general-purpose LLMs to enterprise data, enabling organizations to adapt generative AI (GenAI) models to their unique business needs with higher degrees of specificity and relevance. Even small companies can build customized models suited to their needs and budgets.
Fine-tuning significantly reduces the need to invest in costly infrastructure for training models from scratch. By fine-tuning pretrained models, organizations can achieve faster time to market with reduced inference latency, as the model is more efficiently adapted to specific use cases.
Fine-tuning techniques help reduce memory usage and speed up the training process for foundational models with specialized, domain-specific knowledge, saving labor and resources.
When you fine-tune a language model on your proprietary data on Databricks, your unique datasets are not exposed to third-party risks associated with general model training environments.
Types of fine-tuning
Fine-tuning can help improve the accuracy and relevance of a model’s outputs, making them more effective in specialized applications than the broadly trained foundation models. It tries to adapt the model to understand and generate text that is specific to a particular domain or industry. The model is fine-tuned on a dataset composed of text from the target domain to improve its context and knowledge of domain-specific tasks. The process can be very resource-intensive, but new techniques make fine-tuning much more efficient. The following are some of the ways organizations fine-tune their LLMs:
- Full fine-tuning: Full fine-tuning involves optimizing or training all layers of the neural network. While this approach typically yields the best results, it is also the most resource-intensive and time-consuming.
- Partial fine-tuning: Reduce the computational demands by updating only the select subset of pretrained parameters most critical to model performance on relevant downstream tasks.
- Additive fine-tuning: Additive methods add extra parameters or layers to the model, freeze the existing pretrained weights and train only those new components.
- Few-shot learning: When collecting a large labeled dataset is impractical, few-shot learning tries to address this by providing a few examples (or shots) of the required task at hand.
- Transfer learning: This technique allows a model to perform a task different from the task it was initially trained on. The main idea is to leverage the knowledge the model has gained from a large, general dataset and apply it to a more specific or related task.
Parameter-efficient fine-tuning
Parameter-efficient fine-tuning (PEFT) is a suite of techniques designed to adapt large pretrained models to specific tasks while minimizing computational resources and storage requirements. This approach is beneficial for applications with limited resources or those requiring multiple fine-tuning tasks. PEFT methods, such as low-rank adaptation (LoRA) and adapter-based fine-tuning, work by introducing a small number of trainable parameters instead of updating the entire model. Adapter layers, a key component of PEFT, are lightweight, trainable models inserted into each layer of a pretrained model.
These adapters, which come in variants like Sequential, Residual and Parallel, adjust the model’s output without altering the original weights, thus preserving them while allowing for task-specific adjustments. For instance, LoRA can efficiently fine-tune large language models for tasks such as generating product descriptions. Meanwhile, quantized low-rank adaptation (QLoRA) focuses on reducing memory and computational load by using quantization. QLoRA optimizes memory with quantized low-rank matrices, which makes it highly efficient for tasks where hardware resources are limited.
When to use fine-tuning
Fine-tuning gives the model a more focused dataset such as industry-specific terminology or task-focused interactions. This helps the model generate more relevant responses for the use case, which could be anything from customizing to supplementing the model’s core knowledge to extending the model to entirely new tasks and domains.
- Task-specific adaptation: When you have a pretrained language model, and you want to adapt it to perform a specific task, such as sentiment analysis or text generation for a particular domain using domain-specific data. Instead of training a large model from scratch, you can start with a pretrained model and fine-tune it on your specific task, leveraging its general language understanding for the new task.
- Bias mitigation: Fine-tuning can be used to reduce or counteract biases present in a pretrained mode by providing balanced and representative training data.
- Data security and compliance: When working with sensitive data, you can fine-tune a model locally on your secure infrastructure to ensure that the model never leaves your controlled environment.
- Limited data availability: Fine-tuning is particularly beneficial when you have limited labeled data for your specific task. Instead of training a model from scratch, you can leverage a pretrained model’s knowledge and adapt it to your task using a smaller dataset.
- Continuous learning: