Introduction

Are you looking to go beyond pre-trained large language models (LLM) and customize them with your own specialized data? Our new finetuning feature lets you create the most natural and expressive models possible, tailored to your own datasets and use cases. This finetuning feature runs on our flagship Command model, which is trained to follow user commands and to be instantly useful in practical applications.

With Command model finetuning, you can train a custom model to respond to a specific command in a more natural and fluid way. This means that your model will be able to generate a completion with a more appropriate response at inference time based on the prompts that you specify.

In this blog post, we'll go through how to set up finetuning and discuss some of the benefits of using this new feature.

What is Finetuning?

When we think about training LLMs, we typically envision training them from scratch on a huge amount of text data. This could be anything from a few hundred books, to tens of thousands of web pages, to millions of tweets. Generally, the more (good, diverse, non-noisy) data we use to train an LLM, the better it will perform when we evaluate it on downstream tasks.

However, there's another way to adapt LLMs to specific use cases that don't require vastly increasing the amount of training data. This technique is called finetuning, which benefits in particular from careful formatting and data choices.

For example, if we are training a model to generate responses to questions, we can include the question in the prompt that the model is trained on and the answer as the completion. This will help the model learn to generate responses that are more relevant to the specific question that's being asked.

This is useful when we have limited data available and also when we want to train a model to be suitable for specific use cases, by providing data particular to a domain or guiding the model to respond in a consistent style or tone. For example, if we want to train a model to have conversations with customers about specific products, we can use product descriptions and reviews included within the prompt. This will help the model learn to generate responses that are more relevant to the specific product that's being discussed.

Cohere's finetuning setup on the Command model helps you do this. The setup allows you to use your own data to train custom models that are specific to your use case. This can be anything from customer service conversations, to product reviews, to questions and answers, and more.

By using your own data, you can train models that are specific to your industry and your customers. This can help you improve the relevance and accuracy of the responses that your models generate.

How to Finetune a Command Model

To train the model to complete your prompts, we prepare a dataset of prompt-completion pairs. The minimum required number of data points is 32, but in general, the more data points, the better. It also depends on the task or domain that we are working on — typically, the more specialized the task is, the more data is required.

The finetuning format takes JSONL files. For each training example, we provide a formatted prompt alongside a completion. Here is what the file should look like:

{"prompt": "This is the first prompt", "completion": "This is the first completion"}
{"prompt": "This is the second prompt", "completion": "This is the second completion"}

The following are some example use cases showing how a prompt-completion dataset is formatted.

Example 1: Tone-Tailored Question Answering

{
  "prompt": "How do I exercise more?",
  "completion": "Hey fam, you got this! Lemme start with..."
}
{
  "prompt": "What should I do to improve my bball game?",
  "completion": "Aight so first of all you need to..."
}

Example 2: Structured or Formatted Completions

{
  "prompt": "$GOOG and $MSFT are up today [...] Extract the ticker   symbols.",
  "completion": "['GOOG', 'MSFT']"
}
{
  "prompt": "Extract company names as ticker symbols. Reports say that the latest from Amazon...",
  "completion": "['AMZN']"
}

Example 3: Flexible Classification or Tagging

{
"prompt": "NEW YORK -- Last night, the mayor announced [...] Write a list of keywords.",
"completion": "news, american, political"
}
{
"prompt": "Write a list of keywords for the following: It was the best of times, it was the worst...",
"completion": "literature, book"
}

A great way to think about finetuning is to consider what inputs your model will see when it’s deployed and what completions (or generations) would be ideal for the user.

Completions don’t just have to be a “response” to a user — you can think of baking in a task or a job for the model to complete based on how you structure your finetuning data. For example, the prompts and completions above include not only tailored responses, but also entity extraction and classification.

If you are ever in doubt, try manually examining a set of prompts and completions yourself. If you see only a few examples of the prompts and completions, do you think that the next completion would be reasonable for you to guess? Asking this question can help you determine how to best structure your prompt-completion training data.

If you would like to try running a finetune on a sample dataset, head over to the LLM University chapter on creating custom generative models, which provides you with step-by-step instructions.

Final Thoughts

We hope that this post has helped you understand how to use finetuning to create a custom Command model that will boost the performance of the application you are building.

If you would like to jump right in and get started on creating custom models, you can do that via the Cohere dashboard.