Context by Cohere
Programmatic Fine-Tuning

Programmatic Fine-Tuning

Introducing programmatic fine-tuning with our Python SDK.

Share:

We are excited to introduce programmatic fine-tuning (beta) with our Python SDK. Prior to this, fine-tuning was available exclusively via the Cohere Dashboard. With this new programmatic access, developers can seamlessly incorporate fine-tuning into their workflows.

By leveraging the SDK, developers can unlock automation, version control, and reproducibility, simplifying their fine-tuning process.

Benefits of Programmatic Fine-Tuning

Fine-tuning models programmatically comes with a number of benefits:

  1. Automation: With the Python SDK, customers can automate the process of fine-tuning, eliminating the need for manual configuration through the UI. This allows for rapid experimentation and iteration, saving valuable time and effort.
  2. Version Control: Through the Python SDK, users can manage their fine-tuning code using popular version control systems, such as Git. This provides a centralized and organized approach to model creation, making it easier to collaborate with team members and track changes over time.
  3. Reproducibility: Programmatic fine-tuning using the Python SDK ensures that the entire process is reproducible. Because models can be defined, trained, and finetuned using code, the same setup can be replicated easily for future use or shared with others.

What is Fine-Tuning?

To grasp the concept and functionality of fine-tuning, it is helpful to familiarize oneself with a couple of commonly used terms associated with large language models (LLMs): pre-training and finetuning.

Pre-training involves training a language model on a large corpus of text data, so that it may acquire a grasp of the general patterns and structures of language. This process enables the model to generate coherent text.

Finetuning, on the other hand, entails taking a pre-trained language model and training it on a smaller, more specific dataset in order to adapt it for a particular task. Finetuning allows customization of the model to cater to a specific task, leading to improved performance.

An overview of the fine-tuning process

Understanding the Fine-Tuning Python SDK

Now, let's explore the fine-tuning process via the Python SDK. The create_custom_model method is key to fine-tuning using the Python SDK. It provides a straightforward way to initiate fine-tuning with customizable options.

Basic Input Parameters

To create a fine-tune using the Python SDK, the create_custom_model method requires three essential parameters: name, model_type, and dataset.

The name parameter represents the unique name assigned to the fine-tuned model within the organization.

The model_type parameter determines the type of fine-tuned model to be created. Currently users can choose from two options: GENERATIVE or CLASSIFY.

The dataset parameter is where the training data for the fine-tuned model is provided. The dataset can be of different types, such as InMemoryDataset, CsvDataset, JsonlDataset, or TextDataset.

create_custom_model(name: str, model_type: Literal['GENERATIVE', 'CLASSIFY'], dataset: CustomModelDataset)→ CustomModel

Additional Input Parameters

Beyond the basic parameters, there are other parameters that can be defined, namely the hyperparameters for training a fine-tuned model, as follows:

  • train_steps: The maximum number of training steps to run for
  • learning_rate: The initial learning rate to be used during training
  • train_batch_size: The batch size used during training
  • early_stopping_patience: Stop training if the loss metric does not improve beyond the early_stopping_threshold for this many times of evaluation
  • early_stopping_threshold: How much the loss must improve to prevent early stopping

Step-by-Step Example

This section goes through the step-by-step process of using the Python SDK to create fine-tuned models.

1. Setting Up

First, let’s install and import the cohere library, which provides the necessary functionality for interacting with the Cohere platform.

Also, import the CsvDataset class from the cohere.custom_model_dataset module, which is used to handle training data in a CSV format.

Then, initialize a cohere.Client object by passing an API key as a parameter.

!pip install -q cohere

import cohere
from cohere.custom_model_dataset import CsvDataset

co = cohere.Client("Your API KEY")

2. Creating a Fine-Tuned Model

Now, let’s create a CsvDataset object, specifying the training file path (train_file) and delimiter (delimiter) used in the CSV file.

Finally, call the create_custom_model method to create a fine-tuned model. The function takes a model name of your choice (here the name is "prompt-completion-ft"), the dataset object (dataset), and the model type (GENERATIVE) as parameters.

3. Viewing the Fine-Tuning Status

The fine-tuning status can be viewed via the get_custom_model_by_name method, passing in the model name we specified earlier.

co.get_custom_model_by_name("prompt-completion-ft")

The response is a cohere.CustomModel object. It contains several values that provide information about the fine-tuned model. Here are some of the key ones:

  • id: A unique identifier (UUID) assigned to the model.
  • name: The name of the model.
  • status: The current fine-tuning status. The list of possible statuses can be found in the documentation, but two worth mentioning here are QUEUED, indicating that the model is in the queue for processing and READY, indicating the model is ready for usage.
  • model_type: The model fine-tuning type, which is `GENERATIVE` in this case.
  • model_id: The ID to be used when using the fine-tuned model in an API call.
cohere.CustomModel {
 	id: 1bc210c6-b2fd-4ffa-9a1e-9d07f0eca0ca
 	name: prompt-completion-ft
 	status: READY
 	model_type: GENERATIVE
 	created_at: 2023-07-23 19:42:57.742266+00:00
 	completed_at: None
 	model_id: 2dde35d0-bc35-4976-a98b-03d0d622f3f5-ft
 	hyperparameters: HyperParameters(early_stopping_patience=6, early_stopping_threshold=0.01, train_batch_size=16, train_steps=2500, learning_rate=0.01)
 	_wait_fn: <bound method Client.wait_for_custom_model of <cohere.client.Client object at 0x7eb46c283550>>
 },

An alternative for getting the fine-tuning status is via the get_custom_model method, which takes as input the id value we saw in the response above.

co.get_custom_model("1bc210c6-b2fd-4ffa-9a1e-9d07f0eca0ca")

4. Using the Fine-Tuned Model

Once the fine-tuning’s status is READY, we can start using the model in an endpoint call. Using a fine-tuned model is as simple as substituting the default models (for example, command) with the model_id of the model. Here is an example with the Generate endpoint.

co.generate(
  model='2dde35d0-bc35-4976-a98b-03d0d622f3f5-ft',
  prompt="YOUR_PROMPT_HERE")

Conclusion

The fine-tuning Python SDK brings a number of benefits, such as automation, version control, reproducibility, and seamless integration with existing Python-based workflows. This empowers developers to leverage Cohere's cutting-edge LLMs more efficiently and flexibly, opening up new possibilities for tailored language models that cater to diverse use cases.

Keep reading