Context by Cohere
Programmatic Custom Model Creation

Programmatic Custom Model Creation

Introducing programmatic custom model creation with our Python SDK.

Share:

We are excited to introduce programmatic custom model creation (beta) with our Python SDK. Prior to this, custom model creation was available exclusively via the Cohere Dashboard. With this new programmatic access, developers can seamlessly incorporate custom model creation into their workflows.

By leveraging the SDK, developers can unlock automation, version control, and reproducibility, simplifying their custom model creation process.

Benefits of Programmatic Custom Model Creation

Creating custom models programmatically comes with a number of benefits:

  1. Automation: With the Python SDK, customers can automate the process of creating custom models, eliminating the need for manual configuration through the UI. This allows for rapid experimentation and iteration, saving valuable time and effort.
  2. Version Control: Through the Python SDK, users can manage their code for custom models using popular version control systems, such as Git. This provides a centralized and organized approach to model creation, making it easier to collaborate with team members and track changes over time.
  3. Reproducibility: Programmatically creating custom models using the Python SDK ensures that the entire process is reproducible. Because models can be defined, trained, and finetuned using code, the same setup can be replicated easily for future use or shared with others.

What Are Custom Models?

To grasp the concept and functionality of a custom model, it is helpful to familiarize oneself with a couple of commonly used terms associated with large language models (LLMs): pre-training and finetuning.

Pre-training involves training a language model on a large corpus of text data, so that it may acquire a grasp of the general patterns and structures of language. This process enables the model to generate coherent text.

Finetuning, on the other hand, entails taking a pre-trained language model and training it on a smaller, more specific dataset in order to adapt it for a particular task. Finetuning allows customization of the model to cater to a specific task, leading to improved performance.

An overview of the custom model creation process

Understanding the Custom Model Python SDK

Now, let's explore the process of creating custom models via the Python SDK. The create_custom_model method is key to building custom models using the Python SDK. It provides a straightforward way to initiate the creation of a new custom model with customizable options.

Basic Input Parameters

To create a custom model using the Python SDK, the create_custom_model method requires three essential parameters: name, model_type, and dataset.

The name parameter represents the unique name assigned to the custom model within the organization.

The model_type parameter determines the type of custom model to be created. Currently users can choose from two options: GENERATIVE or CLASSIFY.

The dataset parameter is where the training data for the custom model is provided. The dataset can be of different types, such as InMemoryDataset, CsvDataset, JsonlDataset, or TextDataset.

create_custom_model(name: str, model_type: Literal['GENERATIVE', 'CLASSIFY'], dataset: CustomModelDataset)→ CustomModel

Additional Input Parameters

Beyond the basic parameters, there are other parameters that can be defined, namely the hyperparameters for training a custom model, as follows:

  • train_steps: The maximum number of training steps to run for
  • learning_rate: The initial learning rate to be used during training
  • train_batch_size: The batch size used during training
  • early_stopping_patience: Stop training if the loss metric does not improve beyond the early_stopping_threshold for this many times of evaluation
  • early_stopping_threshold: How much the loss must improve to prevent early stopping

Step-by-Step Example

This section goes through the step-by-step process of using the Python SDK to create custom models.

1. Setting Up

First, let’s install and import the cohere library, which provides the necessary functionality for interacting with the Cohere platform.

Also, import the CsvDataset class from the cohere.custom_model_dataset module, which is used to handle training data in a CSV format.

Then, initialize a cohere.Client object by passing an API key as a parameter.

!pip install -q cohere

import cohere
from cohere.custom_model_dataset import CsvDataset

co = cohere.Client("Your API KEY")

2. Creating a Custom Model

Now, let’s create a CsvDataset object, specifying the training file path (train_file) and delimiter (delimiter) used in the CSV file.

Finally, call the create_custom_model method to create a custom model. The function takes a custom model name of your choice (here the name is "prompt-completion-ft"), the dataset object (dataset), and the model type (GENERATIVE) as parameters.

3. Viewing the Custom Model Status

The status of the custom model creation can be viewed via the get_custom_model_by_name method, passing in the custom model name we specified earlier.

co.get_custom_model_by_name("prompt-completion-ft")

The response is a cohere.CustomModel object. It contains several values that provide information about the custom model. Here are some of the key ones:

  • id: A unique identifier (UUID) assigned to the custom model.
  • name: The name of the custom model.
  • status: The current status of the custom model. The list of possible statuses can be found in the documentation, but two worth mentioning here are QUEUED, indicating that the model is in the queue for processing and READY, indicating the model is ready for usage.
  • model_type: The custom model type, which is `GENERATIVE` in this case.
  • model_id: The ID to be used when using the custom model in an API call.
cohere.CustomModel {
 	id: 1bc210c6-b2fd-4ffa-9a1e-9d07f0eca0ca
 	name: prompt-completion-ft
 	status: READY
 	model_type: GENERATIVE
 	created_at: 2023-07-23 19:42:57.742266+00:00
 	completed_at: None
 	model_id: 2dde35d0-bc35-4976-a98b-03d0d622f3f5-ft
 	hyperparameters: HyperParameters(early_stopping_patience=6, early_stopping_threshold=0.01, train_batch_size=16, train_steps=2500, learning_rate=0.01)
 	_wait_fn: <bound method Client.wait_for_custom_model of <cohere.client.Client object at 0x7eb46c283550>>
 },

An alternative for getting the status of the custom model is via the get_custom_model method, which takes as input the id value we saw in the response above.

co.get_custom_model("1bc210c6-b2fd-4ffa-9a1e-9d07f0eca0ca")

4. Using the Custom Model

Once the custom model’s status is READY, we can start using the model in an endpoint call. Using a custom model is as simple as substituting the default models (for example, command) with the model_id of the custom model. Here is an example with the Generate endpoint.

co.generate(
  model='2dde35d0-bc35-4976-a98b-03d0d622f3f5-ft',
  prompt="YOUR_PROMPT_HERE")

Conclusion

The Python SDK for creating custom models brings a number of benefits, such as automation, version control, reproducibility, and seamless integration with existing Python-based workflows. This empowers developers to leverage Cohere's cutting-edge LLMs more efficiently and flexibly, opening up new possibilities for tailored language models that cater to diverse use cases.

Keep reading