Today, we are launching three new fine-tuning capabilities:

Chat fine-tuning: For developing natural, personalized, and context-sensitive interactions that elevate customer support and virtual assistant conversations
Rerank fine-tuning: For transforming search and recommendation systems with superior accuracy to deliver more relevant results and an improved user experience
Multi-label classification fine-tuning: For streamlining text analysis with simultaneous multi-label categorization to improve efficiency and insight

These latest additions, alongside our existing generative fine-tuning solution, complete a comprehensive suite designed to cater to a diverse range of enterprise AI applications. With these new capabilities, enterprises can now customize Cohere’s generative and embedding models on their data and deliver applications that perform better for their targeted use cases across text generation, summarization, chat, classification, and enterprise search.

Easily Manage Fine-Tuned Models

Our new fine-tuning dashboard enables enterprises to easily manage and run their fine-tuning projects. Users can get started in minutes.

The dashboard includes a testing playground for users to experiment and validate fine-tuned models, as well as a pricing calculator to empower users to make informed decisions about fine-tuning costs upfront.

Users can also access the model management interface that provides real-time insights into the progress and status of ongoing fine-tuning jobs. This ensures that users have a comprehensive overview of their model development pipeline and custom evaluation metrics, empowering Cohere customers to effectively manage, monitor, and optimize their fine-tuned models.

For developers who want to run automated fine-tuning jobs, we support programmatic fine-tuning via our Python SDK.

Quality Improvements with Fine-Tuning

One of our most frequent requests from enterprise users is to develop tailored models designed specifically for their unique domain, which can meaningfully improve model performance across their targeted use cases. With this launch, enterprises can run supervised fine-tuning on various use cases in-house. Cohere provides additional fine-tuning controls with hyperparameter tuning, where customers can tune up to six hyperparameters to maximize performance from their fine-tuned model.

Let’s take a look at an example use case. Consider a financial service provider looking to offer analysts a detailed and accurate view of the current state of the business. An out-of-the-box model could provide some general answers, but it may struggle to provide precise domain-relevant responses. Financial queries tend to require extensive calculations, and responses often need to demonstrate how different financial elements are interrelated or dependent on each other, with many requiring multiple conversation turns.

To explore this use case, we compared an out-of-the-box Command Light model with a fine-tuned Command Light model trained with a financial dataset to answer questions about revenue projections.

Example Output: Financial Projections Chatbot

The results of our comparison demonstrate that the fine-tuned model showed enhancements in natural language output and increased precision in responses. This aligns with the observed 60% enhancement in accuracy for Cohere's Generate solution and a 40% improvement in Cohere's Chat solution as a consequence of fine-tuning.

We consistently observed such enhancements in quality and accuracy as a result of fine-tuning across the various use cases we tested, including in sectors like financial services, legal, human resources, technology, and retail.

Faster Time to Production

Another key benefit of our refined fine-tuning infrastructure is that it facilitates swift training and deployment of fine-tuned models for enterprises. Developers can now train a fine-tuned model in just 30 minutes. This is possible due to the enhancements in our fine-tuning framework and our efficient TPU allocation process, which minimizes the chances of failed jobs and eliminates prolonged wait times during training.

After training is completed, enterprises can then deploy their fine-tuned models in production in under a minute. Developers can easily manage and confidently scale their AI applications, knowing that they are supported by a system designed for stability and efficiency in high-demand environments.

Affordable Fine-Tuning Pricing

The cost efficiency of AI applications is essential to determining their scalability, particularly with respect to training and inference costs. Cohere aims to provide a range of pricing options that meet the needs of various enterprise use cases. To access a powerful LLM at a lower price point, we recommend fine-tuning a smaller generative model like Command Light which can optimize the performance of a specific task while also being cost-efficient to train and run.

For example, training a fine-tuned Command Light model using a dataset of 1M tokens over two epochs, or training cycles, would have an expected cost of $2 USD. Additionally, early evaluations have shown that enterprises can expect to achieve similar or even better performances at lower latency and cost by fine-tuning Command Light as opposed to using Command out-of-the-box. Fine-tuning Command Light also provided more consistent outcomes when compared to Command as there was no need for prompt engineering.

We are excited to announce our fine-tuning offering at the same inference costs as our base models and cannot wait for developers to elevate AI precision with fine-tuning.

Inference pricing for fine-tuned models is the same as those of base models.

Cohere customers can now access the fine-tuning dashboard and begin customization based on generate, chat, classify, and rerank solutions. Developers can follow our guides to learn more about how they work.