Context by Cohere
Large Language Models and Where to Use Them: Part 2

Large Language Models and Where to Use Them: Part 2

Over the past few years, large language models (LLMs) have evolved from emerging to mainstream technology. In this blog post, we'll explore some of the most common natural language processing (NLP) use cases that they can address. This is part two of a two-part series.

You can find Part 1 here.

It can be a bit overwhelming for someone new to Large Language Models (LLMs) to understand when and where to use them in natural language processing (NLP) use cases. In this blog series, we simplify LLM application by mapping out the seven broad categories of use cases that you can address with Cohere’s LLM.

In Part 1 of our series, we covered the first four use case categories: Generate, Summarize, Rewrite, and Extract. In this post, we will cover the other three: Search, Cluster, and Classify. Finally, we’ll look at how we can combine the different types, making their applications much more interesting and useful.

5. Search/Similarity


Any mention of LLMs will most likely spark discussion around their text generation capabilities, as we’ve seen in the previous four use cases. The less-talked-about, but equally powerful capability, is text representation.

While text generation is about creating new text, text representation is about making sense of existing text. Think about the amount of unstructured text data being generated today that’s only accelerated by the increasingly ubiquitous internet. It would not be possible for humans to process this massive volume of information without NLP-powered automation.

One such use case category for text representation is similarity search. Given a text query, the goal is to find documents that are most similar to the query.

The most obvious example use case for this is search engines. As users, we expect the search results to return links and documents that are highly relevant to our query. What makes modern search engines work very well is their ability to match the query to the appropriate results not just via keyword-matching, but by semantic similarity.

In simple words, they are able to perform matching based on meaning, context, themes, ideas — abstract concepts that may use different words altogether, but very much relate to each other.

Let’s say a user enters the search string “ground transportation at the airport.” The search engine must be able to know that the user is looking for taxis, car rentals, trains, or other similar services, even if the user doesn’t explicitly mention them.

When we input a piece of text into a representation model, instead of generating more text, the model generates a set of numbers that represent the meaning or context of the input text. These numbers are called “text embeddings”. In LLMs, they tend to be a very long sequence of numbers, typically in the thousands, and the longer they are, the more information is stored about the text.

With Cohere, you can access this type of model via the Embed endpoint. This Python notebook provides an example of a semantic search application, where given a question, the search engine would return other frequently asked questions (FAQ) whose text embeddings are the most similar to the question.

It goes on to show all the questions on a two-dimensional plot, shown in the image below, where the closer two points are on the plot, the more semantically similar they are.

Two examples of similar questions about sharks and Boxing Day
Two examples of similar questions about sharks and Boxing Day

This concept can be applied to a much broader range of use cases, for example:

  • Retrieval of related and useful documents within an organization
  • Similar product recommendations
  • eCommerce product search
  • Next article recommendations based on reading history
  • Selecting chatbot responses from an available list

6. Cluster


Clustering is another use case category that leverages text embeddings. The idea is to take a group of documents and make sense of how they are organized and how they are related to each other.

In the previous use case, we visualized a set of documents on a plot to get a sense of how a set of documents are similar, or different, from each other. Clustering uses the same principles, but adds another step of organizing them into groups. This can be done via clustering algorithms, for example, k-means clustering, where we specify the number of clusters and the algorithm will return the appropriate cluster associated with each piece.

This Python notebook, also leveraging the Embed endpoint, goes into detail about how to make sense of three thousand “Ask HN” (Hacker News) posts. First, the text embeddings for each are generated. This is followed by clustering them into smaller groups by the theme or topic of the posts, supplemented by the keywords that represent the topic of each group.

Finally, these posts are visualized on a plot, shown in the image below, where one color represents a topic cluster. Below you can see a few topics emerging, such as life, career, coding, startups, and computer science.

Eight clusters from the top 3,000 Ask HN posts, with each set of keywords representing a topic
Eight clusters from the top 3,000 Ask HN posts, with each set of keywords representing a topic

This technique can be applied to number of different tasks, such as:

  • Organizing customer feedback and requests into topics
  • Segmenting products into categories based on product descriptions
  • Turning ESG reports and news into themes
  • Organizing a huge corpus of company documents
  • Discovering emerging themes in survey responses analysis

7. Classify


Last but not least is the text classification category, and that’s because it is probably the most widely applicable use of NLP today. You can think of it as similar to clustering, with a slight twist.

Clustering is called an “unsupervised learning” algorithm. That’s because we don’t know what the clusters are beforehand — we assign a number of clusters (we can choose any number), and the algorithm will group the documents we give according to that number.

On the other hand, classification is a “supervised learning” algorithm, because this time, we already know beforehand what those clusters, or more precisely classes, are.

For example, say we have a list of eCommerce customer inquiries, and for routing purposes, we would like to categorize each of them into one of three classes: Shipping, Returns, and Tracking. To make the classifier work, we first need to train it by showing it enough examples of a piece of text, such as “Do you offer same day shipping?”, and its actual class, which in this case is Shipping.

With LLMs, there are a couple of possible approaches to doing this. The first is via text embeddings, demonstrated in this Python notebook. It shows an example of training a classifier using text embeddings. First, it generates the embeddings of each piece of text. Next, it uses these embeddings as the input for training the classifier. For this kind of setup, the number of training examples required will depend on the task, but typically it can range in the hundreds or even thousands.

The other approach is by leveraging “few-shot” classification. With this approach, we are leveraging prompt engineering to provide classification examples to the model. This has shown to work well with as few as five training examples per class, though it still depends on the kind of task we are working on. But this option allows us to build a working classifier when we don’t have many training examples — an all-too-common problem.

Here’s how we would build the eCommerce inquiries classifier with a few-shot approach. The following is a screenshot from the Cohere Playground, where we leverage the Classify endpoint to build a classifier.

First, we prepare the prompt containing examples of text-class pairs. With a minimum of five examples per class, and three classes, we give it a total number of fifteen examples.

The list of examples used to build the classifier
The list of examples used to build the classifier

Next, we add any number of inputs that we would like to classify — here we have two inputs as examples.

The list of inputs for the classifier to classify
The list of inputs for the classifier to classify

We can then trigger the classification, in which the model will output the predicted class for each input and the accompanying confidence level values, which indicate how confident the model is in its prediction of each class.

The predictions given by the classifier together with the confidence levels
The predictions given by the classifier together with the confidence levels

You can test it out by accessing the saved preset.

Some example areas where text classification can be useful include:

  • Content moderation for toxic comments on online platforms
  • Intent classification in chatbots
  • Sentiment analysis on social media activity
  • eCommerce product categorization
  • Assigning customer support tickets to the right teams

Getting the best out of the Cohere API

Now that we’ve covered the seven main use case categories for LLMs, let’s consider how we can build really interesting applications — by stacking these different capabilities together. Let’s look at a few examples and start with a fun one.

Imagine that you are creating a chatbot that needs to have a certain voice or style. In our case, that bot happens to a pirate!

Let’s make it a game where people can enter a phrase, and the bot will decide whether the phrase is “pirate” enough. And if it’s not, the bot will even correct the phrase and turn it into pirate lingo!

This is actually something that our team has experimented with ourselves. But without going into the implementation details, to make it work, we had to first classify whether or not a phrase is acceptable pirate speak. If it’s not, then we put the phrase through a pirate paraphraser. We then compare the similarity between the generated phrase and the original phrase, and only if they are similar enough would the bot return the new phrase.

To make this happen, we made use of three use case categories: Classify, Rewrite, and Search/Similarity.

A summary flow of the pirate paraphraser
A summary flow of the pirate paraphraser

A more serious example would be a chatbot that answers questions on a forum. Here’s one possible basic implementation. First, we implement a classification step to determine if a user has entered a question or just a general comment or chat. And if it’s a question, then we proceed to search for the question from our database that is the most similar to the query, so we can provide a relevant answer.

A summary flow of the question answering chatbot
A summary flow of the question answering chatbot

In another example, let’s say we are building an article recommendation system, where the goal is to provide a list of other articles most relevant to the one that a user is currently reading. This article demonstrates an example of implementing similarity search, classification, and extraction in a basic recommender.

A summary flow of the article recommender
A summary flow of the article recommender

We can take it even further by combining these steps with other APIs. In a recent blog post, we describe how to build complete, fully playable Magic the Gathering cards with AI, combining the capabilities of Cohere’s API together with a text-to-image generation API.

A summary flow of the Magic the Gathering card generator
A summary flow of the Magic the Gathering card generator


With these examples, we are only just scratching the surface. The possibilities of using LLMs are limited only by our imagination. This is an exciting time where any developer and team, not just the big players anymore, can tackle some of the toughest NLP challenges by leveraging cutting-edge AI technologies that are made available via simple API calls.

Keep reading