In this chapter, you’ll learn about connectors and how to build RAG applications using the web search connector.

We’ll use Cohere’s Python SDK for the code examples. Follow along in this notebook.

Contents

What Are Connectors?
Step-by-Step Guide
Setup
Create the Chatbot Component
Run the chatbot
Conclusion

In the previous chapter, we built the chatbot using the Chat endpoint’s document mode. Document mode provides developers with the flexibility to customize each component of a RAG stack.

There is another way to build RAG systems with the Chat endpoint, which is through the connector mode. Connector mode simplifies the development of RAG systems by abstracting away some of the complexities.

We’ll explore connectors over the next three chapters:

What Are Connectors?

Connectors are independent REST APIs that can be used in a RAG workflow to provide secure, real-time access to private data.

In enterprises, data lives in many different places. The ability of enterprises to realize the full value of RAG rests on their ability to bring these data sources together. Cohere’s build-your-own connectors framework enables developers to develop a connector to any datastore that offers an accompanying search API.

Cohere’s connectors framework simplifies connecting RAG systems to datastores

At a high level, here’s what connectors do. When the Chat endpoint calls a connector, what happens is that the endpoint is sending a query to the search endpoint of that connector. The connector will then return the list of documents that it deems the most relevant to the query.

The build-your-own connectors framework allows developers to build any logic behind a connector. For example, you can define the retrieval implementation—whether it’s running a semantic similarity search over a vector database, searching over an existing full-text search engine, or utilizing the existing search APIs of platforms like Google Drive or Notion.

Additionally, in connector mode, most of the RAG building blocks are taken care of by the endpoint. This includes deciding whether to retrieve information, generating queries, retrieving documents, chunking and reranking documents (post-retrieval), and generating the response.

Recall that in the previous chapter (document mode), we implemented the following steps.

Step 1: Get the user message
Step 2: Call the Chat endpoint in query-generation mode
If at least one query is generated:
- Step 3: Retrieve and rerank relevant documents
- Step 4: Call the Chat endpoint in document mode to generate a grounded response with citations
If no query is generated:
- Step 4: Call the Chat endpoint in normal mode to generate a direct response

In connector mode, this is simplified to the following two steps.

Step 1: Get the user message
Step 2: Call the Chat endpoint in connector mode to generate a response (this can be either a grounded response with citations or a direct response)

Step-by-Step Guide

Below is a diagram that provides an overview of what we’ll build. We’ll build a RAG chatbot that can search the web, retrieve relevant results to a user query, and generate grounded responses to the query.

Setup

First, let’s install and import the cohere library, and then create a Cohere client using an API key.

pip install cohere

import uuid
import cohere
from cohere import ChatConnector
from typing import List

co = cohere.Client("COHERE_API_KEY")

Create the Chatbot Component

The change from document mode to connector mode requires just one change to the Chat endpoint, which is swapping the documents parameter with the connectors parameter.

Here’s how it looks with the web search connector. We supply the connector id, which is web-search as an argument to the connectors parameter.

response = co.chat_stream(message="What is LLM university",
     connectors = [ChatConnector(id="web-search)])

The one line of code above is enough to get a full RAG-enabled response—the response text, the citations, and the source documents, which in this case are snippets from the most relevant information available on the web based on a given user message.

But in order to run this in a multi-turn chatbot scenario, we need to build the chatbot component. The good news is that we can adapt the chatbot we built in the previous chapter.

There are a few changes to make, including:

Remove the query generation logic (done by the endpoint)
Remove the retrieval logic (done by the endpoint)
Change the Chatbot initialization to use connectors instead
Use the connectors parameter instead of documents in the Chat endpoint call

class Chatbot:
    def __init__(self, connectors: List[str]):
        """
        Initializes an instance of the Chatbot class.

        """
        self.conversation_id = str(uuid.uuid4())
        self.connectors = [ChatConnector(id=connector) for connector in connectors]

    def run(self):
        """
        Runs the chatbot application.

        """
        while True:
            # Get the user message
            message = input("User: ")

            # Typing "quit" ends the conversation
            if message.lower() == "quit":
                print("Ending chat.")
                break
            else:                       # If using Google Colab, remove this line to avoid printing the same thing twice
              print(f"User: {message}") # If using Google Colab, remove this line to avoid printing the same thing twice

            # Generate response
            response = co.chat_stream(
                    message=message,
                    model="command-r-plus",
                    conversation_id=self.conversation_id,
                    connectors=self.connectors,
            )

            # Print the chatbot response, citations, and documents
            print("\nChatbot:")
            citations = []
            cited_documents = []

            # Display response
            for event in response:
                if event.event_type == "text-generation":
                    print(event.text, end="")
                elif event.event_type == "citation-generation":
                    citations.extend(event.citations)
                elif event.event_type == "stream-end":
                    cited_documents = event.response.documents

            # Display citations and source documents
            if citations:
              print("\n\nCITATIONS:")
              for citation in citations:
                print(citation)

              print("\nDOCUMENTS:")
              for document in cited_documents:
                print({'id': document['id'],
                      'snippet': document['snippet'][:400] + '...',
                      'title': document['title'],
                      'url': document['url']})

            print(f"\n{'-'*100}\n")

Run the Chatbot

And that’s about it. We are now ready to run the chatbot.

First we define the connector to use, which is web-search. Next, we create an instance of the Chatbot class using the connector, and then we run the chatbot.

# Define the connector
connectors = ["web-search"]

# Create an instance of the Chatbot class
chatbot = Chatbot(connectors)

# Run the chatbot
chatbot.run()

And we get the same type of response as we’ve seen in the previous chapter – the text response followed by the citations and source documents used.

User: What is Cohere's LLM University

Chatbot:
Cohere's LLM University (LLMU) is a set of comprehensive learning resources for anyone interested in natural language processing (NLP), from beginners to advanced learners. The curriculum covers everything from the basics of LLMs to the most advanced topics, including generative AI. The course is designed to give learners a solid foundation in NLP and help them develop their own applications.

CITATIONS:
start=24 end=30 text='(LLMU)' document_ids=['web-search_0', 'web-search_1']
start=36 end=75 text='set of comprehensive learning resources' document_ids=['web-search_1']
start=101 end=134 text='natural language processing (NLP)' document_ids=['web-search_0', 'web-search_1']
start=141 end=172 text='beginners to advanced learners.' document_ids=['web-search_0', 'web-search_1']
start=177 end=187 text='curriculum' document_ids=['web-search_0', 'web-search_1']
start=215 end=229 text='basics of LLMs' document_ids=['web-search_0', 'web-search_1']
start=237 end=283 text='most advanced topics, including generative AI.' document_ids=['web-search_1']
start=326 end=349 text='solid foundation in NLP' document_ids=['web-search_0', 'web-search_1']
start=364 end=395 text='develop their own applications.' document_ids=['web-search_0', 'web-search_1']

DOCUMENTS:
{'id': 'web-search_0', 'snippet': 'Guides and ConceptsAPI ReferenceRelease NotesApplication ExamplesLLMU\n\nCoralDashboardDocumentationPlaygroundCommunityLog In\n\nCoralDashboardDocumentationPlaygroundCommunityLog In\n\nWelcome to LLM University!\n\nWelcome to LLM University by Cohere!\n\nWe’re so happy that you’ve chosen to learn Natural Language Processing and Large Language Models with us.\n\nOur comprehensive curriculum aims to give you a ...', 'title': 'LLM University (LLMU) | Cohere', 'url': 'https://docs.cohere.com/docs/llmu'}
{'id': 'web-search_1', 'snippet': "Introducing LLM University — Your Go-To Learning Resource for NLP🎓\n\nDiscover our comprehensive NLP curriculum at LLM University. From the fundamentals of LLMs all the way to the most advanced topics, including generative AI\n\nWe're excited to announce the launch of LLM University (LLMU), a set of comprehensive learning resources for anyone interested in natural language processing (NLP), from begin...", 'title': 'Introducing LLM University — Your Go-To Learning Resource for NLP🎓', 'url': 'https://txt.cohere.com/llm-university/'}

----------------------------------------------------------------------------------------------------

Ending chat.

Conclusion

In this chapter, you learned about the concept of connectors and how to build a RAG-powered chatbot using connectors. In particular, we used the web search connector, which is a Cohere-managed connector that you can use immediately.

Continue to the next chapter to learn how to connect RAG applications to datastores by leveraging Cohere’s pre-built quickstart connectors.

About Cohere’s LLM University

Our comprehensive curriculum aims to equip you with the skills to develop your own AI applications. We cater to learners from all backgrounds, covering everything from the basics to the most advanced topics in large language models (LLMs). Plus, you'll have the opportunity to work on hands-on exercises, allowing you to build and deploy your very own solutions. Take a course today.

This LLMU module consists of the following chapters:

How to Build RAG Applications With Connectors