After an incredible year for C4AI, we are taking this opportunity to look back at and celebrate all that we have accomplished in 2023! Cohere For AI launched in June 2022 as a non-profit research lab dedicated to contributing fundamental research in machine learning.
This year, our lab published over 30 papers on a range of cutting-edge ML research topics, including efficiency at scale [1,2,3], safety [4,5,6], generalization and evaluation [7,8] and policy [9,10,11].This research is the result of collaborations across 40+ institutions and organizations, including academic, industry and civil society institutions. Cohere For AI and Cohere’s technical staff had over 20 maintrack publication acceptances this year at several major ML conferences, including ICLR, ACL, EMNLP and NeurIPS. An up-to-date list of C4AI’s research publications is available here.
Cohere For AI Scholars Program
In January, Cohere For AI welcomed its inaugural Scholars Program cohort of research scholars. We launched the Cohere For AI Scholars Program to provide the opportunity to rising research talent around the world to work alongside some of the best AI researchers and engineering expertise in the world. In our inaugural year, we welcomed talented researchers based around the globe including Brazil, Nigeria, Germany, Canada and the United States. We are thrilled to share their work on open ended questions such as Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models, Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning and Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation?
In January 2024, we will be welcoming our second cohort of research scholars!
Growing our Open Science Community
Our open science community is a space where researchers, engineers, linguists, social scientists and lifelong learners connect and collaborate with each other, from all over the world. By the end of 2023, our open science community involved 2,430 researchers from 119 countries.
In 2023, our open science community-led programs hosted over 200 events including hosting over 35 guest speakers to present and share their research findings. Lastly, this year, members of our open science community published over 75 papers, many of which were presented at conferences such as ICML, EMNLP and NeurIPS.
Aya: An Open Science Initiative to Accelerate Multilingual AI Progress
In January 2023 we launched the largest open science ML research project to date: Aya. Together with 3,000+ independent researchers spanning 119 countries, we are building a state of art multilingual generative language model that harnesses the collective wisdom and contributions of people from all over the world. The Aya dataset and model support 101 languages and will be released open-source in early 2024.
Exploring the unknown, together.
This year we have enjoyed connecting with the research community in many ways. In June, C4AI celebrated its one year anniversary in Toronto at the Cohere headquarters. This was a great opportunity to network with our community members and to highlight some of our research.
In 2023, we attended conferences on all five continents to discuss the future of machine learning research and support rising stars in research around the world. In March, our team headed to Uruguay for Khipu 2023 where we presented two poster presentations highlighting the latest Cohere For AI and Cohere research. Luiza Pozzobon presented our work with Beyza Ermis, Patrick Lewis, and Sara Hooker on the hazards of using black box APIs to evaluate toxicity. David Cairuz and Luísa Moura teamed up to co-present a poster on the research behind Cohere’s summarization & conversational AI efforts.
The C4AI team traveled to Kigali for the ICLR 2023 conference in March, where we showcased work such as “Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics” led by Shoaib Ahmed Siddiqui, Nitarshan Rajkumar, Tegan Maharaj, David Krueger and Sara Hooker.
In August, our team visited Bangkok, Thailand for MLRS, where the goal of this event is to provide a basic understanding of ML to Thai students and researchers and to promote this research area in Thailand. One of our goals is to build open collaborations across continents to further breakthrough research in ML.
AI Policy and Safety
Cohere For AI continues to share research insights and offer technical expertise surrounding effective governance of AI, and strategies to minimize harm through development and deployment. In April 2023, C4AI attended the World Economic Forum’s AI Governance Summit in San Francisco. We contributed to safety and policy research such as Evaluating the Social Impact of Generative AI Systems in Systems and Society, Goodtriever: Adaptive Toxicity Mitigation with Retrieval-augmented Models and FAIR-Ensemble: When Fairness Naturally Emerges From Deep Ensembling. Our team met with leading experts to discuss the economic and societal implications of Generative AI, and communicated ways to address these challenges. In November, our team headed to London for the UK AI Safety Summit. The summit brought together international governments, leading AI companies, civil society groups and experts in research to discuss how the risks of AI can be mitigated through international coordination.
Research Grant Program
In alignment with our mission to drive meaningful progress in machine learning research through open collaboration, and empowering different perspectives to ensure responsible innovation, we launched the Cohere For AI Research Grant Program. These research grants are designed to support academic partners who are conducting research with the goal of releasing a peer-reviewed scientific artifact.
Since this program launched in July, we have granted over 20 research grants to academic partners, developers, researchers, and other members of our community with subsidized access to the Cohere API. We are proud to support initiatives that are focused on enhancing natural language understanding for underrepresented language, biomedical knowledge integration in language models, implicature understanding and more.
Looking Forward to 2024: Join C4AI
While we are extremely proud of all that we have accomplished in 2023, there are still many things we hope to achieve. As we enter 2024, our goal is to continue to show that top tier research can be done while changing where, how and by whom research is done. We will also continue to be fiercely dedicated to contributing fundamental research in machine learning and contributing breakthroughs at the frontier. We are truly grateful for your support, attendance at our events, and shared enthusiasm in exploring the unknown. Looking forward to the year to come!