Introducing Aya
I am proud to be one of 3,000 humans who built Aya – a new massively multilingual, generative LLM that outperforms existing open-source models and covers 101 different languages. Check out Aya here, including the dataset, model, two new papers on arXiv and a nice documentary on the entire process.
https://cohere.com/research/aya
The paper here details exactly how we put together the dataset and relied on communities of speakers in 101 different languages around the world. Submitted to Association of Computational Linguistics, 2024.
Are Emergent Abilities of Large Language Models a Mirage?
At NeurIPS in December I met Rylan Schaeffer from Stanford, author (with Brando Miranda, and Sanmi Koyejo) of this fascinating paper about the benchmarks used to measure the capabilities of LLMs. He found that many of the most common benchmarks use non-linear or discontinuous metrics to measure capabilities that should really be measured with linear metrics. The non-linear metrics show sudden jumps in ability as models get bigger–so-called emergent abilities. But if you change the metric so it’s linear, as models get bigger they show steady, predictable progress. Nothing magical about it.
Click here for a re-print of an article I wrote for American Scientist, March-April, 2024, Vol. 112.
Combat by Algorithm: Trials of Strength in Artificial Intelligence Research
My latest for Anthropology News: on academic feuds, the emerging ‘voice’ of ChatGPT and ensuring equal access to multilingual datasets:
Large Language Model Reading Group
Elena Yunusov and I are excited to announce the Fall series of our reading group with scholars from OpenAI, University of Oxford, Cohere, Hugging Face, Stanford University and others as we delve into pivotal papers in the field of NLP/LLM.
Series Kick-off: September 27, 2023 at 3 PM (EST)
Opening Paper: “Training language models to follow instructions with human feedback”
First Speaker: Long Ouyang, OpenAI
Join Us on Discord: https://lnkd.in/dgjcbZrn
Upcoming Sessions:
Training language models to follow instructions with human feedback with Long Ouyang, OpenAI, on Sept. 27 at 3pm (EST).
The Curse of Recursion: Training On Generated Data Makes Models Forget with Ilia Shumailov, University of Oxford, on Oct 11 at 12pm (EST).
Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense Reasoning with Lianhui Qin, University of California San Diego, on Oct 25 at 12pm (EST).
Theory of Mind May Have Spontaneously Emerged in Large Language Models with Michal Kosinski, Stanford University, on Nov 8 at 12pm (EST).
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks with Patrick Lewis, Cohere, on Nov 22 at 12pm (EST).
Evaluating the Social Impact of Generative AI Systems in Systems and Society with Irene Solaiman and Zeerak Talat, Hugging Face, on Dec. 6 at 12pm (EST).
Mark your calendars and join us for a fun in-depth exploration into large language models and their expanding role in technology and society.
Shifting Voices in Dungeons & Dragons
Babel: The Language Magazine is a fantastic magazine out of the UK for “lovers of language and linguistics.” I wrote this Feature about the phenomenon of “voicing” in D&D with Adam Axbey (game designer (Ubisoft) and veteran Dungeon Master).
Speakers use a wide range of techniques to establish different voices for different personas in everyday conversation. They can use features of speech such as pitch, intonation, accent, and rhythm to invoke stock characters like ‘the valley girl’ or ‘the nerd’, or to parody the speech of a specific politician or celebrity. Shifts in voice can also be accomplished through the selection of specific words or syntax. My vocabulary as a father is very different to my vocabulary as a graduate student.
[snip]
Click here for a pre-print or head to Babel for full access to the magazine.