Are Emergent Abilities of Large Language Models a Mirage?

At NeurIPS in December I met Rylan Schaeffer from Stanford, author (with Brando Miranda, and Sanmi Koyejo) of this fascinating paper about the benchmarks used to measure the capabilities of LLMs. He found that many of the most common benchmarks use non-linear or discontinuous metrics to measure capabilities that should really be measured with linear metrics. The non-linear metrics show sudden jumps in ability as models get bigger–so-called emergent abilities. But if you change the metric so it’s linear, as models get bigger they show steady, predictable progress. Nothing magical about it.

Click here for a re-print of an article I wrote for American Scientist, March-April, 2024, Vol. 112.

Combat by Algorithm: Trials of Strength in Artificial Intelligence Research

My latest for Anthropology News: on academic feuds, the emerging ‘voice’ of ChatGPT and ensuring equal access to multilingual datasets:

https://www.anthropology-news.org/articles/combat-by-algorithm-trials-of-strength-in-artificial-intelligence-research/

 

28 June 2023; Geoffrey Hinton, Godfather of AI, University of Toronto, on Centre Stage during day two of Collision 2023 at Enercare Centre in Toronto, Canada. Photo by Ramsey Cardy/Collision via Sportsfile

Large Language Model Reading Group

Elena Yunusov and I are excited to announce the Fall series of our reading group with scholars from OpenAIUniversity of OxfordCohereHugging FaceStanford University and others as we delve into pivotal papers in the field of NLP/LLM.

🗓️ Series Kick-off: September 27, 2023 at 3 PM (EST)
📄 Opening Paper: “Training language models to follow instructions with human feedback
🎙️ First Speaker: Long OuyangOpenAI
🔗 Join Us on Discord: https://lnkd.in/dgjcbZrn

🗓️ Upcoming Sessions:

✅ Training language models to follow instructions with human feedback with Long Ouyang, OpenAI, on Sept. 27 at 3pm (EST).

✅ The Curse of Recursion: Training On Generated Data Makes Models Forget with Ilia Shumailov, University of Oxford, on Oct 11 at 12pm (EST).

✅ Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense Reasoning with Lianhui Qin, University of California San Diego, on Oct 25 at 12pm (EST).

✅ Theory of Mind May Have Spontaneously Emerged in Large Language Models with Michal Kosinski, Stanford University, on Nov 8 at 12pm (EST).

✅ Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks with Patrick Lewis, Cohere, on Nov 22 at 12pm (EST).

✅ Evaluating the Social Impact of Generative AI Systems in Systems and Society with Irene Solaiman and Zeerak Talat, Hugging Face, on Dec. 6 at 12pm (EST).

Mark your calendars 🗓️ and join us for a fun in-depth exploration into large language models and their expanding role in technology and society.

Shifting Voices in Dungeons & Dragons

Babel: The Language Magazine is a fantastic magazine out of the UK for “lovers of language and linguistics.” I wrote this Feature about the phenomenon of “voicing” in D&D with Adam Axbey (game designer (Ubisoft) and veteran Dungeon Master).

[snip]

Speakers use a wide range of techniques to establish different voices for different personas in everyday conversation. They can use features of speech such as pitch, intonation, accent, and rhythm to invoke stock characters like ‘the valley girl’ or ‘the nerd’, or to parody the speech of a specific politician or celebrity. Shifts in voice can also be accomplished through the selection of specific words or syntax. My vocabulary as a father is very different to my vocabulary as a graduate student.

[snip]

Click here for a pre-print or head to Babel for full access to the magazine.