At NeurIPS in December I met Rylan Schaeffer from Stanford, author (with Brando Miranda, and Sanmi Koyejo) of this fascinating paper about the benchmarks used to measure the capabilities of LLMs. He found that many of the most common benchmarks use non-linear or discontinuous metrics to measure capabilities that should really be measured with linear metrics. The non-linear metrics show sudden jumps in ability as models get bigger–so-called emergent abilities. But if you change the metric so it’s linear, as models get bigger they show steady, predictable progress. Nothing magical about it.
Click here for a re-print of an article I wrote for American Scientist, March-April, 2024, Vol. 112.
28 June 2023; Geoffrey Hinton, Godfather of AI, University of Toronto, on Centre Stage during day two of Collision 2023 at Enercare Centre in Toronto, Canada. Photo by Ramsey Cardy/Collision via Sportsfile
✅ Training language models to follow instructions with human feedback with Long Ouyang, OpenAI, on Sept. 27 at 3pm (EST).
✅ The Curse of Recursion: Training On Generated Data Makes Models Forget with Ilia Shumailov, University of Oxford, on Oct 11 at 12pm (EST).
✅ Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense Reasoning with Lianhui Qin, University of California San Diego, on Oct 25 at 12pm (EST).
✅ Theory of Mind May Have Spontaneously Emerged in Large Language Models with Michal Kosinski, Stanford University, on Nov 8 at 12pm (EST).
✅ Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks with Patrick Lewis, Cohere, on Nov 22 at 12pm (EST).
✅ Evaluating the Social Impact of Generative AI Systems in Systems and Society with Irene Solaiman and Zeerak Talat, Hugging Face, on Dec. 6 at 12pm (EST).
Mark your calendars 🗓️ and join us for a fun in-depth exploration into large language models and their expanding role in technology and society.
Babel: The Language Magazine is a fantastic magazine out of the UK for “lovers of language and linguistics.” I wrote this Feature about the phenomenon of “voicing” in D&D with Adam Axbey (game designer (Ubisoft) and veteran Dungeon Master).
[snip]
Speakers use a wide range of techniques to establish different voices for different personas in everyday conversation. They can use features of speech such as pitch, intonation, accent, and rhythm to invoke stock characters like ‘the valley girl’ or ‘the nerd’, or to parody the speech of a specific politician or celebrity. Shifts in voice can also be accomplished through the selection of specific words or syntax. My vocabulary as a father is very different to my vocabulary as a graduate student.
[snip]
Click here for a pre-print or head to Babel for full access to the magazine.