On using the proper terms…

6 minute read

Published:

The higher education system, and the mainstream shared conception is that high levels of specialization are desirable, and useful at the individual level. If we project this kind of idea as the most shared principle driving individual training from high-school, to university, and potentially grad-school, we end up with a population of specialists. I’ve read a book on the advantages of being a generalist (Range, by David Epstein) rather than having an early specialization, but surely one side effect of focusing intensely on something and not wasting time on other things is that we tend to become ignorant on any other subject outside our area of specialization.

This phenomenon has a direct influence on the topic of the post, which is the usage of wrong terms, typically introduced in other fields of knowledge, to describe a phenomenon in our area of expertise. I have a couple of recent examples that are particularly bothering me, and I can be pretty obnoxious when bothered. So, here we are.

The first misused term is hallucination when used to discuss various types of errors detected in the output of Large Language Models (LLMs): I actually wanted to insert a reference here, but I am in dire straits finding a good one, due to huge amount of works carried out in very limited time on this topic. Well, anyway, this is a blog post, and you probably know what I’m talking about: basically LLMs can and do sometimes produce statements that are not based on any piece of information present in their training set and this is not necessarily bad. They could be generalizing on the learned “knowledge” and producing correct new statements that are logical implications or at least quite plausible consequences. On the other hand, sometimes the results of LLMs include pieces of information that are factually wrong, or at least not true at the moment (for instance, when asked how to write code to do something in a programming language and/or library they might invent API calls that are not implemented). This is of course bothering, but I’m mostly bothered by the term hallucination that has been adopted to describe this kind of phenomenon.

A hallucination is defined as “a perception in the absence of an external stimulus”. Technically, we can think of the training process for LLMs as an agent that perceives the training set and that builds the LLMs, constructing a complex memory structure also exhibiting elaboration capabilities that are surprisingly general (and we’ll talk about that shortly). The actual LLM perceives the prompt, our questions and statements, and it makes use of its own learned model to elaborate answers. The most appropriate term for describing the phenomena leading to errors in these answers should therefore probably be false memories or confabulations, but certainly not hallucinations. I am by no means an expert in psychology, and that is the source of my irritation: as a category, researchers in AI are displaying a really blatant carelessness about a topic that is also somewhat close to us, giving a striking display of our ignorance beyond our relatively narrow area of expertise, in a moment in which we should exercise extreme caution and professionalism.

The second term that is used at least too lightly (and maybe utterly misused) is emergence with reference to the fact that LLMs exhibit the ability to perform tasks they were not trained to carry out. Well, full disclosure, I do not have a proper definition of emergence, from my perspective it is one of the most fascinating but complex topics to discuss, and probably it is better to provide examples of the kind of phenomena that I really consider to be emergent: mind from brain (and body), life from simple organic compounds (writing this post I learned that the term abiogenesis describes perfectly the concept I had in mind, another example of ignorance, mine in this case). There are certainly lesser forms of emergence, such as glider guns and other patterns in Conway’s Life, and I think of them as lesser because they are extremely easy to implement, analyze in retrospect, although conceptually they can serve as didactic laboratories to convey complex notions and discuss them. Getting back to LLMs, these tools are basically built to have a dialogue with an interlocutor, irrespectively of the topic of the dialogue: it could be getting help on how to write down insanely long and completely delusional documents about the dynamics and directions of a bachelor’s/master’s degree, if it is reasonable to cook amatriciana using pancetta instead of guanciale, or we could ask how to write down code in a certain programming language. The LLM was not trained specifically for that dialogue but, as long as the training set includes a sufficient relevant quantity of information on that topic, it might exhibit the capability to respond in a very functional way to questions, even highly technical, on that topic. It can be surprising the level of quality in certain responses, but honestly if one frames the situation in this way, it it not that surprising. I’m certainly simplifying things (and maybe oversimplifying), but the point is there’s certainly something in the LLMs training, functioning, sheer size and structure, that is surely relatable to topics in complex systems and emergence, but using the terms in a overly simple way seems to me a way of not really taking the matter (which can be a serious line of research) in a serious way.

As I said, I can be obnoxious… but I am strongly convinced that people within the academia should be very careful about language, about using it well.

As a final note, it is particularly interesting that it took a very large number of extremely specialized people (and a lot of processing power, and energy) to come up with a tool whose most interesting feature is to be general, in a way confirming the point of Epstein’s book on the value of generalists in today’s world.