Is ChatGPT hallucinating us into a Borgean universe?

Juan Ortiz Freuler
9 min readApr 8, 2023

--

“The methodical fabrication of hronir has performed prodigious services for archaeologists. It has made possible the interrogation and even the modification of the past, which is now no less plastic and docile than the future” — Jorge Luis Borges, Tlön, Uqbar, Tertius Orbis (1940)

Our reality competes with the fantasies our artists brew. The hallucinations of ChatGPT are the latest edition of this phenomenon. By “hallucinations” I mean that ChatGPT regularly and confidently mixes facts with plausible fiction. It is all the more unsettling when the user requests a source for such information and ChatGPT includes hyperlinks that lead to nowhere as a response.

A concern noted by a Washington Post reporter is that once humans take these fictions as a fact and insert them into content that is distributed across the web, it can subsequently be treated as truth by others, including smart search engines, like Bing or Google’s Bard. Truth isn’t necessarily the norm on the web, but with millions of people relying on ChatGPT for information, it might represent a leap in the speed and breadth at which fiction is introduced into the web. Over time, this process might place us in the fantastic scenario of the short story “Tlön, Uqbar, Orbis Tertius”, published in 1940 by Jorge Luis Borges. A world in which fiction slowly overruns fact. Exploring the meaning of hallucinations through this borgean lens is the humble purpose of this piece.

In Borges’ short story, two friends run into an edition of the Anglo American Encyclopedia, which features a dubious entry detailing an enigmatic land called Tlön. There, the inhabitants embrace a language, algebra, zoology, and a perspective on reality unlike anything known on Earth. Soon, seemingly otherworldly objects that defy the laws of physics begin appearing across the globe, sparking intrigue and confusion. Eventually, these two friends uncover a grand plot by a benevolent secret society, which dates back to the 1600s and has been working tirelessly over the centuries to shape the world. The society had distributed fake editions of the encyclopedia containing references to Tlön and then created the objects that were being discovered around the world and heralded as proof of the existence of the civilization described in the encyclopaedia. Thus, the secret society brought Tlön into existence in the minds of people of our world, and along with it, the value system of its fictional inhabitants. Having Tlön exist in the minds of people was an achievement enough for the secret society…

The secret society in Borges’ story espouses idealism, a philosophy which asserts that the material world is nothing more than a creation of the mind, and that reality exists only in the perceptions of individuals. What struck me in re-reading this story is that idealism seems to offer a useful perspective from which to understand the workings of OpenAI’s GPT engine. Unlike humans, who (arguably) have physical experiences, the GPT model is a computer program with no direct exposure to the physical world or its objects. Yet, it was exposed to billions of texts describing objects as they are experienced by humans. This is how the Large Language Model approach is deployed by ChatGPT. It consists of exposing the system to an extremely large corpus of text from which it will identify patterns within the structure of language, and from those patterns abstract a set of generalizable rules.

Under this framework, we can better understand the need to expose the system to a larger corpus of text. The rules created from a small corpus of text would fail to generalize well. To explore this, let’s go back to the idealism of Tlön: Borges tells us that “there are no nouns in Tlön…For example: there is no word corresponding to the word “moon”, but there is a verb which in English would be “to moon” or “to moonate””. This would be the structure under which ChatGPT would process the word moon in a sentence, if it was exposed to it only once.

To better understand this, let’s look at how these systems deal with simple math or riddles. Chain of thought is something that large language models struggle with. Below, is an example used by Google engineers to show that the system has to be taught how to deal with questions an average person would find simple to address.

Screenshot of a table comparing two types of prompts by Google Engineers (2022)

The lack of material objects makes processing of causality difficult for GPT models. In Tlön, “The perception of a cloud of smoke on the horizon and then of the burning field and then of the half-extinguished cigarette that produced the blaze is considered an example of association of ideas.” Engineers seem to be working around this challenge by unpacking language.

Wrapped around the idealist core of GPT engines is a predictive machine that calculates the probabilities that one string of letters will follow from the previous one. Many of us are familiar with predictive keyboards, where the autocomplete engine suggests reasonable options as we type. In the ChatGPT-type of implementations, the predictive model is more powerful and the “autocomplete” is placed on steroids, such that it prompts itself forward into writing whole paragraphs based on such probabilities.

ChatGPT is mind-boggling because of the speed at which it can deliver content that a human would struggle to produce. For example, if prompted to “write a poem in the style of Shakespeare describing the World Cup final of 1986”, ChatGPT can generate novel content that is coherent (and in Shakespearean iambic pentameter!) in 2 seconds. On the other hand, the system can also fail at seemingly simple tasks like answering basic math questions or riddles described above, which is equally surprising.

Fig. 1 The emergence of a world. Source: Dalle-2. Prompt: a 3d render of a server farm farm dripping a purple planet into existence, in a retro futurist style

Many experts are particularly concerned with ChatGPT’s hallucination of sources. At its root, the problem is that the system operationalized our language as a system of mere probabilities, where what we would call fact is simply yet another more or less plausible string of symbols. As such, when users request that the model show the URL from which it is sourcing its (fantastic) information, the system responds by creating a URL based on the same probabilistic model it relies on to write a sentence. This is what leads to URLs that go nowhere. As noted by The Guardian’s head of editorial innovation, “This specific wrinkle — the invention of sources — is particularly troubling for trusted news organisations. … And for readers and the wider information ecosystem, it opens up whole new questions about whether citations can be trusted in any way.”

The way in which ChatGPT fabricates facts and sources reminds me of the concept of hronir in Borges’ Tlön. Borges refers to these as being duplicates: They pop into the world when someone expects them. At first, in Tlön, they would appear accidentally. For example, “Two persons look for a pencil; the first finds it and says nothing; the second finds a second pencil, no less real, but…somewhat longer.” This seems like the type of challenges visual GPT models, like Dall-e, have to navigate when prompted to create something like faces in a crowd, where often details like the shape of the nose or an eye might be missing in its output images. The system is probabilistic and trained on pictures of crowds where–for some of the faces in the training data– those features are not visible, partially covered by another person or a shadow. The system merely translates that observed pattern into the idea that when assembled into a crowd, a proportion of people have one eye, a weird shaped nose or a big shadow that looks like a hole in their faces. Again, its idealist core at play: If the training set had included just one labelled picture of a crowd, GPT would be processing what we see as distinct elements of such crowd as one complex yet single experience. That crowds are an assemblage of distinct persons is an abstraction that machines apparently struggle with. Similarly, in Tlön, “there are famous poems made up of one enormous word. This word forms a poetic object created by the author.”[2] In short, the larger the corpus of data it is exposed to, the lower the chances that the underlying idealism will be exposed as such.

In Tlön, according to the forged records, people eventually harnessed the power of these hronir for productive purposes. Researchers would go to prisons and high schools, show them photographs of ancient relics that they should expect to find in that area and invite them to shovel. In one of these experiments, “the students unearthed — or produced — a gold mask, an archaic sword, two or three clay urns and the moldy and mutilated torso of a king whose chest bore an inscription which it has not yet been possible to decipher.”

And this is where things get most interesting. In Borges’ story, the secret society starts populating our world with (forged) evidence of the existence of Tlön. Material evidence consistent with the claims made throughout the forged written records that described life in Tlön. Similarly, as time goes by, the non-facts of ChatGPT will continue to populate this world and be included in government reports, academic papers, media articles and TV scripts. Eventually, such pieces of information, now included in reputable sources, will become evidence supporting what was in fact originally the hallucination of a machine. A hallucination that unsuspecting humans helped transcribe into our physical world.

In the fictional story published in 1940, this proved to be a dramatic moment. According to Borges, “Almost immediately, reality yielded on more than one account. The truth is that it longed to yield. Ten years ago any symmetry with a resemblance of order — dialectical materialism, anti-Semitism, Nazism — was sufficient to entrance the minds of men. How could one do other than submit to Tlön, to the minute and vast evidence of an orderly planet? It is useless to answer that reality is also orderly. Perhaps it is, but in accordance with divine laws. I translate: inhuman laws — which we never quite grasp. Tlön is surely a labyrinth, but it is a labyrinth devised by men, a labyrinth destined to be deciphered by men.”

Today, decisions made by a couple of hundred coders impact on the lives of billions of people. We have no existing institutional framework to understand or oversee such power. This piece suggests we may be facing a new series of challenges. These hallucinations are presented to us not as glitches but as an inevitable collateral effect of a powerful system built on probabilities. Perhaps these hallucinations should be perceived as a parakeet in the mine, reminding us that these probability-driven technologies bring with them risks that are difficult to assess under existing technical, legal and institutional arrangements. Perhaps they become a focal point around which we coordinate to move beyond the black box approaches to technological development that have become pervasive. Or perhaps we are already wading knee-deep through the idealist marshlands of Tlön… Perhaps “[t]he contact and the habit of Tlön have disintegrated this world.”

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — -

NOTES:

Open to comments, corrections and suggestions!

Add them below or via twitter DM (@juanof9)

PDF: Tlön, Uqbar, Orbis Tertius (1940)

See also, Is artificial intelligence a ticket to Borges’ Babylon? (2017)

and When Human Rights turned into an AI-driven lottery (OGR, 2023)

— —

Footnotes

[1] In ’86 the world did see,

A final match, for all to be,

Maradona’s hand, a goal did make,

But his next, a true marvel, didst make.

Argentina won, with joy and pride,

The world didst marvel, at Maradona’s ride.

A moment in time, forever to be,

The ’86 final, for all to see.

-Note by the author: The hand of god was not involved in the final. It is yet another example of a ChatGPT hallucination.

[2] For the sake of play, I prompted ChatGPT:

Me: ​​create a word that you think would fulfill this characteristic: “”There are famous poems made up of one enormous word. This word forms a poetic object created by the author.”

ChatGPT: Sure, here’s a possible word that could fit the bill: “omniuniverseternity.” This word combines the concepts of the omniverse, which refers to all possible universes, and eternity, which suggests the infinite and timeless nature of existence. The word’s length and complexity create a sense of grandeur and cosmic scale, making it a potential subject for a poetic object.

--

--

Juan Ortiz Freuler

Justice & participation. ICTs & Data. Affiliate @BKCHarvard. Alumni: @oiiOxford & @blavatnikSchool . Chevening Scholar. Views=personal. Here-> open discussion.