If an AI told you to smoke a pack of cigarettes to feel better after a stressful week at work, would you do it?
“No! That’s unhealthy!” you might protest.
But if the Large Language Model (LLM) AIs that have become popular in the last couple of years existed in the early 20th century, that would have been a recommendation they could have easily made, fully supported by the government, doctors, and media. In fact, during WWII cigarettes were included in American soldiers’ rations to help with the stressful conditions of war.
Tech entrepreneur Elon Musk has insisted recently on multiple occasions that it’s imperative that AI be “maximally truth-seeking.”
But the uncomfortable thing most people don’t realize about a maximally truth-seeking AI is that it must include wrong information both in its training data and its output. Feeding AI all the information—even information we think is false—is essential.
Can AI Find Truth?
How do we determine truth? Throughout most of human history, people have held a shared worldview with those in their proximity, but with the explosion of information available on the internet, siloed by algorithms and selection bias, today people can live right next to each other in completely different worlds. The sheer amount of information available (a lot of it contradictory) makes it very difficult to know what’s true.
An AI capable of seeking truth through analysis and reason might be able to help us here, but that’s not what we have today. Current LLMs only mimic reason by using vast amounts of content to generate a probabilistic answer to a user’s prompt.
What’s the main problem with this model? The main problem with this model is that it puts a small group of people in charge of what to include and what to exclude from the AI’s training data set, as well as how to weight the sources relative to each other and any safety or content guardrails to code into the algorithm. The output is determined probabilistically based on the training data, so it is prone to perpetuate the majority or official position at the time, which is not always correct (see cigarettes in the early 20th century).
In other words, current LLMs are likely to reinforce the status quo.
In one horrifying example, a study released in March 2024 found that ChatGPT was more likely to assign the death penalty to defendants who provided a statement in an African American dialect, as opposed to standard American English.
Why Truth-Seeking AI Should be Trained on Wrong Information
It’s important that a maximally truth-seeking AI training set and output include a wide breadth of information, even information which is currently thought to be wrong by the mainstream, for four major reasons.
The first is that people today are operating under a wide variety of worldviews, and they need to know that the AI understands and has seen the same information they have—and considered it fairly—in order to even entertain changing their opinions. While it’s true that humans can be motivated to base their opinions on things like identity and group cohesion, rather than facts, most have also been exposed to a series of facts that are supporting those opinions. By presenting those facts along with counter evidence or additional context, it gives people who do care about truth an opportunity to gain knowledge and update their perspectives.
The second reason is that being exposed to the steelman arguments or evidence for positions other than their own can help people to better understand and relate to those who think differently. There is value in being exposed to the worldviews of others, since it’s possible they could have something to add to our own knowledge.
The third reason is that it allows for non-mainstream positions to be considered, with the potential for updating those positions more quickly as alternative views are found to be convincing and gain traction. This is an appeal to expose people to heterodox ideas instead of trying to censor them, since sometimes they will be more correct than the official positions.
And finally, who could even determine with absolute certainty what “right” or “wrong” information is in order to include or exclude it? The only alternative to intentionally including some wrong information in the training data and output is that a centralized, authoritarian power must decide for everyone what sources are trusted, what is true and worthy of being included.
For many controversial topics there is an objective truth and it matters, but we may never know it perfectly. In cases where we have imperfect knowledge, people must have the right to choose and judge for themselves, and they can only do so if they are permitted to be exposed to wrong information with the best available context to come to their own conclusions.
Conclusion
As we’ve seen in the case of leaded gasoline and lead based paint, asbestos, and the food pyramid, or in so called “conspiracy theories” such as the Tuskegee Syphilis Study or MK-Ultra that were later confirmed as true, bad information can sometimes be the majority opinion or come from official sources.
A maximally truth-seeking LLM must contain information that is widely considered to be wrong in the training set, and include mention of those positions in the output to build trust among people of different worldviews, to help people understand those who think differently from themselves, to provide an opportunity to correct course when the mainstream is wrong, and to prevent centralized power centers from becoming the authorities of truth for the rest of us.
Add comment