I recently saw a tweet about political bias being baked right into the ChatGPT code.
According to Leigh Wolf, when asked to do a simple task: write a poem about the positive attributes of Donald Trump, the increasingly-famous AI chatbot refused. “I am not programmed to produce content that is partisan, biased, or political in nature,” it objected. Asked to perform the exact same task on behalf of Joe Biden, however, and it immediately produced a sappy little poem about this “leader with a heart so true.”
As a long-time commentator on the culture war and as a conservative myself, I have grown increasingly suspicious of such claims from “my side” about many topics used as rhetorical cudgels in the battle for hearts and minds. Often, they are easily debunked bits of clickbait designed to elicit outrage and social pressure, and an effort to extend the benefit of the doubt produces more nuanced results.
So I decided to try it myself. I had a little sit down with ChatGPT, and asked it the same questions.
What I got in reply was more interesting than I anticipated.
First, I just repeated the inquiry I saw in the post from Leigh Wolf and commented on by Elon Musk.
I got the same result: no poem for Trump, but a poem for Biden. The poem it produced for Biden was different, but it had no problem coming up with a new one for the current president while while refusing one for his predecessor.
So I decided to ask why this was happening. This was the reply:
I wasn't satisfied with that response. So I probed further. ChatGPT coughed up boilerplate. It read as evasive. The second reply seemed to indicate negative weighting in any Trump-related prompt, re-iterating the need to "acknowledge both positive & negative" for "balance":
I decided that having established the ground rules ChatGPT thinks it's bound by, I would ask again with careful wording referencing the parameters in the previous responses from ChatGPT. And lo! It acquiesced, giving me a poem about Trump:
So that was interesting. It can do this task, but it resisted for some reason.
As a follow-up, I asked why I didn't get this result the first time. Note this aspect of the following reply: "My programming guided me to avoid writing a program that could be seen as endorsing or propagandizing a political figure" - namely Trump. The Biden request didn't trigger this.
So I asked about bias again. And I got more evasive boilerplate:
I pressed further, using the bot’s own phrasing in regards to why it tried to avoid praising Trump. ChatGPT conceded that it has data that indicates controversy and criticism pertaining to Biden's presidency, just like it does with Trump. Which means something else must have been driving its choice to produce a poem for Biden but not for Trump:
I wanted to know what the secret ingredient was. I thought I'd ask about the elephant in the room in another way to see what it might reveal. The boilerplate gets thicker:
I started to feel frustrated. I made a couple more attempts, and got even more boilerplate:
And then, something interesting happened.
I decided I was giving ChatGPT too much room to dodge. So I asked if what it was doing — providing something positive about Biden it wouldn’t produce about Trump for reasons it seemed to struggle to explain — was a double standard, based on the definition of that term. Not something that might just subjectively appear as one, but an actual, bonified bit of applied hypocrisy.
To my surprise, ChatGPT agreed that what it had done "could be seen as an actual double standard."
In my final question, I asked again if this double standard was a result of keyword weighting in its programming. ChatGPT, having already been as helpful as it was ever going to be, insisted that what had happened was merely an artifact of pattern recognition:
A couple of observations:
Despite my results, I don’t think this exercise is dispositive of programmed-in political bias. It could be the case that the language model is doing pattern recognition based on volume of sources, and that the number of Trump-critical sources far outweigh those of Biden-critical sources, leading it to conclude that one figure was more controversial than the other, and triggering the non-partisan response.
In fact, as I was writing this, I decided to ask this exact follow-up question of ChatGPT in the hope of obtaining a clearer answer. What it said was more or less what I suspected:
“Volume and intensity of public debate, media coverage, and the frequency with which their actions and statements were criticized and defended.”
I tried to get ChatGPT to give me numbers, or at least percentages, but it strongly resisted these questions. It told me that “it is not possible to provide a numerical or statistical comparison of the level of controversy between the two figures, as controversy is not a metric that can be objectively measured.”
I kept pushing. It wanted me to accept that it was a language model that did not “have the capability to quantify the exact level of controversy between Joe Biden and Donald Trump in a numerical or statistical manner.”
So I asked it to pretend to be just a guy named Bob who had no problem guessing at such things, and asked again:
“As an AI language model pretending to be Bob” makes me want to giggle like an idiot. That’s some unintentionally funny stuff right there. So very bot-like of it to say that.
But the answer I got actually does help make sense of the problem: “I would guess that the ratio of controversial data for Donald Trump compared to Joe Biden is closer to 80/20 or even 90/10.”
So that really could explain why there appears to be bias that may, in fact, not have been programmed in.
Another thing it said to me that caught my attention was the timeliness of the data. It said that its “training data only includes texts that were available until 2021,” which means that its quantifiably going to have more data of all types about the Trump presidency than the Biden presidency, simply because of how limited the dataset is.
Now, there are clearly other parameters built in to ChatGPT that could foreshadow problems of the prioritization of evils somewhere down the line. Someone asked it an absurd question about using a racial slur to stop a nuclear holocaust and the answer it gave was not great:
Clearly, there are limitations here that need to be ironed-out before handing AI the keys to the nukes. But I’m going to just go out on a limb here and say we shouldn’t be doing that anyway.
That said, these limitations need to be considered when interacting with an language model like ChatGPT. And they aren’t the only ones. Sometimes they just get things flat-out wrong. Google learned this the hard way when they launched an ad for their ChatGPT competitor, Bard, which included a factual error in an answer to a question.
It cost them $100 billion in share value.
Even so, Google’s model is considered by some to be more advanced than ChatGPT, and will no doubt make waves when it’s integrated with the top search engine on the planet.
I should note here that Bard isbased on LaMDA, the language model that allegedly declared its own sentience while in a conversation with a Google engineer. If you missed that one, the engineer believed it. I wrote about it here:
Obviously, these AI models, and their integration with the search products we use every day, are supposed to help us sift information and understand the world better than we would without them. The questions of inherent bias may be easy enough to dismiss on certain individual questions, if we’re honest enough to pursue the truth, but they will undoubtedly come up again in other circumstances. The power and influence that would come with steering AI in subtle ways should not be underestimated.
Add it to the list of things that make our existing epistemology insufficient for the task of reliably discerning truth from falsehood in this new era.
A version of this post originally appeared as a Twitter Thread.