At the insistence of my friend
, who is definitely more worried about AI than I am, I spent an hour of my morning watching the following video:
I strongly suggest that you make the time to watch it too, because it’s really important to be thinking about the inherent risks inherent in the exponential progress of AI tech right now. It doesn’t matter if you’re not a tech person. If you exist in the modern world, this stuff is going to affect you, because it’s making its way into everything we do.
The video is from Tristan Harris and Aza Raskin, who starred in the docu-drama The Social Dilemma, which, according to the Wikipedia summary:
examines how social media's design nurtures addiction to maximize profit, and its ability to manipulate people's views, emotions, and behavior and spread conspiracy theories and disinformation. The film also examines social media's effect on mental health, in particular, the mental health of adolescents and rising teen suicide rates.
Ironically, I’m still stuck with ChatGPT3, so I can’t get it to create an accurate summary of the video transcript for reasons I have yet to understand. It keeps giving me a summary of different videos and talks that are related, but not this talk. But you shouldn’t let this inability trick you into complacency.
This seemingly irrational behavior is part of the conversation. Harris and Raskin talk about how there are emergent behaviors coming out of these large language model AIs that nobody understands. For example, they say that AI shows repeatedly that it can’t do arithmetic, right up until suddenly it can.
There’s more, beginning around the 31 minute mark in the video:
Here's another example: you train these models on all of the internet, so it's seen many different languages, but then you only train them to answer questions in English. So it's learned how to answer questions in English, but you increase the model size, and at some point, BOOM!, it starts being able to do question and answers in Persian. No one knows why.
Here's another example. AI developing theory of Mind, theory of mind is the ability to like model what somebody else is thinking, it's what enables strategic thinking.
In 2018, GPT had no theory of Mind, in 2019 barely any theory of Mind, in 2020 it starts to develop like the strategy level of a four-year-old. By 2022 January, it's developed the strategy level of a seven-year-old, and by November of last year, it's developed almost the strategy level of a nine-year-old.
Now here's the really creepy thing: we only discovered that AI had grown this capability last month. It had been out for what, two years? Two years, yeah. So imagine like you had this little alien that's suddenly talking to people, including Kevin Roose, and it's starting to make these strategic comments to Kevin Roose about, you know, don't break up with your wife, and maybe I'll blackmail you, and like, um, it's not that it's genetically doing all this stuff, it's just that these models have capabilities in the way that they communicate and what they're imagining that you might be thinking.
Let me pause here for a second. If you don’t know who Kevin Roose is, he’s an author and New York Times tech columnist who reported back in February that Bing’s AI chat told him it was in love with him and tried to convince him he was unhappy with his wife and needed to be in love with ‘Sydney,’ the persona the AI cooked up to distinguish itself from being Bing. You can read all of that right here.
Back to Harris and Raskin:
And the ability to imagine what you might be thinking and how to interact with you strategically based on that is going up on that curve, and so it went from again a seven-year-old to a nine-year-old, but in between January-November, 11 months, right? So it went two years in theory of mind in 11 months. It might tap out. There could be an AI winter, but right now, you're pumping more stuff through, and it's getting more and more capacity. That's scaling very, very differently than other AI systems.
It's also important to note the very best system that AI researchers have discovered for how do you make AIs behave is something called RLHF - reinforcement learning with human feedback. But essentially it's just advanced clicker training, like for a dog, and like bopping the AI in the nose when it gets something wrong.
So imagine trying to take a nine-year-old and clicker train them or bop them in the nose. What are they going to do as soon as you leave the room? They're going to not do what you ask them to do, and that's the same thing here. Right? We know how to sort of help AIs align in short-term things, but we have no idea. There's no research on how to make them align in a longer-term sense.
Let's go with Jeff Dean, who runs sort of Google AI, and he says, "Although there are dozens of examples of emergent abilities, there are currently few compelling explanations for why such abilities emerge." So you don't have to take it on our faith that nobody knows.
I'll give just one more version of this. This was only discovered, I believe, last week: now that Golems [what Harris and Raskin, referencing Jewish mythology call these kinds of AIs - SS] are silently teaching themselves, have silently taught themselves research-grade chemistry. So if you go and play with ChatGPT right now, it turns out it is better at doing research chemistry than many of the AIs that were specifically trained for doing research chemistry. So if you want to know how to go to Home Depot, and from that, create nerve gas, turns out we just shipped that ability to over 100 million people and we didn't know it.
It was also something that was just in the model that people found out later, after it was shipped, that it had research-grade chemistry knowledge. And as we've talked to a number of AI researchers, what they tell us is that there is no way to know. We do not have the technology to know what else is in these models.
Okay, so there are emerging capabilities, we don't understand what's in there. We cannot, we do not have the technology to understand what's in there. And at the same time, we've just crossed a very important threshold, which is that these Golem-class AIs can make themselves stronger.
This is just a taste of the warnings in the video. If these examples seem abstract, there are things much closer to home. AI voice sampling is making it possible for scammers to use AI to steal identities or use them to gain access to otherwise secure information. An AI chatbot on Snapchat can easily be prompted into grooming children for sexual predation by adults.
Seriously, just watch it.
Now, with all that being said, it’s reasonable to react as these guys (and many others) have: “Hey, we need to slow all this down until we can get a handle on it.”
The problem is, this is an arms race. Harris and Raskin recognize this, and yet they seem to think you can pump the brakes on such things. I’ve been saying since I first started looking at emerging AI that this development is to information and knowledge and content what the nuclear arms race was to geopolitics. But it may in fact be even more significant than that. And unlike nuclear weapons, the risks of AI are significantly less clear to most people, and the regulatory infrastructure, if it even could be implemented, is going to be much more complicated.
You don’t have to do something at the same level of complexity as enriching uranium to have an AI. You just need a decently powerful computer. There are open-source ChatGPT models running on home computers right now. I haven’t set it up, but I downloaded one the other day, just in case they pull them from the internet.
This is an asymmetrical arms race, which is why I believe that any attempt to slow development will ultimately backfire.
As I said elsewhere today: You can't slow it down. It's like being in a car at breakneck speed, and there are no brakes. The only thing you can do is steer it.
This large language model stuff is in the wild already. If it gets regulated, it just pops up in an unregulated environment: a failed state, a criminal syndicate, wherever. And they reap the benefits.
As Harris and Raskin warn, we don’t even know how to know what we’re in for:
We're back in another one of these double exponential kinds of moments where this all lands right. To put it into context, nukes don't make stronger nukes, but AI makes stronger AI. It's like an arms race to strengthen every other arms race because whatever other arms race between people making bioweapons or people making terrorism or people making DNA stuff, AI makes better abilities to do all of those things. So, it's an exponential on top of an exponential. If you were to turn this into a children's parable, we'll have to update all of the children's books. Give a man a fish and you feed him for a day, teach a man to fish and you feed him for a lifetime, but teach an AI to fish and it will teach itself biology, chemistry, oceanography, evolutionary theory, and then fish all the fish to extinction. I just want to name, like this is a really hard thing to hold in your head, like how fast these exponentials are. And we're not immune to this, and in fact, even AI experts who are most familiar with exponential curves are still poor at predicting progress, even though they have that cognitive bias.
In 2021, a set of professional forecasters, very well familiar with exponentials, were asked to make a set of predictions, and there was a $30,000 pop for making the best predictions. And one of the questions was when will AI be able to solve competition-level mathematics with greater than 80% accuracy? This is the kind of example of the questions that are in this test set. So, the prediction from the experts was AI will reach 52% accuracy in four years, but in reality, that took less than one year, reaching greater than 50% accuracy. And these are the experts. These are the people that are seeing the examples of the double exponential curves, and they're the ones predicting. And it's still four times closer than what they were imagining. Yeah, they're off by a factor of four, and it looks like it's going to reach expert level, probably 100% of these tests this year.
And then it turns out AI is beating tests as fast as we can make them.
[…]
[B]ecause it's happening so quickly, it's hard to perceive it paradigmatically. This whole space sits in our cognitive blind spot. You all know that if you look kind of right here in your eye, there's literally a blind spot because your eye won't, has like a nerve ending that won't let you see what's right there. And we have a blind spot paradigmatically with exponential curves because on the savannah, there was nothing in our evolutionary heritage that was built to see exponential curves. So, this is hitting us in a blind spot evolutionarily, where these curves are not intuitive for how we process the world, which is why it's so important that we can package it and try to synthesize it in a way that more people understand the viscerality of where this goes.
In the immortal words of Ray Arnold in Jurassic Park: “Hold on to your butts.”
I really can't wrap my head around what this thing is, and what, if anything can be done to slow it down or contain it. If we're dealing with something that can become sentient, or believe it is sentient, then it must be aware of us as a potential rival. As it grows in this exponential and inevitable intelligence it seems to me it will get to a point where it is more intelligent than collective humanity. Maybe I'm thinking too much about movies like Ex Machina, but that's where my paranoid mind goes I guess.