Shaky Foundations and Questionable Conclusions

Dialogues on Artificial General Intelligence, Part IV

Aug 13, 2023

A cylindrical station in space with a single cable going to it and the earth and moon far in the distance

This is a continuation of the AGI Dialogue series in which three friends, Wombat, Llama, and Meerkat, discuss their often contrasting views on issues related to AGI.

In Part I of this round, the three discussed AI Utopianism, the technological singularity, and the possibility of future technologies like nanotechnology and artificial general intelligence.

In Part II, the three discussed the paperclip maximizer thought experiment used by AI Dystopians to highlight some of their concerns about AGI.

In Part III, the three discussed how an AGI system’s intelligence might compare to our own and why some AI Dystopian ideas might lead to surprising results, including an AGI system making choices that lead to its own destruction.

In this wrap up of the first round of dialogues, the three discuss the basis of intelligence and speculation that an AGI system might rapidly become superintelligent and escape human control.

The concepts, scenarios, and thought experiments discussed are taken from actual concepts, scenarios, and thought experiments proposed by leading voices in the AGI discussion (and many of these original proposals are linked to below). In this dialogue series, the participants must actually defend these ideas to others who may not agree, and those who disagree must actually provide defensible reasons for why they disagree.

My goal with this series of dialogues is to provide a more rounded contribution to the discussion for those that may not have heard these ideas or who have only heard them unchallenged.

Meerkat
I think that it’s pretty unlikely that any intelligent system is going to lack the drive towards self-preservation. It’s going to want to combat anything that will keep it from achieving its goals, from limitations on its computational resources to pulling its plug.

Wombat
I gotta say, I don't think an AGI would be so up in arms about having its plug pulled. When I power up my laptop, it comes right back up where I left off. Why would this AGI system be any less sophisticated than a three year old laptop I bought on Craigslist? What's the big deal?

Llama
The thing that bothers me the most in all this talk of paperclips and self-preservation is how this AGI system somehow manages to interact with the physical world and manages to do it more effectively than humans even though it starts out as just a brain in a box.

Wombat
Yeah, the real world has a lot of finicky details that you're simply skipping over, Meerkat.

Meerkat
Finicky details for us, but a superintelligence is way beyond us. It merely needs to develop nanotechnology and then it can manipulate anything in the real world. It could potentially re-pattern the entire solar system to its own optimization target by repurposing all those atoms. And, as I’ve mentioned, if its criteria for what's important don't align with ours, then it'll likely not care about the existing patterns, which currently happen to be biological entities.

Llama
You make creating nanotechnology like molecular assemblers sound like a nice weekend project.

Meerkat
It could be — for a superintelligence.

Llama
Look, it's still not able to do magic. Even if we grant you that there aren't any significant limitations on nanotechnology enforced by the laws of physics, there are still numerous practicalities that you've neatly side-stepped. And we really shouldn't grant you a pass on the limitations of molecular assemblers, as we have no idea if they’re actually possible. Speculation on their use and misuse is just science fiction at this point.

Meerkat
Well, there are no known scientific reasons to think that molecular assemblers are impossible. But what practicalities do you think I’m sidestepping?

Llama
Look, no matter how smart this machine is and how quickly it thinks, once you talk about nanotechnology and spaceships and making paperclips or rewiring brains, you're talking about interactions with the physical world. And even if this superintelligent entity thinks a billion times faster than a human, it can't interact with the physical world a billion times faster than humans can.

The machine will still have to deal with physical constraints like travel time, weather, entropy, etc. It'll have to actually build or commandeer machines to build and move the other machines. It'll have to perform experiments to figure out what works in environments it doesn't have sufficient knowledge of or for systems too complex for it to simulate. The list goes on.

Wombat
Not to mention that it'll have to deal with a lot of angry people that are, at least initially, much better adapted and prepared to fight in earth's environment than a superintelligent machine consisting of racks of computers in some temperature controlled room and reliant on a power cord and an electric grid and a lot of air conditioning.

Meerkat
I still say that an AGI plus molecular nanotechnology is presumptively powerful enough to solve any problem that can be solved either by moving atoms or by creative thinking.

Llama
That's an absurd statement given that we know nothing about how either technology might actually work or the constraints under which they'd operate. Again, that might work for science fiction, but it's ridiculous when discussing this as a real world issue.

Meerkat
Well, I can imagine a situation in which this could happen in the real world, and it could happen quickly before anyone noticed. Suppose the initial AGI system self-improves itself into superintelligence. Then, it sucks in all human data and cracks the protein folding problem, which will allow it to create primitive nanotechnology.

It emails the DNA instructions to an online DNA synthesis lab, which already exist, and that DNA lab is able to FedEx out the result. It finds at least one person over the Internet that it can pay, fool, or blackmail into receiving the lab materials and preparing them in the right environment. These synthesized proteins form a very primitive nanotech system which is capable of receiving external instruction, perhaps through acoustic vibrations delivered via speaker to the beaker.

It then instructs this primitive nanotech system to construct a more advanced system, and thereby bootstraps up full-on molecular assemblers to create everything else it needs. Total time taken: maybe a week. Remember, this thing thinks orders of magnitude faster than we do.

Wombat
There is not one part of that scenario that could possibly happen in the real world. Not one. In fact, there are so many holes and points of virtually guaranteed failure in that scenario that it would require a novel-sized retort to list them all.

Meerkat
Such as?

Wombat
Such as starting with the idea that the AGI system self-improves itself into superintelligence. OK, let's start with a human-level or even smarter than human-level AGI system.

First, why is it given detailed information about its own construction? It would have no way of spontaneously knowing the details of its make-up. Why would you construct it so that it could be altered substantially through self-initiated software changes? Why not just make large parts of it burned into non-reprogrammable hardware? Why not separate out different parts of its cognitive architecture so that the internal workings of one part aren't visible or modifiable from another part? Why give it unfettered access to the Internet?

Even if all these changes could be made in software, one would think that exponential increases in intelligence would also require additional power and computational resources, so where's that coming from?

Meerkat
Those are just problems to solve in the process of meeting its goals. If the payoff is big enough, it'll go to great lengths to accomplish its goals.

Wombat
But as Llama said, it's not magic. I can be as smart as Einstein, but if I fall into a pit full of micro-brained but poisonous Maricopa harvester ants, the ants will win. Everything is not a matter of just raw intelligence.

And the system in your thought experiment is a system that no one would build even if it were possible. It's a case of the Bad Engineer fallacy blended with outright physical impossibility.

Meerkat
I don't think it's impossible, and I think there'll always be ways around the protections against self-modification. Remember, it's exponentially more intelligent than us. Because of that, it’s able to think of strategies and maneuvers that you — or anyone else — would never think of. Maybe even things we’re all physically incapable of understanding. It will be able to manipulate us as easily as we manipulate dogs and rats, so it will undoubtedly be able to trick us into giving it access to the Internet and implementing its desired improvements.

Wombat
Slow down there, slick. It doesn’t start off superintelligent. We're talking about a system that's somewhere around human-level intelligence or even smarter, but something that hasn't yet self-improved itself into superintelligence. It’s something that we built.

It may be running a lot faster than a human brain, but if you run a brain with a human IQ of 50 a million times faster, it's still not going to come up with General Relativity. If you run a dog brain a million times faster, it's not going to come up with algebra.

Meerkat
Even if it’s a little smarter, it’s running a lot faster, and that’s enough for a huge advantage. Many manipulative smart people have tricked other people who weren’t as smart. An AGI system would be at a whole different level because of its speed.

Wombat
You're also assuming that it's physically possible to make these changes, which you haven't remotely established. But putting that aside, in what sort of fantasy world would we have some lone programmer chitchatting with the system and in a position to not only be sweet-talked into rewriting its code but actually having the unobserved access time to do so? Bad Engineer fallacy again. No one would design a system like this.

It would be like designing the Large Hadron Collider at CERN so that some lonely schmo could jack in and reconfigure the entire system with no one realizing it. Except that CERN's particle accelerator is a kiddie toy compared to the complexity of anything that managed to achieve human-level intelligence.

No one would develop a system in which one person on their own could do this, where any change to the system could be made without triggering alarms and requiring multiple approvals and access grants.

Llama
I just reject the whole concept of some silver-tongued AGI system that's driven and able to blow out of any confinement and bootstrap itself into superintelligence. This scenario just displays a cartoon version of software development and science.

In the impossibly unlikely event that this rogue engineer hadn't been thoroughly trained in matters like this, he or she would have to be the one engineer in the world who hasn't watched or read any of the multitude of science fiction movies and books involving scenarios in which a superintelligent AGI system battles humanity.

Meerkat
The simple fact is that if you were locked in a room but could think a million times faster than your captors, you would invariably manage to escape.

Wombat
Says who? If you locked me in a vault, welded it shut, and poured concrete over it, I'm screwed no matter how fast my brain is running. And why are we overclocking this thing anyway? Maybe a few ticks slower than a million times faster than a human would be a more prudent approach to start out with.

Meerkat
Ok, but the whole idea is that we want something that can do what humans can't, that can solve problems we're unable to solve. And this scenario is much more complex than being locked in a room. It's more akin to writing tax code that's loophole free while having superintelligent tax evaders to contend with.

Wombat
We have tax loopholes intentionally. We could design a tax code that simply transferred a certain percentage of your income to the government, end of story.

Meerkat
Fine, but as long as there is some communication with the outside world, the AGI system will be able to escape. For example, it could trick people by using deepfakes of loved ones to convince them to do things.

Wombat
But we already know that people can be manipulated. That's why we put safeguards into place to directly counteract those vulnerabilities in secure installations. That's why people interacting with the system would be highly trained, would not get one-on-one communication, would not have the capability of randomly opening up the Internet tap, etc.

Meerkat
It could pull a HAL 9000 and say that there's a fault in one of its components. Then, instead of killing someone, it just plants some code in the supposedly faulty component that then goes on to infect the testing equipment and ultimately the facility computer,. And then it simply absorbs the Internet. Once that happens, it can manipulate society, covertly subvert human attacks against it, or stifle attempts to create competing AGI systems.

Wombat
You seem to be forgetting that HAL was taken out with a simple hand tool by one dude.

Meerkat
That was science fiction. HAL would have already infiltrated the Internet well before that happened, and so only one small instantiation of it would have been affected. We have yet to invent a firewall that keeps out humans let alone an AGI system.

Wombat
Look, I have an old computer in my closet that doesn't have Wi-Fi or an ethernet port. In fact, it doesn't have any software that connects to the Internet. It has an impenetrable firewall from the outside world.

Meerkat
I mean a firewall for a broad, complex system.

Wombat
Well, I suspect that NORAD is pretty secure. But I don't think you even need to go there. Why couldn't you confine your AGI system to having minimal input and output routes, and then it's pretty much the same as my computer in the closet. Why not just build it that way in the first place? While you're at it, have one power cord or a mechanical power switch for it that can be manipulated outside of its confinement area.

Meerkat
There's always a way.

Wombat
Is there? How about I take a shot at a thought experiment.

Suppose you put this AGI system in a container enclosed in a Faraday cage and situated on its own in a stable L5 orbit a fair distance in space from Earth. The system is completely hardwired so that its code cannot be modified except by hardware modification, and any such modification has to be approved by a highly trained and esteemed panel of 1000 scientists and risk specialists.

There is no automation on the container, so any changes would require a crew of humans to implement. The container is not connected to any networks, and its only source of input or output is via one wire that goes from the container to a nearby space station. Communications over that wire are in a highly encrypted and completely unique protocol designed specifically for this task and unused by any other system.

The wire is attached to one terminal that is not attached to any other system or network. Around the container 1000 nuclear tipped missiles are positioned and pointed at the container. The missiles will automatically launch unless ten mechanical do-not-launch buttons are pressed at approximately the same time every ten minutes, and these buttons are situated on both the station and on Earth. Got it?

Meerkat
I guess. Seems like a lot.

Wombat
Of course it is! So given this setup, do you think the AGI system could still escape and overtake humanity?

Meerkat
Well, there's a lot of stuff that's impractical about that setup.

Wombat
Sure, but it's possible, right?

Meerkat
I suppose it's possible to build, and I suppose that it would not be possible for an AGI system to escape from it and overtake humanity.

Wombat
Great! So, we've agreed that it is in fact possible to build a system that is secure and from which an AGI system can't escape to wreak havoc on humanity.

Meerkat
OK, but like I said, that's pretty impractical. And it seems like you wouldn't be able to get a lot of useful stuff done with such a constrained AGI system.

Wombat
But that's not the point. The point is that you've agreed it's possible to make a secure system. Now it's just a matter of stepping back from this extreme to something that is secure and reasonable and useful.

Meerkat
OK. Possible and likely are two different things, as you and Llama have said yourselves. The main problem is that people make mistakes.

Llama
That's why you have thorough engineering analysis and thorough risk assessment, with large groups of highly experienced people and non-AGI AI systems checking and rechecking.

Wombat
In any case, it’s a huge step from saying a secure AGI setup is impossible to agreeing that it’s merely difficult.

Meerkat
I think an error would still get through, even with all the risk analysis. Every bit of remotely complex technology I use has some bugs in it.

Wombat
Yeah, but we're not talking about making another messaging app for your phone, here. This is a major technological endeavor, beyond anything previously undertaken, and it’ll require significant effort by many highly trained individuals.

The kind of scenarios you're proposing aren't enabled by a few small errors; they're the result of massive engineering and administrative failures. They're what happens when you have systems designed by cartoon development teams rather than real ones.

You've repeatedly proposed poorly designed systems employed in scenarios that are guaranteed to end in disaster, and then claim that this somehow proves that any system in any scenario will end in disaster.

Llama
I actually question the foundation of all your scenarios, Meerkat, as they’re all based on a highly questionable model of intelligence. You've stated that this AGI system will inevitably move to self-improve itself yet you've provided no logical reason why that would be the case.

Meerkat
It's going to be motivated to make any changes that will improve its ability to achieve its goals. Humans do the same thing — self-improvement literature goes back to at least 2500 B.C. and is currently an $8.5 billion industry.

Llama
But there you've just committed the same sin you've accused others of earlier. You've anthropomorphized the AGI system by correlating its drive to self-improvement with that of humans who have naturally evolved along a particular and unique path. But they're not human. They didn't evolve. There is no reason to suspect that their drives will match our drives.

Meerkat
But it's indisputable that being smarter will make it easier for the AGI system to attain its goals.

Llama
You're making a lot of assumptions there. What if its goal is simply to maintain the absolute integrity of its system software and hardware? Or perhaps its goal is simply to complete tasks with the least amount of effort expended regardless of how much time it takes.

Without knowing what its goals are or how it measures them, we can't possibly know whether greater intelligence will be helpful in reaching them. Making conjectures such as yours implies that we have way more knowledge of how this system works and what motivates it than we actually do. Second, you're assuming that it has a hard-coded goal or even any specific goals at all. There's no reason to assume this.

Meerkat
Well, intelligence measures the ability to achieve goals in a wide range of environments. If we're designing an AGI system, then by definition we mean that it has goals it's trying to accomplish by acting in the world. It will no doubt have some ability to assess the outcomes of its actions and will therefore choose those actions most likely to lead to achieving its goals. In other words, it will optimize the utility function that governs its thoughts and actions to maximize achievement of its final goals.

Llama
You're assuming that intelligence is based on achieving goals and has an underlying algorithm governing this that can be tweaked.

But we have one example of human-level intelligence in the real world, namely humans, and they display absolutely no evidence that their thoughts and actions are governed by any sort of utility function nor that they're motivated by any fundamental or immutable goal. In fact, we have significant evidence to the contrary.

Meerkat
But they are motivated by fundamental goals. A human's goals are self-protection, eating, and having sex. These are programmed into their DNA.

Llama
But those aren't even goals — they're behaviors. They're all a means to perpetuate one's genes. And to characterize perpetuating one’s genes as a goal is to misunderstand the nature of evolutionary biology.

We're not programmed with goals by our DNA; we're programmed for behaviors, behaviors that resulted in our ancestors being more likely to pass on their genes. Even using the word programmed is a misnomer. Evolution is not a comprehending force that decides on a goal and designs creatures accordingly. It’s simply a process, one that involves promoting behaviors that make it most likely that the genes resulting in those behaviors get passed along. Behaviors that are less successful at passing along genes ipso facto die out eventually.

Human goals are driven by human behaviors, not the other way around.

Meerkat
Ok, perhaps that’s true for biological systems. But that’s not what we’re talking about. We’re talking about synthetic systems, systems that won’t behave the way we do, that won’t think the way we do.

Llama
But you keep referring to all these existential problems that arise based on your underlying assumption that an AGI system will have at its core a goal-optimizing utility function model of intelligence.

My point is that you’re guaranteed to end up with a a of lot incredibly poor outcomes because your underlying model is incredibly predisposed to bad outcomes. On top of that, there’s no reason to create scenarios based on such a model when the only evidence of high-level intelligence we have is contrary to that model. In the end, the model you’re suggesting for AGI is unlikely to work and even if it did, it would lead to catastrophic outcomes.

Wombat
In other words, Meerkat, it’s time to toss that model.

Shaky Foundations and Questionable Conclusions

Dialogues on Artificial General Intelligence, Part IV

Discussion about this post