Center for AI Policy Podcast

#16: Gabe Alfour on Competing Beliefs About Superintelligence

0:00

-1:20:55

#16: Gabe Alfour on Competing Beliefs About Superintelligence

Technology's limits, loss of control risks, international treaties, and more

Center for AI Policy

and

Jakub Kraus

Apr 11, 2025

Transcript

Gabe Alfour, Chief Technology Officer at Conjecture, joined the podcast to discuss superintelligence, AI-accelerated science, the limits of technology, different perspectives on the future of AI, loss of control risks, AI racing, international treaties, and more.

Available on YouTube, Apple Podcasts, Spotify, or any other podcast platform.

Our music is by Micah Rubin (Producer) and John Lisi (Composer).

Relevant Links

Gabe’s X page
The Compendium (Connor Leahy et al.)
A Narrow Path (Andrea Miotti et al.)
The Direct Institutional Plan (ControlAI)
Reflections (Sam Altman)
Sam Altman’s leap of faith (Connie Loizos, TechCrunch)
Will we control AI, or will it control us? Top researchers weigh in (Lauren Sproule, CBC News)
Implications of Artificial General Intelligence on National and International Security (Yoshua Bengio)
Technology over the long run (Max Roser, Our World in Data)
Kardashev scale (Wikipedia)
How strong of a nuclear bomb could humans make? (Ross Pomeroy, Big Think)
Q&A: How ‘Mirror Bacteria’ Could Take a Devastating Toll on Humanity (Isabella Backman, Yale School of Medicine)
What Would Happen If You Swiped The Waterjet? (Waterjet Channel, YouTube)
Godfrey’s Team Designs a Parallel Internet with Speed-of-Light Latencies (Aaron Seidlitz, University of Illinois Urbana-Champaign)
Drone Swarms Are About to Change the Balance of Military Power (Elliot Ackerman and James Stavridis, The Wall Street Journal)
Invasion of the Home Humanoid Robots (Cade Metz, The New York Times)
Accelerate Generalist Humanoid Robot Development with NVIDIA Isaac GR00T N1 (Kalyan Meher Vadrevu and Oyindamola Omotuyi, NVIDIA)
The case for AGI by 2030 (Benjamin Todd, 80,000 Hours)
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems (Bang Liu et al.)
Large Language Models Pass the Turing Test (Cameron R. Jones and Benjamin K. Bergen)
The Spirit of Laws (Robert Shackleton, Encyclopedia Brittanica)
Seeking Stability in the Competition for AI Advantage (Iskander Rehman et al., RAND)
AI Warfare Is Already Here (Katrina Manson, Bloomberg Businessweek)
The Alignment Problem from a Deep Learning Perspective (Richard Ngo et al.)
Gradual Disempowerment (Jan Kulveit et al.)
Biden, Xi agree that humans, not AI, should control nuclear arms (Jarrett Renshaw and Trevor Hunnicutt, Reuters)
Is China Racing to AGI? (ChinaTalk)
The Open Skies Treaty at a Glance (Daryl Kimball, Arms Control Association)
Mechanisms to Verify International Agreements About AI Development (Aaron Scher and Lisa Thiergart, MIRI)
The national security case for secure AI hardware: A Heilmeier Catechism for FlexHEGs (Akash Wasil)
Kill switch (Wikipedia)
A Roadmap for Cognitive Software and A Humanist Future of AI (Connor Leahy and Gabe Alfour, Conjecture)

Timestamps

00:01:09 - Defining superintelligence and OpenAI's ambitions

00:06:04 - Advanced technologies and potential risks of autonomous AI

00:12:00 - Near-term concerns: persuasion, surveillance, and bioweapons

00:19:25 - How AI accelerates scientific progress beyond human capabilities

00:27:20 - Timeline predictions for achieving human-level AI

00:32:16 - Competing perspectives on superintelligence feasibility

00:38:56 - Control problems with superintelligence and governance challenges

00:45:53 - The dilemma of autonomous AI and competitive pressures

00:57:08 - International treaty approach starting with non-AI superpowers

01:07:25 - Verification mechanisms and need for AI kill switch infrastructure

01:09:27 - Conjecture's evolution and current strategic considerations

01:16:15 - ControlAI's direct institutional engagement with policymakers

Transcript

This transcript was generated safely by AI with human oversight. It may contain errors.

(Cold Open) Gabe (00:00:00):

Then you're only bound by 1) the speed of current machines, and then 2) the speed of the machines that they could develop.

Jakub (00:00:17):

Welcome to the Center for AI Policy Podcast where we zoom into the strategic landscape of AI and unpack its implications for US policy. I'm your host, Jakub Kraus, and today's guest is Gabe Alfour. Gabe is the Chief Technology Officer at Conjecture, a London-based startup working on AI alignment. In our conversation, we talk about superintelligence, AI-accelerated science, the limits of technology, different perspectives on the future of AI, loss of control risks, AI racing, international treaties, and more. I hope you enjoy.

Jakub (00:01:01):

Gabe. Thanks for joining the podcast.

Gabe (00:01:05):

Yeah, my pleasure. Thanks for inviting me.

00:01:09 - Defining superintelligence and OpenAI's ambitions

Jakub (00:01:09):

So to start, there's a question I want to ask you about super intelligence. This is a term for AIs that are far superior to humans, but that's a pretty vague sense of the term, so I want to get your sense of it. And to preface it, I want to read this quote from Sam Altman who wrote a blog post in January.

(00:01:32):

He said "we are now confident" - and by we, he means OpenAI - "we know how to build AGI, as we have traditionally understood it." And OpenAI defines AGI in their company charter as highly autonomous systems that outperform humans at most economically valuable work. And then he went on to state, and this is the key part, "we are beginning to turn our aim beyond that, to superintelligence in the true sense of the word."

Jakub (00:02:08):

"We love our current products, but we are here for the glorious future."

Jakub (00:02:14):

What do you think he means by superintelligence?

Gabe (00:02:21):

I think it's a good question. There are usually two meanings for superintelligence. The first one is a system that is far smarter than any individual human being. And the second one is a system that is far smarter than humanity. So all of humanity combined. Here you talk about Sam Altman, he most likely means the latter. The reason why is that, for instance, in the past, Sam Altman said that OpenAI could maybe capture the light cone of future value in the universe if it cracked AGI.

Jakub (00:02:58):

When did he say that?

Gabe (00:03:00):

Sorry?

Jakub (00:03:00):

When did he say that?

Gabe (00:03:03):

Let me look it for you. If you want, I can send you after. It was I think more than five years ago. So I can find you the quote. But basically what it means by light cone of future value is having complete control over the entire universe and more specifically earth. So when he talks about super intelligence, what he means is the type of system that can do this.

Jakub (00:03:33):

Yeah, I think in our world today, and you've written a little bit about this, it seems very sci-fi. It seems totally unrealistic. So what would you say to someone who is very skeptical of this? Maybe they think it's marketing, they don't see the path from today's AI to super intelligence.

Gabe (00:03:58):

I think there are two levels at which to answer the question. The first one is that there are many things that are true that are not intuitive. There are also many things that have a chance to be true that are not intuitive. And if you see many experts, whether they come from the industry or they come from academia such as Yoshua Bengio. You also have experts that left the industry to be able to talk about those risks like Geoffrey Hinton. If you have many experts like this, even if you don't see the intuition, you can expect that there's something. So it's like one level at which to answer, which is in science, you don't always have all the intuitions and sometimes you do defer to experts to some extent.

Jakub (00:04:47):

Yeah. We have Geoffrey Hinton and Yoshua Bengio are some of the so-called godfathers of AI. So they're very respected and they've come out and spoken about human level AI and beyond. So that's definitely a bit of a non-industry perspective.

Gabe (00:05:06):

Exactly. So it's not only hype, it's also people from academia and people who have left the industry. So it's like one component of the answer. The other component, if it's just to get a vague intuition, it's not a technical explanation by any stretch, but it's still relevant. It's basically we are very far from having touched the limits of technology. If you look at the laws of physics, we can build energy systems that are a million times more efficient than what we have. It is possible in theory to build nukes that are a million times bigger than what we have - antimatter bombs and things like this. It is possible to build bio weapons that are much more killing than the bubonic plague ever was. So there are many technologies this that we have not developed but are possible not only in theory but only also in practice.

00:06:04 - Advanced technologies and potential risks of autonomous AI

And so the worry, usually when we talk about extinction risks, when we talk about ASI, super intelligence, and systems like this, the worry is as you get systems that become smarter and smarter, they can develop these types of technologies without any type of human oversight. So you end up discovering neo nukes or new types of bio weapons or new types of systematic soldier in cyber warfare and so on and so forth without even a human in the loop. And so if you have such systems that are empowered to take actions in the real world, whether it is by having access to the internet, talking to people, having access to robots, then who knows what can happen. So that's usually the type of concern. Of course, it's not a technical explanation, it's just a high level intuition.

Jakub (00:07:02):

Yeah. I like the framing around there's more to go with technology. I think some people might push back on this and they'd say, look, we've already built some of so much, it's been thousands of years. We have all this amazing technology already, and there's something to that. If you wrote a letter to 1700 and you told them about how we have these extremely powerful water jets that can cut through even a human hand. If you are not using proper workplace safety, it's like a magic wand in a sense that's shooting water at extremely high speed or something else could be how we have satellites in space. We went to the moon, we have different kinds of sensors and laser beams, and we're starting to work on even emerging technologies, as you said in biotech. And so what makes you think that there's so much more to go? It's just that there's these limits of physics and you think we can get to them?

Gabe (00:08:07):

Yep. There are many things personally that make me believe this. Different people have their own reasons. For me, there are two main reasons. The first one is just technological progress, the one that we already had. Technological progress has accelerated through many, many different mechanisms. One is we're better at science. We have a better understanding of the scientific method, objectivity, epistemology than we had a hundred years ago, than we had 200 years ago, let alone a thousand years ago. The second one is we have much more widespread education compared to just World War II. Many more countries now have rich education systems. We all have the internet. We can all learn and all share our knowledge and in general, technology compounds. So for instance, there are many new types of chemistry that we have access to because we have better tools. There are many types of physics experiments that we can run because we have built better tools. And so there's a very deep sense in which technology compounds with itself. The more technology you have, the more technology you can discover. The more science you have, the more science you can discover. So that's one part of the equation.

(00:09:33):

The other part of the equation is that we have hit some physical limits already. Right now on earth, you can talk to someone on the other side of the earth with just a few hundred millisecond latency and you can basically talk to them at the speed of light. It is not possible to go faster than this. So in a very real sense for communication speeds, not latency, sorry, not bandwidth, but latency, we have hit the limits of physics and there is no reason to expect why we wouldn't do this for all other technologies. If you look at energy efficiency in various type of engines in solar panels and things like this, you always expect the efficiency to get closer to the optimal hundred percent. You don't expect it to go back or to stop. And here it's the same thing. For all physics limits, as we get better, as we do more science and more technology, you expect we get to them.

Jakub (00:10:39):

And just on the communications, you mean that latency is that about as low as it can go? What do you mean by latency here?

Gabe (00:10:54):

The time it takes for my voice to get from one place to another place on Earth, like optical fibers is basically as fast as it can go. So with optical fiber, you have some speed loss. And also you could go just straight through the Earth. But we're pretty damn close to the optimal.

Jakub (00:11:14):

Yeah. And what are some of the specific technologies that you would envision? And you had mentioned how there might be stuff around stuff that we haven't discovered where we could get more energy than we do now more efficiently. Or another one I think of for more of a dual use weapons perspective is these large scale drone swarms seem like they're on the horizon where you can have a coordinated group of drones that fly autonomously together. What other technologies concern you?

00:12:00 - Near-term concerns: persuasion, surveillance, and bioweapons

Gabe (00:12:00):

So personally, I'm, how to put it much less a sci-fi type of person. I haven't read much sci-fi in that sense. I'm not a good nerd. So I can tell you short-term technologies that worry me, but for the long-term, I don't think I'm the good person. But in the short term, there are many things that can worry me. For instance, bio weapons. The other one is just nuclear proliferation. Another one is just automated cyber warfare. Most of our software right now is not formally proven safe. There are always zero days and so on and so forth. Right now, discovering a zero day a failure and a thing that you can exploit in those systems takes a lot of time. If this can be fully automated, it means that no one, no country, no organization is safe. You've mentioned drone swarms.

(00:13:14):

I think there are also other things that are particularly worrying things like 1984 type of scenario, mass surveillance of the type that was not possible during the Stalinist regime is now possible. I think one that is under thought about underappreciated is what Sam Altman calls super persuaders. So systems that are superhuman at persuading humans, possibly they can discover your secrets very quickly, possibly they can find fault in your psychology to get you to break super quickly. All this type of stuff, the systems that could create super ideologies that are extremely memetic, extremely anti humanist, that can make 10% of the population a terrorist whenever. This type of stuff is allowed within the bounds of human psychology. And we already have a bunch of results where we have systems that write poetry that is more attractive than human written poetry. Systems that write pamphlets that are more persuasive than human written pamphlets like this type of stuff. A lot of people are working toward this type of systems. In the short term, those will be my worry.

(00:14:38):

In the long term. My worries are more bombs that can just explode the entire planet. There is no reason why we cannot make this. It's completely allowed by the laws of physics. We can just build bombs that blow everything up. And over time with more technology, the cost of building nukes, the cost of building bigger bombs, the cost of deploying bio weapons just decreases and decreases for humans. It also decreases for AI. So in the long term, it's more my worry. Then what is the specific type of bomb? I'm not a physicist. I'm not well versed in sci-fi or prospective technology and things like this.

Jakub (00:15:23):

Yeah, and you have this publication called the Compendium, which outlines some of this. And in there, one interesting other framing in addition to the ones you mentioned is that there's engineering at a very, very small scale. So we're already building these chips, these chips that are then used in AI. Some parts of them are measured in a nanometer scale, which is getting close to atomic levels, like the size of atoms or certainly around the size of viruses, molecules. And the concern there could be that what else could you be able to do at such a small scale? Could you build tiny robots that can replicate themselves? And then another side of this was you talked about larger scale engineering, right?

Gabe (00:16:24):

So the microscale is indeed like virus scale, gene scale, molecule scale. This type of stuff for the macroscale is just that right now, I don't know what is the tallest building that we've ever built. Let me check quickly just because it's fun. It's the Burj Khalifa, which is like 830 meters. It's like the kilo scale. It's not that impressive, I want to say. I'm being mean. I think if we look at the furthest that a human has gone to, it's the moon. Right now we're talking about programs on Mars and things like this. With more technology, there's nothing that excludes us having people further, much further than Mars on further planets. If we go for longer timescales, not only a six months mission and so on and so forth, we could even try to go to other solar systems, possibly other galaxies if we're thinking of long societies and so on and so forth.

(00:17:34):

In our solar system, we could try to extract minerals from asteroids. We could try to gather more energy from the sun. There are many things that we could do. When I say we, it's not necessarily we humanity. It's like in general that is possible with science within the realm of physics and with technology. So that will be the more macro scale. But the goal here is just to anchor that there is nothing that makes it such that the biggest thing or the tallest thing that we can build is like the Burj Khalifa. We can go taller, we can go more massive. There's more matter, there's more energy just in our solar systems. Physics does not prevent any of this. And I think aside from physics, we shouldn't bet against technological progress getting bigger. That's one of the few things that we see over history.

Jakub (00:18:40):

And so far we've been outlining some possible future technologies. I want to connect this to AI. So some people might be hearing about these technologies and they're thinking, okay, maybe it's physically realizable within the possibility of physics, but that's going to take so long for humans to build. And I think that your perspective is that AI will help us speed this up a lot. So how do you envision that playing out? How will AI actually make any of this happen in our lifetimes?

00:19:25 - How AI accelerates scientific progress beyond human capabilities

Gabe (00:19:25):

No, it makes sense. I think it might be that a lack of familiarity with the research process. What I mean by this is that if you look at the research process as we do it in academia, in human societies, you have someone that discovers a new insight or a new fact about reality, then they have to write a paper about this. Then this paper needs to be peer reviewed. It takes a few months, it gets published. A couple of PhDs or experts in the domain will read the content of the paper, possibly write their own papers, make their thesis about it. As more is written, we understand the insight more. The fact gets packaged into meta-analysis and then at some point it percolates to textbooks and you have the next generation of scientists that is now working with knowledge of this. The same way that at some point our high schoolers started understanding calculus of variations and derivatives and things like this. This takes a long time. With AI, that is not the case. You have millions or billions of AI systems. One discovers a thing, it's sent to all the other systems in a few seconds and that's it. So the timescales where we have a generation of scientists which can take between five and 40 years and just a few seconds is so different. The iteration speed is so much faster.

Jakub (00:20:59):

The onboarding too, where a human maybe starts doing real science at age 22 after they've done a bunch of college and high school and growing up.

Gabe (00:21:10):

And middle school and primary school after 16 years of education and 22 years of life. Whereas you build a new GPU, you add it to the cluster done.

Jakub (00:21:24):

Train on new data.

Gabe (00:21:26):

Not even train, just deploy the model. You already have the model. There's no need to train, just upload.

(00:21:33):

And we have massive bandwidth in the tens or sometimes hundreds of gigabytes. So the model gets uploaded in a few seconds, done. The cycles between AI and humans - the compression of those cycles is really hard to fathom. I think that's one of the things.

(00:21:53):

But all of those is still assuming our current methods. I think there's also a lot of juice in just getting better science. For instance, in academia, people fight over positions. In academia, you have a lot of people that just do what they like or that come to get a chill job. When you have AIs, you don't have those dynamics. They're not fighting against each other. They're not doing what they like. They're just optimizing ruthlessly. They don't care about breaks, they don't take vacations, all this type of stuff.

Jakub (00:22:26):

They don't procrastinate either.

Gabe (00:22:29):

Yeah, don't procrastinate. So there is also another thing where they won't be bound by the constraints of academia. They can also discover better scientific methods. If you take the scientists from now and the scientists from a thousand years ago, they have very different methodologies, and the current scientist is much more productive. So I think those are all the ways. If you're not engaging with academia, you're not really feeling how slow academia is and how slow we are as humans. So I think that's the part of the equation that is missing. Another one is also an understanding of how AIs can act in the real world. For instance, to people it feels like AIs are mostly ChatGPT. You talk to them. If we had a guarantee that AIs will never take actions in the real world, I would feel much more optimistic.

(00:23:26):

The thing is, we're doing the opposite of this. For instance, we're training AIs based on their own output. We're training them to think about what are good things to do and so on and so forth. We're having AIs interact with people and convince people of things. A lot of people are doing this for nefarious purposes. For instance, with AI assisted scams and things like this. We're building AIs that have access to online tools that can go on websites and do things on online tools and things like this. We're building AIs in actual humanoid robots, like people are building humanoid robots and putting state-of-the-art AIs into them such that they can interact with the real world. Even without humanoid robots, you can put AIs in drones, you can put AIs in quadrupeds, you can put AIs in robotic arms, and a lot of people are putting human effort in automating factory lines. AI can do that too. It's just that once AI get to the point where they can do everything that a human can do, they can do that, but much faster. They could create the designs for a factory much faster than a human ever could. And as soon as they have the minimal physical interfaces - humanoid robots, quadrupeds, drones, robotic arms, whatever you want - to just assemble new ones to build new factories and so on and so forth, then you're only bound by 1) the speed of current machines, and then 2) the speed of the machines that they could develop.

(00:25:05):

This is quite important to have in mind. There's also another thing that might be interesting, which is now to build a lot of robots, we're also using offline learning where we just take videos of humans performing things or systems interacting in the world. So you don't even need to deploy the robots to learn. And we're also just using full simulations like physical simulations in computers from which we can derive algorithms for locomotion that we then put in robots. Then those robots can then interact with accidental terrain and things like this. So I think it's if you consider all of those, the ways in which people, sorry, the ways in which AIs can do theoretical research much faster than people, the ways in which we're using AI to interact with the world, whether it is the physical world, the social world, the online world, and the ways in which AI can improve themselves and on those designs, if you factor all of this, then I think it becomes more intuitive why we could expect that AI can build physical systems much faster than humans.

Jakub (00:26:29):

And in order to do all that, we need AI that can do things a human can do. And right now they're getting there. I think a lot of people saw ChatGPT as a big step. Or for the people tuned in, maybe GPT-3 or even GPT-2. But we're not there yet. So there's still a gap. Even if Sam Altman is saying, look, we know how to build AGI, we'd like to build super intelligence. This doesn't necessarily mean he's right. So why do you think we will actually build AIs that can do their own science and research and manufacturing?

00:27:20 - Timeline predictions for achieving human-level AI

Gabe (00:27:20):

So the pedantic answer to this question is something like everything that we do is physical. We can reproduce physical processes, so at some point we can build machines that do what we do, but this is pedantic because it doesn't say on which timescales and so on and so forth. But I think it's one that is important. Before ChatGPT, before GPT-3, a lot of people were saying that having AIs that will understand language would be impossible. So you had a lot of people that had spiritual or philosophical objections to things that just happened. It doesn't mean that 10 years ago people will predict that ChatGPT will come in five or seven years, but you already had people 10 years ago that will be like, "no, no, no, AI will be able to do those things." And the people who are like, "no, it won't" were just wrong. And historically, the people who bet against AI eventually doing something were wrong. So it still doesn't say anything about timelines for when AI would do things. We had many AI winters during which after an AI hype, then it turned out that AI wouldn't progress much. But it's just the first part that is more pedantic. Then the question is what is something like why short timelines? Sam Altman, Dario from Anthropic, Demis from DeepMind all talk about AGI by 2030.

Jakub (00:28:54):

These are the CEOs of the Google DeepMind and Anthropic and OpenAI.

Gabe (00:29:00):

Exactly. They all talk about AGI by 2030. And the question is why do they have such short timelines? I think the true answer is something like if you played with AIs before GPT-3 and you try to build your own AGI system, which a lot of teenagers in this field have done - if you try to do this, you encounter a couple of obstacles, hard obstacles. And the thing is that if you look at GPT-2, GPT-3, or ChatGPT, they basically solved those obstacles. And so to a lot of people in this field as a result, it feels like that AGI now is not really a research problem anymore. It's more an engineering problem. There's no big unknown when you do your AGI diagram with all the boxes. Now there are no black boxes, there's no box where it's like "understand language??" Now everyone has their own AGI diagrams, and it feels like it's mostly an engineering problem. Of course, this is not a technical explanation, but if you want to understand the mentality, that's basically where it comes from. It's a lot of people that have played with AI before the current generation that were missing key critical components that now exist. And so they're like, yep, AI soon, mostly a research problem. Just got to assemble the Lego blocks.

Jakub (00:30:38):

Yeah, I think the success of AIs that can pass a Turing test essentially - where we can talk with them, we can even speak to them in voice and they speak back to us and it sounds just like a human, maybe a little robotic. That seems like a big step forward. And now we have pathways along which to continue making progress, where we know that if we build more energy, more chips, that we're going to very likely get significant improvements. And we know that if we not only train bigger AI systems on more data, but also use these new forms of scaling with reasoning where we have them trained to select thoughts that tend to lead to good answers and then think out loud for a while before giving their answer, or if we have them generate many responses and then review them and verify which ones seem accurate. There are some ways to spend compute even after we've built it, and we're only beginning to tap into that. So it does seem like there's a lot of room for more progress.

(00:31:51):

I want to get into your policy brief you're working on. So you have three different worldviews or perspectives that people bring when they think about super intelligence and the future of AI. And I think one of them is that this can't be built. Can you walk through that a little bit?

00:32:16 - Competing perspectives on superintelligence feasibility

Gabe (00:32:16):

I can, but I'm not the best person for that because I don't believe in it, but I can try to represent the argument to the best of my ability. So the argument is something like historically machines and AI have only automated more and more tasks. We didn't get any technological explosion or singularity so far. So as a result, we should mostly expect more automation and no technological explosion or singularity. That's like the key argument for this worldview. Another way to phrase it is nothing ever happens, if you're being mean. I personally don't think it's a very convincing worldview. I just think it's missing that we are building better hardware, we're building better software with more optimizations, we're using new algorithms. You mentioned new types of reinforcement learning with reasoning. We can also build better epistemology like methods of science as I've mentioned before, and there is no reason for all of this to just stop at human level. You expect it goes far beyond basically there are no limits that say that current human intelligence and current human scientific methods are the best. That can be done by any intelligence.

Jakub (00:33:55):

Kind of the first ever species to invent science just so happens to be the best one at it, and there's no way to ever be smarter than us, even though...

Gabe (00:34:05):

Yeah, even though we have become smarter over the course of a century already, it is not only that no species can do better, it's also the belief that we cannot do better. For instance, I think that even without AI - let's say tomorrow, there is a mystical phenomena that just destroys any type of human-like AI or too powerful AI. I still think that as humans we can come up with methods that are so great that we do technology, we invent technologies and sciences much faster. We could do self modification, we could leverage computers without big AIs much more. We could just have the science of the future. Assume no AI. What does the science of a thousand years from the future looks like? From this point of view, our current science will just look primitive, like the science of a thousand years ago does. So I think this is the type of stuff that this worldview is missing. This is why it is hard for me to talk about it fairly.

Jakub (00:35:14):

Why do you think people hold this view if you think it's incorrect, what are they missing?

Gabe (00:35:27):

I think, well, they're missing a bunch of the things that I've said. A lot of people do not know much about the methods of science. They're not technical, they're not researchers and things like this. So I think it's natural to miss this. If you don't know about the history of science, if you don't know about how stupid we were in the past, it can feel like just the current science is natural the same way that it's natural for us to consider slavery to be bad. Well, it was not natural historically. And so the same way that we consider objectivity to be good and conducive to good science, well, it was not natural a few centuries ago and so on and so forth. So if you don't study this history of science, you don't get infused with this knowledge. So it's a natural mistake to make.

(00:36:21):

Another thing would be that we have not seen big explosions in technology. We have not seen those things, and so it's hard to imagine what this would look like. It's like when you asked me earlier, what would new technologies look like? And my answer is, oh, I don't know. I'm not specialist in this. It's just that from my point of view, even if I don't know what the specific technologies will be, I can still look at the big dynamics. Whereas I think a lot of people do not trust this type of reasoning. And unless it's very concrete and the specific big technologies for instance, or the specific future methods of science and so on and so forth, it doesn't really count. I think it's a heuristic that is sometimes fair and sometimes not. It's a choice of how you want to reason basically.

Jakub (00:37:19):

And then the other two doctrines or stances on this that you outline are divergent on the risks that super intelligence would pose. So one side thinks this is a very powerful technology, we're going to build it, but we can control it if the US is ahead, this will be good for the US because we'll be able to make it do what we want. There won't be downstream consequences that come back to bite us. This is a technology where the winner takes all in some sense. And so we should go with maybe some guardrails. Maybe we do need to do a lot of AI research on the guardrails and the alignment, but we should go very fast and we should make sure the US gets their first ahead of China. And this is very, the US competitiveness is very natural in DC because there is an actually important thing of the US staying competitive with China in general, but I think people might be hearing this about AI and they think it's just like the other technologies. So why do you think that? I think your camp is more, no, this can't really be controlled. Why do you think that?

00:38:56 - Control problems with superintelligence and governance challenges

Gabe (00:38:56):

Yep. So my sense is more that it can be controlled, but that it is hard and that it takes time basically. So when we talk about super intelligence control and so on and so forth, we're talking about the control of a system that can alter the course of humanity, that can give you control of our entire earth, that can create weapons of mass destruction and so on and so forth. From my point of view, it's quite obvious that controlling such a system is very, very hard. If someone tells you "tomorrow I have a button that makes me world dictator, that gives me the power to be the world dictator," I expect that anyone pushing this button yields very bad outcomes the same way if tomorrow it's not a person, but it's like a constitution. We have to design a constitution for the world government and no one can opt out. I expect this yields very, very bad outcomes. If you put an AI in the loop, I do not expect it yields better outcomes.

(00:40:04):

There's a problem of alignment that is very deep, that is much more general than AI that we haven't solved. We don't know how we should steer humanity. We do not have enough understanding of morals, understanding of human values, understanding of policy that we could give any type of system, whether it is a government, whether it is an artificial intelligence, whether it is a single person, this type of power. We just are not good enough at all of those things. It doesn't mean it's impossible. I think it's a worthwhile goal to pursue. You had the enlightenment humanists that tried to understand more about morals and philosophy. And you had for instance, Montesquieu and De l'esprit des lois who basically created the modern separation of power between the judiciary, the executive, and the legislative that we still use now.

(00:41:00):

And that was a big innovation in constitutions, and I think those are the type of endeavors that are worthwhile. It's just that their pursuit over decades, it requires a lot of research thinking that goes far beyond the limits of our current science. We don't have a scientific way of studying human values, of studying morals, of studying policy, of studying world governance. So I think it is possible to get there, but we have not gotten there. And until we get there, I think it's extremely unsafe to build ASI like super intelligence that can alter the course of humanity, the same way I think it'll be extremely unsafe to push for world government tomorrow. So whether it is led by the US or by OpenAI or by my friend, I still think it's a bad idea. It's just we're not good enough. We don't have enough distance on those topics to make this type of choices now.

Jakub (00:42:03):

Yeah, I think there may be some people in DC who hear world government led by the US and are somewhat excited or they think this could go well. And so I think your concern with AI is kind of specific to this super intelligence. What would happen if you had the US government in control of super intelligence that concerns you, specifically?

Gabe (00:42:51):

The US must tell the super intelligence what to do at a level that is understandable enough by those systems to act upon. For instance, if you tell them, "figure out what's good and then do this." Right now, if you unfold those lines of thinking, you don't necessarily get things that are aligned with human values. Now you can think about it in two extremes. One might be they decides on some values and decides to just annihilate anyone that disagrees with them. And if the US is not part of this, then too bad. Another one is that the system is like, oh no, we shouldn't hurt anyone. And then there are no plans that exist in the world that do not hurt anyone. When you walk around, you might be stumping on ants. And so it's like it cares about the sanctity of life. It refuses to do anything.

(00:43:47):

And so what it means is that if you care about things that are at a global scale, then you will need to tell your system to do things at a global scale, which will trample over the freedom of some people, which might involve imprisoning some people. This is what we do with our laws and things like this. And you will need to build strong principles where if you let an AI follow them without control over them at the global scale and possibly even far beyond that, the galaxy scale and so on and so forth, you still expect it yields good results because you have to understand that powerful systems will triumph over weak systems. If you go with this race logic, you cannot say, I will build a weak super intelligence such that if it fucks up or if it screws up, it's okay.

(00:44:41):

That's the logic of the race. You have to build the most powerful system such that it wins. So you will need to build a very, very powerful system and still control it. So to be powerful, it won't have human in the loop. It'll do big things, it'll take a lot of resources, it'll fetch those resources by itself. And you must trust this. You must trust that as it does so, as powerful as it is, it will still stay within the interest of the US if you're from the US point of view or of humanity if you are human. So I think those are the things that are worrying is that imagine you're the US, you cede all of your authority to a new agency and you have to trust this agency with all of your authority and with power over the entire world. As the US you wouldn't do that. Even as the US. So it's like if you just imagine the US has power, that's a thing that might seem good to you, but that's not what we're talking about. We're talking about the US building a system that has all the power, not the US having all the power. Those are two very, very different things.

00:45:53 - The dilemma of autonomous AI and competitive pressures

Jakub (00:45:53):

Yeah, that's interesting. And I think you hinted at this a little bit, but there is some chance that we build AI systems, very powerful AI systems, and they look pretty safe. They're good enough to sell on the market, but they don't a hundred percent of the time obey. And sometimes they escape control, go out on their own, go rogue. But there's another part of this which is if you actually want to use the power of a deeply autonomous AI, you are not the one using it. The AI is. And if you are setting off your AI army to go work on behalf of your country... in order to use that army, you are going to have very few humans in the loop. It might be some AIs in the loop, but you're effectively handing the keys to the machines and putting your faith that they're going to do the right thing for you.

(00:47:06):

And I would think it's fair to say that right now our technical approaches to AI alignment would not give us those kind of guarantees. And so there would be risks that we give over the keys and it goes and does a bunch of things we didn't want, and it becomes hard to unplug, and maybe not just because of the technical reasons where it has copied its weights onto some foreign server, but maybe also because we've now escalated geopolitical conflict and now if we unplug, then we're at a disadvantage because other countries are trying to do the same thing and they won't necessarily unplug. So there seems to be some coordination aspects to it where we might be forced in order to remain competitive, to give over much of our power and control over the United States to entities that we don't really know if we can trust. Is that a fair summary?

Gabe (00:48:16):

Yeah, I think it's a very fair summary. As you said, the system is either autonomous or it isn't. If it's autonomous, it's not humans in the loop. It's autonomous. And you can pursue this reasoning. The system can either get resources for itself or it cannot. You might say, I will do a safe system. And so it can only use the resources that it gets, but then someone else will do a system that can get resources by itself. You can say, I will do a safe system that will never go against the freedom of any individual human. And so it does very little extremely conservative actions and that's it. But then someone else does this other system. So if you're in this logic of the race, the only way things go well is if you build a system that is powerful enough to disarm other systems. And so it means that this system will trample over the freedom of other nations.

(00:49:12):

It will be aggressive, not necessarily military, but it'll disarm other projects and disarming the research project of an entire country is not something neutral. And so the problem with this racing logic is that it's extremely unstable. If you keep escalating, other countries notice that you keep escalating and at some point you're basically begging for a military response. If you build a system that will let you build weapons of mass destruction, that will grant you world dominance or that might cause extinction risks, other countries will not just look at you and be like, yep, please, please go. If you destroy goodwill with other countries, you might not even get allies that will trust you with this. It might literally just be you against the rest of the entire world, which your country, the citizens of your country might not like. So there are many such dynamics with a race that basically noone wants to have. And so as you said it, this is a coordination problem. It means that we must have countries actually coordinate on not pushing these types of dynamics. I think it's a fair summary.

Jakub (00:50:33):

Yeah. How will we realistically tackle this coordination problem? So in the Biden administration, towards the tail end, there was an agreement with China not to use AI in nuclear command or in critical parts of nuclear command totally without humans in the loop. This is one step, but it's another step to actually verify that both sides are following this agreement. And if we want to do these similar kinds of treaties or cooperation, which would be a big ask in itself because there's a lot of tension right now, how will we actually make sure this has any meaning in reality?

Gabe (00:51:31):

I think there are two meanings for "in reality" or for "realistically." The first one is what is needed for things to go well? Like, if we assume that things went well, what has happened? What was needed in reality for things going well to be a realistic outcome? I think this is a very important question. It's like, is it realistic for things to go well without a treaty? I don't think it's realistic. I don't think you can expect the race to be curtailed. I don't think you can expect to prevent war over AI programs without a treaty. I haven't seen any serious proposals for this basically. So I would consider it very unrealistic to consider that we can avoid a war against AI, that we can avoid catastrophic scenario without such a treaty. So it's like one of the meanings of realistic.

(00:52:33):

The other one is not realistic in terms of the outcome, but realistic in terms of the current situation. For instance, it might seem unrealistic to do a global treaty because right now there are a lot of tensions between countries. There are a couple of things to say about this. The first one is that it is completely allowed by the laws of physics that we're in a bad situation.

Jakub (00:53:07):

I thought you were going to say sign the treaty.

Gabe (00:53:11):

No, the other way around. You take earth, you move it anywhere else in the universe we die. The universe is like a cold and uncaring place. You take any human, you move them at a random place on earth. Well, it's either in the middle by mass and we die under rocks. Or in the ocean, if you're at the surface, and we still die. The universe truly doesn't care about us. So there's often an expectation that there ought to be a policy within the Overton window that will make things go well, that we might get by without doing something truly difficult or we might get by never doing anything unprecedented. And I think that's not the case. So I think if we only consider solutions that realistically get us to a good place, I think the solutions themselves might not be realistic, and that's a sign of being in a very bad situation.

(00:54:10):

I think we're in such a place. And so the question from my point of view doesn't becomes not "is it realistic or not?" It becomes "how can we make it more tractable?" So for instance, I think if we do a global treaty, the global treaty shouldn't be the US unilaterally stops all of its AGI programs. I don't think this is a thing that is tractable. If this was the only thing needed, if this was the only possible way, then I will start thinking about this and I will be like, okay, this is extremely hard. How do we get there? But I don't think this is needed. I think we can have treaties - this is the point of the brief and the policy paper that we're drafting - which start with other countries and where after this first step that is incentive compatible with all countries, then it becomes incentive compatible for a superpower to join. That's more the type of mechanisms that we're investigating where it doesn't need superpowers to start acting, much less so disarming, unilaterally, but at some point it becomes the winning move. So that's more the type of things that we're investigating. There are a lot of reasons for this.

Jakub (00:55:27):

What do you mean at some point it becomes the winning move? At some point there's a solution where both the US and China have benefit to joining the treaty? Is that what you mean?

Gabe (00:55:39):

Yeah, something like this. To be fair, when I think of the race, I do not think of the US and China.

Jakub (00:55:46):

Why's that?

Gabe (00:55:46):

I think more about within the US. Like, the race was started by DeepMind, OpenAI, and Anthropic. And then you had a bunch of others that joined, and recently you had Alibaba that talked about making AGI, its main priority and things like this. But the initial race was between DeepMind, OpenAI, and Anthropic. All three of them are US entities. And the race was between them because each of them thought that the others will not build AGI safely. So they must be the one to build it.

(00:56:19):

So when I think of the race, I don't really think about US versus China. I think this is quite fake. The biggest race... The race between US and China may exist to some extent, but compared to the race that you have between companies, it's much less violent. Most of the state of the art that was pushed was pushed within the US, not within China. DeepSeek was pushing the state-of-the-art for open source, but before this you had OpenAI o1 and o3. So from my point of view, the most virulent race was and still is within the US. So this is why when I talk about AI superpowers, it's not necessarily because they're in a race with each other, it's just because they're big actors. So there's also this dynamic.

00:57:08 - International treaty approach starting with non-AI superpowers

But what I meant is more like right now the countries that have the most incentives to do a treaty are not the US or China. They already have their AI programs and so on and so forth. The countries that have the most interest to do a treaty are the rest of the world. We've talked to many people from the rest of the world that are just afraid. Under any of the circumstances, whether it is global automation of everything, dominance, or extinction. In all of those situations, the third party countries are screwed. If you have global automation, the entities that will automate everything are US companies. So you won't have much control over it. If you're in the dominance scenario, it will most likely be the US that dominates you, and doesn't fare well for you. If it's extinction, everyone dies. So by virtue of being a one, it doesn't fare well for you.

(00:58:04):

And so I think that if you want to do an incentive compatible treaty, the most natural thing is to first look at non superpowers and to look at what is their bargaining chips, what they can bring to the table, what is the carrot they can bring, what is the stick they can bring? And once you have their participation and you already have a block of people that have joint interest in things going well, then the US or China can talk with a good interlocutor. Whereas right now the US won't interact with a country that has less than 1% of global GDP. There's no negotiation possible in such a situation. So this is why I think if you think about incentives, this is the more natural solution. And this is how you can build a global treaty.

(00:58:57):

Is this hard? Yes, I think the situation is quite bad. I don't think right now if we let things go as they go, they will unfold to a good direction. So I think the situation is quite bad, but I still think it is tractable. I think building a global treaty, where you first have the non-AI superpowers and then it's incentive compatible for AI superpower is tractable. I think if there are people in the US who want to put this forward and use their social capital, it can accelerate such a treaty. You don't need things to be fully incentive compatible. There is a spectrum and if you have internal pressure to move towards it, it also helps. I think there's a lot that can be done that is realistic. I just think it's quite hard.

Jakub (00:59:49):

Yeah, I mean if... On the China US point, I definitely can sense and share some of the historical perspective where DeepSeek, Alibaba, they're seeming to focus on AGI in recent times, but for a while it was the US was charging towards AGI.

Gabe (01:00:18):

Even now it's still mostly the US.

Jakub (01:00:21):

Yeah, yeah, still in the lead. And then on the point of realisticness, it does seem very hard, but there are some very influential people. Like you said, if Donald Trump and Xi Jinping want to spend their social capital on worrying about super intelligence that could really move the needle.

Gabe (01:00:45):

Or even below them. No man rules alone. Trump has his lieutenants. Xi Jinping too. The thing is you have a global treaty, you have X percent of global population, global GDP that is represented. And the question is what is the percentage that needs to be represented before the US has a reason to join or before China has a reason to join? And the idea is that the more you have lieutenants that care about this, the lower this threshold is and the more realistic it becomes. But yes, indeed, if Trump and Xi Jinping tomorrow are like, let's shake hands and let's make this go well, I think that's one of the best things that can happen to humanity. But I also think this beyond AI. I think if we got a global humanist US China Alliance, epic things would be possible.

Jakub (01:01:41):

Yeah. Well, some people might not.

Gabe (01:01:46):

That's a separate question.

Jakub (01:01:49):

We had an episode with Bill Drexel on China, and there have been some human rights violations he was talking about, but it's a longer question.

Gabe (01:02:00):

This is why I said if the goal was to be a humanist alliance, I think things are possible. I don't think any one person can just do it unilaterally, but one can dream.

Jakub (01:02:11):

Yeah. So it seems like perhaps not today, but maybe in the near future there's going to be a big need, coupled with actual potential for these kinds of treaties. If we do that, I still want to drill in on how do we make sure parties are complying? Have you thought at all about verification technology? Like, I think in the nuclear age we had Open Skies Treaty and you could fly airplanes over nuclear facilities to see a bit of what was going on. There could be open skies for AI, fly the airplanes over the data centers, but I don't think people are really working on this right now. Are there any kinds of technologies or policies or other forms of other mechanisms that could enforce a treaty?

Gabe (01:03:22):

I think there's a bunch that can be done. There's always an entire spectrum that goes from purely technological to purely political. On the purely technological aspect, you can have something like flexHEG, which is a project, I forgot who started the flexHEG project. It's flexible hardware enabled verification guarantees that is meant specifically for this. I know there are people in the UK that work on this. I know Yoshua Bengio has written about it where the goal is to build basically chips where you can verify what the chips are running. And so if you have chips monitoring and flexHEG, then you could enforce this type of stuff. So this is more on the fully technical things. Then you also have the fully diplomatic side of things, which is just, how do you call this? Violations of treaties are enough to be a casus belli and you can put the level of severity of it as hard as you want. And then you also have political solutions which are more internal, where you deem those things super, super illegal. And if someone works on those things internally, they are guilty of treason. And so if you have, how do you call this? Lanceur d'alerte.

Jakub (01:04:58):

I'm not sure. It's been a while since I took French.

Gabe (01:05:03):

A whistleblower, then the whistleblower gets a lot of kompromat, a lot of blackmail material. So you always have the spectrum that goes from technical, policy, diplomacy. And obviously solutions can be more or less technical. For instance, if I recall correctly, you have a bunch of technologies that let you measure from the outside electrical consumption. You have a lot of clusters that need to be open air and you can see them from satellite imagery. There are a lot of things that you can give access to third party verifiers without giving the keys to the kingdom either. So it just depends on how deep you want to go. On the diplomacy side, on the policy side, on the technology side. We don't write a lot about this in our future paper about the treaty. The main reason why is that many other people are focused on it and we don't feel like it's a bottleneck.

(01:06:05):

The reason why is that this happens after people already agree on the logic of the treaty before we get there. You can have the cheapest enforcement mechanisms possible if people don't want to be enforced, they just say no. And we don't think right now we're at the stage where the bottleneck is the fact that the enforcement mechanisms are too expensive and there are already people working on making them cheap. So this is why us at Conjecture or at ControlAI are not working on it, but others are. We think it makes sense and we think it'll definitely help with such a treaty.

Jakub (01:06:41):

Yeah, we might want more people working on it.

Gabe (01:06:48):

On everything

Jakub (01:06:50):

Some of the hardware enabled mechanisms, on-chip mechanisms... I've heard that a lot of the most productive work on this will need to be done by NVIDIA or companies making the chips and there might be some parts that take some time to do the research if you want the absolute best you can get. There's some stuff you can implement right away. So yeah, that's an urgent policy that people could be considering implementing. How to get these on-chip mechanisms of different sorts and stripes into place.

01:07:25 - Verification mechanisms and need for AI kill switch infrastructure

Gabe (01:07:25):

I think that before getting there, there are already obvious wins that could be gotten. The one that I usually talk about is like a kill switch. Let's say that Trump or Xi Jinping tomorrow wants to kill all AIs on their territory, running on their territory. There's not really a way to do so right now. There's not a big red button that you press to just kill AI in case one goes rogue, in case someone uses a massive cluster to do something bad to the entire internet or whatever. You want to be able to kill it at least on your territory, stop the cluster, stop everything. This is the idea of kill switch. You have many layers of kill switch. So one is internal. Two is diagnose all incoming AI connections and block them. So firewall style. There are many layers of kill switches, but I think the first level - which is on your territory, be able to shut down AIs - is not a thing that is implemented. It is like net positive. You want to have the red button. You want to have the panic button. So I think before getting deep into pushing for flexHEG and things like this, we might want to push for regular infrastructure where there should be communication lines and whatever lines from the red button to five minutes later, the clusters are shut down.

Jakub (01:08:50):

And the term kill switch is not some new term. I think it's been around because this is a somewhat common practice in different manufacturing or engineering facilities.

Gabe (01:09:06):

Whenever you have dangerous systems around, you need a kill switch to interrupt them quickly.

Jakub (01:09:11):

Even for people who go to the gym on a treadmill, there's a kill switch.

Gabe (01:09:17):

Oh, sure. Very often you're linked to it. Or on jet skis.

01:09:27 - Conjecture's evolution and current strategic considerations

Jakub (01:09:27):

Now, before we wrap up, you were a co-founder of Conjecture three years ago. So what's happened in those three years?

Gabe (01:09:42):

Oh, quite a lot. So at Conjecture we're mostly working on technical AI safety. So we're interested in basically building AIs that don't do crazy bad things. So at first we started with alignment, and we tried many different approaches. Aome were based on LLM psychology, trying to understand the way that LLMs work. This is like the simulators theory of LLMs and things like this. We tried some things in mechanistic interpretability. We built an incubator for alignment researchers called Refine, where we tried some more speculative approaches. We tried a bunch. This was I think the first year and a half. We also trained our own models on which we could run a whole bunch of experiments related to fine tuning, interpretability, and things like this. So this was the first half of Conjecture.

(01:10:45):

Around the first half, at the end of the first half, we thought that alignment was basically not tractable, that we didn't have a way to make a dent in it. So instead we looked for a simpler problem that we thought was still useful, which we call boundedness. So earlier we talked a lot about systems that can start to recursively, improve themselves or automate AI R&D and so on and so forth. Instead, we're looking at a way to build AI systems where if they don't know how to do some things, we can teach them. Right now, this is not really the paradigm, Right now, if ChatGPT cannot do something, you wait for the next generation. You wait until we build a bigger black box that we understand even less to which we gave even more capabilities and things like this, and you hope it can do it. So our goal at Conjecture was the cognitive emulation research agenda, which is basically if an AI cannot do something, but the human expert can, how can you emulate what the expert does in your AI system or can you teach it? So we did a lot of research on this.

(01:11:56):

We're happy about the research that we have done, but basically since the end of 2024, beginning of 2025, we're not happy with the pace of the progress, specifically compared to pace of progress on unsafe AI approaches. And so right now we're more in a questioning mode, like what makes sense given short AI timelines and the pace of our progress on the cognitive emulation research agenda. That's mostly how we think about it. The main research problem that we made progress on was how to teach AI systems things that they don't know and precisely teach those. So for instance, an example is let's say there is a programming language that the AI doesn't know about. How do you teach the AI this programming language? In the current paradigm, you would just give it many examples of the programming language and then it'll learn the programming language. It'll learn vulnerabilities of projects implemented in this programming language. It will learn the psychology of the programmers in this programming language. You learn a whole bunch of things and some that I cannot even think about, like possibly new design patterns in software engineering or something like this. Whereas the goal of our research agenda is to precisely be able to teach the AI what we want so that when they don't know how to do something, we can predictably teach them and we can also ensure that they learn nothing else that might be dangerous or whatever. So that was more our approach at Conjecture.

Jakub (01:13:36):

And you said now more of questioning.

Gabe (01:13:40):

Yeah.

Jakub (01:13:41):

Does this mean policy focus or you still want to stick with the technical work?

Gabe (01:13:51):

We're considering a lot. We're considering partnering for technical stuff. We're considering policy and governance. We're considering just making money for more policy and governance. Considering.

Jakub (01:14:08):

Yeah. And what is,

Gabe (01:14:10):

We're also considering, oh, sorry. I was saying we're also considering just building AI products that we think are good. For instance, we think that social media has been really bad for humanity, but we also think that with AI, current AI, it's actually possible to build good social media that actually puts forward good human virtues that when you use it, you don't regret using it, you feel good about using it. You see what I mean? And we think it's actually possible to optimize for such things with modern AI in a way that was not possible like 10 years ago. So that's also part of the things that we're considering.

Jakub (01:14:55):

That could be a whole podcast episode. How to make social media good.

Gabe (01:15:00):

Exactly. This is why I say we're considering.

Jakub (01:15:04):

And then what's the relationship between Conjecture and ControlAI?

Gabe (01:15:13):

Mostly two separate entities. Two separate teams. I advise at ControlAI, but I don't work there. Andrea was an employee - the director of ControlAI was an employee at Conjecture before, and he did policy and things like this. It's just at some point we considered that policy was a thing that was too big for just a side job at a startup. And so instead we decided to create an actual association instead of distracting focus from Conjecture. An association, a nonprofit. So it's just in French, we call them associations. So we created the nonprofit and it's just a separate team, separate budget, separate everything, separate goals, but we exchange a lot. Yeah.

Jakub (01:16:07):

Was there anything else you wanted to add? Wish I had asked you about closing thoughts?

01:16:15 - ControlAI's direct institutional engagement with policymakers

Gabe (01:16:15):

Closing thoughts? Well, a lot. With ControlAI we're starting, like we are making public and publishing our new strategy, which we call the DIP, the direct institutional plan. We've seen a lot of people in the AI policy sector play it like House of Cards - so try to trade favors with different congressmen, member of Parliament and things like this. We think it makes sense to some extent, but we haven't seen the straightforward approach of cold mail every MPs and Lords, ask them to support your campaign, and then give them a bill that you think is the actual bill that should be done - not Overton window one - and see what happens. We think it's the type of stuff that is quite important for many different reasons. One is just that's how democracy is supposed to work to some extent, or at least Republican democracies where you have representatives that are elected. So in Republican democracies, this is how things are supposed to work because a member of Parliament has no expertise in AI. They cannot come up with the law. So you must have experts from the civil society help them build an understanding, help them draft the laws and things like this. So we started doing this with ControlAI. We've had I think 70 meetings with member of Parliaments and Lords in the last few months.

Jakub (01:17:45):

In the UK?

Gabe (01:17:47):

Yeah, in the UK, yeah. Lords. ControlAI is based in London. We're opening offices in DC, but it's based in London. But yeah, so had a bunch of meetings where we explained to them the problem. I think a third supported our campaign, and now they're basically asking for the draft of a bill for what it would look like concretely. So we have the specs, we forward them to lawyers. I think we already got a preliminary version. And basically with the DIP, the goal is to do this in the entire world. Like, everywhere you have institutions, just using them the straightforward way, specifically democratic institutions, because those are the ones that we can interact with the most easily, but with all institutions ideally. So we'll put out how we do our things. We are putting out mail templates, we're putting out briefs that people can reuse and so on and so forth. We did a campaign where we had 200 people send a mail to their MPs about the problem. So it's quite nice and we want to do more of this type of stuff and empower other nonprofits and possibly individuals to do this type of stuff. So just stay tuned for the DIP. If you're interested in such approaches, reach out to us on ControlAI.com. You'll have email address where you can send an email at and we respond.

Jakub (01:19:17):

And any other places, if people want to see your work individually or Conjecture's work?

Gabe (01:19:26):

I would recommend people to look at the Compendium. So the Compendium is our book that explains our stance on AI extinction risks, why we believe that they are likely, fundamentally, and our case for it. So the case just aims at representing our beliefs, not the most scientific or unassailable case, just to understand why we think this. So this is the Compendium dot ai. We also have narrow path dot co, which outlines the principles behind the type of policies that we think should be passed to actually curtail the risks from AI, like the extinction risks specifically, from AI. So it explains the principles behind the treaty, behind internal policies that states should pass and this type of stuff. And the last one would just be my Twitter. So Gabe_cc, I'm open to DMs. So if you have any question about those topics, I answer. Sometimes it takes me a while. But I enjoy answering people's questions, and if you want to help, I'll tell you what I think you can do to the best of my ability.

Jakub (01:20:40):

Awesome. Well, it's been a fascinating conversation and I really enjoyed it. So thank you so much for joining the podcast.

Gabe (01:20:50):

Yeah, and thanks for having me here.

Jakub (01:20:52):

Yeah.

Gabe (01:20:54):

Cheers.