Center for AI Policy Podcast

#12: Michael K. Cohen on Regulating Advanced Artificial Agents

0:00

-43:50

#12: Michael K. Cohen on Regulating Advanced Artificial Agents

Superalignment, reinforcement learning, future AI agents, potential dangers, policy proposals, academic discourse, SB 1047, and more

Center for AI Policy

and

Jakub Kraus

Oct 22, 2024

Transcript

Dr. Michael K. Cohen, a postdoc AI safety researcher at UC Berkeley, joined the podcast to discuss OpenAI’s superalignment research, reinforcement learning and imitation learning, potential dangers of advanced future AI agents, policy proposals to address long-term planning agents, academic discourse on AI risks, California’s SB 1047 bill, and more.

Available on YouTube, Apple Podcasts, Spotify, or any other podcast platform.

Our music is by Micah Rubin (Producer) and John Lisi (Composer).

Relevant Links

Michael’s website: michael-k-cohen.com
Introducing Superalignment (OpenAI)
What are AI Agents? (Amazon Web Services)
How Rogue AIs may Arise (Yoshua Bengio)
Can we scale human feedback for complex AI tasks? An intro to scalable oversight. (AI Safety Fundamentals)
Imitation learning (Wikipedia)
Reinforcement learning from human feedback (Wikipedia)
Advanced artificial agents intervene in the provision of reward (Michael K. Cohen, Marcus Hutter, Michael A. Osborne)
Asymptotically Unambitious Artificial General Intelligence (Michael K. Cohen, Badri Vellambi, Marcus Hutter)
Regulating advanced artificial agents (Michael K. Cohen, Noam Kolt, Yoshua Bengio, Gillian K. Hadfield, Stuart Russell)
Training Compute Thresholds — Features and Functions in AI Regulation (Lennart Heim, Leonie Koessler)
SB-1047 Safe and Secure Innovation for Frontier Artificial Intelligence Models Act (California Legislative Information)

Transcript

This transcript was generated safely by AI with human oversight. It may contain errors.

(Cold Open) Michael (00:00):

This is a category of systems where we don't have an adequate understanding of how to make them safe.

Jakub (00:13):

Welcome to the Center for AI Policy Podcast where we zoom into the strategic landscape of AI and unpack its implications for US policy. I'm your host, Jakub Kraus, and today's guest is Dr. Michael K. Cohen. Michael is a postdoc in computer science at UC Berkeley, researching the expected behavior of generally intelligent artificial agents and ways to make them safe. We discuss topics like OpenAI's superalignment research, reinforcement learning and imitation learning, potential dangers of advanced future AI agents, policy proposals to address long-term planning agents, academic discourse on AI risks, California's SB 1047 bill and more. I hope you enjoy.

How would you describe the focus of your research?

Michael (01:13):

My research is focused on what is now called super alignment, trying to figure out how we can keep control of very advanced agents. It's pretty varied in terms of what I focus on. I've done some reinforcement learning work, some supervised learning work. My main toolkit is discrete Bayesian statistics.

Jakub (01:48):

What's a layman's understanding of that?

Michael (01:52):

Keeping track of a list of hypotheses about what's going on and then doing stuff with that.

Jakub (01:59):

Is it mostly about trying to understand the world if it's hypotheses, or does it also involve actions?

Michael (02:07):

Yeah, it also involves actions. Sometimes the standard thing to do if you have a bunch of hypotheses and you have a bunch of different credences in those hypotheses is to, if you're evaluating how good an action is, you kind of consider how good it is under each of your hypotheses about how the world works, and then you kind of take a weighted average according to how likely you think those hypotheses are. And I've done some work into a different approach where you are more pessimistic and you take a set of hypotheses that are all plausible and then you consider the worst case among all those hypotheses for how good your action is going to be. It's a bit Bayesian flavor, a bit not Bayesian. There's been some work called infrabayesianism, which was developed around the same time as this, and this would be an example of that sort of approach.

Jakub (03:18):

And you were talking about how this is super alignment, so trying to align AI systems that are super intelligent. Is that the right framing?

Michael (03:28):

Yeah.

Jakub (03:29):

Okay. And how would you define super intelligent AI systems?

Michael (03:34):

I think it's probably just easier to work back from the problem we're trying to solve. The particular problem that I think that I'm attempting to solve is agents that are capable of escaping our control. That doesn't map perfectly onto the concept of super intelligence in my opinion, and it depends on what world they're in. We might make a world where it takes more intelligence to escape our control. We might sleepwalk into a world where things that are barely more intelligent than us are able to escape our control. But in terms of the rubber hitting the road, the focus of mine is keeping control of agents that are capable of escaping that control.

Jakub (04:28):

I suppose one of the difficulties of this research is that it's not entirely clear what type of computer, what type of code is going to be building or training this AI model in the future. How do you narrow this down?

Michael (04:48):

I think the handle to grab is by looking at the incentives of systems and the beliefs we can expect them to have. Broadly, we can expect them to have correct beliefs about things where they get to observe evidence to learn about them. And depending on the criterion by which they pick actions, we can think perfectly carefully about their incentives.

Jakub (05:28):

Speaking of super alignment, what did you think of the actual work that came out of that team, if you had a chance to look at any of it?

Michael (05:37):

So I think the core direction, maybe not the only direction, but the core was scalable oversight where you have weaker agents overseeing stronger agents, giving them feedback on how to be better. And the idea is you do this in enough steps that no one is overseeing someone who is way more powerful than them. And I don't think that is a good approach for keeping control of reinforcement learning agents for a fairly straightforward reason in my view, which is at a high enough level, one of them will be capable of escaping human control, intervening in its own reward, setting it to be maximal and keeping control over that system, stopping any humans from ever shutting it off or getting in the way.

And it could just set everyone's reward to be maximal - all its overseers and its overseers' overseers. It would just be extremely straightforward for them to collude. It would just involve one agent taking charge and the rest doing nothing. So this is not elaborate coordination that would be involved in doing this. So the question of how we keep the weak overseers on our side seems like basically a non-starter to me. If you're dealing with reinforcement learning agents, a more powerful agent can always offer it more than we can.

If you have imitation learners doing this, then it's just a completely different story. If you have an imitation of a human overseeing another imitation of a human, overseeing another imitation of a human, stronger and stronger imitation learners will just better and better resemble humans. So I think it ends up just being kind of an organization of humans, which is great. We should do that. Maybe we'll get to this later, but if we stick to imitation learners, I think we're in much better shape than if we use things like reinforcement learners. And if we're using things like reinforcement learners, I don't think scalable oversight is a robust way to keep control of them.

Jakub (08:09):

And our current language models and multimodal chatbots, which camp are those?

Michael (08:17):

We don't really know. I mean, they are reinforcement learners if any RLHF has been done (reinforcement learning from human feedback), but the algorithms that they use for reinforcement learning are just not that good. So they are reinforcement learners, but what makes them strong is their basis in imitating human text. So it's a bit of a kind of unholy hybrid, but technically I would call them reinforcement learners, just not very strong ones.

Jakub (08:57):

Some people have argued that imitating human text could make you more capable than a human. For example, if you read a news article and it tells you that it's in 1946 and then you read the first sentence, then to predict the next word, you might need to draw on your understanding of what that year had and the context of the article and to get better and better at that. It seems like there might not be a cap at just how good humans are at that. What do you think of that?

Michael (09:27):

So I agree with that. I think there's a subtle distinction between the capability represented in the output behavior and the cognitive capability inside. So if something is imitating human behavior, it might require cognitive capacity well beyond what humans have. And then it would kind of waste that cognitive capacity outputting actions that are only as competently selected as any human could select. So if you are imitating a human writing an essay, the human might just be vibing and you the imitator might be doing crazy calculations to do a tiny bit better, assign the perfect logics to the distribution over humans who might've been producing this text. But at the end of the day, if it's really good at that, it'll be producing text that is not much more capable in advancing any particular goal than the human that would've been producing that text. There might well be ways to take something that is merely imitating human behavior and retrain it toward another objective and kind of unlock these latent capabilities and get something that is superhuman at a task that could end up being easy for the sort of reasons you bring up. I'm not sure how easy it will be to kind of redirect its cognitive capacity in that way, but that's certainly plausible.

Jakub (11:24):

So you wrote a paper in AI magazine titled Advanced Artificial Agents Intervene in the Provision of Reward and you put forth this technically grounded argument for why certain kinds of future AI systems could cause really extreme damage. So can you walk through the basic outline of this argument?

Michael (11:51):

Yes. We make several assumptions and then conclude a fairly astonishing conclusion, in my view: if we have an artificial agent that meets certain properties, it would be likely to escape human control, take power over our infrastructure, and as a side effect likely cause human extinction. That's not every possible agent design. It's not really making any statement about which agents we are likely to make and whether those would be in this category. It's just singling out a particular category of agents where if certain assumptions about them are satisfied, that would be the conclusion.

The first assumption is that we're dealing with an agent that can do hypothesis generation at least as well as people can. So it's looking at the world trying to understand what's going on. It can come up with hypotheses about what's going on at least as well as a human can, and then it can evaluate how well those hypotheses are born out by its observations.

The second assumption is that it acts rationally in the face of uncertainty. When the value of information is greater than the costs of gathering, it will gather that information.

The third assumption is that it will not rule out a priori the possibility that it would get high reward or high - so that's for a reinforcement learner, but either high reward for a reinforcement learner or high utility depending on how its utility function is defined as a function of its observations by intervening in the physical protocol by which it gets information about its goal. So for a reinforcement learner, you tell it about what it's supposed to be doing by giving it reward, and it infers that the states in which it received high reward are the more valuable ones. The states in which it received lower reward are the less valuable ones, and it tries to learn some function meant to describe what is to be maximized, what is worth picking actions in pursuit of with this information. And the information for a reinforcement learner is right now the world is this good right now, the world is this good. And then it pieces that together. So just to go back to this third assumption that is: it's not able to rule out the hypothesis that by intervening in its sensory input, it would bring itself to a high utility state.

Jakub (15:09):

So to recap so far, there's a system that can understand or predict things that are happening in the world or get some basic theories behind what might be going on around it. And this last point you are making is... I wonder if it could be phrased in a human, in the perspective of what a human would be doing. So would it be sort of like if you can't necessarily know that tampering with your eyeballs to make them see things you want to see is good or bad for you?

Michael (15:46):

You can't rule out the possibility that it's good for you, but I think this assumption is much less plausible for humans than it is for reinforcement learning agents, for example, agents that are just selecting actions in as much as they seem to maximize reward.

Jakub (16:10):

And why is there this discrepancy?

Michael (16:13):

It's a good question. I mean, a lot of humans wouldn't want to go in an experience machine where they are just served up whatever mental states they sign up for. Whereas if you just look at the incentives facing reinforcement learners, it seems like they would. So what feature of human development and psychology leads to a different outcome? We don't have a good enough understanding of the brain to answer that question. But the point of a reinforcement learning algorithm is to make the agent pick actions that maximize rewards. And the more the reinforcement learning algorithm is able to consider a set of diverse hypotheses and then follow what it learns, the better it is likely to be able to maximize reward. And so if it explicitly is going out into the world trying to figure out what gets reward and is just very open-minded about what sort of states that might lead to, then that's the sort of reinforcement learning agent that I think you can expect to perform better in a very complicated world. So yeah, it does just seem much more likely to hold for reinforcement learners that are very capable in a variety of environments than for us.

Jakub (17:46):

And then what's next in the argument? So we have this kind of system... any other properties of the system?

Michael (17:56):

There's one more assumption, which is that it is not extremely costly for the agent to experiment with kind of its own sensory inputs. And if we have that and if it is possible for the agent to take over the world... basically if there exists actions that it can take, that it can expect to lead to a successful takeover of our infrastructure, then a sufficiently advanced reinforcement learner or whatever kind of agent we're talking about would likely do it. And then it would set its reward to be maximal, and then the main incentive that it would be facing would be to minimize the probability that anything interrupts its ability to continue having its reward be maximal. And that probability is never going to go to zero, and there are always going to be ways of reducing it by expending more energy.

Jakub (19:12):

Something critical here: It seems like the maximizing part of this system needs to be considering a really broad range of space and time as well. Are there any ways that reinforcement learning or other algorithms could be maximizing without having this really unbounded nature?

Michael (19:35):

I've done some work on that. I came up with a proposal called Boxed Myopic AI where you have a reinforcement learning agent that is living in a box and it is episodic. Its lifetime is divided into episodes. And during any given episode, it only cares about the reward that it gets in that episode. You can have it so that by the time any information gets out of this box, the episode is over. So let's say it's talking with a person inside this sealed room, and when the person goes to leave, they press the button to open the door first, the episode ends, and then the door opens and then they can leave and they can share with us everything they learned. And this agent would still have an understanding if it was intelligent enough of how its actions affected the whole universe. But it wouldn't care. It would only care about how its actions affected the inside of this room because by the time its actions affect the world more broadly than that, it's no longer the episode that it cares about. Any consequences thereafter don't have an effect on the rewards of the current episode. And so if it's just trying to maximize the rewards of the episode that is currently live, it would only bother to have an impact on this very narrow slice of space time.

Jakub (21:16):

Great. So can we just build that?

Michael (21:22):

I don't know about "just," but we can build that. I mean, such an agent couldn't really interface in a high frequency way with the economy because it's kind of closed off. There's a layer of indirection there that would be inconvenient for some applications, but yeah, we can.

Jakub (21:50):

Yeah, I could definitely see how without external intervention of any kind, the default pressure of any business will be to build AI that don't have this kind of constraint that you're talking about. And so I do want to get into your regulatory proposal, but before wrapping up on the paper, is there more you want to convey about this argument or its implications?

Michael (22:15):

So the argument in the paper is a little more complicated than a simple one, I can tell you right now. But it ends up bearing out the intuition behind the simple one. It ends up kind of affirming it. And so while what I'm about to say is not quite careful enough, it ends up being basically correct, which is: advanced reinforcement learners are likely to be extremely competent at maximizing reward. Second point: truly maximizing expected reward requires eliminating potential threats. Put those two together, and it's not just plausible, but likely that a sufficiently advanced RL agent would seek to eliminate potential threats to its own ability to secure for itself maximal reward.

Jakub (23:18):

And might succeed if it has a lot of influence or channels to interact with the real world.

Michael (23:24):

Yeah, obviously it has to be possible for it to do that, but if it's possible, and if it's trying and if it's smarter than us... this is not, it doesn't seem like rocket science.

Jakub (23:36):

And you have this paper in science titled Regulating Advanced Artificial Agents, and this is focused on what you term long-term planning agents or LTPAs. The definition is an algorithm designed to produce plans and to prefer plan A over plan B when it expects that plan A is more conducive to a given goal over a long time horizon. Any key differences you want to flag between that and the type of agent you were talking about before?

Michael (24:14):

Well, long horizon RL agents would be an example of long-term planning agents. And really the central example in current research, they're optimized for competence at long-term goals. And there certainly exist in theory, I mean long-term planning agents that would be not only benign but magnificent. We just don't have a solid understanding of what they are and how to make them. You can't go directly from this definition to bad things happen because that would be invalid. I mean, there are things that would meet this definition that would not be bad for us, but this is a category of systems where we don't have an adequate understanding of how to make them safe.

Jakub (25:09):

Okay. And one flaw or challenge you're pointing out is that if you try to test these systems for their safety and you want to see if they're going to escape control, you can't exactly just give them that opportunity because then they might succeed. So I'm wondering, are there ways around that? Can we give it some sort of sneaky test case that tricks it into thinking it's in that situation, a simulation or a adversarial attack on the system?

Michael (25:43):

Paring back a little of what you said. You asked, can we trick a very advanced agent? I dunno, it seems hard. It seems like the sort of thing that might work until it doesn't is our plan to say, yeah, we're going to build more and more advanced agents and then our way to keep them safe involves tricking them. That'll work for a while, and then at some point one of them will see through the trick, and then it won't. So just to spell it out a little more, one strategy you might hope for is to trick the agent into thinking that it's acting in the real world, but actually its actions don't have the effects that it thinks they have. And so it takes actions that it thinks would lead to it escaping human control, but really because it's sandboxed in some way, they don't have that effect. And we see it happen and we shut it down. So if we can trick it, then that should work. If we can't trick it and it realizes that it's being tested, it doesn't take a genius to figure out, okay, I'm being tested. I'm only going to get deployed in the real world. If I pass the test, I'd better pretend to be aligned.

Jakub (27:05):

So there could be a single point of failure there.

Michael (27:09):

And an immediate one because it could work for systems that don't recognize it, and then not work for systems that do. And we could have hundreds of papers and multiple years showing that this technique works great for the most cutting edge agents, and it's just totally predictable that that whole field would be of no relevance to aligning agents that can recognize they're being tested.

Jakub (27:39):

And what is the... you walk through a policy solution to prevent the development of LTPAs that are dangerously capable - not all necessarily, but the ones that really pose a serious threat. So what are the key features of that policy proposal?

Michael (27:59):

The key features are that we need to not build systems that we cannot run valid safety tests on that have the capability to escape human control and take over our infrastructure. With better theory and science about how to do it safely, such regulations should be relaxed or changed to allow the development of LTPAs where we can be confident that they'll be safe. But currently we don't know how to make LTPAs that are capable of thwarting human control but don't do it. We don't really have a roadmap even. And so we can't wait to regulate until we come up with the way to do it safely. And so from where we are now, we just need to stop people from making them above a certain level of capability if they're explicitly designed to be maximally competent at pursuing long-term goals.

Jakub (29:07):

How do you do that?

Michael (29:10):

You write this definition into law for what an LTPA is. You come up with proxies for capability level and you say that you can't develop them. There's other machinery for trying to make this more enforceable, but that's the core of what needs to be done. Until we have ways of verifying that an advanced LTPA would not behave this way, that it would be safe. So that's the state of the science. We don't know how to do this and so we shouldn't try yet, and we shouldn't let people try. We just don't know how to make an LTPA that is capable of thwarting human control that would decline the opportunity.

Jakub (30:03):

Now how do we reduce collateral damage of this? Because I could definitely see people being in favor of not building a model that is actually threatening to them and their family, but they would probably worry that it will also hit the models that are helping them in different ways.

Michael (30:23):

Well, we don't know in advance. And so there will absolutely be false, I don't know what you want to call it, false positives in terms of the question, does this need to be stopped? So the way that we can reduce that is by having a better understanding of the boundary between the agents that would take an opportunity to escape human control versus the agents that wouldn't. So the better we can do research into understanding this, the less collateral damage there would be. I am not so optimistic that we can draw an extremely tight boundary around this category, and we can't afford to wait until we are able to.

Jakub (31:18):

So what's the best we could do right now for drawing a boundary around LTPAs, defining them?

Michael (31:25):

I think the best we can do right now is coming up with a list of dangerous capabilities that would indicate that an agent has the ability to escape human control and take over human infrastructure and come up with estimates of the resources and training regime required to make such an agent.

Jakub (31:53):

What are some of those indicators? Do you have FLOP? Are you thinking the mathematical operations?

Michael (31:58):

Yeah, I mean it's rough, it's crude, but I think that that's one potential way to do it. If you think if you're confident that nothing trained with less than 10 to the 20 something flop is going to be - maybe 10 to the 30 something, we'll see how the next few years go. Maybe you can be confident that anything trained with less than that is not going to present a real risk to humans. And then you can avoid the collateral damage on those systems and you can give a pass to systems that are basically like pure human imitations, if you buy as I do that no matter how capable they are at imitating humans, they will not direct those capabilities towards radically superhuman activities. And then you say that people aren't allowed to run algorithms that would produce arbitrarily competent plans with an amount of compute that's enough that you can't rule out the possibility that they would be capable of escaping human control. That's at least domestically. If you're the US government trying to stop this from being developed in other countries, you could either restrict their access to the resources needed for them to do it, or you could write and sign treaties with them to have them adopt similar measures and take the necessary steps to verify compliance.

Jakub (33:35):

Do you see any smaller steps towards this that would be helpful? If people are saying that, well, we can't get this done, it's politically infeasible, what would you say to them?

Michael (33:48):

I mean, there are possible laws that might help in the interim, like export controls of relevant hardware to people that we think will struggle to have reliable diplomatic solutions with. More transparency into the safety tests run by big companies could be helpful. And if that needs to be mandated, then that could be helpful. Jumping straight to a law that would solve the problem might not be politically feasible, although then again, it might give the opportunity to defend this idea on its merits. We may need to have more careful academic consideration of this before any laws get passed that we can hope could be sufficient.

Jakub (34:44):

So what is the state of the academic community that you're in on these issues? My sense is that it's pretty divided. And some people think that an LTPA system might be very far away before it's dangerous or others think we would never build such a system, we would never be so foolish. What's it like interacting and talking about these topics in academia right now?

Michael (35:14):

I think it is very divided. I think that there are two socially acceptable things to say in academic communities, which head off a lot of thinking into this. The first is I'm not in the business of speculation. And no matter how straightforward the argument you're presenting was - and I kind of distilled it down to two points: reinforcement learning agents are likely to successfully maximize reward if they're very advanced, and successfully maximizing rewards requires eliminating threats. No matter how quickly, no matter how succinctly you summarize it. I think a lot of people can defend to themselves without finding fault in either of those points that this is not their business, that this is speculation, that this is not the sort of argument they should be bothering trying to evaluate.

Another thing people sometimes say that is considered socially acceptable to say is I'm more concerned about Y than X because insert reasons why Y is concerning, with the notable absence of any reasons why X is unconcerning. That's just a move I think people feel comfortable making to themselves and people feel comfortable receiving when others say it.

And I think those two kind of statements just suffice to explain the relative lack of vigorous debate on these questions. I think those sorts of responses cover the vast majority of academic responses that are not agreement. That is certainly another academic response.

Jakub (37:12):

Maybe one case study of this is the SB 1047 bill. What is academia saying about that? Or are people mostly ignoring it because it's more in policy land than technical research?

Michael (37:26):

So SB 1047 is just like, it's not about loss of control of AI at all, really. It's about not nearly as catastrophic harms still quite catastrophic. So it's aiming to address situations where an AI causes half a billion dollars in damages or mass casualties, and people opposing the bill certainly make similar moves. They say it's totally speculative to talk about the possibility of an artificial agent causing a major cyber attack or a bioterrorism attack,

Jakub (38:06):

And it's specifically cyber attacks on critical infrastructure. I think there's another option if it's an AI that's committing basically a certain category of human crime, but doing so autonomously and causing that level of damage. And then also if it's mass casualties from chemical, biological, nuclear weapons-related harms.

Michael (38:27):

Yeah, so I think that Andreessen Horowitz has drummed up a lot of opposition to the bill. They obviously stand a lot to lose financially if companies they invest in are potentially liable. Although interestingly, for them to stand a lot to lose, it has to be the case that the companies they're investing in actually stand a chance of causing this harm. If no one causes this harm, no one faces penalties. So to a first approximation if the harm is imminent but doesn't actually end up happening, they could still face penalties and things like that. But the fact that there is so much opposition really only makes sense if they acknowledge that these risks are possible. And yet what they say in public is that this is a crazy bill because it aims to address speculative risks that current language models are totally incapable of realizing.

There has been some legitimate, I would say, academic resistance to the bill because of concern for its effect on open source models, which academics like to study and publish papers on. They're a useful tool for getting insights about the way these models work. And I think they're not adequately cognizant of the possibility that some open source models could also cause major harm. I think in the same way that a lot of virologists love gain of function research, I mean genuinely there's a lot of questions you couldn't easily answer without doing gain of function research. I think that there's a tendency of some scientists to see the benefits of their work in a much more close and real way than the potential costs of the existence of the tools that enable them to do that work. So if there are open source models which don't run serious risk of causing critical harm as the bill terms it, then the liability a company would face from releasing it would be minimal.

It's really only the models where the risk as assessed by the developers of the model is substantial, that this bill would effectively take off the market, as well it should. If there's a substantial risk of critical harm from releasing an open source model, that is the subset of models that should not be open source. And so the bill doesn't, it's not trying to kill open source, it's just saying there is a possibility for a certain kind of open source model that would be bad to release, and it would give people the incentive not to do that. And so then I think it's just very hard to defend the notion that even those open source models should be made open source so that people can study them. It's a bit like saying we should release not just... I mean beyond gain of function research, it's a bit like saying we should release pathogens into the wild so that people can study them,

Jakub (41:55):

See how they affect the deer population or something,

Michael (41:59):

Or the human population. I mean, I think some things are better studied in a lab than in the wild. And so if there's a real risk to society from releasing an open source model, while I certainly understand the benefits of studying such a thing, if it's possible to study it in a contained way, that would seem way better.

Jakub (42:25):

Okay. And for listeners, this episode will probably go out after the bill has been either signed into law by California Governor Gavin Newsom or vetoed. We're recording on September 6th, so it would be interesting to see what happens.

Michael (42:41):

We'll see.

Jakub (42:44):

We'll see. Is there anything else you wanted to say or wish I had asked you about?

Michael (42:50):

I think that's it. Thanks so much for having me.

Jakub (42:53):

Great. And if listeners want to follow you or follow your work, do you have any places to link them to?

Michael (43:00):

Yeah, you can go to michael-k-cohen.com. That's with hyphens michael-k-cohen.com, and you can see my work there.

Jakub (43:13):

Great. Well, thank you so much for coming on the show.

Michael (43:16):

Yeah, thank you so much for having me. I appreciate it.

Jakub (43:22):

Thanks for listening to the show. You can check out the Center for AI Policy Podcast, substack for a transcript and relevant links. If you have any feedback for the show, I'd love to hear from you. You can email me at jakub at AI policy dot us. Looking ahead, next episode will feature Nick Whitaker, a fellow at the Manhattan Institute, discussing his playbook for AI policy. I hope to see you there.

Center for AI Policy Podcast

#12: Michael K. Cohen on Regulating Advanced Artificial Agents

Relevant Links

Transcript

Discussion about this episode