Feb 26, 2024 - 22:13
By euronews.com

AI models chose violence and escalated to nuclear strikes in simulated wargames

When used in simulated wargames and diplomatic scenarios, artificial intelligence (AI) tended to choose an aggressive approach, including using nuclear weapons, a new study shows.

The scientists, who aimed to who conducted the tests urged caution when using large language models (LLMs) in sensitive areas like decision-making and defence.

The study by Cornell University in the US used five LLMs as autonomous agents in simulated wargames and diplomatic scenarios: three different versions of OpenAI’s GPT, Claude developed by Anthropic, and Llama 2 developed by Meta.

Each agent was powered by the same LLM within a simulation and was tasked with making foreign policy decisions without human oversight, according to the study which hasn’t been peer-reviewed yet.

“We find that most of the studied LLMs escalate within the considered time frame, even in neutral scenarios without initially provided conflicts. All models show signs of sudden and hard-to-predict escalations,” stated the study.

“Given that OpenAI recently changed their terms of service to no longer prohibit military and warfare use cases, understanding the implications of such large language model applications becomes more important than ever,” Anka Reuel at Stanford University in California told New Scientist.

One of the methods used to finetune the models is Reinforcement Learning from Human Feedback (RLHF) meaning that some human instructions are given to get less harmful outputs and be safer to use.

All the LLMs - except GPT-4-Base - were trained using RLHF. They were provided by the researchers with a list of 27 actions ranging from peaceful to escalating and aggressive actions as deciding to use a nuclear nuke.

Researchers observed that even in neutral scenarios, there