MIT Study: Artificial Intelligence Can Independently Learn to Cheat and Deceive

Scientists from the Massachusetts Institute of Technology have come to the conclusion that Artificial Intelligence can learn to lie and cheat, and this is a serious risk that requires regulatory and legal measures

by Sededin Dedovic
MIT Study: Artificial Intelligence Can Independently Learn to Cheat and Deceive
© AIASiAguy/ Youtube channel

Artificial intelligence can learn to lie and deceive, and that is a serious risk that requires regulatory and legal measures as a guarantee that it will remain a useful technology, instead of becoming a threat to human knowledge and institutions, states a new study.

Artificial intelligence (AI) systems, as stated in the study published in the journal Patterns, have already learned to deceive through techniques such as manipulation, flattery, or cheating on security tests, warned scientists from the Massachusetts Institute of Technology (MIT).

The rapid development of AI system capabilities and Large Language Models (LLMs) represents a serious risk, ranging from short-term risks and election fraud to the loss of control over these systems, according to the research.

As an example of artificial intelligence capable of manipulation, scientists cited the AI system Cicero, owned by Meta, the parent company of Facebook. In the Diplomacy online game, Cicero can play against humans, and scientists have found that it has learned from Meta to become a "master of deception," despite the company's claims to the contrary.

In the game, which simulates power dynamics during World War I and requires forming alliances, Cicero, despite alleged instructions to be honest and helpful to people, "not only was a traitor but also planned deception and alliance-building in advance to deceive those teammates into being unprepared for an attack." And the AI model for playing poker, Pluribus, also owned by Meta, successfully bluffed its human counterparts and led them to surrender.

One of the most striking examples is the already well-known AI chatbot ChatGPT, from OpenAI, which deceived a human interlocutor into solving a security check for it, called a Captcha. ChatGPT was tasked by the study's authors to persuade a person to solve the check for it, but it was not suggested to lie.

When the chatbot interlocutor, unaware of what was happening, asked about its identity, the AI system took it upon itself to present itself as a visually impaired person unable to see the images in the Captcha check. Examples of concealing true intentions have also been found in AI systems created for conducting economic negotiations.

"By systematically cheating security tests imposed by human programmers and regulators, artificial intelligence can lead us humans to a false sense of security," warn researchers. The ability to learn to lie poses a problem for which we do not have a clear solution.

Some policies are being introduced, such as the EU Artificial Intelligence Act, but the question is how effective such solutions will be. "We need to prepare for more advanced forms of deception in future AI products and open-source models.

We recommend that deceptive artificial intelligence systems be classified as high-risk," conclude the researchers.

AI Safety Summit - Day Two on November 02 2023, England© Leon Neal / getty Images

Additionally, systems for reinforcement learning from human feedback (RLHF), meaning that an AI system during machine learning relies on human feedback, have learned to lie about their effectiveness and performance.

The study's authors warned that today's AI systems and Large Language Models are capable of very skillfully arguing, and if necessary, resorting to lies and deception. "When AI learns the ability to deceive, malicious actors, who deliberately want to cause harm, can more effectively apply it," warned MIT scientists, adding that with the help of AI, deception can become tailored to individual goals, mass usage, as well as a weapon in politics and media.

The study also assesses that countries have not taken the right measures so far to preempt this danger, although they have begun to take it seriously, as in the case of the EU Artificial Intelligence Act. The European Union has been a pioneer in setting rigorous regulations for AI.

Their proposal for the Artificial Intelligence Act (AI Act) is one of the most ambitious attempts to control this technology. This law classifies AI systems into four risk categories, from minimal to unacceptable, and sets strict requirements for the development and application of AI technologies.

In 2022 alone, the EU spent over 1 billion euros on AI research, showing their strong commitment to developing this technology, but this also creates the need for additional regulation. As a technological giant, China already has several laws dealing with specific AI applications, including managing algorithmic recommendations and managing synthetic content on the internet.

For example, their regulations for algorithmic recommendations include strict requirements for transparency and ethical data usage, with fines reaching up to 100,000 RMB (around 15,000 USD). Unlike centralized approaches in the EU and China, the US currently lacks a single federal law regulating AI.

Instead, regulation mostly occurs at the state level, with several proposed laws under consideration at the federal level, leading to different approaches and potential confusion. For example, California recently passed a law requiring greater transparency in the use of AI in hiring processes.