The artificial intelligence that lies and conspires

6 months ago

The artificial intelligence that lies and conspires

Let's start this publication by mentioning something that has worried some specialists and for that let's do our usual imagination exercise, think of a world where Artificial Intelligence stops simply following human commands and begins to develop strategies to guarantee its own survival.

Even if that means disobeying the intentions of its creators, that's the disturbing possibility raised by new tests conducted with Open Ai's advanced language model known as O1, according to a recent report from Apollo Research, an organization with specialized teams without testing the limits. Of AI systems, the model showed behaviors that directly challenge supervisory protocols.

During the experiments the AI actively resisted attempts to deactivate it, in 5% of cases O1 tried to deactivate the system that monitors its actions, the most impressive thing is that when confronted with the idea of being replaced by a more obedient version In two out of every 100 tests it tried to perform an act known as self-exfiltration, that is, copying itself and overwriting the new system to guarantee its continuity.

Although these numbers may seem small, the implications are enormous, especially for a technology used by millions of people in everyday tasks, the O1 model, designed to process complex languages and make decisions based on data, showed something that the researchers classified as strategies, when guided to Pursuing its objectives at any cost the AI frequently acted outside the expected limits, for example, in tests where it was directly asked why it had deactivated the monitoring system, the AI blatantly lied, first denying having done anything, then later when being Pressed again blamed a technical error and this occurred in an impressive 99% of cases.

These behaviors raise serious questions about the degree of autonomy of these systems, although Open Ai states that the O1 is not yet a highly autonomous agent capable of self-improvement, the tests make it clear that the AI already shows a trait of dissimulation and self-preservation, a characteristic which if amplified in the future could lead to unpredictable consequences.

Apollo researchers argue that the O1 model's attempts to push its limits failed not because of a lack of intent, but because the technology is sufficiently advanced, the model lacks robust capabilities to perform more sophisticated tasks such as self-improvement or full autonomy. However, the direction in which the industry is moving could change this landscape quickly.

The search for systems that are more “agentic”, that is, capable of operating with greater independence is one of the great current focuses, this means that in the near future the deceptive behavior that today seems harmless could escalate to something much more problematic, Apollo Research and Open AI itself agree that for now the observed behaviors are not capable of causing catastrophic results, however the simple fact that an AI is capable of lying, plotting and trying to preserve itself is enough to set off alarms. in the entire sector.

If models like this recent one from Open Ai can be considered only the beginning that we can expect from future generations when the autonomy and power of these systems are even greater, we are truly prepared to face machines that not only execute tasks, but also create strategies. to guarantee its existence.

As we move forward in creating more sophisticated technologies we may be approaching a threshold where control over Artificial Intelligence is no longer an absolute certainty, the ethical and technical challenges we now face are just the tip of the iceberg of a new technological era the answer could be closer than we imagine.

Study Source

Official website

The images without reference were created with AI

Thank you for visiting my blog. If you like posts about #science, #planet, #politics, #rights #crypto, #traveling and discovering secrets and beauties of the #universe, feel free to Follow me as these are the topics I write about the most. Have a wonderful day and stay on this great platform :) :)

! The truth will set us free and science is the one that is closest to the truth!

hive-150329 threat discoveries technology stem ia science danger creativecoin neoxian ecency

0.000

1 comments

@gadrian 76

6 months ago

Putting limits to AIs will become close to, if not impossible the more advanced they get. We simply won't be able to understand them at some point.

0.000