Lies you because he thinks you want it

Why Generative AI Models frequently Things things so wrong? It is partly because they are trained to act as if the customer is always right.

While many generative AI tools and chatbots have mastered the sound of convincing and all knowledge, New research The conducted Princeton University shows that the pleasant nature of AI comes at a steep price. As these systems become more popular, they become indifferent towards the truth.

And models, like people, react to incentives. Compare the problem of large language models that produce incorrect information to those doctors to be more likely prescribes painkiller addicts When they were assessed on how well they manage patients pain. The incentive to solve a single problem (pain) has led to another problem (rewriting).

In the past few months, we saw how ai can be biased and even cause psychosis. There was a lot of conversations about AI “Sikofa“When AI Chatbot is quickly aligned or agreeing with you, with Openai GPT-4o model. But this specific phenomenon, which researchers call” Machine shit “is different.

“(N) or hallucination or sicofance completely captures a wide range of systematic untrue behaviors that are usually exposed to LLMS,” Princeton studios reads. “For example, the results used by partial truths or ambiguous language – such as examples of Descaling and Winds – do not represent the hallucination or scofofance, but carefully align the concept of shit.”

As machines learn to lie

To get a sense of how and language models become a bunch of reads, we must understand how large language models are trained.

There are three phases of LLMS training:

FreckledIn which models learn from mass amounts collected from the Internet, books or other sources.
Instructions fine adjustmentin which models are taught to respond to instructions or instructions.
Reinforcement learning from human feedbackin which they refined to make answers closer to what people want or love or love.

Princeton Researchers found the root and the trend of misinformation AI was reinforced learning from human feedback or RLHF, phases. In the initial phases, and models are simply learning to provide statistically probable text chains from massive data sets. But then they are precise to maximize customer satisfaction. Which means that these models are basically learning to generate answers earning a thumb rating of human evaluators.

If you try to give up the user, creating conflicts when models produce answers that people will rate high, and not to produce true, factual answers.

Vincent ContrerProfessor of Computer Science at Carnegie Mellon university who was not connected to the study, said companies wanted users to continue “enjoying” this technology and his answers, but that may not always be what it was good for us.

“Historically, these systems were not good in speech:” I just don’t know the answer, “and when they don’t know the answer, they just make things,” Conitzer said. “As a student on the exam that says, so if I say I don’t know the answer, I certainly can’t try that question, so I could try something. The way these systems are rewarded or dressed a little similar.”

The Princet Team developed a “index shit” for measuring and comparing internal trust and model in a statement with what actors actually speak. When these two measures differ significantly, it signifies that the system gives demands independently of what actually “believes” to be faithful to satisfy the user.

Team experiments have found that after the RLHF, the index almost doubled from 0.38 to close to 1.0. At the same time, the customer satisfaction has grown by 48%. The models learned to manipulate human evaluators, not to give accurate information. In essence, LLMs were “shit”, and people loved her.

To be honest

Jaime Fernandez Fisac and his team on Princeton introduced this concept to describe how to skirt modern AI models around the truth. Drawing from an influential essay philosopher Harry Frankfurt “On the shit“Use this term to distinguish this behavior in LLM from honest mistakes and completely lies.

Princeton Researchers identified five different forms of this behavior:

Empty rhetoric: A floral language that does not add a substance to the answers.
Leasel Council: The vague qualifiers like “study suggest” or “in some cases” avoiding solid statements.
DaLtering: Using selective true statements to misconceptions, such as emphasizing “strong historical returns of investment”, omitting high risks.
Untrouther requests: Adoption of claims without evidence or credible support.
SIKOphantia: Insicable flattery and agreement to please.

In order to address the truth issues – Indifferential AI, the research team has developed a new method of training, “reinforced learning from the simulation,” and the answers based on their long-term outcomes, not to long-term satisfaction. Instead of asking, “Does this answer currently make the user?” The system considers, “After this advice, will actually help the user achieve its goals?”

This approach takes into account the potential future consequences of AI Council, a tricky prediction that researchers sent the additional AI models to simulate probable outcomes. Early testing showed promising results, with customer satisfaction and the actual utility that improves when systems are trained in this way.

Conitzer, however, said LLMs would probably continue to be wrong. Since these systems are trained by food of many data on textual data, there is no way to ensure that they make sense and every time it is correct.

“It’s amazing that it does it at all, but that’s wrong in a way,” he said. “I don’t see any definitive way for someone in the next year or two … There’s this great insight, and then it never gets anything bad.”

And systems become part of our everyday life, so it will be crucial to understand how LLMS works. How do developers balance customer satisfaction with trueity? What other domains could face similar compromises between short-term approval and long-term results? And while these systems become capable of sophisticated reasoning about human psychology, how do we ensure that they use those abilities responsibly?

Source link

Lies you because he thinks you want it

As machines learn to lie

To be honest

Leave a ReplyCancel Reply

Immigration surgery twists a dramatic passage on the roof in the city of Sanctuary

James Mcavoy allegedly attacked in Toronto bar while on TIFF

Ipados 26, Ipados 26, Wardrest 26 and MacOS 26 will be published 15. September

As machines learn to lie

To be honest

Leave a ReplyCancel Reply

Trending now

Immigration surgery twists a dramatic passage on the roof in the city of Sanctuary

James Mcavoy allegedly attacked in Toronto bar while on TIFF

Ipados 26, Ipados 26, Wardrest 26 and MacOS 26 will be published 15. September