Thursday, May 14, 2026

The telephone game built into AI?

https://venturebeat.com/orchestration/frontier-ai-models-dont-just-delete-document-content-they-rewrite-it-and-the-errors-are-nearly-impossible-to-catch

Frontier AI models don't just delete document content — they rewrite it, and the errors are nearly impossible to catch

A new study by researchers at Microsoft shows that large language models silently corrupt documents that they work on by introducing errors. The researchers developed a benchmark that simulates multi-step autonomous workflows across 52 professional domains, using a method that automatically measures how much content degrades over time.

Their findings show that even top-tier frontier models corrupt an average of 25% of document content by the end of these workflows. And providing models with agentic tools or realistic distractor documents actually worsens their performance.





A little more detail on a problem I mentioned earlier…

https://arstechnica.com/ai/2026/05/anthropic-blames-dystopian-sci-fi-for-training-ai-models-to-act-evil/

Anthropic blames dystopian sci-fi for training AI models to act “evil”

Those with an interest in the concept of AI alignment (i.e., getting AIs to stick to human-authored ethical rules) may remember when Anthropic claimed its Opus 4 model resorted to blackmail to stay online in a theoretical testing scenario last year. Now, Anthropic says it thinks this “misalignment” was primarily the result of training on “internet text that portrays AI as evil and interested in self-preservation.”

In a recent technical post on Anthropic’s Alignment Science blog (and an accompanying social media thread and public-facing blog post), Anthropic researchers lay out their attempts to correct for the kind of “unsafe” AI behavior that “the model most likely learned… through science fiction stories, many of which depict an AI that is not as aligned as we would like Claude to be.” In the end, the model maker says the best remedy for overriding those “evil AI” stories might be additional training with synthetic stories showing an AI acting ethically.



No comments: