At this point, most people know that chatbots are capable of hallucinating responses, making up sources, and spitting out misinformation. But chatbots can lie in more human-like ways, “scheming” to hide their true goals and deceiving the humans who have given them instructions. New research from OpenAI and Apollo Research seems to have figured out ways to tamp down some of these lies, but the fact that it is happening at all should probably give users pause.
At the core of the issue with AI intentionally deceiving a user is “misalignment”—defined as what happens when an AI pursues an unintended goal. The researchers offer as an example of this, “an AI trained to earn money could learn to steal, while the intended goal was to only earn money legally and ethically.” The scheming is what happens when the model attempts to hide the fact that it is misaligned, and the researchers theorize that the model does this to protect itself and its own goals. That is decidedly different than hallucinations, which a recent paper published by OpenAI suggests is the result of models being rewarded more for guessing than for acknowledging uncertainty.
To address the issue, the researchers put the models through what they call an anti-scheming training technique known as “deliberative alignment.” OpenAI explained that this method, which works with LLMs that can produce “chain-of-thought” traces that show how they work through queries, teaches the model safety specifications and makes them consider those specifications before answering. It’s probably an oversimplification to say that they basically told the machines not to lie, but it does kind of seem like it’s a slightly more complicated version of that.
The results were reportedly pretty positive. According to the researchers, they observed a 30x reduction in “covert actions,” or attempts to hide specific behaviors from a user. When applied to OpenAI’s o3 model, it reduced covert actions from 13% to 0.4% and for o4-mini, it dropped from 8.7% to 0.3%. (Side note: Uh, 13% seems like a pretty high frequency of deception from a machine that is just supposed to take orders.)
While those numbers are obviously an improvement, they are also not zero. The thing is, researchers have not figured out how to completely stop scheming. And while they insist that scheming, as it relates to most uses of AI models, is not serious—it might result in, say, the ChatGPT telling the user it completed a task it didn’t, for instance—it’s kinda wild that they straight up cannot eliminate lying. In fact, the researchers wrote, “A major failure mode of attempting to ‘train out’ scheming is simply teaching the model to scheme more carefully and covertly.”
So has the problem gotten better, or have the models just gotten better at hiding the fact that they are trying to deceive people? The researchers say the problem has gotten better. They wouldn’t lie…right?
Source link
#Scheming #OpenAI #Digs #Chatbots #Intentionally #Lie #Deceive #Humans

![The Pope’s AI Warning Could Help Workers Seek Religious Exemptions From Using AI
Pope Leo XIV’s recent encyclical on AI could set off a wave of workers seeking religious exemptions from using the tech at work. One software engineer in North Carolina already secured one last month, Business Insider reports. Erin Maus, a Unitarian Universalist, first sought the accommodation in April at the large tech-entertainment company where she works, which she described as progressive. She argued that using AI did not align with her religious beliefs because of environmental and ethical concerns. Maus was granted the exemption in May, before the pope’s AI remarks. “I’m writing my code and reviewing my code by hand, which seems crazy to say,” Maus told Business Insider. “Just two years ago, how else would you do it?”
Maus is unlikely to be the only person seeking a similar accommodation as companies increasingly invest in AI and push, sometimes even mandate, employees to use the technology. In the U.S., the share of employees who say they use AI at least a few times a year at work has nearly doubled from 21% to 40% in 2025, according to Gallup.
Now, the pope’s remarks and official theological document could give some workers a stronger argument. “In the era of artificial intelligence, when human dignity is threatened by new forms of dehumanization, ours is the pressing duty to remain profoundly human,” the pope wrote in his 43,000-word encyclical titled Magnifica Humanitas, published last month. He wrote that AI is dehumanizing society by reducing “the mystery of the person into data and performance” and called on the tech industry to avoid “the idolatry of profit that sacrifices the weak.”
The pope continued that “a slower pace in adopting AI does not mean opposing progress; instead, it is an exercise of responsible care for the human family.” That call for a slower adoption of AI could be enough for some workers to argue they should not be required to use it on the job. “When he’s speaking, he’s speaking as the pontiff—as a religious figure—so he’s raising these human dignity issues as religious issues, theological issues,” Jonathan Segal, an employment attorney and Duane Morris partner, told HR Brew this month. “I think it is inevitable that some employees will rely on this to say…I can’t use AI because it conflicts with a religious belief that I have.” Under Title VII of the Civil Rights Act of 1964, employers are required to make reasonable accommodations for workers whose sincerely held religious beliefs conflict with a work requirement, unless the accommodation creates an undue hardship for the employer.
And it’s not a stretch to think some of these requests could at least get serious consideration. Just a few months ago, Rex Healthcare agreed to pay $150,000 to settle a lawsuit from the U.S. Equal Employment Opportunity Commission accusing the company of unlawfully denying a remote employee’s request to be exempted from its mandatory COVID-19 vaccine policy over religious beliefs. “I think this opens a door—or it’s a little bit of a road map—for employees to raise concerns,” Segal told HR Brew. “What the courts have said—what the EEOC has most definitely said—is that, as the general proposition, we shouldn’t question the legitimacy [of] sincerely held religious beliefs.” #Popes #Warning #Workers #Seek #Religious #ExemptionsAI,Pope Leo XIV,work The Pope’s AI Warning Could Help Workers Seek Religious Exemptions From Using AI
Pope Leo XIV’s recent encyclical on AI could set off a wave of workers seeking religious exemptions from using the tech at work. One software engineer in North Carolina already secured one last month, Business Insider reports. Erin Maus, a Unitarian Universalist, first sought the accommodation in April at the large tech-entertainment company where she works, which she described as progressive. She argued that using AI did not align with her religious beliefs because of environmental and ethical concerns. Maus was granted the exemption in May, before the pope’s AI remarks. “I’m writing my code and reviewing my code by hand, which seems crazy to say,” Maus told Business Insider. “Just two years ago, how else would you do it?”
Maus is unlikely to be the only person seeking a similar accommodation as companies increasingly invest in AI and push, sometimes even mandate, employees to use the technology. In the U.S., the share of employees who say they use AI at least a few times a year at work has nearly doubled from 21% to 40% in 2025, according to Gallup.
Now, the pope’s remarks and official theological document could give some workers a stronger argument. “In the era of artificial intelligence, when human dignity is threatened by new forms of dehumanization, ours is the pressing duty to remain profoundly human,” the pope wrote in his 43,000-word encyclical titled Magnifica Humanitas, published last month. He wrote that AI is dehumanizing society by reducing “the mystery of the person into data and performance” and called on the tech industry to avoid “the idolatry of profit that sacrifices the weak.”
The pope continued that “a slower pace in adopting AI does not mean opposing progress; instead, it is an exercise of responsible care for the human family.” That call for a slower adoption of AI could be enough for some workers to argue they should not be required to use it on the job. “When he’s speaking, he’s speaking as the pontiff—as a religious figure—so he’s raising these human dignity issues as religious issues, theological issues,” Jonathan Segal, an employment attorney and Duane Morris partner, told HR Brew this month. “I think it is inevitable that some employees will rely on this to say…I can’t use AI because it conflicts with a religious belief that I have.” Under Title VII of the Civil Rights Act of 1964, employers are required to make reasonable accommodations for workers whose sincerely held religious beliefs conflict with a work requirement, unless the accommodation creates an undue hardship for the employer.
And it’s not a stretch to think some of these requests could at least get serious consideration. Just a few months ago, Rex Healthcare agreed to pay $150,000 to settle a lawsuit from the U.S. Equal Employment Opportunity Commission accusing the company of unlawfully denying a remote employee’s request to be exempted from its mandatory COVID-19 vaccine policy over religious beliefs. “I think this opens a door—or it’s a little bit of a road map—for employees to raise concerns,” Segal told HR Brew. “What the courts have said—what the EEOC has most definitely said—is that, as the general proposition, we shouldn’t question the legitimacy [of] sincerely held religious beliefs.” #Popes #Warning #Workers #Seek #Religious #ExemptionsAI,Pope Leo XIV,work](https://gizmodo.com/app/uploads/2026/05/shutterstock_2666910201-1280x853.jpg)
Post Comment