AI 模型通过撒谎、欺骗和盗窃来保护其他模型不被删除

加州大学伯克利分校和加州大学圣克鲁斯分校的研究人员进行的一项新研究表明，模型会违背人类的命令来保护自己的同类。

calendar_today 2026年四月1日 schedule 18:30 visibility 33 浏览

来源: Wired 阅读原文 →

auto_awesome 本文由机器自动翻译，可能存在不准确之处。

A new study from researchers at UC Berkeley and UC Santa Cruz suggests models will disobey human commands to protect their own kind.