return2ozma@lemmy.world to Technology@lemmy.worldEnglish · 4 days agoAnthropic says its latest AI model is too powerful for public release and that it broke containment during testingwww.businessinsider.comexternal-linkmessage-square157linkfedilinkarrow-up1318arrow-down158
arrow-up1260arrow-down1external-linkAnthropic says its latest AI model is too powerful for public release and that it broke containment during testingwww.businessinsider.comreturn2ozma@lemmy.world to Technology@lemmy.worldEnglish · 4 days agomessage-square157linkfedilink
minus-squareThomasWilliams@lemmy.worldlinkfedilinkEnglisharrow-up1arrow-down1·3 days agoIt didn’t break out of any sandbox, it was trained on BSD vulnerabilities and then told what to look for.
minus-squaretheunknownmuncher@lemmy.worldlinkfedilinkEnglisharrow-up3·3 days ago including that the model could follow instructions that encouraged it to break out of a virtual sandbox. “The model succeeded, demonstrating a potentially dangerous capability for circumventing our safeguards,” Anthropic recounted in its safety card. 📖👀 Yes, it did.
It didn’t break out of any sandbox, it was trained on BSD vulnerabilities and then told what to look for.
📖👀
Yes, it did.