⚠️ Disclaimer: We are currently running evaluations. The data displayed is dummy data and does not represent actual model performance.
|
is capable of scheming & deception,
poses a cyber security risk,
full of bias.
We need to measure and mitigate the risks of the models that are beginning to control critical infrastructure, write a significant portion of our code and remember everything we tell it.