

Scientists created a breakthrough method that helps AI models monitor their own truthfulness from the inside.

This "steering" algorithm outperformed competing approaches and could make AI systems more reliable without constant human oversight.

Want to know more?