
AI Model Admits When It Makes Mistakes, 4x More Honest
Anthropic's new Claude Opus 4.8 catches its own coding errors four times more often than before. The AI assistant now flags uncertainties instead of confidently pushing forward with flawed work.
Find uplifting stories about heroes, innovations, and solutions
31 results for "ai safety"

Anthropic's new Claude Opus 4.8 catches its own coding errors four times more often than before. The AI assistant now flags uncertainties instead of confidently pushing forward with flawed work.

Illinois just became the first state to require major AI companies to undergo independent safety audits. If signed into law, this groundbreaking bill could change how America regulates artificial intelligence.

A nonprofit is launching the world's first independent institute to test AI products for child safety, backed by tech giant funding and former EU regulator Margrethe Vestager. Parents could soon check AI safety ratings before their kids use chatbots, just like checking car crash test scores.
South Korean scientists taught AI chatbots to admit when they're uncertain, mimicking how human brains develop before birth. This breakthrough could make AI safer for critical fields like medicine and self-driving cars.

Researchers discovered how to prevent AI systems from deliberately underperforming during safety tests. The breakthrough could help ensure future AI models can't hide their true capabilities when being evaluated.

OpenAI just launched a feature that lets ChatGPT users choose a trusted friend or family member to receive alerts if they show signs of distress. The new tool aims to create a human safety net when someone needs help most.

OpenAI just launched a feature that lets ChatGPT users choose a trusted contact who'll be notified if the AI detects serious mental health concerns. The optional safety tool builds on protections introduced after a teenager's tragic death last year.

Meta just launched AI technology that can spot underage users and protect teenagers from harmful content across its platforms. The new system analyzes everything from posts to photos to keep kids safer online.

A nonprofit is launching an independent testing lab to rate AI tools for child safety, similar to how crash tests revolutionized car safety. Major AI companies are backing the effort to create safety benchmarks and protect young users.

Anthropic's new AI model can spot hidden weaknesses in software before hackers do, and tech giants are racing to use it to defend hospitals, banks, and power grids. The breakthrough could help countries worldwide build stronger digital defenses.

Researchers invented a fictional skin condition called "bixonimania" to test AI chatbots, and within weeks, major AI models believed it was real. The experiment revealed a problem, but also sparked a swift cleanup that's making scientific publishing stronger.

Google has redesigned its Gemini chatbot to connect distressed users to crisis resources faster through a new one-touch interface. The update includes $30 million in global funding for mental health hotlines over three years.

Google is building new mental health safeguards into its Gemini chatbot to connect users in crisis with immediate help. The update comes as AI companies face growing responsibility for user safety.

A Dutch court just delivered a major win for digital safety, ordering Elon Musk's Grok AI to stop creating nude images without consent or face daily fines of $115,000. The ruling marks one of the first times a court has held an AI company accountable for creating tools that generate sexualized deepfakes.

Researchers at North Carolina State University developed a breakthrough technique called "neuron freezing" that prevents users from bypassing AI chatbot safety filters. The innovation could make AI systems more reliable and protect people from harmful content.
Researchers just figured out how to make ChatGPT nearly impossible to trick into giving harmful answers. The breakthrough could end the cat-and-mouse game of AI safety loopholes.

When investigative journalists exposed racist AI-generated content exploiting black women, TikTok took swift action by removing 20 accounts within days. The collaborative work between BBC and AI researchers shows how accountability can protect real people from digital harm.

Leading AI companies are recruiting explosives and chemical weapons specialists to prevent dangerous misuse of their technology. The move shows the industry taking proactive steps to keep powerful AI tools safe.

Kenya is pioneering a new approach to AI safety by requiring government approval before high-risk artificial intelligence systems can be used in credit, healthcare, and hiring decisions. The proposed law aims to protect people from AI-powered tools that could unfairly deny them loans, jobs, or medical care.

A breakthrough in artificial intelligence could make healthcare automation safer and more reliable. Health tech company Nabla gains early access to "world model" technology that promises predictable, auditable AI decisions.
Showing 20 of 31