Anthropic has launched an investigation following reports that unauthorised users gained access to its unreleased Claude Mythos artificial intelligence model.

The development, first reported by Bloomberg, indicated that a small group of individuals accessed the model through a third-party vendor environment, raising concerns about access controls and AI safety.

In response, the company confirmed it is probing the incident, though it has not disclosed the full scope or impact of the reported access.

Claude Mythos is an advanced large language model designed for cybersecurity applications, including identifying and potentially mitigating software vulnerabilities. Due to its capabilities, Anthropic had restricted access to a limited group of collaborators, including major firms such as Goldman Sachs and Apple Inc..

Reports suggest that at least one individual with ties to a third-party contractor was able to interact with the model, although there is no indication that it was used for malicious purposes.

The incident has renewed scrutiny of safeguards around powerful AI systems. Earlier, the UK AI Security Institute warned that the model represents a significant leap in cyber capabilities, including the potential to orchestrate complex attacks and identify vulnerabilities autonomously.

Anthropic has positioned the model as part of its broader initiative to strengthen defensive cybersecurity, while acknowledging the dual-use risks associated with such advanced systems.