Model Evaluation & Threat Research
AI companies and wider society want to understand the capabilities of frontier AI systems, and what risks they pose.
Time Horizon 1.1 (Current)
Follows the same methodology described in the initial paper, but with a larger task suite. See release announcement.
Time Horizon 1.0 (Mar 2025)
Original time horizon computations. Calculated for models from 2019 through Nov 2025, following the methods described in the original time horizon paper.

Risk Assessment

Our work assessing risks from frontier AI systems — including the Frontier Risk Report, independent reviews of AI developers' risk assessments, and capability evaluations of frontier models.

View all evaluation reports

METR does not accept compensation for this work.

Companies such as OpenAI and Anthropic have provided access and compute credits to support evaluation research. We also occasionally evaluate models independently after they are released, without involvement from the model's developers. Recent public reports resulting from this work are above, with additional discussion in the respective system cards.

Frontier AI Safety Policies

We advise AI developers and governments on implementing risk assessment methodologies for AI. For example, we have advised developers on Frontier AI Safety Policies.

Resources on FSPs

Recent