Home Industry Microsoft Launches ExCyTIn-Bench: An Open-Source Benchmark for Real-World AI Cybersecurity Evaluation
Industry

Microsoft Launches ExCyTIn-Bench: An Open-Source Benchmark for Real-World AI Cybersecurity Evaluation

Share
Share

Microsoft has introduced ExCyTIn-Bench, an open-source benchmarking framework designed to evaluate how effectively AI systems perform real-world cybersecurity investigations. Unlike traditional benchmarks that focus on static knowledge or trivia, ExCyTIn-Bench simulates dynamic, multi-stage cyberattacks within a virtual Security Operations Center (SOC) in Microsoft Azure. Using 57 log tables from Microsoft Sentinel and related services, it mirrors the complexity and noise of genuine security incidents.

The tool helps CISOs and IT leaders assess how well AI models reason, adapt, and explain findings in realistic threat scenarios providing actionable insights into detection and response capabilities. Microsoft also uses ExCyTIn internally to enhance its own AI-powered security tools, including Security Copilot, Sentinel, and Defender.

ExCyTIn-Bench’s key innovations include fine-grained, transparent scoring metrics, realistic investigative workflows, and extensibility for custom benchmarks. Early results show GPT-5 achieving a 56.2% average reward, outperforming earlier models and highlighting the importance of advanced reasoning in cyber defense. Open-source models are also closing performance gaps, making sophisticated security automation more accessible.

Through ExCyTIn-Bench, Microsoft aims to accelerate global collaboration, improve trust in AI-driven cybersecurity, and foster innovation in automated threat investigation.

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles

Siemens and SBS Transit Renew Contract for Downtown Line Signal System Support

Siemens Mobility and SBS Transit have agreed to extend their Long-Term Service...

Acer Posts November Revenue Update with Year-on-Year and Monthly Growth

Acer Inc. reported consolidated revenues of NT$24.60 billion for November, reflecting a...

Google Rolls Out Gemini 3 Deep Think Model to AI Ultra Subscribers

Google has released the Gemini 3 Deep Think artificial intelligence model to...

Motorola Announces India Launch Details for the Edge 70 Smartphone

Motorola has confirmed the India launch date for the Edge 70, scheduled...