Future Tense

Future Tense

The Safety Theatre of Agentic AI

Or: how we learned to stop worrying and deploy the benchmark

A Z Mackay's avatar
A Z Mackay
Apr 14, 2026
∙ Paid

In January, researchers at Carnegie Mellon and Fujitsu presented FieldWorkArena at the AAAI conference in Singapore. A benchmark designed to measure whether AI agents are safe enough to field in live industrial settings. Factories. Warehouses. Places where the wrong answer doesn’t embarrass the product manager but puts someone in a sling.

FieldWorkArena …

User's avatar

Continue reading this post for free, courtesy of A Z Mackay.

Or purchase a paid subscription.
© 2026 A Mackay · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture