Research shows that persona prompting "reliably" damages accuracy for some types of tasks but works well in other categories.
Authentication Failures (A07) show the largest gap in the dataset: a 48-percentage-point difference between leaders and the field. Leaders fix at nearly 60%, while the field sits at roughly 12%.
Over a decade ago, when I was first starting to pretend I could write about quantum mechanics, I covered a truly bizarre ...
OpenAI's GPT-5.4 Pro has solved an open math problem unsolved since 2019, with Epoch AI independently verifying the first AI ...