Preliminary Findings: How Humans Detect AI-Generated Phishing Across 2,511 Classifications
Findings from 153 participants classifying AI-generated phishing: technique-level bypass rates, overconfidence patterns, and what security training misses.
Tag
12 posts
Findings from 153 participants classifying AI-generated phishing: technique-level bypass rates, overconfidence patterns, and what security training misses.
Most gamified security training is a quiz with a badge. Real game design looks fundamentally different, and it matters for engagement and measurement.
Real-time 1v1 ranked matches, a new unlock ladder, and a terminal AI that will not stop talking. Threat Terminal v2.0 goes live tonight.
What is changing in Threat Terminal v2: complete UI overhaul, persistent progression, daily challenges, ranked PvP, badges, and a coin economy.
Preliminary descriptive patterns from 100 participants and 1,612 classified emails in Threat Terminal, before formal statistical analysis begins.
Pilot data from 56 participants in Threat Terminal reveals which phishing techniques humans miss most when AI eliminates writing quality as a signal.
Phishing emails with no urgency, no threats, and no red flags bypass humans at three times the rate of credential harvesting. Training has it backwards.
Most people catch phishing because they know they do not have an Apple account, not because they spotted a spoofed domain. That distinction matters.
AI eliminated the grammar errors and broken formatting phishing training taught people to spot. The detection problem is now fundamentally different.
How I built a controlled phishing dataset with the Claude API: batching by technique, automated review, and handling rate limits at scale.
Decisions, pivots, and problems behind designing a phishing research study, and why the constraints produced a cleaner methodology than planned.