arxiv:2511.14366
Zihan Ma
MichaelErchi
ยท
AI & ML interests
None yet
Recent Activity
authored
a paper
17 days ago
How Brittle is Agent Safety? Rethinking Agent Risk under Intent Concealment and Task Complexity
authored
a paper
17 days ago
ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning
upvoted
a
paper
17 days ago
ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning