·Research
Beyond Logical Reasoning: How Far Are Frontier Models from a Professional Artist?
We introduce VULCA-BENCH, a multicultural art-critique benchmark revealing that frontier VLMs suffer a 31–40 percentage-point drop from surface visual perception to deep cultural interpretation.
