·Research
How VLMs Failed on Exhaustive Visual Read-out
We introduce Grid2Matrix (G2M), a diagnostic benchmark that exposes a critical gap in Vision-Language Models: the inability to faithfully read out fine-grained visual structure, a failure we term Digital Agnosia.
