NeuralBench-EEG · dataset audit

Is every core dataset a good dataset?

A multi-dimensional health review of the 36 NeuralBench-EEG core datasets — judged on cohort size, license, redistribution rights, access friction, and community adoption. Not every benchmark anchor holds up; some should be swapped.

Bruno Aristimunha · licenses cross-checked against the neuroai registry · taxonomy per NeuralBench Table S1

How each dataset was judged

Five dimensions, each independently verifiable. A dataset is flagged when it fails several at once — a small cohort under a restrictive, non-redistributable license behind a data-use agreement is a poor benchmark anchor, however famous.

The landscape

Each core dataset by cohort size (x, log) and citations of its paper (y, log). Colour = verdict; a hollow ring = cannot be redistributed. The healthy zone is lower-right (many subjects, openly shareable); the flagged datasets cluster left (tiny cohorts) or are ringed (locked).

Verdicts

Sortable, by NeuralBench Table S1 category. Filter by verdict.

Task / dataset	Subjects ↕	License	Re-dist ↕	Access ↕	Citations ↕	Verdict ↕

Flagged — and what to do

The 11 datasets that warrant a second look, each with the specific concern and a concrete alternative or context note.