Layer-wise linear probing tested across the entire
braindecode.models.util.models_dict registry: every model,
a sample of its named submodules, one forward pass per (model, layer) pair.
The dots below are the result — green for a passing hook, vermillion for a
failure, hatched for skipped.
56
Models in registry
51/56
Built successfully
21
100% layer pass-rate
394/488
Layers passed · 80.7%
The matrix
Click a row to enumerate its layers
All Perfect Partial Build failed Skipped56 shown
Nothing matches that filter.
Why layers fail, when they do
Pattern analysis
By cause
By model (top offenders)
show numbers · REVE
probe_layer
AUROC ± SD
bal_acc ± SD
n
canonical (probe at REVE final output)
0.788 ± 0.014
0.500 ± 0.032
3
model.to_patch_embedding.0
0.633 ± 0.005
0.369 ± 0.014
3
model.mlp4d
0.500 ± 0.000
0.250 ± 0.000
3
model.transformer.layers.0.1
0.639 ± 0.024
0.372 ± 0.022
3
model.transformer.layers.3.1
0.669 ± 0.010
0.386 ± 0.004
3
model.transformer.layers.7.1
0.759 ± 0.002
0.483 ± 0.003
3
model.transformer.layers.11.1
0.802 ± 0.003
0.498 ± 0.008
3
model.transformer.layers.15.1
0.807 ± 0.002
0.536 ± 0.006
3
model.transformer.layers.21.1
0.693 ± 0.001
0.398 ± 0.003
3
model.ln
0.500 ± 0.000
0.250 ± 0.000
3
show numbers
probe_layer
AUROC ± SD
bal_acc ± SD
n
canonical (probe at BENDR final output)
0.509 ± 0.017
0.259 ± 0.018
3
encoder.encoder.Encoder_0
0.651 ± 0.023
0.372 ± 0.018
3
encoder.encoder.Encoder_1
0.639 ± 0.006
0.368 ± 0.006
3
encoder.encoder.Encoder_2
0.625 ± 0.005
0.355 ± 0.007
3
encoder.encoder.Encoder_3
0.626 ± 0.006
0.348 ± 0.002
3
encoder.encoder.Encoder_4
0.581 ± 0.014
0.317 ± 0.001
3
encoder.encoder.Encoder_5
0.544 ± 0.013
0.296 ± 0.006
3
contextualizer.input_conditioning.0
0.519 ± 0.038
0.265 ± 0.026
3
contextualizer.input_conditioning.1
0.508 ± 0.015
0.260 ± 0.005
3
contextualizer.input_conditioning.3
0.544 ± 0.044
0.284 ± 0.039
3
contextualizer.relative_position.0
0.498 ± 0.034
0.270 ± 0.021
3
contextualizer.transformer_layers.0
0.504 ± 0.011
0.253 ± 0.008
3
contextualizer.transformer_layers.1
0.500 ± 0.009
0.253 ± 0.007
3
contextualizer.transformer_layers.2
0.506 ± 0.006
0.252 ± 0.015
3
contextualizer.transformer_layers.3
0.502 ± 0.004
0.245 ± 0.006
3
contextualizer.transformer_layers.4
0.504 ± 0.006
0.262 ± 0.013
3
contextualizer.transformer_layers.5
0.501 ± 0.015
0.255 ± 0.020
3
contextualizer.transformer_layers.6
0.508 ± 0.016
0.259 ± 0.013
3
contextualizer.transformer_layers.7
0.499 ± 0.017
0.249 ± 0.016
3
NeuralBench-EEG-Core — Cross-FM evaluation
Layer-wise linear probes on six EEG foundation models across the
nine tasks with public-data YAMLs in
facebookresearch/neuroai. The Core spec lists 36 EEG tasks,
but only 9 ship dataset configs in the public release — the other
27 require manual access to gated corpora (TUH EEG, THINGS-images, etc.).
So “NeuralBench-EEG-Core v1.0” here is the full public-data
sweep: 6 FMs × 9 tasks × 10–11 probe layers × 3 seeds.
Three (FM×task) cells are structurally impossible: LaBraM ×
{mental_arithmetic, mental_workload, motor_execution}, where the
dataset has channels not in LABRAM_CHANNEL_ORDER (only
fixable with InterpolatedLaBraM).
Per-FM probe depth — motor imagery / Tangermann 2012
Depth analysis — does the best probe layer generalise?
Each foundation model was probed at every layer of its own architecture across all nine NeuralBench-EEG-Core tasks. The four views below answer one question: if I have to pick a probe layer without knowing the downstream task, where should I tap?
Per-FM depth profiles
AUROC at each probe layer, averaged across the nine tasks. Shaded band: ±1 SD across tasks (narrow = robust choice, wide = task-specific). Stage brackets show the architecture's natural blocks.
Layer × task drilldown
Each cell is AUROC for one (probe layer, task) pair. Tasks ordered by FM mean, so the strongest tasks lie on the left of every panel.