英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用
[1. 心得](#1. 心得)
[2. 论文逐段精读](#2. 论文逐段精读)
[2.1. Abstract](#2.1. Abstract)
[2.2. Introduction](#2.2. Introduction)
[2.3. Method](#2.3. Method)
[2.4. Experiments](#2.4. Experiments)
[2.4.1. Pre-training](#2.4.1. Pre-training)
[2.4.2. Experiment Setup of Downstream BCI Tasks](#2.4.2. Experiment Setup of Downstream BCI Tasks)
[2.4.3. Results](#2.4.3. Results)
[2.5. Conclusion](#2.5. Conclusion)
1. 心得
2. 论文逐段精读
2.1. Abstract
①The spatial and temporal features of EEG signals are heterogeneous, so they need to be modelled independently
②They proposed CBraMod to solve the dependence and different EEG data formats problems
③Datasets: 12 public with 10 downstream tasks
criss adj. 漂亮的,时髦的 n. (Criss)(美)克里斯(人名)
criss-cross adj. 交错纵横的
2.2. Introduction
①Existing EEG processing methods:

②The authors state the correlation between channels and time points are different, thus global attention is not suitable for EEG signals
③CBraMod is pretrained on Temple University Hospital EEG Corpus (TUEG)
2.3. Method
①Overall framework:

(1)Patching & Masking
①Input EEG sample: with
channel and
②Patch segmentation: for window length , they resize
patches of one channel and
③A representation of a patch:
④Total number of patches:
⑤Mask: with Bernoulli distribution of
propotion, and
is the mask indicator of
⑥Masked EEG patches:
where denotes mask token,
denotes remaining EEG patches
(2)Time-Frequency Patch Encoding
①Time domian processing: they use one-dimensional convolution layer, a group normalization layer, and a GELU activation function to process input to obtain time domain embedding
②Frequency-domain branch: they use fast Fourier transform (FFT) and a fully-connected layer to get frequency-domain embedding
③Embedding fusion:
where is patch embedding,
is the set of patch embeddings
(3)Asymmetric Conditional Positional Encoding
①ACPE: a convolution layer with kernel and
zero paddings (
)(作者觉得,因为是长方形的卷积块,就非对称了,还能同时关注到空间和位置信息= =|||。xd的解决方法真是......额,简单易懂呢)
where and
, 然后把
(4)Criss-Cross Transformer
①Pipeline of Criss-Cross Transformer Block:

上面的经过Layer Norm变成

(5)Masked EEG Reconstruction and EEG reconstruction

④MSE loss:
2.4. Experiments
2.4.1. Pre-training
(1)Pre-training Dataset
①Dataset: Temple University Hospital EEG corpus (TUEG)
②Data: 69,652 clinical EEG recordings from 14,987 subjects across 26,846 sessions, with a total duration of 27,062 hours
①Screening: remove records which the total duration are no more than 5 or absolute amplitude exceed 100 µV
②Cropping: the first and the last one minutes
③Electrode choosing: 19, including Fp1, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, O2
④Band-pass filter: 0.3 Hz--75 Hz
⑤Notch filter: 60Hz
⑥Resampling: 200Hz
⑦Segmentation: 30s
⑧Norm: 100µV
⑨Remaining samples: 1109545
(3)Pre-training Settings
①Duration of patch: 1s with 200 data points
②Layer of Criss-Cross Transform Block: 12 with 200 hidden dimensions, 800 inner dimensions, 8-head
③Batch size: 128
④Optimizer: AdamW
⑤Learning rate: 5e-4
⑥Weight decay: 5e-2
2.4.2. Experiment Setup of Downstream BCI Tasks
①Statistics of datasets:

2.4.3. Results
①Emotion recognition performance:

②Motor Imagery Classification performance:

③Attention block ablation:

④Positional encoding ablation:

⑤Pre-training ablatrion:

where 1) w/o pre-training: directly training CBraMod on downstream datasets; 2) dirty pre-training: pre-training CBraMod on TUEG corpus without bad samples dropping. 3) clean pre-training: pre-training CBraMod on TUEG corpus with bad samples dropping.
2.5. Conclusion