Usage¶

The general usage of ffrprep is to preprocess frequency-following response (FFR) EEG data in BIDS format. ffrprep is a BIDS-compatible preprocessing workflow designed specifically for FFR experiments, providing automated preprocessing pipelines that include data loading, filtering, artifact removal, and report generation for neurophysiological data analysis. The exact command to run ffrprep depends on the Installation method and user. Regarding the latter, ffrprep can either be used as a command line tool or directly within python. Please refer to the Tutorial for a more detailed walkthrough.

Here’s a very conceptual example of running ffrprep via CLI:

ffrprep
ffrprep optional_arguments

and here from within python:

from ffrprep import ffrprep_function
from ffrprep import ffrprep_function

result = ffrprep_function(input)

result = ffrprep_function(input, optional_arguments)

Below, we will focus on the CLI version. For programmatic use, the Reference API documents every public function with its signature and parameters.

ffrprep through the CLI¶

As ffrprep is a BIDS-App , it is primarily designed as a command-line tool that you can run directly from your terminal or command prompt. Ideally, using the provided Docker or Singularity images as they encapsulate all dependencies and ensure a consistent environment across different systems, ensuring ease-of-use and reproducibility.

Downloading the example dataset¶

The download subcommand of the container fetches the example OSF dataset used in the Tutorial. The container’s entrypoint dispatches download … to the ffrprep-download console script bundled inside the image.

# Just the EEG data (default)
docker run --rm \
  -v /path/to/data:/out:rw \
  ksitek/ffrprep:latest \
  download example --out /out

# Also fetch the stimulus audio + augment events.tsv with the
# BIDS stim_file column (needed for the stim-vs-response
# correlation in the analysis report)
docker run --rm \
  -v /path/to/data:/out:rw \
  ksitek/ffrprep:latest \
  download example --with-stimuli --out /out

--with-stimuli is BooleanOptionalAction (default False); --no-with-stimuli is its inverse. The download lands the EEG data under <out>/ffrprep_raw_data/ and the stimulus files under <out>/ffrprep_raw_data/stimuli/ per the BIDS spec, then augments every sub-*/eeg/*_events.tsv with the matching stim_file column based on a trial_type → filename lookup.

Command-Line Arguments¶

ffrprep: A BIDS-App for standardized FFR preprocessing and analysis

usage: ffrprep [-h] [-v]
               [--participant_label PARTICIPANT_LABEL [PARTICIPANT_LABEL ...]]
               [--stage {preprocessing,analysis,both}]
               [--ref_channels REF_CHANNELS [REF_CHANNELS ...]]
               [--high_pass HIGH_PASS] [--low_pass LOW_PASS] [--l_freq L_FREQ]
               [--h_freq H_FREQ] [--no-filter] [--baseline START END]
               [--picks PICKS] [--on-missing {warn,raise,ignore}]
               [--event-id EVENT_ID] [--events-file EVENTS_FILE] [--tmin TMIN]
               [--tmax TMAX] [--reject-eeg REJECT_EEG] [--no-auto-reject]
               [--concat-runs] [--save-each-node] [--run RUN [RUN ...]]
               [--task TASK [TASK ...]]
               [--split-by-trial-type | --no-split-by-trial-type]
               [--trial-types TRIAL_TYPES [TRIAL_TYPES ...]]
               [--difference-pairs DIFFERENCE_PAIRS [DIFFERENCE_PAIRS ...]]
               [--skip_bids_validation] [--n_procs N_PROCS]
               [--work_dir WORK_DIR]
               bids_dir output_dir {participant,group}

Positional Arguments¶

bids_dir

The directory with the input dataset formatted according to the BIDS standard.

output_dir

The directory where the output files should be stored. If you are running group level analysis this folder should be prepopulated with the results of the participant level analysis.

analysis_level

Possible choices: participant, group

Level of the analysis that will be performed. Multiple participant level analyses can be run independently (in parallel) using the same output_dir.

Named Arguments¶

-v, --version

show program’s version number and exit

--participant_label

The label(s) of the participant(s) that should be processed. The label corresponds to sub-<participant_label> from the BIDS spec (so it does not include “sub-“). Multiple participants can be specified with a space-separated list.

--stage

Possible choices: preprocessing, analysis, both

Processing stage to run: preprocessing only, analysis only, or both stages sequentially.

--skip_bids_validation

Assume the input dataset is BIDS compliant and skip the validation.

--n_procs

Number of parallel (task, run) workers per subject. Each worker runs its own preprocessing or analysis workflow with the Linear plugin. Memory footprint scales linearly with this value. A failure in any worker aborts the run. Default: 1 (sequential). For multi-node / cluster scaling, run one ffrprep invocation per subject (e.g. via slurm array or GNU parallel) and use –n_procs to control intra-subject parallelism.

--work_dir

Path where intermediate results should be stored.

Preprocessing options¶

--ref_channels

Reference channel(s) for re-referencing. Provide space-separated channel names (e.g. –ref_channels M1 M2). Use ‘average’ for average reference. Comma-separated single-argument styles are also accepted for backward compatibility (e.g. ‘M1,M2’).

--high_pass

High-pass filter cutoff frequency in Hz.

--low_pass

Low-pass filter cutoff frequency in Hz.

--l_freq

Lower-pass edge in Hz (MNE name: l_freq). If provided, overrides –high_pass. Use ‘None’ via –no-filter to disable.

--h_freq

Upper-pass edge in Hz (MNE name: h_freq). If provided, overrides –low_pass. Use ‘None’ via –no-filter to disable.

--no-filter

Disable filtering entirely (equivalent to l_freq=None and h_freq=None). When set, any provided l_freq/h_freq/high_pass/low_pass are ignored.

--baseline

Baseline correction period. Provide two numbers: START END in seconds (e.g. –baseline -0.2 0 for -200ms to 0ms).

--picks

Channels to include in epochs. Comma-separated list or single channel name (e.g. ‘Cz’ or ‘Cz,Fz’). If not provided, all channels are considered.

--on-missing

Possible choices: warn, raise, ignore

Behavior when events referenced by event_id are missing: warn, raise, or ignore. Matches MNE’s on_missing option.

--event-id

Event id mapping. Provide a JSON string like ‘{“A”:1,”B”:2}’ or comma-separated pairs like ‘A:1,B:2’. If not provided, event ids will be inferred from events file or annotations.

--events-file

Optional path to an events.tsv file to use for epoching. If not provided, events will be inferred from annotations.

--tmin

Start time of epochs relative to event onset (s).

--tmax

End time of epochs relative to event onset (s).

--reject-eeg

Peak-to-peak rejection threshold for EEG channels in Volts. Set to 0 to disable automatic rejection.

--no-auto-reject

Disable automatic amplitude-based epoch rejection.

--concat-runs

Concatenate multiple runs for a subject and process them as a single recording. By default runs are processed separately.

--save-each-node

Write intermediate outputs to disk after each preprocessing step (disk-backed mode). This reduces peak memory at the expense of increased I/O and runtime.

--run

Run label(s) to process for the participant (without the ‘run-’ prefix). If not provided, all runs found for the subject will be processed. Multiple runs can be provided as space-separated values, or as a single comma-separated string (e.g. ‘1 2’ or ‘1,2’).

--task

Task label(s) to process for the participant (without the ‘task-’ prefix). If not provided, all tasks found for the subject will be processed. Multiple tasks can be provided as space-separated values, or as a single comma-separated string (e.g. ‘active passive’).

Analysis options¶

--split-by-trial-type, --no-split-by-trial-type: Emit per-trial-type epoched and evoked outputs. Pass –no-split-by-trial-type to fall back to a single combined output per (subject, task, run).
--trial-types: Restrict per-trial-type outputs to this subset of trial types (matched against the events.tsv trial_type column). If omitted, all trial types are emitted.
--difference-pairs: Difference evokeds to compute, given as A:B tokens (e.g. Pos:Neg). For exactly two trial types the diff is auto-computed; this flag is required to opt in to diffs when there are three or more trial types.

Example Call(s)¶

The examples below all use Docker — the recommended path. They build up from the simplest call (preprocessing only, one subject) to per-trial-type analysis outputs with explicit difference pairs. Singularity users can swap the docker run … invocation for singularity run --cleanenv -B <host>:<container> <image.sif> keeping every flag below the image name identical.

All examples assume your BIDS dataset is at /local/bids_dataset on the host and the derivatives should land in /local/bids_dataset/derivatives.

Example 1 - Basic preprocessing¶

docker run --rm \
  -v /local/bids_dataset:/data:rw \
  ksitek/ffrprep:latest \
  /data \
  /data/derivatives \
  participant \
  --participant_label 01 02 \
  --stage preprocessing

What’s in this call:

docker run --rm runs the container and removes it after completion.
-v /local/bids_dataset:/data:rw mounts the local BIDS dataset read-write so derivatives can be written back to disk.
ksitek/ffrprep:latest is the published image (see Installation for tag pinning).
/data (1st positional) is the BIDS dataset inside the container; /data/derivatives (2nd) is the output directory; participant (3rd) is the BIDS-App analysis level.
--participant_label 01 02 processes only sub-01 and sub-02.
--stage preprocessing runs only the preprocessing stage.

Example 2 - Full analysis with custom parameters¶

docker run --rm \
  -v /local/bids_dataset:/data:rw \
  ksitek/ffrprep:latest \
  /data \
  /data/derivatives \
  participant \
  --stage both \
  --high_pass 0.5 \
  --low_pass 50.0 \
  --ref_channels average \
  --baseline "-0.1,0" \
  --n_procs 4

What’s in this call:

--stage both runs both preprocessing and analysis stages.
--high_pass 0.5 / --low_pass 50.0 set the band-pass filter in Hz.
--ref_channels average uses an average reference.
--baseline "-0.1,0" sets the baseline window to −100 ms → 0 ms.
--n_procs 4 runs 4 per-(task, run) iterations in parallel per subject (see Parallelization below).

By default the analysis stage emits one evoked response per trial_type value in events.tsv (_desc-evoked{Cond}.fif), a combined evoked across all events (_desc-evoked.fif), and — for 2-trial-type datasets — an auto-paired difference evoked (_desc-evokedDiff{A}Vs{B}.fif). This per-trial-type split is governed by --split-by-trial-type (default on); pass --no-split-by-trial-type to emit only the combined evoked. Examples 3 and 4 below show how to pick explicit difference pairs or restrict the output set.

Example 3 - Per-trial-type analysis with explicit difference pairs¶

For datasets with more than two trial_type values, the automatic difference is ambiguous; use --difference-pairs A:B [C:D …] to specify which pairs to subtract. Each pair becomes one _desc-evokedDiff{A}Vs{B}.fif file. The per-trial-type _desc-evoked{Cond}.fif and combined _desc-evoked.fif files are always emitted.

docker run --rm \
  -v /local/bids_dataset:/data:rw \
  ksitek/ffrprep:latest \
  /data \
  /data/derivatives \
  participant \
  --stage both \
  --difference-pairs positive:negative tone1:tone2 \
  --n_procs 4

What’s in this call:

--difference-pairs positive:negative tone1:tone2 emits two difference evokeds: _desc-evokedDiffPositiveVsNegative.fif (positive − negative) and _desc-evokedDiffTone1VsTone2.fif (tone1 − tone2).
All per-trial-type and combined evokeds for this (task, run) group are also written — --split-by-trial-type is on by default, and --difference-pairs requires it (the pairs are subtracted from the per-trial-type evokeds, so combining it with --no-split-by-trial-type would leave nothing to subtract).

Example 4 - Restrict outputs to a subset, no split¶

The trial-type split is controlled by a single boolean flag, exposed in both forms via argparse.BooleanOptionalAction:

--split-by-trial-type (the default) emits the combined evoked plus one _desc-evoked{Cond}.fif per trial type plus the auto-paired / explicit difference evokeds. Passing this flag explicitly is equivalent to omitting it.
--no-split-by-trial-type strips back to the combined evoked only — no per-trial-type files, no differences.

To keep the split on but restrict it to a subset of trial types (while still emitting the combined evoked), pass --trial-types A B.

# Combined evoked only — strips per-trial-type files and
# any (auto-paired or explicit) difference evokeds
docker run --rm \
  -v /local/bids_dataset:/data:rw \
  ksitek/ffrprep:latest \
  /data \
  /data/derivatives \
  participant \
  --stage both \
  --no-split-by-trial-type \
  --n_procs 4

# Only emit per-trial-type files for "positive" + the combined
# evoked; drop the "negative" output even though events.tsv
# contains both
docker run --rm \
  -v /local/bids_dataset:/data:rw \
  ksitek/ffrprep:latest \
  /data \
  /data/derivatives \
  participant \
  --stage both \
  --trial-types positive \
  --n_procs 4

Parallelization and cluster usage¶

ffrprep follows the standard BIDS-App parallelism model:

Inside one invocation — --n_procs N runs N per-(task, run) iterations concurrently for the subject(s) being processed. Each worker runs its own preprocessing or analysis workflow with the Nipype Linear plugin. Default: --n_procs 1 (sequential). Memory footprint scales linearly with N — each worker loads its own raw + epochs into memory, so dial it down on small machines.
Across invocations — for multi-node / cluster scaling, run one ffrprep invocation per subject under your scheduler (slurm job array, GNU parallel, HTCondor, etc.). Both layers compose: 8 parallel slurm tasks each with --n_procs 4 gives 32-way effective parallelism.

Single workstation¶

ffrprep \
/data/bids_dataset \
/data/bids_dataset/derivatives \
participant \
--participant_label 01 02 03 \
--n_procs 4

The 4 workers chew through each subject’s (task, run) iterations in parallel, then move to the next subject.

Cluster (slurm job array)¶

Run one invocation per subject via the scheduler; each job uses --n_procs for intra-subject parallelism. Example wrapper script:

# ffrprep_one_subject.sh
#!/bin/bash
#SBATCH --array=0-99
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
SUBJECTS=(sub-01 sub-02 sub-03 ...)
SUB=${SUBJECTS[$SLURM_ARRAY_TASK_ID]}
ffrprep /data /data/derivatives participant \
    --participant_label ${SUB#sub-} \
    --n_procs $SLURM_CPUS_PER_TASK

Cluster (GNU parallel on a single beefy box)¶

parallel -j 8 \
  "ffrprep /data /data/derivatives participant \
     --participant_label {} --n_procs 4" \
  ::: 01 02 03 04 05 06 07 08

Failure handling¶

A failure in any (task, run) iteration aborts the run (fail-fast). The exception propagates up from the worker to the CLI entry point, so the underlying error is visible in the terminal output. Per-iteration logs land in <work_dir>/<task>-<run>.log so you can drill into the failing iteration without scanning the whole subject log.

Support and communication¶

The documentation of this project is found here: https://sitek.github.io/ffrprep.

All bugs, concerns and enhancement requests for this software can be submitted here: https://github.com/sitek/ffrprep/issues.

If you have a problem or would like to ask a question about how to use ffrprep, please submit a question to NeuroStars.org with an ffrprep tag. NeuroStars.org is a platform similar to StackOverflow but dedicated to neuroinformatics.

All previous ffrprep questions are available here: http://neurostars.org/tags/ffrprep/

Not running on a local machine? - Data transfer¶

Please contact you local system administrator regarding possible and favourable transfer options (e.g., rsync or FileZilla).

A very comprehensive approach would be Datalad, which will handle data transfers with the appropriate settings and commands. Datalad also performs version control over your data.