DATA DICTIONARY

Synthetic Speech Atlas: Feature Schema

The below represents the features we have extracted from each sample in our synthetic speech atlas, and a brief explaination of the feature in question. Features have been split into four groups by class.

Global Metadata (All Datasets)

Column Type Description
anon_id int Anonymous sequential ID (post-shuffle)
label string bonafide or spoof
tier string Brouhaha quality tier (1=studio, 2=near-field, UNKNOWN=ungraded)
brouhaha_graded int 1=real Brouhaha grades computed, 0=defaulted
source_licence string SPDX licence identifier for this row's source
source_dataset string Originating dataset name
duration_ms float32 Clip duration (ms); bonafide bucketed to 500ms
duration_s float32 Clip duration (seconds)

Dataset-Specific Columns

Column Datasets Description
vocoder LibriSeVoc Neural vocoder architecture name
tts_system SONAR TTS system name (xTTS, OpenAI, FlashSpeech, etc.)
attack_id ASVspoof5 Dev, Eval Attack system identifier from ASVspoof5 protocol
codec_env ASVspoof5 Dev, Eval Codec environment condition
codec_type ASVspoof5 Dev, Eval Codec type applied to clip
gender ASVspoof5 Dev, Eval Speaker gender (from ASVspoof5 protocol TSV)

Tier 1 — Standard DSP Features

Note: Values are float32. DSP-extracted values are round(2dp) then reinflated to FP16 resolution. Statistical properties are preserved.
PRISTINE-gated features are NaN unless brouhaha_graded = 1. Do not impute with zero — treat as structurally missing.

Column Units Description
snr_mediandBMedian SNR (Brouhaha); 99.0 if ungraded
snr_meandBMean SNR (Brouhaha)
c50_mediandBMedian room clarity C50; 60.0 if ungraded
speech_ratio0–1Active speech proportion
pitch_meanHzMean F0 (voiced frames)
pitch_stdHzF0 standard deviation
pitch_rangeHzF0 max–min range
npviNormalised Pairwise Variability Index (rhythm)
intensity_meandBMean RMS intensity
intensity_maxdBPeak intensity
intensity_rangedBDynamic range (peak – minimum)
intensity_velocity_maxdB/frameMax rate of intensity change
jitter_local%Cycle-to-cycle period perturbation (PRISTINE-gated)
shimmer_local%Cycle-to-cycle amplitude perturbation (PRISTINE-gated)
hnr_meandBHarmonics-to-noise ratio (PRISTINE-gated)
cppsdBCepstral peak prominence, smoothed (PRISTINE-gated)
hnr_c50_ratioHNR adjusted for room acoustics (PRISTINE-gated)
cpps_snr_ratioCPPS normalised for noise floor (PRISTINE-gated)
spectral_centroid_meanHzMean spectral brightness
spectral_tiltHF vs LF energy slope
mfcc_delta_meanMean first-order MFCC delta
mfcc_high_varianceUpper MFCC band variance (bands 12–20)
zcr_meanMean zero-crossing rate
teo_meanMean Teager-Kaiser Energy Operator
teo_stdTEO temporal standard deviation
f1_meanHzMean first formant
f2_meanHzMean second formant
f3_meanHzMean third formant
formant_dispersionHzF3–F1 vocal tract length proxy
articulation_ratesyl/sEstimated syllables per second
phoneme_countEstimated phoneme count
emotion_score0–1Affective charge heuristic
spectral_7k8k_entropybits7–8kHz entropy; NaN = codec gate triggered
fam_75hz_sharpnessAcoustic mode sharpness at 75Hz
fam_86hz_sharpnessAcoustic mode sharpness at 86Hz
drr_hf_lf_slope_ratioDirect-to-reverberant HF/LF slope

Tier 2 — Biomechanical Features

Note: Z-score only. Raw extracted values permanently dropped to ensure model-agnosticism.

Column Description Known Signature
bico_f0_f1_zBicoherence F0–F1 phase couplingUniversal deepfake marker — architecture-invariant
bico_f1_f2_zBicoherence F1–F2 phase couplingVocoder formant band independence
modgd_var_zModified group delay varianceTTS collapses to low variance
pgv_magnitude_correlation_zPhase group velocity correlationNear-zero across all synthetics
pgv_total_zTotal phase group velocity energyArchitecture-dependent
f1_velocity_zF1 transition rateImpossible tongue acceleration; |Z|>9 in production fakes
f2_velocity_zF2 transition rateImpossible lip acceleration
inertial_decay_residual_zBiomechanical inertia decay~59% instant-kill on SONAR
teo_std_high_zTEO high-band stdDigital vacuum in neural vocoders
teo_std_low_zTEO low-band stdSynthetic LF TEO too smooth
pitch_velocity_max_zMax F0 rate-of-change[Placeholder: complete description...]
Extraction Methodology Project Details