REFERENCE LIBRARY

Research & Provenance Literature

This repository contains the foundational physics documentation, acoustic theory, and architectural research papers that underpin our extraction methodologies and quality gating protocols.

Deepfake Detection & Anti-Spoofing

Article TitleYearAuthor(s)Reference / Link
Quantizer-Aware Hierarchical Neural Codec Modeling for Speech Deepfake Detection 2026 Unknown (arXiv) arXiv:2603.16914
VoxAnchor: Grounding Speech Authenticity in Throat Vibration 2026 Unknown (arXiv) arXiv:2603.27562
How to Label Resynthesized Audio: The Dual Role of Neural Audio Codecs in Audio Deepfake Detection 2026 Unknown (arXiv) arXiv:2602.16343
Measuring the Robustness of Audio Deepfake Detectors 2025 Unknown (arXiv) arXiv:2503.17577
Beyond Identity: Generalizable Deepfake Audio Detection 2025 Unknown (arXiv) arXiv:2505.06766
Forensic Deepfake Audio Detection Using Segmental Speech Features 2025 Unknown (arXiv) arXiv:2505.13847
Phoneme-Level Analysis for Person-of-Interest Speech Deepfake Detection 2025 Unknown (arXiv) arXiv:2507.08626
AUDETER: A Large-scale Dataset for Deepfake Audio Detection in Open Worlds 2025 Unknown (arXiv) arXiv:2509.04345
I Can Hear You: Selective Robust Training for Deepfake Audio Detection 2024 Unknown (arXiv) arXiv:2411.00121
CLAD: Robust Audio Deepfake Detection Against Manipulation Attacks with Contrastive Learning 2024 Unknown (arXiv) arXiv:2404.15854
Towards the Detection of Speech Deepfakes for Scam Prevention 2024 White, A., & Watson, C. SST2024
Linear Frequency Residual Cepstral Features for Replay Spoof Detection on ASVSpoof 2019 2022 Singh, P., et al. IEEE EUSIPCO
Detecting AI-Synthesized Speech Using Bispectral Analysis 2019 AlBadawy, E. A., Lyu, S., & Farid, H. CVPR Workshops 2019
Replay detection using CQT-based modified group delay feature and ResNeWt network in ASVspoof 2019 2019 APSIPA ASC IEEE APSIPA ASC
Audio Deepfake Detection: What Has Been Achieved and What Lies Ahead N/A Unknown PMC11991371

Acoustic Theory & Classical Systems

Article TitleYearAuthor(s)Reference / Link
Speech Representation and Transformation using Adaptive Interpolation of Weighted Spectrum: Vocoder Revisited 1997 Kawahara, H., et al. IEEE ICASSP
Algebraic Code-Excited Linear Prediction (ACELP) 1995 Salami, R., et al. ITU-T G.729
A Mixed Excitation LPC Vocoder Model for Low Bit Rate Speech Coding 1995 McCree, A., & Barnwell, T. P. IEEE Trans. on Speech and Audio Processing
Linear Prediction: A Tutorial Review 1975 Makhoul, J. Proceedings of the IEEE
Acoustic Theory of Speech Production 1970 Fant, G. Mouton
Adaptive Predictive Coding of Speech Signals 1970 Atal, B. S., & Schroeder, M. R. Bell System Technical Journal
Analysis Synthesis Telephony based on the Maximum Likelihood Method 1968 Itakura, F. & Saito, S. Proc. 6th Int. Congress on Acoustics

Generative Speech Models & Neural Codecs

Article TitleYearAuthor(s)Reference / Link
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale 2023 Le, M., et al. (Meta) arXiv:2306.15687
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers (VALL-E) 2023 Wang, C., et al. (Microsoft) arXiv:2301.02111
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis 2020 Kong, J., et al. arXiv:2010.05646
FastSpeech: Fast, Robust and Controllable Text to Speech 2019 Ren, Y., et al. arXiv:1905.09263
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions (Tacotron 2) 2018 Shen, J., et al. (Google) arXiv:1712.05884
WaveNet: A Generative Model for Raw Audio 2016 van den Oord, A., et al. (DeepMind) arXiv:1609.03499