REFERENCE LIBRARY

Research & Provenance Literature

This repository contains the foundational physics documentation, acoustic theory, and architectural research papers that underpin our extraction methodologies and quality gating protocols.

Deepfake Detection & Anti-Spoofing

Article Title	Year	Author(s)	Reference / Link
Quantizer-Aware Hierarchical Neural Codec Modeling for Speech Deepfake Detection	2026	Unknown (arXiv)	arXiv:2603.16914
VoxAnchor: Grounding Speech Authenticity in Throat Vibration	2026	Unknown (arXiv)	arXiv:2603.27562
How to Label Resynthesized Audio: The Dual Role of Neural Audio Codecs in Audio Deepfake Detection	2026	Unknown (arXiv)	arXiv:2602.16343
Measuring the Robustness of Audio Deepfake Detectors	2025	Unknown (arXiv)	arXiv:2503.17577
Beyond Identity: Generalizable Deepfake Audio Detection	2025	Unknown (arXiv)	arXiv:2505.06766
Forensic Deepfake Audio Detection Using Segmental Speech Features	2025	Unknown (arXiv)	arXiv:2505.13847
Phoneme-Level Analysis for Person-of-Interest Speech Deepfake Detection	2025	Unknown (arXiv)	arXiv:2507.08626
AUDETER: A Large-scale Dataset for Deepfake Audio Detection in Open Worlds	2025	Unknown (arXiv)	arXiv:2509.04345
I Can Hear You: Selective Robust Training for Deepfake Audio Detection	2024	Unknown (arXiv)	arXiv:2411.00121
CLAD: Robust Audio Deepfake Detection Against Manipulation Attacks with Contrastive Learning	2024	Unknown (arXiv)	arXiv:2404.15854
Towards the Detection of Speech Deepfakes for Scam Prevention	2024	White, A., & Watson, C.	SST2024
Linear Frequency Residual Cepstral Features for Replay Spoof Detection on ASVSpoof 2019	2022	Singh, P., et al.	IEEE EUSIPCO
Detecting AI-Synthesized Speech Using Bispectral Analysis	2019	AlBadawy, E. A., Lyu, S., & Farid, H.	CVPR Workshops 2019
Replay detection using CQT-based modified group delay feature and ResNeWt network in ASVspoof 2019	2019	APSIPA ASC	IEEE APSIPA ASC
Audio Deepfake Detection: What Has Been Achieved and What Lies Ahead	N/A	Unknown	PMC11991371

Acoustic Theory & Classical Systems

Article Title	Year	Author(s)	Reference / Link
Speech Representation and Transformation using Adaptive Interpolation of Weighted Spectrum: Vocoder Revisited	1997	Kawahara, H., et al.	IEEE ICASSP
Algebraic Code-Excited Linear Prediction (ACELP)	1995	Salami, R., et al.	ITU-T G.729
A Mixed Excitation LPC Vocoder Model for Low Bit Rate Speech Coding	1995	McCree, A., & Barnwell, T. P.	IEEE Trans. on Speech and Audio Processing
Linear Prediction: A Tutorial Review	1975	Makhoul, J.	Proceedings of the IEEE
Acoustic Theory of Speech Production	1970	Fant, G.	Mouton
Adaptive Predictive Coding of Speech Signals	1970	Atal, B. S., & Schroeder, M. R.	Bell System Technical Journal
Analysis Synthesis Telephony based on the Maximum Likelihood Method	1968	Itakura, F. & Saito, S.	Proc. 6th Int. Congress on Acoustics

Generative Speech Models & Neural Codecs

Article Title	Year	Author(s)	Reference / Link
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale	2023	Le, M., et al. (Meta)	arXiv:2306.15687
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers (VALL-E)	2023	Wang, C., et al. (Microsoft)	arXiv:2301.02111
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis	2020	Kong, J., et al.	arXiv:2010.05646
FastSpeech: Fast, Robust and Controllable Text to Speech	2019	Ren, Y., et al.	arXiv:1905.09263
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions (Tacotron 2)	2018	Shen, J., et al. (Google)	arXiv:1712.05884
WaveNet: A Generative Model for Raw Audio	2016	van den Oord, A., et al. (DeepMind)	arXiv:1609.03499