Robust Realtime Polyphonic Pitch Detection




Thomas, John M.

Journal Title

Journal ISSN

Volume Title



Pitch detection is a subset of automatic music transcription, which is the application of various signal processing algorithms with the specific intent of automatically gathering musical information from audio signals. This field, in various forms, has been the subject of much research over the years, as it has virtually endless possibilities for application. Much work has been done on monophonic signals, however, much less has been done to tackle the problem of polyphonic music. One major issue for polyphonic pitch detection systems is efficiency. Most existing algorithms sacrifice efficiency for accuracy and robustness, while some others take the opposite tradeoff. The purpose of this paper is to work toward new systems that are both robust and efficient enough to run in realtime. First, the relevant background information necessary to explore this topic is presented. Musical terminology and concepts are explained, and some common analytical tools and algorithms used in existing systems are described. Three relatively efficient reference systems are then presented. The first is a multiple fundamental frequency (F0) estimator based on the Auto-Correlation Function (ACF) that utilizes a unique enhancement algorithm to easily identify individual pitch components. The second is a multiple F0 estimator based on the Fast Fourier Transform (FFT) that exploits the harmonic nature of musical sounds. The third reference system outputs directly to pitch numbers by using a modified form of an unsupervised learning algorithm called Non-Negative Matrix Factorization (NMF). It is clear from investigation of the first two reference systems that the two opposing camps on fundamental frequency estimation (FFT vs. ACF) are actually quite complementary. Therefore, the remainder of the paper explores the inherent high-frequency versus low-frequency accuracy tradeoffs and proposes potential solutions. A novel analysis tool called the Combined ACF/FFT Representation (CAFR) is developed and three new pitch detection algorithms are devised from it. These algorithms are then evaluated for both robustness and efficiency and compared against results for the three reference systems.



Music Transcription, Robust, Pitch Detection, Realtime, Polyphonic, Optimal Accuracy