Improved Detection of Bird Vocalisations Using BirdNET Embeddings and Machine Learning
DOI:
https://doi.org/10.31224/4466Abstract
Automated bird sound recognition has become an essential tool for biodiversity monitoring, enabling large-scale species detection from audio recordings. BirdNET is a well-known deep learning-based algorithm that has been trained using a vast dataset of weakly labeled recordings and demonstrated strong performance in identifying bird species. When applied on a certain case such as a specific species or a geographical location, its performance can be leveraged through fine-tuning or incorporating a posterior classification step.
In this study, the detection of the Eurasian Woodcock (Scolopax rusticola) is investigated, using BirdNET embeddings as feature representations and training a classifier based on them. A strongly labeled dataset is created by manually annotating 97 recent recordings (2023–2024) from Xeno-canto, extracting 501 positive segments and 2,505 negative segments. BirdNET was then evaluated on this dataset, achieving an average precision of 84%. To enhance the detection accuracy, three machine learning classifiers are trained —Support Vector Machine (SVM), Random Forest, and XGBoost— using BirdNET’s embeddings as input features. The results demonstrate a significant improvement in classification performance, with overall average precision scores reaching the values of 99–100%, surpassing BirdNET’s baseline performance. These findings suggest that a hybrid deep learning and classical machine learning approach can substantially enhance bird species recognition, particularly for challenging acoustic environments.
This work contributes to advancing bioacoustic classification methodologies by demonstrating how deep learning embeddings can be effectively leveraged with traditional classifiers and strongly labeled data, for improved species detection. Future research may explore the applicability of this approach to other species and recording conditions, further refining the bird sound classification systems.
Downloads
Downloads
Posted
Versions
- 2025-05-05 (5)
- 2025-04-23 (4)
- 2025-04-13 (3)
- 2025-03-26 (2)
- 2025-03-25 (1)
License
Copyright (c) 2025 Hakan Dogan

This work is licensed under a Creative Commons Attribution 4.0 International License.