Computational bioacoustics scene analysis |

Chapter 11 References

References for Chapter 11 are listed below. You can also download them as a BibTeX file.

[1]	K. Abe and D. Watanabe. Songbirds possess the spontaneous ability to discriminate syntactic rules. Nature Neuroscience, 14:1067--1074, 2011. [ DOI ] Whether the computational systems in language perception involve specific abilities in humans is debated. The vocalizations of songbirds share many features with human speech, but whether songbirds possess a similar computational ability to process auditory information as humans is unknown. We analyzed their spontaneous discrimination of auditory stimuli and found that the Bengalese finch (Lonchura striata var. domestica) can use the syntactical information processing of syllables to discriminate songs). These finches were also able to acquire artificial grammatical rules from synthesized syllable strings and to discriminate novel auditory information according to them. We found that a specific brain region was involved in such discrimination and that this ability was acquired postnatally through the encounter with various conspecific songs. Our results indicate that passerine songbirds spontaneously acquire the ability to process hierarchical structures, an ability that was previously supposed to be specific to humans. </td> </tr>
[2]	T. M. Aide, C. Corrada-Bravo, M. Campos-Cerqueira, C. Milan, G. Vega, and R. Alvarez. Real-time bioacoustics monitoring and automated species identification. PeerJ, 1:e103, 2013. [ DOI ] Traditionally, animal species diversity and abundance is assessed using a variety of methods that are generally costly, limited in space and time, and most importantly, they rarely include a permanent record. Given the urgency of climate change and the loss of habitat, it is vital that we use new technologies to improve and expand global biodiversity monitoring to thousands of sites around the world. In this article, we describe the acoustical component of the Automated Remote Biodiversity Monitoring Network (ARBIMON), a novel combination of hardware and software for automating data acquisition, data management, and species identification based on audio recordings. The major components of the cyberinfrastructure include: a solar powered remote monitoring station that sends 1-min recordings every 10 min to a base station, which relays the recordings in real-time to the project server, where the recordings are processed and uploaded to the project website (arbimon.net). Along with a module for viewing, listening, and annotating recordings, the website includes a species identification interface to help users create machine learning algorithms to automate species identification. To demonstrate the system we present data on the vocal activity patterns of birds, frogs, insects, and mammals from Puerto Rico and Costa Rica. </td> </tr>
[3]	Ikkyu Aihara, Takeshi Mizumoto, Hiromitsu Awano, and Hiroshi G. Okuno. Call alternation between specific pairs of male frogs revealed by a sound-imaging method in their natural habitat. In Interspeech 2016. International Speech Communication Association, sep 2016. [ DOI ] Male frogs vocalize calls to attract conspecific females as well as to announce their own territories to other male frogs. In the choruses, acoustic interaction allows the male frogs to alternate their calls with each other. Such call alternation is reported in various species of frogs including Japanese tree frogs ( Hyla japonica). During call alternation, both male and female frogs are likely to discriminate calls of the male frogs because of small amount of call overlaps. Here, we show that call alternation is observed in natural choruses of male Japanese tree frogs especially between neighboring pairs. First, we demonstrate that caller positions and call timings can be estimated by a sound-imaging method. Second, the occurrence of call alternation is detected on the basis of statistical tests on phase differences of calls between respective pairs. Although our previous study revealed a global synchronization pattern in natural choruses of the male frogs, local chorus structures were not examined well. Through the observation of call alternation between specific pairs, this study suggests the existence of selective attention in the frog choruses. </td> </tr>
[4]	Xavier Anguera, Chuck Wooters, and Javier Hernando. Acoustic beamforming for speaker diarization of meetings. Audio, Speech, and Language Processing, IEEE Transactions on, 15(7):2011--2022, 2007. [ DOI ] When performing speaker diarization on recordings from meetings, multiple microphones of different qualities are usually available and distributed around the meeting room. Although several approaches have been proposed in recent years to take advantage of multiple microphones, they are either too computationally expensive and not easily scalable or they cannot outperform the simpler case of using the best single microphone. In this paper, the use of classic acoustic beamforming techniques is proposed together with several novel algorithms to create a complete frontend for speaker diarization in the meeting room domain. New techniques we are presenting include blind reference-channel selection, two-step time delay of arrival (TDOA) Viterbi postprocessing, and a dynamic output signal weighting algorithm, together with using such TDOA values in the diarization to complement the acoustic information. Tests on speaker diarization show a 25% relative improvement on the test set compared to using a single most centrally located microphone. Additional experimental results show improvements using these techniques in a speech recognition task. </td> </tr>
[5]	Randall Balestriero et al. Scattering decomposition for massive signal classification: from theory to fast algorithm and implementation with validation on international bioacoustic benchmark. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW), pages 753--761. IEEE, 2015. [ DOI ] With the computational power available today, machine learning is becoming a very active field finding its applications in our everyday life. One of its biggest challenge is the classification task involving data representation (the preprocessing part in a machine learning algorithm). In fact, classification of linearly separable data can be easily done. The aim of the preprocessing part is to obtain well represented data by mapping raw data into a ”feature space” where simple classifiers can be used efficiently. For example, almost everything around audio/bioacoustic uses MFCC features until now. We present here a toolbox giving the basic tools for audio representation using the C++ programming language by providing an implementation of the Scattering Network [2] which brings a new and powerful solution for these tasks. We focused our implementation to massive dataset and servers applications. The toolkit of reference in scattering analysis is SCATNET from Mallat et al. 1 . This tool is an attempt to have some of the scatnet features more tractable for Big Data challenges. Furthermore, the use of this toolbox is not limited to machine learning preprocessing. It can also be used for more advanced biological analysis such as animal communication behaviours analysis or any biological study related to signal analysis. This implementation gives out of the box executables that can be used by simple commands without a graphical interface and is thus suited for server applications. As we will review in the next part, we will need to perform data manipulation on huge dataset. It becomes important to have fast and efficient implementations in order to deal with this new ”Big Data” era. </td> </tr>
[6]	Daniel T. Blumstein, Daniel J. Mennill, Patrick Clemins, Lewis Girod, Kung Yao, Gail Patricelli, Jill L. Deppe, Alan H. Krakauer, Christopher Clark, Kathryn A. Cortopassi, Sean F. Hanser, Brenda McCowan, Andreas M. Ali, and Alexander N. G. Kirschel. Acoustic monitoring in terrestrial environments using microphone arrays: applications, technological considerations and prospectus. Journal of Applied Ecology, 48(3):758--767, apr 2011. [ DOI \| http ] 1. Animals produce sounds for diverse biological functions such as defending territories, attracting mates, deterring predators, navigation, finding food and maintaining contact with members of their social group. Biologists can take advantage of these acoustic behaviours to gain valuable insights into the spatial and temporal scales over which individuals and populations interact. Advances in bioacoustic technology, including the development of autonomous cabled and wireless recording arrays, permit data collection at multiple locations over time. These systems are transforming the way we study individuals and populations of animals and are leading to significant advances in our understandings of the complex interactions between animals and their habitats. 2. Here, we review questions that can be addressed using bioacoustic approaches, by providing a primer on technologies and approaches used to study animals at multiple organizational levels by ecologists, behaviourists and conservation biologists. 3. Spatially dispersed groups of microphones (arrays) enable users to study signal directionality on a small scale or to locate animals and track their movements on a larger scale. 4. Advances in algorithm development can allow users to discriminate among species, sexes, age groups and individuals. 5. With such technology, users can remotely and non-invasively survey populations, describe the soundscape, quantify anthropogenic noise, study species interactions, gain new insights into the social dynamics of sound-producing animals and track the effects of factors such as climate change and habitat fragmentation on phenology and biodiversity. 6. There remain many challenges in the use of acoustic monitoring, including the difficulties in per- forming signal recognition across taxa. The bioacoustics community should focus on developing a common framework for signal recognition that allows for various species’ data to be analysed by any recognition system supporting a set of common standards. 7. Synthesis and applications. Microphone arrays are increasingly used to remotely monitor acous- tically active animals. We provide examples from a variety of taxa where acoustic arrays have been used for ecological, behavioural and conservation studies. We discuss the technologies used, the methodologies for automating signal recognition and some of the remaining challenges. We also make recommendations for using this technology to aid in wildlife management. </td> </tr>
[7]	Hielke Freerk Boersma. Characterization of the natural ambient sound environment: Measurements in open agricultural grassland. Journal of the Acoustical Society of America, 101:2104, 1997. [ DOI ] The audibility of manmade sound in a natural environment is affected because of masking by ambient sound. In this report results are presented of measurements of the level and spectral composition of natural ambient sound. The statistical L95 level was determined, i.e., the sound pressure level which is exceeded for 95% of the time, at various wind velocities in open agricultural grassland. The total L95 is described by L95 = 37.9 log(v)+42.5, where v is the wind velocity. The standard deviation with respect to the data points is 2.4 dB. For the A-weighted L95 we found a similar relation [LA95 = 22.6 log(v)+22.7]. The relation between wind speed and natural ambient sound level for each ⅓-oct band was also determined with nominal midband frequencies ranging from 6.3 Hz to 20 kHz separately. The frequency spectrum of ambient sound shows at low frequencies a behavior typical for turbulent processes. The ⅓-oct band intensity of sound at these frequencies is found to be proportional to f−2. Dimensionless spectra obtained for low frequencies at wind speeds exceeding 2 m/s collapse into a line almost identical to results of low-turbulence measurements performed by Strasberg. </td> </tr>
[8]	A. L. Borker, M. W. McKown, J. T. Ackerman, C. A. Eagles-Smith, B. R. Tershy, and D. A. Croll. Vocal activity as a low cost and scalable index of seabird colony size. Conservation Biology, Mar 2014. [ DOI ] Although wildlife conservation actions have increased globally in number and complexity, the lack of scalable, cost-effective monitoring methods limits adaptive management and the evaluation of conservation efficacy. Automated sensors and computer-aided analyses provide a scalable and increasingly cost-effective tool for conservation monitoring. A key assumption of automated acoustic monitoring of birds is that measures of acoustic activity at colony sites are correlated with the relative abundance of nesting birds. We tested this assumption for nesting Forster's terns (Sterna forsteri) in San Francisco Bay for 2 breeding seasons. Sensors recorded ambient sound at 7 colonies that had 15–111 nests in 2009 and 2010. Colonies were spaced at least 250 m apart and ranged from 36 to 2,571 m2. We used spectrogram cross-correlation to automate the detection of tern calls from recordings. We calculated mean seasonal call rate and compared it with mean active nest count at each colony. Acoustic activity explained 71% of the variation in nest abundance between breeding sites and 88% of the change in colony size between years. These results validate a primary assumption of acoustic indices; that is, for terns, acoustic activity is correlated to relative abundance, a fundamental step toward designing rigorous and scalable acoustic monitoring programs to measure the effectiveness of conservation actions for colonial birds and other acoustically active wildlife. </td> </tr>
[9]	E. Briefer and A. G. McElligott. Indicators of age, body size and sex in goat kid calls revealed using the source-filter theory. Applied Animal Behaviour Science, 133:175--185, 2011. [ DOI ]
[10]	F. Briggs, R. Raich, and X. Z. Fern. Audio classification of bird species: A statistical manifold approach. In Proceedings of the Ninth IEEE International Conference on Data Mining, pages 51--60, Dec 2009. [ DOI ] Our goal is to automatically identify which species of bird is present in an audio recording using supervised learning. Devising effective algorithms for bird species classification is a preliminary step toward extracting useful ecological data from recordings collected in the field. We propose a probabilistic model for audio features within a short interval of time, then derive its Bayes risk-minimizing classifier, and show that it is closely approximated by a nearest-neighbor classifier using Kullback-Leibler divergence to compare histograms of features. We note that feature histograms can be viewed as points on a statistical manifold, and KL divergence approximates geodesic distances defined by the Fisher information metric on such manifolds. Motivated by this fact, we propose the use of another approximation to the Fisher information metric, namely the Hellinger metric. The proposed classifiers achieve over 90% accuracy on a data set containing six species of bird, and outperform support vector machines. </td> </tr>
[11]	Giuseppa Buscaino, Maria Ceraulo, Nadia Pieretti, Valentina Corrias, Almo Farina, Francesco Filiciotto, Vincenzo Maccarrone, Rosario Grammauta, Francesco Caruso, Alonge Giuseppe, et al. Temporal patterns in the soundscape of the shallow waters of a mediterranean marine protected area. Scientific Reports, 6, 2016. [ DOI ] The study of marine soundscapes is an emerging field of research that contributes important information about biological compositions and environmental conditions. The seasonal and circadian soundscape trends of a marine protected area (MPA) in the Mediterranean Sea have been studied for one year using an autonomous acoustic recorder. Frequencies less than 1 kHz are dominated by noise generated by waves and are louder during the winter; conversely, higher frequencies (4–96 kHz) are dominated by snapping shrimp, which increase their acoustic activity at night during the summer. Fish choruses, below 2 kHz, characterize the soundscape at sunset during the summer. Because there are 13 vessel passages per hour on average, causing acoustic interference with fish choruses 46% of the time, this MPA cannot be considered to be protected from noise. On the basis of the high seasonal variability of the soundscape components, this study proposes a one-year acoustic monitoring protocol using the soundscape methodology approach and discusses the concept of MPA size. </td> </tr>
[12]	R. T. Buxton and I. L. Jones. Measuring nocturnal seabird activity and status using acoustic recording devices: applications for island restoration. Journal of Field Ornithology, 83(1):47--60, 2012. [ DOI ] Nocturnal burrow-nesting seabirds breeding on isolated oceanic islands pose challenges to conventional monitoring techniques, resulting in their frequent exclusion from population studies. These seabirds have been devastated by nonnative predator introductions on islands worldwide. After predators are eradicated, recovery has been poorly quantified, but evidence suggests some nocturnal seabird populations have been slow to return. We evaluated the use of automated acoustic recorders and call-recognition software to investigate nocturnal seabird recovery after removal of introduced Arctic foxes (Alopex lagopus) in the Aleutian Archipelago, Alaska. We compared relative seabird abundance among islands by examining levels of vocal activity. We deployed acoustic recorders on Nizki-Alaid, Amatignak, and Little Sitkin islands that had foxes removed in 1975, 1991, and 2000, respectively, and on Buldir, a predator-free seabird colony. Despite frequent gales, only 2.9% of 2230 recording hours from May to August of 2008 and 2009 were unusable due to wind noise. Recording quality and call recognition model success were highest when recording devices were placed at sites offering some wind shelter. We detected greater vocal activity of Fork-tailed (Oceanodroma furcata) and Leach's (O. leucorhoa) storm-petrels and Ancient Murrelets (Synthliboramphus antiquus) on islands with longer time periods since fox eradication. Also, by detecting chick calls in the automated recordings, we confirmed breeding by Ancient Murrelets on an island thought to be abandoned due to fox predation. Acoustic monitoring allowed us to examine the relative abundance of seabirds at remote sites. If a link between vocalizations and population dynamics can be made, acoustic monitoring could be a powerful census method. </td> </tr>
[13]	Michael A Casey and Malcolm Slaney. Song intersection by approximate nearest neighbor search. In Proceedings of the International Symposium on Music Information Retrieval (ISMIR), volume 6, pages 144--149, 2006. [ DOI ] We present new methods for computing inter-song similarities using intersections between multiple audio pieces. The intersection contains portions that are similar, when one song is a derivative work of the other for example, in two different musical recordings. To scale our search to large song databases we have developed an algorithm based on locality-sensitive hashing (LSH) of sequences of audio features called audio shingles. LSH provides an efficient means to identify approximate nearest neighbors in a high-dimensional feature space. We combine these nearest neighbor estimates, each a match from a very large database of audio to a small portion of the query song, to form a measure of the approximate similarity. We demonstrate the utility of our methods on a derivative works retrieval experiment using both exact and approximate (LSH) methods. The results show that LSH is at least an order of magnitude faster than the exact nearest neighbor method and that accuracy is not impacted by the approximate method. </td> </tr>
[14]	Z. Chen and R. C. Maher. Semi-automatic classification of bird vocalizations using spectral peak tracks. Journal of the Acoustical Society of America, 120(5):2974--2984, Nov 2006. [ DOI ] Automatic off-line classification and recognition of bird vocalizations has been a subject of interest to ornithologists and pattern detection researchers for many years. Several new applications, including bird vocalization classification for aircraft bird strike avoidance, will require real time classification in the presence of noise and other disturbances. The vocalizations of many common bird species can be represented using a sum-of-sinusoids model. An experiment using computer software to perform peak tracking of spectral analysis data demonstrates the usefulness of the sum-of-sinusoids model for rapid automatic recognition of isolated bird syllables. The technique derives a set of spectral features by time-variant analysis of the recorded bird vocalizations, then performs a calculation of the degree to which the derived parameters match a set of stored templates that were determined from a set of reference bird vocalizations. The results of this relatively simple technique are favorable for both clean and noisy recordings. </td> </tr>
[15]	A. Coates and A. Y. Ng. Learning feature representations with k-means. In G. Montavon, G. B. Orr, and K.-R. Muller, editors, Neural Networks: Tricks of the Trade, pages 561--580. Springer, 2012. [ DOI ] Many algorithms are available to learn deep hierarchies of features from unlabeled data, especially images. In many cases, these algorithms involve multi-layered networks of features (e.g., neural networks) that are sometimes tricky to train and tune and are difficult to scale up to many machines effectively. Recently, it has been found that K-means clustering can be used as a fast alternative training method. The main advantage of this approach is that it is very fast and easily implemented at large scale. On the other hand, employing this method in practice is not completely trivial: K-means has several limitations, and care must be taken to combine the right ingredients to get the system to work well. This chapter will summarize recent results and technical tricks that are needed to make effective use of K-means clustering for learning large-scale representations of images. We will also connect these results to other well-known algorithms to make clear when K-means can be most useful and convey intuitions about its behavior that are useful for debugging and engineering new systems. </td> </tr>
[16]	Juan Gabriel Colonna, Marco Cristo, Mario Salvatierra, and Eduardo Freire Nakamura. An incremental technique for real-time bioacoustic signal segmentation. Expert Systems with Applications, 2015. [ DOI ] A bioacustical animal recognition system is composed of two parts: (1) the segmenter, responsible for detecting syllables (animal vocalization) in the audio; and (2) the classifier, which determines the species/animal whose the syllables belong to. In this work, we first present a novel technique for automatic segmentation of anuran calls in real time; then, we present a method to assess the performance of the whole system. The proposed segmentation method performs an unsupervised binary classification of time series (audio) that incrementally computes two exponentially-weighted features (Energy and Zero Crossing Rate). In our proposal, classical sliding temporal windows are replaced with counters that give higher weights to new data, allowing us to distinguish between a syllable and ambient noise (considered as silences). Compared to sliding-window approaches, the associated memory cost of our proposal is lower, and processing speed is higher. Our evaluation of the segmentation component considers three metrics: (1) the Matthews Correlation Coefficient for point-to-point comparison; (2) the WinPR to quantify the precision of boundaries; and (3) the AEER for event-to-event counting. The experiments were carried out in a dataset with 896 syllables of seven different species of anurans. To evaluate the whole system, we derived four equations that helps understand the impact that the precision and recall of the segmentation component has on the classification task. Finally, our experiments show a segmentation/recognition improvement of 37%, while reducing memory and data communication. Therefore, results suggest that our proposal is suitable for resource-constrained systems, such as Wireless Sensor Networks (WSNs). </td> </tr>
[17]	D. K. Dawson and M. G. Efford. Bird population density estimated from acoustic signals. Journal of Applied Ecology, 46(6):1201--1209, Nov 2009. [ DOI ] 1. Many animal species are detected primarily by sound. Although songs, calls and other sounds are often used for population assessment, as in bird point counts and hydrophone surveys of cetaceans, there are few rigorous methods for estimating population density from acoustic data. 2. The problem has several parts -- distinguishing individuals, adjusting for individuals that are missed, and adjusting for the area sampled. Spatially explicit capture--recapture (SECR) is a statistical methodology that addresses jointly the second and third parts of the problem. We have extended SECR to use uncalibrated information from acoustic signals on the distance to each source. 3. We applied this extension of SECR to data from an acoustic survey of ovenbird Seiurus aurocapilla density in an eastern US deciduous forest with multiple four-microphone arrays. We modelled average power from spectrograms of ovenbird songs measured within a window of 0·7 s duration and frequencies between 4200 and 5200 Hz. 4. The resulting estimates of the density of singing males (0·19 ha−1 SE 0·03 ha−1) were consistent with estimates of the adult male population density from mist-netting (0·36 ha−1 SE 0·12 ha−1). The fitted model predicts sound attenuation of 0·11 dB m−1 (SE 0·01 dB m−1) in excess of losses from spherical spreading. 5.Synthesis and applications. Our method for estimating animal population density from acoustic signals fills a gap in the census methods available for visually cryptic but vocal taxa, including many species of bird and cetacean. The necessary equipment is simple and readily available; as few as two microphones may provide adequate estimates, given spatial replication. The method requires that individuals detected at the same place are acoustically distinguishable and all individuals vocalize during the recording interval, or that the per capita rate of vocalization is known. We believe these requirements can be met, with suitable field methods, for a significant number of songbird species. </font></blockquote> </td> </tr>
[18]	S. Dixon. Onset detection revisited. In Proc. of the Int. Conf. on Digital Audio Effects (DAFx-06), pages 133--137, Montreal, Quebec, Canada, 2006.
[19]	S. Duan, J. Zhang, P. Roe, J. Wimmer, X. Dong, A. Truskinger, and M. Towsey. Timed probabilistic automaton: a bridge between Raven and Song Scope for automatic species recognition. In Proceedings of the Twenty-Fifth Innovative Applications of Artificial Intelligence Conference, pages 1519--1524. AAAI, 2013. [ DOI ]
[20]	J. E. Elie and F. E. Theunissen. The vocal repertoire of the domesticated zebra finch: a data-driven approach to decipher the information-bearing acoustic features of communication signals. Animal Cognition, pages 1--31, 2015. [ DOI ] Although a universal code for the acoustic features of animal vocal communication calls may not exist, the thorough analysis of the distinctive acoustical features of vocalization categories is important not only to decipher the acoustical code for a specific species but also to understand the evolution of communication signals and the mechanisms used to produce and understand them. Here, we recorded more than 8000 examples of almost all the vocalizations of the domesticated zebra finch, Taeniopygia guttata: vocalizations produced to establish contact, to form and maintain pair bonds, to sound an alarm, to communicate distress or to advertise hunger or aggressive intents. We characterized each vocalization type using complete representations that avoided any a priori assumptions on the acoustic code, as well as classical bioacoustics measures that could provide more intuitive interpretations. We then used these acoustical features to rigorously determine the potential information-bearing acoustical features for each vocalization type using both a novel regularized classifier and an unsupervised clustering algorithm. Vocalization categories are discriminated by the shape of their frequency spectrum and by their pitch saliency (noisy to tonal vocalizations) but not particularly by their fundamental frequency. Notably, the spectral shape of zebra finch vocalizations contains peaks or formants that vary systematically across categories and that would be generated by active control of both the vocal organ (source) and the upper vocal tract (filter). </td> </tr>
[21]	Sabrina Engesser, Jodie MS Crane, James L Savage, Andrew F Russell, and Simon W Townsend. Experimental evidence for phonemic contrasts in a nonhuman vocal system. PLoS Biol, 13(6):e1002171, 2015. [ DOI ] The ability to generate new meaning by rearranging combinations of meaningless sounds is a fundamental component of language. Although animal vocalizations often comprise com- binations of meaningless acoustic elements, evidence that rearranging such combinations generates functionally distinct meaning is lacking. Here, we provide evidence for this basic ability in calls of the chestnut-crowned babbler (Pomatostomus ruficeps), a highly coopera- tive bird of the Australian arid zone. Using acoustic analyses, natural observations, and a series of controlled playback experiments, we demonstrate that this species uses the same acoustic elements (A and B) in different arrangements (AB or BAB) to create two functional- ly distinct vocalizations. Specifically, the addition or omission of a contextually meaningless acoustic element at a single position generates a phoneme-like contrast that is sufficient to distinguish the meaning between the two calls. Our results indicate that the capacity to rear- range meaningless sounds in order to create new signals occurs outside of humans. We suggest that phonemic contrasts represent a rudimentary form of phoneme structure and a potential early step towards the generative phonemic system of human language. </td> </tr>
[22]	S. Fagerlund. Bird species recognition using support vector machines. EURASIP Journal on Applied Signal Processing, page 38637, 2007. [ DOI ] Automatic identification of bird species by their vocalization is studied in this paper. Bird sounds are represented with two different parametric representations: (i) the mel-cepstrum parameters and (ii) a set of low-level signal parameters, both of which have been found useful for bird species recognition. Recognition is performed in a decision tree with support vector machine (SVM) classifiers at each node that perform classification between two species. Recognition is tested with two sets of bird species whose recognition has been previously tested with alternative methods. Recognition results with the proposed method suggest better or equal performance when compared to existing reference methods. </td> </tr>
[23]	Almo Farina, Nadia Pieretti, and Luigi Piccioli. The soundscape methodology for long-term bird monitoring: a Mediterranean Europe case-study. Ecological Informatics, 6(6):354--363, 2011. [ DOI ] The soundscape represents the acoustic footprint of a landscape, and may well be a source of a vast amount of information that could be used efficiently in, for example, long-term bird aggregation monitoring schemes. To depict such soundscape footprint, specific indexes are requested. In particular, the aim of this paper was to extensively describe the Acoustic Complexity Index (ACI) and to successively apply it to process the sound files recorded in an ecologically fragile area in a Mediterranean maqui (Eastern Liguria, Italy). Daily acoustic animal activity was sampled in 90 one-minute files between the end of May and the end of July, 2010, using a pre-programmed recording procedure (Songmeter, Wildlife Acoustic). The WaveSurfer software, powered by the Soundscape Metric plug-in, was then utilized to quickly process these data. This approach allows the identification of the compositional changes and acoustic fluctuations activity of a local community (in the proposed case prevalently composed by birds and cicadas). In particular, two distinct patterns emerged during the investigation. From 20 May to 4 July, the soundscape was dominated by birds but, after that period, the onset of the cicadas' songs completely changed the sound dynamics. The proposed methodology has been demonstrated to be a powerful tool to identify the complex patterns of the soundscape across different temporal scales (hours, days and intraseason). This approach could also be adopted in long-term studies to monitor animal dynamics under different environmental scenarios. </font></blockquote> </td> </tr>
[24]	A Gasc, J Sueur, F Jiguet, V Devictor, P Grandcolas, C Burrow, M Depraetere, and S Pavoine. Assessing biodiversity with sound: Do acoustic diversity indices reflect phylogenetic and functional diversities of bird communities? Ecological Indicators, 25:279--287, 2013. [ DOI ] Measuring biodiversity is a challenging task for research in taxonomy, ecology and conservation. Biodiversity is commonly measured using metrics related to species richness, phylogenetic-, or functional-trait diversity of species assemblages. Because these metrics are not always correlated with each other, they have to be considered separately. A descriptor of animal diversity based on the diversity of sounds produced by animal communities, named here the Community Acoustic Diversity (CAD), was recently proposed. In many cases, the CAD could be easier to measure than other metrics. Although previous analyses have revealed that acoustic diversity might increase as species richness increases, the ability of CAD to reflect other components of biodiversity has not been formally investigated. The aim of this study is to test theoretically whether functional and phylogenetic diversities could be reflected by acoustic diversity indices in bird communities. Data on species assemblages were collated by the French Breeding Bird Survey describing spatial and temporal variation in community structure and composition across France since 2001. Phylogenetic and functional data were compiled from literature. Acoustic data were obtained from sound libraries. For each of the 19,420 sites sampled, indices of phylogenetic, functional and acoustic diversity of bird communities were calculated based on species’ pair-wise distance matrices and species’ abundances. The different aspects of biodiversity were compared through correlation analyses. The results showed that acoustic diversity was correlated with phylogenetic diversity, when the branch lengths of the tree were considered, and to functional diversity, especially body mass and reproduction. Correlations between phylogenetic, functional and acoustic distances among species did not entirely explain the correlations between phylogenetic, functional and acoustic diversity within communities. This result was interpreted as an effect of local ecological processes underpinning how bird communities assemble. Comparing the diversity patterns with a null model, phylogenetic and functional diversities were significantly clustered whereas acoustic diversity was not different from what was expected by chance. A comparison between acoustic indices showed that spectral component of acoustic diversity seems more appropriate to reveal bird phylogenetic diversity whereas temporal component seems more adapted to reveal functional diversity of a bird community. Overall, even if the processes at the origin of the different facets of biodiversity are different, CAD reveals part of phylogenetic diversity and some extent of functional diversity. </td> </tr>
[25]	L. F. Gill, W. Goymann, A. Ter Maat, and M. Gahr. Patterns of call communication between group-housed zebra finches change during the breeding cycle. eLife, 4, 2015. [ DOI ] Vocal signals such as calls play a crucial role for survival and successful reproduction, especially in group-living animals. However, call interactions and call dynamics within groups remain largely unexplored because their relation to relevant contexts or life-history stages could not be studied with individual-level resolution. Using on-bird microphone transmitters, we recorded the vocalisations of individual zebra finches (Taeniopygia guttata) behaving freely in social groups, while females and males previously unknown to each other passed through different stages of the breeding cycle. As birds formed pairs and shifted their reproductive status, their call repertoire composition changed. The recordings revealed that calls occurred non-randomly in fine-tuned vocal interactions and decreased within groups while pair-specific patterns emerged. Call-type combinations of vocal interactions changed within pairs and were associated with successful egg-laying, highlighting a potential fitness relevance of calling dynamics in communication systems. </td> </tr>
[26]	Douglas Gillespie, David K. Mellinger, Jonathan Gordon, David McLaren, Paul Redmond, Ronald McHugh, Philip Trinder, Xiao-Yan Deng, and Aaron Thode. PAMGUARD: Semiautomated, open source software for real-time acoustic detection and localization of cetaceans. Journal of the Acoustical Society of America, 125(4):2547--2547, apr 2009. [ DOI \| http ] PAMGUARD is open‐source, platform‐independent software to address the needs of developers and users of Passive Acoustic Monitoring (PAM) systems. For the PAM operator—marine mammal biologist, manager, or mitigator—PAMGUARD provides a flexible and easy‐to‐use suite of detection, localization, data management, and display modules. These provide a standard interface across different platforms with the flexibility to allow multiple detectors to be added, removed, and configured according to the species of interest and the hardware configuration on a particular project. For developers of PAM systems, an Application Programming Interface (API) has been developed which contains standard classes for the efficient handling of many types of data, interfaces to acquisition hardware and to databases, and a GUI framework for data display. PAMGUARD replicates and exceeds the capabilities of earlier real time monitoring programs such as the IFAW Logger Suite and Ishmael. Ongoing developments include improved real‐time location and automated species classification. [PAMGUARD funded by the OGP E&P Sound and Marine Life project.] </td> </tr>
[27]	Hervé Goëau, Hervé Glotin, Willem-Pier Vellinga, Robert Planqué, and Alexis Joly. Lifeclef bird identification task 2016: The arrival of deep learning. In Working Notes of CLEF 2016-Conference and Labs of the Evaluation forum, Évora, Portugal, 5-8 September, 2016., pages 440--449, 2016. [ DOI ] The LifeCLEF bird identification challenge provides a large- scale testbed for the system-oriented evaluation of bird species identifi- cation based on audio recordings. One of its main strength is that the data used for the evaluation is collected through Xeno-Canto, the largest network of bird sound recordists in the world. This makes the task closer to the conditions of a real-world application than previous, similar initia- tives. The main novelty of the 2016-th edition of the challenge was the inclusion of soundscape recordings in addition to the usual xeno-canto recordings that focus on a single foreground species. This paper reports the methodology of the conducted evaluation, the overview of the sys- tems experimented by the 6 participating research groups and a synthetic analysis of the obtained results. </td> </tr>
[28]	A. Härma and P. Somervuo. Classification of the harmonic structure in bird vocalization. In Proc International Conference on Acoustics, Speech, and Signal Processing (ICASSP'04), volume 5, pages 701--704, 2004. [ DOI ] The article is related to the development of techniques for automatic recognition of bird species by their sounds. It has been demonstrated earlier that a simple model of one time-varying sinusoid is very useful in classification and recognition of typical bird sounds. However, a large class of bird sounds are not pure sinusoids but have a clear harmonic spectrum structure. We introduce a way to classify bird syllables into four classes by their harmonic structure. </td> </tr>
[29]	Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 604--613. ACM, 1998. [ DOI ]
[30]	Peter Jancovic and Munevver Kokuer. Acoustic recognition of multiple bird species based on penalised maximum likelihood. IEEE Signal Processing Letters, pages 1--1, 2015. [ DOI \| http ] Automatic system for recognition of multiple bird species in audio recordings is presented. Time-frequency segmentation of the acoustic scene is obtained by employing a sinusoidal detection algorithm, which does not require any estimate of noise and is able to handle multiple simultaneous bird vocalizations. Each segment is characterized as a sequence of frequencies over time, referred to as a frequency track. Each bird species is represented by a hidden Markov model that models the temporal evolution of frequency tracks. The decision on the number and identity of bird species in a given recording is obtained based on maximizing the overall likelihood of the set of detected segments, with a penalization applied for increasing the number of bird models used. Experimental evaluations are performed on audio field recordings containing 30 bird species. The presence of multiple bird species is simulated by joining the set of detected segments from several bird species. Results show that the proposed method can achieve recognition performance for multiple bird species not far from that obtained for single bird species, and considerably outperforms majority voting methods. </td> </tr>
[31]	A. T. Johansson and P. R. White. An adaptive filter-based method for robust, automatic detection and frequency estimation of whistles. Journal of the Acoustical Society of America, 130(2):893--903, 2011. [ DOI ] This paper proposes an adaptive filter-based method for detection and frequency estimation of whistle calls, such as the calls of birds and marine mammals, which are typically analyzed in the time-frequency domain using a spectrogram. The approach taken here is based on adaptive notch filtering, which is an established technique for frequency tracking. For application to automatic whistle processing, methods for detection and improved frequency tracking through frequency crossings as well as interfering transients are developed and coupled to the frequency tracker. Background noise estimation and compensation is accomplished using order statistics and pre-whitening. Using simulated signals as well as recorded calls of marine mammals and a human whistled speech utterance, it is shown that the proposed method can detect more simultaneous whistles than two competing spectrogram-based methods while not reporting any false alarms on the example datasets. In one example, it extracts complete 1.4 and 1.8 s bottlenose dolphin whistles successfully through frequency crossings. The method performs detection and estimates frequency tracks even at high sweep rates. The algorithm is also shown to be effective on human whistled utterances. </td> </tr>
[32]	A. Kershenbaum, D. T. Blumstein, M. A. Roch, Ç. A. Akçay, G. Backus, M. A. Bee, K. Bohn, Y. Cao, G. Carter, C. Cäsar, et al. Acoustic sequences in non-human animals: a tutorial review and prospectus. Biological Reviews, 2014. [ DOI ]
[33]	A. Kershenbaum, A. E. Bowles, T. M. Freeberg, D. Z. Jin, A. R. Lameira, and K. Bohn. Animal vocal sequences: not the Markov chains we thought they were. Proceedings of the Royal Society B: Biological Sciences, 281(1792):20141370, 2014. [ DOI ]
[34]	Arik Kershenbaum, Holly Root-Gutteridge, Bilal Habib, Janice Koler-Matznick, Brian Mitchell, Vicente Palacios, and Sara Waller. Disentangling canid howls across multiple species and subspecies: Structure in a complex communication channel. Behavioural processes, 124:149--157, 2016. [ DOI ] Wolves, coyotes, and other canids are members of a diverse genus of top predators of considerable conservation and management interest. Canid howls are long-range communication signals, used both for territorial defence and group cohesion. Previous studies have shown that howls can encode individual and group identity. However, no comprehensive study has investigated the nature of variation in canid howls across the wide range of species. We analysed a database of over 2000 howls recorded from 13 different canid species and subspecies. We applied a quantitative similarity measure to compare the modulation pattern in howls from different populations, and then applied an unsupervised clustering algorithm to group the howls into natural units of distinct howl types. We found that different species and subspecies showed markedly different use of howl types, indicating that howl modulation is not arbitrary, but can be used to distinguish one population from another. We give an example of the conservation importance of these findings by comparing the howls of the critically endangered red wolves to those of sympatric coyotes Canis latrans, with whom red wolves may hybridise, potentially compromising reintroduced red wolf populations. We believe that quantitative cross-species comparisons such as these can provide important understanding of the nature and use of communication in socially cooperative species, as well as support conservation and management of wolf populations. </td> </tr>
[35]	Daniel Kohlsdorf, Denise Herzing, and Thad Starner. Feature learning and automatic segmentation for dolphin communication analysis. In Interspeech 2016. International Speech Communication Association, sep 2016. [ DOI \| http ] The study of dolphin cognition involves intensive research of animal vocalizations recorded in the field. We address the automated analysis of audible dolphin communication and propose a system that automatically discovers patterns in dolphin signals. These patterns are invariant to frequency shifts and time warping transformations. The discovery algorithm is based on feature learning and unsupervised time series segmentation using hidden Markov models. Researchers can inspect the patterns visually and interactively run comparative statistics between the distribution of dolphin signals in different behavioral contexts. Our results indicate that our system provides meaningful patterns to the marine biologist and that the comparative statistics are aligned with the biologists domain knowledge. </td> </tr>
[36]	Ryosuke Kojima, Osamu Sugiyama, Reiji Suzuki, Kazuhiro Nakadai, and Charles E Taylor. Semi-automatic bird song analysis by spatial-cue-based integration of sound source detection, localization, separation, and identification. In Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on, pages 1287--1292. IEEE, 2016. [ DOI ] This paper addresses bird song analysis based on semi-automatic annotation. Research in animal behavior, especially with birds, would be aided by automated (or semiautomated) systems that can localize sounds, measure their timing, and identify their source. This is difficult to achieve in real environments where several birds may be singing from different locations and at the same time. Analysis of recordings from the wild has in the past typically required manual annotation. Such annotation is not always accurate or even consistent, as it may vary both within or between observers. Here we propose a system that uses automated methods from robot audition, including sound source detection, localization, separation and identification. In robot audition these technologies have typically been studied separately; combining them often leads to poor performance in real-time application from the wild. We suggest that integration is aided by placing a primary focus on spatial cues, then combining other features within a Bayesian framework. A second problem has been that supervised machine learning methods typically requires a pre-trained model that may require a large training set of annotated labels. We have employed a semi-automatic annotation approach that requires much less pre-annotation. Preliminary experiments with recordings of bird songs from the wild revealed that for identification accuracy our system outperformed a method based on conventional robot audition. </td> </tr>
[37]	RF Lachlan, L Verhagen, S Peters, and C. ten Cate. Are there species-universal categories in bird song phonology and syntax? A comparative study of chaffinches (fringilla coelebs), zebra finches (taenopygia guttata), and swamp sparrows (melospiza georgiana). Journal of Comparative Psychology, 124(1):92, 2010. [ DOI ]
[38]	Robert F. Lachlan, Machteld N. Verzijden, Caroline S. Bernard, Peter-Paul Jonker, Bram Koese, Shirley Jaarsma, Willemijn Spoor, Peter J.B. Slater, and Carel ten Cate. The progressive loss of syntactical structure in bird song along an island colonization chain. Current Biology, 23(19):1896--1901, oct 2013. [ DOI ] Cultural transmission can increase the flexibility of behavior, such as bird song. Nevertheless, this flexibility often appears to be constrained, sometimes by preferences for learning certain traits over others, a phenomenon known as “biased” learning or transmission [1]. The sequential colonization of the Atlantic Islands by the chaffinch (Fringilla coelebs) [ 2] provides a unique model system in which to investigate how the variability of a cultural trait has evolved. We used novel computational methods to analyze chaffinch song from twelve island and continental populations and to infer patterns of evolution in song structure. We found that variability of the subunits within songs (“syllables”) differed moderately between populations but was not predicted by whether the population was continental or not. In contrast, we found that the sequencing of syllables within songs (“syntax”) was less structured in island than continental populations and in fact decreased significantly after each colonization. Syntactical structure was very clear in the mainland European populations but was almost entirely absent in the most recently colonized island, Gran Canaria. Our results suggest that colonization leads to the progressive loss of a species-specific feature of song, syntactical structure. </td> </tr>
[39]	Robert F Lachlan and Stephen Nowicki. Context-dependent categorical perception in a songbird. Proceedings of the National Academy of Sciences, 112(6):1892--1897, 2015. [ DOI ] The division of continuously variable acoustic signals into discrete perceptual categories is a fundamental feature of human speech. Other animals have been found to categorize speech sounds much the same as humans do, although little is known of the role of categorical perception by animals in their own natural communication systems. A hallmark of human categorical perception of speech is that linguistic context affects both how speech sounds are categorized into phonemes, and how different versions of phonemes are produced. I first review earlier findings showing that a species of songbird, the swamp sparrow, categorically perceives the notes that constitute its learned songs and that individual neurons in the bird’s brain show categorical responses that map onto its behavioral response. I then present more recent data, using both discrimination and labeling tests, that show how swamp sparrows perceive categorical boundaries differently depending on context. These results demonstrate that there is a more complex relationship between underlying categorical representations and surface forms in the perception of birdsong. To our knowledge, this work suggests for the first time that this higher-order characteristic of human phonology is also found in a nonhuman communication system. </td> </tr>
[40]	P. Laiolo. The emerging significance of bioacoustics in animal species conservation. Biological Conservation, 143(7):1635--1645, 2010. [ DOI ] This review reports on the effects of human activities on animal acoustic signals published in the literature from 1970 to 2009. Almost 5% of the studies on variation in animal communication tested or hypothesised on human impacts, and showed that habitat fragmentation, direct human disturbance, introduced diseases, urbanization, hunting, chemical and noise pollution may challenge animal acoustic behaviour. Although acoustic adaptations to anthropogenic habitats have been documented, human impacts have most often generated neutral variation or potential maladaptive responses. Negative impacts have been postulated in the sexual signals of fishes, amphibians, birds, and mammals; these are concerning as any maladaptive alteration of sexual behaviour may have direct bearings on breeding success and ultimately population growth rate. Acoustic communication also facilitates other vital behaviours influenced by human-driven perturbations. Bat and cetacean echolocation, for instance, is disrupted by noise pollution, whereas bird and mammal alarming is also affected by introduced diseases and hunting. Mammal social signals are sensitive to noise pollution and hunting, and birds selecting habitats by means of acoustic cues are especially vulnerable to habitat loss. Anthropogenic intervention in these cases may have a negative impact on individual survival, recruitment and group cohesion, limiting rescue-effects and triggering Allee effects. Published evidence shows that acoustic variation may be used as an early-warning indicator of perturbations even when not directly affecting individual fitness. Acoustic signalling can be studied in a broad range of ecosystems, can be recorded, analyzed, synthesised and played back with relative ease and limited economic budget, and is sensitive to many types of impacts, thus can have great conservation significance. </td> </tr>
[41]	M. Lasseck. Bird song classification in field recordings: Winning solution for NIPS4B 2013 competition. In H. Glotin, Y. LeCun, T. Artières, S. Mallat, O. Tchernichovski, and X. Halkias, editors, Neural Information Processing Scaled for Bioacoustics, from Neurons to Big Data, pages 176--181, USA, 2013. [ .pdf ]
[42]	Laurent Lellouch, Sandrine Pavoine, Frederic Jiguet, Herve Glotin, and Jerome Sueur. Monitoring temporal change of bird communities with dissimilarity acoustic indices. Methods in Ecology and Evolution, 5(6):495--505, 2014. [ DOI ] A part of biodiversity assessment and monitoring consists in the estimation and track of the changes in species composition and abundance of animal communities. Such a task requires an important sampling over a broad-scale time that is difficult to reach with classical survey methods. Acoustics may offer an alternative to usual techniques by recording the sound produced by vocal animals. Animal species that use sound for communication (sing and/or call) establish an acoustic community when they sing at the same time and at a particular place. The estimation of the acoustic community dynamics could provide indirect cues on what drives changes in community composition and species abundance. Here, new methods were developed to estimate the changes in bird communities recorded at three woodland temperate sites in France. Both field recordings and simulated data were used to test whether acoustic dissimilarity indices can be used to estimate changes in the composition of the community. Four dissimilarity indices found in the literature, and a new one named Dcf were tested on auditory spectra after transformation to the Mel scale, rather than on classical Fourier frequency spectra. All indices were compared with each other and with compositional indices. The results show that bird communities occurring at the three sites were dynamic with changes of composition with time. Dissimilarities computed on simulated acoustic communities were correlated with compositional dissimilarity but those computed on field-recorded communities could not be considered as faithful estimators of community composition variations. However, the indices indicate important dates in community changes around mid-April that were also seen in the composition dynamics. Acoustic dissimilarity indices failed to track accurately changes in species composition of the bird communities. However, these indices, which are easy to compute, still provide information on the acoustic dynamics of bird community. Acoustics might not be considered as a proxy of compositional diversity but rather as another facet of animal diversity that needs to be studied and preserved on its own. </font></blockquote> </td> </tr>
[43]	J.P. Lewis. Fast normalized cross-correlation. Vision interface, 10(1):120--123, 1995. Although it is well known that cross correlation can be efficiently implemented in the transform domain, the normalized form of cross correlation preferred for feature matching applications does not have a simple frequency domain expression. Normalized cross correlation has been computed in the spatial domain for this reason. This short paper shows that unnormalized cross correlation can be efficiently normalized using precomputing integrals of the image and image 2 over the search window. </td> </tr>
[44]	D. Lipkind and O. Tchernichovski. Quantification of developmental birdsong learning from the subsyllabic scale to cultural evolution. Proceedings of the National Academy of Sciences, 2011. [ DOI ] Quantitative analysis of behavior plays an important role in birdsong neuroethology, serving as a common denominator in studies spanning molecular to system-level investigation of sensory-motor conversion, developmental learning, and pattern generation in the brain. In this review, we describe the role of behavioral analysis in facilitating cross-level integration. Modern sound analysis approaches allow investigation of developmental song learning across multiple time scales. Combined with novel methods that allow experimental control of vocal changes, it is now possible to test hypotheses about mechanisms of vocal learning. Further, song analysis can be done at the population level across generations to track cultural evolution and multigenerational behavioral processes. Complementing the investigation of song development with noninvasive brain imaging technology makes it now possible to study behavioral dynamics at multiple levels side by side with developmental changes in brain connectivity and in auditory responses. </td> </tr>
[45]	Ciira wa Maina, David Muchiri, and Peter Njoroge. A bioacoustic record of a conservancy in the Mount Kenya ecosystem. Biodiversity Data Journal, 4:e9906, oct 2016. [ DOI \| http ] Background Environmental degradation is a major threat facing ecosystems around the world. In order to determine ecosystems in need of conservation interventions, we must monitor the biodiversity of these ecosystems effectively. Bioacoustic approaches offer a means to monitor ecosystems of interest in a sustainable manner. In this work we show how a bioacoustic record from the Dedan Kimathi University wildlife conservancy, a conservancy in the Mount Kenya ecosystem, was obtained in a cost effective manner. A subset of the dataset was annotated with the identities of bird species present since they serve as useful indicator species. These data reveal the spatial distribution of species within the conservancy and also point to the effects of major highways on bird populations. This dataset will provide data to train automatic species recognition systems for birds found within the Mount Kenya ecosystem. Such systems are necessary if bioacoustic approaches are to be employed at the large scales necessary to influence wildlife conservation measures. New information We provide acoustic recordings from the Dedan Kimathi University wildlife conservancy, a conservancy in the Mount Kenya ecosystem, obtained using a low cost acoustic recorder. A total of 2701 minute long recordings are provided including both daytime and nighttime recordings. We present an annotation of a subset of the daytime recordings indicating the bird species present in the recordings. The dataset contains recordings of at least 36 bird species. In addition, the presence of a few nocturnal species within the conservancy is also confirmed. </font></blockquote> </td> </tr>
[46]	S. Mallat. Group invariant scattering. Communications on Pure and Applied Mathematics, 65(10):1331--1398, 2012. [ http ] This paper constructs translation invariant operators on L2(R^d), which are Lipschitz continuous to the action of diffeomorphisms. A scattering propagator is a path ordered product of non-linear and non-commuting operators, each of which computes the modulus of a wavelet transform. A local integration defines a windowed scattering transform, which is proved to be Lipschitz continuous to the action of diffeomorphisms. As the window size increases, it converges to a wavelet scattering transform which is translation invariant. Scattering coefficients also provide representations of stationary processes. Expected values depend upon high order moments and can discriminate processes having the same power spectrum. Scattering operators are extended on L2 (G), where G is a compact Lie group, and are invariant under the action of G. Combining a scattering on L2(R^d) and on Ld (SO(d)) defines a translation and rotation invariant scattering on L2(R^d). </td> </tr>
[47]	P. R. Marler and H. Slabbekoorn. Nature's Music: the Science of Birdsong. Academic Press, Massachusetts, USA, 2004.
[48]	T. A. Marques, L. Thomas, S. W. Martin, D. K. Mellinger, J. A. Ward, D. J. Moretti, D. Harris, and P. L. Tyack. Estimating animal population density using passive acoustics. Biological Reviews, 2012. [ DOI ] Reliable estimation of the size or density of wild animal populations is very important for effective wildlife management, conservation and ecology. Currently, the most widely used methods for obtaining such estimates involve either sighting animals from transect lines or some form of capture-recapture on marked or uniquely identifiable individuals. However, many species are difficult to sight, and cannot be easily marked or recaptured. Some of these species produce readily identifiable sounds, providing an opportunity to use passive acoustic data to estimate animal density. In addition, even for species for which other visually based methods are feasible, passive acoustic methods offer the potential for greater detection ranges in some environments (e.g. underwater or in dense forest), and hence potentially better precision. Automated data collection means that surveys can take place at times and in places where it would be too expensive or dangerous to send human observers. Here, we present an overview of animal density estimation using passive acoustic data, a relatively new and fast-developing field. We review the types of data and methodological approaches currently available to researchers and we provide a framework for acoustics-based density estimation, illustrated with examples from real-world case studies. We mention moving sensor platforms (e.g. towed acoustics), but then focus on methods involving sensors at fixed locations, particularly hydrophones to survey marine mammals, as acoustic-based density estimation research to date has been concentrated in this area. Primary among these are methods based on distance sampling and spatially explicit capture-recapture. The methods are also applicable to other aquatic and terrestrial sound-producing taxa. We conclude that, despite being in its infancy, density estimation based on passive acoustic data likely will become an important method for surveying a number of diverse taxa, such as sea mammals, fish, birds, amphibians, and insects, especially in situations where inferences are required over long periods of time. There is considerable work ahead, with several potentially fruitful research areas, including the development of (i) hardware and software for data acquisition, (ii) efficient, calibrated, automated detection and classification systems, and (iii) statistical approaches optimized for this application. Further, survey design will need to be developed, and research is needed on the acoustic behaviour of target species. Fundamental research on vocalization rates and group sizes, and the relation between these and other factors such as season or behaviour state, is critical. Evaluation of the methods under known density scenarios will be important for empirically validating the approaches presented here. </font></blockquote> </td> </tr>
[49]	A. L. McIlraith and H. C. Card. Birdsong recognition using backpropagation and multivariate statistics. IEEE Transactions on Signal Processing, 45(11):2740--2748, Nov 1997. [ DOI ] An investigation has been made of bird species recognition using recordings of birdsong. Six species of birds native to Manitoba were chosen: song sparrows, fox sparrows, marsh wrens, sedge wrens, yellow warblers, and red-winged blackbirds. These species exhibit overlapping characteristics in terms of frequency content, song components, and length of songs. Songs from multiple individuals in each of these species were employed, with discernible recording noise such as tape hiss and, in some cases, other competing songs in the background. These songs were analyzed using backpropagation learning in two-layer perceptrons, as well as methods from multivariate statistics that included principal components and quadratic discriminant analysis. Preprocessing methods included linear predictive coding and windowed Fourier transforms. Generalization performance ranged from 82-93 </td> </tr>
[50]	D.K. Mellinger, S.W. Martin, R.P. Morrissey, L. Thomas, and J.J. Yosco. A method for detecting whistles, moans, and other frequency contour sounds. Journal of the Acoustical Society of America, pages 4055--4061, 2010. [ DOI ] An algorithm is presented for the detection of frequency contour sounds---whistles of dolphins and many other odontocetes, moans of baleen whales, chirps of birds, and numerous other animal and non-animal sounds. The algorithm works by tracking spectral peaks over time, grouping together peaks in successive time slices in a spectrogram if the peaks are sufficiently near in frequency and form a smooth contour over time. The algorithm has nine parameters, including the ones needed for spectrogram calculation and normalization. Finding optimal values for all of these parameters simultaneously requires a search of parameter space, and a grid search technique is described. The frequency contour detection method and parameter optimization technique are applied to the problem of detecting “boing” sounds of minke whales from near Hawaii. The test data set contained many humpback whale sounds in the frequency range of interest. Detection performance is quantified, and the method is found to work well at detecting boings, with a false-detection rate of 3% for the target missed-call rate of 25%. It has also worked well anecdotally for other marine and some terrestrial species, and could be applied to any species that produces a frequency contour, or to non-animal sounds as well. </td> </tr>
[51]	D. J. Mennill, J. M. Burt, K. M. Fristrup, and S. L. Vehrencamp. Accuracy of an acoustic location system for monitoring the position of duetting songbirds in tropical forest. Journal of the Acoustical Society of America, 119:2832--2839, 2006. [ DOI \| http ] A field test was conducted on the accuracy of an eight-microphone acoustic location system designed to triangulate the position of duetting rufous-and-white wrens (Thryothorus rufalbus) in Costa Rica's humid evergreen forest. Eight microphones were set up in the breeding territories of 20 pairs of wrens, with an average intermicrophone distance of 75.22.6 m. The array of microphones was used to record antiphonal duets broadcast through stereo loudspeakers. The positions of the loudspeakers were then estimated by evaluating the delay with which the eight microphones recorded the broadcast sounds. Position estimates were compared to coordinates surveyed with a global-positioning system (GPS). The acoustic location system estimated the position of loudspeakers with an error of 2.820.26 m and calculated the distance between the “male” and “female” loudspeakers with an error of 2.120.42 m. Given the large range of distances between duetting birds, this relatively low level of error demonstrates that the acoustic location system is a useful tool for studying avian duets. Location error was influenced partly by the difficulties inherent in collecting high accuracy GPS coordinates of microphone positions underneath a lush tropical canopy and partly by the complicating influence of irregular topography and thick vegetation on sound transmission. </td> </tr>
[52]	Sebastian Menze, Daniel P. Zitterbart, Ilse van Opzeeland, and Olaf Boebel. The influence of sea ice, wind speed and marine mammals on southern ocean ambient sound. Royal Society Open Science, 4(1):160370, jan 2017. [ DOI \| http ] This paper describes the natural variability of ambient sound in the Southern Ocean, an acoustically pristine marine mammal habitat. Over a 3-year period, two autonomous recorders were moored along the Greenwich meridian to collect underwater passive acoustic data. Ambient sound levels were strongly affected by the annual variation of the sea-ice cover, which decouples local wind speed and sound levels during austral winter. With increasing sea-ice concentration, area and thickness, sound levels decreased while the contribution of distant sources increased. Marine mammal sounds formed a substantial part of the overall acoustic environment, comprising calls produced by Antarctic blue whales (Balaenoptera musculus intermedia), fin whales (Balaenoptera physalus), Antarctic minke whales (Balaenoptera bonaerensis) and leopard seals (Hydrurga leptonyx). The combined sound energy of a group or population vocalizing during extended periods contributed species-specific peaks to the ambient sound spectra. The temporal and spatial variation in the contribution of marine mammals to ambient sound suggests annual patterns in migration and behaviour. The Antarctic blue and fin whale contributions were loudest in austral autumn, whereas the Antarctic minke whale contribution was loudest during austral winter and repeatedly showed a diel pattern that coincided with the diel vertical migration of zooplankton. </td> </tr>
[53]	Eduardo Mercado III and Christopher B. Sturdy. Classifying animal sounds with neural networks. In C. H. Brown and T. Riede, editors, Comparative Bioacoustics: An Overview, chapter 10. Bentham Science Publishers, Oak Park, IL, USA, 2016. Humans naturally classify the sounds they hear into different categories, including sounds produced by animals. Bioacousticians have supplemented this type of subjective sorting with quantitative analyses of acoustic features of animal sounds. Using neural networks to classify animal sounds extends this process one step further by not only facilitating objective descriptive analyses of animal sounds, but also by making it possible to simulate auditory classification processes. Critical aspects of developing a neural network include choosing a particular architecture, converting measurements into input representations, and training the network to recognize inputs. When the goal is to sort vocalizations into specific types, supervised learning algorithms make it possible for a neural network to do so with high accuracy and speed. When the goal is to sort vocalizations based on similarities between measured properties, unsupervised learning algorithms can be used to create neural networks that objectively sort sounds or that quantify sequential properties of sequences of sounds. Neural networks can also provide insights into how animals might themselves classify the sounds they hear, and be useful in developing specific testable hypotheses about the functions of different sounds. The current chapter illustrates each of these applications of neural networks in bioacoustics studies of the sounds produced by chickadees (Poecile atricapillus), false killer whales (Pseudoorca crassidens), and humpback whales (Megaptera novaeangliae). </td> </tr>
[54]	G. Montavon, G. Orr, and K.-R. Müller, editors. Neural Networks: Tricks of the Trade. Springer, 2012.
[55]	Iosif Mporas, Todor Ganchev, Otilia Kocsis, Nikos Fakotakis, Olaf Jahn, Klaus Riede, and Karl L Schuchmann. Automated acoustic classification of bird species from real-field recordings. In 2012 IEEE 24th International Conference on Tools with Artificial Intelligence, volume 1, pages 778--781. IEEE, 2012. [ DOI ] We report on a recent progress with the development of an automated bioacoustic bird recognizer, which is part of a long-term project, aiming at the establishment of an automated biodiversity monitoring system at the Hymettus Mountain near Athens. In particular, employing a classical audio processing strategy, which has been proved quite successful in various audio recognition applications, we evaluate the appropriateness of six classifiers on the bird species recognition task. In the experimental evaluation of the acoustic bird recognizer, we made use of real-field audio recordings for seven bird species, which are common for the Hymettus Mountain. Encouraging recognition accuracy was obtained on the real-field data, and further experiments with additive noise demonstrated significant noise robustness in low SNR conditions. </td> </tr>
[56]	Kevin P Murphy and Mark A Paskin. Linear-time inference in hierarchical HMMs. In Advances in neural information processing systems, volume 2, pages 833--840, 2002. [ DOI ] The hierarchical hidden Markov model (HHMM) is a generalization of the hidden Markov model (HMM) that models sequences with structure at many length/time scales [FST98]. Unfortunately, the original inference algorithm is rather complicated, and takes O(T^3) time, where T is the length of the sequence, making it impractical for many domains. In this paper, we show how HHMMs are a special kind of dynamic Bayesian network (DBN), and thereby derive a much simpler inference algorithm, which only takes O(T) time. Furthermore, by drawing the connection between HHMMs and DBNs, we enable the application of many standard approximation techniques to further speed up inference. </td> </tr>
[57]	K. P. Murphy. Machine Learning: A Probabilistic Perspective. MIT press, 2012. [ DOI ] </td> </tr>
[58]	M. Padgham. Reverberation and frequency attenuation in forests---implications for acoustic communication in animals. Journal of the Acoustical Society of America, 115:402, 2004. [ DOI ] Rates of reverberative decay and frequency attenuation are measured within two Australian forests. In particular, their dependence on the distance between a source and receiver, and the relative heights of both, is examined. Distance is always the most influential of these factors. The structurally denser of the forests exhibits much slower reverberative decay, although the frequency dependence of reverberation is qualitatively similar in the two forests. There exists a central range of frequencies between 1 and 3 kHz within which reverberation varies relatively little with distance. Attenuation is much greater within the structurally denser forest, and in both forests it generally increases with increasing frequency and distance, although patterns of variation differ between the two forests. Increasing the source height generally reduces reverberation, while increasing the receiver height generally reduces attenuation. These findings have considerable implications for acoustic communication between inhabitants of these forests, particularly for the perching behaviors of birds. Furthermore, this work indicates the ease with which the general acoustic properties of forests can be measured and compared. </td> </tr>
[59]	E. C. Perez, M. S. A. Fernandez, S. C. Griffith, C. Vignal, and H. A. Soula. Impact of visual contact on vocal interaction dynamics of pair-bonded birds. Animal Behaviour, 107:125--137, 2015. [ DOI ] Animal social interactions usually revolve around several sensory modalities. For birds, these are primarily visual and acoustic. However, some habitat specificities or long distances may temporarily hinder or limit visual information transmission making acoustic transmission a central channel of communication even during complex social behaviours. Here we investigated the impact of visual limitation on the vocal dynamics between zebra finch, Taeniopygia guttata, partners. Pairs were acoustically recorded during a separation and reunion protocol with gradually decreasing distance without visual contact. Without visual contact, pairs displayed more correlated vocal exchanges than with visual contact. We also analysed the turn-taking sequences of individuals' vocalizations during an exchange with or without visual contact. In the absence of visual contact, the identity of a vocalizing individual was well predicted by the knowledge of the identity of the previous vocalizer. This property is characteristic of a stochastic process called a Markov chain and we found that turn-taking sequences of birds deprived of visual contact were Markovian. Thus, both the temporal correlation between the calls of the two partners and Markov properties of acoustic interactions indicate that, in the absence of visual cues, the decision to call is taken on a very short-term basis and solely on acoustic information (both temporal and identity of caller). Strikingly, when individuals were in visual contact both these features of their acoustic social interactions disappeared indicating that birds adapted their calling dynamics to cope with limited visual cues. </td> </tr>
[60]	Nadia Pieretti, Almo Farina, and Davide Morri. A new methodology to infer the singing activity of an avian community: the Acoustic Complexity Index (ACI). Ecological Indicators, 11(3):868--873, 2011. [ DOI ] The animal soundscape is a field of growing interest because of the implications it has for human–landscape interactions. Yet, it continues to be a difficult subject to investigate, due to the huge amount of information which it contains. In this contribution, the suitability of the Acoustic Complexity Index (ACI) is examined. It is an algorithm created to produce a direct quantification of the complex biotic songs by computing the variability of the intensities registered in audio-recordings, despite the presence of constant human-generated-noise. Twenty audio-recordings were made at equally spaced locations in a beech mountain forest in the Tuscan-Emilian Apennine National Park (Italy) between June and July 2008. The study area is characterized by the absence of recent human disturbance to forest assets but the presence of airplane routes does bring engine noise that overlaps and mixes with the natural soundscape, which resulted entirely composed by bird songs. The intensity values and frequency bin occurrences of soundscapes, the total number of bird vocalizations and the ACI were processed by using the Songscope v2.1 and Avisoft v4.40 software. The Spearman's rho calculation highlighted a significant correlation between the ACI values and the number of bird vocalizations, while the frequency bin occurrence and acoustic intensity were weaker correlated to bird singing activity because of the inclusion of all of the other geo/anthro-phonies composing the soundscape. The ACI tends to be efficient in filtering out anthrophonies (such as airplane engine noise), and demonstrates the capacity to synthetically and efficiently describe the complexity of bird soundscapes. Finally, this index offers new opportunities for the monitoring of songbird communities faced with the challenge of human-induced disturbances and other proxies like climate and land use changes. </td> </tr>
[61]	Jeffrey Podos, Dana L Moseley, Sarah E Goodwin, Jesse McClure, Benjamin N Taft, Amy VH Strauss, Christine Rega-Brodsky, and David C Lahti. A fine-scale, broadly applicable index of vocal performance: frequency excursion. Animal Behaviour, 116:203--212, 2016. [ DOI ] Our understanding of the evolution and function of animal displays has been advanced through studies of vocal performance. A widely used metric of vocal performance, vocal deviation, is limited by being applicable only to vocal trills, and also overlooks certain fine-scale aspects of song structure that might reflect vocal performance. In light of these limitations we here introduce a new index of vocal performance, ‘frequency excursion’. Frequency excursion calculates, for any given song or song segment, the sum of frequency modulations both within and between notes on a per-time basis. We calculated and compared the two performance metrics in three species: chipping sparrows, Spizella passerina, swamp sparrows, Melospiza georgiana, and song sparrows, Melospiza melodia. The two metrics correlated as expected, yet frequency excursion accounted for subtle variations in performance overlooked by vocal deviation. In swamp sparrows, frequency excursion values varied significantly by song type but not by individual. Moreover, song type performance in swamp sparrows, according to both metrics, varied negatively with the extent to which song types were shared among neighbours. In song sparrows, frequency excursion values of trilled song segments exceeded those of nontrilled song segments, although not to a statistically significant degree. We suggest that application of frequency excursion in birds and other taxa will provide new insights into diverse open questions concerning vocal performance, function and evolution. </td> </tr>
[62]	Ladislav Ptacek, Lukas Machlica, Pavel Linhart, Pavel Jaska, and Ludek Muller. Automatic recognition of bird individuals on an open set using as-is recordings. Bioacoustics, 25(1):55--73, 2016. [ DOI ] The most common method used to determine the identity of an individual bird is the capture-mark-recapture technique. The method has several major disadvantages, e.g. some species are difficult to capture/recapture and the capturing process itself may cause significant stress in animals leading even to injuries of more vulnerable species. Some studies introduce systems based on methods used for human identification. An automatic system for recognition of bird individuals (ASRBI) described in this article is based on a Gaussian mixture model (GMM) and a universal background model (GMM-UBM) method extended by an advanced voice activity detection (VAD) algorithm. It is focused on recognizing the bird individuals on an open set, i.e. any number of unknown birds may appear anytime during the identification process as is common in nature. The introduced ASRBI processes the recordings just as if they were recorded by an ornithologist: with durations from seconds to minutes, containing noise and unwanted sounds, as well as masking of the singer, etc. Thanks to the VAD algorithm, the proposed system is fully automatic, no manual pre-processing of recordings is needed, neither by cutting off the songs nor syllables. The overall achieved identification accuracy is 78.5%, the lowest 60.3% and the highest 95.7%. In total, 90% of all experiments reach at least 70% accuracy. The result suggests the application of the GMM-UBM with VAD is feasible for individual identification on the open set processing real-life recordings. The described method is capable of reducing both the time consumption and human intervention in animal monitoring projects. </td> </tr>
[63]	R. Ranft. Natural sound archives: Past, present and future. Anais da Academia Brasileira de Ciências, 76(2):456--460, 2004. [ DOI ] Recordings of wild animals were first made in the Palearctic in 1900, in the Nearctic in 1929, in Antarctica in 1934, in Asia in 1937, and in the Neotropics in the 1940s. However, systematic collecting did not begin until the 1950s. Collections of animal sound recordings serve many uses in education, entertainment, science and nature conservation. In recent years, technological developments have transformed the ways in which sounds can be sampled, stored and accessed. Now the largest collections between them hold altogether around 0.5 million recordings with their associated data. The functioning of a major archive will be described with reference to the British Library Sound Archive. Preserving large collections for the long term is a primary concern in the digital age. While digitization and digital preservation has many advantages over analogue methods, the rate of technology change and lack of standardization are a serious problem for theworld's major audio archives. Another challenge is to make collections more easily and widely accessible via electronic networks. On-line catalogues and access to the actual sounds via the internet are already available for some collections. Case studies describing the establishment and functioning of sound libraries inMexico, Colombia and Brazil are given in individually authored sections in an Appendix. </td> </tr>
[64]	Y. Ren, M.T. Johnson, P.J. Clemins, M. Darre, S.S. Glaeser, T.S. Osiejuk, and E. Out-Nyarko. A framework for bioacoustic vocalization analysis using hidden Markov models. Algorithms, 2(4):1410--1428, 2009. [ DOI ] Using Hidden Markov Models (HMMs) as a recognition framework for automatic classification of animal vocalizations has a number of benefits, including the ability to handle duration variability through nonlinear time alignment, the ability to incorporate complex language or recognition constraints, and easy extendibility to continuous recognition and detection domains. In this work, we apply HMMs to several different species and bioacoustic tasks using generalized spectral features that can be easily adjusted across species and HMM network topologies suited to each task. This experimental work includes a simple call type classification task using one HMM per vocalization for repertoire analysis of Asian elephants, a language-constrained song recognition task using syllable models as base units for ortolan bunting vocalizations, and a stress stimulus differentiation task in poultry vocalizations using a non-sequential model via a one-state HMM with Gaussian mixtures. Results show strong performance across all tasks and illustrate the flexibility of the HMM framework for a variety of species, vocalization types, and analysis tasks. </td> </tr>
[65]	J. C. Ross and P. E. Allen. Random forest for improved analysis efficiency in passive acoustic monitoring. Ecological Informatics, 2013. [ DOI ] Passive acoustic monitoring often leads to large quantities of sound data which are burdensome to process, such that the availability and cost of expert human analysts can be a bottleneck and make ecosystem or landscape-scale projects infeasible. This manuscript presents a method for rapidly analyzing the results of band-limited energy detectors, which are commonly used for the detection of passerine nocturnal flight calls, but which typically are beset by high false positive rates. We first manually classify a subset of the detected events as signals of interest or false detections. From that subset, we build a random forest model to eliminate most of the remaining events as false detections without further human inspection. The overall reduction in the labor required to separate signals of interest from false detections can be 80% or more. Additionally, we present an R package, flightcallr, containing functions which can be used to implement this new workflow. </td> </tr>
[66]	The state of nature in the UK and its overseas territories. Technical report, RSPB and 24 other UK organisations, 2013. [ http ]
[67]	JF Ruiz-Muñoz, Zeyu You, Raviv Raich, and Xiaoli Z Fern. Dictionary learning for bioacoustics monitoring with applications to species classification. Journal of Signal Processing Systems, pages 1--15, 2016. [ DOI ] This paper deals with the application of the convolutive version of dictionary learning to analyze in-situ audio recordings for bio-acoustics monitoring. We propose an efficient approach for learning and using a sparse convolutive model to represent a collection of spectrograms. In this approach, we identify repeated bioacoustics patterns, e.g., bird syllables, as words and represent new spectrograms using these words. Moreover, we propose a supervised dictionary learning approach in the multiple-label setting to support multi-label classification of unlabeled spectrograms. Our approach relies on a random projection for reduced computational complexity. As a consequence, the non-negativity requirement on the dictionary words is relaxed. Furthermore, the proposed approach is well-suited for a collection of discontinuous spectrograms. We evaluate our approach on synthetic examples and on two real datasets consisting of multiple birds audio recordings. Bird syllable dictionary learning from a real-world dataset is demonstrated. Additionally, we successfully apply the approach to spectrogram denoising and species classification. </td> </tr>
[68]	Maria Sandsten, Mareile Große Ruse, and Martin Jönsson. Robust feature representation for classification of bird song syllables. EURASIP Journal on Advances in Signal Processing, 2016(1), may 2016. [ DOI \| http ] A novel feature set for low-dimensional signal representation, designed for classification or clustering of non-stationary signals with complex variation in time and frequency, is presented. The feature representation of a signal is given by the first left and right singular vectors of its ambiguity spectrum matrix. If the ambiguity matrix is of low rank, most signal information in time direction is captured by the first right singular vector while the signal’s key frequency information is encoded by the first left singular vector. The resemblance of two signals is investigated by means of a suitable similarity assessment of the signals’ respective singular vector pair. Application of multitapers for the calculation of the ambiguity spectrum gives an increased robustness to jitter and background noise and a consequent improvement in performance, as compared to estimation based on the ordinary single Hanning window spectrogram. The suggested feature-based signal compression is applied to a syllable-based analysis of a song from the bird species Great Reed Warbler and evaluated by comparison to manual auditive and/or visual signal classification. The results show that the proposed approach outperforms well-known approaches based on mel-frequency cepstral coefficients and spectrogram cross-correlation. </td> </tr>
[69]	T. Scott Brandes. Automated sound recording and analysis techniques for bird surveys and conservation. Bird Conservation International, 18(S1):163--173, Aug 2008. [ DOI ] There is a great need for increased use and further development of automated sound recording and analysis of avian sounds. Birds are critical to ecosystem functioning so techniques to make avian monitoring more efficient and accurate will greatly benefit science and conservation efforts. We provide an overview of the hardware approaches to automated sound recording as well as an overview of the prominent techniques used in software to automatically detect and classify avian sound. We provide a comparative summary of examples of three general categories of hardware solutions for automating sound recording which include a hardware interface for a scheduling timer to control a standalone commercial recorder, a programmable recording device, and a single board computer. We also describe examples of the two main approaches to improving microphone performance for automated recorders through small arrays of microphone elements and using waveguides. For the purposes of thinking about automated sound analysis, we suggest five basic sound fragment types of avian sound and discuss a variety of techniques to automatically detect and classify avian sounds to species level, as well as their limitations. A variety of the features to measure for the various call types are provided, along with a variety of classification methods for those features. They are discussed in context of general performance as well as the monitoring and conservation efforts they are used in. </td> </tr>
[70]	P. Somervuo, A. Härma, and S. Fagerlund. Parametric representations of bird sounds for automatic species recognition. IEEE Transactions on Audio, Speech and Language Processing, 14(6):2252--2263, Nov 2006. [ DOI ] This paper is related to the development of signal processing techniques for automatic recognition of bird species. Three different parametric representations are compared. The first representation is based on sinusoidal modeling which has been earlier found useful for highly tonal bird sounds. Mel-cepstrum parameters are used since they have been found very useful in the parallel problem of speech recognition. Finally, a vector of various descriptive features is tested because such models are popular in audio classification applications, and bird song is almost like music. We briefly introduce the methods and evaluate their performance in the classification and recognition of both individual syllables and song fragments of 14 common North-European Passerine bird species. </td> </tr>
[71]	North American Bird Conservation Initiative. State of North America's birds 2016. Technical report, Environment and Climate Change Canada, Ottawa, Ontario, 2016. [ http ]
[72]	Robert Carrington Stein. Modulation in bird sounds. The Auk, 85(2):229--243, 1968. [ DOI ]
[73]	Philip K. Stoddard and Michael J. Owren. Filtering in bioacoustics. In C. H. Brown and T. Riede, editors, Comparative Bioacoustics: An Overview, chapter 7. Bentham Science Publishers, Oak Park, IL, USA, 2016. Working in bioacoustics requires knowledge of filtering, which is the application of frequency-dependent energy attenuation. General filter types include low-pass, high-pass, band-pass, and band-stop versions, each of which involves selecting a target frequency range, corresponding corner frequencies, and an optimized combination of attenuation slope and pass-band ripple. Filters can be constructed in either analog (hardware) or digital (software) forms, the former being necessary when converting signals between these two kinds of representations. However, the latter are more flexible, less expensive, and the more common when working with digital signals. Readily available programs allow even novice users to easily design and use digital filters. Filtering applications include removing various kinds of noise, simulating environmental degradation effects, and searching for signals embedded in noise. While easily performed, each of these applications requires some background knowledge. There is also good reason to avoid unnecessary use of filtering, as it is easy to create unintended effects. This chapter discusses these and other issues in the context of the everyday work of bioacoustics. </td> </tr>
[74]	D. Stowell and M. D. Plumbley. Segregating event streams and noise with a Markov renewal process model. Journal of Machine Learning Research, 14:1891--1916, 2013. [ .html ] We describe an inference task in which a set of timestamped event observations must be clustered into an unknown number of temporal sequences with independent and varying rates of observations. Various existing approaches to multi-object tracking assume a fixed number of sources and/or a fixed observation rate; we develop an approach to inferring structure in timestamped data produced by a mixture of an unknown and varying number of similar Markov renewal processes, plus independent clutter noise. The inference simultaneously distinguishes signal from noise as well as clustering signal observations into separate source streams. We illustrate the technique via synthetic experiments as well as an experiment to track a mixture of singing birds. Source code is available. </td> </tr>
[75]	D. Stowell and M. D. Plumbley. Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ, 2:e488, 2014. [ DOI \| arXiv ]
[76]	D. Stowell and M. D. Plumbley. Large-scale analysis of frequency modulation in birdsong databases. Methods in Ecology and Evolution, 2014. [ DOI \| http ] Birdsong often contains large amounts of rapid frequency modulation (FM). It is believed that the use or otherwise of FM is adaptive to the acoustic environment and also that there are specific social uses of FM such as trills in aggressive territorial encounters. Yet temporal fine detail of FM is often absent or obscured in standard audio signal analysis methods such as Fourier analysis or linear prediction. Hence, it is important to consider high-resolution signal processing techniques for analysis of FM in bird vocalizations. If such methods can be applied at big data scales, this offers a further advantage as large data sets become available. We introduce methods from the signal processing literature which can go beyond spectrogram representations to analyse the fine modulations present in a signal at very short time-scales. Focusing primarily on the genus Phylloscopus, we investigate which of a set of four analysis methods most strongly captures the species signal encoded in birdsong. We evaluate this through a feature selection technique and an automatic classification experiment. In order to find tools useful in practical analysis of large data bases, we also study the computational time taken by the methods, and their robustness to additive noise and MP3 compression. We find three methods which can robustly represent species-correlated FM attributes and can be applied to large data sets, and that the simplest method tested also appears to perform the best. We find that features representing the extremes of FM encode species identity supplementary to that captured in frequency features, whereas bandwidth features do not encode additional information. FM analysis can extract information useful for bioacoustic studies, in addition to measures more commonly used to characterize vocalizations. Further, it can be applied efficiently across very large data sets and archives. </td> </tr>
[77]	D. Stowell, L. F. Gill, and D. Clayton. Detailed temporal structure of communication networks in groups of songbirds. Journal of the Royal Society Interface, 13(119), 2016. [ DOI ] Animals in groups often exchange calls, in patterns whose temporal structure may be influenced by contextual factors such as physical location and the social network structure of the group. We introduce a model-based analysis for temporal patterns of animal call timing, originally developed for networks of firing neurons. This has advantages over cross-correlation analysis in that it can correctly handle common-cause confounds and provides a generative model of call patterns with explicit parameters for the influences between individuals. It also has advantages over standard Markovian analysis in that it incorporates detailed temporal interactions which affect timing as well as sequencing of calls. Further, a fitted model can be used to generate novel synthetic call sequences. We apply the method to calls recorded from groups of domesticated zebra finch (Taeniopygia guttata) individuals. We find that the communication network in these groups has stable structure that persists from one day to the next, and that “kernels” reflecting the temporal range of influence have a characteristic structure for a calling individual’s effect on itself, its partner, and on others in the group. We further find characteristic patterns of influences by call type as well as by individual. </td> </tr>
[78]	Dan Stowell, Mike Wood, Yannis Stylianou, and Hervé Glotin. Bird detection in audio: a survey and a challenge. In Proceedings of MLSP 2016, 2016. Many biological monitoring projects rely on acoustic detection of birds. Despite increasingly large datasets, this detection is often manual or semi-automatic, requiring manual tuning/postprocessing. We review the state of the art in automatic bird sound detection, and identify a widespread need for tuning-free and species-agnostic approaches. We introduce new datasets and an IEEE research challenge to address this need, to make possible the development of fully automatic algorithms for bird sound detection. </td> </tr>
[79]	D. Stowell, E. Benetos, and L. F. Gill. On-bird sound recordings: Automatic acoustic recognition of activities and contexts. IEEE/ACM Transactions on Audio Speech and Language Processing, accepted.
[80]	Jerome Sueur, Sandrine Pavoine, Olivier Hamerlynck, and Stephanie Duvail. Rapid acoustic survey for biodiversity appraisal. PloS one, 3(12):e4065, 2008. [ DOI ] Biodiversity assessment remains one of the most difficult challenges encountered by ecologists and conservation biologists. This task is becoming even more urgent with the current increase of habitat loss. Many methods–from rapid biodiversity assessments (RBA) to all-taxa biodiversity inventories (ATBI)–have been developed for decades to estimate local species richness. However, these methods are costly and invasive. Several animals–birds, mammals, amphibians, fishes and arthropods–produce sounds when moving, communicating or sensing their environment. Here we propose a new concept and method to describe biodiversity. We suggest to forego species or morphospecies identification used by ATBI and RBA respectively but rather to tackle the problem at another evolutionary unit, the community level. We also propose that a part of diversity can be estimated and compared through a rapid acoustic analysis of the sound produced by animal communities. We produced α and β diversity indexes that we first tested with 540 simulated acoustic communities. The α index, which measures acoustic entropy, shows a logarithmic correlation with the number of species within the acoustic community. The β index, which estimates both temporal and spectral dissimilarities, is linearly linked to the number of unshared species between acoustic communities. We then applied both indexes to two closely spaced Tanzanian dry lowland coastal forests. Indexes reveal for this small sample a lower acoustic diversity for the most disturbed forest and acoustic dissimilarities between the two forests suggest that degradation could have significantly decreased and modified community composition. Our results demonstrate for the first time that an indicator of biological diversity can be reliably obtained in a non-invasive way and with a limited sampling effort. This new approach may facilitate the appraisal of animal diversity at large spatial and temporal scales. </td> </tr>
[81]	Jérôme Sueur, Almo Farina, Amandine Gasc, Nadia Pieretti, and Sandrine Pavoine. Acoustic indices for biodiversity assessment and landscape investigation. Acta Acustica united with Acustica, 100(4):772--781, 2014. [ DOI ] Bioacoustics is historically a discipline that essentially focuses on individual behaviour in relation to population and species evolutionary levels but rarely in connection with higher levels of ecological complexity like community, landscape or ecosystem. However, some recent bioacoustic researches have operated a change of scale by developing acoustic indices which aim is to characterize animal acoustic communities and soundscapes. We here review these indices for the first time. The indices can be divided into two classes: the α or within-group indices and the β or between-group indices. Up to 21 α acoustic indices were proposed in less than six years. These indices estimate the amplitude, evenness, richness, heterogeneity of an acoustic community or soundscape. Seven β diversity indices were suggested to compare amplitude envelopes or, more often, frequency spectral profiles. Both α and β indices reported congruent and expected results but they may still suff er some bias due, for instance, to anthropic background noise or variations in the distances between vocalising animals and the sensors. Research is still needed to improve the reliability of these new mathematical tools for biodiversity assessment and monitoring. We recommend the contemporary use of some of these indices to obtain complementary information. Eventually, we foresee that this new field of research which tries to build bridges between animal behaviour and ecology should meet an important success in the next years for the assessment and monitoring of marine, freshwater and terrestrial biodiversity from individual-based level to landscape dimension. </td> </tr>
[82]	Jérôme Sueur and Almo Farina. Ecoacoustics: the ecological investigation and interpretation of environmental sound. Biosemiotics, pages 1--10, 2015. [ DOI ] The sounds produced by animals have been a topic of research into animal behaviour for a very long time. If acoustic signals are undoubtedly a vehicle for exchanging information between individuals, environmental sounds embed as well a significant level of data related to the ecology of populations, communities and landscapes. The consideration of environmental sounds for ecological investigations opens up a field of research that we define with the term ecoacoustics. In this paper, we draw the contours of ecoacoustics by detailing: the main theories, concepts and methods used in ecoacoustic research, and the numerous outcomes that can be expected from the ecological approach to sound. Ecoacoustics has several theoretical and practical challenges, but we firmly believe that this new approach to investigating ecological processes will generate abundant and exciting research programs. </td> </tr>
[83]	Reiji Suzuki, Shiho Matsubayashi, Kazuhiro Nakadai, and Hiroshi G. Okuno. Localizing bird songs using an open source robot audition system with a microphone array. In Interspeech 2016. International Speech Communication Association, sep 2016. [ DOI \| http ] Auditory scene analysis is critical in observing bio-diversity and understanding social behavior of animals in natural habitats because many animals and birds sing or call and environmental sounds are made. To understand acoustic interactions among songbirds, we need to collect spatiotemporal data for a long period of time during which multiple individuals and species are singing simultaneously. We are developing HARKBird, which is an easily-available and portable system to record, localize, and analyze bird songs. It is composed of a laptop PC with an open source robot audition system HARK (Honda Research Institute Japan Audition for Robots with Kyoto University) and a commercially available low-cost microphone array. HARKBird helps us annotate bird songs and grasp the soundscape around the microphone array by providing the direction of arrival (DOA) of each localized source and its separated sound automatically. In this paper, we briefly introduce our system and show an example analysis of a track recorded at the experimental forest of Nagoya University, in central Japan. We demonstrate that HARKBird can extract birdsongs successfully by combining multiple localization results with appropriate parameter settings that took account of ecological properties of environment around a microphone array and species-specific properties of bird songs. </td> </tr>
[84]	O. Tchernichovski, F. Nottebohm, C. E. Ho, B. Pesaran, and P. P. Mitra. A procedure for an automated measurement of song similarity. Animal Behaviour, 59(6):1167--1176, 2000. [ DOI ]
[85]	A. Ter Maat, L. Trost, H. Sagunsky, S. Seltmann, and M. Gahr. Zebra finch mates use their forebrain song system in unlearned call communication. PLOS ONE, 9(10):e109334, 10 2014. [ DOI ] Unlearned calls are produced by all birds whereas learned songs are only found in three avian taxa, most notably in songbirds. The neural basis for song learning and production is formed by interconnected song nuclei: the song control system. In addition to song, zebra finches produce large numbers of soft, unlearned calls, among which “stack” calls are uttered frequently. To determine unequivocally the calls produced by each member of a group, we mounted miniature wireless microphones on each zebra finch. We find that group living paired males and females communicate using bilateral stack calling. To investigate the role of the song control system in call-based male female communication, we recorded the electrical activity in a premotor nucleus of the song control system in freely behaving male birds. The unique combination of acoustic monitoring together with wireless brain recording of individual zebra finches in groups shows that the neuronal activity of the song system correlates with the production of unlearned stack calls. The results suggest that the song system evolved from a brain circuit controlling simple unlearned calls to a system capable of producing acoustically rich, learned vocalizations. </td> </tr>
[86]	M. Towsey, B. Planitz, A. Nantes, J. Wimmer, and P. Roe. A toolbox for animal call recognition. Bioacoustics, 21(2):107--125, 2012. [ DOI ] Monitoring the natural environment is increasingly important as habit degradation and climate change reduce the world's biodiversity. We have developed software tools and applications to assist ecologists with the collection and analysis of acoustic data at large spatial and temporal scales. One of our key objectives is automated animal call recognition, and our approach has three novel attributes. First, we work with raw environmental audio, contaminated by noise and artefacts and containing calls that vary greatly in volume depending on the animal's proximity to the microphone. Second, initial experimentation suggested that no single recognizer could deal with the enormous variety of calls. Therefore, we developed a toolbox of generic recognizers to extract invariant features for each call type. Third, many species are cryptic and offer little data with which to train a recognizer. Many popular machine learning methods require large volumes of training and validation data and considerable time and expertise to prepare. Consequently we adopt bootstrap techniques that can be initiated with little data and refined subsequently. In this paper, we describe our recognition tools and present results for real ecological problems. </td> </tr>
[87]	M. Towsey, L. Zhang, M. Cottman-Fields, J. Wimmer, J. Zhang, and P. Roe. Visualization of long-duration acoustic recordings of the environment. Procedia Computer Science, 29:703--712, 2014. [ DOI \| http ]
[88]	E. Vannoni and A.G. McElligott. Fallow bucks get hoarse: vocal fatigue as a possible signal to conspecifics. Animal Behaviour, 78(1):3--10, 2009. [ DOI ] Many studies of sexually selected vocal communication assume that calls remain stable throughout the breeding season. However, during this period, physiological and social factors change and these can have strong effects on the structure of calls and calling rates. During the rut, fallow bucks, Dama dama, reduce their feeding and increase the time and energy spent on vocalizing and fighting to gain matings, and consequently their body condition declines greatly. The availability of matings and intensity of competition between males also change. Therefore, we predicted that male vocal signalling would vary over time in response to the changing intersexual and intrasexual selective environment. We measured the structure of fallow buck groans and the groaning rate throughout the rut. Fundamental frequency-related parameters were highest at the beginning and at the end of the rut, and lowest during the middle when most matings occur. The fundamental frequency perturbation along the groan (Jitter) remained stable throughout the rut, whereas the number of pulses and duration of the groans decreased linearly. The minimum formant dispersion did not vary significantly over the rut. Groaning rate increased towards the middle of the rut and then rapidly decreased afterwards. We suggest that changes in the structure of groans and groaning rate are associated with the declining body condition of males and variation in the availability of mating opportunities. The breakdown in some aspects of call structure towards the end of the breeding season may represent an honest signal that could be widespread in other species. </td> </tr>
[89]	E. Vannoni and A. G. McElligott. Low frequency groans indicate larger and more dominant fallow deer (Dama dama) males. PLoS One, 3(9):e3113, 2008. [ DOI ] Background Models of honest advertisement predict that sexually selected calls should signal male quality. In most vertebrates, high quality males have larger body sizes that determine higher social status and in turn higher reproductive success. Previous research has emphasised the importance of vocal tract resonances or formant frequencies of calls as cues to body size in mammals. However, the role of the acoustic features of vocalisations as cues to other quality-related phenotypic characteristics of callers has rarely been investigated. Methodology/Principal Findings We examined whether the acoustic structure of fallow deer groans provides reliable information on the quality of the caller, by exploring the relationships between male quality (body size, dominance rank, and mating success) and the frequency components of calls (fundamental frequency, formant frequencies, and formant dispersion). We found that body size was not related to the fundamental frequency of groans, whereas larger males produced groans with lower formant frequencies and lower formant dispersion. Groans of high-ranking males were characterised by lower minimum fundamental frequencies and to a lesser extent, by lower formant dispersions. Dominance rank was the factor most strongly related to mating success, with higher-ranking males having higher mating success. The minimum fundamental frequency and the minimum formant dispersion were indirectly related to male mating success (through dominance rank). Conclusion/Significance Our study is the first to show that sexually selected vocalisations can signal social dominance in mammals other than primates, and reveals that independent acoustic components encode accurate information on different phenotypic aspects of male quality. </font></blockquote> </td> </tr>
[90]	T. M Ventura, A. G. de Oliveira, T. D. Ganchev, J. M. de Figueiredo, O. Jahn, M. I. Marques, and K.-L. Schuchmann. Audio parameterization with robust frame selection for improved bird identification. Expert Systems with Applications, jul 2015. [ DOI \| http ] A major challenge in the automated acoustic recognition of bird species is the audio segmentation, which aims to select portions of audio that contain meaningful sound events and eliminates segments that contain predominantly background noise or sound events of other origin. Here we report on the development of an audio parameterization method with integrated robust frame selection that makes use of morphological filtering applied on the spectrogram seen as an image. The morphological filtering allows to exclude from further processing certain audio events, which otherwise could cause misclassification errors. The Mel Frequency Cepstral Coefficients (MFCCs) computed for the selected audio frames offer a good representation of the spectral information for dominant vocalizations because the morphological filtering eliminates short bursts of noise and suppresses weak competing signals. Experimental validation of the proposed method on the identification of 40 bird species from Brazil demonstrated superior accuracy and faster operation than three traditional and recent approaches. This is expressed as reduction of the relative error rate by 3.4% and the overall operational time by 7.5% when compared to the second best result. The improved frame selection robustness, precision, and operational speed facilitate applications like multi-species identification of real-field recordings. </td> </tr>
[91]	Emmanuel Vincent, Shoko Araki, Fabian Theis, Guido Nolte, Pau Bofill, Hiroshi Sawada, Alexey Ozerov, Vikrham Gowreesunker, Dominik Lutter, and Ngoc QK Duong. The signal separation evaluation campaign (2007--2010): Achievements and remaining challenges. Signal Processing, 92(8):1928--1936, 2012. [ DOI ] We present the outcomes of three recent evaluation campaigns in the field of audio and biomedical source separation. These campaigns have witnessed a boom in the range of applications of source separation systems in the last few years, as shown by the increasing number of datasets from 1 to 9 and the increasing number of submissions from 15 to 34. We first discuss their impact on the definition of a reference evaluation methodology, together with shared datasets and software. We then present the key results obtained over almost all datasets. We conclude by proposing directions for future research and evaluation, based in particular on the ideas raised during the related panel discussion at the Ninth International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA 2010). </td> </tr>
[92]	Jeffrey S. Vitter. Random sampling with a reservoir. ACM Transactions on Mathematical Software, 11(1):37--57, mar 1985. [ DOI \| http ] We introduce fast algorithms for selecting a random sample of n records without replacement from a pool of N records, where the value of N is unknown beforehand. The main result of the paper is the design and analysis of Algorithm Z; it does the sampling in one pass using constant space and in O(n(1 + log(N/n))) expected time, which is optimum, up to a constant factor. Several optimizations are studied that collectively improve the speed of the naive version of the algorithm by an order of magnitude. We give an efficient Pascal-like implementation that incorporates these modifications and that is suitable for general use. Theoretical and empirical results indicate that Algorithm Z outperforms current methods by a significant margin. </td> </tr>
[93]	C. L. Walters, R. Freeman, A. Collen, C. Dietz, M. Brock Fenton, G. Jones, M. K. Obrist, S. J. Puechmaille, T. Sattler, B. M. Siemers, et al. A continental-scale tool for acoustic identification of european bats. Journal of Applied Ecology, 49:1064--1074, 2012. [ DOI ]
[94]	Michael S. Webster and Gregory F. Budney. Sound archives and media specimens in the 21st century. In C. H. Brown and T. Riede, editors, Comparative Bioacoustics: An Overview, chapter 11. Bentham Science Publishers, Oak Park, IL, USA, 2016. Audio recordings of birds and other animals, and also other forms of ‘biodiversity media’ (e.g., video recordings), capture the behavioral phenotype in ways that traditional museum specimens cannot, and natural history audio/media archives hold collections of recordings that span geography, time, and taxonomy. As such, these recordings can be used for a broad range of studies in ecology, evolution, and animal behavior, and newly developed tools for collecting and analyzing these recordings promise to further increase that research potential. Moreover, the digital revolution has made it easier than ever for high quality recordings to be collected and deposited in an archive, opening the door for large-scale citizen science efforts. But this potential also brings new challenges that must be met by the research community with regard to digital standards and accessibility. We recommend that researchers and other recordists deposit their materials in a suitable archive, that sound/media archives build strong partnerships with other types of natural history collections, that these archives also embrace technological advances to make their assets more accessible, and that archives and acoustic researchers harness “the power of the crowd” through crowd-sourcing and similar approaches. In doing so, sound archives and bioacoustic research will play an ever-increasing role in understanding our natural world, including responses of natural systems to human activities, in the 21st century. </td> </tr>
[95]	G. Wichern, J. Xue, H. Thornburg, B. Mechtley, and A. Spanias. Segmentation, indexing, and retrieval for environmental and natural sounds. IEEE Transactions on Audio, Speech, and Language Processing, 18(3):688--707, 2010. [ DOI ] We propose a method for characterizing sound activity in fixed spaces through segmentation, indexing, and retrieval of continuous audio recordings. Regarding segmentation, we present a dynamic Bayesian network (DBN) that jointly infers onsets and end times of the most prominent sound events in the space, along with an extension of the algorithm for covering large spaces with distributed microphone arrays. Each segmented sound event is indexed with a hidden Markov model (HMM) that models the distribution of example-based queries that a user would employ to retrieve the event (or similar events). In order to increase the efficiency of the retrieval search, we recursively apply a modified spectral clustering algorithm to group similar sound events based on the distance between their corresponding HMMs. We then conduct a formal user study to obtain the relevancy decisions necessary for evaluation of our retrieval algorithm on both automatically and manually segmented sound clips. Furthermore, our segmentation and retrieval algorithms are shown to be effective in both quiet indoor and noisy outdoor recording conditions. </td> </tr>
[96]	R Haven Wiley. Associations of song properties with habitats for territorial oscine birds of eastern North America. American Naturalist, pages 973--993, 1991. [ DOI ]
[97]	H. Williams, I.I. Levin, D.R. Norris, A.E.M. Newman, and N.T. Wheelwright. Three decades of cultural evolution in savannah sparrow songs. Animal Behaviour, 85, 2013. [ DOI ] Cultural evolution can result in changes in the prevalence not only of different learned song types within bird populations but also of different segments within the song. Between 1980 and 2011, we examined changes within different segments of the single songs of male Savannah sparrows, Passerculus sandwichiensis, in an island population. Introductory notes did not change. The buzz segment showed similar stability; although a rare low-frequency variant appeared and then disappeared, the buzz segments from 1980 and 2011 were essentially identical. The middle segment, made up of discrete notes assembled into several types, was variable. However, the form of the middle segment did not affect fitness and may serve to denote individual identity. The terminal trill decreased steadily in frequency and duration over three decades. Longer trills were associated with lower reproductive success, suggesting that trill duration was under sexual selection. The notes sung between introductory notes were also associated with reproductive success. A high cluster sung in 1980–1982 disappeared altogether by 2011, and was gradually replaced by click trains, which were associated with greater reproductive success. During the final decade of the study, more clicks were added to click trains. Longer click trains, which may require vocal virtuosity and so indicate male quality, were also associated with greater reproductive success. Both trill duration and the number of clicks increased in variance during the three-decade span of the study. We suggest that such increases in variance might be a signature of directional cultural selection. Within the Savannah sparrow's relatively short and simple learned song, cultural evolution appears to be mediated by different mechanisms for different song segments, perhaps because the segments convey different information. </td> </tr>
[98]	David R. Wilson, Laurene M. Ratcliffe, and Daniel J. Mennill. Black-capped chickadees, poecile atricapillus, avoid song overlapping: evidence for the acoustic interference hypothesis. Animal Behaviour, 114:219--229, apr 2016. [ DOI \| http ] Many animals produce sounds that overlap the sounds of others. In some animals, overlapping is thought to be an aggressive signal important in resource defence. Yet, overlapping can also occur by chance, and therefore its function is controversial. In this study, we conducted two experiments to test the function of overlapping in black-capped chickadees, Poecile atricapillus. In experiment 1, we simulated territorial intrusions by broadcasting songs inside established chickadee territories. Resident males overlapped the playback-simulated intruders significantly less than expected by chance, as in most species in which overlapping has been described. Chickadees also overlapped more when they were farther from the intruder. This pattern suggests that chickadees avoid overlapping as a mechanism for reducing acoustic interference (‘interference avoidance hypothesis’). However, the pattern could also constitute submissive signalling if chickadees signal de-escalation (associated with greater distance between opponents) through increasing rates of overlapping (‘submissive signalling hypothesis’). Therefore, in experiment 2, we contrasted these two hypotheses by comparing responses to playback stimuli with low or high interference potential and low or high signal value. We manipulated interference potential by broadcasting stimuli at different amplitudes. We manipulated signal value by broadcasting either song stimuli, which elicit aggression, or white noise stimuli with matching time-amplitude characteristics. If overlapping is a submissive signal, then we predicted that chickadees would avoid overlapping song stimuli, but not white noise stimuli, which lack signal value. Contrary to this prediction, chickadees overlapped song and white noise stimuli equally often, but significantly less often than expected by chance. Furthermore, chickadees overlapped both types of stimuli more often when they were broadcast at lower amplitudes (i.e. lower interference potential). Together, these findings provide compelling evidence that overlapping is not a signal in this species, and that chickadees avoid overlapping both biotic and abiotic sounds as a mechanism for reducing interference. </td> </tr>
[99]	R. A. Zann. The zebra finch: a synthesis of field and laboratory studies, volume 5. Oxford University Press, Oxford, 1996. </td> </tr>
[100]	Willem Zuidema. Context-freeness revisited. Proceedings of CogSci 2013, 2013. [ DOI ] A series of papers have appeared investigating the ability of various species to learn context-free languages. From a computational point of view, the experiments in this tradition suffer from a number of problems concerning the stimuli used in the training phase of the experiments, the controls presented in the test phase of the experiments, and the motivation for and the conclusions drawn from the experiments. This paper discusses in some detail the problems with the existing work in this domain before presenting a new design for this type of experiments that avoids the problems identified in existing studies. Finally, the paper presents results from a small study demonstrating the benefits of the new design. </td> </tr> </table> This file was generated by bibtex2html 1.98.

Chapter 11

Computational bioacoustics scene analysis

Dan Stowell

Chapter 11 References