Author
Thakur, A
Abrol, V
Sharma, P
Rajan, P
IEEE
Journal title
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
DOI
10.1109/ICASSP.2018.8461814
Volume
2018-April
Last updated
2020-10-03T14:19:47.813+01:00
Page
261-265
Abstract
© 2018 IEEE. This paper focuses on the problem of bird species identification using audio recordings. Following recent developments in deep learning, we propose a multi-layer alternating sparse-dense framework for bird species identification. Temporal and frequency modulations in bird vocalizations are captured by concatenating frames of spectrograms, resulting in a high dimensional super-frame based representation. These super-frame representations are highly sparse. Hence, we propose to use random projections to compress these super-frames. This is followed by class-specific archetypal analysis, employed on these compressed super-frames for acoustic modeling, to obtain a convex-sparse representation. These convex-sparse representations are referred as compressed convex spectral embeddings (CCSE). It is observed that these representations efficiently capture species-specific discriminative information. Experimental results show compelling evidence that the proposed approach shows performance comparable to existing methods such as deep neural networks (DNN) and dynamic kernel based SVMs.
Symplectic ID
925636
Download URL
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000446384600053&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=4fd6f7d59a501f9b8bac2be37914c43e
Publication type
Conference Paper
ISBN-13
9781538646588
Publication date
13 September 2018
Please contact us with feedback and comments about this page. Created on 04 Feb 2019 - 14:11.