Author
Sharma, P
Abrol, V
Nivedita
Sao, A
Journal title
COMPUTER SPEECH AND LANGUAGE
DOI
10.1016/j.csl.2018.05.003
Volume
52
Last updated
2021-09-26T04:19:01.12+01:00
Page
191-208
Abstract
© 2018 Elsevier Ltd In this paper, we have explored the framework of compressed sensing (CS) and sparse representation (SR) to reduce the footprint of unit selection based speech synthesis (USS) system. In the CS based framework, footprint reduction is achieved by storing either CS measurements or signs of CS measurements, instead of storing the raw speech waveforms. For efficient reconstruction using CS measurements, the speech signal should have a sparse representation over a predefined basis/dictionary. Hence, in this work, we have also studied the effectiveness of sparse representation for compressing the speech waveform. The experimental results are demonstrated using an analytical dictionary (DCT matrix), and several learned dictionaries, derived using K-singular value decomposition (KSVD), method of optimal directions (MOD), greedy adaptive dictionary (GAD) and principal component analysis (PCA) algorithms. To further increase compression in SR based framework of footprint reduction, the significant coefficients of sparse vector are selected adaptively, based on the type of speech segment (e.g., voiced, unvoiced etc.). Experimental studies on two different Indian languages suggest that CS/SR based footprint reduction methods can be used as an alternative to existing compression methods employed in USS system.
Symplectic ID
925635
Download URL
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000440789000010&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=4fd6f7d59a501f9b8bac2be37914c43e
Publication type
Journal Article
Publication date
November 2018
Please contact us with feedback and comments about this page. Created on 04 Feb 2019 - 14:11.