Speech Coding Based on Spectral Dynamics

Motlíček, Petr; Hermansky, Hynek; Garudadri, Harinath; Srinivasamurthy, Naveen

doi:10.1007/11846406_59

Petr Motlíček^21,22,
Hynek Hermansky^21,22,23,
Harinath Garudadri²⁴ &
…
Naveen Srinivasamurthy²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4188))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1152 Accesses
5 Citations
3 Altmetric

Abstract

In this paper we present first experimental results with a novel audio coding technique based on approximating Hilbert envelopes of relatively long segments of audio signal in critical-band-sized sub-bands by autoregressive model. We exploit the generalized autocorrelation linear predictive technique that allows for a better control of fitting the peaks and troughs of the envelope in the sub-band. Despite introducing longer algorithmic delay, improved coding efficiency is achieved. Since the described technique does not directly model short-term spectral envelopes of the signal, it is suitable not only for coding speech but also for coding of other audio signals.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Improved audio compression through advanced adaptive data processing and distribution in an ANN framework

Article 06 February 2025

Exploiting time-frequency patterns with LSTM-RNNs for low-bitrate audio restoration

Article 10 May 2019

Deep Learning-Based Empirical and Sub-Space Decomposition for Speech Enhancement

Article 20 February 2024

References

Spanias, A.S.: Speech Coding: A Tutorial Review. Proc. of IEEE 82(10) (October 1994)
Google Scholar
Vinton, M.S., Atlas, L.E.: A scalable and progressive audio codec. In: Proc. of ICASSP, Salt Lake City, USA, vol. 5, pp. 3277–3280 (May 2001)
Google Scholar
Athineos, M., Hermansky, H., Ellis, D.P.W.: LP-TRAP: Linear predictive temporal patterns. In: Proc. of ICSLP, Jeju, S. Korea, pp. 1154–1157 (October 2004)
Google Scholar
Makhoul, J.: Linear Prediction: A Tutorial Review. Proc. of IEEE 63(4) (April 1975)
Google Scholar
Hermansky, H.: Perceptual linear predictive (PLP) analysis for speech. J. Acoust. Soc. Am., 1738–1752 (1990)
Google Scholar
Hermansky, H., Fujisaki, H., Sato, Y.: Analysis and Synthesis of Speech based on Spectral Transform Linear Predictive Method. In: Proc. of ICASSP, Boston, USA, vol. 8, pp. 777–780 (April 1983)
Google Scholar
Schimmel, S., Atlas, L.: Coherent Envelope Detector for Modulation Filtering of Speech. In: Proc. of ICASSP, Philadelphia, USA, vol. 1, pp. 221–224 (May 2005)
Google Scholar

Download references

Author information

Authors and Affiliations

IDIAP Research Institute, Rue du Simplon 4, CH 1920, Martigny, Switzerland
Petr Motlíček & Hynek Hermansky
Faculty of Information Technology, Brno University of Technology, Božetěchova 2, Brno, 612 66, Czech Republic
Petr Motlíček & Hynek Hermansky
École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
Hynek Hermansky
Qualcomm Inc., San Diego, California, USA
Harinath Garudadri & Naveen Srinivasamurthy

Authors

Petr Motlíček
View author publications
Search author on:PubMed Google Scholar
Hynek Hermansky
View author publications
Search author on:PubMed Google Scholar
Harinath Garudadri
View author publications
Search author on:PubMed Google Scholar
Naveen Srinivasamurthy
View author publications
Search author on:PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Botanická 68a, CZ-602 00, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 60200, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Motlíček, P., Hermansky, H., Garudadri, H., Srinivasamurthy, N. (2006). Speech Coding Based on Spectral Dynamics. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_59

Download citation

DOI: https://doi.org/10.1007/11846406_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39090-9
Online ISBN: 978-3-540-39091-6
eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us

Policies and ethics