Some Notes on Nonlinearities of Speech

Esposito, Anna; Marinaro, Maria

doi:10.1007/11520153_1

Anna Esposito^22,23 &
Maria Marinaro^23,24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3445))

Included in the following conference series:

International School on Neural Networks, Initiated by IIASS and EMFCSC

1279 Accesses
6 Citations

Abstract

Speech is exceedingly nonlinear. Efforts to propose non-linear models of its dynamics are worth to be made but difficult to implement since nonlinearity is not easily handled from an engineering and mathematical point of view. This paper is an attempt to make accessible to untrained people the notion of nonlinearity in speech, revising several nonlinear speech phenomena and the engineering endeavour for modeling them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Low-Complexity Linear-in-the-Parameters Nonlinear Filter for Distorted Speech Signals

The Representation of Speech in Deep Neural Networks

A Literature Survey on Speech Enhancement Based on Deep Neural Network Technique

References

Albrecht, D.G., Geisler, W.S.: Motion Selectivity and the Contrast Response Function of Simple Cells in the Visual Cortex. Visual Neuroscience 7(6), 531–546 (1991)
Article Google Scholar
Atal, B.S., Hanauer, S.L.: Speech Analysis and Synthesis by Linear Prediction of Speech Wave. J. Acoustic. Soc. Amer. 50(2), 637–655 (1971)
Article Google Scholar
Bastari, A., Squartini, S., Piazza, F.: Underdetermined Blind Separation of Speech Signals with Delays in Different Time-Frequency Domain. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 136–163. Springer, Heidelberg (2005)
Chapter Google Scholar
Bekesy, G.V.: Experiments in Hearing. McGraw-Hill, New York (1960)
Google Scholar
Bekesy, G.V.: Sensory Inhibition. Princeton University Press, Princeton (1967)
Google Scholar
Bell, C.G., Fujisaki, H., Heinz, J.M., Stevens, K.N., House, A.S.: Reduction of Speech Spectra by Analysis.by.Synthesis Techniques. J. Acoustic. Soc. Amer. 33, 1725–1736 (1961)
Article Google Scholar
Chollet, G., McTait, K., Petrovska-Delacretaz, D.: Data Driven Approaches to Speech and Languages Processing. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 164–198. Springer, Heidelberg (2005) (to be Published)
Chapter Google Scholar
Cosi, P., De Mori, R., Vagges, K.: A Neural Network Architecture for Italian Vowel Recognition. In: Proceedings of VERBA 1990, Rome, Italy, pp. 22–24 (1990)
Google Scholar
Cosi, P., Bengio, Y., De Mori, R.: Phonetically-Based Multi-Layered Neural Networks for Vowel Classification. Speech Comm 9(1), 15–29 (1990)
Article Google Scholar
Cosi, P., Ferrero, F.: Applicazione di un Modello del Sistema Uditivo Periferico alla Segmentazione Automatica del Segnale Vocale. In: AIA Proceedings, Atti del XX Convegno Nazionale di Acustica, Roma ( April 1992)
Google Scholar
Cosi, P., Frasconi, P., Gori, M., Griggio, N.: Phonetic Recognition Experiments with Recurrent Neural Networks. In: Proc. ICSLP, pp. 1335–1338 (1992)
Google Scholar
Cosi, P.: Auditory Modelling for Speech Analysis and Recognition. In: Cooke, M., Beet, S., Crawford, M. (eds.) Visual Representation of Speech Signals, pp. 205–212. Wiley & Sons, Chichester (1993)
Google Scholar
Cosi, P.: Auditory Modeling and Neural Networks. In: Chollet, G., Di Benedetto, M.G., Esposito, A., Marinaro, M. (eds.) Speech Processing, Recognition, and Artificial Neural Networks, pp. 54–84. Springer, Berlin (1999)
Google Scholar
Cummiskey, P., Jayant, N.S., Flanagan, J.L.: Adaptive Quantization in Differential PCM Coding of Speech. Bell Syst. Tech. J., 1105–1118 (1973)
Google Scholar
Delgutte, B.: Representation of Speech-like Sounds in the Discharge Patterns of Auditorynerve Fibers. J. Acoustic. Soc. Amer. 68, 843–857 (1980)
Article Google Scholar
Delgutte, B., Kiang, N.Y.S.: Speech Coding in the Auditory Nerve: I Vowel-like Sounds. J. Acoustic. Soc. Amer. 75, 866–878 (1984)
Article Google Scholar
Delgutte, B., Kiang, N.Y.S.: Speech Coding in the Auditory Nerve: II Processing Schemes for Vowel-like Sounds. J. Acoustic. Soc. Amer. 75, 879–886 (1984)
Article Google Scholar
Delgutte, B., Kiang, N.Y.S.: Speech Coding in the Auditory Nerve: III Voiceless Fricative Consonants. J. Acoustic. Soc. Amer. 75, 887–896 (1984)
Article Google Scholar
Delgutte, B., Kiang, N.Y.S.: Speech Coding in the Auditory Nerve: IV Sounds with Consonant- Like Dynamic Characteristics. J. Acoustic. Soc. Amer. 75, 897–907 (1984)
Article Google Scholar
Esposito, A., Rampone, S., Stanzione, C., Tagliaferri, R.: A Mathematical Model for Speech Processing. In: Proceedings of IEEE on Neural Networks for Signal Processing, pp. 194–203 (1992)
Google Scholar
Esposito, A., Rampone, S., Stanzione, C., Tagliaferri, R.: Experimental Results on a Model of the Peripheral Auditory Apparatus. In: Proceedings of International Workshop on Neural Networks for Speech Recognition, Lint, Trieste, pp. 163–177 (1992)
Google Scholar
Esposito, A., Aversano, G.: Text Independent Methods for Speech Segmentation. In: Chollet, G., Esposito, A., Faundez-Zauny, M., Marinaro, M. (eds.) Advances in Nonlinear Speech Modeling and Applications. LNCS, Springer, New York (2005) (to be Published)
Google Scholar
Fant, G.: Preliminaries to Analysis of the Human Voice Source. Speech Communication Group Working Papers. Research Laboratory of Electronics, Massachusetts Institute of Technology 3 (1983)
Google Scholar
Faundez-Zanuy, M.: Nonlinear Speech Processing: Overview and Possibilities in Speech Coding. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 15–42. Springer, Heidelberg (2005)
Chapter Google Scholar
Flanagan, J.L., Golden, R.M.: Phase Vocoder. Tech J. 45, 1493–1509 (1966)
Google Scholar
Fletcher, H.: Auditory Patterns. Review of Modern Physics 13, 47–65 (1940)
Google Scholar
Gauffin, J., Hammarberg, B., Imaizumi, S.: A Microcomputer Based System for Acoustic Analsyis of Voice Characteristics. In: Proceedings of ICASSP 1986, Tokyo, vol. 1, pp. 681–684 (1986)
Google Scholar
Gold, B.: Note on Buzz.Hiss Detection. J. Acoustic. Soc. Amer. 36, 1659–1661 (1964)
Article Google Scholar
Gold, B., Rabiner, L.R.: Parallel Processing Technique for Estimating Pitch Periods of Speech in the Time Domain. J. Acoustic. Soc. Amer. 46(2), 442–449 (1969)
Article Google Scholar
Gold, B., Rader, C.M.: Digital Processing of Signals. McGraw-Hill, New York (1969)
MATH Google Scholar
Gold, B., Rader, C.M.: System for Compressing the Bandwidth of Speech. IEEE Trans. Audio Electroacoustic AU.15, 131–135 (1967)
Article Google Scholar
Goldhor, R.S.: Representation of Consonants in the Peripheral Auditory System: A Modeling Study of the Correspondence between Response Properties and Phonetic Features. RLE Technical Report N. 505, MIT press (1985)
Google Scholar
Haykin, S.: Signal Processing in Nonlinear Nongaussian and Nonstationary World. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 43–53. Springer, Heidelberg (2005)
Chapter Google Scholar
Hussain, A., Durrani, T.S., Soraghan, J.J., Aikulaibi, A., Mterwa, N.: Nonlinear Adaptive Speech Enhancement Inspired by Early Auditory Processing. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 291–316. Springer, Heidelberg (2005) (to be Published)
Chapter Google Scholar
Itakura, F.: Minimum Prediction Residual Principle Applied to Speech Recognition. IEEE Trans. Acoust., Speech, and Signal Process., ASSP 23, 67–72 (1975)
Article Google Scholar
Jankowski Jr., C.R., Vo, H.-D.H., Lippmann, R.P.: A Comparison of Signal Processin Front Ends for Automatic Word Recognition. IEEE Trans Speech and Audio Processing SAP-3(3), 286–293 (1995)
Article Google Scholar
Javkin, H.R., Antonanzas-Barroso, N., Maddieson, I.: Digital Inverse Filtering for Linguistic Research. Journal of Speech and Hearing Research 30, 122–129 (1987)
Google Scholar
Jayant, N.S.: Digital Coding of Speech Waveform. Proc. IEEE 62, 611–632 (1964)
Article Google Scholar
Johnson, D.H., Swami, A.: The Transmission of Signals by Auditory-Nerve Fiber Discharge Patterns. J. Acoustic. Soc. Amer. 74, 493–501 (1983)
Article Google Scholar
Keller, E.: The Analysis of Voice Quality in Speech Processing. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 54–73. Springer, Heidelberg (2005)
Chapter Google Scholar
Kim, D.O., Molnar, C.E.: A Population Study of Cochlear Nerve Fibers: Comparison of Spatial Distributions of Average-Rate and Phase Locking Measures of Responses to Single Tones. J. of Neurophysiology 42, 16–30 (1979)
Google Scholar
Kim, D.O., Molnar, C.E., Matthews, J.W.: Cochlear Mechanics: Nonlinear Behaviour in Two-Tone Responses as Reflected in Cochlear-Nerve-Fiber Responses and in Ear-Canal Sound Pressure. J. Acoustic. Soc. Amer. 67, 1704–1721 (1980)
Article Google Scholar
Kubin, G., Lainscsek, C., Rank, E.: Identification of Nonlinear Oscillator Models for Speech Analysis and Synthesis. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 74–113. Springer, Heidelberg (2005) (to be Published)
Chapter Google Scholar
Lakoff, G., Johnson, M.: Metaphors We Live By, pp. 10–11. University of Chicago Press, Chicago (1980)
Google Scholar
Lyon, R.F.: A Computational Model of Filtering, Detection, and Compression in the Cochlea. In: Proceedings of IEEE-ICASSP, pp. 1282–1285 (1982)
Google Scholar
Murphy, P., Akande, O.: Cepstrum-Based Harmonics-to-Noise Ratio Measurements in Voiced Speech. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 199–218. Springer, Heidelberg (2005) (to be Published)
Chapter Google Scholar
Markel, J.D., Gray, A.H., Wakita, H.: Linear Prediction of Speech Theory and Practice. Speech Communications. Santa Barbara, California, SCRL monograph 10 (1973)
Google Scholar
Martin, T.: Acoustic Recognition of a Limited Vocabulary in Continuous Speech. Ph.D Thesis, Uni. Pennsylvania, Philadelphia (1970)
Google Scholar
Meddis, R.: Simulation of Mechanical to Neural Transduction in the Auditory Receptor. J. Acoustic. Soc. Amer. 79, 702–711 (1986)
Article Google Scholar
Mermelstein, P.: Computer Generated Spectrogram Displays for On.Line Speech Research. IEEE Trans. Audio Electroacoustic. AU.19, 44–47 (1971)
Article Google Scholar
Noll, A.M.: Cepstrum Pitch Determination. J. Acoustic. Soc. Amer. 41, 293–309 (1967)
Article Google Scholar
Oppenheim, A.V.: A Speech Analysis.Synthesis System Based on Homomorphic Filtering. J. Acoustic. Soc. Amer. 45, 458–465 (1969)
Article Google Scholar
Oppenheim, A.V.: Speech Spectrograms Using the Fast Fourier Transform. IEEE Spectrum 7, 57–62 (1970)
Article Google Scholar
Oppenheim, A.V., Schafer, R.W.: Homomorphic Analysis of Speech. IEEE Trans. Audio Electroacoust AU16, 221–226 (1968)
Article Google Scholar
Oppenheim, A.V., Schafer, R.W., Stochham, S.: Nonlinear Filtering of Multiplied and Convolved Signals. Proc. IEEE 56, 1264–1291 (1968)
Article Google Scholar
Oppenheim, A.V., Schafer, R.W.: Digital Signal Processing. Englewood Cliffs, N.J (1975)
MATH Google Scholar
Petek, B.: Predictive Connectionist Approach to Speech Recognition. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 219–243. Springer, Heidelberg (2005) (to be Published)
Chapter Google Scholar
Reddy, D.R.: Computer Recognition of Connected Speech. J. Acoustic. Soc. Amer. 42(2), 329–347 (1967)
Article Google Scholar
Rose, J.E., Brugge, J.F., Anderson, D.J., Hindi, J.E.: Patterns of Activity in Single Auditory Nerve Fibers of the Squirrel Monkey. In: de Reuck, A.V.S., Knight, J. (eds.) Hearing Mechanisms in Vertebrate, Churchill, London, pp. 144–168 (1968)
Google Scholar
Rothenberg, M.: A New Inverse-Filtering Technique for Deriving the Glottal Airflow Waveform during Voicing. Journal of Acoustical Society of America 53, 1632–1645 (1973)
Article Google Scholar
Rothenberg, M.: Measurement of Airflow in Speech. Journal of Speech and Hearing Research 20, 155–176 (1977)
Google Scholar
Rothenberg, M.: Acoustic Interaction between the Glottal Source and Vocal Tract. In: Stevens, K.N., Hirano, H. (eds.) Vocal Fold Physiology, pp. 305–328. Tokyo Press (1981)
Google Scholar
Rothenberg,M.: Inverse Filtering on your Laptop, http://www.rothenberg.org/contents.htm
Rouat, J., Pichevar, R., Loiselle, S.: Perceptive Nonlinear Speech Processing and Spiking Neural Networks. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 317–337. Springer, Heidelberg (2005) (to be Published)
Chapter Google Scholar
Sachs, M.B., Young, E.D.: Encoding of Steady State Vowels in the Auditory Nerve: Representation in Terms of Discontinuities. J. Acoustic. Soc. Amer. 66, 470–479 (1979)
Article Google Scholar
Schafer, R.W., Rabiner, L.R.: System for Automatic Formant Analysis of Voiced Speech. J. Acoustic. Soc. Amer. 47(2), 634–648 (1970)
Article Google Scholar
Schafer, R.W., Rabiner, L.R.: Design of Digital Filter Banks for Speech Analysis. Bell Syst. Tech. Journ. 50(10), 3015–3097 (1971)
Google Scholar
Schafer, R.W., Rabiner, L.R.: Design and Simulation of a Speech Analysis.Synthesis System Based on Short.Time Fourier Analysis. IEEE Trans. Audio Electroacoustic. AU.21, 165–174 (1973)
Article Google Scholar
Schoentgen, J.: Speech Modeling based on Acoustic-to-Articulatory Mapping. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 114–135. Springer, Heidelberg (2005) (to be Published)
Chapter Google Scholar
Schroeder, M.H., Hall, J.L.: Model for Mechanical to Neural Transduction in the Auditory Receptor. J. Acoustic. Soc. Amer. 55, 1055–1060 (1974)
Article Google Scholar
Schroeder, M.R.: Vocoders, Analysis and Synthesis of Speech. Proc. IEEE 54, 720–754 (1966)
Article Google Scholar
Schroeder, M.R.: Period Histogram and Product Spectrum: New Methods for Fundamental Frequency Measurements. J. Acoustic. Soc. Amer. 43(4), 829–834 (1968)
Article Google Scholar
Seneff, S.: Pitch and Spectral Analysis of Speech Based on an Auditory Synchrony Model. Ph. D. Thesis of Speech Communication Group, MIT, Cambridge, MA (1985)
Google Scholar
Seneff, S.: A Joint Synchrony/Mean-Rate Model of Auditory Speech Processing. Journal of Phonetics 16, 55–76 (1988)
Google Scholar
Shannon, C.E., Weaver, W.: Mathematical Theory of Communication. University of Illinois Press, US (1949)
MATH Google Scholar
Silverman, H.R., Dixon, N.R.: A Parametrically Controlled Spectral Analysis System for Speech. IEEE Trans on Acoustic. Speech and Signal Processing ASSP.22(2), 362–381 (1974)
Article Google Scholar
Smith, R.L., Brachman, M.L., Frisina, R.D.: Sensitivity of Auditory-Nerve Fibers to Changes in Intensity: A Dichotomy Between Decrements and Increments. J. Acoustic. Soc. Amer. 78, 1310–1316 (1985)
Article Google Scholar
Smith, J.C., Zwislocki, J.J.: Short-Term Adaptation and Incremental Responses of Single Auditory-Nerve Fibers. Biol. Cybernetics 17, 169–182 (1975)
Article Google Scholar
Sondhi, M.M.: New Methods of Pitch Detection. IEEE Trans. Audio Electroacoustic AU.16(2), 262–266 (1968)
Article Google Scholar
Stewart, J.L.: The Bionic Ear. Covox Company, Santa Maria, California
Google Scholar
Stylianou, Y.: Modeling Speech based on Harmonic plus Noise Models. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 244–260. Springer, Heidelberg (2005)
Chapter Google Scholar
Trask, R.L.: A Dictionary of Phonetics and Phonology. Routledge, London,UK (1996)
Google Scholar
Young, E.D., Sachs, M.B.: Representation of Steady-State Vowels in the Temporal Aspects of the Discharge Pattern of Populations of Auditory Nerve Fibers. J. Acoustic. Soc. Amer. 66, 1381–1403 (1979)
Article Google Scholar
Zwicker, E.: Psychoacoustics. Springer, Berlin (1962)
Google Scholar
Zwicker, E.: Suddivision of the Audible Frequency Range into Critical Bands. J. Acoustic. Soc. Amer. 88, 248–249 (1961)
Article Google Scholar
Zwislocki, J.J.: On Intensity Characteristics of Sensory Receptors: A Generalized Function. Kybernetik 12, 169–183 (1973)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Psicologia, Seconda Università di Napoli, Via Vivaldi 43, Caserta, Italy
Anna Esposito
IIASS, Via Pellegrino 19, 84019, Vietri sul Mare, INFM Salerno, Italy
Anna Esposito & Maria Marinaro
Università di Salerno, Via S. Allende, Baronissi, Salerno, Italy
Maria Marinaro

Authors

Anna Esposito
View author publications
Search author on:PubMed Google Scholar
Maria Marinaro
View author publications
Search author on:PubMed Google Scholar

Editor information

Editors and Affiliations

CNRS LTCI/TSI Paris, 46 rue Barrault, 75634, Paris Cedex 13, France
Gérard Chollet
Department of Psychology, Second University of Naples, and IIASS, Via Pellegrino 19, 84019, Vietri sul Mare, SA, Italy
Anna Esposito
Escola Universitària Politècnica de Mataró, Universitat Politècnica de Catalunya, Barcelona, Spain
Marcos Faundez-Zanuy
Dipartimento di Fisica “E.R. Caianiello”, Università degli Studi di Salerno, Via S. Allende, 84081, Baronissi, SA, Italy
Maria Marinaro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Esposito, A., Marinaro, M. (2005). Some Notes on Nonlinearities of Speech. In: Chollet, G., Esposito, A., Faundez-Zanuy, M., Marinaro, M. (eds) Nonlinear Speech Modeling and Applications. NN 2004. Lecture Notes in Computer Science(), vol 3445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11520153_1

Download citation

DOI: https://doi.org/10.1007/11520153_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27441-4
Online ISBN: 978-3-540-31886-6
eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us

Policies and ethics

Profiles

Anna Esposito View author profile