Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3445))

Included in the following conference series:

  • 1279 Accesses

  • 6 Citations

Abstract

Speech is exceedingly nonlinear. Efforts to propose non-linear models of its dynamics are worth to be made but difficult to implement since nonlinearity is not easily handled from an engineering and mathematical point of view. This paper is an attempt to make accessible to untrained people the notion of nonlinearity in speech, revising several nonlinear speech phenomena and the engineering endeavour for modeling them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Albrecht, D.G., Geisler, W.S.: Motion Selectivity and the Contrast Response Function of Simple Cells in the Visual Cortex. Visual Neuroscience 7(6), 531–546 (1991)

    Article  Google Scholar 

  2. Atal, B.S., Hanauer, S.L.: Speech Analysis and Synthesis by Linear Prediction of Speech Wave. J. Acoustic. Soc. Amer. 50(2), 637–655 (1971)

    Article  Google Scholar 

  3. Bastari, A., Squartini, S., Piazza, F.: Underdetermined Blind Separation of Speech Signals with Delays in Different Time-Frequency Domain. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 136–163. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  4. Bekesy, G.V.: Experiments in Hearing. McGraw-Hill, New York (1960)

    Google Scholar 

  5. Bekesy, G.V.: Sensory Inhibition. Princeton University Press, Princeton (1967)

    Google Scholar 

  6. Bell, C.G., Fujisaki, H., Heinz, J.M., Stevens, K.N., House, A.S.: Reduction of Speech Spectra by Analysis.by.Synthesis Techniques. J. Acoustic. Soc. Amer. 33, 1725–1736 (1961)

    Article  Google Scholar 

  7. Chollet, G., McTait, K., Petrovska-Delacretaz, D.: Data Driven Approaches to Speech and Languages Processing. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 164–198. Springer, Heidelberg (2005) (to be Published)

    Chapter  Google Scholar 

  8. Cosi, P., De Mori, R., Vagges, K.: A Neural Network Architecture for Italian Vowel Recognition. In: Proceedings of VERBA 1990, Rome, Italy, pp. 22–24 (1990)

    Google Scholar 

  9. Cosi, P., Bengio, Y., De Mori, R.: Phonetically-Based Multi-Layered Neural Networks for Vowel Classification. Speech Comm 9(1), 15–29 (1990)

    Article  Google Scholar 

  10. Cosi, P., Ferrero, F.: Applicazione di un Modello del Sistema Uditivo Periferico alla Segmentazione Automatica del Segnale Vocale. In: AIA Proceedings, Atti del XX Convegno Nazionale di Acustica, Roma ( April 1992)

    Google Scholar 

  11. Cosi, P., Frasconi, P., Gori, M., Griggio, N.: Phonetic Recognition Experiments with Recurrent Neural Networks. In: Proc. ICSLP, pp. 1335–1338 (1992)

    Google Scholar 

  12. Cosi, P.: Auditory Modelling for Speech Analysis and Recognition. In: Cooke, M., Beet, S., Crawford, M. (eds.) Visual Representation of Speech Signals, pp. 205–212. Wiley & Sons, Chichester (1993)

    Google Scholar 

  13. Cosi, P.: Auditory Modeling and Neural Networks. In: Chollet, G., Di Benedetto, M.G., Esposito, A., Marinaro, M. (eds.) Speech Processing, Recognition, and Artificial Neural Networks, pp. 54–84. Springer, Berlin (1999)

    Google Scholar 

  14. Cummiskey, P., Jayant, N.S., Flanagan, J.L.: Adaptive Quantization in Differential PCM Coding of Speech. Bell Syst. Tech. J., 1105–1118 (1973)

    Google Scholar 

  15. Delgutte, B.: Representation of Speech-like Sounds in the Discharge Patterns of Auditorynerve Fibers. J. Acoustic. Soc. Amer. 68, 843–857 (1980)

    Article  Google Scholar 

  16. Delgutte, B., Kiang, N.Y.S.: Speech Coding in the Auditory Nerve: I Vowel-like Sounds. J. Acoustic. Soc. Amer. 75, 866–878 (1984)

    Article  Google Scholar 

  17. Delgutte, B., Kiang, N.Y.S.: Speech Coding in the Auditory Nerve: II Processing Schemes for Vowel-like Sounds. J. Acoustic. Soc. Amer. 75, 879–886 (1984)

    Article  Google Scholar 

  18. Delgutte, B., Kiang, N.Y.S.: Speech Coding in the Auditory Nerve: III Voiceless Fricative Consonants. J. Acoustic. Soc. Amer. 75, 887–896 (1984)

    Article  Google Scholar 

  19. Delgutte, B., Kiang, N.Y.S.: Speech Coding in the Auditory Nerve: IV Sounds with Consonant- Like Dynamic Characteristics. J. Acoustic. Soc. Amer. 75, 897–907 (1984)

    Article  Google Scholar 

  20. Esposito, A., Rampone, S., Stanzione, C., Tagliaferri, R.: A Mathematical Model for Speech Processing. In: Proceedings of IEEE on Neural Networks for Signal Processing, pp. 194–203 (1992)

    Google Scholar 

  21. Esposito, A., Rampone, S., Stanzione, C., Tagliaferri, R.: Experimental Results on a Model of the Peripheral Auditory Apparatus. In: Proceedings of International Workshop on Neural Networks for Speech Recognition, Lint, Trieste, pp. 163–177 (1992)

    Google Scholar 

  22. Esposito, A., Aversano, G.: Text Independent Methods for Speech Segmentation. In: Chollet, G., Esposito, A., Faundez-Zauny, M., Marinaro, M. (eds.) Advances in Nonlinear Speech Modeling and Applications. LNCS, Springer, New York (2005) (to be Published)

    Google Scholar 

  23. Fant, G.: Preliminaries to Analysis of the Human Voice Source. Speech Communication Group Working Papers. Research Laboratory of Electronics, Massachusetts Institute of Technology 3 (1983)

    Google Scholar 

  24. Faundez-Zanuy, M.: Nonlinear Speech Processing: Overview and Possibilities in Speech Coding. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 15–42. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  25. Flanagan, J.L., Golden, R.M.: Phase Vocoder. Tech J. 45, 1493–1509 (1966)

    Google Scholar 

  26. Fletcher, H.: Auditory Patterns. Review of Modern Physics 13, 47–65 (1940)

    Google Scholar 

  27. Gauffin, J., Hammarberg, B., Imaizumi, S.: A Microcomputer Based System for Acoustic Analsyis of Voice Characteristics. In: Proceedings of ICASSP 1986, Tokyo, vol. 1, pp. 681–684 (1986)

    Google Scholar 

  28. Gold, B.: Note on Buzz.Hiss Detection. J. Acoustic. Soc. Amer. 36, 1659–1661 (1964)

    Article  Google Scholar 

  29. Gold, B., Rabiner, L.R.: Parallel Processing Technique for Estimating Pitch Periods of Speech in the Time Domain. J. Acoustic. Soc. Amer. 46(2), 442–449 (1969)

    Article  Google Scholar 

  30. Gold, B., Rader, C.M.: Digital Processing of Signals. McGraw-Hill, New York (1969)

    MATH  Google Scholar 

  31. Gold, B., Rader, C.M.: System for Compressing the Bandwidth of Speech. IEEE Trans. Audio Electroacoustic AU.15, 131–135 (1967)

    Article  Google Scholar 

  32. Goldhor, R.S.: Representation of Consonants in the Peripheral Auditory System: A Modeling Study of the Correspondence between Response Properties and Phonetic Features. RLE Technical Report N. 505, MIT press (1985)

    Google Scholar 

  33. Haykin, S.: Signal Processing in Nonlinear Nongaussian and Nonstationary World. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 43–53. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  34. Hussain, A., Durrani, T.S., Soraghan, J.J., Aikulaibi, A., Mterwa, N.: Nonlinear Adaptive Speech Enhancement Inspired by Early Auditory Processing. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 291–316. Springer, Heidelberg (2005) (to be Published)

    Chapter  Google Scholar 

  35. Itakura, F.: Minimum Prediction Residual Principle Applied to Speech Recognition. IEEE Trans. Acoust., Speech, and Signal Process., ASSP 23, 67–72 (1975)

    Article  Google Scholar 

  36. Jankowski Jr., C.R., Vo, H.-D.H., Lippmann, R.P.: A Comparison of Signal Processin Front Ends for Automatic Word Recognition. IEEE Trans Speech and Audio Processing SAP-3(3), 286–293 (1995)

    Article  Google Scholar 

  37. Javkin, H.R., Antonanzas-Barroso, N., Maddieson, I.: Digital Inverse Filtering for Linguistic Research. Journal of Speech and Hearing Research 30, 122–129 (1987)

    Google Scholar 

  38. Jayant, N.S.: Digital Coding of Speech Waveform. Proc. IEEE 62, 611–632 (1964)

    Article  Google Scholar 

  39. Johnson, D.H., Swami, A.: The Transmission of Signals by Auditory-Nerve Fiber Discharge Patterns. J. Acoustic. Soc. Amer. 74, 493–501 (1983)

    Article  Google Scholar 

  40. Keller, E.: The Analysis of Voice Quality in Speech Processing. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 54–73. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  41. Kim, D.O., Molnar, C.E.: A Population Study of Cochlear Nerve Fibers: Comparison of Spatial Distributions of Average-Rate and Phase Locking Measures of Responses to Single Tones. J. of Neurophysiology 42, 16–30 (1979)

    Google Scholar 

  42. Kim, D.O., Molnar, C.E., Matthews, J.W.: Cochlear Mechanics: Nonlinear Behaviour in Two-Tone Responses as Reflected in Cochlear-Nerve-Fiber Responses and in Ear-Canal Sound Pressure. J. Acoustic. Soc. Amer. 67, 1704–1721 (1980)

    Article  Google Scholar 

  43. Kubin, G., Lainscsek, C., Rank, E.: Identification of Nonlinear Oscillator Models for Speech Analysis and Synthesis. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 74–113. Springer, Heidelberg (2005) (to be Published)

    Chapter  Google Scholar 

  44. Lakoff, G., Johnson, M.: Metaphors We Live By, pp. 10–11. University of Chicago Press, Chicago (1980)

    Google Scholar 

  45. Lyon, R.F.: A Computational Model of Filtering, Detection, and Compression in the Cochlea. In: Proceedings of IEEE-ICASSP, pp. 1282–1285 (1982)

    Google Scholar 

  46. Murphy, P., Akande, O.: Cepstrum-Based Harmonics-to-Noise Ratio Measurements in Voiced Speech. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 199–218. Springer, Heidelberg (2005) (to be Published)

    Chapter  Google Scholar 

  47. Markel, J.D., Gray, A.H., Wakita, H.: Linear Prediction of Speech Theory and Practice. Speech Communications. Santa Barbara, California, SCRL monograph 10 (1973)

    Google Scholar 

  48. Martin, T.: Acoustic Recognition of a Limited Vocabulary in Continuous Speech. Ph.D Thesis, Uni. Pennsylvania, Philadelphia (1970)

    Google Scholar 

  49. Meddis, R.: Simulation of Mechanical to Neural Transduction in the Auditory Receptor. J. Acoustic. Soc. Amer. 79, 702–711 (1986)

    Article  Google Scholar 

  50. Mermelstein, P.: Computer Generated Spectrogram Displays for On.Line Speech Research. IEEE Trans. Audio Electroacoustic. AU.19, 44–47 (1971)

    Article  Google Scholar 

  51. Noll, A.M.: Cepstrum Pitch Determination. J. Acoustic. Soc. Amer. 41, 293–309 (1967)

    Article  Google Scholar 

  52. Oppenheim, A.V.: A Speech Analysis.Synthesis System Based on Homomorphic Filtering. J. Acoustic. Soc. Amer. 45, 458–465 (1969)

    Article  Google Scholar 

  53. Oppenheim, A.V.: Speech Spectrograms Using the Fast Fourier Transform. IEEE Spectrum 7, 57–62 (1970)

    Article  Google Scholar 

  54. Oppenheim, A.V., Schafer, R.W.: Homomorphic Analysis of Speech. IEEE Trans. Audio Electroacoust AU16, 221–226 (1968)

    Article  Google Scholar 

  55. Oppenheim, A.V., Schafer, R.W., Stochham, S.: Nonlinear Filtering of Multiplied and Convolved Signals. Proc. IEEE 56, 1264–1291 (1968)

    Article  Google Scholar 

  56. Oppenheim, A.V., Schafer, R.W.: Digital Signal Processing. Englewood Cliffs, N.J (1975)

    MATH  Google Scholar 

  57. Petek, B.: Predictive Connectionist Approach to Speech Recognition. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 219–243. Springer, Heidelberg (2005) (to be Published)

    Chapter  Google Scholar 

  58. Reddy, D.R.: Computer Recognition of Connected Speech. J. Acoustic. Soc. Amer. 42(2), 329–347 (1967)

    Article  Google Scholar 

  59. Rose, J.E., Brugge, J.F., Anderson, D.J., Hindi, J.E.: Patterns of Activity in Single Auditory Nerve Fibers of the Squirrel Monkey. In: de Reuck, A.V.S., Knight, J. (eds.) Hearing Mechanisms in Vertebrate, Churchill, London, pp. 144–168 (1968)

    Google Scholar 

  60. Rothenberg, M.: A New Inverse-Filtering Technique for Deriving the Glottal Airflow Waveform during Voicing. Journal of Acoustical Society of America 53, 1632–1645 (1973)

    Article  Google Scholar 

  61. Rothenberg, M.: Measurement of Airflow in Speech. Journal of Speech and Hearing Research 20, 155–176 (1977)

    Google Scholar 

  62. Rothenberg, M.: Acoustic Interaction between the Glottal Source and Vocal Tract. In: Stevens, K.N., Hirano, H. (eds.) Vocal Fold Physiology, pp. 305–328. Tokyo Press (1981)

    Google Scholar 

  63. Rothenberg,M.: Inverse Filtering on your Laptop, http://www.rothenberg.org/contents.htm

  64. Rouat, J., Pichevar, R., Loiselle, S.: Perceptive Nonlinear Speech Processing and Spiking Neural Networks. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 317–337. Springer, Heidelberg (2005) (to be Published)

    Chapter  Google Scholar 

  65. Sachs, M.B., Young, E.D.: Encoding of Steady State Vowels in the Auditory Nerve: Representation in Terms of Discontinuities. J. Acoustic. Soc. Amer. 66, 470–479 (1979)

    Article  Google Scholar 

  66. Schafer, R.W., Rabiner, L.R.: System for Automatic Formant Analysis of Voiced Speech. J. Acoustic. Soc. Amer. 47(2), 634–648 (1970)

    Article  Google Scholar 

  67. Schafer, R.W., Rabiner, L.R.: Design of Digital Filter Banks for Speech Analysis. Bell Syst. Tech. Journ. 50(10), 3015–3097 (1971)

    Google Scholar 

  68. Schafer, R.W., Rabiner, L.R.: Design and Simulation of a Speech Analysis.Synthesis System Based on Short.Time Fourier Analysis. IEEE Trans. Audio Electroacoustic. AU.21, 165–174 (1973)

    Article  Google Scholar 

  69. Schoentgen, J.: Speech Modeling based on Acoustic-to-Articulatory Mapping. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 114–135. Springer, Heidelberg (2005) (to be Published)

    Chapter  Google Scholar 

  70. Schroeder, M.H., Hall, J.L.: Model for Mechanical to Neural Transduction in the Auditory Receptor. J. Acoustic. Soc. Amer. 55, 1055–1060 (1974)

    Article  Google Scholar 

  71. Schroeder, M.R.: Vocoders, Analysis and Synthesis of Speech. Proc. IEEE 54, 720–754 (1966)

    Article  Google Scholar 

  72. Schroeder, M.R.: Period Histogram and Product Spectrum: New Methods for Fundamental Frequency Measurements. J. Acoustic. Soc. Amer. 43(4), 829–834 (1968)

    Article  Google Scholar 

  73. Seneff, S.: Pitch and Spectral Analysis of Speech Based on an Auditory Synchrony Model. Ph. D. Thesis of Speech Communication Group, MIT, Cambridge, MA (1985)

    Google Scholar 

  74. Seneff, S.: A Joint Synchrony/Mean-Rate Model of Auditory Speech Processing. Journal of Phonetics 16, 55–76 (1988)

    Google Scholar 

  75. Shannon, C.E., Weaver, W.: Mathematical Theory of Communication. University of Illinois Press, US (1949)

    MATH  Google Scholar 

  76. Silverman, H.R., Dixon, N.R.: A Parametrically Controlled Spectral Analysis System for Speech. IEEE Trans on Acoustic. Speech and Signal Processing ASSP.22(2), 362–381 (1974)

    Article  Google Scholar 

  77. Smith, R.L., Brachman, M.L., Frisina, R.D.: Sensitivity of Auditory-Nerve Fibers to Changes in Intensity: A Dichotomy Between Decrements and Increments. J. Acoustic. Soc. Amer. 78, 1310–1316 (1985)

    Article  Google Scholar 

  78. Smith, J.C., Zwislocki, J.J.: Short-Term Adaptation and Incremental Responses of Single Auditory-Nerve Fibers. Biol. Cybernetics 17, 169–182 (1975)

    Article  Google Scholar 

  79. Sondhi, M.M.: New Methods of Pitch Detection. IEEE Trans. Audio Electroacoustic AU.16(2), 262–266 (1968)

    Article  Google Scholar 

  80. Stewart, J.L.: The Bionic Ear. Covox Company, Santa Maria, California

    Google Scholar 

  81. Stylianou, Y.: Modeling Speech based on Harmonic plus Noise Models. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 244–260. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  82. Trask, R.L.: A Dictionary of Phonetics and Phonology. Routledge, London,UK (1996)

    Google Scholar 

  83. Young, E.D., Sachs, M.B.: Representation of Steady-State Vowels in the Temporal Aspects of the Discharge Pattern of Populations of Auditory Nerve Fibers. J. Acoustic. Soc. Amer. 66, 1381–1403 (1979)

    Article  Google Scholar 

  84. Zwicker, E.: Psychoacoustics. Springer, Berlin (1962)

    Google Scholar 

  85. Zwicker, E.: Suddivision of the Audible Frequency Range into Critical Bands. J. Acoustic. Soc. Amer. 88, 248–249 (1961)

    Article  Google Scholar 

  86. Zwislocki, J.J.: On Intensity Characteristics of Sensory Receptors: A Generalized Function. Kybernetik 12, 169–183 (1973)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Esposito, A., Marinaro, M. (2005). Some Notes on Nonlinearities of Speech. In: Chollet, G., Esposito, A., Faundez-Zanuy, M., Marinaro, M. (eds) Nonlinear Speech Modeling and Applications. NN 2004. Lecture Notes in Computer Science(), vol 3445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11520153_1

Download citation

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us

Policies and ethics

Profiles

  1. Anna Esposito