Nima Mesgarani, PhD

  • Assistant Professor of Electrical Engineering (Columbia University)
Profile Headshot

Profile Navigation

Overview

Academic Appointments

  • Assistant Professor of Electrical Engineering (Columbia University)

Gender

  • Male

Research

My research interests are in information processing of acoustic signals at the interface of engineering and neuroscience. Using an interdisciplinary research approach, our goal is to bridge the gap between these two disciplines by reverse engineering the signal processing in the brain, which in turn inspires novel approaches to emulate human abilities in machines.

This integrated research approach leads to better scientific understanding of the brain, novel speech processing algorithms for automated systems, and innovative methods for Brain-Machine Interface and neural prosthesis.

Research Interests

  • Neural Engineering
  • Neural Processing of Speech and Sound
  • Neuro-Inspired Computation

Selected Publications

Zai, A., Bhargava, S., Mesgarani, N., & Liu, S. C. (2015). Reconstruction of audio waveforms from spike trains of artificial cochlea models. Frontiers in Neuroscience, 9, 347
Mesgarani, N. (2014). Stimulus Reconstruction from Cortical Responses. In Encyclopedia of Computational Neuroscience (pp. 1-3). Springer New York
Suied, C., Agus, T. R., Thorpe, S. J., Mesgarani, N., Pressnitzer, D. (2014). Auditory gist: Recognition of very short sounds from timbre cues. The Journal of the Acoustical Society of America, 135(3), 1380-1391
Mesgarani, N., Cheung, C., Jonson, K., Chang, E. F., (2014), "Representation of phonetic features in human non-primary auditory cortex", Science 1245994
Mesgarani, N., David, S. V., Fritz, J., Shamma, S., (2014), "Mechanisms of noise robust Representation of Speech in Primary Auditory Cortex", Proceedings of the National Academy of Sciences (PNAS), 111.18
O'Sullivan, J. A., Power, A. J., Mesgarani., …, Lalor, E. C. (2014). Attentional selection in a cocktail party environment can be decoded from single-trial EEG. Cerebral Cortex
Suied, C., Agus, T. R., Thorpe, S. J., Mesgarani, N., & Pressnitzer, D. (2014). Auditory gist: Recognition of very short sounds from timbre cues. The Journal of the Acoustical Society of America, 135(3), 1380-1391
Bouchard, K., Mesgarani, N., Johnson, K., Chang, E. F., (2013), "Functional Organization of Human Ventral Sensorimotor Cortex in Speech Articulation", Nature 1476
Mesgarani, N., Chang, E. F., (2012), "Selective cortical representation of attended speaker in multi-talker speech perception", Nature 485
Pasley, B., David, S., Mesgarani, N., Flinker, A., Shamma, S., Crone, N., Knight, R., Chang, E., (2012), "Reconstructing speech from human auditory cortex." PLoS-Biology 10, no. 1: 175
Mesgarani, N., Thomas, S., & Hermansky, H. (2011). Toward optimizing stream fusion in multistream recognition of speech. The Journal of the Acoustical Society of America, 130(1), EL14-EL18
Sivaram, G. S., Nemala, S. K., Mesgarani, N., & Hermansky, H. (2010). Data-driven and feedback based spectro-temporal features for speech recognition. Signal processing Letters, IEEE, 17(11), 957-960
Mesgarani, N., David, S. V., Fritz, J. B., & Shamma, S. A. (2009). Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex. Journal of neurophysiology, 102(6), 3329-3339
Mesgarani, N., Fritz, J., & Shamma, S. (2010). A computational model of rapid task-related plasticity of auditory cortical receptive fields. Journal of computational neuroscience, 28(1), 19-27
David, S. V., Mesgarani, N., Fritz, J. B., & Shamma, S. A. (2009). Rapid synaptic depression explains nonlinear modulation of spectro-temporal tuning in primary auditory cortex by natural stimuli. The Journal of Neuroscience, 29(11), 3374-3386
David, S. V., Mesgarani, N., & Shamma, S. A. (2007). Estimating sparse spectro-temporal receptive fields with natural stimuli. Network: Computation in Neural Systems, 18(3), 191-212
Mesgarani, N., David, S. V., Fritz, J. B., & Shamma, S. A. (2008). Phoneme representation and classification in primary auditory cortex. The Journal of the Acoustical Society of America, 123(2), 899-909
Mesgarani, N., & Shamma, S. (2007). Denoising in the domain of spectrotemporal modulations. EURASIP Journal on Audio, Speech, and Music Processing, 2007(3), 3
Mesgarani, N., Slaney, M., & Shamma, S. (2006). Discrimination of speech from non-speech based on multiscale spectro-temporal modulations. Audio, Speech, and Language Processing, IEEE Transactions on, 14(3), 920-930

Refereed conference publications
Nagamine, T., Seltzer, M. L., & Mesgarani, N. (2015). Exploring How Deep Neural Networks Form Phonemic Categories. In Sixteenth Annual Conference of the International Speech Communication Association, Dresden, Germany
Yang, M., Sheth, S. A., Schevon, C. A., II, G. M. M., & Mesgarani, N. (2015). Speech reconstruction from human auditory cortex with deep neural networks. In Sixteenth Annual Conference of the International Speech Communication Association, Dresden, Germany
Mahajan, N., Mesgarani, N., & Hermansky, H. (2014). Principal Components of Auditory Spectro-Temporal Receptive Fields. In Fifteenth Annual Conference of the International Speech Communication Association
Ng, T., Zhang, B., Nguyen, L., Matsoukas, S., Zhou, X., Mesgarani, N., ... & Matejka, P. (2012). Developing a Speech Activity Detection System for the DARPA RATS Program. In Proceedings of Interspeech
Plchot, Oldrich, et al. (2013), "Developing a speaker identification system for the DARPA RATS project." International Conference on Acoustic, Speech, and Signal Processing
Mesgarani, N., Chang, E. F., (2012), "Speech and speaker separation in human auditory cortex", in Proceedings of Interspeech, Portland
Zhou, X., Garcia-Romero, D., Mesgarani, N., Stone, M. C., Espy-Wilson, C. Y., & Shamma, S. A. (2012). Automatic intelligibility assessment of pathologic speech in head and neck cancer based on auditory-inspired spectro-temporal modulations. In Proceedings of Interspeech
Thomas, S., Mallidi, S. H. R., Janu, T., Hermansky, H., Mesgarani, N., Zhou, X., ... & Matsoukas, S. (2012). Acoustic and Data-driven Features for Robust Speech Activity Detection. In Proceedings of Interspeech
Hermansky, H., Mesgarani, N., & Thomas, S. (2011). Performance monitoring for robustness in automatic recognition of speech. In MLSLP (pp. 31-34)
Mesgarani, N., Thomas, S., & Hermansky, H. (2011). Adaptive Stream Fusion in Multistream Recognition of Speech. In INTERSPEECH (pp. 2329-2332)
Mesgarani, N., & Shamma, S. (2011). Speech processing with a cortical representation of audio. In Acoustics, Speech and Signal Processing (ICASSP)
Mesgarani, N., Shamma, S., (2010) "Noise Robust Encoding of Speech in Primary Auditory Cortex", in proc. of IEEE Asilomar Conference on Signals, Systems and Computers, Asilomar, CA
Jansen, A., Mesgarani, N., Niyogi, P., (2010), "Point Process Models of Spectro-Temporal Modulation Events for Speech Recognition", in proc. Of IEEE Asilomar Conference on Signals, Systems and Computers, Asilomar, CA
Mesgarani, N., Thomas, S., Hermansky, H., (2010), "A multistream multiresolution framework for phoneme recognition", International Conference on Speech and Language (Interspeech), Makuhari, Japan
Thomas, S., Patil, K., Ganapathy, S., Mesgarani, N., & Hermansky, H. (2010). A phoneme recognition framework based on auditory spectro-temporal receptive fields. In Proceedings of Interspeech (pp. 2458-2461)
Liu, S., Mesgarani, N., Hermansky, H., (2010), "The use of spike-based representations for hardware audition systems, IEEE International Symposium on Circuit and Systems, (ISCAS), France
Mirbagheri, M., Mesgarani, N., Shamma, S., (2010), "Speech enhancement using nonlinear filtering of spectrotemporal representation", IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Dallas
Sivaram, G., Nemala, S., Mesgarani, N., Elhilali, M., Hermansky, H., (2010), "Augmented discriminant spectrotemporal features for speech recognition", IEEE Int. Conference on Acoustic, Speech and Signal Processing (ICASSP), Dallas
Mesgarani, N., Sivaram, G., Hermansky, H., (2009) "Discriminant spectrotemporal features for phoneme recognition", International Conference on Speech and Language (Interspeech), U.K