Keynote Speakers

We are pleased to welcome three distinguished keynote speakers at WASPAA 2017:

  • Ville Pulkki, Aalto University, Finland
  • Augusto Sarti, Politecnico di Milano, Italy
  • Mark Plumbley, Surrey University, UK


Parametric Time-Frequency-Domain Spatial Audio — Delivering Sound According to Human Spatial Resolution

Ville Pulkki, Aalto University, Finland


The application of time-frequency-domain techniques in spatial audio is relatively new, as first attempts were published about 15 years ago. A common property of the techniques is that the sound field is captured with multiple microphones, and its properties are analyzed for each time instance and individually for different frequency bands. These properties can be described by a set of parameters which are subsequently used in processing to achieve different tasks, such as perceptually-motivated reproduction of spatial sound, spatial filtering, or spatial sound synthesis. Such signal-dependent processing relies on more-or-less implicit assumptions of the properties of the spatial and spectral resolution of the listener, and they typically provide a prominent enhancement in quality when compared to signal-independent techniques sharing the same input signal. I will describe in the talk the main approaches and techniques, and also their motivations and underlying assumptions. Directional audio coding (DirAC), which is one of the major techniques in the field, is described in detail. DirAC is discussed in the contexts of spatial sound reproduction and in synthesis of virtual acoustic environments, with first- or higher-order spherical-harmonic microphone input.


Ville Pulkki is an associate professor in the Department of Signal Processing and Acoustics in Aalto University, Helsinki, Finland. He has been working in the field of spatial audio for over 20 years. He developed the vector-base amplitude panning (VBAP) method in his PhD (2001), and directional audio coding after the PhD with his research group. He has also contributions in perception of spatial sound, in laser-based measurement of room responses, and in binaural auditory models.

Looking into Perfection: the Art and Science of Stradivarius Violins

Augusto Sarti, Politecnico di Milano, Italy


It was only five years ago when I was offered the task of establishing a new research lab devoted to the acoustics of violins in the city of Cremona, UNESCO World Heritage site of the intangible practice of lutherie. It made sense because it was in this very city that the families of Stradivarius, “Guarneri del Gesù” and Amati thrived and taught the world how to seek perfection in musical instrument making. The Musical Acoustics Lab has been hard at work ever since. Settled in the world-renowned Violin Museum, we have been analysing the vibrational, the acoustic and the timbral properties on historical violins, with the purpose of shedding light on the most prestigious “sound generators” ever made. We have done so using tools of signal processing, computational acoustics and machine intelligence.
We learnt how to perform a model-based estimation of the so-called “bridge admittance” to capture the vibrational response of the violin’s body. We developed a technique for measuring the violin’s radiance pattern while it is being played, using a calibrated pair of plenacoustic cameras for simultaneous localization/tracking of the instrument’s pose and high-resolution radiance estimation along a given path. We also developed specific machine learning techniques (typically based on Deep Belief Networks) for studying the timbral qualities of the instruments in relation with other acoustic and vibrational properties, all in a non-invasive (or minimally invasive) fashion. This we applied to all the historical violins of the collection of the Violin Museum of Cremona, as well as on all the winners of the renowned world competition of lutherie that takes place in Cremona every three years.
I will talk about these experiences (particularly about the challenges of non-invasive multimodal analysis) and discuss what we have learnt about the art and craftsmanship of the great luthiers of the past.

Cremona, museo del violino


Augusto Sarti is a professor of the Politecnico di Milano, Italy. He received his Ph.D. in information engineering from the University of Padua, Italy in 1993, with a joint graduate program with the University of California, Berkeley. His research interests are in the area of multimedia signal processing, with particular focus on audio and acoustic signal processing. He coauthored over 250 scientific publications and 20 patents in the area of multimedia signal processing. He promoted and coordinated or contributed to numerous (20+) European projects. He is currently the scientific director of the Musical Acoustics Lab and of the Sound and Music Computing Lab of the Politecnico di Milano. He is a member of the IEEE Technical Committee on Audio and Acoustics Signal Processing, and the chairman of the EURASIP Special Area Team on Acoustic, Sound and Music Signal Processing.


Making Sense of Sounds: Machine Listening in the Real World

Mark Plumbley, Surrey University, UK


Imagine you are standing on a street corner in a city. Close your eyes: what do you hear? Perhaps some cars and busses driving on the road, footsteps of people on the pavement, beeps from a pedestrian crossing, rustling and clonks from shopping bags and boxes, and the hubbub of talking shoppers. Just by listening, you know what is happening around you, without even needing to open your eyes. You can do the same in a kitchen as someone is making breakfast, or if you are listening to a tennis match on the radio. For most people, this skill of listening to everyday events and scenes is so natural that it is taken for granted. However, this is a very challenging task for computers, and it is an open problem how to build “machine listening” algorithms that can automatically recognize sounds events and scenes. In this talk, I will discuss some of the techniques and approaches that we are using to recognize and explore different types of real-world sounds, and we will see how these machine listening algorithms offer the potential to make sense of the huge amount of sound in our digital world, bringing benefits to areas such as health, security, creative industries and the environment.


Mark Plumbley is Professor of Signal Processing at the Centre for Vision, Speech and Signal Processing (CVSSP) at the University of Surrey, in Guildford, UK. After receiving his Ph.D. degree in neural networks in 1991, he became a Lecturer at King’s College London, before moving to Queen Mary University of London in 2002. He subsequently became Professor and Director of the Centre for Digital Music, before joining the University of Surrey in 2015. He is known for his work on analysis and processing of audio and music, using a wide range of signal processing techniques, including independent component analysis, sparse representations, and deep learning. He has also a keen to promote the importance of research software and data in audio and music research, including training researchers to follow the principles of reproducible research, and he led the first data challenge on Detection and Classification of Acoustic Scenes and Events (D-CASE), at WASPAA 2013. He currently leads two EU-funded research training networks in sparse representations, compressed sensing and machine sensing, and EPSRC-funded projects on audio source separation and on making sense of everyday sounds, and is a co-editor of the forthcoming book on “Computational Analysis of Sound Scenes and Events”. He is a Fellow of the IET and IEEE.

Comments are closed.