Creating super-hearing capabilities with real-time AI
Shyam Gollakota
Abstract: In noisy environments, human auditory perception can be limited. Imagine yourself in a crowded room with a cacophony of sounds, yet having the ability to focus on specific sounds or remove unwanted ones based on their semantic descriptions. This capability entails understanding and manipulating an acoustic scene, isolating each sound, and associating a 2D spatial context or semantic meaning with each individual sound -- a formidable challenge even for the human brain.
In this talk, I will showcase a series of projects that aim to enhance human auditory perception using AI. I will demonstrate how AI can augment humans to tackle tasks that are difficult for the human auditory system alone.
Lastly, I will speculate on the future of augmented intelligence and how we are on the brink of a new era for intelligent mobile systems. These systems have the potential to turn science fiction into reality and revolutionize various domains, from redefining the future of billions of earbuds and headphones to the development of cutting-edge hearing aid devices.
Bio: Shyam Gollakota is a Washington Research Foundation endowed Professor at the Paul G. Allen School of Computer Science & Engineering in the University of Washington. His work has been licensed by ResMed Inc, our startup Sound Life Sciences acquired by Google, and is in use by millions of users. He was also CEO of a startup where we obtained FDA 510(k) clearance for the technology developed in my lab. His lab also worked closely with the Washington Department of Agriculture to wirelessly track invasive "murder" hornets, which resulted in the destruction of the first nest in the United States. He is the recipient of the ACM Grace Murray Hopper Award in 2020 and recently named as a Moore Inventor Fellow in 2021. He was also named in MIT Technology Review's 35 Innovators Under 35, Popular Science 'brilliant 10' and twice to the Forbes' 30 Under 30 list. His group's research has earned Best Paper awards at MOBICOM, SIGCOMM, UbiComp, SenSys, NSDI and CHI, appeared in interdisciplinary journals like Nature, Nature Communications, Nature Biomedical Engineering, Science Translational Medicine and Science Robotics as well as named as a MIT Technology Review Breakthrough technology of 2016 as well as Popular Science top innovations in 2015. He is an alumni of MIT (Ph.D., 2013, winner of ACM doctoral dissertation award) and IIT Madras..
EEG on earables - an industry perspective
Geoffrey Mackellar
Abstract: The human auditory system is exquisitely designed to translate ambient sounds into sensory neural signals which are processed by the brain in the nearby temporal and parietal cortices. Hearing sensitivity is logarithmic, covering an extraordinary range of sound pressure levels. We can perceive tiny differences in sound frequencies and appreciate the full spectrum of sounds in parallel. Strong connections between left and right lobes and the directionality of the ears allows us to localise sound sources by estimating the timing difference across a wide range of frequencies and amplitudes. Ambient sounds contribute to a fused sensory model of our surroundings, utilised by the brain in planning movements and behaviour. Components of the soundscape are classified and directed to appropriate cortical structures to understand speech, hear music, perceive threats and opportunities and interpret our surroundings. The brain is the organ we hear with – the auditory system serves to bring the sounds into the brain as electrical impulses that set off a cascade of downstream processing.
Electroencephalography (EEG) non-invasively measures the pattern of tiny voltages induced on the surface of the skin by the interaction of groups of neurons in the cortical layer of the brain. In this presentation I will discuss the brain’s architecture, how we can measure its operation and what we can infer about the wearer’s state of mind, task processing, intentions, preferences and behaviour in real time. I will review some of the applications that already exist for EEG, and the engineering challenges for integrating direct brain measurements into wearable devices. I will discuss some of the novel and powerful opportunities presented by integrating EEG sensing and interpretation into hearables.
Bio: Dr Geoffrey Mackellar (BSc (Hons 1), PhD (Physics)) has a long and distinguished career in private industry as a Research and Product Development leader for many
high-tech medical, industrial, and scientific products. He has deep experience in all
aspects of business, from invention and development of products to business
development, intellectual property, technology licensing, regulatory, government
and private funding, marketing, and sales of over 50 products. Geoff leads our
Research, Hardware, and AI / Machine Learning efforts.
Geoff is a co-inventor on >10 US Patents and several pending applications. He was
a co-author of several significant journal articles in neuroscience, laser physics, and
the detection of cervical pre-cancer and melanoma. Instruments of his own design
coupled with machine learning algorithms outperform existing diagnostic methods.
Geoff co-founded Metalaser Technologies, Visiray, Dynamic Light, and Emotiv
Lifesciences and was an R&D leader with Polartechnics. He has led the
development of successful products including medical and scientific laser systems,
diagnostic and screening systems for pre-cancer of the cervix and melanoma,
patented machine learning platforms and techniques, and the highly successful
Emotiv EPOC, Insight, Flex, and MN8 neuroheadsets.
Solarscan, a product for the early detection of melanoma, won the prestigious
Bradfield Prize awarded by the Australian Institute of Engineers, the Red Dot
Design Award, and the Australian International Design Awards in 2005.
EMOTIV EPOC won two categories of the Engineering Excellence Awards in 2010,
the Australian International Design Awards, the Red Dot Design Award, and the
AutoVision Innovations Award in 2010.
Session 1: Earables for Motion Tracking and Gesture Recognition
10:40 - 11:00,
Coffee Break
11:00 - 11:15,
GCCRR: A Short Sequence Gait Cycle Segmentation Method Based on Ear-Worn IMU Zhenye Xu et al.
11:15 - 11:30,
BrushBuds: Toothbrushing Tracking Using Earphone IMUs Qiang Yang et al.
11:30 - 11:45,
Boxing Gesture Recognition in Real-Time using Earable IMUs Ozlem Durmaz et al.
11:45 - 12:00,
Fit2Ear: Generating Personalized Earplugs from Smartphone Depth Camera Images Haibin Zhao et al.
12:00 – 13:00,
Lunch and Networking
Session 2: Earables for Health and Physiological Monitoring
14:00 - 14:15,
Identification of Vagal Activation Metrics in Heart Rate Variability during Transcutaneous Auricular Vagal Nerve Stimulation Daniel Ehrens et al.
14:15 - 14:30,
I know it’s EEG: An in-situ selection of ear-EEG channels Tanuja Jayas et al.
14:30 - 14:45,
On the Production and Measurement of Cardiac Sounds in the Ear Canal Kenneth Christofferson et al.
14:45 - 15:00,
Quality Assessment And Neurophysiological Signal Resemblance Of Ear EEG Jonathan Berent et al.
15:00 - 15:15,
EarTune: Exploring the Physiology of Music Listening Kayla-Jade Butkow et al.
15:15 – 15:30,
Closing Remarks and Best Paper Award announcement
All papers will be presented and presentation time will be 15 minutes (10 minutes for presentation + 5 minutes for Q&A)
We will solicit three categories of papers.
Full papers (up to 6 pages including references) should report a reasonably mature work with earables, and is expected to demonstrate concrete and reproducible results albeit scale may be limited.
Experience papers (up to 4 pages including references) that present extensive experiences with implementation, deployment, and operations of earable-based systems. Desirable papers are expected to contain real data as well as descriptions of the practical lessons learned.
Short papers (up to 2 pages including references) are encouraged to report novel, and creative ideas that are yet to produce concrete research results but are at a stage where community feedback would be useful.
Moreover, we will have a special submission category - "Dataset Paper" - soliciting a 1-2 page document describing a well curated and labelled dataset collected with earables (eventually accompanied by the dataset).
All papers will be in ACM sigconf template with 2 columns and all of the accepted papers (regardless of category) will be included in the ACM Digital Library. All papers will be digitally available through the workshop website, and the UbiComp/ISWC 2024 Adjunct Proceedings. For each category of papers, we will offer a "Best Paper" and "Best Dataset" awards sponsored by Nokia Bell Labs.
Topics of interest (NOT an exhaustive list):
Acoustic Sensing with Earables
Kinetic Sensing with Earables
Multi-Modal Learning with Earables
Multi-Task Learning with Earables
Active Learning with Earables
Low-Power Sensing Systems for Earables
Authentication & Trust mechanisms for Earables
Quality-Aware Data Collection with Earables
Experience Sampling with Earables
Crowd Sourcing with Earables
Novel UI and UX for Earables
Auditory Augmented Reality Application with Earables
Lightweight Deep Learning on Earables
Tiny Machine Learning on Earables
Health and Wellbeing Applications of Earables
Emerging applications of Earables
While the workshop will accept papers describing completed work as well as work-in-progress, the
emphasis is on early
discussion of novel and radical ideas (potentially of a controversial nature) rather than detailed
description and evaluation of incremental advances.
Submissions must be no longer than 6 pages (including references) for Full Papers, 4 pages (including references) for Experience Papers, and 2 pages
(including references) for Interactive Posters and Vision Papers and must be in PDF format.
Reviews will be double-blind: no names or affiliation should be included in the submission.
The submission template can be downloaded from
ACM site.
Alternatively, the Overleaf version can be found
here.
Latex documents should use the “sigconf” template style. Word users should use the interim
template downloadable from the
ACM site.
Submission Site: https://new.precisionconference.com/submissions
Submission Instructions: to select the appropriate track choose "SIGCHI" in the field
Society, "Ubicomp/ISWC 2024" as
Conference, and, finally, pick "Ubicomp/ISWC 2024 EarComp" as
Track.
- Submission Deadline: July 5th 2024
- Acceptance Notification: July 15th 2024
- Camera Ready Deadline: July 26th 2024
For any question/concern, get in touch with
earcomp@esense.io.
General Chairs
Alessandro Montanari, Nokia Bell Labs Cambridge
Andrea Ferlini, Nokia Bell Labs Cambridge
Program Chairs
Nirupam Roy, University of Maryland, College Park (UMD)
Xiaoran "Van" Fan, Google
Longfei Shangguan, University of Pittsburgh
Steering Committee
Fahim Kawsar, Nokia Bell Labs, Cambridge
Alessandro Montanari, Nokia Bell Labs Cambridge
Andrea Ferlini, Nokia Bell Labs Cambridge
Web, Publicity and Publication
Jake Stuchbury-Wass, University of Cambridge
Yang Liu, Nokia Bell Labs Cambridge
Program Committee
Dong Ma, Singapore Management University
Marios Costantinides, Nokia Bell Labs Cambridge
Yang Liu, University of Cambridge
Yang Liu, Nokia Bell Labs Cambridge
Ting Dang, University of Melbourne
Khaldoon Al-Naimi, Nokia Bell Labs Cambridge
Ashok Thangarajan, Nokia Bell Labs Cambridge
Shijia Pan, UC Merced
Jagmohan Chauhan, University of Southampton
Jay Prakash, Silence Laboratories, Singapore
Jun Han, KAIST
Wen Hu, UNSW Sydney
Zhenyu Yan, Chinese University of Hong Kong
Mi Zhang, Michigan State University
Rajalakshmi Nandakumar, Cornell Tech
Bashima Islam, Worcester Polytechnic Institute
VP Nguyen, University of Texas at Arlington/UMass Amherst
Ho-Hsiang Wu, Bosch, USA
Ananta Balaji, Nokia Bell Labs Cambridge