ASPED: Audio Sensing for PEdestrian Detection Dataset

Overview

ASPED (Audio Sensing for PEdestrian Detection) is a large-scale audio and video dataset prepared for pedestrian detection using sound. ASPED.a consists of almost 2,600 hours of audio, more than 3.4 million continuous frames in video, and corresponding annotation of pedestrian count for each audio and video. For more information, please take a look at our paper published and presented in IEEE ICASSP 2024.

We are currently processing the second round of collected data, ASPED.b, to make it publicly available. Unlike ASPED.a, which was gathered in a vehicle-free campus environment, ASPED.b features data collected amidst vehicular noise. We are investigating the differences in pedestrian sensing performance between these two distinct environments.

  1. Download ASPED.a
  2. Download ASPED.b

Model

  • Our model is available in our github repository, here.