Searching for a sound snippet in audio files

The technology you are looking for is called Acoustic fingerprint, defined as :

An acoustic fingerprint is a condensed digital summary, deterministically generated from an audio signal, that can be used to identify an audio sample or quickly locate similar items in an audio database.

There are many applications of it listed in the above Wikipedia link, but most are commercial.

Another list of freeware and commercial products are found in the article AudioFingerprint in MusicBrainz, a user-maintained open community that collects and makes available to the public music metadata in the form of a relational database.

Some free and open-source projects from the list that you might examine :

jHears
an acoustic fingerprinting framework.

Acoustid
open source project that aims to create a free database of audio fingerprints with mapping to the MusicBrainz metadata database and provide a web service for audio file identification using this database.

libFooID
an open source acoustic fingerprinting library.


You could try the algorithm that Avery Wang developed for Shazam. He's doing the same thing. He stores fingerprints for each song in a library so they can be easily checked to see if there is a constellation of points that matches the ones from a snippet.

You can get his whitepaper and links to several other systems/ideas here.