Audio-Processing
A handy nodejs package for audio processing.
For example, it can extract frequencies from the audio and compute the pitch.
The source code can be found on https://github.com/fengzhang2011/audio-processing.
The npm package is on https://www.npmjs.com/package/audio-processing.
1. HOW TO USE
1.1 A simple test on the library
Just execute the following command.
$ cd build$ cmake ..$ make$ ./audio_processing
1.2 Use it in the Javascript
This package has been published into the npm repository. Therefore, it can be installed via npm
.
$ mkdir your_project$ cd your_project$ npm init$ npm install audio-processing
NOTE:
If you encounter some issues like permission denied
while installing it, especially in a docker container, try the following command.
Reason: The unsafe-perm
boolean set to true to suppress the UID/GID switching when running package scripts.
# npm config set unsafe-perm true
Now you could use the code. The example code is as follows.
const ap = ; console; {let audio = await ap; console; ap; console; console; console; // console.log(ap.detectPitch(audio.wavdataL, audio.samplerate, 'goertzel')); // console.log(ap.detectPitch(audio.wavdataL, audio.samplerate, 'dft')); let ampfreq = await ap; // console.log('ampfreq=', ampfreq); let data = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19; // console.log(data); let freq_data = await ap; // console.log(freq_data); let td_data = await ap; // console.log(td_data);} ;
2. CREDITS
This code uses the FFTS, Pitch-Detection, AudioFile, Opencore-AMR, MiniMp3, and libsamplerate.
Thanks for their great work.
The detailed versions in use are as follows:
# | Project | Version | Date |
---|---|---|---|
1 | FFTS | fe86885 | Jun-17-2017 |
2 | Pitch-Detection | 7799a62 | Oct-07-2018 |
3 | AudioFile | a6430a0 | Jun-06-2017 |
4 | Opencore-AMR | 0.1.5 | Mar-16-2017 |
5 | MiniMp3 | 7295650 | Sep-26-2018 |
6 | libsamplerate | 313685a | Mar-07-2019 |
3. THIRD-PARTY LIBRARIES
3.1 Supported Audio Format
3.1.1 Format: .wav (WAV)
Copy the AudioFile source code
$ git clone https://github.com/adamstark/AudioFile$ cd AudioFile$ mkdir build$ cd build$ g++ -ansi -pedantic -Werror -O3 -std=c++17 -fPIC -fext-numeric-literals -ffast-math -c ../*.cpp$ ar rcs libaudiofile.a *.o
Once these commands are done, the libaudiofile.a
will be generated under the build
folder.
Copy the header file src/AudioFile.h
and libaudiofile.a
to ./include
and ./lib
folders, respectively, into this repository.
3.1.2 Format: .amr (AMR)
OpenAMR WB/NB
https://sourceforge.net/projects/opencore-amr/files/opencore-amr/
3.1.3 Format: .mp3 (MP3)
MP3 decoder
https://github.com/lieff/minimp3
3.2 Resample
3.2.1 Sample Rate Converter
$ git clone https://github.com/anthonix/libsamplerate.git$ cd libsamplerate$ echo "set(CMAKE_C_FLAGS \"\${CMAKE_C_FLAGS} -fPIC\")" >> CMakeLists.txt$ mkdir build$ cd build$ cmake ..$ make
3.3 FFT and MFCC
3.3.1 Compile the FFTS static library
$ git clone https://github.com/anthonix/ffts.git$ cd ffts$ echo "set(CMAKE_C_FLAGS \"\${CMAKE_C_FLAGS} -fPIC\")" >> CMakeLists.txt$ mkdir build$ cd build$ cmake ..$ make ffts_static
NOTE:
We must enable the -fPIC
flag when compiling the ffts static library, to enable the "Position Independent Code". Otherwise, it will generate the follow error:
/usr/bin/ld: ../lib/libffts.a(ffts.c.o): relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile with -fPIC
../lib/libffts.a: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
make: *** [Release/obj.target/ssrc.node] Error 1
Once these commands are done, the libffts.a
will be generated under the build
folder.
Copy the header file include/ffts.h
and libffts.a
to ./include
and ./lib
folders, respectively, into this repository.
3.3.2 MFCC compuation
The code are written based on these great documents. Thanks for the authors.
- [1] Mel Frequency Cepstral Coefficient (MFCC) tutorial. James Lyons.
- [2] Speech Processing for Machine Learning: Filter banks, Mel-Frequency Cepstral Coefficients (MFCCs) and What's In-Between. Haytham Fayek.
- [3] Hamming Window, SPECTRAL AUDIO SIGNAL PROCESSING. Julius Orion Smith III.
3.4 Audio features
3.4.1 Pitch
Compile the Pitch-Detection static library
$ git clone https://github.com/sevagh/pitch-detection.git$ cd pitch-detection$ mkdir build$ cd build$ g++ -I../include -ansi -pedantic -Werror -Wall -O3 -std=c++17 -fPIC -fext-numeric-literals -ffast-math -c ../src/*.cpp$ ar rcs --plugin $(gcc --print-file-name=liblto_plugin.so) libpitch_detection.a *.o
Once these commands are done, the libpitch_detection.a
will be generated under the build
folder.
Copy the header file include/pitch_detection.h
and libpitch_detection.a
to ./include
and ./lib
folders, respectively, into this repository.
3.5 Noise reduction
3.5.1 Weiner filter for Noise Reduction and speech enhancement.
Pascal Scalart. Wiener Noise Suppressor based on Decision-Directed method with TSNR and HRNR algorithms.