sox from SoX to analyze a short audio sample:
sox -t .wav "|arecord -d 2" -n stat
-t .wav we specify we process the wav type,
"|arecord -d 2" executes the
arecord program for two seconds,
-n outputs to the null file and with
stat we specify we want statistics.
The output of this command, on my system with some background speech, is:
Recording WAVE 'stdin' : Unsigned 8 bit, Rate 8000 Hz, Mono Samples read: 16000 Length (seconds): 2.000000 Scaled by: 2147483647.0 Maximum amplitude: 0.312500 Minimum amplitude: -0.421875 Midline amplitude: -0.054688 Mean norm: 0.046831 Mean amplitude: -0.000044 RMS amplitude: 0.068383 Maximum delta: 0.414063 Minimum delta: 0.000000 Mean delta: 0.021912 RMS delta: 0.036752 Rough frequency: 684 Volume adjustment: 2.370
The maximum amplitude can then be extracted via:
grep -e "RMS.*amplitude" | tr -d ' ' | cut -d ':' -f 2
grep for the line we want, use
tr to trim away the space characters and then
cut it by the
: character and take the second part which gives us
0.068383 in this example. As suggested by comments, RMS is a better measure of energy than maximum amplitude.
You can finally use
bc on the result to compare floating-point values from the command-line:
if (( $(echo "$value > $threshold" | bc -l) )) ; # ...
If you build a loop (see Bash examples) that calls sleep for 1 minute, tests the volume, and then repeats, you can leave it running in the background. The last step is to add it to the init scripts or service files (depending on your OS / distro), such that you do not even have to launch it manually.
Here’s how it can be done with Pure Data:
Metro is a metronome, and “metro 100” keeps banging each 100 ms.
The audio is coming from adc~, the volume is calculated by env~. “pd dsp 0” turns off the DSP when banged, “pd dsp 1” turns it on. “shell” executes the passed command in a shell, I use the Linux xrandr API to set the brightness to X, you need to adapt this for Wayland.
As you can see, grace period and locking takes up way more space than audio code does.
Making a solution with ring buffers and/or moving averages should be way easier than doing it with
sox. So I don’t think it’s a bad idea to use Pure Data for this. But the screen blanking itself and the locking doesn’t fit with the dataflow paradigm.
The PD file is at gist.github.com: ysangkok – kidsyell.pd.
Check “How to detect the presence of sound/audio” by Thomer M. Gil.
Basically it records the sound every 5 seconds, than checks for the sound amplitude, using
sox, and decides if trigger a script or not. I think you can easily adapt the
ruby script for your children! Or you can choose to hack away on the Python script (using PyAudio) that he has provided as well.