When we got the request to build an app to measure the RT60 value with an iPhone, my first thought was: “RT60, you say?” Right. I had no idea what it was, let alone how to measure it. I was told it was measuring a “machinegun”-sound, followed by some simple Excel-kinda-calculation to get the RT60 result.
Well, it turned out to be a little different. But we dug in deep and shot on sight. This is the story of a battle against Fourier Transformations, ANSI S1.11-2004 and audio ring buffering.
First step: Wikipedia. I learned about the length of the sound decay in a room, caused by echoes. We’re not talking about echoes as you know from the Alps here, but echoes that arrive very quickly as they are generated in a room or office. Instead of distinct echoes, we’re talking about a constant decay of “the sound”. RT60 basically means the duration for the sound to decay by 60dB. Smart people can guess what RT30 and RT15 will be. 😉
Step 2. We had to measure the sound levels and than monitor the decay over a certain time. “Easy!” I thought. But after a visit to the sound experts, things turned out to be a little bit more complicated. We didn’t need a single RT60 figure, we’re interested in the full range of RT60 values of the audible frequency range. In other words, we needed to transform the time recording to the frequency range. A vague remembrance immediately triggered “Fourier transforms”, something I once learned at the university, to never use again….I thought.
Next: capture the time recording. Since we’re looking at time durations, the latency had to be very small. This means we needed to use to the lowest level of audio services in iOS. Goodbye to all the convenient libraries and start-up the cold turkey C-programming machine! Extra bottleneck: an iPhone is essentially meant to record a normal conversation and the hardware design is based on this restriction. For the RT60 application this meant dealing with a poor response flatness. Pimping the iPhone with an external microphone did help a lot.
Step 4: Frequencies. Luckily there is native support for Fast Fourier Transform, so after some fiddling we got the FFT running. But then I realised that we required Octave Bands. In Octave Bands the frequency is doubled for each successive band. When dealing with FFT we get frequency bins evenly spread over the whole spectrum. This means, when trying to fit it to octave bands, the lower band will get lots of frequency bins but the higher bands hardly won’t get any. On top of that, some frequency bins will cover two bands, which will lead to spectral leakage. (In case you’re still reading: “Congratz!”)
“Tackling the Octave Bands” was my next challenge. Instead of heading for the FFT, I choose to get octave band levels by band pass filtering. There is an official specification (ANSI S1.11-2004) which defines the minimum and maximum attenuation (loss of intensity) for each octave band. When you’re talking about band pass filters, there are a few options like Butterworth, Bessel, Chebyshev, Elliptic. Since the specification is rather stringent for the band flatness, the first two filters seem to be the best candidates. The slope of the decay demands that you need at least a 4th order filter. Since iOS has good support for biquad filter calculations, the most efficient way to implement the real-time filters was a cascade of two biquad filters. The result is that we’re actually not 100% meeting the specs but are getting really close. With a bit of tweaking we can probably get it even better.
Final stage: wrapping it up and making it work in a real-time system. With all the octave band filtering calculations we needed to perform, the key goal is not to lose any recorded package. Well, a ring buffer should solve this issue. While the recorder writes the fresh data to the ring buffer at the sampling rate, the analyser can get the samples from the ring buffer at it’s own speed, making sure it avoids buffer overrun. Note: when there is a hiccup in the analysing speed (especially during writing to the disk) there is no direct worry for data loss. In practice, we see that this works great.
The result is an application with a slick interface including some very sweet touch gestures. Actually, we (and our client) are convinced the app is a real showpiece.
What’s left? The validation of the design. We’ll meet up with the audio experts again this week and will put the app to the test in a professional recording studio. Yep, the big test.
This post describes the fun we have in the work we do. It’s about accepting challenges and trying new things every day. And when the mission is accomplished, we’re all proud & happy (and ready for some cold beer).
[written by Bas Pellis, partner at WebComrades]