Recording, syncing and exporting web audio

I'm currently working on a side project which allows musicians to collaborate and record music. For this web app I needed to overcome several technical challenges which I will explain for you today!


1) Supporting audio recording/playback
The first challenge is how to support web audio playback for as many people as possible. Bearing in mind Web Audio technologies are fairly new to the scene and not all features are supported. First step is to look at the supported browsers for each technology:
http://caniuse.com/#search=audio

I settled on Web Audio as it is supported by the most popular browsers for desktop and mobile, Chrome, Firefox and Safari. It also provides an array of advanced features which I could use to make my web app better!

For my version I decided to use Recorder.js to assist interactions with the Web Audio API. It provides many advantages such as using Web Workers, which prevent glitches and jumps in audio recording.

You can get Recorder.js here:
https://github.com/mattdiamond/Recorderjs

Here is my code to check for Web Audio API support and then to load Recorder.js library:

var me = this;
window.onload = function () {
    // check for web audio support
    try {
        window.AudioContext = window.AudioContext || window.webkitAudioContext  || window.mozAudioContext || window.msAudioContext;
        navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia || navigator.msGetUserMedia;
        window.URL = window.URL || window.webkitURL || window.mozURL  || window.msURL;
        me.context = new window.AudioContext();
        me.context.createGain = me.context.createGain || me.context.createGainNode;
    } catch (e) {
        window.alert('Your browser does not support WebAudio, try Google Chrome');
    }

    // if recording is supported then load Recorder.js
    if (navigator.getUserMedia) {
        navigator.getUserMedia({audio: true}, function (stream) {
            var input = me.context.createMediaStreamSource(stream);
            me.recorder = new Recorder(input);
        }, function (e) {
            window.alert('Please enable your microphone to begin recording');
        });
    } else {
        window.alert('Your browser does not support recording, try Google Chrome');
    }
};


2) Preloading audio files
The next challenge is how to preload audio files in the background. This allows them to be played on demand e.g. in time with a button click or another sound. You can accomplish this by loading the file using a normal XHR request, then use the Web Audio API to decode the data back into a audio buffer array for fast playback:

cue: function (url, callback) {
    // abort the previous ajax request
    var me = this;
    if (this.request) {
        this.request.abort();
    } else {
        this.request = new XMLHttpRequest();
    }
    this.request.open('GET', url, true);
    this.request.responseType = 'arraybuffer';
    this.request.onload = function () {
        // convert data into audio buffer
        me.context.decodeAudioData(me.request.response, function (buffer) {
            callback(buffer);
        });
    };
    this.request.send();
}

3) Playing and looping audio
Next we want to play back and loop the audio buffers! In this example I am also recording the start time of the audio. This is important for later on when we want to sync another audio file to this one:

play: function (data, callback) {
    // create audio node and play buffer
    var me = this,
        source = this.context.createBufferSource(),
        gainNode = this.context.createGain();
    if (!source.start) { source.start = source.noteOn; }
    if (!source.stop) { source.stop = source.noteOff; }
    source.connect(gainNode);
    gainNode.connect(this.context.destination);
    source.buffer = data;
    source.loop = true;
    source.startTime = this.context.currentTime; // important for later!
    source.start(0);
    return source;
}


4) Recording audio
Now we have a backing track looping we want to be able to record a vocal over the top. For this we are using Recorder.js's Web Workers, again i'm manually saving the start time for use later on:

record: function () {
    // start recording using Recorder.js
    this.recorder.clear();
    this.recorder.startTime = this.context.currentTime;
    this.recorder.record();
}


5) Stop recording and export buffer
When we stop the recording, we can export the buffer using Recorder.js ready for playback

recordStop: function (callback) {
    // stop recording and get the recorded buffer data
    this.recorder.stop();
    this.recorder.getBuffer(function (buffers) {
        callback(buffers);
    });
}


6) Play audio in sync with another file
We now need to time the recorded audio buffer to play back exactly in time with the loop. How do we do that? I created a sync function which calculates the difference in time between the current time, and the length of the loop you pass in. This uses the custom start time variable from the play function in step 3. Then it runs the function after a delay, to put it in sync with the loop.

sync: function (action, target, param, runThis) {
    // calculate difference between current time and loop length
    var me = this,
        offset = (this.context.currentTime - target.startTime) % target.buffer.duration,
        time = target.buffer.duration - offset;
    // clear previous timers then run function after time difference
    if (this.syncTimer) {
        window.clearTimeout(this.syncTimer);
    }
    this.syncTimer = window.setTimeout(function () {
        runThis();
    }, time * 1000);
}

7) Adjusting for latency and small timing issues
Now we have working playback, where both audio files start at the same time. But now we have another problem, different devices and browsers have a slight delay between the sound and recording to disk, even thought the audio files start at the same time! We need a way to tweak the recording, pushing it faster or slower to match the original loop.

There are several approaches to this, including getting the user to set the offset before recording and adjusting the timers. However I opted to go for the best user experience, which is to complete the recording, then allow the user to adjust the recording sync dynamically during playback. Quite a tough challenge!

The way I achieved this was to force the recording to be shorter than the original loop length. Then dynamically fill in the gaps at the start and end with blank audio. These gaps can then be adjusted dynamically, based on a slider.

The first function allows us to create an audio buffer from audio data, this is required to work out the real length of the audio based on the playback sample rate:

createBuffer: function (buffers, channelTotal) {
    // create an audio buffer from data, so we can play it back and get the real length
    var channel = 0,
        buffer = this.context.createBuffer(channelTotal, buffers[0].length, this.context.sampleRate);
    for (channel = 0; channel < channelTotal; channel += 1) {
        buffer.getChannelData(channel).set(buffers[channel]);
    }
    return buffer;
}

The next function works out the different in length (in samples) between the original loop and the recording. This gives us the before and after samples, we then deduct the offset that the user can change to put more samples at the start or at the end:

getOffset: function (vocalsRecording, backingInstance, offset) {
    // work out the difference in samples between the length of two recordings
    var diff = (this.recorder.startTime + (offset / 1000)) - backingInstance.startTime;
    return {
        before: Math.round((diff % backingInstance.buffer.duration) * this.context.sampleRate),
        after: Math.round((backingInstance.buffer.duration - ((diff + vocalsRecording.duration) % backingInstance.buffer.duration)) * this.context.sampleRate)
    };
}

The last function takes the recorded audio data, and the number of before and after samples. It creates a new audio data array and fills in the blanks before and after the recording:

offsetBuffer: function (vocalsBuffers, before, after) {
    // create a new audio buffer and fill in the gaps before and after of the recording
    var i = 0,
        channel = 0,
        channelTotal = 2,
        num = 0,
        audioBuffer = this.context.createBuffer(channelTotal, before + vocalsBuffers[0].length + after, this.context.sampleRate),
        buffer = null;
    for (channel = 0; channel < channelTotal; channel += 1) {
        buffer = audioBuffer.getChannelData(channel);
        for (i = 0; i < before; i += 1) {
            buffer[num] = 0;
            num += 1;
        }
        for (i = 0; i < vocalsBuffers[channel].length; i += 1) {
            buffer[num] = vocalsBuffers[channel][i];
            num += 1;
        }
        for (i = 0; i < after; i += 1) {
            buffer[num] = 0;
            num += 1;
        }
    }
    return audioBuffer;
}

Whew... that's a lot of maths. But amazingly it works!

8) Exporting the modified audio to wav
Once the user is happy with the recording and the offset is lining up correctly, they want save it as an audio file. For this we need to export the modified audio data and adjust the sample rate to change the filesize. Recorder.js unfortunately doesn't support passing our modified audio file through to web workers and also doesn't support different sample rates, so we will need to modify recorderWorker.js to support our new features.

I added before and after params to the export wav function and the rate variable allows you to drop the sample rate down to reduce the quality/filesize:

var rate = 22050;

function exportWAV(type, before, after){
    if (!before) { before = 0; }
    if (!after) { after = 0; }

    var channel = 0,
        buffers = [];
    for (channel = 0; channel < numChannels; channel++){
        buffers.push(mergeBuffers(recBuffers[channel], recLength));
    }

    var i = 0,
        offset = 0,
        newbuffers = [];

    for (channel = 0; channel < numChannels; channel += 1) {
        offset = 0;
        newbuffers[channel] = new Float32Array(before + recLength + after);
        if (before > 0) {
            for (i = 0; i < before; i += 1) {
                newbuffers[channel].set([0], offset);
                offset += 1;
            }
        }
        newbuffers[channel].set(buffers[channel], offset);
        offset += buffers[channel].length;
        if (after > 0) {
            for (i = 0; i < after; i += 1) {
                newbuffers[channel].set([0], offset);
                offset += 1;
            }
        }
    }

    if (numChannels === 2){
        var interleaved = interleave(newbuffers[0], newbuffers[1]);
    } else {
        var interleaved = newbuffers[0];
    }

    var downsampledBuffer = downsampleBuffer(interleaved, rate);
    var dataview = encodeWAV(downsampledBuffer, rate);
    var audioBlob = new Blob([dataview], { type: type });

    this.postMessage(audioBlob);
}

There you have it, a completed audio workflow!

If you want to have a play with some live code you can see my working demo here:
http://kmturley.github.io/Recorderjs/loop.html