Stefan Popp

Softwareentwicklung, Trainer & Berater

Capture iPhone microphone

By Stefan Popp | November 7, 2011 | 11 Comment

screenshot-audioexample

Processing input as stream

Before you start, you should already have some knowledge about basics in Objective-C and C. Sometimes you need more than just a basic recording of external sounds. In my case i need the direct input stream from the iPhone microphone or headset.

There are many cases to process the stream on-the-fly for things like echoes or fancy delayed sounds on the input signal. You can make funny voices or the next Shazam. Its up to you and here is the first step for get a callback on the stream.

It is pretty hard to find informations about common techniques for processing the audio stream in iOS. This example and sample project which is available on GitHub, is just a start. In my example i give you a basic overview about things you have to make. This example covers a basic gain to boost the audio input. Since iOS 5 you dont need to write your on gain function cause the api got now a direct control about the level.

Project Setup

Just start with a empty or window based project. You will need only one class to manage your signals at all. In my example project my view controller holds a AudioProcessor object. It includes all functionalities like start and stoping and configure the AU.

In our case we got a file name AudioProcessor.h

//
//  AudioProcessor.h
//
//  Created by Stefan Popp on 21.09.11.
//  Copyright 2011 www.stefanpopp.de . All rights reserved.
//

#import
#import

// return max value for given values
#define max(a, b) (((a) > (b)) ? (a) : (b))
// return min value for given values
#define min(a, b) (((a) < (b)) ? (a) : (b))

#define kOutputBus 0
#define kInputBus 1

// our default sample rate
#define SAMPLE_RATE 44100.00

We use the Audio Toolbox which is part of the Audio Toolbox Framework. The kOutputBus and kInputBus defines later the microphone and default output speaker.
The sample rate is 44khz.

@interface AudioProcessor : NSObject
{
    // Audio unit
    AudioComponentInstance audioUnit;

    // Audio buffers
    AudioBuffer audioBuffer;

    // gain
    float gain;
}

@property (readonly) AudioBuffer audioBuffer;
@property (readonly) AudioComponentInstance audioUnit;
@property (nonatomic) float gain;

Our class needs a audio component, a input buffer and for our DSP (digital signal processing) section a variable which holds the current gain multiplicator.

-(AudioProcessor*)init;

-(void)initializeAudio;
-(void)processBuffer: (AudioBufferList*) audioBufferList;

// control object
-(void)start;
-(void)stop;

// gain
-(void)setGain:(float)gainValue;
-(float)getGain;

// error managment
-(void)hasError:(int)statusCode:(char*)file:(int)line;

Record callback

Beside the initializers i implemented a simple error check method and start and stop functions for the AU.
The implementation is lot more complex and i commented a lot on it. If you dont really know what it is, consult the apple documentation even if its not best document on this topic.


#import "AudioProcessor.h"

#pragma mark Recording callback

static OSStatus recordingCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {

    // the data gets rendered here
    AudioBuffer buffer;

    // a variable where we check the status
    OSStatus status;

    /**
    This is the reference to the object who owns the callback.
    */
    AudioProcessor *audioProcessor = (AudioProcessor*) inRefCon;

    /**
    on this point we define the number of channels, which is mono
    for the iphone. the number of frames is usally 512 or 1024.
    */
    buffer.mDataByteSize = inNumberFrames * 2; // sample size
    buffer.mNumberChannels = 1; // one channel
    buffer.mData = malloc( inNumberFrames * 2 ); // buffer size

    // we put our buffer into a bufferlist array for rendering
    AudioBufferList bufferList;
    bufferList.mNumberBuffers = 1;
    bufferList.mBuffers[0] = buffer;

    // render input and check for error
    status = AudioUnitRender([audioProcessor audioUnit], ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames,     &bufferList);
    [audioProcessor hasError:status:__FILE__:__LINE__];

    // process the bufferlist in the audio processor
    [audioProcessor processBuffer:&bufferList];

    // clean up the buffer
    free(bufferList.mBuffers[0].mData);

    return noErr;
}

Playback callback

The recording callback is called every time when new packets are available. The stream has to be rendered by the audio unit for further processing in the process buffer function of the audio processor object. At the end the buffer has to be freed to avoid memory leaks.

#pragma mark Playback callback

static OSStatus playbackCallback(void *inRefCon,
								 AudioUnitRenderActionFlags *ioActionFlags,
								 const AudioTimeStamp *inTimeStamp,
								 UInt32 inBusNumber,
								 UInt32 inNumberFrames,
								 AudioBufferList *ioData) {

    /**
     This is the reference to the object who owns the callback.
     */
    AudioProcessor *audioProcessor = (AudioProcessor*) inRefCon;

    // iterate over incoming stream an copy to output stream
	for (int i=0; i < ioData->mNumberBuffers; i++) {
		AudioBuffer buffer = ioData->mBuffers[i];

                // find minimum size
		UInt32 size = min(buffer.mDataByteSize, [audioProcessor audioBuffer].mDataByteSize);

                // copy buffer to audio buffer which gets played after function return
		memcpy(buffer.mData, [audioProcessor audioBuffer].mData, size);

                // set data size
		buffer.mDataByteSize = size;
    }
    return noErr;
}

The playback function is just important if you want to loop back the signal you have processed. This is pretty useful on debugging your processed buffers.

#pragma mark objective-c class

@implementation AudioProcessor
@synthesize audioUnit, audioBuffer, gain;

-(AudioProcessor*)init
{
    self = [super init];
    if (self) {
        gain = 0;
        [self initializeAudio];
    }
    return self;
}

Audio component description

I dont think that this part needs a lot of explantation. The part below is well commented. If you have any questions take a look into the CoreAudio documentation =). The code below is a description about the input and output processing. We define the channel properties like the input format. The audio unit is defined by our description and gets initialized at the bottom.

-(void)initializeAudio
{
        OSStatus status;

	// We define the audio component
	AudioComponentDescription desc;
	desc.componentType = kAudioUnitType_Output; // we want to ouput
	desc.componentSubType = kAudioUnitSubType_RemoteIO; // we want in and ouput
	desc.componentFlags = 0; // must be zero
	desc.componentFlagsMask = 0; // must be zero
	desc.componentManufacturer = kAudioUnitManufacturer_Apple; // select provider

	// find the AU component by description
	AudioComponent inputComponent = AudioComponentFindNext(NULL, &desc);

	// create audio unit by component
	status = AudioComponentInstanceNew(inputComponent, &audioUnit);

	[self hasError:status:__FILE__:__LINE__];

        // define that we want record io on the input bus
        UInt32 flag = 1;
	status = AudioUnitSetProperty(audioUnit,
								  kAudioOutputUnitProperty_EnableIO, // use io
								  kAudioUnitScope_Input, // scope to input
								  kInputBus, // select input bus (1)
								  &flag, // set flag
								  sizeof(flag));
	[self hasError:status:__FILE__:__LINE__];

	// define that we want play on io on the output bus
	status = AudioUnitSetProperty(audioUnit,
								  kAudioOutputUnitProperty_EnableIO, // use io
								  kAudioUnitScope_Output, // scope to output
								  kOutputBus, // select output bus (0)
								  &flag, // set flag
								  sizeof(flag));
	[self hasError:status:__FILE__:__LINE__];

	/*
         We need to specifie our format on which we want to work.
         We use Linear PCM cause its uncompressed and we work on raw data.
         for more informations check.

         We want 16 bits, 2 bytes per packet/frames at 44khz
        */
	AudioStreamBasicDescription audioFormat;
	audioFormat.mSampleRate			= SAMPLE_RATE;
	audioFormat.mFormatID			= kAudioFormatLinearPCM;
	audioFormat.mFormatFlags		= kAudioFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger;
	audioFormat.mFramesPerPacket	= 1;
	audioFormat.mChannelsPerFrame	= 1;
	audioFormat.mBitsPerChannel		= 16;
	audioFormat.mBytesPerPacket		= 2;
	audioFormat.mBytesPerFrame		= 2;

	// set the format on the output stream
	status = AudioUnitSetProperty(audioUnit,
								  kAudioUnitProperty_StreamFormat,
								  kAudioUnitScope_Output,
								  kInputBus,
								  &audioFormat,
								  sizeof(audioFormat));

	[self hasError:status:__FILE__:__LINE__];

        // set the format on the input stream
	status = AudioUnitSetProperty(audioUnit,
								  kAudioUnitProperty_StreamFormat,
								  kAudioUnitScope_Input,
								  kOutputBus,
								  &audioFormat,
								  sizeof(audioFormat));
	[self hasError:status:__FILE__:__LINE__];

        /**
        We need to define a callback structure which holds
        a pointer to the recordingCallback and a reference to
        the audio processor object
        */
	AURenderCallbackStruct callbackStruct;

        // set recording callback
	callbackStruct.inputProc = recordingCallback; // recordingCallback pointer
	callbackStruct.inputProcRefCon = self;

        // set input callback to recording callback on the input bus
	status = AudioUnitSetProperty(audioUnit,
                                  kAudioOutputUnitProperty_SetInputCallback,
								  kAudioUnitScope_Global,
								  kInputBus,
								  &callbackStruct,
								  sizeof(callbackStruct));

        [self hasError:status:__FILE__:__LINE__];

        /*
         We do the same on the output stream to hear what is coming
         from the input stream
         */
	callbackStruct.inputProc = playbackCallback;
	callbackStruct.inputProcRefCon = self;

        // set playbackCallback as callback on our renderer for the output bus
	status = AudioUnitSetProperty(audioUnit,
								  kAudioUnitProperty_SetRenderCallback,
								  kAudioUnitScope_Global,
								  kOutputBus,
								  &callbackStruct,
								  sizeof(callbackStruct));
	[self hasError:status:__FILE__:__LINE__];

        // reset flag to 0
	flag = 0;

        /*
         we need to tell the audio unit to allocate the render buffer,
         that we can directly write into it.
         */
	status = AudioUnitSetProperty(audioUnit,
								  kAudioUnitProperty_ShouldAllocateBuffer,
								  kAudioUnitScope_Output,
								  kInputBus,
								  &flag,
								  sizeof(flag));

        /*
         we set the number of channels to mono and allocate our block size to
         1024 bytes.
        */
	audioBuffer.mNumberChannels = 1;
	audioBuffer.mDataByteSize = 512 * 2;
	audioBuffer.mData = malloc( 512 * 2 );

	// Initialize the Audio Unit and cross fingers =)
	status = AudioUnitInitialize(audioUnit);
	[self hasError:status:__FILE__:__LINE__];

       NSLog(@"Started");

}

AudioUnit control

I need some control about the AU so i added a start and stop function to the class.

#pragma mark controll stream

-(void)start;
{
    // start the audio unit. You should hear something, hopefully :)
    OSStatus status = AudioOutputUnitStart(audioUnit);
    [self hasError:status:__FILE__:__LINE__];
}
-(void)stop;
{
    // stop the audio unit
    OSStatus status = AudioOutputUnitStop(audioUnit);
    [self hasError:status:__FILE__:__LINE__];
}

This is just to set the gain from outside.

-(void)setGain:(float)gainValue
{
    gain = gainValue;
}

-(float)getGain
{
    return gain;
}

Audio stream manipulation

Iam not the fan of splitting function code in a blogpost so here is the audio buffer processor commented in the code. Hopefully not in that bad english like in this post ;)

#pragma mark processing

-(void)processBuffer: (AudioBufferList*) audioBufferList
{
    AudioBuffer sourceBuffer = audioBufferList->mBuffers[0];

    // we check here if the input data byte size has changed
    if (audioBuffer.mDataByteSize != sourceBuffer.mDataByteSize) {
        // clear old buffer
        free(audioBuffer.mData);
        // assing new byte size and allocate them on mData
        audioBuffer.mDataByteSize = sourceBuffer.mDataByteSize;
        audioBuffer.mData = malloc(sourceBuffer.mDataByteSize);
     }

    /**
     Here we modify the raw data buffer now.
     In my example this is a simple input volume gain.
     iOS 5 has this on board now, but as example quite good.
     */
    SInt16 *editBuffer = audioBufferList->mBuffers[0].mData;

    // loop over every packet
    for (int nb = 0; nb < (audioBufferList->mBuffers[0].mDataByteSize / 2); nb++) {

        // we check if the gain has been modified to save resoures
        if (gain != 0) {
            // we need more accuracy in our calculation so we calculate with doubles
            double gainSample = ((double)editBuffer[nb]) / 32767.0;

            /*
            at this point we multiply with our gain factor
            we dont make a addition to prevent generation of sound where no sound is.

             no noise
             0*10=0

             noise if zero
             0+10=10
            */
            gainSample *= gain;

            /**
             our signal range cant be higher or lesser -1.0/1.0
             we prevent that the signal got outside our range
             */
            gainSample = (gainSample < -1.0) ? -1.0 : (gainSample > 1.0) ? 1.0 : gainSample;

            /*
             This thing here is a little helper to shape our incoming wave.
             The sound gets pretty warm and better and the noise is reduced a lot.
             Feel free to outcomment this line and here again.

             You can see here what happens here http://silentmatt.com/javascript-function-plotter/
             Copy this to the command line and hit enter: plot y=(1.5*x)-0.5*x*x*x
             */

            gainSample = (1.5 * gainSample) - 0.5 * gainSample * gainSample * gainSample;

            // multiply the new signal back to short
            gainSample = gainSample * 32767.0;

            // write calculate sample back to the buffer
            editBuffer[nb] = (SInt16)gainSample;
        }
    }

    // copy incoming audio data to the audio buffer
    memcpy(audioBuffer.mData, audioBufferList->mBuffers[0].mData, audioBufferList->mBuffers[0].mDataByteSize);
}

My little error handler which does not really handle errors, but this is not really important for this example.

#pragma mark Error handling

-(void)hasError:(int)statusCode:(char*)file:(int)line
{
	if (statusCode) {
		printf("Error Code responded %d in file %s on line %d\n", statusCode, file, line);
        exit(-1);
	}
}
@end

Conclusion

Its pretty easy to get a callback and define a simple audio component. The hard part is to find the right tool for the things you wanna do. While research for this post i have seen a lot of ways to manipulate audio streams, but this is in my opinion easiest and most controllable  way.

You can read more about this topic on the following sites. MusicDSP is my personal favorite if you need professional ways of coding effects or getting methods for counting beats per minute. Its the best knowledge base for audio DSP i know.

CoreAudio documentation
MusicDSP.org
Apple Audio Session Programming Guide

Project source code

You can download the project source code here on GitHub
https://github.com/fuxx/MicInput

 
39 Kudos
Don't
move!

11 Comments

02 Februar, 2012

Hi
This is very cool staff.
I am looking for a way to stream the voice (Output) to my server.
Did you ever tried to do that ?
Do you know where can I find an example that do such a thing ?

Best

Samantha

Stefan Popp

02 Februar, 2012

Hey Sam!

I have not tried that. I dont know what you want but it should not to hard extend my example. One approach is to open a direct connection via tcp or udp to a remote server and send the audio chunks to it.

I would do something like this:
Thinking about creating something like a web service which can receive data, on port 1234 for example and analyze the PCM data and send something back.

Like:

Class Webservice communication on a background thread. You should have 2-3 functions e.g. open connection to webserver, send data and compress data on the fly. Receive events from webservice and handle them.

On another thread you have your microphone recorder which has a instance of the webservice object.

The object should do something like this:
Record to iPhone audio buffer -> check if webservice is available and connect if not -> call send function and give chunk to the webservice send function.

Its pretty abstract and just an idea. Maybe there is a better way, but i think this is a good approach if you need to make secret analysis on your server.

Best,
Stefan

03 Februar, 2012

Thanks Stefan

I will try that, if it will come out good I will share it.

Best

Sam

27 Juli, 2012

HI,

I want to route the audio to airplay device, but audio unit does not recognizes airplay device. Please help

27 Juli, 2012

Hi,

I'm trying to develop something with audio units and I'm getting a distorted sound. I came across your post and I placed my settings in your app and notice I'm getting a distorted noise also. I need to use ALaw and 8000 hertz sampling rate but my voice sounds robotic. Do you see anything wrong with your app and the below settings ? I'd really appreciate some help as clearly I'm doing something wrong but can't see what it is.

Thanks

AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 8000;
audioFormat.mFormatID = kAudioFormatALaw;
audioFormat.mFormatFlags = kAudioFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = 1;
audioFormat.mBitsPerChannel = 8;
audioFormat.mBytesPerPacket = 1;
audioFormat.mBytesPerFrame = 1;

12 Dezember, 2012

[...] is using the iPhone microphone to capture the sound of your heart beat. A pretty accurate BPM detection analyzes the signal and returns the [...]

10 Oktober, 2013

excellent tutorial. can you tell me the way to find the beat count per min?

thanks,
bala

10 Oktober, 2013

Hi,

excellent tutorial. can you tell me the way to find the beat count per min?, and a way to show some nice graphical waves in the screen.

thanks,
bala

05 Dezember, 2013

Hi Stefan,
Hope thismessage finds you in good spirits.

I am doing a POC in which I need to crate an app which fetches Input from iPhone mic and routes the output to the Bluetooth headset/speakers.

I referred to your code

The code works flawlessly but it produces the output via In-Call Speakers.

Could you suggest where I should edit the code to re-route the output to Bluetooth Speakers?

30 Mai, 2014

I'm having the some problem as the last commenter. How can I route the output to bluetooth? It keeps going back to the inner speaker. Thanks!

Great article by the way :D

25 Oktober, 2014

Hi, very nice article here ! Recently, Apple has released AVAudioEngine, in which, I can tap micropone input and get AVAudioPCMBuffer. How do I stream AVAudioPCMBuffer to a remotely connected device and how do the receiving side re-construct the AVAudioPCMBuffer ?

Kommentar verfassen