Monday, August 2, 2010


Abstract:I demonstrate three AudioUnit sample codes using MonoTouch: sine waveform generator, AIFF audio file player, and monitoring microphone. Comparing to other frameworks (Audio Queue Service and OpenAL), AudioUnit can realize real-time audio rendering and the lowest output latency (~ 3 msec), which is suitable for instrument or sound-triggerd application. The sample codes are in the github under the MIT license so that you can use it freely [1].

 In my software modem project, I once decided to use AudioQueue Serivce to access speaker and microphone. But I need smaller output latency than 23 msec, which is the lowest achievable value with Audio Queue Service. Therefore I try to use AudioUnit which is the lowest-level audio API where its output latency is up to 3 msec.

MonoTouch AudioUnit marshaling

 AudioUnit is a set of C language methods included in AudioToolbox framework.Therefore interoperating with unmanaged code is required to use it from MonoTouch [2]. Details of the interoperation is described in the wrapper classe source codes (a project MonoTouch.AudioUnit in the sample code) which correspond to AudioUnit method and struct definitions one by one.

How to output an arbitrary signal using audio unit
 Arbitrary signal generation code using Audio Unit is basically the same to that of Audio Queue Service described in my previous article ( ).
To output arbitrary signal:
  1. Setting up output audio unit,
  2. Setting an event handler to render,
  3. Filling waveform in the buffer.

 Procedure to create output audio unit (Remote IO unit) is already described in Apple developer's documents. First, output audio unit (Remote IO unit) is created as the following code:
   1:  void prepareAudioUnit()
   2:  {
   3:      // Creating AudioComponentDescription instance of RemoteIO Audio Unit
   4:      AudioComponentDescription cd = new AudioComponentDescription()
   5:      {
   6:          componentType    = AudioComponentDescription.AudioComponentType.kAudioUnitType_Output,
   7:          componentSubType = AudioComponentDescription.AudioComponentSubType.kAudioUnitSubType_RemoteIO,
   8:          componentManufacturer = AudioComponentDescription.AudioComponentManufacturerType.kAudioUnitManufacturer_Apple,
   9:          componentFlags = 0,
  10:          componentFlagsMask = 0
  11:      };
  13:      // Getting AudioComponent from the description
  14:      _component = AudioComponent.FindComponent(cd);
  16:      // Getting Audiounit
  17:      _audioUnit = AudioUnit.CreateInstance(_component);
  19:      // setting AudioStreamBasicDescription
  20:      int AudioUnitSampleTypeSize;
  21:      if (MonoTouch.ObjCRuntime.Runtime.Arch == MonoTouch.ObjCRuntime.Arch.SIMULATOR)
  22:      {
  23:          AudioUnitSampleTypeSize = sizeof(float);
  24:      }
  25:      else
  26:      {
  27:          AudioUnitSampleTypeSize = sizeof(int);
  28:      }
  29:      AudioStreamBasicDescription audioFormat = new AudioStreamBasicDescription()
  30:      {
  31:          SampleRate = _sampleRate,
  32:          Format = AudioFormatType.LinearPCM,
  33:          //kAudioFormatFlagsAudioUnitCanonical = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsPacked | kAudioFormatFlagIsNonInterleaved | (kAudioUnitSampleFractionBits << kLinearPCMFormatFlagsSampleFractionShift),
  34:          FormatFlags = (AudioFormatFlags)((int)AudioFormatFlags.IsSignedInteger | (int)AudioFormatFlags.IsPacked | (int)AudioFormatFlags.IsNonInterleaved | (int)(kAudioUnitSampleFractionBits << (int)AudioFormatFlags.LinearPCMSampleFractionShift)),
  35:          ChannelsPerFrame = 2,
  36:          BytesPerPacket = AudioUnitSampleTypeSize,
  37:          BytesPerFrame = AudioUnitSampleTypeSize,
  38:          FramesPerPacket = 1,
  39:          BitsPerChannel = 8 * AudioUnitSampleTypeSize,
  40:          Reserved = 0
  41:      };
  42:      _audioUnit.SetAudioFormat(audioFormat, AudioUnit.AudioUnitScopeType.kAudioUnitScope_Input, 0);            
  44:      // setting callback
  45:      _audioUnit.RenderCallback += new EventHandler<AudioUnitEventArgs>(callback);
  46:  }

 Because Audio unit sample type is different between a simulator (32-bit float) and a device (32-bit int), the difference is absorbed in the code (line #21-28). Each time audio unit requires buffer rendering, a render callback event handler (line #45) is invoked as following:

   1:  void callback(object sender, AudioUnitEventArgs args)
   2:  { 
   3:      // Generating sin waveform
   4:      double dphai = 440 * 2.0 * Math.PI / _sampleRate;
   6:      // Getting a pointer to a buffer to be filled
   7:      IntPtr outL = args.Data.mBuffers[0].mData;            
   8:      IntPtr outR = args.Data.mBuffers[1].mData;
  10:      // filling sin waveform.
  11:      // AudioUnitSampleType is different between a simulator (float32) and a real device (int32).
  12:      if (MonoTouch.ObjCRuntime.Runtime.Arch == MonoTouch.ObjCRuntime.Arch.SIMULATOR)
  13:      {
  14:          unsafe
  15:          {
  16:              var outLPtr = (float *) outL.ToPointer();
  17:              var outRPtr = (float *) outR.ToPointer();
  18:              for (int i = 0; i < args.NumberFrames; i++)
  19:              {                        
  20:                  float sample = (float)Math.Sin(_phase) / 2048;
  21:                  *outLPtr++ = sample;
  22:                  *outRPtr++ = sample;
  23:                  _phase += dphai;
  24:              }
  25:          }
  26:      }
  27:      else
  28:      {
  29:          unsafe 
  30:          {
  31:              var outLPtr = (int*)outL.ToPointer();
  32:              var outRPtr = (int*)outR.ToPointer();
  33:              for (int i = 0; i < args.NumberFrames; i++)
  34:              {                        
  35:                  int sample = (int)(Math.Sin(_phase) * int.MaxValue / 128); // signal waveform format is fixed-point (8.24)
  36:                  *outLPtr++ = sample;
  37:                  *outRPtr++ = sample;
  38:                  _phase += dphai;
  39:              }
  40:          }
  41:      }
  43:      _phase %= 2 * Math.PI;
  44:  }

 Audio buffer is a 32-bit float or 32-bit integer array. Therefore arbitrary waveform can be written in the buffer within unsafe block as above. Array length can be set using AudioSesion framework, whose default value is 512 and must be more than 128. Smaller array length can achieve less output latency while it needs more processor power.

Other examples
 I omit explanation of other examples, playing audio file and recording from the microphone. Please read the source code :-P


  1. Sample codes,
  2. Interoperating with unmanaged code,


  1. Good stuff!

    One suggestion, rather than an if/then in the callback, I'd prefer to see two separate callback methods, one for Simulator and one for Device, and assign the appropriate handler to the RenderCallback event handler..

  2. Thank you!
    I agree with you. Your implementation is better performance than mine. I will update the example code of the github.

  3. Very nice.

    I like to see an MonoTouch example on how to record audio in compressed format like AAC (.m4a) on iPhone.


  4. It is possible by replacing AudioStreamBasicDescription format of the sample code , ExtAudioFile supports various audio formats including AAC.

  5. Thanks,

    I tried with this settings:

    AudioStreamBasicDescription outputFormat = new AudioStreamBasicDescription()
    SampleRate = 44100,
    Format = AudioFormatType.AppleIMA4,
    FramesPerPacket = 64,
    ChannelsPerFrame = 1,
    BytesPerPacket = 32


    _extAudioFile = ExtAudioFile.CreateWithURL(url, AudioFileType.CAF, outputFormat, AudioFileFlags.EraseFlags);

    I think it works in the simulator (no audio is played back, but no crash)

    In the Phone (iPhone 3G and iPhone 4 with iOS 4) it crash on "public static ExtAudioFile CreateWithURL".

    A monotoch sample code recording to AAC would be perfect, I know more than me is looking for this.

  6. I have not tried to handle AppleIMA4 format, so AudioUnit wrapper class may have some problem.
    AudioUnit wrapper classes throw an exception when AudioUnit library function returns an error code, witch is OSStatus enum value (you can get value means here : ).
    Chekking an error code within the function CreateWithURL by setting a breaking point may be starting point.

  7. OK I got it working on simulator and iPhone.

    AudioStreamBasicDescriptionhave to be on (iPhone4)
    SampleRate = 44100,
    Format = AudioFormatType.MPEG4AAC,

    _extAudioFile = ExtAudioFile.CreateWithURL(url, AudioFileType.CAF, outputFormat, AudioFileFlags.EraseFlags);

    on iPhone 3G (not S) don't do AAC so here I have to use: Format = AudioFormatType.AppleIMA4

    But I do have tow more questions.

    1. How do I turn off the routing mic > speaker (I only need recording)
    2. VU meters how? With AVAudioRecorder I just use "MeteringEnabled"

    Thanks again!

  8. > 1. mic > speaker
    I am just writing ;-P.
    You can get a C/C++ sample code from .
    To record, getting a remote io audio unit by calling audio component functions, and call AURender function to get mic signal in the callback function.
    AURender function needs a memory allocated audio list buffer, which I am just writting a wrapper class for it (it may take a month, it is not high priority task item...).

    >2. UV meters how ?
    I do not know. Perhaps, I have to read the technical document .

  9. Hi again,

    Well my C++ is not that god and Mac... C# is my thing.

    I simply have to wait for your AURender wrapper and monotouch code, please hurry up ;-)

    I found this, maybe helps

  10. Thank you for a good blog entry. I will do ;-P

  11. I aslo found that AudioQue can get levels for VU meter, but not sure if and how I can use that with AudioUnit.

  12. Very nice progress on the AudioUnit examples.

    "SoundTrigger" sound level, could be used as a VU meter properly.

    One question, The "PlayThrough" example, I got AAC recording working just fine. But how can I mute (or set volume to zero) on the speaker or better mic that goes to speaker without muting sound to recording?

    Thanks again!

  13. muting a speaker and sound signal rendering can be achieved by modification of the sound waveform which passed by the event argument.

  14. Are there any apps in the appstore that do the same thing?


  15. I do not know, but I afraid no app like this, because it is too much simple so Apple may reject.