Copyright © 2006-2019 MultiMedia Soft

How to perform a recording session

Previous pageReturn to chapter overviewNext page

A recording session allows receiving a sound stream from the input channel of a physical sound card, eventually applying some custom DSP in order to perform analysis and/or special effects, encoding into a specific audio format (WAV, MP3, WMA, etc.) and finally storing the encoded audio stream inside an output file or inside a memory buffer. The architecture of the control is based upon the graph below:

 

 

Modern sound cards can perform a recording session through DirectSound drivers, ASIO drivers and, starting from Windows Vista, WASAPI drivers: you can choose the type of driver you want to support through the InitDriversType method; if you should omit this call, by default the control will use DirectSound.

 

Another recording possibility, not directly related to the usage of a sound card installed inside the system, is the availability of an external device (for example a hardware device or a telephony-based application) that could receive data in input and produce RAW audio in output that, through the usage of a queue, could be "manually" sent to the recorder as seen on the schema below:

 

 

Before starting a recording session, the component needs to be initialized: for this purpose it's mandatory a call to the InitRecordingSystem method; the best place to call this method is usually the container form initialization function: for example, when using Visual Basic 6, it will be the Form_Load subroutine.

 

At this point we must decide is the output format of the recorded sound: for this purpose you need to set the EncodeFormats.ForRecording property to one of the available encoding formats.

Each encoding format set into the EncodeFormats.ForRecording property have its own settings which can be controlled through sub-properties of the EncodeFormats property: for example, if the chosen encoding format should be ENCODING_FORMAT_WAV you would have to modify settings of the EncodeFormats.WAV property, if chosen encoding format should be ENCODING_FORMAT_MP3 you would have to modify settings of the EncodeFormats.MP3 property and so on for the remaining accepted encoding formats.

 

Before starting a recording session, you need to check if the Status property is set to RECORD_STATUS_READY: if the property value should be RECORD_STATUS_NOT_READY, this would mean that initialization of the control and negotiation with underlying drivers is still not completed.

 

In case there should be the need to resample incoming sound data, for example for reducing the final recording size, it could be convenient checking the EncodeFormats.ResampleMode property and choose which of the available resample modes would best fit your needs.

 

After having decided the supported driver type, having initialized the control and chosen the output format, we must decide from which input device we want to receive sound data: for this purpose we need to enumerate available audio devices; this enumeration may vary depending upon the type of driver chosen when the InitDriversType method was invoked:

 

Recording through DirectSound

Recording through ASIO

Recording through WASAPI (Only available on Windows Vista and later versions)

Recording through a raw audio queue

 

 

Recording through DirectSound

 

When dealing with Windows XP or Windows Server 2003 and earlier versions, the number and friendly descriptions of available input devices (intended as sound cards) can be obtained through a call to the GetInputDevicesCount and GetInputDeviceDesc methods. The number and friendly descriptions of available input channels (Microphone, Line-In, etc.), relative to a certain input device, can be obtained through a call to the GetInputDeviceChannelsCount and GetInputDeviceChannelDesc methods.

 

When dealing with Windows Vista and higher versions, each input channel is seen as a separate input device so it will be enough enumerating them through the GetInputDevicesCount and GetInputDeviceDesc methods; eventual calls to the GetInputDeviceChannelsCount method will always return 1 and the GetInputDeviceChannelDesc method will return the string "Master volume".

 

The system default input channel for a certain input device can be changed at system level (so it will be reflected inside the "Sounds" applet of the Windows Control Panel) through a call to the SetInputDeviceChannelDefault method. The volume of the input channel can be changed, again at at system level, through the SetInputDeviceChannelVolume method.

 

At this point we can start the recording session calling the StartFromDirectSoundDevice method which needs to know the output file location: if an absolute pathname is specified, it can be a file on the system hard disk or, if the pathname is left empty, a memory file that could be used at a later time, for example in order to perform a custom encrypting before saving it to a file on disk: it's highly suggested avoiding the use of memory based recording sessions for a duration longer than 5 minutes because they could require a huge amount of memory: for a longer duration the output on files should be preferred.

 

 

Recording through ASIO

 

Each ASIO device is seen as a combination of input and output channels so the way to enumerate ASIO devices and respective input channels is the following:

 

1.Enumerate available ASIO devices through the ASIO.DeviceGetCount and ASIO.DeviceGetDesc methods.
2.For each ASIO device, enumerate the list of available input channels through the ASIO.DeviceGetChannelsCount and ASIO.DeviceGetChannelDesc methods by setting the respective bInputChannel parameter to "true".

 

At this point we need to start the device through the ASIO.DeviceStart method: you can know if a device is already started through the ASIO.DeviceIsStarted method.

 

Finally we can start the recording session through the StartFromAsioDevice method which needs to know the output file location: if an absolute pathname is specified, it can be a file on the system hard disk or, if the pathname is left empty, a memory file that could be used at a later time, for example in order to perform a custom encrypting before saving it to a file on disk: it's highly suggested avoiding the use of memory based recording sessions for a duration longer than 5 minutes because they could require a huge amount of memory: for a longer duration the output on files should be preferred.

 

You can refer to the How to manage ASIO drivers tutorial for further details about ASIO drivers.

 

 

Recording through WASAPI (Only available on Windows Vista and later versions)

 

Each WASAPI device is seen as a combination of input, output and loopback devices; for recording purposes we can choose between input devices and loopback devices so the way to enumerate WASAPI input/loopback devices is through the WASAPI.DeviceGetCount and WASAPI.DeviceGetDesc methods by setting the respective nDeviceType parameter to WASAPI_DEVICE_TYPE_CAPTURE for input devices or WASAPI_DEVICE_TYPE_LOOPBACK for loopback devices.

 

At this point we need to start the device through the WASAPI.DeviceStartShared or through the WASAPI.DeviceStartExclusive method: you can know if a device is already started through the WASAPI.DeviceIsStarted method.

 

Finally we can start the recording session through the StartFromWasapiCaptureDevice method for input devices or through the StartFromWasapiLoopbackDevice method for loopback devices.  Both methods need to know the output file location: if an absolute pathname is specified, it can be a file on the system hard disk or, if the pathname is left empty, a memory file that could be used at a later time, for example in order to perform a custom encrypting before saving it to a file on disk: it's highly suggested avoiding the use of memory based recording sessions for a duration longer than 5 minutes because they could require a huge amount of memory: for a longer duration the output on files should be preferred.

 

When using WASAPI there is also the possibility to record from multiple input devices (both capture and loopback devices) at the same time and to store the mixed result into an output file; after having started the input devices of interest, you can add them to the WASAPI input mixer through separate calls to the WASAPI.MixerInputDeviceAttach method (one call for each input device) and then you can start the recording session through the StartFromWasapiMixer method.

 

You can refer to the How to manage audio flow through WASAPI tutorial for further details about WASAPI.

 

Recording through a raw audio queue

 

In order to start a recording session from an external physical or logical device that produces in output raw audio data, you need to perform a call to the StartFromQueueRaw method: as seen for other kinds of recording sessions, this method needs to know the output file location: if an absolute pathname is specified, it can be a file on the system hard disk or, if the pathname is left empty, a memory file that could be used at a later time, for example in order to perform a custom encrypting before saving it to a file on disk: it's highly suggested avoiding the use of memory based recording sessions for a duration longer than 5 minutes because they could require a huge amount of memory: for a longer duration the output on files should be preferred.

 

As a second information, the StartFromQueueRaw method needs to know the format of raw audio data, for example it could be an uncompressed PCM in stereo with 16 bits per sample  and a sample rate of 44100 or a compressed GSM 6.10 in mono with a sample rate of 8000.

 

Once the recording session is started, your code can start feeding the queue with raw audio data taken from the output of the external device through the SendDataToQueueRaw method.

 

 

 

During the recording session, the container application can be notified about the current duration through the RecordingDuration event and about the current disk or memory occupation through the RecordingSize event: note that, for reducing the amount of bytes requested for storing sound data, the control gives the possibility to discard silent data, i.e. portions of sounds that cannot be heard: see section How the Sound Activation System works for further details about this feature.

 

The current recording session can be stopped at any time through the Stop method.

 

The current recording session can be paused through the Pause method and resumed through the Resume method.

 

After stopping the current recording session, you have the possibility to start a new one without discarding the existing one: for this purpose you need to define how the new recording session will behave through a call to the SetRecordingMode method: this will allow appending the new recording session to the existing one or inserting/mixing/overwriting the new recording session to a given position of the existing one: see the SetInsertPos, SetMixingPos and SetOverwritePos methods for defining where the new recording session will be inserted/mixed/overwritten.

It's important to note that, after stopping a restarted recording session, the total recorded sound will not be immediately available because there will be the need to finalize and join the new recording session with the previous one: the container application will be informed about the start/end of the automatic finalization process through the RecordingFinalizationStarted and RecordingFinalizationDone events.

 

Once a recording session has been stopped, the control can manage recorded sound data, stored on a disk file or inside a memory buffer, through the following methods:

 

RecordedSound.Play
RecordedSound.PlayRange
RecordedSound.Pause
RecordedSound.Resume
RecordedSound.Stop
RecordedSound.SeekPlayPosition
RecordedSound.GetPlaybackPosition
RecordedSound.GetFormattedPlaybackPosition
RecordedSound.GetDuration
RecordedSound.GetFormattedDuration

 

The playback rate of the recorded stream can be modified through the following methods:

 

RecordedSound.PlaybackRateSet
RecordedSound.PlaybackTempoSet

 

allowing a faster listening of the recorded audio stream.

As a further playback feature, you may change playback direction through the RecordedSound.SoundDirectionSet method: using this feature requires enabling playback direction management through a previous call to the RecordedSound.SoundDirectionEnable method or, in case the waveform scroller should be in use, through the WaveformScroller.PlaybackOnScrollEnable method.

 

In case you should need more advanced playback features, consider using this control in conjunction with our Active DJ Studio ActiveX control.

 

The recorded sound can be edited through one of the following methods:

 

RecordedSound.RequestReduceToRange
RecordedSound.RequestDeleteRange
RecordedSound.RequestInsertSilence
RecordedSound.TrimSilence

 

Note that editing capabilities are quite limited: if you need more sophisticated editing features consider using this control in conjunction with our Active Sound Editor ActiveX control.

 

The recorded sound can be exported to a disk file or inside a memory buffer using the RecordedSound.RequestExportToFile method: see the How to export a recorded sound section for further details

 

The recorded sound can be also uploaded to a FTP server through the RecordedSound.RequestUploadToFTP method (the upload session can be cancelled through the RecordedSound.CancelUploadToFTP method) and can be also copied into the system clipboard in CF_WAVE format through the RecordedSound.CopyRangeToClipboard method.

 

As already stated before, if the strOutputPath parameter of the StartFromDirectSoundDevice method has been left blank, recorded sound data will be stored inside a memory buffer; the operation that can be performed on this memory buffer are the following:

 

RecordedSound.GetMemoryPtr
RecordedSound.GetMemorySize
RecordedSound.SaveToFile
RecordedSound.FreeMemory

 

While performing a recording session whose output will be stored inside a file on the hard disk, it could be useful having the possibility to save the recording session in more than one file without any interruption of the audio flow: for this purpose you can use the SwitchOutputFile or the SwitchOutputFileEx methods: the call to one of these method generates a RecordingOutputFileSwitch event.

After switching the output file one or more times, you can verify how many times the output file was switched through the SwitchedOutputFileGetCount method and, for each single output file, you can obtain the absolute pathname through the SwitchedOutputFileGetPathname method, the size in bytes through the SwitchedOutputFileGetSize method and its duration through the SwitchedOutputFileGetDuration method.

 

In case you should need to split the recording session in separate files containing the left and right channels respectively, you could use the StartSplitFromDirectSoundDevice method. In this case the recording session wouldn't be kept inside the RecordedSound object so the only accepted recording mode, set through the SetRecordingMode method, is REC_MODE_NEW.

The same splitting feature is also available for Wasapi capture and loopback devices through the StartSplitFromWasapiCaptureDevice and StartSplitFromWasapiLoopbackDevice methods.

 

While performing a split recording session whose output will be stored inside two file on the hard disk, it could be useful having the possibility to save the recording session in more than one file without any interruption of the audio flow: for this purpose you can use the SwitchOutputFileOnSplit: the call to one of these method makes the engine invokes the RecordingOutputFileSwitch event.

 

 

POSITION MARKERS

 

During a recording session it's sometime useful having the possibility to add one or more position markers: a position marker is like a bookmark and it's used to mark a location within a document. This is useful if you need to return to that location quickly or frequently without the need to manually search for a specific position inside a long recording session.

 

During a recording session you can add position markers using the PositionMarkerAdd method and, when the recording session is completed and stopped, you can add further position markers through the PositionMarkerAddOffline method. Position markers can be removed at any time through the PositionMarkerRemove method.

 

You can retrieve the current number of available position markers through the PositionMarkerCountGet method. Each position marker is identified by a "unique identifier" which can be retrieved through the PositionMarkerUniqueIdGet method. The position in milliseconds of each marker can be obtained or changed through the PositionMarkerPosGet and PositionMarkerPosSet methods and its friendly descriptor can be obtained through the PositionMarkerDescGet method.

 

Position markers can be saved for later use through the PositionMarkerSave method: you have the choice to store them inside an external XML file or, if supported by the audio format of the output sound file created by the recording session, directly into the tag of the file. After loading again the sound file into the recorder through the StartFromFile method, saved position markers can be retrieved through the PositionMarkerLoad method.

 

During a playback session, position markers can be used to seek the current playback position through the PositionMarkerSeekPlayPositionTo method; in alternative they can be used to start the playback of a specific range within the recorded sound through the PositionMarkerPlayRange method.