How to compose a sound |
|
Important note
The SoundComposer object has become partially obsolete after the introduction of the TracksBoard class which allows a better editing experience.
For details about the use of the TracksBoard refer to the How to use the TracksBoard to visually compose songs tutorial. For further details about methods of the TracksBoard refer to the TracksBoard class section.
|
Audio DJ Studio for .NET allows composing new sound and/or music files by mixing together audio data incoming from several sources; an important concept about this feature is the fact that all sound items composing the session are layered, meaning that they can be added, removed and modified before rendering the final mix.
Main sources of audio data can be the following:
• | Sound files stored inside disk files or inside memory buffers: these sound files can be in several audio formats as seen inside the LoadSound and LoadSoundFromRawFile methods |
• | The Microsoft Speech API which allows creating audio data from a string of text or from a text file through synthesized voices |
The sound composer can create mono, stereo and multi-channel (up to 7.1) sessions and allows, for each item added to the session, to define the destination channel and the offset in milliseconds respect to the beginning of the final audio stream.
The main actor in sound composition is the internal sound composer object, implemented through the SoundComposerMan class and exposed through the SoundComposer property; there are three main steps needed to perform a sound composition:
• | Initializing the sound composer's session by defining the characteristics of the audio stream that will be composed, mainly sample rate and number of audio channels, through the SoundComposer.SessionCreate method. |
• | Once the session has been initialized, we can start adding items through a set of methods that will be mentioned on the next paragraph; each item added to the session is identified through a unique identifier: this unique identifier can be used in case you should need to perform some modification to the item itself, for example changing its amplitude, its offset, its channel and so on. |
• | At this point, when all of the needed items have been added and placed on the selected offsets and audio channels, before starting playback there is the possibility to save the current session into a XML file on disk (with the only exception of memory-based items) through the SoundComposer.SessionSave method, allowing to eventually reload the same session at a later time through the SoundComposer.SessionLoad method. |
• | We can now start rendering the audio stream through the PlaySound method. It's important to note that the sound composer will output a continuous audio stream and, by default, when playback of the last item inside the session has been completed, the involved player will stay in "playback status" until the StopSound method is invoked. You may change this behaviour by invoking the SoundComposer.SessionAutomaticStopWhenDoneSet method with its bAutomaticStop parameter set to "true": in this situation the audio stream will be automatically stopped when playback of the last item inside the session has been completed and the SoundDone event will be raised. |
• | When the session is started, you can still add new items to the sound composer, keeping in mind that, in this case, the offset of the item will be related to the current playback position. |
• | When playback of an item is started, the container application is notified through the SoundComposerItemStart event. You may also check if a given item is in playback state through the SoundComposer.ItemIsPlaying method and obtain its current playback position through the SoundComposer.ItemPlaybackPositionGet method. |
• | When playback of an item has been performed completely, the item is automatically removed from the session and its resources deallocated, including the unique identifier which from now on becomes invalid. In this case the container application is notified through the SoundComposerItemDone event. |
• | In order to totally remove a sound composer session and discard all of the previously added items, you would need to call the CloseSound method. |
As mentioned on the second point above, new items can be added to the sound composer session through a set of specific methods which vary depending upon the specific type of item:
Audio generated from Microsoft Speech API
After adding an item to the session, you can still perform changes to the item or eventually obtaining related information by leveraging the following set of methods:
- SoundComposer.ItemAmplitudeGet to obtain the amplitude of the item
- SoundComposer.ItemAmplitudeSet to modify the amplitude of the item
- SoundComposer.ItemChannelGet to obtain the channel of of the audio stream that will reproduce the item
- SoundComposer.ItemChannelSet to modify the channel of of the audio stream that will reproduce the item
- SoundComposer.ItemRemove to remove the item from the sound composition
- SoundComposer.ItemInfoGet to obtain the duration, expressed in milliseconds, of the item and the number of audio channel on the original sound file
- SoundComposer.ItemOffsetGet to obtain the offset, expressed in milliseconds, of the item respect to the beginning of the audio stream
- SoundComposer.ItemOffsetSet to modify the offset, expressed in milliseconds, of the item respect to the beginning of the audio stream
- SoundComposer.ItemTypeGet to obtain the item's type
- SoundComposer.ItemFriendlyNameGet to obtain the friendly name of the item
- SoundComposer.ItemFriendlyNameSet to modify the friendly name of the item
Other methods, specific to each type of item, are described inside sections below.
The sound composer allows adding sound files stored in different audio formats and media:
• | Sound files stored on the local disk through the SoundComposer.ItemSoundFileAdd method or through the SoundComposer.ItemSoundFileRawAdd method: accepted audio formats are the same supported by the LoadSound and LoadSoundFromRawFile methods. |
• | Sound files stored inside a memory buffer through the SoundComposer.ItemSoundFileMemoryAdd method or through the SoundComposer.ItemSoundFileMemoryRawAdd method: accepted audio formats are the same supported by the LoadSoundFromMemory and LoadSoundFromRawMemory methods; as mentioned before, memory-based items are ignored and discarded when the SoundComposer.SessionSave method is invoked. |
Once the sound file has been added to the session, you can modify some of its settings through the following set of methods:
- SoundComposer.ItemSoundFileLoadRangeGet to obtain the loading range of the item
- SoundComposer.ItemSoundFileVolumeSmoothingGet to obtain the eventual volume smoothing of the item
- SoundComposer.ItemSoundFileVolumeSmoothingSet to obtain the eventual volume smoothing of the item
As you can see there is a consistent number of methods that can be used to modify how a specific sound file will be mixed to the audio stream.
In case you should need to add a new sound file item or speech item at the exact end of an existing item, you may append the new sound file by invoking the SoundComposer.ItemAppendNext method immediately before invoking the SoundComposer.ItemSoundFileAdd method (or the other methods for adding files in raw format or memory-based files).
The SoundComposer.ItemAppendNext method receives the unique identifier of the already existing item allowing the next call to the SoundComposer.ItemSoundFileAdd method to calculate the exact offset to place the new item inside the audio stream.
When the SoundComposer.ItemAppendNext method is invoked successfully, the next sound file will be appended to the given existing item on the same audio channel of the sound composer and with the same "downmix to mono" setting. After the new item has been appended, the "append mode" is automatically reset.
Let's see a small code snippet where we create a session having a stereo audio stream; on this stream we will perform the following actions:
• | add two stereo sound files (myfile1.mp3 and myfile2.wma) to the session |
• | the loading range of both sound files is limited to 30 seconds starting from second 10 of each sound file |
• | the first file is loaded at offset 0 on the final audio stream |
• | the second file is loaded at offset 28000 (28 seconds) on the final audio stream |
• | a fade-in of 2 seconds is applied at the beginning of each sound file |
• | a fade-out of 3 seconds is applied at the end of each sound file |
as you may see, the two sound file will overlap for a small chunk of 2 seconds, during fade-in/fade-out, on the final audio stream:
Visual Basic.NET |
' create the audio stream on a specific player (44100, stereo) audioDjStudio1.SoundComposer.SessionCreate (Player_1, 44100, 2)
' add the two sound files by limiting their loading range from second 10 to second 40 (30 seconds) Dim nUniqueIdFile1 As Int32 = 0 Dim nUniqueIdFile2 As Int32 = 0 audioDjStudio1.SoundComposer.ItemSoundFileAdd (Player_1, "First sound", 0, _ "c:\myfolder\myfile1.mp3", False, 0, 10000, 40000, nUniqueIdFile1) audioDjStudio1.SoundComposer.ItemSoundFileAdd (Player_1, "Second sound", 0, _ "c:\myfolder\myfile2.wma", False, 28000, 10000, 40000, nUniqueIdFile2)
' add a 2 seconds fade-in and a 3 seconds fade-out to the two sound files audioDjStudio1.SoundComposer.ItemSoundFileVolumeSmoothingSet (Player_1, nUniqueIdFile1, 2000, 3000) audioDjStudio1.SoundComposer.ItemSoundFileVolumeSmoothingSet (Player_1, nUniqueIdFile2, 2000, 3000)
' start playback of the final audio stream audioDjStudio1.PlaySound (Player_1)
|
Visual C# |
// create the audio stream on a specific player (44100, stereo) audioDjStudio1.SoundComposer.SessionCreate (Player_1, 44100, 2);
// add the two sound files by limiting their loading range from second 10 to second 40 (30 seconds) Int32 nUniqueIdFile1 = 0; Int32 nUniqueIdFile2 = 0; audioDjStudio1.SoundComposer.ItemSoundFileAdd (Player_1, "First sound", 0, @"c:\myfolder\myfile1.mp3", false, 0, 10000, 40000, ref nUniqueIdFile1); audioDjStudio1.SoundComposer.ItemSoundFileAdd (Player_1, "Second sound", 0, @"c:\myfolder\myfile2.wma", false, 28000, 10000, 40000, ref nUniqueIdFile2);
// add a 2 seconds fade-in and a 3 seconds fade-out to the two sound files audioDjStudio1.SoundComposer.ItemSoundFileVolumeSmoothingSet (Player_1, nUniqueIdFile1, 2000, 3000); audioDjStudio1.SoundComposer.ItemSoundFileVolumeSmoothingSet (Player_1, nUniqueIdFile2, 2000, 3000);
// start playback of the final audio stream audioDjStudio1.PlaySound (Player_1);
|
Now let's see another small code snippet where we create a session having again a stereo audio stream; on this stream we will perform the following actions:
• | add a sound file (myfile1.mp3) to the session |
• | append a new sound file (myfile2.mp3) to the previous file (myfile1.mp3) by separating them with 2 seconds of silence |
Visual Basic.NET |
' create the audio stream on a specific player (44100, stereo) audioDjStudio1.SoundComposer.SessionCreate (Player_1, 44100, 2)
Dim nUniqueIdFile1 As Int32 = 0 Dim nUniqueIdFile2 As Int32 = 0
' add the first sound files audioDjStudio1.SoundComposer.ItemSoundFileAdd (Player_1, "First sound", 0, _ "c:\myfolder\myfile1.mp3", False, 0, 0, -1, nUniqueIdFile1)
' predispose the next item to be appended and leave 2 seconds of silence in between audioDjStudio1.SoundComposer.ItemAppendNext (Player_1, nUniqueIdFile1l, 2000)
' append the second item to the previous one audioDjStudio1.SoundComposer.ItemSoundFileAdd (Player_1, "Second sound", 0, _ "c:\myfolder\myfile2.wma", False, 0, 0, -1, nUniqueIdFile2)
' start playback of the final audio stream audioDjStudio1.PlaySound (Player_1)
|
Visual C# |
// create the audio stream on a specific player (44100, stereo) audioDjStudio1.SoundComposer.SessionCreate (Player_1, 44100, 2);
// add the two sound files and obtain respective unique identifiers Int32 nUniqueIdFile1 = 0; Int32 nUniqueIdFile2 = 0; audioDjStudio1.SoundComposer.ItemSoundFileAdd (Player_1, "First sound", 0, @"c:\myfolder\myfile1.mp3", false, 0, 0, -1, ref nUniqueIdFile1);
// predispose the next item to be appended and leave 2 seconds of silence in between audioDjStudio1.SoundComposer.ItemAppendNext (Player_1, nUniqueIdFile1, 2000);
// append the second item to the previous one audioDjStudio1.SoundComposer.ItemSoundFileAdd (Player_1, "Second sound", 0, @"c:\myfolder\myfile2.wma", false, 0, 0, -1, ref nUniqueIdFile2);
// start playback of the final audio stream audioDjStudio1.PlaySound (Player_1);
|
Adding audio generated from Microsoft Speech API
Text to speech is the artificial production of human speech and is obtained by leveraging the Microsoft's Speech API installed on Windows systems. Text is converted into an audio stream containing spoken voice through the usage of voices installed inside the system: you can enumerate installed voices through the combination of the SoundGenerator.SpeechVoicesNumGet and SoundGenerator.SpeechVoiceAttributeGet methods; the sound composer can generate audio streams starting from a string of text, through the SoundComposer.ItemSpeechFromStringAdd method, or from a file containing text, through the SoundComposer.ItemSpeechFromFileAdd method; in both cases the provided text may eventually contain XML markups: see the MSDN documentation for a tutorial about XML markup syntax.
Once the text to speech has been added to the session, you can modify some of its settings through the following set of methods:
- SoundComposer.ItemSpeechFileSet to modify the file containing the string of the text to speech (only if the item was added through the SoundComposer.ItemSpeechFromFileAdd method)
- SoundComposer.ItemSpeechStringSet to modify the string of the text to speech (only if the item was added through the SoundComposer.ItemSpeechFromStringAdd method)
- SoundComposer.ItemContentGet to obtain the current string of text to speech or the pathname of the file containing the text to speech
- SoundComposer.ItemSpeechVoiceGet to obtain the speaking voice
- SoundComposer.ItemSpeechVoiceSet to modify the speaking voice
In case you should need to add a new speech item at the exact end of an existing item, which could be a sound file or another speech item, you may append the new item by invoking the SoundComposer.ItemAppendNext method immediately before invoking the SoundComposer.ItemSpeechFromStringAdd and SoundComposer.ItemSpeechFromFileAdd methods.
The SoundComposer.ItemAppendNext method receives the unique identifier of the already existing item allowing the next call to the SoundComposer.ItemSpeechFromStringAdd and SoundComposer.ItemSpeechFromFileAdd methods to calculate the exact offset to place the new item inside the audio stream.
When the SoundComposer.ItemAppendNext method is invoked successfully, the speech item or sound file will be appended to the given existing item on the same audio channel of the sound composer and with the same "downmix to mono" setting. After the new item has been appended, the "append mode" is automatically reset.
Let's see a small code snippet where we create a session having a stereo audio stream; on this stream we will add two mono streams, one for each channel, containing speech generated from a string of text (in order to keep the code simpler we have used a unique identifier variable for each item, in a real situation you may want to use separate unique identifier variables for each item):
Visual Basic.NET |
' create the audio stream on a specific player (44100, stereo) audioDjStudio1.SoundComposer.SessionCreate (Player_1, 44100, 2)
Dim nDummyUniqueId As Int32 = 0 Dim nChannel As Integer = 0
' add the string of text on the two channels of the audio stream (channel 0 and 1) audioDjStudio1.SoundComposer.ItemSpeechFromStringAdd (Player_1, "", nChannel, _ "This is a string of text to speech", 0, True, 1, 0, nDummyUniqueId) audioDjStudio1.SoundComposer.ItemSpeechFromStringAdd (Player_1, "", nChannel+1, _ "This is a string of text to speech", 0, True, 1, 0, nDummyUniqueId)
' start playback of the final audio stream audioDjStudio1.PlaySound (Player_1)
|
Visual C# |
// create the audio stream on a specific player (44100, stereo) audioDjStudio1.SoundComposer.SessionCreate (Player_1, 44100, 2);
Int32 nDummyUniqueId = 0; int nChannel = 0;
// add the string of text on the two channels of the audio stream (channel 0 and 1) audioDjStudio1.SoundComposer.ItemSpeechFromStringAdd (Player_1, "", nChannel, "This is a string of text to speech", 0, True, 1, 0, ref nDummyUniqueId); audioDjStudio1.SoundComposer.ItemSpeechFromStringAdd (Player_1, "", nChannel+1, "This is a string of text to speech", 0, True, 1, 0, ref nDummyUniqueId);
// start playback of the final audio stream audioDjStudio1.PlaySound (Player_1);
|
A sample of usage of the sound composer object in Visual Basic.NET and Visual C# can be found inside the following sample installed with the product's setup package:
- SoundComposer