*One component of the Modulation Toolbox for Matlab.

1. Introduction

The graphical user interface (GUI) version of the modulation toolbox adds a function called modspecgramgui to the Matlab environment. Simply type the function name in the Matlab console to load the GUI. modspecgramgui starts a graphical user interface for the analysis, modification and synthesis of signals with respect to their modulation spectrum. When loaded with a signal, the main window of the GUI will appear as in Figure 1.


Figure 1. Modulation spectrogram using default settings.


Try it yourself by loading 'speech_short.wav' located in the \sounds folder. In general, there are two ways to load a signal into the modulation spectrogram interface, as described next.

Loading a Signal from File

To load a signal from disk, select File->Open... from the menu. A dialog box will appear in which you can select the wave-file (*.wav) you want to open. If you load a stereo recording, the left and right channels will be averaged together to form a single-channel.

Loading a Signal from Workspace

To load a signal from the Matlab workspace, go to the menu bar and select File->Import.. A dialog box will appear listing all workspace variables that could represent an audio signal (i.e. all vectors of doubles), as well as all workspace variables that could be a sampling frequency (all doubles). Select the signal/sampling frequency pair that you want to import, or select a signal and specify an alternate sampling frequency. Check the normalize checkbox if you want to subtract the mean and scale the signal so that its maximum amplitude is equal to 1.

A word of warning
The modulation spectrogram GUI was not designed with memory efficiency in mind. As a consequence, loading large signals into the interface without a large amount of memory installed in your computer will probably slow your computer down quite considerably. We recommend limiting the signals you load to 100,000 samples.

Viewing Modulation Spectra

The central figure in the graphical interface is the joint-frequency representation, or modulation spectrum, of one data frame from the input signal. It shows the modulation frequency content (horizontal axis) versus the carrier frequency bin (vertical axis). For more explanation, refer to the second tutorial, tutorial2_modspecgram.m included with the toolbox installation.

To see the modulation spectrum for a different segment in time, simply click on the signal spectrogram or on the time-domain plot of the signal itself. The dashed lines indicate the extent of the input data currently being analyzed.

We are now ready to modify the modulation spectrum parameters.

2. Adjusting Demodulation Parameters

The Modulation Toolbox is designed to allow you to compare the performance of various demodulation methods. Broadly speaking, demodulation is either incoherent or coherent. The difference is that coherent demodulation detects carrier signals that are constrained by certain properties that allow more effective modulation filtering. To try out some different demodulation settings, click on Options->Demodulation Options... You will see a dialog box that looks like Figure 2.


Figure 2. Demodulation options dialog box.


Data Frame Settings

First, note the "Data frame settings" rectangle. The drop-down box allows you to set the length of the window used to segment the time-domain input signal. A modulation spectrogram is computed for each frame of the input signal. You can also adjust the overlap between successive frames, either 50% or 75%.

Demodulation Methods

Next, look at the "Demodulation methods" rectangle. The radio button allows you to pick from three methods: Hilbert envelope (incoherent), spectral center-of-gravity (coherent), and pitch-synchronous (coherent).

Figure 1 shows the default setting using spectral center-of-gravity on the first frame of 'speech_short.wav.' To see the Hilbert envelope spectrum, select "Hilbert envelope" and click 'Ok.' You should see the results shown in Figure 3.


Figure 3. Modulation spectra using Hilbert envelope (incoherent) demodulation.


Or, observe the pitch-synchronous modulator spectrum, as shown in Figure 4, where this time we have shifted attention to the sixth analysis frame.


Figure 4. Modulation spectra using pitch-synchronous (coherent) demodulation.


The next section discusses filterbank settings, which only apply to the spectral COG and the Hilbert envelope demodulation methods. Before proceeding, open the Demodulation Options dialog box again and select "Hilbert envelope (incoherent)." This is method is the quickest to compute and will allow you to easily test the effects of the filterbank settings.

Filterbank Settings

Finally, the "Filterbank settings" rectangle contains parameters that control how many carriers are detected and how they are constrained in frequency (but they only apply to Hilbert envelope and spectral center-of-gravity demodulation methods). The plot next to it shows representative subband frequency responses from the filterbank.


Figure 5. Updated demodulation options dialog using a smaller downsampling rate.



Figure 6. Updated demodulation options dialog showing reduced subband spectral overlap.


Choosing the Modulation Transform

Referring to the "modulation spectrum" often involves taking the Fourier transform of the modulators of a signal. As in the previous version of the modulation spectrogram GUI, other transforms are offered as well. When you go to the Transform menu, you will see three options:

After selecting the Daubechies 4 wavelet transform, the modulation spectrogram will appear as in Figure 7.


Figure 7. Modulation spectrum using the Daubechies-4 wavelet transform.


Modifying Modulation Spectra

Looking at the main interface, you can easily design masks in frequency to attenuate, amplify, or zero-out select parts of the modulation spectrogram. Simply click and drag within the joint-frequency axes to make a selection, and then choose an option from the menu on the right. In designing a masking function, click "Symmetrize" at any time to reflect left-right symmetry in the mask. Having completed your mask design, click "Apply." To listen to the resulting signal, click "Play Masked." For instructional purposes, the following screenshot shows an arbitrary masking function, with three zeroed-out portions and a fourth selection that has not yet been modified, as applied to the coherent modulation spectra found via spectral COG.


Figure 8. Modulation spectra with filtering mask applied.


Or, we can construct a simple, 10 Hz lowpass modulation filter by applying the mask seen in Figure 9.


Figure 9. A 10-Hz lowpass filter applied to the modulation spectra.


For example, we can recreate the results from application1_musicSeparation.m, in which a lowpass modulation filter of 12 Hz isolates the saxophone in a jazz recording using pitch-synchronous demodulation. To verify, load the signal "saturn1.wav" from the \sounds folder. Set the demodulation settings to "pitch synchronous" with min F0 = 179 Hz, max F0 = 550 Hz, and harmonic range = 6050 Hz (these are the settings used in application1_musicSeparation.m). Also, set the data frame size to 1 second, so that the resulting modulation spectrum appears as in Figure 10.


Figure 10. Pitch-synchronous modulation spectrogram of the saxophone/drums mix.


Then, applying a +/- 12 Hz lowpass filter in modulation frequency yields the modified spectrum seen in Figure 11. Click the "Play Masked" button to hear the result, which should not contain any of the drumming from the original signal.


Figure 11. The same modulation spectrum as in Figure 10, with a lowpass mask applied to zero out modulation frequencies beyond +/-12 Hz.


Try Your Own Experiments

The Modulation Toolbox for Matlab is publicly available for non-profit research purposes. All of the sounds files and screenshots on this page were generated using the toolbox, which includes the modulation spectrogram GUI as well as standalone functions that can be arranged in a variety of experimental topologies.