The ISDL modulation toolbox is a set of Matlab files that enables you to perform modulation spectral analysis of speech and other sounds.
We are currently maintaining two versions of the modulation spectral analysis software in the toolbox: a command line version and a graphical user interface. See the webpage for each of the versions for more information.
If you plan to use the modulation toolbox, please note that the toolbox was designed using Matlab 6.5 and has not yet been tested on other Matlab versions.
July 7, 2003 (revision 1.00)
First release of the modulation toolbox under a non-profit license. See download instructions for information on how to obtain your copy of the software.
January 20, 2004 (revision 1.15)
Replaced Matlab 6.5's logical operators (|| and &&) with the pre-6.5 binary operators (| and &) for compatibility
April 8, 2004 (revision 1.20)
Added GUI option to evaluate functions on the modulation spectrogram.
March 25, 2005 (revision 1.23)
Added colormap option to the GUI menu.
The commandline version of the modulation toolbox adds a function called 'modspecgram' to the Matlab environment.
modspecgram(a); b = modspecgram(a); [b,f,t] = modspecgram(a,fs,basewindow,baseoverlap,basenfft,modwindow,modnfft);
modspecgram calculates the modulation spectrogram from a signal.
b = modspecgram(a,fs,basewindow,baseoverlap,basenfft,modwindow,modnfft)
calculates the modulation spectrogram for the signal in vector
modspecgram splits the signal into overlapping segments, windows each
basewindow vector and forms the intermediate columns of
their zero-padded, length
basenfft discrete Fourier transforms (similar
modspecgram then takes the magnitude of
each row of
b with the
modwindow vector and computes a second
transform across the rows of
b using either a length
modnfft discrete Fourier
transform, or a Hierarchical Lapped Transform. Thus
each column of
b contains an estimate of the short-term, modulation
frequency localized frequency content of the signal
frequency increases linearly across the columns of
b, from left to
right. (Acoustic) frequency increases linearly down the rows, starting at 0.
a is a complex signal,
b is a complex matrix with
basenfft rows and
modnfft/2+1 columns. If
a is real,
b still has
modnfft/2+1 columns, but the higher (acoustic) frequency
components are truncated (because they are redundant); in that case,
basenfft/2+1 rows for
basenfft even and
If you specify a scalar for
uses a sine-window of that length.
basewindow must have length smaller than
or equal to
basenfft and greater than
is the number of samples the sections of
modwindow must have length
smaller than or equal to the number of columns in the spectrogram of
a and smaller than
or equal to
fs is the sampling frequency which does not effect the
modulation spectrogram but is used for scaling plots.
[b,f,t] = modspecgram(a,fs,basewindow,baseoverlap,basenfft,modwindow,modnfft)
returns a column of frequencies
f and one of modulation frequencies
t at which the modulation spectrogram is computed.
f has length equal to
the number of rows of
t has length equal to the number of columns of
b. If you leave
modspecgram assumes a default of 2 Hz.
b = modspecgram(a) produces the modulation spectrogram of the signal
using default settings; the defaults are
basewindow = basenfft = 256,
baseoverlap = basesize*3/4 and
modwindow = modnfft = the number
of columns in the spectrogram of
You can tell
modspecgram to use the default for any parameter by
leaving it off or using
 for that parameter, e.g.
modspecgram(a,...,'fft') performs the second transform using the
discrete Fourier transform (this is the default).
modspecgram(a,...,'hlt') performs the second transoform using the
Hierarchical Lapped Transform.
modspecgram with no output arguments plots the absolute value of the
modgram in the current figure, using
axis xy, colormap(jet). The low modulation frequency content of the
lowest frequency is displayed in the lower left corner of the axes.
- Load one of the predefined audio signals in matlab and display
its modulation spectrogram:
>> load train; >> modspecgram(y,Fs);This will bring up a figure with the modulation spectrogram of the train whistle.
- Specifying the base-window:
>> load train; >> modspecgram(y,Fs,128); >> modspecgram(y,Fs,hann(128));The third parameter controls the base-window, either by specifying its length or by specifying a window function. The length of the base-window controls the resolution on the acoustic frequency axis (the vertical axis in the modulation spectrogram). The default base-window size is 256, so specifying a base-window size of 128 reduces the resolution on the acoustic frequency axis, for an increase in resolution on the modulation frequency axis, as shown in the figure below. Specifying a window function gives full control over the length and shape of the base-window.
- Specifying window overlap:
>> load train; >> modspecgram(y,Fs,256,128);The fourth parameter controls overlap between windows in the base-transform. The default value is 75% of the base-window length, but you may also specify 50% of the base-window length, as is done in this example.
- Selecting a Hierachical Lapped Transform as the second transform:
>> load train; >> modspecgram(y,Fs,'hlt');The last parameter on the list specifies which transform to use as a second transform. If this parameter is omitted or set to 'fft', the Fourier transform is selected. Otherwise, by specifying 'hlt', the Hierarchical Lapped Transform is selected. The output of this example is quite different from the previous examples, since the horizontal axis is now no longer modulation frequency, but a number of different time scales.
- Getting numerical output from modspecgram:
>> load train; >> b = modspecgram(y,Fs);In this example, modspecgram does not create a figure, but instead returns the coefficients of the modulation spectrogram in the matrix
- Concentrating signal energy in a single coefficient in the modulation spectrogram:
>> Fs = 8000; >> y = sin(2*pi*800*[0:Fs*2]/Fs) .* ... (1 + 0.8 * sin(2*pi*3*[0:Fs*2]/Fs)) / 2; >> modspecgram(y,Fs);These instructions construct a modulated carrier signal, with a carrier at 800 Hz and a modulator at 3 Hz. Most of the energy of this signal is in a single coefficient in the modulation spectrogram, at acoustic frequency 800 Hz and modulation frequency 3 Hz.
- Increasing the resolution of the modulation spectrogram:
>> load train; >> modspecgram(y,Fs,256,128,1024,,1024);The fifth and seventh parameter (
modnfftcontrol the resolution of the modulation spectrogram in acoustic frequency and modulation frequency respectively. This example sets both parameters to 1024, and as a result the output figure contains more detail (compare with example 3).
The graphical user interface (GUI) version of the modulation toolbox adds a function called 'modspecgramgui' to the Matlab environment.
modspecgramgui starts a graphical user interface for the analysis,
modification and synthesis of signals with respect to their modulation spectrum.
When loaded with a signal, the main window of the GUI looks like this.
The titlebar of the window contains information about the loaded signal; its source
(file or workspace), the name of the file or the workspace variable, the duration
of the signal and its sampling frequency. The graphical axes show (from top to
bottom): the time waveform of the signal, the spectrogram of the signal, and one
frame of the modulation spectrogram of the signal. The location of the
frame is marked in the time waveform and the spectrogram by a dasked black box.
In the figure above, the variable
zero was loaded from the Matlab
workspace. It has a duration of 0.7 seconds at 10 kHz. The bottom axes
show the sixth (and last) modulation spectrogram frame of the input signal.
How to use the GUI
Loading a signal
To load a signal from disk into the modulation spectrogram interface, select File->Open... from the menu. A dialog box will appear in which you can select the wave-file (*.wav) you want to open. After selecting Open, the wave-file will be loaded.
To load a signal from the Matlab workspace into the modulation spectrogram interface, select File->Import.. from the menu. A dialog box will appear listing all workspace variables that could represent an audio signal (i.e. all vectors of doubles), as well as all workspace variables that could be a sampling frequency (all doubles). Select the signal/sampling frequency pair that you want to import, or select a signal and specify an alternate sampling frequency. Check the normalize checkbox if you want to normalize the signal prior to loading. When a signal is normalized, its mean is subtracted from the signal and its maximum amplitude is set to 1.
A word of warning
The modulation spectrogram GUI was not designed with memory efficiency in mind. As a consequence, loading large signals into the interface without a large amount of memory installed in your computer will probably slow your computer down quite considerably. We recommend limiting the signals you load to 100000 samples.
Adjusting the modulation spectrogram parameters
To change the parameters that are used in the modulation spectrogram transform, open the Transform menu. The first three items allow you to select the type of second transform used. Currently, the GUI supports the Fourier transform, a Hierachical Lapped Transform that uses an odd-DFT basis, and a Hierarchical Lapped Transform that uses a modified DCT basis. Technical details on the second transform will be posted shortly on this webpage.
The last item of the Transform menu allows you to change the transform parameters. Selecting this item shows a dialog box featuring two sets of parameters. The base window parameters control the window size and window overlap of the base transform. Choosing a large base window size will give you good (acoustic) frequency resolution, at the price of less range in modulation frequency.most recent version instead.