Jump to: Command line version GUI version Download Instructions


The ISDL modulation toolbox is a set of Matlab files that enables you to perform modulation spectral analysis of speech and other sounds.

We are currently maintaining two versions of the modulation spectral analysis software in the toolbox: a command line version and a graphical user interface. See the webpage for each of the versions for more information.

If you plan to use the modulation toolbox, please note that the toolbox was designed using Matlab 6.5 and has not yet been tested on other Matlab versions.


July 7, 2003 (revision 1.00)

First release of the modulation toolbox under a non-profit license. See download instructions for information on how to obtain your copy of the software.

January 20, 2004 (revision 1.15)

Replaced Matlab 6.5's logical operators (|| and &&) with the pre-6.5 binary operators (| and &) for compatibility

April 8, 2004 (revision 1.20)

Added GUI option to evaluate functions on the modulation spectrogram.

March 25, 2005 (revision 1.23)

Added colormap option to the GUI menu.

Command line version

The commandline version of the modulation toolbox adds a function called 'modspecgram' to the Matlab environment.


b = modspecgram(a);
[b,f,t] = modspecgram(a,fs,basewindow,baseoverlap,basenfft,modwindow,modnfft);


modspecgram calculates the modulation spectrogram from a signal.

b = modspecgram(a,fs,basewindow,baseoverlap,basenfft,modwindow,modnfft) calculates the modulation spectrogram for the signal in vector a. modspecgram splits the signal into overlapping segments, windows each with the basewindow vector and forms the intermediate columns of b with their zero-padded, length basenfft discrete Fourier transforms (similar to the specgram function). modspecgram then takes the magnitude of b, windows each row of b with the modwindow vector and computes a second transform across the rows of b using either a length modnfft discrete Fourier transform, or a Hierarchical Lapped Transform. Thus each column of b contains an estimate of the short-term, modulation frequency localized frequency content of the signal a. Modulation frequency increases linearly across the columns of b, from left to right. (Acoustic) frequency increases linearly down the rows, starting at 0.

If a is a complex signal, b is a complex matrix with basenfft rows and modnfft/2+1 columns. If a is real, b still has modnfft/2+1 columns, but the higher (acoustic) frequency components are truncated (because they are redundant); in that case, modspecgram returns b with basenfft/2+1 rows for basenfft even and (basenfft+1)/2 for basenfft odd.

If you specify a scalar for basewindow or modwindow, modspecgram uses a sine-window of that length. basewindow must have length smaller than or equal to basenfft and greater than baseoverlap. baseoverlap is the number of samples the sections of a overlap. modwindow must have length smaller than or equal to the number of columns in the spectrogram of a and smaller than or equal to modnfft. fs is the sampling frequency which does not effect the modulation spectrogram but is used for scaling plots.

[b,f,t] = modspecgram(a,fs,basewindow,baseoverlap,basenfft,modwindow,modnfft) returns a column of frequencies f and one of modulation frequencies t at which the modulation spectrogram is computed. f has length equal to the number of rows of b, t has length equal to the number of columns of b. If you leave fs unspecified, modspecgram assumes a default of 2 Hz.

b = modspecgram(a) produces the modulation spectrogram of the signal a using default settings; the defaults are basewindow = basenfft = 256, baseoverlap = basesize*3/4 and modwindow = modnfft = the number of columns in the spectrogram of a.

You can tell modspecgram to use the default for any parameter by leaving it off or using [] for that parameter, e.g. modspecgram(a,1000,[],[],1024).

modspecgram(a,...,'fft') performs the second transform using the discrete Fourier transform (this is the default). modspecgram(a,...,'hlt') performs the second transoform using the Hierarchical Lapped Transform.

modspecgram with no output arguments plots the absolute value of the modgram in the current figure, using imagesc(t,f,20*log10(abs(b))), axis xy, colormap(jet). The low modulation frequency content of the lowest frequency is displayed in the lower left corner of the axes.


  1. Load one of the predefined audio signals in matlab and display its modulation spectrogram:
         >> load train;
         >> modspecgram(y,Fs);
    This will bring up a figure with the modulation spectrogram of the train whistle.

  2. Specifying the base-window:
         >> load train;
         >> modspecgram(y,Fs,128);
         >> modspecgram(y,Fs,hann(128));
    The third parameter controls the base-window, either by specifying its length or by specifying a window function. The length of the base-window controls the resolution on the acoustic frequency axis (the vertical axis in the modulation spectrogram). The default base-window size is 256, so specifying a base-window size of 128 reduces the resolution on the acoustic frequency axis, for an increase in resolution on the modulation frequency axis, as shown in the figure below. Specifying a window function gives full control over the length and shape of the base-window.

  3. Specifying window overlap:
         >> load train;
         >> modspecgram(y,Fs,256,128);
    The fourth parameter controls overlap between windows in the base-transform. The default value is 75% of the base-window length, but you may also specify 50% of the base-window length, as is done in this example.

  4. Selecting a Hierachical Lapped Transform as the second transform:
         >> load train;
         >> modspecgram(y,Fs,'hlt');
    The last parameter on the list specifies which transform to use as a second transform. If this parameter is omitted or set to 'fft', the Fourier transform is selected. Otherwise, by specifying 'hlt', the Hierarchical Lapped Transform is selected. The output of this example is quite different from the previous examples, since the horizontal axis is now no longer modulation frequency, but a number of different time scales.

  5. Getting numerical output from modspecgram:
         >> load train;
         >> b = modspecgram(y,Fs);
    In this example, modspecgram does not create a figure, but instead returns the coefficients of the modulation spectrogram in the matrix b.
  6. Concentrating signal energy in a single coefficient in the modulation spectrogram:
         >> Fs = 8000;
         >> y = sin(2*pi*800*[0:Fs*2]/Fs) .* ...
                      (1 + 0.8 * sin(2*pi*3*[0:Fs*2]/Fs)) / 2;
         >> modspecgram(y,Fs);
    These instructions construct a modulated carrier signal, with a carrier at 800 Hz and a modulator at 3 Hz. Most of the energy of this signal is in a single coefficient in the modulation spectrogram, at acoustic frequency 800 Hz and modulation frequency 3 Hz.

  7. Increasing the resolution of the modulation spectrogram:
         >> load train;
         >> modspecgram(y,Fs,256,128,1024,[],1024);
    The fifth and seventh parameter (basenfft and modnfft control the resolution of the modulation spectrogram in acoustic frequency and modulation frequency respectively. This example sets both parameters to 1024, and as a result the output figure contains more detail (compare with example 3).

GUI version

The graphical user interface (GUI) version of the modulation toolbox adds a function called 'modspecgramgui' to the Matlab environment.




modspecgramgui starts a graphical user interface for the analysis, modification and synthesis of signals with respect to their modulation spectrum.

When loaded with a signal, the main window of the GUI looks like this.

The titlebar of the window contains information about the loaded signal; its source (file or workspace), the name of the file or the workspace variable, the duration of the signal and its sampling frequency. The graphical axes show (from top to bottom): the time waveform of the signal, the spectrogram of the signal, and one frame of the modulation spectrogram of the signal. The location of the frame is marked in the time waveform and the spectrogram by a dasked black box. In the figure above, the variable zero was loaded from the Matlab workspace. It has a duration of 0.7 seconds at 10 kHz. The bottom axes show the sixth (and last) modulation spectrogram frame of the input signal.

How to use the GUI

Loading a signal
To load a signal from disk into the modulation spectrogram interface, select File->Open... from the menu. A dialog box will appear in which you can select the wave-file (*.wav) you want to open. After selecting Open, the wave-file will be loaded.

To load a signal from the Matlab workspace into the modulation spectrogram interface, select File->Import.. from the menu. A dialog box will appear listing all workspace variables that could represent an audio signal (i.e. all vectors of doubles), as well as all workspace variables that could be a sampling frequency (all doubles). Select the signal/sampling frequency pair that you want to import, or select a signal and specify an alternate sampling frequency. Check the normalize checkbox if you want to normalize the signal prior to loading. When a signal is normalized, its mean is subtracted from the signal and its maximum amplitude is set to 1.

A word of warning
The modulation spectrogram GUI was not designed with memory efficiency in mind. As a consequence, loading large signals into the interface without a large amount of memory installed in your computer will probably slow your computer down quite considerably. We recommend limiting the signals you load to 100000 samples.

Adjusting the modulation spectrogram parameters
To change the parameters that are used in the modulation spectrogram transform, open the Transform menu. The first three items allow you to select the type of second transform used. Currently, the GUI supports the Fourier transform, a Hierachical Lapped Transform that uses an odd-DFT basis, and a Hierarchical Lapped Transform that uses a modified DCT basis. Technical details on the second transform will be posted shortly on this webpage.

The last item of the Transform menu allows you to change the transform parameters. Selecting this item shows a dialog box featuring two sets of parameters. The base window parameters control the window size and window overlap of the base transform. Choosing a large base window size will give you good (acoustic) frequency resolution, at the price of less range in modulation frequency.

Download instructions

Version 1.23 of the Modulation Toolbox is deprecated. We encourage you to refer to the most recent version instead.