November 30

Sound & Music


So far, we have concentrated on images for many reasons. Images, particularly abstract images of the sort that computers can easily create, are generally more familiar than the sonic equivalents. Images also reveal the computation spatially rather than temporally: you can see in an instant everything that your program computes, whereas with sound, you often have to play a sound from beginning to end and listen carefully. Computation (at least with JavaScript) is inherently sequential, whereas making complex sounds often requires doing things in parallel, resulting in more complex programming.

Computing Sound

Nevertheless, computers can make sounds. (For examples, review Looking Outwards 04.) There are several reasons to consider computers as sound generators (and these reasons hold for images, video, and other media as well):

  1. Precision. Computers can make sounds more precisely than any other means. This can be a problem as much as an advantage, but composers, scientists, and even performing musicians are drawn to the possibility of realizing sounds exactly as they are conceived.
  2. Refinement. Related to precision, one can make small, incremental adjustments to sounds in order to refine and improve them over many trials.
  3. Algorithms. Computers can execute complex procedures quickly. Composers can think about sound and music as a generative process, focusing on the rules and procedures without fully understanding what will result from them. Since composers (as all artists) often look for original sounds and form, this approach can help free the creator from old habits and tendencies to create what is familiar.
  4. Interaction. New works that engage the audience or performers in interaction are possible with real-time computation. One could argue that all live music is interactive, but with computers, the composer/creator/designer has much more direct control over the nature of interaction, and very complex interactions are possible. For example, a composer can have a computer produce a musical response to input from a live performer. Of course, this happens in jazz all the time, but in a jazz performance, it is the performer, not the composer who creates the musical response, and the response will be characteristic of that performer rather than something the composer can design.

Types of Interaction

In this class, we will focus on interaction. Other aspects of sound and music are covered in courses such as Introduction to Computer Music (15-322) (emphasizing synthesis techniques and algorithmic composition) and Experimental Sound Synthesis (57-344) (emphasizing the use of computing in real-time music performance).

Let’s consider interaction as a linkage between things. On the one hand, we have images, animation, and user input such as mouse coordinates and key clicks (and more generally, any number of instrument-like controllers, motion sensors, environmental sensors, etc.). On the other hand, we have sounds. How can these be linked?

The Direction of Control Flow

First, we can think about the direction of control flow. Does information flow to or from sounds? There are a number of possibilities:

  • Image to Sound. We can use information created to build images and translate that into sound. Example: make a sound when two visual objects collide.
  • Sound to Image. Reversing the direction, parameters of sounds (pitch, loudness, type, etc.) can be translated into images. Example: Make a new particle each time a musical note is generated.
  • Controllers to Sound. Sounds can be controlled directly from control input. Example: a classic program often used to introduce interactive concepts uses mouseX to control amplitude (loudness) and mouseY to control frequency (pitch). This is often referred to as a “Theremin” which is really an electronic instrument that uses two proximity detectors to control amplitude and frequency.
  • Sound as Control. Sound from a microphone can be analyzed and used as control input in the same way we can use mouseX, mouseY, or other input devices. It is simple to analyze the overall amplitude (effectively how loud is the sound) and with a little more work, we can extract pitch and other features. The results are just numbers which can be mapped to control a visual scene.

Can you think of other forms of interaction?

Events and Continuous Control

In the computer music world, we often distinguish two kinds of control: events and continuous control.

Event-based control means that the control information takes place at a single instant in time. You can think of an event as a procedure call with parameters. For example, suppose we can make a musical note by calling note(pitch, loudness, duration). We might create the note when the user clicks with the mouse, and we might use mouseX and mouseY to determine some of the parameters. The note can be called a “sound event.” Another common form of “sound event” is simply the playback of a sound file containing a recording, which could be a short sound effect or minutes of background music. In all these cases, the control is passed in an instant, and typically a new sound is created.

Continuous control means that the control information extends over a period of time and represents continuous change in some parameter(s) of sound. For example, change the pitch and amplitude of a sound at every animation frame using mouseX and mouseY. Continuous control is often claimed to be more interactive, more expressive and more responsive than event-based control.

This discussion is not limited to computers: is the piano (essentially event-based because once the hammer is launched by pressing a key, the pianist has almost no further control over the sound except to dampen the string to stop the note) less expressive than the violin (essentially continuous control because the violinist continuously adjusts the pitch and amplitude by moving fingers on the fingerboard while changing the bow velocity and orientation)? A more interesting question for us is: how is piano music (event-based music) different from violin music (continous control music)? The nature of control has a direct impact on how we use it.

Basic Sound in p5.js

For all these techniques, you need to use which includes support for the p5.sound library. Please see the p5.sound reference page for details on all of the functions introduced below.

Making a Continuous Tone

Here is how to create a continuous sinusoid (a pretty boring sound by itself, so changing parameters and perhaps creating many sinusoids is key to getting something interesting). A “sinusoid” means that the vibration actually has the shape of a sinusoid — the output of the sine function you’ve seen before. These are also called “sine tones”, “harmonics”, or “partials” depending on the context.

myOsc = new p5.Oscillator();

Now you have an object (myOsc) that can make sound, but you need to set some parameters and tell it to start.
Frequency should be (roughly) between 200 and 2000. You can hear down to 20, but your laptop speakers will not put out much sound below
200. You can hear up to 20,000 but 2000 is already very high in terms of musical pitch.


To change the amplitude (how load) or frequency (pitch), you can update
the oscillator from within draw() or anywhere:

myOsc.amp(x); // where 0

Making “Pings”

Rather than controlling amplitude with some visual parameter (e.g. particle velocity), you can have the amplitude change with time. A particularly simple and effective strategy is to make a “ping” where the sine tone dies away exponentially. This is almost exactly what happens with any physical object that you strike — the resulting vibrations decay by a factor over each equal unit of time. (Real objects vibrate at many frequencies and each frequency has a different decay rate, but we’ll stick to a single sine tone for now.)

To create a decaying “ping” sound, you need a global variable to remember the amplitude. Initialize the variable to 1 when you want to start the “ping” — this is therefore an event:

var amp = 0; // initially zero, no sound

// somewhere, start the sound like this:
amp = 1;

Then, in draw cause the amplitude to decay by a factor each time:

amp = amp * 0.95; // compute new amplitude
myOsc.amp(amp); // update the amplitude

You can use larger factors (0) for faster decay.

Playing Sound Files

You can play sounds from sound files. Each sound should be preloaded into a variable from a file. You will need
a local server to load files (just like images):

        function preload() {
            mySnd = loadSound("samplefile.wav");
        // to play the sound from within draw() or anywhere:

Have fun!