Sound Synthesis Theory/Sound in the Time Domain
Sound in the Time Domain
The appearance and behaviour of sound waves
Sound is a variance in air pressure, caused by the propagation of energy through a medium as waves. Human hearing systems sense these waves as they cause the ear drum to move; this movement is transduced into other types of energy inside the ear where it is finally sent to the brain as electrical impulses for analysis. Since sound waves are variances in air pressure over time, it is typical to represent the waves as a varying voltage or a stream of data over time, in order to capture, analyse, and reproduce the sounds. When visualising the behaviour sound waves over time, that is, in the time domain, we use the term amplitude to describe the sound level at a point in time. Amplitude is typically represented as a value between 1 and -1 where 1 and -1 represent maximum amplitude of the signal, and 0 represents zero amplitude.

The waveform in Fig. 1.1 is called a sine wave or sinusoid. Sine waves can be considered the fundamental building blocks of sound and are very smooth-sounding, basic tones. The figure demonstrates that the amplitude varies over time, but that pattern of variance repeats periodically. This short, constant period gives the sine wave its particular qualities.

The waveform in Fig. 1.2 is more complicated than the sinusoid in 1.1. There are peaks and troughs of different amplitudes, and, although the pattern does repeat itself over time (see if you can find it) it is harder to spot. In the same way that a sine wave behaves in a simple way and sounds simple, this sound behaves with greater complexity and also sounds more complex. For this reason, detailed, complex sounds that change over time often have no discernable features when viewed this close up- there may be no repeating pattern or behaviour which we can use to tell us something about the sound. As you can see from Fig. 1.1 and 1.2, we are looking at a section of the sound over a very short time scale; it may be nessecary to lengthen the time scale in order to gain some information about it.

In Fig. 1.3 we are given a look at a sound over the course of about 2 seconds rather than 2 milliseconds. From this perspective, we can see the way the overall sound amplitude changes over time; in particular, the parts with high amplitude can easily be seen as drum hits - they appear suddenly and drop in amplitude very quickly as one would expect from striking a drum head. It may have been very difficult to tell what kind of instrument was being played if this sound was viewed over the range of a few milliseconds. From this, we should conclude that the short time interval and long time interval perspectives both show different types of information and that selecting the right perspective to suit one's needs is important.
Sinusoids, frequency and pitch
As indicated in Fig. 1.1, the sine wave has a periodic form that repeats every seconds which is known as the period, cycle or wavelength. The wave also has a positive maximum amplitude, and a negative maximum amplitude, . The frequency, , of a sine wave is the number of cycles per second and is measured in Hertz (Hz). We can obtain the frequency from wavelength from the following equation:
Furthermore, we can express a sine wave with the following mathematical form (with angles in radians). This form may be useful to programmers interested in creating their own controllable sine functions in code:
High frequencies are often associated with words such as 'brightness', whereas low frequencies are often associated with 'depth' or 'bass'. For example, an instrument such as an electric guitar played clean may be called 'bright' or 'sharp' whereas an acoustic double-bass may be reffered to as 'dark' and 'warm'. Words like these are not objective quantities we can measure precisely, but are often used in describing the timbre of a particular sound. The frequencies present in a sound make a large contribution to timbre, and there are many different shades of timbre that can be achieved through combinations of different frequencies that make up a sound. The human hearing system also associates frequency with pitch if a particular frequency is sustained or perceived for a period of time; and we associate particular frequencies with particular notes in the standard Western scale:
| Cycle length (t) | Frequency (Hz) | Note name |
|---|---|---|
| 0.0045 | 220.0 | A3 |
| 0.0040 | 246.94 | B3 |
| 0.0038 | 261.63 | C4 |
| 0.0034 | 293.66 | D4 |
| 0.0030 | 329.63 | E4 |
| 0.0028 | 349.23 | F4 |
| 0.0025 | 392.0 | G4 |
| 0.0022 | 440.0 | A4 |
Construction and deconstruction of sinusoids
It has already been mentioned that sine waves can be considered the building blocks of sound. This is possible because a single sine wave can represent a single frequency- if we combine a series of different sinusoids together, we can theoretically recreate the frequency spectrum of an entire sound, be it real or imagined. In the same way, we can also break a complex sound down into its individual frequency components, allowing us to analyse or even control its minutest characteristics. Both these processes are typically simplified due to the incredibly complex nature of real-world or "realistic" sounds and the subsequent demands analysis and modification put on systems performing the task.

Fig. 1.5 demonstrates the appearance of two sine waves summed together. The characteristics of both waves are combined in the resultant waveform, which now, due to its increased complexity, develops new features. We can continue this process by adding more and more sine waves, each one representing a single frequency component of our desired sound. This technique is the basis of additive synthesis which is covered later in the book. Furthermore, in the way that we have constructed this sound, it is possible to filter out the two component frequencies from the whole; this is typically done by analysis of the waveform in the frequency domain, which is covered in the subsequent chapter.