Sound effects (or audio effects) are artificially created or enhanced sounds, or sound processes used to emphasize artistic or other content of films, television shows, live performance, animation, video games, music, or other media. In motion picture and television production, a sound effect is a sound recorded and presented to make a specific storytelling or creative point without the use of dialogue or music. The term often refers to a process applied to a recording, without necessarily referring to the recording itself. In professional motion picture and television production, dialogue, music, and sound effects recordings are treated as separate elements. Dialogue and music recordings are never referred to as sound effects, even though the processes applied to such as reverberation or flanging effects, often are called "sound effects".
The term sound effect ranges back to the early days of radio. In its Year Book 1931 the BBC published a major article about "The Use of Sound Effects". It considers sounds effect deeply linked with broadcasting and states: "It would be a great mistake to think of them as anologous to punctuation marks and accents in print. They should never be inserted into a programme already existing. The author of a broadcast play or broadcast construction ought to have used Sound Effects as bricks with which to build, treating them as of equal value with speech and music." It lists six "totally different primary genres of Sound Effect":
- Realistic, confirmatory effect
- Realistic, evocative effect
- Symbolic, evocative effect
- Conventionalised effect
- Impressionistic effect
- Music as an effect
According to the author, "It is axiomatic that every Sound Effect, to whatever category it belongs, must register in the listener's mind instantaneously. If it fails to do so its presence could not be justified."
In the context of motion pictures and television, sound effects refers to an entire hierarchy of sound elements, whose production encompasses many different disciplines, including:
- Hard sound effects are common sounds that appear on screen, such as door alarms, weapons firing, and cars driving by.
- Background (or BG) sound effects are sounds that do not explicitly synchronize with the picture, but indicate setting to the audience, such as forest sounds, the buzzing of fluorescent lights, and car interiors. The sound of people talking in the background is also considered a "BG," but only if the speaker is unintelligible and the language is unrecognizable (this is known as walla). These background noises are also called ambience or atmos ("atmosphere").
- Foley sound effects are sounds that synchronize on screen, and require the expertise of a foley artist to record properly. Footsteps, the movement of hand props (e.g., a tea cup and saucer), and the rustling of cloth are common foley units.
- Design sound effects are sounds that do not normally occur in nature, or are impossible to record in nature. These sounds are used to suggest futuristic technology in a science fiction film, or are used in a musical fashion to create an emotional mood.
Each of these sound effect categories is specialized, with sound editors known as specialists in an area of sound effects (e.g. a "Car cutter" or "Guns cutter").
Foley is another method of adding sound effects. Foley is more of a technique for creating sound effects than a type of sound effect, but it is often used for creating the incidental real world sounds that are very specific to what is going on onscreen, such as footsteps. With this technique the action onscreen is essentially recreated to try to match it as closely as possible. If done correctly it is very hard for audiences to tell what sounds were added and what sounds were originally recorded (location sound).
In the early days of film and radio, foley artists would add sounds in realtime or pre-recorded sound effects would be played back from analogue discs in realtime (while watching the picture). Today, with effects held in digital format, it is easy to create any required sequence to be played in any desired timeline.
In the days of silent film, sound effects were added by the operator of a theater organ or photoplayer, both of which also supplied the soundtrack of the film. Theater organ sound effects are usually electric or electro-pneumatic, and activated by a button pressed with the hand or foot. Photoplayer operators activate sound effects either by flipping switches on the machine or pulling "cow-tail" pull-strings, which hang above. Sounds like bells and drums are made mechanically, sirens and horns electronically. Due to its smaller size, a photoplayer usually has less special effects than a theater organ, or less complex ones.
The principles involved with modern video game sound effects (since the introduction of sample playback) are essentially the same as those of motion pictures. Typically a game project requires two jobs to be completed: sounds must be recorded or selected from a library and a sound engine must be programmed so that those sounds can be incorporated into the game's interactive environment.
In earlier computers and video game systems, sound effects were typically produced using sound synthesis. In modern systems, the increases in storage capacity and playback quality has allowed sampled sound to be used. The modern systems also frequently utilize positional audio, often with hardware acceleration, and real-time audio post-processing, which can also be tied to the 3D graphics development. Based on the internal state of the game, multiple different calculations can be made. This will allow for, for example, realistic sound dampening, echoes and doppler effect.
Historically the simplicity of game environments reduced the required number of sounds needed, and thus only one or two people were directly responsible for the sound recording and design. As the video game business has grown and computer sound reproduction quality has increased, however, the team of sound designers dedicated to game projects has likewise grown and the demands placed on them may now approach those of mid-budget motion pictures.
Some pieces of music use sound effects that are made by a musical instrument or by other means. An early example is the 18th century Toy Symphony. Richard Wagner in the opera Das Rheingold (1869) lets a choir of anvils introduce the scene of the dwarfs who have to work in the mines, similar to the introduction of the dwarfs in the 1937 Disney movie Snow White. Klaus Doldingers soundtrack for the 1981 movie Das Boot includes a title score with a sonar sound to reflect the U-boat setting. John Barry integrated into the title song of Moonraker (1979) a sound representing the beep of a Sputnik like satellite.
The most realistic sound effects may originate from original sources; the closest sound to machine-gun fire that we can replay should be an original recording of actual machine guns.
However, real life and actual practice do not always coincide with theory. Often recordings of real life do not sound realistic on playback. That is why we have Foley and f/x. The realistic sound of bacon frying is the crumpling of cellophane. Rain may be recorded as salt falling on a piece of tinfoil.
Less realistic sound effects are digitally synthesized or sampled and sequenced (the same recording played repeatedly using a sequencer). When the producer or content creator demands high-fidelity sound effects, the sound editor usually must augment his available library with new sound effects recorded in the field.
When the required sound effect is of a small subject, such as scissors cutting, cloth ripping, or footsteps, the sound effect is best recorded in a studio, under controlled conditions. Such small sounds are often delegated to a foley artist and foley editor. Many sound effects cannot be recorded in a studio, such as explosions, gunfire, and automobile or aircraft maneuvers. These effects must be recorded by a sound effects editor or a professional sound effects recordist.
When such "big" sounds are required, the recordist will begin contacting professionals or technicians in the same way a producer may arrange a crew; if the recordist needs an explosion, he may contact a demolition company to see if any buildings are scheduled to be destroyed with explosives in the near future. If the recordist requires a volley of cannon fire, he may contact historical re-enactors or gun enthusiasts.
Depending on the effect, recordists may use several DAT, hard disk, or Nagra recorders and a large number of microphones. During a cannon- and musket-fire recording session for the 2003 film The Alamo, conducted by Jon Johnson and Charles Maynes, two to three DAT machines were used. One machine was stationed near the cannon itself, so it could record the actual firing. Another was stationed several hundred yards away, below the trajectory of the ball, to record the sound of the cannonball passing by. When the crew recorded musket-fire, a set of microphones were arrayed close to the target (in this case a swine carcass) to record the musket-ball impacts.
A counter-example is the common technique for recording an automobile. For recording "Onboard" car sounds (which include the car interiors), a three-microphone technique is common. Two microphones record the engine directly: one is taped to the underside of the hood, near the engine block. The second microphone is covered in a wind screen and tightly attached to the rear bumper, within an inch or so of the tail pipe. The third microphone, which is often a stereo microphone, is stationed inside the car to get the car interior.
Having all of these tracks at once gives a sound designer or audio engineer a great deal of control over how he wants the car to sound. In order to make the car more ominous or low, he can mix in more of the tailpipe recording; if he wants the car to sound like it is running full throttle, he can mix in more of the engine recording and reduce the interior perspective. In cartoons, a pencil being dragged down a washboard may be used to simulate the sound of a sputtering engine. What we would consider today to be the first recorded sound effect was of Big Ben striking 10:30, 10:45, and 11:00. It was recorded on a brown wax cylinder by technicians at Edison House in London. It was recorded July 16, 1890. This recording is currently in the public domain.
As the car example demonstrates, the ability to make multiple simultaneous recordings of the same subject—through the use of several DAT or multitrack recorders—has made sound recording into a sophisticated craft. The sound effect can be shaped by the sound editor or sound designer, not just for realism, but for emotional effect.
Once the sound effects are recorded or captured, they are usually loaded into a computer integrated with an audio non-linear editing system. This allows a sound editor or sound designer to heavily manipulate a sound to meet his or her needs.
The most common sound design tool is the use of layering to create a new, interesting sound out of two or three old, average sounds. For example, the sound of a bullet impact into a pig carcass may be mixed with the sound of a melon being gouged to add to the "stickiness" or "gore" of the effect. If the effect is featured in a close-up, the designer may also add an "impact sweetener" from his or her library. The sweetener may simply be the sound of a hammer pounding hardwood, equalized so that only the low-end can be heard. The low end gives the three sounds together added weight, so that the audience actually "feels" the weight of the bullet hit the victim.
If the victim is the villain, and his death is climactic, the sound designer may add reverb to the impact, in order to enhance the dramatic beat. And then, as the victim falls over in slow motion, the sound editor may add the sound of a broom whooshing by a microphone, pitch-shifted down and time-expanded to further emphasize the death. If the film is science-fiction, the designer may phaser the "whoosh" to give it a more sci-fi feel. (For a list of many sound effects processes available to a sound designer, see the bottom of this article.)
When creating sound effects for films, sound recordists and editors do not generally concern themselves with the verisimilitude or accuracy of the sounds they present. The sound of a bullet entering a person from a close distance may sound nothing like the sound designed in the above example, but since very few people are aware of how such a thing actually sounds, the job of designing the effect is mainly an issue of creating a conjectural sound which feeds the audience's expectations while still suspending disbelief.
In the previous example, the phased 'whoosh' of the victim's fall has no analogue in real life experience, but it is emotionally immediate. If a sound editor uses such sounds in the context of emotional climax or a character's subjective experience, they can add to the drama of a situation in a way visuals simply cannot. If a visual effects artist were to do something similar to the 'whooshing fall' example, it would probably look ridiculous or at least excessively melodramatic.
The "Conjectural Sound" principle applies even to happenstance sounds, such as tires squealing, doorknobs turning or people walking. If the sound editor wants to communicate that a driver is in a hurry to leave, he will cut the sound of tires squealing when the car accelerates from a stop; even if the car is on a dirt road, the effect will work if the audience is dramatically engaged. If a character is afraid of someone on the other side of a door, the turning of the doorknob can take a second or more, and the mechanism of the knob can possess dozens of clicking parts. A skillful Foley artist can make someone walking calmly across the screen seem terrified simply by giving the actor a different gait.
In music and film/television production, typical effects used in recording and amplified performances are:
- echo - to simulate the effect of reverberation in a large hall or cavern, one or several delayed signals are added to the original signal. To be perceived as echo, the delay has to be of order 35 milliseconds or above. Short of actually playing a sound in the desired environment, the effect of echo can be implemented using either analog or digital methods. Analog echo effects are implemented using tape delays and/or spring reverbs. When large numbers of delayed signals are mixed over several seconds, the resulting sound has the effect of being presented in a large room, and it is more commonly called reverberation or reverb for short.
- flanger - to create an unusual sound, a delayed signal is added to the original signal with a continuously variable delay (usually smaller than 10 ms). This effect is now done electronically using DSP, but originally the effect was created by playing the same recording on two synchronized tape players, and then mixing the signals together. As long as the machines were synchronized, the mix would sound more-or-less normal, but if the operator placed his finger on the flange of one of the players (hence "flanger"), that machine would slow down and its signal would fall out-of-phase with its partner, producing a phasing effect. Once the operator took his finger off, the player would speed up until its tachometer was back in phase with the master, and as this happened, the phasing effect would appear to slide up the frequency spectrum. This phasing up-and-down the register can be performed rhythmically.
- phaser - another way of creating an unusual sound; the signal is split, a portion is filtered with an all-pass filter to produce a phase-shift, and then the unfiltered and filtered signals are mixed. The phaser effect was originally a simpler implementation of the flanger effect since delays were difficult to implement with analog equipment. Phasers are often used to give a "synthesized" or electronic effect to natural sounds, such as human speech. The voice of C-3PO from Star Wars was created by taking the actor's voice and treating it with a phaser.
- chorus - a delayed signal is added to the original signal with a constant delay. The delay has to be short in order not to be perceived as echo, but above 5 ms to be audible. If the delay is too short, it will destructively interfere with the un-delayed signal and create a flanging effect. Often, the delayed signals will be slightly pitch shifted to more realistically convey the effect of multiple voices.
- equalization - different frequency bands are attenuated or boosted to produce desired spectral characteristics. Moderate use of equalization (often abbreviated as "EQ") can be used to "fine-tune" the tone quality of a recording; extreme use of equalization, such as heavily cutting a certain frequency can create more unusual effects.
- filtering - Equalization is a form of filtering. In the general sense, frequency ranges can be emphasized or attenuated using low-pass, high-pass, band-pass or band-stop filters. Band-pass filtering of voice can simulate the effect of a telephone because telephones use band-pass filters.
- overdrive effects such as the use of a fuzz box can be used to produce distorted sounds, such as for imitating robotic voices or to simulate distorted radiotelephone traffic (e.g., the radio chatter between starfighter pilots in the science fiction film Star Wars). The most basic overdrive effect involves clipping the signal when its absolute value exceeds a certain threshold.
- pitch shift - similar to pitch correction, this effect shifts a signal up or down in pitch. For example, a signal may be shifted an octave up or down. This is usually applied to the entire signal, and not to each note separately. One application of pitch shifting is pitch correction. Here a musical signal is tuned to the correct pitch using digital signal processing techniques. This effect is ubiquitous in karaoke machines and is often used to assist pop singers who sing out of tune. It is also used intentionally for aesthetic effect in such pop songs as Cher's Believe and Madonna's Die Another Day.
- time stretching - the opposite of pitch shift, that is, the process of changing the speed of an audio signal without affecting its pitch.
- resonators - emphasize harmonic frequency content on specified frequencies.
- robotic voice effects are used to make an actor's voice sound like a synthesized human voice.
- synthesizer - generate artificially almost any sound by either imitating natural sounds or creating completely new sounds.
- modulation - to change the frequency or amplitude of a carrier signal in relation to a predefined signal. Ring modulation, also known as amplitude modulation, is an effect made famous by Doctor Who's Daleks and commonly used throughout sci-fi.
- compression - the reduction of the dynamic range of a sound to avoid unintentional fluctuation in the dynamics. Level compression is not to be confused with audio data compression, where the amount of data is reduced without affecting the amplitude of the sound it represents.
- 3D audio effects - place sounds outside the stereo basis
- reverse echo - a swelling effect created by reversing an audio signal and recording echo and/or delay whilst the signal runs in reverse. When played back forward the last echos are heard before the effected sound creating a rush like swell preceding and during playback.