Pitch Shifting
Pitch shifting is the process of changing the perceived pitch of an audio signal without necessarily affecting its duration. It is a fundamental tool in music production, sound design, and audio post-production. Pitch shifting can be applied creatively for harmonization, transposition, or special effects, and technically for correcting tuning or aligning samples.
π― Learning Objectives
By the end of this topic, you should be able to:
- Understand the principles of pitch perception and how it relates to audio frequency.
- Differentiate between naive pitch shifting (resampling) and independent pitch shifting (time-invariant).
- Apply pitch shifting techniques to audio while controlling artifacts.
- Recognize common algorithms: phase vocoder, granular synthesis, and frequency-domain processing.
- Explore creative and technical applications of pitch shifting in music, film, and sound design.
Naive Pitch Shifting (Resampling):
Naive pitch shifting is performed by resampling the signal:
Where:
- = original audio signal
- = resampled audio signal
- = time-scaling factor (e.g., 2.0 speeds up, 0.5 slows down)
Effects:
- Duration changes:
- Pitch changes: frequencies are scaled by the same factor,
- Simple, but unsuitable when pitch must remain constant.
Independent Pitch Shifting:
To shift pitch without changing duration, we manipulate the frequency content using short-time Fourier transform (STFT) or granular synthesis.
Frequency-Domain Method (Phase Vocoder)
- Compute STFT of the signal:
Where:
- = FFT size
- = hop size
- = analysis window
- = frequency bin
- = frame index
- Scale frequencies by factor :
- Adjust phase of each bin to maintain time coherence, then apply inverse STFT to reconstruct signal.
Effect:
- Duration preserved
- Pitch scaled independently
Granular Synthesis Method
- Split audio into small overlapping grains (10β50 ms).
- Adjust playback rate of each grain to achieve desired pitch.
- Overlap and crossfade grains to preserve continuous duration.
Where:
- = grain window function
- = grain spacing
- = pitch shift factor
Semitone Conversion
Pitch shift is often specified in semitones, related to frequency scaling:
Where:
- = number of semitones (positive or negative)
- = original frequency
Example:
- +12 semitones β (one octave up)
- -12 semitones β (one octave down)
Practical Considerations
- Aggressive pitch shifts may introduce aliasing or phasiness.
- Using windowing and overlap-add techniques minimizes artifacts.
- Granular methods excel for large shifts; phase vocoders are smoother for moderate shifts.
Interactive Pitch Shifter
Pitch Shifter
Pitch shifting keeps playback duration constant. Upload an audio file and move the slider to change pitch in real-time.
π§ Key Takeaways
- Pitch shifting alters the frequency content of an audio signal to change its perceived pitch.
- Independent pitch shifting preserves the original duration of the sound using techniques like phase vocoder, granular synthesis, or FFT-based processing.
- Naive pitch shifting (resampling) changes both pitch and duration, useful for speed effects but not for precise audio editing.
- Applications include vocal tuning, instrument transposition, sound design, and creative effects like chipmunk or monster voices.
- Understanding timeβpitch relationships helps maintain naturalness and avoid unwanted artifacts such as phasing or aliasing.
π§ Quick Quiz
1) What happens when you naively speed up a sample by 2Γ?
2) Which algorithm allows independent pitch shifting without changing duration?
3) If a sound is transposed up by +7 semitones, what is the frequency ratio?
4) Which application uses pitch shifting without changing duration?
5) What is a common artifact when performing aggressive pitch shifting?