Upload
electronics-communication-staff-scu-suez-canal-university
View
89
Download
2
Tags:
Embed Size (px)
Citation preview
Prepared by:Doaa Gamal
Lecturer Assistant Faculty of Engineering – Suez Canal University
1
Outline
Introduction
Applications
History of time and pitch modification
Time-domain techniques
Frequency-domain techniques
Parametric techniques
conclusion
2
Introduction Timescale modification: slow down or speed up a given
signal, possibly in a time-varying manner, withoutaltering the signal’s spectral content (and in particularits pitch when the signal is periodic).
pitch-scale modification: the aim is to modify thepitch of the signal, possibly in a time-varying manner,without altering the signal’s time-evolution (and inparticular, its duration).
3
Introduction time-scaling or pitch-scaling is not easy because time
and frequency characteristics of a signal, being relatedby the Fourier transform, are not independent.
the simplest method of time scaling a sound is to justreplay it at a different rate. When using magnetictapes, for example, the tape speed may be varied, butthis incurs a simultaneous change in the pitch of thesignal.
4
applications
Speech Synthesizers
Post-synchronization
Data compression
Reading for the blind:
Foreign language learning
Voice transformation
5
History of time and pitch modification
Signal type
method technique
Analog tape recorder machine Time-domain
Digital Digital tape recorder Time-domain
Digital Periodicity-driven methods
Time-domain
Digital STFT Frequency-domain
Digital Linear prediction models & sinusoidal models
parametric models
6
time and pitch modification techniques
Non-parametric
Frequency-domain
techniques
Time-domain techniques
Parametric
Time-domain techniques
Pitch independent methods
requires very few calculations
very well to real-time implementation.
prone to artifacts because no precaution is taken at the splicing points, other than to guarantee continuity.
Time-domain techniquesPeriodicity-driven methods
The most popular method using pitch information is TD-PSOLA
modification factors (between 0.5 and 2).
TD-PSOLA analysis-synthesis process without modification
1 2
3
TD-PSOLA analysis-synthesis process without modification
The output speech waveform of PSOLA analysis-synthesis is perceptually indistinguishable from the original waveform.
4
pitch-scaling (lowering) using TD-PSOLA
12
time-scaling (lengthening) using TD-PSOLA
13
Computation of synthesis pitch-marks for pitch modification
14
Computation of synthesis pitch-marks for pitch modification (raising)
Computation of synthesis pitch-marks for duration modification
16
Computation of synthesis pitch-marks for time-scale modification (lengthening)
From the synthesis pitch-marks to the modified waveform
The simple way is
calculate the nearest analysis pitch-mark to the virtual pitch-mark is found
The frames which corresponds to the nearest analysis pitch-marks are centered on the synthesis pitch-marks.
The overlapping regions are added together.
18
From the synthesis pitch-marks to the modified waveform
19
• In more sophisticated systems, the mapping involves linear interpolation between the two successive short-time analysis signals lying the closest to the virtual pitch-mark
The perceptual quality of the prosody modified speechusing PSOLA methods depends on the accuracy of thepitch markers estimation. As estimating epochs fromspeech provide more accurate pitch marker locations
LP-PSOLA & FD-PSOLA
The Frequency-Domain PSOLA (FD-PSOLA) and theLinear-Predictive PSOLA (LP-PSOLA) approaches aretheoretically more appropriate than the time-domainPSOLA method for pitch-scale modifications becausethey provide independent control over the spectralenvelope of the synthesis signal.
Frequency-domain techniques
Frequency-domain algorithms operate with a short-time spectrum of the signal (phase-vocoder)
1. Calculate shift-time Fourier transform (STFT) of a signal
2. Modify phases of each frequency channel.
3. Synthesize a signal using inverse STFT with a different time stride
21
Parametric techniques linear prediction models
sinusoidal models
the Harmonic plus Noise Model, HNM
wideband models
STRAIGHT
conclusion Time-domain approaches are computationally cheap
and perform good for small modification factors.
Good for real-time implementations
possible to incorporate such systems in consumerproducts such as telephone answering systems.
suffering from echos.
In particular, time or pitch-scale modifications bylarge factors cannot be carried out by time-domainmethods and usually require the use of the moreelaborate frequency-domain techniques.
23
conclusion
Frequency-domain techniques are capable ofproviding very high quality output. However, they stillsuffer from some distortion, mainly due to the effectsof “phase dispersion.”
computationally intensive.
24
conclusion Parametric techniques tend to outperform non-
parametric methods when the adequation between thesignal to be modified and the underlying model isgood. When this is not the case however, the methodsbreak down and the results are unreliable.
Parametric techniques usually are more costly in termsof computations, because they require an explicitpreliminary analysis stage for the estimation of themodel parameters.
25