C2. XL Âm thanh

Embed Size (px)

DESCRIPTION

chuong 2 xu li am thanh hinh anh

Citation preview

  • Chng II: X l m thanh X l thoi

    Tng quan v XL thoi M ha d on: DPCM, ADPCM Vocoder M ha lai Cc tiu chun m ha thoi

    X l m thanh M hnh Psychoacoustic v chc nng nghe-

    ngi Cc bc c bn trong m ha m thanh cm

    nhn M ha m thanh MPEG

  • X l thoi M ha ting ni l qu trnh biu din tn hiu ting

    ni s ha s dng cng t bit cng tt, m boc mc cht lng ting ni hp l. (Nn tingni).

    Dch v thoi: dch v vin thng c bn. Cng ngh m ha ting ni : thu ht s quan tm ca

    cc nh nghin cu, chun ha, doanh nghip; Nh u vit ca vi in t, tnh kh dng ca cc b XL

    c kh nng lp trnh, gi thnh thp, chip nh chuyndng; khc phc nhc im v hn ch.

    Chun ha cc phng php m ha ting ni cho ccng dng khc nhau.

  • Cc ng dng ca MH thoi

  • Mc tiu m ha thoi

    2,4 Kbps, t l 53 :1

    Tn s LM=8 kHzS bit/mu=16

    Tc bit=8 kHz x 16 bits = 128 Kbps

    Mc tiu ca m ha thoi

  • Cu trc ca h thng m ha thoi

    Digital Speech

    FocusFocus

  • Yu cu t ra B m ha thoi Tc bit thp: bng thng truyn dn thp, s dng h

    thng hiu qu hn( cht lng thoi). Ty thuc vong dng.

    Cht lng thoi cao: cht lng c th chp nhn ctheo ng dng hng n; Cc tham s xc nh: tnh dhiu, tnh t nhin, tnh d chu, kh nng nhn din gingngi ni.

    Tnh bn vng: qua cc ngn ng khc nhau, chng nhiu Hiu nng tt i vi tn hiu phi thoi: m thng bo,

    nhc. Kch thc b nh thp, phc tp tnh ton thp Tr m ha thp

  • Cc phng php nh gi cht lng thoi Phng php nh gi theo thang im MOS (Mean Opinion Score)

    (khuyn ngh ITU-T P.800): 5-Excellent/4-Good/3-Fair/2-Poor/1-Bad. PCM (64 Kbps lut A, ) c MOS = 4,5 5,0

    Phng php nh gi da trn m hnh gic quan PSQM(Perceptual Speech Quality Measurement) (khuyn ngh ITU-T P.861): im PSQM th hin lch gia tn hiu chun v tn hiu

    truyn dn. Phng php PESQ (Perceptual Evaluation of Speech Quality):

    So snh tn hiu gc X(t) vi tn hiu suy gim Y(t) l kt qu ca vic truyn tnhiu X(t) qua h thng thng tin. u ra ca PESQ l mt c lng v chtlng thoi nhn c ca tn hiu Y(t).

    Phng php da trn m hnh nh gi truyn dn E-model (chunETR 250): c lng cht lng thoi hai chiu v tnh n cc yu t nh: ting vng, tr...

  • Phn loi cc b m ha thoi

    Theo tc Theo k thut m ha Theo phng thc s dng

  • Phn loi cc b MH thoi (theo k thut m ha)

  • Cht lng thoi & Tc bit ca cc b m ha

  • M ho dng sng

    Trong min thi gian: M ho iu xung m (PCM): mi mu tn hiu

    c m ha c lp vi cc mu khc. iu bin xung m vi sai (DPCM):

    Cc mu ln cn tng quan vi nhau ng k = s saikhc v bin gia cc mu lin tip l kh nh.

    Xy dng m hnh m ha tn dng tnh cht ny gim tc s liu u ra ca ngun: m ha s saikhc gia cc mu lin tip thay v m ha tng muc lp.

  • M ha d on (LPC, DPCM)

    Quan st: Cc mu ln cn c s tng quan vi nhau rt ln.

    M ha d on: D on mu hin ti t cc mu trc . Lng t ha,

    m ha sai s d on thay v c gi tr mu. Nu d on chnh xc, sai s d on tp trung gn 0 v c

    th m ha t bit hn so vi mu ban u. B d on thng c s dng l b d on tuyn tnh:

    ( ) ( )=

    -=P

    kkp knxanx

    1

  • S khi b m ha DPCM

  • S khi b gii m DPCM

  • V d 1M ha chui mu sau s dng b m ha DPCM:-Chui: {1,3,4,4,7,8,6,5,3,1,}- S dng b d on: d on gi tr hin ti bi gi tr trc :

    - S dng b lng t ha 3 mc:

    - Vit chui mu c khi phc.- Vit chui bit m ha nh phn u ra, nu t m sau c s dng:

    Sai s 0 l 1, sai s 2 l 01, sai s -2 l 00.

    ( ) ( )1px n x n= -

    ( )2 10 1

    2 1

    dQ d d

    d

    =

  • iu ch Delta S dng b lng t ha sai s d on gm c 2

    mc: . Mi mu m ha 1 bit.

  • Bi tpBi 3:

    M ha chui m sau s dng DM: {1,3,4,4,7,8,6,5,3,1,}. S dng deltabng 2, gi s cho trc mu d on u tin. Biu din chui khi phcv chui m ha nh phn.

    Bi 4:Xt h thng d on s dng iu ch DM. B d on d on mu hinti da trn mu khi phc trc . Sai s d on c lng t hatheo b lng t:

    Cho chui mu {3,4,5,3,1,}. Tnh gi tr d on, sai s d on, sai s d on lng t ha, gi tr

    khi phc i vi mi mu cho,vi mu d on u tin l 3 c sdng c 2 b m ha v gii m. Gii thit 1 biu din e0 v 0 vie

  • M ho bng con SBC (subband coding)

    Tn hiu c chia thnh nhiu di bng hp, tnhiu trong min thi gian ng vi mi di c mha c lp.

    Trong m ha ting ni, di tn s thp cha phnln nng lng ca tn hiu, ng thi nhiu lngt nh hng n tai rt thp. Do vy, tn hiu bng tn thp c m ha nhiu bit hn tn hiu min tn cao.

  • M ho bin i thch nghi ATC (Adaptive Transform Coding)

    Ti pha pht: chia cc mu tn hiu ca ngun thnhtng khung Nf mu, s liu trong mi khung cchuyn sang min tn s m ha truyn i.

    Ti pha thu mi khung ph cc mu tn hiu cchuyn i ngc li trong min thi gian v tn hius c tng hp li t cc mu.

    m ha hiu qu, ta dng nhiu bit cho cc thnhphn ph quan trng, v t bit cho cc thnh phn phkhng quan trng.

    Cc php bin i c chn sao cho cc mu phkhng tng quan vi nhau: KLT (Karhunen-Love)(ti u nhng phc tp), DCT.

  • M ho dng sng Khi phc c tn hiu sng ging nh tn hiu gc.

    phc tp, gi thnh, tr cng sut tiu th thp. Ch to c ting ni cht lng cao ti cc tc

    ln hn 16kbps.

    Khng to c ting ni cht lng cao ti tc nh hn 16kbps.

  • Lm cch no gim tc bit hn na?

    ADPCM khng th cho cht lng tt khitc bit thp hn 16 Kbps.

    gim tc bit hn na, m hnh tora ting ni cn c khai thc: m hada trn m hnh hoc m ha vocoder.

    Cc phng thc m ha khng da trnm hnh c gi l m ha dng sng.

  • M ha ngun Tn hiu thoi c to ra t 1 m hnh (m

    hnh AR) (c iu khin bi 1 s cc thams): Trong qu trnh m ha, cc tham s cam hnh c d on t tn hiu thoi u vo,m ha truyn n b gii m.

    i vi ting ni, b m ha ngun c gi lvocoder: Hot ng da trn m hnh c quan pht m, c

    kch thch bi mt ngun nhiu trng i vi onting ni v thanh hoc bi mt dy xung c chu kbng chu k pitch vi on hu thanh.

    Thng tin c gi ti b gii m l cc thng s kthut ca b lc..

  • M hnh AR Phng php chung ca m hnh ha cc tn hiu ngu nhin: biu din tn

    hiu lm u ra ca b lc tuyn tnh ton cc (all-pole filter). Khi c:

    trong : ph cng sut u ra ca b lc l tch ca ph nhiu trng nhnvi bnh phng bin hm truyn t ca b lc.

    Qua vic chn b lc c a thc mu s thch hp c th c c trng phmong mun cho cc tn hiu ngu nhin.

    Cho chui cc gi tr x[n], x[n-1],, x[n-M] l th hin tin trnh t ngngc (AR) bc M, tha mn phng trnh vi phn:

    trong a1, a2 ,, aM : l cc tham s ca AR, v(n) l tin trnh nhiu trng.hay:

    Nhn thy rng: gi tr hin ti ca tin trnh x[n] l t hp tuyn tnh cc gi tr qukh ca tin trnh: x[n-1],, x[n-M] , cng vi v(n). Tin trnh x[n] c th hiu l csuy ngc li trn chnh cc gi tr trc ca n. Nn gi l t ng ngc (AR).

  • Hm truyn t ca b phn tch AR Ly bin i Z cng thc 3.41, ta c hm truyn t ca b phn

    tch AR (b lc ton khng-FIR), vi u vo b lc l x[n] v u ra l v[n]:

    Nh vy, b phn tch AR bin i tin trnh AR ti u vo thnh nhiu trng ti u ra.

  • Hm truyn t ca b tng hp AR Vi tn hiu vo l nhiu trng v[n] v s dng hm truyn t c

    biu din nh sau:

    to ra tin trnh AR, x[n] u ra. B tng hp tin trnh AR l b lc ton cc c p ng xung v

    hn (IIR). B tng hp bin i ly nhiu trng lm u vo v to ra tn hiu AR u ra. Cng thc 3.44 cho thy hm truyn t ca b phn tch l nghch o hm truyn t ca b tng hp.

    trong p1, p2,, pM l cc im cc ca Hs(z) v l nghim ca phng trnh c tnh:

  • B lc tng hp

    Kch thch b lc tng hp (hm truyn t Hs(z)) s dng tn hiu nhiu trng, u ra ca b lc s c PSD gn ging vi tn hiu gc

  • D on tuyn tnh ng vai tr quan trng trong cc thut ton m ha ting ni. Trong mt khung tn hiu, cc trng s (h s d on tuyn tnh)

    c s dng tnh ton t hp tuyn tnh c tm qua vic tithiu ha sai s d on bnh phng trung bnh.

    ng thi, cc h s ny c s dng biu din li khung tnhiu .

    Thnh phn c bn ca phng php d on l m hnh AR.Phn tch d on tuyn tnh l qu trnh d on tm cc thams AR da trn cc mu tn hiu (Gi thit ting ni c m hnhha l tn hiu AR).

    LP cng c xem l phng php c lng ph: phn tch LPcho php tm c cc tham s AR (xc nh PSD ca chnh tnhiu). Qua vic tnh ton cc h s LPC ca khung tn hiu c thto ra c mt tn hiu khc c ni dung ph gn ging vi phca tn hiu gc.

  • Bi ton d on tuyn tnh D on tuyn tnh l bi ton nhn dng vi cc tham s AR c

    c lng t chnh tn hiu AR (hnh 4.1). Tn hiu nhiu trng x[n] clc bi b tng hp tin trnh AR cho u ra s[n] tn hiu AR- vi cctham s AR l . B d on LP c s dng d on s[n] datrn M mu trc :

    trong ai l nhng d on ca tham s AR v l h s LPC.

    Sai s d on:

  • Ti thiu ha sai s Bi ton nhn dng h thng: d on cc tham s AR t s[n], vi

    cc d on l cc h s LPC. thc hin d on, phi thit lpc tiu chun. Trong trng hp ny: sai s d on bnhphng trung bnh:

    c ti thiu ha qua vic chn cc h s LPC. (J l hm bc 2ca cc LPC). Thy c s ph thuc ca J v cc h s LPC.

    Tm cc h s LPC ti u qua vic ly o hm J theo ak:

    Khi phng trnh 4.4 tha mn, th cc h s LPC = tham s AR.Do khi tm c cc h s LPC, h thng s s dng cc thams ny to ra tn hiu AR (b tng hp AR).

  • Ti thiu ha sai s T 4.4, vit li:

    Hoc:

    Phng trnh 4.6 nh ngha cc h s LPC ti u theo cc t tng quan Rs[l] ca tn hiu x[n]. Biu din 4.6 dng ma trn:

  • Biu din 4.6 dng ma trn:

    Trong :

    Nu tn ti ma trn nghch o ca ma trn tng quan Rs, tm c cc h s ti u LPC:

    li d on (t s gia phng sai ca tn hiu vo vi phng sai ca sai s d on): nh gi hiu nng ca b d on

  • Sai s d on bnh phng trung ti thiu T hnh 4.1, khi ; sai s d on bng nhiu trng

    (c s dng to tn hiu AR l s[n]): e[n]=v[n]. Khi , sai sbnh phng trung bnh l nh nht ( li d on ln nht):

    iu kin t ti u: bc ca b d on bc ca b tng hpAR. Trong thc t, M khng bit trc. Do , phi lm cho lid on l hm theo bc d on. Bng cch ny, c th xc nhc bc d on khi li bo ha.

    Nu bit c bc d on M, J t ti thiu khi .Cc tham s AR c s dng to tn hiu s[n]:

    Kt hp 4.16 v 4.9:(0 l vector khng Mx1)

  • Phn tch LP i vi tn hiu khng dng Tn hiu thoi: tnh cht ng. Cc h s LPC phi c tnh mi

    khung. Gi thit tnh thng k khng i trong mi khung . Tnhcc h s LPC t N im d liu kt thc ti thi im m: s[m-N+1],s[m-N+2],, s[m]. Vector LPC (M: bc d on) l:

    T 4.9 vit li dng ph thuc vo thi gian:

  • Cc c ch d on C 2 k thut c bn: d on trong v d on ngoi.

    D on trong: cc h s LPC c tnh t cc gi tr t tng quan clng c s dng d liu ca khung thoi x l cho chnh khung thoi .

    D on ngoi: cc h s LPC tm c c s dng trong khung tng lai(sau ). D on ngoi c s dng v tnh thng k ca tn hiu thay ichm theo thi gian. Nu khung khng qu ln, nhng tnh cht thng k c thc ly t cc khung trc khng xa.

    Khung c di in hnh: 160 n 240 mu. Phi s dng ca sc kch thc hu hn ly ra cc mu. Khung di hn: phc tp tnh ton t, tc bit thp hn, v vic tnh ton v

    truyn cc h s LPC t thng xuyn hn. Song tr m ha ln hn v h thngphi ch tp hp cc mu.

    Khng cho li d on cao.

    Khung ngn hn: biu din chnh xc hn, nhng ti tnh ton v tc bit caohn

  • Gii thut Levinson-Durbin Phng trnh 4.9 c th gii theo li gii 4.13, nhng nhn chung l

    phc tp. Hai gii thut Levison-Durbin (LD) v Leroux-Gueguen (LG) l hai

    gii thut rt ph hp cho vic phn tch LP ca cc h thng trin khai trong thc t.

    Xt phng trnh:

    Mc tiu: Tm cc h s ai theo cc gi tr t tng quan cho trc.

    Cc gi tr tng quan c c t vic c lng cc mu tn hiu. J l sai s d on trung bnh bnh phng ti thiu (thc t khng bit trc).

    Thut ton LD: tm li gii ca b d on bc M t b d on bc M-1 ( quy lp).

  • Thut ton da trn tnh cht c bn bt bin ca ma trn tng quan:

    B d on bc 0:

    M rng chiu ca 4.29:

  • Gii thut Levinson-DurbinB d on bc 1 (tip):

    V a1=0, nn iu kin ti u khng t c, a thm cn bngphng trnh v c xc nh:

    T tnh cht ca ma trn tng quan, 4.30 tng ng vi:

    Phng trnh 4.30 v 4.32 c s dng cho bc tip theo. B d on bc 1:

    Tm li gii cho:

    Trong , 2 bin cn tm cho phng trnh 4.34 :

    : l h s d on ca b d on bc 1. J1 l sai s d on bnh phng trung bnh ti thiu c th t c s dng b d on

    bc 1.

    [ ]10 R=D0D

  • B d on bc 1 (tip)Tm c h s phn x k1 , h s d on ca b lc bc 1, v J1:

    Gii thut:

  • Bi tp1. Cho mt khung d liu thoi c cc t tng quan l R(0)=1; R(1)=0,866; R(2)=0,554

    v R(3)=0,225. Tm cc h s ai=? (i=1,2,3).2. Cho h thng LPC c cc h s d on a1=1,793; a2=-1,401; a3=0,566; a4=-0,147.

    Bit li thu G=2, di chu k pitch=60; gi thit l m hu thanh. Vi cc iukin u =0 ti thi im bt u ca chu k pitch, tng hp 10 mu u tin?

    3.a. Cho s khi ca mt m hnh d on tuyn tnh ca tn hiu x(m). Vit phng trnh m t quan h vo-ra ca m hnh trn (min thi gian). Ly bin i Z phng trnh vo/ra tm hm truyn t ca m hnh (m hnh cc) Vit phng trnh b lc d on tuyn tnh ngc. Tm sai s bnh phng ti thiu i vi cc h s d on tuyn tnh

    b. Cho 3 h s t tng quan u tin ca tn hiu l: r(0)=1; r(1)=0,865; r(2)=0,521

    Tm cc h s ca m hnh d on tuyn tnh bc 2, biu din cc h s dng cc. S dng m hnh ny tnh p ng tn s ca qu trnh v biu din ph ca b d on.

  • M ha ngun

    C nhiu m hnh c xut: m hnh da trn don tuyn tnh (thnh cng nht):b lc bin i theothi gian. S dng to ra ting ni ting ni.

    Tc bit in hnh: 2 n 5 kbps. C cc loi: m ha d on tuyn tnh (LPC), m ha

    d on tuyn tnh kch thch hn hp (MELP).

  • M hnh ha h thng pht m

    Cc cu trc gii phu to nn h thng pht m ca con ngi.

    Velum: vm ming (ngc mm)Pharyngal cavity: Khoang huLarynx: Thanh qunTrachea: Kh qun

    Ting ni (sng m) pht ra tmi, ming khi khng kh t phithot ra. Cc khoang mi+khoangming+khoang hu = b lc mthanh c bn c p ng tn sthay i theo thi gian, c kchthch bi khng kh. C quan pht m = khoang hu+ khoang mi tn s cng hng ca cquan pht m = tn s formant:ph thuc vo hnh dng v kchthc ca c quan pht m.

  • M hnh ha h thng pht m

    Cc cu trc gii phu to nn h thng pht m ca con ngi.

    Velum: vm ming (ngc mm)Pharyngal cavity: Khoang huLarynx: Thanh qunTrachea: Kh qun

    Dy thanh (bn trong thanhqun ): ng/m nhanh trong khipht m. m hu thanh: khi dy thanhrung lm cho lung khng kh tphi b ngt theo chu k, to rachui xung kch thch c quanpht m. m v thanh: Khng kh thotra khng lm rung dy thanh,khng c tnh chu k, hn lon Trong min thi gian,m huthanh c tnh chu k rt mnh, vitn s c bn = tn s pitch.

  • M hnh ha h thng pht m(M hnh AR: autoregressive)

    Phi: to khng kh = nng lng kch thch c quanpht m biu din bi ngun nhiu trng S dng k thut nhn dng (d on tuyn tnh) d on cc thng s ca b lc bin i theo thigian da trn tn hiu quan st c.

  • M hnh thoi c quan pht m

    u ra ca blc s (b lcLPC): tn hiuthoi s

    u vo l chuixung hoc chuinhiu trng.

    Quan h gia 2 m hnh:

  • B m ha vocoder

    Thng tin a n b gii m: Cc tham s c trng cho b lc; m v thanh/hu thanh; Nhng thay i cn thit ca tn hiu kch thch, chu k

    m thanh.

    Phng trnh biu din quan h vo/ra ca b lc c th hin phng trnh sai phn tuyn tnh:

    Hm truyn t ca b lc:

  • M ha vocoder M hnh b lc c biu din di dng vector:

    A thay i theo chu k 20ms (theo tnh cht khng dng ca tn hiuthoi), ti tn s ly mu 8000 Hz, chu k 20 ms tng ng vi 160mu. Do vy tn hiu thoi c phn chia thnh cc khung c di20 ms (50 khung/sec)

    M hnh ny tng ng vi

    Nh vy, 160 gi tr ca S c i din cho 13 gi trca A

    2 kiu bi ton: Tng hp (Synthesis): Cho A, to S. Phn tch (Analysis): Cho S, tm A tt nht

  • B m ha ngun Thc hin:

    Tm cc thng s ca b lc = phn tch d ontuyn tnh, c th to ra khung tn hiu c ni dungph ging vi khung ban u, vi m thanh gnging.

    V vy, khung c th biu din qua vic s dng 10thng s b lc + h s nh c (tnh t mc cngsut ca khung gc). Tng s bit: 45 bit (40 bit chocc thng s, 5 bit: h s nh c)

  • LPC Vocoder 2,4Kbps S :

    Hot ng vi tc khong 2,4 Kbps hoc thp hn To ra thoi c m thanh d hiu nhng khng trung thc so vi ting

    ni t nhin ca con ngi. Cc h s LPC c biu din l cc tham s cp ph vch (line

    spectrum pair (LSP)). LSP tng ng 1-1 v mt ton hc vi LPC LSP c tnh nh sau:

  • LPC Vocoder 2,4Kbps Phn tch thnh tha s cc phng trnh trn:

    l cc tham s LSP LSP c bc v bin: LSP tng quan t khung ny n khung khc hn LPC Kch c khung l 20 msec (C 50 frames/sec, tc 2400 bps = 48

    bits/frame). Cc bit ny c gn nh sau:

  • LPC Vocoder 2,4Kbps 34 bit cho LSP c gn

    nh B1. tng ch G, c m

    ha s dng b lng tha v hng khng u 7bit (b lng t ha vec t1 chiu).

    i vi m hu thanh, lcc gi tr t 20-146. V/UV,T c m ha nh B2.

    B1

    B2

  • Tng qut ha: Cu trc ca b MH ngun

  • S b m ha LPC

  • S b gii m LPC

  • B m ha ngun Qu trnh m ha: (theo tng khung)

    Ti pha pht: Tm cc h s ca b lc t khung thoi. Tm h s nh c t khung thoi. Gi cc h s b lc v nh c ti pha thu.

    Pha thu: To ra chui nhiu trng. Nhn cc mu nhiu trng vi h s nh c. Xy dng b lc s dng cc h s b lc nhn c t

    pha pht v lc chui nhiu trng nh c. u ra ca blc chnh l thoi tng hp.

  • u nhc im ca Vocoder

    Cht lng ph thuc vo m hnh thoi.

    Cc Vocoder c th pht m kh gi to.

    Cht l-ng km cc vocoder rt nhy cm vi li.

    C th cung cp thoi s vi tc < 2 Kbps.

  • M ho Hybrid (lai)

    S dng lai ghp 2 cng ngh m ho sng vm ho Vocoder

    C th t -c cht l-ng thoi tt ti cc tc bt 2-16kbps

    M ha lai ph bin nht l m ho phn tchbng cch tng hp AbS (Analysis-by-Synthesis): RPE-LTP(Regular-Pulse-Excited-Long-Term Prediction), CELP, ACELP, CS-CELPvv

  • M ha lai To ra cc m thanh t nhin hn, tn hiu kch thch l

    ty , c chn sao cho dng sng ting ni c tora cng ging vi dng sng tht cng tt.

    B m ha lai: s dng m ha m hnh b lc v tnhiu kch thch nh mt dng sng.

    B m ha d on kch thch m (CELP): chn tn hiukch thch t cc t m trong bng m c thit ktrc.

    Nguyn l ny cho php cht lng tn hiu thoi c thchp nhn c trong di tc 4,8 16 kbps trong cch thng in thoi v tuyn.

  • M ho phn tch bng cch tng hp AbS C ch ti u ha vng kn (closed-loop) : chn tham s tt nht

    nh x tn hiu thoi tng hp cng ging cng tt tn hiu gc. Tn hiu c tng hp trong qu trnh m ha cho mc ch phn

    tch gi l AbS.

  • M ho phn tch bng cch tng hp AbS (Analysis-by-Synthesis)

    Cng s dng m hnh c quan pht m ca conngi.

    Thay v s dng cc m hnh tn hiu kch thch ngin th tn hiu kch thch c chn sao cho c gngt c dng sng ting ni ti to cng ging vidng sng ting ni ban u cng tt.

    Thut ton tm ra dng sng kch thch quyt nh phc tp b m ha.

    c s dng ph bin trong cc chun m ha tingni cho mng di ng.

  • Vng Phn tch bng cch tng hp trong CELP

  • B lc d on thi gian di Vic s dng s liu ting ni thc cho thy bc ca b lc phi

    ln c th c t nht 1 chu k pitch m hnh ha c tn hiuhu thanh.

    B d on tuyn tnh bc 10 khng chnh xc m hnh hatnh chu k ca tn hiu hu thanh c chu k pitch=50.

    Khi tng bc b d on, tnh chu k trong sai s d on khngcn, dn n tng li d on.

    Song nu bc b on cao s lm tng chi ph thc hin, tc bitv cn nhiu bit biu din cc h s d on, tng thm vic tnhton trong qu trnh phn tch. Cn phi c gii php va n gianli va c th m hnh ha tn hiu chnh xc.

    Quan st thc nghim: li d on tng ch yu 8-10 h sd on u tin, cng thm h s ti chu k pitch l 49. Cn cch s bc 11-48 v ln hn 49 khng ng gp vo vic ci thin li d on (Hnh 4.9).

  • B lc d on thi gian di

    B d on ngn hn c bc d on M tng i thp (M=8-12):loi b s tng quan gia cc mu ln cn. B d on thi giandi hng n s tng quan gia cc mu cch nhau 1 chu k.

    Hm truyn t ca b lc thi gian di:

    Hai tham s cn xc nh: chu k pitch T v li d on b (Biton phn tch LP thi gian di).

  • Cu trc khung/khung con ng dng trong phn tch LP thi gian ngn i vi khong tng

    i di (khung- 240 mu). Khung c chia thnh cc khung con (60 mu) (khong thi gian

    ngn hn). Vic phn tch LP thi gian di c thc hin trn cckhung con ny. (B m ha CELP).

  • B m ha CELP S dng cc m hnh d on tuyn tnh di hn v ngn hn

    tng hp ting ni, trnh vic phn loi m hu thanh v v thanhca LPC. Sau kt hp vi bng m kch thch (c truy vntrong qu trnh m ha), tm ra chui kch thch tt nht.

    B lc tng hp pitch to ra tn hiu c tnh chu k vi tn s c bnpitch, a n b lc tng hp formant to ng bao ph.

    Bng m: c nh hoc thch ng, cha cc xung xc nh hocnhiu ngu nhin.

    n gin: bng m c nh, cha cc mu nhiu trng.

    Ch s kch thch s chn ra chui nhiu trng a vo cc b lc.

  • B m ha CELP

  • B m ha RPE-LTP(Regular-Pulse-Excited-Long-Term Prediction)

    L b m ha ADPCM, trong b d on thc hin tnh ton ttn hiu, tm sai s d on v lng t sai s ny s dng c chthch nghi.

    C 2 b d on thi gian ngn v thi gian di, tng c lid on trung bnh.

    B m ha: Cc tham s ca mi khung/khung con c ly ra v c ng gi to thnh

    lung bit. Chia cc mu ting ni u vo thnh cc khung (160 mu 20ms), t cc

    khung chia thnh cc khung con (40 mu). Khi tin x l: s dng b lc thng cao loi b thnh phn DC. Phn tch LP: c thc hin trn tng khung, s dng bc d on l 8. 9 gi

    tr t tng quan c tnh ton t khung s dng ca s hnh ch nht. Ccgi tr tng quan c s dng tm 8 h s phn x.

  • Phn tch d on thi gian di, lc v khi m ha

  • B gii m RPE-LTP

  • Nhn xt Hu ht tt c cc b m ha lai u da

    trn m hnh LPC, tu theo cch to ra tnhiu kch thch m ngi ta a ra cc loim ho lai khc nhau nh:

    - M ho a xung MPE-LTP- M ho xung u RPE-LTP- M ho kch thch bng m CELP,ACELP,CS-

    ACELP..- M ho kch thch vect tng VSELP.vv

    Cc b m ha lai khc phc nhc imca LPC v cho dch v thoi tc thp vcht lng tng i tt.

  • Cc tiu chun m ha thoi

  • M ha m thanh (audio coding)

    m thanh M hnh b m ha v gii m m thanh B m ha cm nhn

    m l hc (psychoacoustics) Hiu ng che (auditory masking)

    Che min tn s Che min thi gian

    Chun nn m thanh MPEG

  • m thanh (Sound) m thanh l mt tn hiu lin tc c to ra bi s nn gin khng

    kh. S thay i p sut khng kh lm cho mng nh (eardrum) rung

    ng. Di tn s t 16Hz -20000Hz c gi l di tn s m thanh.

    Bc sng ca m thanh trong di m tn l t 21.25m n0.017m.

    Nhng m c tn s nh hn 16Hz gi l sng h m Nhng m c tn s ln hn 20000 Hz gi l sng siu m

  • c tnh m thanh Tn s m thanh:

    S ln dao ng ca khng kh truyn dn m trong mt n vthi gian l 1 giy.

    Tn s biu th cao (pitch) ca m thanh. Tn s cng ln th m thanh cng cao v ngc li.

    Cng ( mnh: intensity): L lng nng lng c sng m truyn i trong mt n v thi

    gian qua mt n v din tch t vung gc vi phng truyn m. n v: W/m2 Mc m thanh c cm nhn bi con ngi c cp n

    nh m lng (loudness). c o bi n v m lng: phon L cng cm gic ti 1000 Hz (gi tr cng m chun)

    Cng sut: L nng lng m thanh i qua mt din tch S trong thi gian

    mt giy. Cht lng quality (m sc: timbre):

    m sc c bit n nh l "cht lng" m thanh hay "mu sc" ca mthanh; gip phn bit nhng loi nhc c khc nhau.

  • Truyn tn hiu audio s

    Him khi n knh (monoaural sound) CD: 2 knh (stereo). DVD: 7.1 knh (surround sound) (7 knh

    normal + 1 knh hiu ng tn s thp LFE -

  • M ha m thanh m nhc c bng tn rng hn v a knh. M ha dng sng m bo c cht lng

    m thanh t nhin S dng nhng c tnh ca tai ngi xc

    nh s mc lng t ha trong cc di tn skhc nhau. Mi thnh phn tn s c lng t ha vi kch c

    bc ph thuc vo ngng nghe. Khng m ha thnh phn tn s m tai ngi khng

    th nghe c.

  • M ha m thanh Cht lng m thanh cao hn tc ly mu nhanh

    hn, nhiu bit/ mu hn, v nhiu knh hn. Tc truyn tn hiu audio Nch knh:

    B0 = b (s bit/mu). Fs. Nch DVD-Video: 48 kHz x 24 bit/mu = 1.152 kbps/ 1 channel; 2.304

    kbps/2 channles; 6.912 kbps/5.1; 9.216 kbps/7.1; Bng thng yu cu ln, phn pht n khch hng qua

    mt s phng tin truyn thng c dung lng hn ch(wireless: yu cu ln hn 36 ln so vi bng thngknh c gn).

    Gii php: Tng dung lng knh truyn (chi ph ln, ko th thc hin c) Hoc gim yu cu bng thng (gim tc bit: m ha m thanh

    s).

  • S m ha m thanh

    Yu cu t bitT s nn: r=B0/B

    (B: tc bit yu cu truyn bn rt gn)

    B m ha knh, b iu ch, knh vt l, b gii iu ch, Pht sinh li bit.

    Khng tn tht: tn hiu m thanh khi phc ging vi tn hiu m thanh ngun.

    Tn tht: bn gn ging, mt s thng tin b mt, tn hiu m thanh mo (khng cm nhn c)

  • tng L thuyt thng tin: tc bit trung bnh ti thiu cn

    thit truyn tn hiu ngun l entropy H ca n (xcnh bi xc sut phn b ca tn hiu ngun).

    S sai khc: R= B0 H, d tha thng k. M ha m thanh kiu lossless: remove d tha

    thng k t tn hiu ngun cng nhiu cng tt, saocho B cng gn H cng tt. (Hnh 1.2)

    M ha entropy: l k thut m ha g b phn d tha thng k

    Nhn xt:T l nn: hn ch (2:1), ko tha mn yu cu thc t (36:1), vimc ny mt s thng tin trong tn hiu ngun s b mt, khngchuyn i ngc li c (b gii m)

  • tng Thng tin mt khng chuyn ngc li c gy mo

    tn hiu audio khi phc ti u ra b gii m. Vn : Thit k c b m ha m bo vic tai ko

    cm nhn c mo, hoc c th cm nhn cnhng cha n mc phin phc (annoying).

    Phn thng tin trong tn hiu ngun gy ra mo nhngko nh hng n cm nhn hoc khng phin phc lthng tin ko lin quan n cm nhn (ngoi cm nhn:perceptual irrelevant) c th loi b khi tn hiungun, gim ng k tc bit (B m ha lossy).

    B m ha lossy: remove nhng thng tin ko nh hngn s cm nhn + d tha thng k. (Hnh 1.3)

  • M hnh d liu Lm cho m ha m thanh hiu qu hn, nhng cng

    phc tp. Cc tn hiu m thanh c s tng quan rt ln v c

    cc cu trc bn trong c th th biu din qua cc mhnh d liu.

    V d: tn hiu sin 1000 Hz, c LM vi tc 48 kHz,biu din 16 bit/mu, tnh chu k chng minh rng n cs tng quan ln. M hnh ha tnh chu k b i stng qua l d on tuyn tnh hoc chuyn itrc giao (gii tng quan) (chuyn tn hiu vo thnhcc h s ko tng quan, c nng lng c chuynvo mt s t cc h s: DCT), hoc chuyn i lapped(m ha bng con)

  • M hnh cm nhn

    M hnh xc nh mc ti u c th loi b an ton thng tin khng lin quan n cm nhn (perceptual irrelevance).

    c thit k trong min tn s (ngng che).

  • Kin trc c bn (P.15)- B m ha m thanh

  • Kin trc c bn (P.15)- B gii m m thanh

  • B m ha m thanh cm nhn(Perceptual Audio Coder)

  • Di b lc phn tch v tng hp Di b lc:

    Thnh phn quan trng trong hu ht cc b m ha video. Chuyn i t min thi gian dang min tn s v ngc li.

  • Gim tc ly mu (Down-Sampling)

    Hot ng gim tc ly mu i N ln m t qu trnh gi li cc mu th nN.

    V d Di b lc phn tch :N=1024 b lcTn s ly mu: fs=44100 HzTn s Nyquist: fg=22050 Hz

  • Tng tc ly mu (Up-sampling) Hot ng tng tc ly mu ln N ln m t vic chn vo

    N-1 mu 0 gia cc mu u vo.

    Di b lc tng hp

  • Di b lc iu ch Ci thin di b lc: To ra kch c ca s ln hn Hnh dng ca s khc nhau V d:

  • m l hc (Psychacoustic) S khi ca b m ha m thanh cm nhn

  • Cu to ca tai

  • Tin x l m thanh trong h thng ngoi bin

    La chn tn s ca mng nn

  • X l m thanh trong h thng thnh gic Mm nn=Di b lc

    c tai

  • X l m thanh trong h thng thnh gic

  • H thng thnh gic c th c m hnh nh di b lc, gm 25 blc bng thng chng ln, t 0 n 20 KHz.

    Tai khng th phn bit cc m thanh xut hin ng thi trongcng mt di bng.

    Mi di bng c gi l bng c bn. Bng tn ca mi bng c bn khong 100 hz i vi cc tn hiu

    di 500 Hz, v tng tuyn tnh sau 500 Hz n 5000 Hz. 1 bark = rng ca 1 bng c bn.

    Tai ngi: di b lc

    2

    /100, 500ar

    9 4log ( /100), 500f f Hz

    B kf f Hz

    = + >

  • Cm nhn m thanh Tn s v di tn s m thanh

  • Ngng ngheq Ngng nghe l mt hm ca tn s m thanh.

    q Khi cc thnh phn tn s thp di mc ngng th cc m thanh c tn s ny s khng nghe c.

    q Tai ngi nhy nht trong phm vi tn s t 2 4KHz.

  • Che min tn sTn hiu c p sut cao hn mc ngng nghe vn c th bche khut bi cc tn hiu c p sut ln hn v tr gn tnhiu trong min tn s tn hiu tn s ny s khng nghec. Tn hiu che khut lm dch ngng nghe.

  • Che min thi gian

  • M ha m thanh cm nhn Phn tch tn hiu thnh cc di tn s ring bit qua

    vic s dng di b lc. Phn tch nng lng tn hiu trong cc di khc nhau

    v xc nh ngng che tng ca mi di bi cc tnhiu trong di khc.

    Lng t ha cc mu trong cc di khc nhau c t lchnh xc vi mc che.

    Mt tn hiu no di mc che khng cn m ha. Tn hiu trn mc che c lng t ha v cc bit

    c gn qua cc di sao cho mi bit thm vo c thgim ti a mo cm nhn.

  • Cc tiu chun MPEG MPEG: nhm chuyn gia nh ng ca t chc tiu chun quc t

    (ISO). MPEG-1: nh ngha cc chun m ha v m thanh v video, cch

    thc gi ha cc bit m thanh v video ng b thi gian. Tc tng: 1,5 Mbps. Video (352x240 pels/frame, 30 frame/s): 30 Mbps n 1,2 Mbps. m thanh ( 2 knh, 48 K samples/s, 16 bit/sample): 2*768 kbps n < 0,3 Mbps. ng dng: web movies, MP3 audio, video CD.

    MPEG-2: cho m thanh v video cht lng tt hn. Video: 720x480 pels/frame, 30 frames/s: 216 Mbps n 3-5 Mbps. Audio (5.1 knh), m ha m thanh tin tin (AAC).

    MPEG-4: hng n s a dng v cc ng dng, c di chtlng v tc bit rng, nhng cht lng c ci thin ch yu tc bit thp. Cho ng dng internet audio video streaming

  • Chun m ha MPEG Tip cn ca MPEG: MPEG ch chun ha khun dng lung bit v

    b gii m (khng a ra khuyn ngh v thut ton m ha).

    MPEG-1: Tc bit t 32 kb/s n 448 kb/s. Ba lp:

    Lp 1: phc tp thp nht Lp 2: phc tp v cht lng tng Lp 3: phc tp cao nht, t cht lng cao nht tc bit thp.

    Cc tc hng n: 384 kbps; 256 kbps;

  • Thut ton m ha m thanh MPEGq B Lc bng con

    q Che bng bi bng gn s dng m hnh m l hc(khoa hc tm sinh l nghe (Psychoacoustics) )

    q Loi b nhng bng c p sut nm di ngng che.

    q Lng t ha/ Gn bit/ M ha

    q nh dng lung bit

  • Cc bc c bn trong m ha m thanh MPEG-1

    S dng cc b lc tch chp chia tn hiu m thanh thnh 32bng con: lc bng con.

    Xc nh mc che i vi mi bng da trn tn s ca n (ngngche tuyt i threshold in quiet), v nng lng ca bng ln cnv tn s v thi gian (che min tn s v che min thi gian).

    Nu nng lng trong mt bng nm di ngng che, khng mha n.

    Ngc li, xc nh s bit cn thit biu din h s trong bngny sao cho tp m sinh ra do lng t ha nm di hiu ng che(khi thm vo 1 bit gim c tp m lng t ha i 6 dB).

    nh dng lung bit: chn cc tiu thch hp, m ha thng tinpha pht nh lng t ha cc h s t l cho cc bng khc nhauv m ha (s dng m ha di thay i: Huffman).

  • Cc Lp trong MPEG-1 Lp 1:

    di khung: 384 mu (8ms vi fs=48kHz) phn gii tn s: 32 bng con Lng t ha: nn khi (12 mu), bin ca cc mu bng

    con c ch th qua h s nh c SCF, phn gii 2dB. Lp 2:

    di khung: 1152 mu (24ms vi fs=48kHz) phn gii tn s: 32 bng con Lng t ha: nn khi (12 mu), s dng h s nh c (SCF)

    chn thng tin. Lp 3:

    di khung: 1152 mu (24ms vi fs=48kHz) phn gii tn s: 576/192 bng con Lng t ha: khng u vi m ha, s dng h s nh c

    (SCF) chn thng tin.

  • MPEG-1 Lp 1 S khi

  • MPEG-1 Lp 1 Cu trc khung

  • MPEG-1 Lp 3 MP3 = ISO/IEC IS 11172-3 (MPEG-1 lp 3) v 13818-3

    (MPEG-2 lp 3). Khun dng file khng c tiu , khng cn thit c tiu . Tr nh nht ti b M ha/Gii m l 59 ms.

  • MPEG-1 Lp 3 Cu trc c bn ging nh lp 2: di khung (24ms, 48kHz), di b lc

    nhiu pha. im khc:

    Di b lc lai ghp (MDCT)(32x18=576 bng con hoc 32x6=192 bng con). (Hnh v)

    Lng t ha khng u. M ha Huffman. Cu trc phn tch da trn tng hp. H tr bit thay i

  • Cu trc b gii m MPEG-1

    Qu trnh gii m Lp1 v 2

  • Qu trnh gii m MPEG-1 Lp 3

  • Qu trnh gii m MPEG-Tng hp ca b lc bng con

  • Bi tp 81. Nu 3 c im nghe c bn xc nh mc thp nht ca mt

    m thanh c th nghe c?2. Gi s tn hiu m thanh c chia thnh 16 bng tn c nng

    lng trong cc bng khc nhau nh sau:---------------------------------------------------------------------------------------Bng 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Mc (db) 0 8 12 10 6 2 20 60 14 20 15 2 3 5 3 1Gi s rng nu mc ca bng th 8 l 60dB, n cho php che 12dB bng 7, 15 dB bng 9. Xc nh s bit cn thit m habng 7 v 9? Bit rng tn hiu gc c biu din vi 8bit/mu/bng.

    3. Nu 3 bc c bn trong m ha m thanh cm nhn? V s khi ch ra 3 thnh phn ny.

    4. Nu nhng im khc bit gia cc lp ca audio MPEG4 khacnh cc k thut c s dng v cht lng m thanh cng tc bit?

  • Li giiBi 1: Ngng nghe tuyt i, che min tn s, che min thi gian. Ngng nghe tuyt i cho bit mc thp nht (ngng) c th nghe

    c khi ch c mt tn s m thanh n v ngng ph thuc tn s.Bi 2: V nng lng bng 7 l 20 dB, ln hn 12 dB, cn phi m ha. Do mc

    che 12 dB, c th m ha vi tp m lng t 12 dB hay gim tc bit i2 bit. Do ch cn 6 bit.

    i vi bng 9, nng lng tn hiu l 14dB, thp hn 15 dB. Khng cn mha bng 9.

    Bi 3 Vic lc bng con xut pht t cc tn hiu bng con trong cc di tn s

    khc nhau. Tnh ton mc che da trn nng lng ca cc bng con khc nhau. Gn bit trong s cc bng con da trn mc cho ca mi bng con.