Upload
zhongde-wang
View
216
Download
0
Embed Size (px)
Citation preview
Signal Processing 19 (1990) 91-102 91 Elsevier
F A S T D I S C R E T E S I N E T R A N S F O R M A L G O R I T H M S *
Z h o n g d e W A N G
Beijing University of Posts and Telecommunications, Beijing, People's Rep. China
Received 10 February 1988 Revised 2 August 1988, 16 January 1989 and 11 September 1989
Abstract. A novel type of algorithms for the discrete sine transform (DST) are introduced in this paper. By using a basic trigonometric identity, these algorithms realize a successive reduction of the summation size in a simple manner, and therefore cause a very simple structure. The indexing of this algorithm involves the Hadamard order, the generation of which is given in this paper. These algorithms use cosines and sines as multipliers. It will cause less computational error than those algorithms with secant multipliers. The multipliers can be generated recursively in a simple way, without the need of referring to any trigonometric functions. FORTRAN subroutines to compute various types of the DST are provided.
Zusammeafassung. Neuartige Algorithmen fiir die diskrete Sinustransformation (DST) werden eingefiihrt. Durch die Anwen- dung einer grundlegenden trigonometrischen Identit~it realisieren diese Algorithmen auf einfache Weise eine sukzessive Verringerung des Summationsumfangs; so entsteht eine sehr einfache Struktur. Die Indizierung dieser Algorithmen folgt der Hadamard-Ordnung, deren Erzeugung in diesem Aufsatz gezeigt wird. Diese Algorithmen verwenden Sinus- und Kosinusterme als Muitiplikatoren und verursachen so geringere Rechenfehler als die Verfahren mit Sekans-Multiplikatoren. Die Multi- plikatoren kiSnnen in einfacher Weise rekursiv erzeugt werden, ohne dab man auf irgendwelche trigonometrische Funktionen zuriickgreifen muB, FORTRAN-Unterprogramme zur Berechung verschiedener DST-Arten werden angegeben.
Rrsumr. Cet article introduit un nouveau type d'algorithmes pour la transformre discrete en sinus (en anglais: discrete sine transform, DST). Utilisant une identit6 trigonomrtrique de base, ces algorithmes rralisent une rrduction successive de la taille de la somme ~ effectuer de manirre simple, et induisent de ce fait une structure tr~s simple. L'indexation dans cet algorithme implique l'ordre d'Hadamard, dont la grnrration est donnre dans cet article. Ces algorithmes utilisent des sinus et cosinus comme multiplicateurs. Ceci cause moins d'erreurs que dans les algorithmes employant des srcantes comme multiplicateurs. Ces multiplicatuers peuvent ~tre grnrrrs rrcursivement de mani~re simple, sans aucun appel ~ des fonctions trigonomrtriques. Des routines FORTRAN de calcul de diffrrents types de DST sont fournis.
Keywords. Discrete sine transform, fast algorithms.
1. Introduction
The discrete sine t r ans fo rm (DST) was first
i n t roduced into the digi tal image process ing by
Ja in [2] and lately classified by Wang and H u n t
[10]. Besides its app l i ca t ion in image process ing
[2], the D S T is appl icab le to the adapt ive fi l tering
[12] and t r ansmul t ip lex ing [16]. Wang has shown
that unde r cer ta in condi t ions , the K a r h u n e n -
Loeve t r ans fo rm of a Markov - I signal reduces to
the DST, more precisely, the DST- I (type I o f the
* Supported by the National Natural Science Foundation of China.
DST) and D S T - I I [9, 11]. In addi t ion , all different
types o f the D S T relate closely to the different
types o f the discrete W t rans form (DWT) [10],
which are real app roaches to spectral analysis, and
different types o f the discrete cosine t rans form
( D C T ) , as well as the discrete Four ie r t r ans fo rm
( D F T ) [6]. N e w a lgor i thms for the D S T may be
app l ied as a basic scheme to imp lemen t the c o m p u -
ta t ion o f D C T , D W T and the DFT.
In the last few years, a n u m b e r o f a lgor i thms
were d e v e l o p e d for the D S T [5, 8, 13-15]. A m o n g
0165-1684/90/$3.50 O 1990, Elsevier Science Publishers B.V.
92 Z. Wang / Fast discrete sine transform algorithms
these algorithms, the relationship between the DCT and DST is established [5]. It enables one
to compute the DST via algorithms for the DCT,
or conversely, to compute the DCT via the DST
algorithms. The algorithms in [14, 15] possess a simple structure. However, that they use secant
multipliers is a disadvantage. Although the algorithm in [13] uses cosines and sines as multi-
pliers, it contains some errors [8] and is hard to
implement.
In this paper, a novel type of algorithms for the
DST are introduced, these algorithms use cosine
and sine multipliers and require the minimum
number of arithmetic operations. Besides, the very
simple structure of the proposed algorithms makes
it easy to be implemented either by hardware or by software.
matrix, and the Roman superscript represents the type of the DST. km -- v~ /2 for m = N, and km -- 1
for other cases.
The DSTs relate to each other by [6]
1 _ tN |AN- t , (1) S~- l - PN-I[S~ ' 0
fsb~-, f J
0 A
and
s ~ I = [ s ~ ] - ' = [ s ~ ] T,
where T represents the matrix transposition, [ is
the antidiagonal identity matrix, the elements of which are all zeros except those along the NE-SW
diagonal.
2. Four even types of the DST
There are two categories of DSTs [10]. The even
types and the odd types. Since all present applica- tions of the DST involve the even types of the DST
only, we will concentrate on the even types of the
DST in this paper. The four even type DST
matrices are defined as [10]
S ~ - , : ~ / ~ [ s i n ( m n N ) ],
m ,n=l ,2 , . . . ,N -1 ,
2 k 1 "rr
m,n=l ,2 , . . . ,N,
S ~ I = ~ [ k n s i n ( ( m - 1 ) n ~ ) ] ,
m,n=l ,2 , . . . ,N,
S ~ : ~f~N[sin((m+~)(n+~)N) ]"
m,n=O, 1 , . . . ,N-1 , where the subscript represents the order of the
Signal Processing
AN-~ --V~ t I~N-~
A N = V ~ L ~ N /I~N]'
1 0 . . . . . . . . 0 . . . . . . . . 1 0 1 0 . . . . . . .
0 . . . . . . . 1 0 P N - I ~
0 1 0 . . . . 0 . . . . . 1 0 •
0 . . . . 1 0
1 0 . . . . . . . .
0 . . . . . . . . 1
0 1 0 . . . . . . .
0 . . . . . . . 1 0 PN =
0 . . . . 1 0
0 1 0 . . . . .
0 1 0 . . . .
With the relations given above, one may decom- pose S ~, S II and S m into S TM without any multipli-
cations.
3. Algorithm for the DST-IV
z. Wang / Fast discrete sine transform algorithms
Since the constant x /2 /N in the definitions of DSTs does not affect the analysis of the algorithm, it will be omitted in the following discussion.
Let N = 2 M be a power of 2, and let x(n) , n = 0, 1 , . . . , N - 1, be the input sequence. Its DST- IV coefficients are given by
N - 1
X(m)- - ~ x(n)s in(m+½)(n+½)~r/S) n = 0
N - I
= ~. x(n) sin((n+½)0m), n = 0
m =0, 1 , . . . , N - l , (3)
where
O~ = (m+½)~r/ N. (4)
We partition the summation into two parts, each containing N / 2 terms. The first part is given by
5~,= Y. x(n)sin((n+½)O,.), (5) n = 0
and the second part is given by
N - - I
Y~2 = Y~ x(n)sin((n+½)Om) n=~N
~ N - 1
~, x (~N+n)s in( (½N+n+½)O, , ) . (6) n = 0
By using a basic trigonometric identity
sin(a + 3) = 2 cos a sin/3 +s in (a - /3) , (7)
one has
sin((½N + n +½)0,.)
= 2 cos(½NOm) sin((n +½)Or.)
+ sin((½N - (n +½))0m).
Y.: may then be written as
~ N - - l
E:= 2 cos(½NO,.) E n=O
x(½N+ n) sin((n +½) Om)
+ ½N--1
E n = 0
x(½N + n) sin((½N - (n +½))0,,).
93
Replacing n by ½ N - 1 - n in the last summation, one gets
~ N - I
Y,2= Y. [2 cos(½NO,.)x(½N+ n) r l~0
+ x ( N - l - n ) ]
x sin((n +l)0, .) . (8)
Substituting (5) and (8) into (3), one obtains
~ N - 1
X ( m ) = Y r icO
[ x ( n ) + x ( N - l - n )
+ 2 cos(½NO,.)x(½N + n)]
x sin((n +½)0m)
m = 0 , 1 . . . . . N - 1 (9)
Since
cos(½N0~) = cos((,. +1)..12)
( c o s ( ~ r / 4 ) , m m o d 4 = 0 , 3 ,
= [-cos(~r/4), m mod 4 = 1, 2. (10)
Let
Xt,o(0) = x (n )+ x( N - 1 - n)
+ 2 cos(~r/4)x(½N + n), (11)
xl. ,(n) = x(n) + x( N - 1 - n)
- 2 cos(~r/4)x(½N + n). (12)
X ( m ) may then be partitioned into two groups, each containing N / 2 coefficients:
~ N - I
X ( m ) = Z n~O
Xl.o(n) sin((n +½)0,.),
m mod 4 = 0, 3,
~ N - 1
X(m)= E r i c o
Xl.l(n) sin((n +½)0m),
(13)
m mod 4 -- 1, 2. (14)
Equations (13) and (14) possess the same struc- ture as (3). Repeating the same procedure from
Vol. 19, No. 2, February 1990
94 z. Wang / Fast discrete sine transform algorithms
(3) tO (9), one may represent these two groups by
and
Since
and
~N-1 X ( m ) = Y.
n=O
Let
[Xl ,0 ( r s ) + x l , 0 ( l N - 1 - n )
+ 2 cos(~N0,.)X,.o(¼N + n)]
x sin((n +½)0.,),
m mod 4 = 0 , 3 (15)
In-- I
X(m)= Z n=O
[x, , ,(n) + xIA(½N - 1 - n )
+ 2 cos(~NOm)Xl,l(tN + n)]
x sin((n + ½) 0,.),
m rood 4 = 1, 2. (16)
cos(¼N0,,) [ cos('rr/8), m mod 8 = 0, 7, = I.-cos(~r/8), m mod 8 = 3, 4,
S cos(3"tr/8), m mod 8= 1,6, cos( I NOm ) ,i
[-cos(3"rr/8), m mod 8 = 2, 5.
X2,o(/'l ) = X 1,0( n ) "Jff X 1 ,0(½N -- 1 - n )
+2 cosOr/8)X,.o(kN + n),
x2.,(n) = X,.o(n) + X,.o(½N - 1 - n)
- 2 cos(~r/8)Xl.o(~N + n),
x2,z(n) = x,, ,(n) + x,.,(½N - 1 - n)
+ 2 cos(3~r/8)x m (¼N + n),
x2.3(n) = x , . , (n)+ x,.,(½N - 1 - n)
- 2 cos(3~r/8)xl,l(~N + n).
(17)
(18)
Then, each group of X ( m ) in (13) and (14) may be further partitioned into two parts. That is to say, X ( m ) may be partitioned into four parts: Signal Processing
~ N - I X(m)= Z
n=0 X2.o(n) sin((n +½) 0.,),
m mod 8 = 0, 7,
~N--1 X ( m ) = Z
n~O
xz,,(n) sin((n +½)0,.),
(21)
m mod 8 = 3, 4, (22)
~N--I X ( m ) = Y.
n=0 x2.2(n) sin((n +½)0,,),
m rood 8 = 1, 6, (23)
~N--I X(m)= Z
n=0 x2,3(n) sin((n +½) 0,,),
rn mod 8 = 2, 5. (24)
Thus the second stage of decomposition is com- pleted.
In the above expressions, two subscripts are used for x(n) . The first subscript, denoted by k, rep- resents the stages of the decomposition; while the second subscript, denoted by i, ranging from 0 to 2 k - 1, represents the group numbers. Each group i in the previous stage will generate the groups 2i and 2 i+ 1 in the next stage.
In general, the partition of m in the previous stage generates the partition of m in the next stage. Observation of the generation yields a structure as illustrated in Fig. 1.
Representing the new order of m after k stages of decomposition by h2K (m) mod 2K, where K = 2 k, then h2x (m) is generated by hr (m) as follows:
h2r(2m) = h2r(m) ,
h2r (2m + 1) = 2K - 1 - h2r (m), (25)
with the initial condition h2(0)=0 and h2(1)= 1. (19) After k stages of decomposition, X(m)s are parti-
tioned into K groups. Each group contains N / K DST-IV coefficients. Now we shall show that the
(20) ith group of the kth stage
N / K - - I
X ( m ) = ~ Xk, i(n)sin((n+½)Om) n=0
m rood 2K = h2K(2i), h2r(2i+ 1), (26)
Z. Wang / Fast discrete sine transform algorithms 95
S t age
jo Hadamard order modulus
2 0 7 /\ / \ 3 0 15 7 8
1
3 1 2 4 / \ / \ / \ 3 i 6 2 5 8
/\ /\ /\ /\ / \ 3 12 4 ii I 6 9 2 14 5 i0 16
Fig. 1. Par t i t ioning m into groups.
which is arranged according to the order of h2K ( i ) ,
will exactly generate two groups
N/2K--I X ( m ) = Y~ xk+,.2i(n) sin((n+½)0,.)
n=0
m rood 4K = h4r (4i), h4r (4i + 1), (27)
N/2K--I X ( m ) = Y. Xk+~,2i+,(n)sin((n+½)O,.)
. = 0
m m o d 4 K = h 4 r ( 4 i + 2 ) , h4r(4i+3), (28)
which are arranged according to the order of
h4r(i). Repeating the derivation from (3) to (9), the ith
group of X ( m ) of (26) may be represented as
N/2K--1 X ( m ) = ~ [Xk, i ( n ) + X k . i ( N / K - - l - - n )
n=0
+ 2 cos(NO, . /2K)Xka(N/2K + n)]
x sin((n +½)0m),
m mod 2K = h2r (2i), h2r(2i+ 1).
(29)
Since the period of
cos(NOm/2K) = cos((m +½)~r/2K)
is 4K, h2K(2i)+2jK and h2~:(2i+l)+2jK have four different values in one period. According to (25), they are
mo = h2K (2i) = hr (i),
m~ = h2K (2i + 1) = 2K - 1 - hr (i),
m2= 2K + h2K(2i) = 2K + hr( i ) ,
and
m 3 = 2K + h2r (2i + 1) = 4K - 1 - hK (i).
On the other hand, we have
cos( ( mo + ½)~r/2K )
= cos((ma+½)~r/2g)
= cos((2h~c (i) + 1).rr/4K), (30)
cos((ml +½)'rr/2K)
= cos((mz+½)rc/2K)
= -cos((Zhr. (i) + 1)'tr/4K). (31)
Thus, two newly generated groups are parti t ioned according to
m - - - - t o o , m 3 or m = m t , r n 2 .
However, since
mo = h2K (2i) = h4r(4i),
m a = 4 K - l - h K ( i )
= 4K - 1 - h2r (2i) = h4K (4i + 1),
ml = h2K ( 2i + 1) = h4r ( 4i + 2 ),
m2= 2K +hK(i)
= 4 K - l - [ 2 K - 1 - h r ( i ) ]
= 4 K - l - h 2 r ( 2 i + l)
= h4~c (2(2i + 1)+ 1) = h4K(4i+3).
This confirms our argument that new groups are arranged according to the order of h4K (i).
It is interesting to notice that the order of hx (i) is identical with the sequency (number of zero- crossing) of the ith row of a K by K Hadamard
Vol. 19, NO. 2, February 1990
96 Z. Wang I Fast discrete sine transform algorithms
matrix [ 1 ]. For this reason, the order of hr (i) will x t ll be referred to as the Hadamard order. Equations x(s) (30) and (31) show that the cosine multipliers are xIil
arranged in a Hadamard order too. xI5) After M = log2 N stage of decomposition, X ( m ) x(2)
is partitioned into N groups. Each group contains x(7) x(3)
only one DST-IV coefficient, and there is only one xt6l term in the summation
X ( m ) = x~.,(0) sin((n +½)0m),
m = hN( i ) . (32)
The modulus is not needed in this expression. At this point the decomposition is completed. To show the above FST algorithm clearly, the signal- flow graph of a 8 point FST-IV is given in Fig. 2. Following the above lines, a signal-flow graph of the FST-III is shown in Fig. 3. Fig. 4 is the signal- flow graph of the FST-II.
4. The Matrix representation of the FST algorithms
The FST algorithms given in the last section can be represented compactly by matrix products, as will be shown in this section.
xco, A / / V / > < ) ' xlo> x . i / A \ / / ~ / ~ X X ~4, - 4. xr , i
x<,> K~ff//24, i ~ , . / ~ X 4 , , x<,>
:',13) ~ A ~ ) ( ~ , ~ 2d ' -- Ida -- dlo Ill)
,< < . l / / 2 , , , ~ / ~ / . . . . . / X 4 , 3 X<,l : < ~ ' / / ~ . 4 , / A ~ . . . / ~ X X " 24. - 4,, x<~, X(6) 1 2di / - V243 ~ X d , n X(2) X(7) 2d, -- 2da -- 247 -- d*4 X(5)
Fig. 2. S igna l - f low g raph o f the FST-IV for N = 8. (d~ = 0x/~.5;
d2i = ~ d ~ ) ; d2i+l = 40.5(1 - dl). )
x(2) X(8)
x(3) X(4) / ' X X X / 2<,,, /'~'5>~- x<.
x(5) i// 2di X(2)
x(6) / 2d* X(7)
x(7) -- X(3) ~ < , , / \ ,43 /--~4C><? x,6>
x(8) d, - -- --
Fig. 3. S igna l - f l ow g raph o f the FST- I I I for N = 8. (d I = 0x/~.5; d~, = 0 . 5 ( ~ , ) ; d:,+, = ~ d,).)
Signal Processing
x(2) - ~ 4 ~ \ V / \ \
X(5)
- > < > < 4 , ~ / / % \ 24, \ x<~ X(8)
Fig. 4. S igna l - f l ow g raph of the FST-I I for N = 8. ( d I = 0x/-~.5;
d2i = ~ d , ) ; d2,+l = ~ d,). )
Let HN be a permutation matrix of order N which permutes a sequency ordered matrix into a Hadamard ordered one. Then
= [sin[(hN(m)+½)(n +½)~r/N]], (33)
m, n =O, 1, . . . , N - 1 ,
is the DST-IV matrix in a Hadamard order, where h N ( n ) are generated by (25).
Let K = 2 k < ~ N . We partition the first N / K
columns of S~ ) into K square submatrices, the ith submatrix is given by
S ~ / K = [ s i n [ ( 2 h N ( i N / K + j ) + l)
x (n +½)~r/N]],
£ n = 0 , 1 . . . . ,NIk-1.
Then S~ ) may be factorized into
s~,__ r si~ o 1B o~o) Lo 4 ~ J N. .~ , ,
where
(34)
(35)
,,, r , , ;:] L I~N - " (36)
2dlltNj, d,=cos(-,.r/4). (37)
Each submatrix S~)/X may be further factorized in the same manner.
o . 0 i f ¢(2 i+1) l " ° N / K ' ~ ' N / K , (38) L.~ N /2K - I
Z Wang / Fast discrete sine transform algorithms 97
where BN/~c is given by (36), while
• [ 2K iN/2K ] R~)/r= I 0 2d(K+i)IN/2KJ' (39)
d(K + i) = cos[(EhK (i) + 1)~r/4K],
i = 0 , 1 , . . . , K - 1 . (40)
Therefore, S~ ) may be decomposed recursively, until the smallest submatrices, which contain only one element on the first row of S~ ), are reached.
5. Generation of multipliers
The multipliers encountered in the proposed algorithm are all cosines. They may be generated by calling the internal cosine function and then be arranged properly. However, as will be shown in the following, they can be generated recursively, without referring to any trigonometric functions.
According to (25),
2hr (2i + 1) + 1 = 2K - 1 - 2hK/2(i)
= 2 K - [ 2 h r ( 2 i ) + l ] . (41)
compactly as
d(2i) = [0.5(1 + d(i))] 1/2, (43a)
d (2/+ 1) = [0.5(1 - d(i))] ~/2. (43b)
Starting from
d(1) -- cos(~/4) = 0x/-0~.5 (44)
all multipliers, including the sines of the last step, required by the DST-IV of order N can be gener- ated recursively. In general, getting a square root is faster than getting a cosine in a computer. In addition, reordering the cosines needs some more time. Therefore, the recursive generation of the multipliers is more efficient. However, since the round-oil error will accumulate during the recur- sive procedure, the round-oil errors of multipliers in later stages are generally larger than that in former stages. In order to reduce the error accumu- lation, one may use double precision for multi- pliers in the first few stages. The accuracy will be improved at the cost of a small increment of com- putational and memory requirement.
Substituting (41) into (40) yields
d(K + 2 i + 1) = cos[(2hK (2i + 1)+ 1)'rr/4K ]
= sin[(2hr (2i) + 1)ar/4K].
Hence
d ( K + 2 i ) 2 + d ( K + 2 i + l ) 2= 1,
d(K +2i) 2 - d ( K +2i+1) 2
= cos((hK (2i) +½)~r/K)
=d(K /2+i ) .
Then, the following formulas are obtained:
6. Operation counts
For each stage of decomposition, N/2 multipli- cations and 3N/2 additions are required. After M stages of decomposition, still N more multiplica- tions are needed. A total of N ( M +2) /2 multipli- cations together with 3MN/2 additions are required for the computation of N points DST-IV. With the relations given in [6] (eqns. (75)-(90) in [6]), and adopting its symbolic notation, i.e., /x and a represent the required number of multiplica- tions and additions respectively, the arithmetic operations for other types of the DST are given by
d(K+2i )=[O.5( l+d(K/2+i) ) ] 1/2, (42a)
d(K + 2i + 1) = [0.5(1 - d(K /2+ i))]1/2. (42b)
Replacing K / 2 + i by i, (42) can be rewritten more
I~( S~ ) = I~( S~ ~) = M N / 2 + 1,
a(S~) = a(S~) = 3MN/2 - N + 1,
tz(S~)= M N / 2 - N + I,
ct(S~ ) = 3 M N / 2 - 2 N - M + 2. Vol. 19, No. 2, February 1990
98
Since the DST-II are equivalent with the DCT-II in the sense of computational complexity [5], the figures show that in terms of arithmetic operations, the proposed FST algorithm is with the same efficiency as some most efficient algorithms for the discrete cosine transform, such as Lee's FCT [3], Vetterli and Nussbaumer's FFCT [4] and Wang's rotation factor algorithm [7], and is a little more efficient than the FST algorithms proposed by Yip and Rao [14, 15]. However, the present algorithm possesses two advantages:
(1) Instead of secant multipliers used in [1, 14, 15], the present algorithm uses cosine multi- pliers. Therefore, the present algorithm will cause less computational error than algorithms given in [1, 14, 15].
(2) The structure and the indexing of the pro- posed algorithm is simpler than algorithms given in [4] and [7]. It is very easy to be implemented either by software or by hardware.
FORTRAN subroutines to compute all four DSTs via the proposed algorithm are provided in the appendix, where caution has been taken to reduce the round-off errors of the multipliers.
Acknowledgment
The author wishes to thank one of the reviewers for pointing out the error accumulation of the recursive generation of the multipliers. The author's consideration of reducing the error accumulation is owed to that reviewer.
Appendix
Eight subroutines (author: Zhongde Wang) are included in this appendix. They are: FST1, FST2, FST3, FST4, COEFS, REORD, H D M O D and IVHDM. The first four are fast DST subroutines. COEFS is a subroutine to initialise the multiplica- tion coefficients required by FSTs. The last three subroutines are for reordering the sequence into an appropriate order. Signal Processing
Z. Wang / Fast discrete sine transform algorithms
The fast DST-I subroutine
Parameters: F: Array of input and output data. G: Working array. D: Array of multiplication coefficients obtained
by calling COEFS, the dimension of which must be greater than or equal to N/2.
L: An integer array for reordering<the transformed data into a sequence order. L is obtained by calling REORD.
N = 2**M: The dimensions of F, G and L are all N - 1 .
SUBROUTINE FST1 (F, G, D, L, M, N) D I M E N S I O N F(1), G(1), D(1), L(1) M1 = M - 1 N1 = N/2 N2 = N1 - 1 DO 1 0 I = 1 , N2
G(I) = F(I) + F(N - I) 10 F(N - I) = F(I) - F(N - I)
G(N1) = F(N1) CALL FST3 (G, D, M1, N1) NO = 0
20 N 0 = N I + N 0 N I = N 1 / 2 M1 = M I - 1 N 2 = N1 - 1 N3 = N 0 + N 1 N4 = N3 + N3 DO 30 I = N 0 + I , N 3 - 1
G(I) = F(I) + F ( N 4 - I) 30 F ( N 4 - I) = F ( N 4 - I ) - F(I)
G(N3) = F(N3) CALL FST3 ( G ( N 0 + 1), D, M1, N1) IF (N1.GE.2) GOTO 20 G ( N - I ) - F ( N - I) DO 40 I = I, N - 1
40 F(L(I)) = G(1) RETURN END
The fast DST-II subroutine
Parameters: F: Array of input and output data. The input array
Z. Wang / Fast discrete sine transform algorithms
is in a H a d a m a r d order . The ou tu t a r r ay is in
a na tu ra l order .
D: Ar ray o f mu l t i p l i ca t i on coefficients o b t a i n e d
by ca l l ing C O E F S , the d i m e n s i o n o f which
mus t be grea ter t han or equa l to N.
N = 2**M: The d i m e n s i o n o f F.
S U B R O U T I N E FST2 (F, D, M, N)
D I M E N S I O N F ( N ) , D(1)
N I = N / 2
N 2 = 2
J3 = 1
D O 1 0 I = 1 , N1
J3 = - J 3
I 2 = I + I
I1 = I 2 - 1
T = F ( I1 )
E(I1) = ( T + F ( I 2 ) ) * D ( N 1 + I - J3)
10 E(I2) = T - F ( I2 )
D O 5 0 I = 1 , M - 1
N1 = N 1 / 2
N7 = N2
N2 = N2 + N2
N 6 = 0
I F ( N 1 . E Q . 1 ) J3 = 0
D O 40 J = 1, N1
J3 = - J 3
J 2 = N I + J
B = D(J2) + D(J2)
N3 = N6 + 1
N6 = N6 + N2
N8 = N6 - N7
N5 = N8 - 1
N4 = N8 + N8
T = F ( N 8 )
F ( N S ) = ( T + F ( N 6 ) ) * D ( J 2 - J 3 )
F ( N 6 ) = T - E(N6)
D O 20 K - - N 3 , N5
K1 = N7 + K
T = F ( K )
F ( K ) = T + F ( K 1 )
20 F ( K 1 ) = B * ( T - F ( K 1 ) )
D O 30 K = N3, N5
30 F ( N 4 - K) = F ( N 4 - K) + F ( K )
40 C O N T I N U E
99
50 C O N T I N U E
F ( N ) = D ( 1 ) * F ( N )
R E T U R N
E N D
The fast DST-III subroutine.
I t is also used in FST1. F o r DST-I , the 4th sentence
shou ld not be work ing
Paramete rs :
F: Ar ray o f inpu t and o u t p u t data . The inpu t a r ray
is in a na tu ra l order . The ou tpu t a r ray is in a
H a d a m a r d order .
D: Ar ray o f mu l t i p l i ca t i on coefficients o b t a i n e d
by ca l l ing C O E F S , the d imens ion o f which
mus t be grea te r t han or equal to N.
N = 2**M: the d i m e n s i o n o f F.
10
S U B R O U T I N E FST3 (F, D, M, N)
D I M E N S I O N F ( N ) , D(1)
J l = 1
N1 = N
J3 = - 1
F ( N ) = D ( 1 ) * F ( N )
D O 40 I = l , M - 1
N2 = N1
N1 = N 1 / 2
N6 = 0
D O 3 0 J = l , J1
J 2 = J l + J
J3 = - J 3
A = D ( J 2 - J 3 )
B = D(J2) + D(J2)
N3 = N6 + 1
N6 = N2 + N6
N5 = N 6 - N1
N4 = N5 + N5
D O 1 0 K = N 3 , N 5 - 1
K1 = N 4 - K F ( K ) = F ( K ) + F ( K I )
F ( K 1 ) = F (K 1)*B
F ( N 5 ) = F ( N 5 ) * A
D O 20 K = N 3 , N5
T = F ( K + N1)
F ( K + N1) = F ( K ) - T
Vol. 19, No. 2, February 1990
100
20 F(K) = F(K) + T 30 C O N T I N U E
40 J1 = J l + J 1 50 DO 6 0 1 = l , J1
I1 = I + I
I1 = I 1 - 1
J3 = - J 3 T = D(J1 + I - J3)*F(I2) F(I2) = T + F(I1)
60 F(I1) = T - F(I1) R E T U R N
E N D
Z. Wang / Fast discrete sine transform algorithms
DO 20 K = N3, N5 T = F ( K + N1) F ( K + N1) = F ( K ) - T
20 F(K) = F(K) + T 30 C O N T I N U E
40 J1 = J l + J 1 DO 50 I = 1, J1
I I = I + I
I 2 = I 1 - 1 F ( I i ) = D(J1 + I2)*F(I1)
50 F(I2) = D(J1 + I1)*F(I2)
R E T U R N
E N D
The fast DST-IV subroutine.
Parameters: F: Array of input and output data. The input array
is in a natural order. The output array is in a
Hadamard order. D: Array of multiplication coefficients obtained
by calling COEFS, the dimension of which
must be greater than or equal to 2*N.
N = 2**M: The dimension of F.
S U B R O U T I N E FST4 (F, D, M, N)
D I M E N S I O N F(N), D(1)
J l = l N1 = N DO 40 I = 1, M
J 2 = J 1 N2 = N1 N1 = N1/2
N7 = 0 DO 3 0 J = l , J1
A = D(J2+ J)
A = A + A N6 = N7 N7 = N7 + N2 N3 = N6 + 1 N4 = N3 + N7 N5 = N 6 + N1 DO 10 K = N3, N5
K1 = N 4 - K F(K) = F(K) + F(K1)
10 F(K1) = A*F(K1)
Signal Processing
Subroutine to obtain the multiplication coefficient array D to be used by all FST subroutines
For FST-IV, N1 > = 2*N; for FST-II I and FST-
II, N I > = N ; for FST-I, N I > = N / 2 . N is the length of the sequence to be transformed.
S U B R O U T I N E COEFS (D, N1)
D I M E N S I O N D(N1), C(15) D O U B L E P R E C I S I O N C, T C(1) = DSQRT ( 5 D - 1) D(1) =C(1 ) M = 2 M 0 = 1
10 DO 20 I = M 0 , M - 1
I1 = I + I 1 2 = 1 1 + 1 C(I1) --- DSQRT ( 5 D - I* (1D0+ C(I))) C(I2) = DSQRT ( 5 D - 1 , ( i n 0 - C(I))) n ( I 1 ) = C(I1)
20 O(I2) = C(I2)
M 0 = M 0 + M 0
M = M 0 + M 0
IF(M.GE.N1) G O T O 70 IF(M.GE.16) G O T O 30 G O T O 10
30 DO 4 0 I = M 0 , M - 1 I1 = I + I 1 2 = 1 1 + 1 T = DSQRT ( 5 D - I * ( 1 D 0 + C ( I ) ) ) D ( I + I ) = T T = DSQRT ( 5 D - 1 , ( 1 D 0 - C ( I ) ) )
Z. Wang / Fast discrete sine transform algorithms
40 D ( I + I + I ) = T
M0 = M0 + M0
M = M 0 + M0
I F ( M . G E . N 1 ) G O T O 70
50 D O 6 0 I = M 0 , M - 1
D ( I + I ) = S Q R T ( . 5 . ( 1 . + D ( I ) ) )
60 D ( I + I + 1) = S Q R T ( . 5 . ( 1 . - D ( I ) ) )
M0 = M0 + M0
M = M0 + M0
I F ( M . G E . N 1 ) G O T O 70
G O T O 50
70 D O 80 I = N1, 2, - 1
80 D ( I ) = D ( I - 1)
R E T U R N
E N D
Subroutine for permuting a natural ordered array
into a Hadamard ordered one
Paramete rs :
F: Ar ray to be pe rmu ted .
G: W o r k i n g array, the d i m e n s i o n o f which has to
be grea te r t han or equa l to N / 2 .
N: The d i m e n s i o n o f F.
10
S U B R O U T I N E H D M O D (F, G, N)
D I M E N S I O N F ( N ) , G(1 )
I F (N.LE.2) R E T U R N
N1 = N / 2
NO = N1
N 2 = 1
I1 = N
12 = - N 2
D O 20 I - 1, N1
I1 = I1 - N2
12 = 12 + N2
D O 20 J = 1, N2
G ( I 2 + J) = F(I1 + J )
N3 = 0
N4 = N2 + N2
I I = N
D O 3 0 I = 1 , N1
N3 = N3 + N2
I1 = I1 - N4
12 = N 0 - N3
D O 30 J = 1 , N2
20
30 F(I1 + J ) = F ( I 2 + J )
I1 -- - N 2
I2 = - N 2
D O 4 0 I - - 1 , N1
I1 --- I1 + N4
I2 = I2 + N2
D O 40 J = 1, N2
40 F(I1 + J ) -- G ( I 2 + J)
N2 = N2 + N2
N1 = N 1 / 2
I F (N1 .GE.2 ) G O T O 10
R E T U R N
E N D
101
Subroutine to permute a Hadamard ordered array
into a natural ordered one
The same e x p l a n a t i o n as tha t for H D M O D .
S U B R O U T I N E I V H D M (F, G, N)
D I M E N S I O N F ( N ) , G(1)
I F (N.LE.2) R E T U R N
N1 = N / 4
I I = l
10 N 2 = N I + N 1
N3 = N2 + N2
I 2 = 0
D O 40 I = 1, I1
I3 = I2
I2 = I2 + N3
I4 = I3 + N2
D O 20 J = 1, N2
20 G(J ) = F ( I 3 + J + J )
D O 30 J = 2 , N2
30 F ( I 3 + J ) = F ( I 3 + J + J - 1)
D O 40 J = 1, N1
J1 = J + J
J2 = J 1 - 1
F ( I 4 + J1) = G(J2)
40 F ( I 4 + J2) = G(J1)
N1 = N 1 / 2
I 1 = I 1 + I 1
I F (N1 .GE.1 ) G O T O 10 R E T U R N
E N D
Vol. 19, No. 2, February 1990
102
The I V H D M routine
Z. Wang / Fast discrete sine transform algorithms
This s u b r o u t i n e m a k e s an i n d e x a r r ay L o f l e n g t h
N to be u s e d in F S T 1 f o r r e o r d e r i n g the t r ans -
f o r m e d d a t a in to a n a t u r a l o rde r .
S U B R O U T I N E R E O R D (L, N )
D I M E N S I O N L ( N )
L ( 1 ) = 1
L(2) = 3
N 2 = 2
10 N1 = N 2
N 2 = N1 + N1
D O 20 I = N 1 , 1, - 1
I1 = I + I
L ( I 1 ) = N 2 - L( I )
20 L( I1 - 1) = L( I )
I F ( N 1 . L T . N / 2 ) G O T O 10
NO = 0
N 2 = N / 2
N3 = N 2
30 N 2 = N 2 / 2
D O 40 I = 1, N 2
I1 = N 0 + I + I - 1
40 L ( N 3 + I ) = L ( I1 ) + L( I1 )
I F ( N 2 . E Q . 1 ) R E T U R N
NO = N 3
N3 = N 3 + N 2
G O T O 30
R E T U R N
E N D
References
[1] N. Ahmed and K.R. Rao, Orthogonal Transforms for Digital Signal Processing, Springer, New York, 1975.
[2] A.K. Jain, "A fast Karhunen-Loeve transform for a class
of stochastic processes", IEEE Trans. Commun., Vol. COM-24, 1976, pp. 1023-1029.
[3] B.G. Lee, "A new algorithm for the discrete cosine trans- form", IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-32, 1984, pp. 1243-1245.
[4] M. Vetterli and H. Nussbaumer, "Simple FFT and DCT algorithms with reduced number of operations", Signal Process., Vol. 6, 1984, pp. 267-278.
[5] Zhongde Wang, "A fast algorithm for the discrete sine transform implemented by the fast cosine transform", IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-30, 1982, pp. 814-815.
[6] Zhongde Wang, "Fast algorithms for the discrete W trans- form and for the discrete Fourier transform", IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-32, 1984, pp. 803-816.
[7] Zhongde Wang, "On computing the discrete Fourier and cosine transform", IEEE Trans. Acoust. Speech, Signal Process., Vol. ASSP-33, 1985, pp. 1341-1344.
[8] Zhongde Wang, "Comments on 'A fast algorithm for the discrete sine transform'", IEEE Trans. Commun., Vol. COM-34, 1986, pp. 204-205.
[9] Zhongde Wang, "Some special cases of the Karhunen- Loeve transform", Acta Electronica Sinica, Vol. 13, No. 3, 1985, pp. 132-133 (In Chinese).
[10] Zhongde Wang and B.R. Hunt, "The discrete W trans- form", Appl. Math. Comput., Vol. 16, 1985, pp. 19-48.
[11] Zhongde Wang and Shenglong Xu, "Comments on 'On the computation and the effectiveness of discrete sine transform'", Comput. Elec. Eng., Vol. 12. No. 1/2, 1986, pp. 23-27.
[12] J.L. Wang and Z.Q. Ding, "Discrete sine transform domain LMS adaptive filtering", Int. Conf. Acoust. Speech Signal Process. (ICASSP-85), Beijing, China, 1985, pp. 260-263.
[13] P. Yip and K.R. Rao, "A fast computational algorithms for the discrete sine transform", IEEE Trans. Commun., Vol. COM-28, 1980, pp. 304-307.
[14] P. Yip and K.R. Rao, "Fast decimation-in-time algorithm for a family of discrete sine and cosine transforms", Cir- cuits Syst. Signal Process. Vol. 3, 1984, pp. 387-408.
[15] P. Yip and K.R. Rao, "DIF algorithms for a family of discrete sine and cosine transforms", Proc. Int. Conf. Acoust. Speech Signal Process. (1CASSP-85), 1985, pp. 776-779.
[16] Zhao Zhen-Gang, "A new scheme of polyphase network TDM-FDM transmultiplexer using all-pass filters", Signal Process. (China), Vol. 1, No. 1, 1985, pp. 25-31. (In Chinese).
Signal Processing