12
Signal Processing 19 (1990) 91-102 91 Elsevier FAST DISCRETE SINE TRANSFORM ALGORITHMS* Zhongde WANG Beijing University of Posts and Telecommunications, Beijing, People's Rep. China Received 10 February 1988 Revised 2 August 1988, 16 January 1989 and 11 September 1989 Abstract. A novel type of algorithms for the discrete sine transform (DST) are introduced in this paper. By using a basic trigonometric identity, these algorithms realize a successive reduction of the summation size in a simple manner, and therefore cause a very simple structure. The indexing of this algorithm involves the Hadamard order, the generation of which is given in this paper. These algorithms use cosines and sines as multipliers. It will cause less computational error than those algorithms with secant multipliers. The multipliers can be generated recursively in a simple way, without the need of referring to any trigonometric functions. FORTRANsubroutines to compute various types of the DST are provided. Zusammeafassung. Neuartige Algorithmen fiir die diskrete Sinustransformation (DST) werden eingefiihrt. Durch die Anwen- dung einer grundlegenden trigonometrischen Identit~it realisieren diese Algorithmen auf einfache Weise eine sukzessive Verringerung des Summationsumfangs; so entsteht eine sehr einfache Struktur. Die Indizierung dieser Algorithmen folgt der Hadamard-Ordnung, deren Erzeugung in diesem Aufsatz gezeigt wird. Diese Algorithmen verwenden Sinus- und Kosinusterme als Muitiplikatoren und verursachen so geringere Rechenfehler als die Verfahren mit Sekans-Multiplikatoren. Die Multi- plikatoren kiSnnen in einfacher Weise rekursiv erzeugt werden, ohne dab man auf irgendwelche trigonometrische Funktionen zuriickgreifen muB, FORTRAN-Unterprogramme zur Berechung verschiedener DST-Arten werden angegeben. Rrsumr. Cet article introduit un nouveau type d'algorithmes pour la transformre discrete en sinus (en anglais: discrete sine transform, DST). Utilisant une identit6 trigonomrtrique de base, ces algorithmes rralisent une rrduction successive de la taille de la somme ~ effectuer de manirre simple, et induisent de ce fait une structure tr~s simple. L'indexation dans cet algorithme implique l'ordre d'Hadamard, dont la grnrration est donnre dans cet article. Ces algorithmes utilisent des sinus et cosinus comme multiplicateurs. Ceci cause moins d'erreurs que dans les algorithmes employant des srcantes comme multiplicateurs. Ces multiplicatuers peuvent ~tre grnrrrs rrcursivement de mani~re simple, sans aucun appel ~ des fonctions trigonomrtriques. Des routines FORTRANde calcul de diffrrents types de DST sont fournis. Keywords. Discrete sine transform, fast algorithms. 1. Introduction The discrete sine transform (DST) was first introduced into the digital image processing by Jain [2] and lately classified by Wang and Hunt [10]. Besides its application in image processing [2], the DST is applicable to the adaptive filtering [12] and transmultiplexing [16]. Wang has shown that under certain conditions, the Karhunen- Loeve transform of a Markov-I signal reduces to the DST, more precisely, the DST-I (type I of the * Supported by the National Natural Science Foundation of China. DST) and DST-II [9, 11]. In addition, all different types of the DST relate closely to the different types of the discrete W transform (DWT) [10], which are real approaches to spectral analysis, and different types of the discrete cosine transform (DCT), as well as the discrete Fourier transform (DFT) [6]. New algorithms for the DST may be applied as a basic scheme to implement the compu- tation of DCT, DWT and the DFT. In the last few years, a number of algorithms were developed for the DST [5, 8, 13-15]. Among 0165-1684/90/$3.50 O 1990, Elsevier Science Publishers B.V.

Fast discrete sine transform algorithms

Embed Size (px)

Citation preview

Page 1: Fast discrete sine transform algorithms

Signal Processing 19 (1990) 91-102 91 Elsevier

F A S T D I S C R E T E S I N E T R A N S F O R M A L G O R I T H M S *

Z h o n g d e W A N G

Beijing University of Posts and Telecommunications, Beijing, People's Rep. China

Received 10 February 1988 Revised 2 August 1988, 16 January 1989 and 11 September 1989

Abstract. A novel type of algorithms for the discrete sine transform (DST) are introduced in this paper. By using a basic trigonometric identity, these algorithms realize a successive reduction of the summation size in a simple manner, and therefore cause a very simple structure. The indexing of this algorithm involves the Hadamard order, the generation of which is given in this paper. These algorithms use cosines and sines as multipliers. It will cause less computational error than those algorithms with secant multipliers. The multipliers can be generated recursively in a simple way, without the need of referring to any trigonometric functions. FORTRAN subroutines to compute various types of the DST are provided.

Zusammeafassung. Neuartige Algorithmen fiir die diskrete Sinustransformation (DST) werden eingefiihrt. Durch die Anwen- dung einer grundlegenden trigonometrischen Identit~it realisieren diese Algorithmen auf einfache Weise eine sukzessive Verringerung des Summationsumfangs; so entsteht eine sehr einfache Struktur. Die Indizierung dieser Algorithmen folgt der Hadamard-Ordnung, deren Erzeugung in diesem Aufsatz gezeigt wird. Diese Algorithmen verwenden Sinus- und Kosinusterme als Muitiplikatoren und verursachen so geringere Rechenfehler als die Verfahren mit Sekans-Multiplikatoren. Die Multi- plikatoren kiSnnen in einfacher Weise rekursiv erzeugt werden, ohne dab man auf irgendwelche trigonometrische Funktionen zuriickgreifen muB, FORTRAN-Unterprogramme zur Berechung verschiedener DST-Arten werden angegeben.

Rrsumr. Cet article introduit un nouveau type d'algorithmes pour la transformre discrete en sinus (en anglais: discrete sine transform, DST). Utilisant une identit6 trigonomrtrique de base, ces algorithmes rralisent une rrduction successive de la taille de la somme ~ effectuer de manirre simple, et induisent de ce fait une structure tr~s simple. L'indexation dans cet algorithme implique l'ordre d'Hadamard, dont la grnrration est donnre dans cet article. Ces algorithmes utilisent des sinus et cosinus comme multiplicateurs. Ceci cause moins d'erreurs que dans les algorithmes employant des srcantes comme multiplicateurs. Ces multiplicatuers peuvent ~tre grnrrrs rrcursivement de mani~re simple, sans aucun appel ~ des fonctions trigonomrtriques. Des routines FORTRAN de calcul de diffrrents types de DST sont fournis.

Keywords. Discrete sine transform, fast algorithms.

1. Introduction

The discrete sine t r ans fo rm (DST) was first

i n t roduced into the digi tal image process ing by

Ja in [2] and lately classified by Wang and H u n t

[10]. Besides its app l i ca t ion in image process ing

[2], the D S T is appl icab le to the adapt ive fi l tering

[12] and t r ansmul t ip lex ing [16]. Wang has shown

that unde r cer ta in condi t ions , the K a r h u n e n -

Loeve t r ans fo rm of a Markov - I signal reduces to

the DST, more precisely, the DST- I (type I o f the

* Supported by the National Natural Science Foundation of China.

DST) and D S T - I I [9, 11]. In addi t ion , all different

types o f the D S T relate closely to the different

types o f the discrete W t rans form (DWT) [10],

which are real app roaches to spectral analysis, and

different types o f the discrete cosine t rans form

( D C T ) , as well as the discrete Four ie r t r ans fo rm

( D F T ) [6]. N e w a lgor i thms for the D S T may be

app l ied as a basic scheme to imp lemen t the c o m p u -

ta t ion o f D C T , D W T and the DFT.

In the last few years, a n u m b e r o f a lgor i thms

were d e v e l o p e d for the D S T [5, 8, 13-15]. A m o n g

0165-1684/90/$3.50 O 1990, Elsevier Science Publishers B.V.

Page 2: Fast discrete sine transform algorithms

92 Z. Wang / Fast discrete sine transform algorithms

these algorithms, the relationship between the DCT and DST is established [5]. It enables one

to compute the DST via algorithms for the DCT,

or conversely, to compute the DCT via the DST

algorithms. The algorithms in [14, 15] possess a simple structure. However, that they use secant

multipliers is a disadvantage. Although the algorithm in [13] uses cosines and sines as multi-

pliers, it contains some errors [8] and is hard to

implement.

In this paper, a novel type of algorithms for the

DST are introduced, these algorithms use cosine

and sine multipliers and require the minimum

number of arithmetic operations. Besides, the very

simple structure of the proposed algorithms makes

it easy to be implemented either by hardware or by software.

matrix, and the Roman superscript represents the type of the DST. km -- v~ /2 for m = N, and km -- 1

for other cases.

The DSTs relate to each other by [6]

1 _ tN |AN- t , (1) S~- l - PN-I[S~ ' 0

fsb~-, f J

0 A

and

s ~ I = [ s ~ ] - ' = [ s ~ ] T,

where T represents the matrix transposition, [ is

the antidiagonal identity matrix, the elements of which are all zeros except those along the NE-SW

diagonal.

2. Four even types of the DST

There are two categories of DSTs [10]. The even

types and the odd types. Since all present applica- tions of the DST involve the even types of the DST

only, we will concentrate on the even types of the

DST in this paper. The four even type DST

matrices are defined as [10]

S ~ - , : ~ / ~ [ s i n ( m n N ) ],

m ,n=l ,2 , . . . ,N -1 ,

2 k 1 "rr

m,n=l ,2 , . . . ,N,

S ~ I = ~ [ k n s i n ( ( m - 1 ) n ~ ) ] ,

m,n=l ,2 , . . . ,N,

S ~ : ~f~N[sin((m+~)(n+~)N) ]"

m,n=O, 1 , . . . ,N-1 , where the subscript represents the order of the

Signal Processing

AN-~ --V~ t I~N-~

A N = V ~ L ~ N /I~N]'

1 0 . . . . . . . . 0 . . . . . . . . 1 0 1 0 . . . . . . .

0 . . . . . . . 1 0 P N - I ~

0 1 0 . . . . 0 . . . . . 1 0 •

0 . . . . 1 0

1 0 . . . . . . . .

0 . . . . . . . . 1

0 1 0 . . . . . . .

0 . . . . . . . 1 0 PN =

0 . . . . 1 0

0 1 0 . . . . .

0 1 0 . . . .

With the relations given above, one may decom- pose S ~, S II and S m into S TM without any multipli-

cations.

Page 3: Fast discrete sine transform algorithms

3. Algorithm for the DST-IV

z. Wang / Fast discrete sine transform algorithms

Since the constant x /2 /N in the definitions of DSTs does not affect the analysis of the algorithm, it will be omitted in the following discussion.

Let N = 2 M be a power of 2, and let x(n) , n = 0, 1 , . . . , N - 1, be the input sequence. Its DST- IV coefficients are given by

N - 1

X(m)- - ~ x(n)s in(m+½)(n+½)~r/S) n = 0

N - I

= ~. x(n) sin((n+½)0m), n = 0

m =0, 1 , . . . , N - l , (3)

where

O~ = (m+½)~r/ N. (4)

We partition the summation into two parts, each containing N / 2 terms. The first part is given by

5~,= Y. x(n)sin((n+½)O,.), (5) n = 0

and the second part is given by

N - - I

Y~2 = Y~ x(n)sin((n+½)Om) n=~N

~ N - 1

~, x (~N+n)s in( (½N+n+½)O, , ) . (6) n = 0

By using a basic trigonometric identity

sin(a + 3) = 2 cos a sin/3 +s in (a - /3) , (7)

one has

sin((½N + n +½)0,.)

= 2 cos(½NOm) sin((n +½)Or.)

+ sin((½N - (n +½))0m).

Y.: may then be written as

~ N - - l

E:= 2 cos(½NO,.) E n=O

x(½N+ n) sin((n +½) Om)

+ ½N--1

E n = 0

x(½N + n) sin((½N - (n +½))0,,).

93

Replacing n by ½ N - 1 - n in the last summation, one gets

~ N - I

Y,2= Y. [2 cos(½NO,.)x(½N+ n) r l~0

+ x ( N - l - n ) ]

x sin((n +l)0, .) . (8)

Substituting (5) and (8) into (3), one obtains

~ N - 1

X ( m ) = Y r icO

[ x ( n ) + x ( N - l - n )

+ 2 cos(½NO,.)x(½N + n)]

x sin((n +½)0m)

m = 0 , 1 . . . . . N - 1 (9)

Since

cos(½N0~) = cos((,. +1)..12)

( c o s ( ~ r / 4 ) , m m o d 4 = 0 , 3 ,

= [-cos(~r/4), m mod 4 = 1, 2. (10)

Let

Xt,o(0) = x (n )+ x( N - 1 - n)

+ 2 cos(~r/4)x(½N + n), (11)

xl. ,(n) = x(n) + x( N - 1 - n)

- 2 cos(~r/4)x(½N + n). (12)

X ( m ) may then be partitioned into two groups, each containing N / 2 coefficients:

~ N - I

X ( m ) = Z n~O

Xl.o(n) sin((n +½)0,.),

m mod 4 = 0, 3,

~ N - 1

X(m)= E r i c o

Xl.l(n) sin((n +½)0m),

(13)

m mod 4 -- 1, 2. (14)

Equations (13) and (14) possess the same struc- ture as (3). Repeating the same procedure from

Vol. 19, No. 2, February 1990

Page 4: Fast discrete sine transform algorithms

94 z. Wang / Fast discrete sine transform algorithms

(3) tO (9), one may represent these two groups by

and

Since

and

~N-1 X ( m ) = Y.

n=O

Let

[Xl ,0 ( r s ) + x l , 0 ( l N - 1 - n )

+ 2 cos(~N0,.)X,.o(¼N + n)]

x sin((n +½)0.,),

m mod 4 = 0 , 3 (15)

In-- I

X(m)= Z n=O

[x, , ,(n) + xIA(½N - 1 - n )

+ 2 cos(~NOm)Xl,l(tN + n)]

x sin((n + ½) 0,.),

m rood 4 = 1, 2. (16)

cos(¼N0,,) [ cos('rr/8), m mod 8 = 0, 7, = I.-cos(~r/8), m mod 8 = 3, 4,

S cos(3"tr/8), m mod 8= 1,6, cos( I NOm ) ,i

[-cos(3"rr/8), m mod 8 = 2, 5.

X2,o(/'l ) = X 1,0( n ) "Jff X 1 ,0(½N -- 1 - n )

+2 cosOr/8)X,.o(kN + n),

x2.,(n) = X,.o(n) + X,.o(½N - 1 - n)

- 2 cos(~r/8)Xl.o(~N + n),

x2,z(n) = x,, ,(n) + x,.,(½N - 1 - n)

+ 2 cos(3~r/8)x m (¼N + n),

x2.3(n) = x , . , (n)+ x,.,(½N - 1 - n)

- 2 cos(3~r/8)xl,l(~N + n).

(17)

(18)

Then, each group of X ( m ) in (13) and (14) may be further partitioned into two parts. That is to say, X ( m ) may be partitioned into four parts: Signal Processing

~ N - I X(m)= Z

n=0 X2.o(n) sin((n +½) 0.,),

m mod 8 = 0, 7,

~N--1 X ( m ) = Z

n~O

xz,,(n) sin((n +½)0,.),

(21)

m mod 8 = 3, 4, (22)

~N--I X ( m ) = Y.

n=0 x2.2(n) sin((n +½)0,,),

m rood 8 = 1, 6, (23)

~N--I X(m)= Z

n=0 x2,3(n) sin((n +½) 0,,),

rn mod 8 = 2, 5. (24)

Thus the second stage of decomposition is com- pleted.

In the above expressions, two subscripts are used for x(n) . The first subscript, denoted by k, rep- resents the stages of the decomposition; while the second subscript, denoted by i, ranging from 0 to 2 k - 1, represents the group numbers. Each group i in the previous stage will generate the groups 2i and 2 i+ 1 in the next stage.

In general, the partition of m in the previous stage generates the partition of m in the next stage. Observation of the generation yields a structure as illustrated in Fig. 1.

Representing the new order of m after k stages of decomposition by h2K (m) mod 2K, where K = 2 k, then h2x (m) is generated by hr (m) as follows:

h2r(2m) = h2r(m) ,

h2r (2m + 1) = 2K - 1 - h2r (m), (25)

with the initial condition h2(0)=0 and h2(1)= 1. (19) After k stages of decomposition, X(m)s are parti-

tioned into K groups. Each group contains N / K DST-IV coefficients. Now we shall show that the

(20) ith group of the kth stage

N / K - - I

X ( m ) = ~ Xk, i(n)sin((n+½)Om) n=0

m rood 2K = h2K(2i), h2r(2i+ 1), (26)

Page 5: Fast discrete sine transform algorithms

Z. Wang / Fast discrete sine transform algorithms 95

S t age

jo Hadamard order modulus

2 0 7 /\ / \ 3 0 15 7 8

1

3 1 2 4 / \ / \ / \ 3 i 6 2 5 8

/\ /\ /\ /\ / \ 3 12 4 ii I 6 9 2 14 5 i0 16

Fig. 1. Par t i t ioning m into groups.

which is arranged according to the order of h2K ( i ) ,

will exactly generate two groups

N/2K--I X ( m ) = Y~ xk+,.2i(n) sin((n+½)0,.)

n=0

m rood 4K = h4r (4i), h4r (4i + 1), (27)

N/2K--I X ( m ) = Y. Xk+~,2i+,(n)sin((n+½)O,.)

. = 0

m m o d 4 K = h 4 r ( 4 i + 2 ) , h4r(4i+3), (28)

which are arranged according to the order of

h4r(i). Repeating the derivation from (3) to (9), the ith

group of X ( m ) of (26) may be represented as

N/2K--1 X ( m ) = ~ [Xk, i ( n ) + X k . i ( N / K - - l - - n )

n=0

+ 2 cos(NO, . /2K)Xka(N/2K + n)]

x sin((n +½)0m),

m mod 2K = h2r (2i), h2r(2i+ 1).

(29)

Since the period of

cos(NOm/2K) = cos((m +½)~r/2K)

is 4K, h2K(2i)+2jK and h2~:(2i+l)+2jK have four different values in one period. According to (25), they are

mo = h2K (2i) = hr (i),

m~ = h2K (2i + 1) = 2K - 1 - hr (i),

m2= 2K + h2K(2i) = 2K + hr( i ) ,

and

m 3 = 2K + h2r (2i + 1) = 4K - 1 - hK (i).

On the other hand, we have

cos( ( mo + ½)~r/2K )

= cos((ma+½)~r/2g)

= cos((2h~c (i) + 1).rr/4K), (30)

cos((ml +½)'rr/2K)

= cos((mz+½)rc/2K)

= -cos((Zhr. (i) + 1)'tr/4K). (31)

Thus, two newly generated groups are parti t ioned according to

m - - - - t o o , m 3 or m = m t , r n 2 .

However, since

mo = h2K (2i) = h4r(4i),

m a = 4 K - l - h K ( i )

= 4K - 1 - h2r (2i) = h4K (4i + 1),

ml = h2K ( 2i + 1) = h4r ( 4i + 2 ),

m2= 2K +hK(i)

= 4 K - l - [ 2 K - 1 - h r ( i ) ]

= 4 K - l - h 2 r ( 2 i + l)

= h4~c (2(2i + 1)+ 1) = h4K(4i+3).

This confirms our argument that new groups are arranged according to the order of h4K (i).

It is interesting to notice that the order of hx (i) is identical with the sequency (number of zero- crossing) of the ith row of a K by K Hadamard

Vol. 19, NO. 2, February 1990

Page 6: Fast discrete sine transform algorithms

96 Z. Wang I Fast discrete sine transform algorithms

matrix [ 1 ]. For this reason, the order of hr (i) will x t ll be referred to as the Hadamard order. Equations x(s) (30) and (31) show that the cosine multipliers are xIil

arranged in a Hadamard order too. xI5) After M = log2 N stage of decomposition, X ( m ) x(2)

is partitioned into N groups. Each group contains x(7) x(3)

only one DST-IV coefficient, and there is only one xt6l term in the summation

X ( m ) = x~.,(0) sin((n +½)0m),

m = hN( i ) . (32)

The modulus is not needed in this expression. At this point the decomposition is completed. To show the above FST algorithm clearly, the signal- flow graph of a 8 point FST-IV is given in Fig. 2. Following the above lines, a signal-flow graph of the FST-III is shown in Fig. 3. Fig. 4 is the signal- flow graph of the FST-II.

4. The Matrix representation of the FST algorithms

The FST algorithms given in the last section can be represented compactly by matrix products, as will be shown in this section.

xco, A / / V / > < ) ' xlo> x . i / A \ / / ~ / ~ X X ~4, - 4. xr , i

x<,> K~ff//24, i ~ , . / ~ X 4 , , x<,>

:',13) ~ A ~ ) ( ~ , ~ 2d ' -- Ida -- dlo Ill)

,< < . l / / 2 , , , ~ / ~ / . . . . . / X 4 , 3 X<,l : < ~ ' / / ~ . 4 , / A ~ . . . / ~ X X " 24. - 4,, x<~, X(6) 1 2di / - V243 ~ X d , n X(2) X(7) 2d, -- 2da -- 247 -- d*4 X(5)

Fig. 2. S igna l - f low g raph o f the FST-IV for N = 8. (d~ = 0x/~.5;

d2i = ~ d ~ ) ; d2i+l = 40.5(1 - dl). )

x(2) X(8)

x(3) X(4) / ' X X X / 2<,,, /'~'5>~- x<.

x(5) i// 2di X(2)

x(6) / 2d* X(7)

x(7) -- X(3) ~ < , , / \ ,43 /--~4C><? x,6>

x(8) d, - -- --

Fig. 3. S igna l - f l ow g raph o f the FST- I I I for N = 8. (d I = 0x/~.5; d~, = 0 . 5 ( ~ , ) ; d:,+, = ~ d,).)

Signal Processing

x(2) - ~ 4 ~ \ V / \ \

X(5)

- > < > < 4 , ~ / / % \ 24, \ x<~ X(8)

Fig. 4. S igna l - f l ow g raph of the FST-I I for N = 8. ( d I = 0x/-~.5;

d2i = ~ d , ) ; d2,+l = ~ d,). )

Let HN be a permutation matrix of order N which permutes a sequency ordered matrix into a Hadamard ordered one. Then

= [sin[(hN(m)+½)(n +½)~r/N]], (33)

m, n =O, 1, . . . , N - 1 ,

is the DST-IV matrix in a Hadamard order, where h N ( n ) are generated by (25).

Let K = 2 k < ~ N . We partition the first N / K

columns of S~ ) into K square submatrices, the ith submatrix is given by

S ~ / K = [ s i n [ ( 2 h N ( i N / K + j ) + l)

x (n +½)~r/N]],

£ n = 0 , 1 . . . . ,NIk-1.

Then S~ ) may be factorized into

s~,__ r si~ o 1B o~o) Lo 4 ~ J N. .~ , ,

where

(34)

(35)

,,, r , , ;:] L I~N - " (36)

2dlltNj, d,=cos(-,.r/4). (37)

Each submatrix S~)/X may be further factorized in the same manner.

o . 0 i f ¢(2 i+1) l " ° N / K ' ~ ' N / K , (38) L.~ N /2K - I

Page 7: Fast discrete sine transform algorithms

Z Wang / Fast discrete sine transform algorithms 97

where BN/~c is given by (36), while

• [ 2K iN/2K ] R~)/r= I 0 2d(K+i)IN/2KJ' (39)

d(K + i) = cos[(EhK (i) + 1)~r/4K],

i = 0 , 1 , . . . , K - 1 . (40)

Therefore, S~ ) may be decomposed recursively, until the smallest submatrices, which contain only one element on the first row of S~ ), are reached.

5. Generation of multipliers

The multipliers encountered in the proposed algorithm are all cosines. They may be generated by calling the internal cosine function and then be arranged properly. However, as will be shown in the following, they can be generated recursively, without referring to any trigonometric functions.

According to (25),

2hr (2i + 1) + 1 = 2K - 1 - 2hK/2(i)

= 2 K - [ 2 h r ( 2 i ) + l ] . (41)

compactly as

d(2i) = [0.5(1 + d(i))] 1/2, (43a)

d (2/+ 1) = [0.5(1 - d(i))] ~/2. (43b)

Starting from

d(1) -- cos(~/4) = 0x/-0~.5 (44)

all multipliers, including the sines of the last step, required by the DST-IV of order N can be gener- ated recursively. In general, getting a square root is faster than getting a cosine in a computer. In addition, reordering the cosines needs some more time. Therefore, the recursive generation of the multipliers is more efficient. However, since the round-oil error will accumulate during the recur- sive procedure, the round-oil errors of multipliers in later stages are generally larger than that in former stages. In order to reduce the error accumu- lation, one may use double precision for multi- pliers in the first few stages. The accuracy will be improved at the cost of a small increment of com- putational and memory requirement.

Substituting (41) into (40) yields

d(K + 2 i + 1) = cos[(2hK (2i + 1)+ 1)'rr/4K ]

= sin[(2hr (2i) + 1)ar/4K].

Hence

d ( K + 2 i ) 2 + d ( K + 2 i + l ) 2= 1,

d(K +2i) 2 - d ( K +2i+1) 2

= cos((hK (2i) +½)~r/K)

=d(K /2+i ) .

Then, the following formulas are obtained:

6. Operation counts

For each stage of decomposition, N/2 multipli- cations and 3N/2 additions are required. After M stages of decomposition, still N more multiplica- tions are needed. A total of N ( M +2) /2 multipli- cations together with 3MN/2 additions are required for the computation of N points DST-IV. With the relations given in [6] (eqns. (75)-(90) in [6]), and adopting its symbolic notation, i.e., /x and a represent the required number of multiplica- tions and additions respectively, the arithmetic operations for other types of the DST are given by

d(K+2i )=[O.5( l+d(K/2+i) ) ] 1/2, (42a)

d(K + 2i + 1) = [0.5(1 - d(K /2+ i))]1/2. (42b)

Replacing K / 2 + i by i, (42) can be rewritten more

I~( S~ ) = I~( S~ ~) = M N / 2 + 1,

a(S~) = a(S~) = 3MN/2 - N + 1,

tz(S~)= M N / 2 - N + I,

ct(S~ ) = 3 M N / 2 - 2 N - M + 2. Vol. 19, No. 2, February 1990

Page 8: Fast discrete sine transform algorithms

98

Since the DST-II are equivalent with the DCT-II in the sense of computational complexity [5], the figures show that in terms of arithmetic operations, the proposed FST algorithm is with the same efficiency as some most efficient algorithms for the discrete cosine transform, such as Lee's FCT [3], Vetterli and Nussbaumer's FFCT [4] and Wang's rotation factor algorithm [7], and is a little more efficient than the FST algorithms proposed by Yip and Rao [14, 15]. However, the present algorithm possesses two advantages:

(1) Instead of secant multipliers used in [1, 14, 15], the present algorithm uses cosine multi- pliers. Therefore, the present algorithm will cause less computational error than algorithms given in [1, 14, 15].

(2) The structure and the indexing of the pro- posed algorithm is simpler than algorithms given in [4] and [7]. It is very easy to be implemented either by software or by hardware.

FORTRAN subroutines to compute all four DSTs via the proposed algorithm are provided in the appendix, where caution has been taken to reduce the round-off errors of the multipliers.

Acknowledgment

The author wishes to thank one of the reviewers for pointing out the error accumulation of the recursive generation of the multipliers. The author's consideration of reducing the error accumulation is owed to that reviewer.

Appendix

Eight subroutines (author: Zhongde Wang) are included in this appendix. They are: FST1, FST2, FST3, FST4, COEFS, REORD, H D M O D and IVHDM. The first four are fast DST subroutines. COEFS is a subroutine to initialise the multiplica- tion coefficients required by FSTs. The last three subroutines are for reordering the sequence into an appropriate order. Signal Processing

Z. Wang / Fast discrete sine transform algorithms

The fast DST-I subroutine

Parameters: F: Array of input and output data. G: Working array. D: Array of multiplication coefficients obtained

by calling COEFS, the dimension of which must be greater than or equal to N/2.

L: An integer array for reordering<the transformed data into a sequence order. L is obtained by calling REORD.

N = 2**M: The dimensions of F, G and L are all N - 1 .

SUBROUTINE FST1 (F, G, D, L, M, N) D I M E N S I O N F(1), G(1), D(1), L(1) M1 = M - 1 N1 = N/2 N2 = N1 - 1 DO 1 0 I = 1 , N2

G(I) = F(I) + F(N - I) 10 F(N - I) = F(I) - F(N - I)

G(N1) = F(N1) CALL FST3 (G, D, M1, N1) NO = 0

20 N 0 = N I + N 0 N I = N 1 / 2 M1 = M I - 1 N 2 = N1 - 1 N3 = N 0 + N 1 N4 = N3 + N3 DO 30 I = N 0 + I , N 3 - 1

G(I) = F(I) + F ( N 4 - I) 30 F ( N 4 - I) = F ( N 4 - I ) - F(I)

G(N3) = F(N3) CALL FST3 ( G ( N 0 + 1), D, M1, N1) IF (N1.GE.2) GOTO 20 G ( N - I ) - F ( N - I) DO 40 I = I, N - 1

40 F(L(I)) = G(1) RETURN END

The fast DST-II subroutine

Parameters: F: Array of input and output data. The input array

Page 9: Fast discrete sine transform algorithms

Z. Wang / Fast discrete sine transform algorithms

is in a H a d a m a r d order . The ou tu t a r r ay is in

a na tu ra l order .

D: Ar ray o f mu l t i p l i ca t i on coefficients o b t a i n e d

by ca l l ing C O E F S , the d i m e n s i o n o f which

mus t be grea ter t han or equa l to N.

N = 2**M: The d i m e n s i o n o f F.

S U B R O U T I N E FST2 (F, D, M, N)

D I M E N S I O N F ( N ) , D(1)

N I = N / 2

N 2 = 2

J3 = 1

D O 1 0 I = 1 , N1

J3 = - J 3

I 2 = I + I

I1 = I 2 - 1

T = F ( I1 )

E(I1) = ( T + F ( I 2 ) ) * D ( N 1 + I - J3)

10 E(I2) = T - F ( I2 )

D O 5 0 I = 1 , M - 1

N1 = N 1 / 2

N7 = N2

N2 = N2 + N2

N 6 = 0

I F ( N 1 . E Q . 1 ) J3 = 0

D O 40 J = 1, N1

J3 = - J 3

J 2 = N I + J

B = D(J2) + D(J2)

N3 = N6 + 1

N6 = N6 + N2

N8 = N6 - N7

N5 = N8 - 1

N4 = N8 + N8

T = F ( N 8 )

F ( N S ) = ( T + F ( N 6 ) ) * D ( J 2 - J 3 )

F ( N 6 ) = T - E(N6)

D O 20 K - - N 3 , N5

K1 = N7 + K

T = F ( K )

F ( K ) = T + F ( K 1 )

20 F ( K 1 ) = B * ( T - F ( K 1 ) )

D O 30 K = N3, N5

30 F ( N 4 - K) = F ( N 4 - K) + F ( K )

40 C O N T I N U E

99

50 C O N T I N U E

F ( N ) = D ( 1 ) * F ( N )

R E T U R N

E N D

The fast DST-III subroutine.

I t is also used in FST1. F o r DST-I , the 4th sentence

shou ld not be work ing

Paramete rs :

F: Ar ray o f inpu t and o u t p u t data . The inpu t a r ray

is in a na tu ra l order . The ou tpu t a r ray is in a

H a d a m a r d order .

D: Ar ray o f mu l t i p l i ca t i on coefficients o b t a i n e d

by ca l l ing C O E F S , the d imens ion o f which

mus t be grea te r t han or equal to N.

N = 2**M: the d i m e n s i o n o f F.

10

S U B R O U T I N E FST3 (F, D, M, N)

D I M E N S I O N F ( N ) , D(1)

J l = 1

N1 = N

J3 = - 1

F ( N ) = D ( 1 ) * F ( N )

D O 40 I = l , M - 1

N2 = N1

N1 = N 1 / 2

N6 = 0

D O 3 0 J = l , J1

J 2 = J l + J

J3 = - J 3

A = D ( J 2 - J 3 )

B = D(J2) + D(J2)

N3 = N6 + 1

N6 = N2 + N6

N5 = N 6 - N1

N4 = N5 + N5

D O 1 0 K = N 3 , N 5 - 1

K1 = N 4 - K F ( K ) = F ( K ) + F ( K I )

F ( K 1 ) = F (K 1)*B

F ( N 5 ) = F ( N 5 ) * A

D O 20 K = N 3 , N5

T = F ( K + N1)

F ( K + N1) = F ( K ) - T

Vol. 19, No. 2, February 1990

Page 10: Fast discrete sine transform algorithms

100

20 F(K) = F(K) + T 30 C O N T I N U E

40 J1 = J l + J 1 50 DO 6 0 1 = l , J1

I1 = I + I

I1 = I 1 - 1

J3 = - J 3 T = D(J1 + I - J3)*F(I2) F(I2) = T + F(I1)

60 F(I1) = T - F(I1) R E T U R N

E N D

Z. Wang / Fast discrete sine transform algorithms

DO 20 K = N3, N5 T = F ( K + N1) F ( K + N1) = F ( K ) - T

20 F(K) = F(K) + T 30 C O N T I N U E

40 J1 = J l + J 1 DO 50 I = 1, J1

I I = I + I

I 2 = I 1 - 1 F ( I i ) = D(J1 + I2)*F(I1)

50 F(I2) = D(J1 + I1)*F(I2)

R E T U R N

E N D

The fast DST-IV subroutine.

Parameters: F: Array of input and output data. The input array

is in a natural order. The output array is in a

Hadamard order. D: Array of multiplication coefficients obtained

by calling COEFS, the dimension of which

must be greater than or equal to 2*N.

N = 2**M: The dimension of F.

S U B R O U T I N E FST4 (F, D, M, N)

D I M E N S I O N F(N), D(1)

J l = l N1 = N DO 40 I = 1, M

J 2 = J 1 N2 = N1 N1 = N1/2

N7 = 0 DO 3 0 J = l , J1

A = D(J2+ J)

A = A + A N6 = N7 N7 = N7 + N2 N3 = N6 + 1 N4 = N3 + N7 N5 = N 6 + N1 DO 10 K = N3, N5

K1 = N 4 - K F(K) = F(K) + F(K1)

10 F(K1) = A*F(K1)

Signal Processing

Subroutine to obtain the multiplication coefficient array D to be used by all FST subroutines

For FST-IV, N1 > = 2*N; for FST-II I and FST-

II, N I > = N ; for FST-I, N I > = N / 2 . N is the length of the sequence to be transformed.

S U B R O U T I N E COEFS (D, N1)

D I M E N S I O N D(N1), C(15) D O U B L E P R E C I S I O N C, T C(1) = DSQRT ( 5 D - 1) D(1) =C(1 ) M = 2 M 0 = 1

10 DO 20 I = M 0 , M - 1

I1 = I + I 1 2 = 1 1 + 1 C(I1) --- DSQRT ( 5 D - I* (1D0+ C(I))) C(I2) = DSQRT ( 5 D - 1 , ( i n 0 - C(I))) n ( I 1 ) = C(I1)

20 O(I2) = C(I2)

M 0 = M 0 + M 0

M = M 0 + M 0

IF(M.GE.N1) G O T O 70 IF(M.GE.16) G O T O 30 G O T O 10

30 DO 4 0 I = M 0 , M - 1 I1 = I + I 1 2 = 1 1 + 1 T = DSQRT ( 5 D - I * ( 1 D 0 + C ( I ) ) ) D ( I + I ) = T T = DSQRT ( 5 D - 1 , ( 1 D 0 - C ( I ) ) )

Page 11: Fast discrete sine transform algorithms

Z. Wang / Fast discrete sine transform algorithms

40 D ( I + I + I ) = T

M0 = M0 + M0

M = M 0 + M0

I F ( M . G E . N 1 ) G O T O 70

50 D O 6 0 I = M 0 , M - 1

D ( I + I ) = S Q R T ( . 5 . ( 1 . + D ( I ) ) )

60 D ( I + I + 1) = S Q R T ( . 5 . ( 1 . - D ( I ) ) )

M0 = M0 + M0

M = M0 + M0

I F ( M . G E . N 1 ) G O T O 70

G O T O 50

70 D O 80 I = N1, 2, - 1

80 D ( I ) = D ( I - 1)

R E T U R N

E N D

Subroutine for permuting a natural ordered array

into a Hadamard ordered one

Paramete rs :

F: Ar ray to be pe rmu ted .

G: W o r k i n g array, the d i m e n s i o n o f which has to

be grea te r t han or equa l to N / 2 .

N: The d i m e n s i o n o f F.

10

S U B R O U T I N E H D M O D (F, G, N)

D I M E N S I O N F ( N ) , G(1 )

I F (N.LE.2) R E T U R N

N1 = N / 2

NO = N1

N 2 = 1

I1 = N

12 = - N 2

D O 20 I - 1, N1

I1 = I1 - N2

12 = 12 + N2

D O 20 J = 1, N2

G ( I 2 + J) = F(I1 + J )

N3 = 0

N4 = N2 + N2

I I = N

D O 3 0 I = 1 , N1

N3 = N3 + N2

I1 = I1 - N4

12 = N 0 - N3

D O 30 J = 1 , N2

20

30 F(I1 + J ) = F ( I 2 + J )

I1 -- - N 2

I2 = - N 2

D O 4 0 I - - 1 , N1

I1 --- I1 + N4

I2 = I2 + N2

D O 40 J = 1, N2

40 F(I1 + J ) -- G ( I 2 + J)

N2 = N2 + N2

N1 = N 1 / 2

I F (N1 .GE.2 ) G O T O 10

R E T U R N

E N D

101

Subroutine to permute a Hadamard ordered array

into a natural ordered one

The same e x p l a n a t i o n as tha t for H D M O D .

S U B R O U T I N E I V H D M (F, G, N)

D I M E N S I O N F ( N ) , G(1)

I F (N.LE.2) R E T U R N

N1 = N / 4

I I = l

10 N 2 = N I + N 1

N3 = N2 + N2

I 2 = 0

D O 40 I = 1, I1

I3 = I2

I2 = I2 + N3

I4 = I3 + N2

D O 20 J = 1, N2

20 G(J ) = F ( I 3 + J + J )

D O 30 J = 2 , N2

30 F ( I 3 + J ) = F ( I 3 + J + J - 1)

D O 40 J = 1, N1

J1 = J + J

J2 = J 1 - 1

F ( I 4 + J1) = G(J2)

40 F ( I 4 + J2) = G(J1)

N1 = N 1 / 2

I 1 = I 1 + I 1

I F (N1 .GE.1 ) G O T O 10 R E T U R N

E N D

Vol. 19, No. 2, February 1990

Page 12: Fast discrete sine transform algorithms

102

The I V H D M routine

Z. Wang / Fast discrete sine transform algorithms

This s u b r o u t i n e m a k e s an i n d e x a r r ay L o f l e n g t h

N to be u s e d in F S T 1 f o r r e o r d e r i n g the t r ans -

f o r m e d d a t a in to a n a t u r a l o rde r .

S U B R O U T I N E R E O R D (L, N )

D I M E N S I O N L ( N )

L ( 1 ) = 1

L(2) = 3

N 2 = 2

10 N1 = N 2

N 2 = N1 + N1

D O 20 I = N 1 , 1, - 1

I1 = I + I

L ( I 1 ) = N 2 - L( I )

20 L( I1 - 1) = L( I )

I F ( N 1 . L T . N / 2 ) G O T O 10

NO = 0

N 2 = N / 2

N3 = N 2

30 N 2 = N 2 / 2

D O 40 I = 1, N 2

I1 = N 0 + I + I - 1

40 L ( N 3 + I ) = L ( I1 ) + L( I1 )

I F ( N 2 . E Q . 1 ) R E T U R N

NO = N 3

N3 = N 3 + N 2

G O T O 30

R E T U R N

E N D

References

[1] N. Ahmed and K.R. Rao, Orthogonal Transforms for Digital Signal Processing, Springer, New York, 1975.

[2] A.K. Jain, "A fast Karhunen-Loeve transform for a class

of stochastic processes", IEEE Trans. Commun., Vol. COM-24, 1976, pp. 1023-1029.

[3] B.G. Lee, "A new algorithm for the discrete cosine trans- form", IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-32, 1984, pp. 1243-1245.

[4] M. Vetterli and H. Nussbaumer, "Simple FFT and DCT algorithms with reduced number of operations", Signal Process., Vol. 6, 1984, pp. 267-278.

[5] Zhongde Wang, "A fast algorithm for the discrete sine transform implemented by the fast cosine transform", IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-30, 1982, pp. 814-815.

[6] Zhongde Wang, "Fast algorithms for the discrete W trans- form and for the discrete Fourier transform", IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-32, 1984, pp. 803-816.

[7] Zhongde Wang, "On computing the discrete Fourier and cosine transform", IEEE Trans. Acoust. Speech, Signal Process., Vol. ASSP-33, 1985, pp. 1341-1344.

[8] Zhongde Wang, "Comments on 'A fast algorithm for the discrete sine transform'", IEEE Trans. Commun., Vol. COM-34, 1986, pp. 204-205.

[9] Zhongde Wang, "Some special cases of the Karhunen- Loeve transform", Acta Electronica Sinica, Vol. 13, No. 3, 1985, pp. 132-133 (In Chinese).

[10] Zhongde Wang and B.R. Hunt, "The discrete W trans- form", Appl. Math. Comput., Vol. 16, 1985, pp. 19-48.

[11] Zhongde Wang and Shenglong Xu, "Comments on 'On the computation and the effectiveness of discrete sine transform'", Comput. Elec. Eng., Vol. 12. No. 1/2, 1986, pp. 23-27.

[12] J.L. Wang and Z.Q. Ding, "Discrete sine transform domain LMS adaptive filtering", Int. Conf. Acoust. Speech Signal Process. (ICASSP-85), Beijing, China, 1985, pp. 260-263.

[13] P. Yip and K.R. Rao, "A fast computational algorithms for the discrete sine transform", IEEE Trans. Commun., Vol. COM-28, 1980, pp. 304-307.

[14] P. Yip and K.R. Rao, "Fast decimation-in-time algorithm for a family of discrete sine and cosine transforms", Cir- cuits Syst. Signal Process. Vol. 3, 1984, pp. 387-408.

[15] P. Yip and K.R. Rao, "DIF algorithms for a family of discrete sine and cosine transforms", Proc. Int. Conf. Acoust. Speech Signal Process. (1CASSP-85), 1985, pp. 776-779.

[16] Zhao Zhen-Gang, "A new scheme of polyphase network TDM-FDM transmultiplexer using all-pass filters", Signal Process. (China), Vol. 1, No. 1, 1985, pp. 25-31. (In Chinese).

Signal Processing