• View

  • Download

Embed Size (px)



    Shrey DuttaDept. of Computer Sci. & Engg.Indian Institute of Technology


    Krishnaraj Sekhar PVDept. of Computer Sci. & Engg.Indian Institute of Technology


    Hema A. MurthyDept. of Computer Sci. & Engg.Indian Institute of Technology



    There are at least 100 ragas that are regularly performedin Carnatic music concerts. The audience determines theidentity of ragas within a few seconds of listening to anitem. Most of the audience consists of people who are onlyavid listeners and not performers.

    In this paper, an attempt is made to mimic the listener.A raga verification framework is therefore suggested. Theraga verification system assumes that a specific raga isclaimed based on similarity of movements and motivic pat-terns. The system then checks whether this claimed raga iscorrect. For every raga, a set of cohorts are chosen. A ragaand its cohorts are represented using pallavi lines of com-positions. A novel approach for matching, called LongestCommon Segment Set (LCSS), is introduced. The LCSSscores for a raga are then normalized with respect to itscohorts in two different ways. The resulting systems anda baseline system are compared for two partitionings of adataset. A dataset of 30 ragas from Charsur Foundation 1is used for analysis. An equal error rate (EER) of 12% isobtained.

    1 IntroductionRaga identification by machine is a difficult task in Car-

    natic music. This is primarily because a raga is not definedjust by the solfege but by svaras (ornamented notes) [13].The melodic histograms obtained for the Carnatic musicare more or less continuous owing to the gamaka 2 ladensvaras of the raga [23]. Although the svaras in Carnaticmusic are not quantifiable, for notational purposes an oc-tave is divided into 12 semitones: S, R1, R2(G1), R3(G2),G3, M1, M2, P, D1, D2(N1), D3(N2) and N3. Each raga ischaracterised by atleast 5 svaras. Arohana and avarohanacorrespond to an ordering of svaras in the ascent and de-

    1 http://www.charsurartsfoundation.org2 Gamaka is a meandering of a svara encompassing other permissible

    frequencies around it.

    c Shrey Dutta, Krishnaraj Sekhar PV, Hema A. Murthy.

    Licensed under a Creative Commons Attribution 4.0 International Li-cense (CC BY 4.0). Attribution: Shrey Dutta, Krishnaraj Sekhar PV,Hema A. Murthy. Raga Verification in Carnatic Music Using LongestCommon Segment Set, 16th International Society for Music Informa-tion Retrieval Conference, 2015.

    scent of the raga, respectively. Ragas with linear orderingof svaras are referred to as linear ragas such as Mohonamraga (S R2 G3 P D2 S). Similarly, non linear ragas havenon linear ordering such as Ananda Bhairavi raga (S G2R2 G2 M1 P D2 P S). A further complication arises owingto the fact that although the svaras in different ragas maybe identical, the ordering can be different. Even if the or-dering is the same, in one raga the approach to the svaracan be different, for example, todi and dhanyasi.

    There is no parallel in Western classical music to ragaverification. The closest that one can associate with, iscover song detection [6, 16, 22], where the objective is todetermine the same song rendered by different musicians.Whereas, two different renditions of the same raga maynot contain identical renditions of the motifs.

    Several attempts have been made to identify ragas [24,7,8,12,14,26]. Most of these efforts have used small reper-toires or have focused on ragas for which ordering is notimportant. In [26], the audio is transcribed to a sequence ofnotes and string matching techniques are used to performraga identification. In [2], pitch-class and pitch-dyads dis-tributions are used for identifying ragas. Bigrams on pitchare obtained using a twelve semitone scale. In [18], the au-thors assume that an automatic note transcription systemfor the audio is available. The transcribed notes are thensubjected to HMM based raga analysis. In [12,25], a tem-plate based on the arohana and avarohana is used to deter-mine the identity of the raga. The frequency of the svarasin Carnatic music is seldom fixed. Further, as indicatedin [27] and [28], the improvisations in extempore enuncia-tion of ragas can vary across musicians and schools. Thisbehaviour is accounted for in [10, 11, 14] by decreasingthe binwidth for computing melodic histograms. In [14],steady note transcription along with n-gram models is usedto perform raga identification. In [3] chroma features areused in an HMM framework to perform scale indepen-dent raga identification, while in [4] hierarchical randomforest classifier is used to match svara histograms. Thesvaras are obtained using the Western transcription sys-tem. These experiments are performed on 4/8 differentragas of Hindustani music. In [7], an attempt is made toperform raga identification using semi-continuous Gaus-sian mixtures models. This will work only for linear ragas.Recent research indicates that a raga is characterised bestby a time-frequency trajectory rather than a sequence of


  • Vocal Instruments TotalMale Female Violin Veena Saxophone FluteNumber of Ragas 25 27 8 3 2 2 30 (distinct)Number of Artists 53 37 8 3 1 3 105Number of Recordings 134 97 14 4 2 3 254Total Duration of Recordings 30 h 22 h 3 h 31 m 10 m 58 m 57 hNumber of Pallavi Lines 655 475 69 20 10 15 1244Average Duration of Pallavi Lines 11 s 8 s 10 s 6 s 6 s 8 s 8 s (avg.)Total Duration of Pallavi Lines 2 h 1 h 11 m 2 m 55 s 2 m 3 h

    Table 1. Details of the database used. Durations are given in approximate hours (h), minutes (m) or seconds (s).

    quantised pitches [5, 8, 9, 19, 20, 24]. In [19, 20], the samaof the tala (emphasised by the bol of tabla) is used to seg-ment a piece. The repeating pattern in a bandish in Hin-dustani Khyal music is located using the sama informa-tion. In [8, 19], motif identification is performed for Car-natic music. Motifs for a set of five ragas are defined andmarked carefully by a musician. Motif identification is per-formed using hidden Markov models (HMMs) trained foreach motif. Similar to [20], motif spotting in an alapanain Carnatic music is performed in [9]. In [24], a number ofdifferent similarity measures for matching melodic motifsof Indian music was attempted. It was shown that the in-tra pattern melodic motif has higher variation for Carnaticmusic in comparison with that of Hindustani music. It wasalso shown that the similarity obtained is very sensitive tothe measure used. All these efforts are ultimately aimedat obtaining typical signatures of ragas. It is shown in [9]that there can be many signatures for a given raga. To alle-viate this problem in [5], an attempt was made to obtain asmany signatures for a raga by comparing lines of compo-sitions. Here again, it was observed that the typical motifdetection was very sensitive to the distance measure cho-sen. Using typical motifs/signatures for raga identificationis not scalable, when the number of ragas under consider-ation increases.

    In this paper, this problem is addressed in a differentway. The objective is to mimic a listener in a Carnatic mu-sic concert. There are at least 100 ragas that are activelyperformed today. Most listeners identify ragas by refer-ring to the compositions with similar motivic patterns thatthey might have heard before. In raga verification, a ragasname (claim) and an audio clip is supplied. The machinehas to primarily verify whether the clip belongs to a givenraga or not.

    This task therefore requires the definition of cohorts fora raga. Cohorts of a given raga are the ragas which havesimilar movements while at the same time have subtle dif-ferences, for example, darbar and nayaki. In darbar raga,G2 is repeated twice in avarohana. The first is more or lessflat and short, while the second repetition is inflected. TheG2 in nayaki is characterised by a very typical gamaka.In order to verify whether a given audio clip belongs to aclaimed raga, the similarity is measured with respect to theclaimed raga and compared with its cohorts using a novelalgorithm called longest common segment set (LCSS). LCSS

    scores are then normalized using Z and T norms [1, 17].The rest of the paper is organised as follows. Section 2

    describes the dataset used in the study. Section 3 describesthe LCSS algorithm and its relevance for raga verifica-tion. As the task is raga verification, score normalisation iscrucial. Different score normalisation techniques are dis-cussed in Section 4. The experimental results are presentedin Section 5 and discussed in Section 6. The main conclu-sions drawn from the key results in this paper are discussedin Section 7

    2 Dataset usedTable 1 gives the details of the dataset used in this work.

    This dataset is obtained from the Charsur arts foundation 3 .The dataset consists of 254 vocal and instrument live record-ings spread across 30 ragas, including both target ragasand their cohorts. For every new raga that needs to be ver-ified, templates for the raga and its cohorts are required.

    2.1 Extraction of pallavi lines

    A composition in Carnatic music is composed of threeparts, namely, pallavi, anupallavi and caranam. It is be-lieved that the first phrase of the first pallavi line of a com-position contains the important movements in a raga. Abasic sketch is initiated in the pallavi line, developed fur-ther in the anupallavi and caranam [21] and therefore con-tains the gist of the raga. The algorithm described in [21]is used for extracting pallavi lines from compositions. De-tails of the extracted pallavi lines are given in Table 1. Ex-periments are performed on template and test recordings,selected from these pallavi lines, as discussed in greaterdetail in Section 5.

    2.2 Selection of