5
Musical Notes Reader Anna Shmushkin & Lior Abramov Introduction Optical music recognition (OMR) has been the subject of research for decade. Many image processing algorithms and techniques have been developed to address this problem, and yet the problem still poses many challenges to scientists and researchers today. The goal of this project is to parse musical note sheet from a taken image and supply playback mechanism for it. Algorithm Image align 1) find the 4 corners of the page i. We find the largest region boundaries and then find its corners. Using matlab function bwboundaries. 2) Apply 2-D projective geometric transformation on the input image using the 4 corners of the page we have found at step 1. Reading the Note Sheet … Staff Lines 1) Detection: The first step in processing a given input image is to detect the individual staff lines of the piece of music. We used Y projection of the image and later we have found the peaks by using: findpeaks with MinPeakHeight=image_width/3. 2) Parameter Extraction: Once we had the final staff line locations from the previous step, we calculated the gap between the staff lines, g, by computing the median of the set of interval lengths between adjacent staff lines. We also calculated the staff thickness, t by computing the median of width peaks from previous step. 3) Removal: We have removed the staff lines by using the staff line locations and the parameters g and t. We scanned across the rows of the staff lines, removing the existing black pixel if there are no black pixels above nor below it. Segmentation 1) Staff Segmentation: In the staff segmentation, we divided the image into horizontal strips, one strip per 5 lines (staff). 2) Note Segmentation: In the group segmentation, vertical strips of the staff segment were identified as note group segments. We did X-projection and then group segmentation with thresholds: MIN_WIDTH_THRESHOLD=g*2, MIN_HEIGHT_SUM_THRESHOLD=t*2. Clef Detection we have identified the clef by using template matching (matlab function normxcorr2). We

Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were

Musical Notes Reader

Anna Shmushkin & Lior Abramov

Introduction

Optical music recognition (OMR) has been the subject of research for decade. Many image

processing algorithms and techniques have been developed to address this problem, and yet

the problem still poses many challenges to scientists and researchers today.

The goal of this project is to parse musical note sheet from a taken image and supply

playback mechanism for it.

Algorithm

Image align

1) find the 4 corners of the page

i. We find the largest region boundaries and then find its corners. Using

matlab function bwboundaries.

2) Apply 2-D projective geometric transformation on the input image using the 4

corners of the page we have found at step 1.

Reading the Note Sheet …

Staff Lines

1) Detection: The first step in processing a given input image is to detect the individual

staff lines of the piece of music. We used Y projection of the image and later we

have found the peaks by using: findpeaks with MinPeakHeight=image_width/3.

2) Parameter Extraction: Once we had the final staff line locations from the previous

step, we calculated the gap between the staff lines, g, by computing the median of

the set of interval lengths between adjacent staff lines. We also calculated the staff

thickness, t by computing the median of width peaks from previous step.

3) Removal: We have removed the staff lines by using the staff line locations and the

parameters g and t. We scanned across the rows of the staff lines, removing the

existing black pixel if there are no black pixels above nor below it.

Segmentation

1) Staff Segmentation: In the staff segmentation, we divided the image into horizontal

strips, one strip per 5 lines (staff).

2) Note Segmentation: In the group segmentation, vertical strips of the staff segment

were identified as note group segments. We did X-projection and then group

segmentation with thresholds: MIN_WIDTH_THRESHOLD=g*2,

MIN_HEIGHT_SUM_THRESHOLD=t*2.

Clef Detection

we have identified the clef by using template matching (matlab function normxcorr2). We

Page 2: Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were

have assumed there is 1 clef in the beginning of the staff.

Note Head Detection

Once the note segments had been identified, we were able to identify the coordinates of the

note heads. We would like to perform a simple erosion to accomplish this task, but doing so

would not detect any half or whole notes. Therefore, we filled holes with radius smaller than

g/2. At this point, we could detect quarter, half, and whole note heads by simply performing

erosion with a disk structuring element. Since we expected the note heads to have a

diameter of approximately the staff gap line g, we chose a radius of 0.75*g/2 for the

structuring element.

Note Identification given the eroded regions from the previous portion of the processing pipeline, we next

classified the note type, octave, and pitch of the note.

Note type we currently support only 8th, quarter, half, and whole notes.

First we tried to detect 8th note, 8th note consists of 2 regions with centroids centroid1 and

centroid2 (sorted by y) where centroid2.y-centroid1.y ~ g*2 and centroid1.x-centroid1.x ~ g.

Later we have detected quarter, half, and whole notes:

we first determined whether the region around the note was filled or empty. To do this, we

counted the number of filled pixels in a small circular region surrounding the centroid in the

original image before erosion or small-region filling and thresholder to determine if a note

head was full or empty. If the note head was full, we classify the note as a quarter notes. If

the note head was empty, we looked at the X-projection of the note to look for the presence

of a note stem, which we observed to be at least 2.5*g. If we found a peak of this height or

greater in the X-projection we assumed the presence of a stem, and classified the note as a

half note. Otherwise we classified the note as a whole note.

Octave and pitch

To determine the octave and pitch, we used the centroid of the region found previously and

cross-referenced it with the staff line locations of the original image in order to round the

position to the nearest half step of g, which directly was converted into an octave and pitch.

MIDI Synthesis

finally, once given the list of notes, their durations, and pitches, we were able to generate a

MIDI file for the decoded sheet music. We have used lilypond program to play the music.

Page 3: Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were

Figure 1: Original Image Figure 2: Aligned Image Figure 3: Staff Line Detection

Interface We have used a client-server model.

1) At client side, user can choose an input image (from gallery or camera).

2) Input image is sent to server

3) Server returns a zip file containing the processed result consisting of:

a. MIDI file, output of lilypond program

b. PNG image file, output of lilypond program

4) Client shows the PNG results and plays the MIDI file.

Visual Results

Page 4: Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were

Figure 4: Staff Segmentation Figure 5: Staff Lines Removal and Notes Segmentation

Figure 6: Notes Identification

Page 5: Musical Notes Reader - IDC Notes/report.pdf · have assumed there is 1 clef in the beginning of the staff. Note Head Detection Once the note segments had been identified, we were

The solution limitations:

We currently support only 8th, quarter, half, and whole notes.

When the image is not aligned correctly, meaning the lines are jagged and therefore

the algorithm behaves problematically in the phase of staff line removal.

References https://stacks.stanford.edu/file/druid:yj296hj2790/Khan_Ng_Mobile_Sheet_Music_Player.p

df

http://www.music.mcgill.ca/~ich/research/papers/dalitz08comparative.pdf

\score {<<

\new Staff { \easyHeadsOn

\clef treble e'4 e'4 e'2 e'4 e'4 e'2 e'4

g'4 c'4 d'8 }

\new Staff { \easyHeadsOn

\clef treble e'1 f'4 f'4 f'4 f'8 f'4 e'4

e'4 e'4 }

\new Staff { \easyHeadsOn

\clef treble e'4 d'4 d'4 e'4 d'2 g'2 e'4

e'4 e'2 e'4 e'4 e'2 }

\new Staff { \easyHeadsOn

\clef treble e'4 g'4 c'4 d'8 e'1 f'4 f'4

f'4 f'4 }

\new Staff { \easyHeadsOn

\clef treble f'4 e'4 e'4 e'4 g'4 g'4 f'4

d'4 c'1 }

>>\midi {}

\layout {}

}

Figure 7: Lilipond file