M532 Final Project Write-Up - Colorado State Universityblumstei/MathDocs/Homework... · M532 Final Project Write-Up Mark Blumstein 1 Introduction This project concerns the applications

M532 Final Project Write-Up

Mark Blumstein

1 Introduction

This project concerns the applications of geometric data analysis techniques to EEG data. Thedata was collected by a patient performing 4 mental tasks, for five trials of 10 seconds each. Everysecond yields 256 readings from the EEG machine. The mental tasks were: counting backwards,thinking about moving one’s fist, mentally rotating an object, and thinking about singing a song.Sometimes I refer to these tasks as Task A,B,C, or D respectively.

The EEG cap used has 8 channels placed on the left and right side of the head (enumeratedby an odd/even number respectively) over the frontal, central, parietal, and occipital lobes of thebrain. Thus, across the 8 channels we have 50 total seconds of time series data per mental task.

Professor Chuck Anderson in the computer science department at CSU provided the data. Onegoal of scientists working with this data is to be able to classify what someone is thinking basedoff of the EEG reading. To some extent this was also my goal, however the more immediate goalof this project was not quite so lofty - I simply wanted to apply various techniques for geometricdata analysis taught in class, and see if anything interesting stuck out.

In this not so rigorous spirit, I did uncover a few items worth mentioning. In section 2, I showresults using the SVD decomposition which demonstrate a low dimensional structure to the data.In section 3, a pattern emerges using principal angles between subspaces - regardless of the task, thepairs of channels 2/3 (Right Frontal/ Left Central) appear “near” each other, and so does the pair6/7 (Right Parietal/ Left Occipital). In section 5, I used the radial basis function method to builda model which predicts the time series data. I saw an improvement in the RBF prediction whenI projected the RBF approximation onto a subspace formed by a basis built using the max noisefraction method (no measure was used to quantify this improvement, this is simply a qualitativeobservation.)

One important aspect in working with this data set is deciding how to parse the data. My firstthought was to consider each channel independently, and I stick with this convention throughoutthis analysis. Given a task and a channel, there is 50 seconds of data to work with (12,800 datasample.) One way to parse the data down further is to use a “time window.” For example, I mightwant to consider all 50 seconds of data for one channel (12,800 samples) and look one second at atime (256 samples). This yields a 256 × 50 data matrix, with data stored down the columns. Inthis case, the “time window” is 1 second (256 samples).

2 SVD and low-dimensionality

The tables that follow detail the energy dimension 1 for the subspaces, computed with the SVDbasis, as the time window is varied. Recall that the energy dimension is the dimension of thesubspace required to retain 95% of the energy. By “Max Rank” it is meant the maximum possiblerank of the data matrix.

1

Energy Dimension Across Tasks.

ch.1 ch.2 ch.3 ch.4 ch.5 ch.6 ch.7 ch.8

5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000

5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000

5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000

5.000 5.000 5.000 5.000 5.000 5.000 5.000 5.000

Table 2.0.0: 10 second time window/ Max Rank: 5


19.000 18.000 18.000 20.000 20.000 19.000 19.000 20.00018.000 17.000 18.000 20.000 20.000 19.000 19.000 20.00018.000 18.000 18.000 20.000 20.000 19.000 18.000 20.00018.000 17.000 18.000 19.000 20.000 19.000 19.000 20.000

Table 2.0.0: 2 second time window/ Max Rank: 25.


23.000 21.000 23.000 26.000 27.000 24.000 23.000 27.00020.000 19.000 21.000 24.000 25.000 22.000 22.000 24.00021.000 20.000 21.000 24.000 25.000 22.000 21.000 24.00021.000 20.000 21.000 23.000 24.000 21.000 22.000 24.000

Table 2.0.0: 1 second time window/ Max Rank: 50.


18.000 16.000 19.000 22.000 23.000 18.000 18.000 25.00015.000 14.000 16.000 19.000 18.000 15.000 15.000 18.00015.000 14.000 16.000 19.000 19.000 15.000 15.000 17.00015.000 14.000 16.000 17.000 18.000 14.000 15.000 17.000

Table 2.0.0: .5 second time window/ Max Rank: 100.


12.000 11.000 13.000 16.000 16.000 11.000 12.000 17.0009.0000 9.0000 10.000 12.000 11.000 9.0000 9.0000 11.0009.0000 9.0000 10.000 12.000 12.000 9.0000 9.0000 11.00010.000 9.0000 10.000 11.000 11.000 8.0000 9.0000 11.000

Table 2.0.0: .25 second time window/ Max Rank 64.

2

A first observation is that with the exception of the first table listed, all display traces of lowdimensionality. That the first table does not display any low dimensionality, is simply a matterof the size of the matrix - using a 10 second time window yields a data matrix with only 5 datavectors, and it would be extremely unlikely to find a lower dimensional representation than 5.

2.1 Plotting The SVD Basis

We may plot the SVD basis vectors for each subspace to obtain a visualization of the SVD de-composition. Each plot below is the first two components of each SVD basis vectors as an x,ycoordinate pair. Each plot corresponds to one mental task, with the subspaces for all 8 channelsplotted.

The second set of plots use the energy dimension calculations above. I’ve only used the firstn-many SVD vectors in these plots, where n is the minimum energy dimension across the channelsfor a given mental task. Notice the linearity!

1See homework 1 for the formulas to compute Shannon’s entropy and Energy dimension.

3

3 Principal angles

In this section, I used the sum of the squares of the principal angles between subspaces to get asense of which channels are related across the varying tasks. The first set of graphics are using thetime window of 0.5 seconds. In the graphics, the darker a square is, the smaller the distance. Then,a rough comparison of tasks is to see how dark the pictures are. If there is a lot of dark squares,then there are many subspaces which are near to each other.

One observation is that when looking at the graphic for a Task compared with itself, we seethat the off-diagonal is darker, and the corners are lighter. This might make sense in terms of thedata, since the off diagonal only compares even channels to even channels and odd channels to oddchannels, thus these subspaces represent regions of the brain on the same side of the head. Further,the off diagonal is comparing regions which are close to one another (e.g. channel 1 is close tochannel 3, 2 is close to 4, etc.). The light yellow color of the corners also “makes sense” as thesechannels are further away from one another.

4

3.1 Using Low Dimensionality

I wanted to see if I could take advantage of the low dimensionality of the data to better understandthe principal angle graphics above. I decided to redo the principal angle computations by projectingthe data onto the “reduced” SVD basis (i.e. I only used as many basis vectors as the energydimension dictated.) The graphics below compare Task A to the other tasks. Notice that thegraphics have become darker. This in and of itself is not surprising, since by projecting the dataonto a lower dimensional space, we obtain fewer principal angles.

5

3.2 Using MNF

Using a time window of 10 seconds (i.e. each trial is taken as one data vector) revealed a patternin the principal angle graphics. Notice the four darker squares, and the upper left corner in eachof the following principal angle graphics:

This pattern suggests a relationship between channels 2,3,6, and 7. To probe this a little further,

6

I decided to compute the max noise fraction basis for each of the data sets, and see if this patternwas due to “noise”. The following principal angle graphics were computed by removing one ”noisy”vector at a time, projecting the data onto the reduced basis, and then computing principal angles.

Original graphic, no MNF vectors removed.

Remove the first “noisy” basis vector and project the data.

Remove the second “noisy” basis vector and project the data.

7

Remove the third “noisy” basis vector and project the data.

Remove the fourth and final “noisy” basis vector and project the data.

It does appear that the “four square” pattern persists, although the top left corner seems to

8

have faded out.

4 Low Dimensional Representations

My partner for this project Tomojit Ghosh produced the following pictures using the Self-OrganizingMapping and Laplacian Eigenmap methods. In the SOM picture, there is a bit of separation whichmay be useful to help in classification, however he had less success with the Laplacian Eigenmapmethod.

Self-Organizing Mapping

Laplacian Eigenmap

5 Radial Basis Functions

Finally I used the radial basis function method to build a model which might predict the EEG timeseries data. The following graphics show results of training an RBF off of 3 seconds of data, andusing the Gaussian parameter ϕ(r) = exp(−x2/a2). The method was highly sensitive to a-value andnumber of centers chosen. For a more accurate model, I need to check the literature for guidanceon how to adjust the parameters. However, I did achieve some “success” just by guessing:

9

As this last graphic shows, the prediction could be wildly inaccurate depending on the parame-ters/how far out I wanted to predict. I did have the idea to take the approximation from the RBFmethod, and then project the approximation onto the MNF basis. Surprisingly, this worked prettywell as the following graphic shows

10

Refined RBF predictor in yellow:

11

Documents

M532 Final Project Write-Up - Colorado State Universityblumstei/MathDocs/Homework... · M532 Final Project Write-Up Mark Blumstein 1 Introduction This project concerns the applications