Differential Analysis & FDR Correction. Correlation Analysis Steps Step 1: Construction of input data table in EXCEL Step 2: Save EXCEL file into tab

Differential Analysis& FDR Correction

Correlation Analysis Steps

Step 1: Construction of input data table in EXCELStep 2: Save EXCEL file into tab delimited txt fileStep 3: Upload data - tab delimited txt fileStep 4: Choose correlation algorithm Step 5: Enter your email and submitStep 6: Result interpretation: global FDRStep 7: Result interpretation: local FDR

Step 1:

Sample Clinical parameter Gene Gene Gene

patient.1.name … … … …

patient.2.name … … … …

… … … … …

… … … … …

Construction of input data table in EXCEL

Step 1:

Input data format:• Cell A1: “sample”• 1st Column: patient names or IDs• 1st Row: .

• Cell A2: clinical parameter• Cell A3 & others: gene name

• 2nd column: values of one clinical parameter• All other cells should be molecular data,

• one sample/patient per row• e.g. array intensity or protein quantities

EXCEL file example

Step 2: Save EXCEL file into tab delimited txt file

Step 3: Upload data - tab delimited txt file

1 2

3

Step 3: Upload data - tab delimited txt file

Input data “input.cor.txt” selected

Step 4: Choose algorithm for correlation analysis

Choose correlation algorithm

Step 4: which one to choose?

• Rank based correlation – study relationship between different rankings on the same set of items

• During the analysis, raw scores are converted to rankings

• Spearman• Kendall

• Pearson product-moment correlation coefficient

To correlate a clinical variable to molecular data:

Spearman’s rank, Kendall tau, or Pearson product-moment correlation coefficient analysis?

Spearman’s Rank Correlation Coefficient Analysis:

Spearman rank correlation is used when you have two measurement variables and one “hidden” nominal variable, which groups the measurements into pairs. It is a non-parametric test for correlation and used when one or both of the variables consists of ranks.

Kendall Tau Correlation Coefficient Analysis:Kendall's Tau Correlation Coefficient analysis is a measure of correlation and

measures the strength of the relationship between two variables. It provides a distribution free test of independence and a measure of the strength of dependence between two variables. It is required two variables, X and Y, that are paired observations. Both variables that are provided should be at least ordinal.

Pearson Product-Moment Correlation Coefficient Analysis:

The Pearson product-moment correlation coefficient is a common measure of the correlation (linear dependence) between two variables X and Y. It is very widely used in the sciences as a measure of the strength of linear dependence between two variables, giving a value somewhere between +1 and -1 inclusive.

Step 5: Enter your email and submit

Enter your email

Submit

Step 6: Result interpretationGlobal FDR

Single hypothesis test

= correlation between one gene and one clinical variable


FDR plot red line: Total Discoveries (TD) or Total Discovery rate = 1

FDR plot green line: False Discoveries (MEAN) or False Discovery Rate FDR (MEAN)

FDR plot black bar line: False Discoveries (MEDIAN) or False Discovery Rate FDR (MEDIAN)

FDR plot blue line: False Discoveries (95%) or False Discovery Rate FDR (95%)

FDR plot dotted black line: FDR=0.05

95% FD/TD .05 FDR1 = TD/TD

Single hypothesis test P-value thresholds

Mean FD/TD Median FD/TDA

Glo

bal

FD

R0.

00.

20.

40.

60.

81.

0

0.0

0.2

0.4

0.6

10-9 0.01 0.02 0.03 0.0410-9 0.05 1.0


Step 6: How to read the gFDR plots

• Commonly used global FDR cut off • 0.05

• If there are no significant features• No data points will show up below

the 0.05 dotted horizontal line




Glo

bal

FD

R0.

00.

20.

40.

60.

81.

0

0.0

0.2

0.4

0.6

10-9 0.01 0.02 0.03 0.0410-9 0.05 1.0


Features which satisfy global FDR < 0.05

Commonly used gFDR cutoff: 0.05



Mean FD/TD Median FD/TDAG

lob

al

FD

R0.

00.

20.

40.

60.

81.

0

0.0

0.2

0.4

0.6

10-9 0.01 0.02 0.03 0.0410-9 0.05 1.0





Glo

bal

FD

R0.

00.

20.

40.

60.

81.

0

0.0

0.2

0.4

0.6

10-9 0.01 0.02 0.03 0.0410-9 0.05 1.0

Features which satisfy global FDR < 0.05

Commonly used gFDR cutoff: 0.05

Step 7: Result interpretationlocal FDR

Lo

cal F

DR

Single hypothesis test P-value

0.0 0.01 0.02 0.03 0.04 0.050.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.0

50

.10

0.1

50

.20

Step 7: How to read the lFDR plots

It has been suggested (Aubert, et al., 2004) that the first abrupt change of the local FDR can be an indication for the determination of a good threshold to choose genuinely statistically significant features.


Lo

cal F

DR

Single hypothesis test P-value

0.0 0.01 0.02 0.03 0.04 0.050.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.0

50

.10

0.1

50

.20

1st abrupt change of lFDR


Click to download result file


Local FDR results:• 1st column: feature name

• 2nd column: correlation test P value

• 3rd column: local FDR results

Documents

Differential Analysis & FDR Correction. Correlation Analysis Steps Step 1: Construction of input data table in EXCEL Step 2: Save EXCEL file into tab