29
Introduction t JMP f to JMP for Statistics Statistics Jeff Skinner, M.S. Biostatistics Specialist Bioinformatics and Computational Biosciences Branch (BCBB) NIH/NIAID/OD/OSMO/OCICB http://bioinformatics.niaid.nih.gov Si A @ i id ih ScienceApps@niaid.nih.gov

Intro to JMP for statistics

Embed Size (px)

Citation preview

Introductiont JMP f to JMP for StatisticsStatistics

Jeff Skinner, M.S.Biostatistics SpecialistBioinformatics and Computational Biosciences Branch (BCBB)NIH/NIAID/OD/OSMO/OCICBhttp://bioinformatics.niaid.nih.govS i A @ i id [email protected]

JMP Statistical Discovery

• NIAID researchers can request JMP 8.0 for Mac or Windows PC at http://bioinformatics niaid nih govWindows PC at http://bioinformatics.niaid.nih.gov

• Desirable combination of features:Desirable combination of features: Spreadsheet data tables with point-and-click user interface Deep statistical modeling capabilities

Edit bl d i t ti hi h lit hi Editable and interactive high quality graphics “Dynamic variable linking” feature helps you explore your data Comprehensive Design of Experiments (DOE) platform Free NIAID training and support: [email protected]

JMP Starter Window

JMP Starter Window provides shortcuts for useful functions:• JMP Starter Window provides shortcuts for useful functions: Opening data tables, scripts, journals or projects Opening analysis platforms like Design of Experiments, modeling, etc. Setting user preferences for file locations, displayed output, etc.

• Tip of the Day introduces JMP features from tutorials and help menu

Five Types of JMP Files• JMP Data Tables

Spreadsheet files that store data scripts etc Spreadsheet files that store data, scripts, etc.

• JMP Scripts Text files that store JMP Script Language (JSL) codep g g ( )

• JMP Reports Interactive graphs and results from a statistical test

• JMP Journals Publishing files that store read-only data tables, url links, descriptive

figures graphs statistical test results etcfigures, graphs, statistical test results, etc.

• JMP Projects Collection of data tables, scripts, journals and reports, p , j p

Preferences in JMP• General preference settings

Show JMP Starter Window and Tip of the Day at startupDay at startup

• Platforms preferences Customize the output from any statistical

analysis or graph

• File location preferencesFile location preferences Choose default locations for opening and

saving files in Windows

Importing Data

• MS Excel® worksheets can be imported directlycan be imported directly Options for first row header,

multiple sheets, etc. Remember to format your dataRemember to format your data

in MS Excel® first

• Use Text Import PreviewUse Text Import Previewoption to import text files Options for delimited and fixed

width fileswidth files Preview your data columns

before import to save time

Importing from a Database• Windows PC users can open

dBASE and MS Access files di l f JMPdirectly from JMP ODBC drivers are required for

Mac users and other databases

• Open subsets of the database files using click through menus or specific SQL statementsor specific SQL statements Query specific columns or even

select rows of data matching a set of conditions (e.g. sex = male Tip: MS Excel files can be openedset of conditions (e.g. sex male or height < 68 inches).

Click through windows generate SQL statements automatically

Tip: MS Excel files can be opened as database files to import subsets of large files, but subsets of smaller files are best created within JMP

JMP Data Tables

• Table, Columns and Rows menus on the left hand side of the data table• Three places to access Columns and Rows menus:

Columns and rows windows (LHS of the data table) Spreadsheet hotspots (upperleft corner of spreadsheet)

D d i JMP t l b (t f JMP i d ) Drop-down menus in JMP tool bar (top of JMP window)

JMP “Hotspots”• JMP “Hotspots” are clickable red

triangles that provide access totriangles that provide access to additional features in JMP

• Always check the hotspots in your test reports to find additional tests and figures for your analyses

• JMP hotspots make tests more interactive and customizable

Variable Types in JMP• Three types of variables in JMP:

Continuous (blue triangle symbols) Ordinal (ordered green bar symbols) Nominal (unordered red bar symbols)

Ch t t t i bl l b• Character or text variables can only be ordinal or nominal variables in JMP

• Numeric variables can be anything

• Change variable types in the column info• Change variable types in the column info menu or by clicking symbols in the column window (shown right)

JMP Columns (Cols) Menu• Use the column menu to add,

delete, select (go to) or reorder , (g )columns in the data table

• The Column Info menu allowsThe Column Info menu allows you to define the general attributes of your variables

• The Preselect Role menu allows you to specify variable attributes within the analysesattributes within the analyses E.g. ___ is always a response

JMP Columns (Cols) Menu• Label / Unlabel option produces and

removes special labeling in figures for l t d l (t b l)selected columns (tag symbol)

• Hide / Unhide removes columns from view i th d t t bl b t d t thin the data table, but does not remove them from analyses (blindfold symbol)

E l d /U l d t i• Exclude /Unexclude removes certain columns from analyses, but not from the data table (strike through symbol)

• Scroll Lock / Unlock secures the locations of the columns in the data table

JMP Columns (Cols) Menu• The Validation window allows you verify the values

recorded in your data table List Check produces a list of all text responses List Check produces a list of all text responses Range Check reports the min and max for numeric variables,

with a list of possible constraints to remove outliers

• The Recode window allows you to recode text and numeric variables quickly and automatically

The Formula window allows you to create• The Formula window allows you to create transformed variables (e.g. ln(x) ) and generate random data from theoretical distributions

• The Standardize Attributes window allows you to specify identical properties among many columns

JMP Rows Menu

• Add, delete, move and select rows

• Exclude/unexclude, hide/unhide and label/unlabel options same as Cols

• Use Colors and Markers options to identify selected rows in a graphidentify selected rows in a graph

• Powerful data management with RowPowerful data management with Row Editor, Data Filter and Row Selection

Select Where … Option• Click > Rows > Row Selection >

Select Where to open theSelect Where … to open the Select Where window in JMP

• Select rows meeting multiple user-specified conditions Character variable conditions

e.g. Gender = “female” Numeric variable inequalitiesNumeric variable inequalities

e.g. 20% < Percent Body Fat < 45%

JMP Tables Menu

• Summary option creates table of column statistics

• Concatenate to join two tables top-to-bottom if table of column statistics

E.g. mean age, etc.

• Subset option creates a

pthey share columns

• Join tables with unique columns side by sidenew table from selected

rows and columnsSort by column values

columns side-by-side • Update missing values

from a new table• Sort by column values• Stack and Split columns

to rearrange data tables

• Tabulate table stats using drag and drop E g mean age etcto rearrange data tables

• Transpose rows and columns to rotate tables

E.g. mean age, etc.• Missing Data Pattern

finds sampling errors

JMP Toolbar Options

• Arrow tool • Lasso tool Regular cursor arrow

• Help tool Click any JMP object and be

directed its help menus

Select an irregularly-shaped group of points on a graph

• Magnifier tool Zoom in or zoom outdirected its help menus

• Selection tool Select elements of a report or

journal to copy and paste

Zoom in or zoom out• Crosshairs tool

Identify the location of points in a figure or plotj py p

• Scroller tool Precisely scroll in your report

• Grabber tool

g p• Annotate tool

Add notes or captions to a figure or report

Li t l P l t l d Grab and drag axes or other features of a figure

• Brush tool Highlight points on a plot

• Line tool, Polygon tool and Simple Shape tool Draw lines, polygons and simple

shapes on reports Highlight points on a plot p p

JMP Graphs Menu• Use Graph Builder for drag-and-drop

200

250

Bubble Plot of Weight by Calories Sized by Percent Body Fat

• Use Chart for pie charts and bar charts

• Use Overlay and Scatterplot 3D to100

150

Wei

ght

1000 1500 2000 2500 3000 3500Calories

Circle SizeUse Overlay and Scatterplot 3D to create customizable scatter plots

• Animate a Bubble Plot over time

Circle Size

• Animate a Bubble Plot over time

• Use Cell Plot to create “heat maps”Tree Map of Region, Adverse Event

• Use Tree Map to view relationships among categorical variables

Anemia Erythema

Headache

Induration

Leukopenia

Malaise Nausea

Pain Papule

Swelling

Anemia

Ecchymosis

Leukopenia

Malaise Nodule

Pain Swelling

Anemia Headache

Induration

Papule

Anemia

Ecchymosis

Erythema

Headache

Leukopenia

Malaise

Mylagia

Tenderness

Anemia

Arthralgia

Dimpling Headach

Myalgia Nausea

Midwest Northwest Southeast

Southwestamong categorical variablesElavated CH

50

Erythema

Pain Swelling

Tenderness

e

Induration

Leukopenia

Pain Swelling

TendernessNortheast Southwest

JMP Analyze Menu• Distribution procedure

Collect descriptive statistics, create histograms and run hypothesis tests

• Modeling procedures Nonlinear procedures for curve fitting

P titi f d t l tihistograms and run hypothesis tests on individual variables

• Fit Y by X procedure Automatically fits the appropriate

relationship between two variables

Partition for data exploration Neural nets and time series models Categorical platform for log-linear

models and share charts (JMP 7.0) Gaussian processes and screeningrelationship between two variables

Simple linear regression, one-way ANOVA, logistic regression and chi-square tests for table data

• Matched pairs procedure

Gaussian processes and screening platforms (JMP 7.0)

• Multivariate procedures Correlations, PCA, clustering,

discriminant analysis item analysis• Matched pairs procedure Paired t-tests and related figures

• Fit model procedure Used to fit more complicated models

discriminant analysis, item analysis and partial least squares (PLS)

• Survival and Reliability Kaplan-Meier and log-rank testsp

with multiple factors, etc. Includes general linear models (OLS),

stepwise and multivariate procedures, generalized linear models (GLS), etc.

Parametric survival Proportional hazards Recurrence analysis

The Distribution Platform• Descriptive Statistics

Moments: mean, variance, standard deviation, etc. Quantiles: median, interquartile range (IQR), etc

• Descriptive Figuresp g Histograms and boxplots Stem and leaf, empirical cumulative distribution

function (ecdf) and normal quantile-quantile plots

• One-sample Inferences Hypothesis tests for means and standard

deviationsdeviations Confidence, prediction and tolerance intervals Distribution fitting with goodness-of-fit testing

“Broadcasting” Your Hotspot• Sometimes you want to perform several of the

same tests all at the same time

• Two options:p Set your platform preferences to request special tests from the

hotspot every time “Broadcast” your Hotspot options by control clicking any of

the individual results

• Speed up workflow and reduce “click thru”• Speed up workflow and reduce click-thru

JMP Script Features• JMP scripting mostly for “power users” who want

to simplify processing or create new procedures

• Scripting Window also used for “SAS Integration” to access SAS features

T h l f l f t f l• Two helpful features for casual users Save analyses to their respective data tables Leave helpful notes in data tablesLeave helpful notes in data tables

JMP Fit Y by X Platform• Fit Y by X chooses the correct

analysis for any pair of variablesy y p Simple linear regression for continuous

response and predictor variables Logistic regression for categorical response g g g p

and continuous predictor One-way ANOVA for a continuous response

and categorical predictor Contingency tables for categorical response

and predictor variables

30

35

40

45

50

Bod

y Fa

t

Bivariate Fit of Percent Body Fat By Calories

• Fit Y by X often has both parametric and nonparametric tests in hotspots

5

10

15

20

25

Per

cent

B

1000 1500 2000 2500 3000 3500Calories

JMP Fit Model Platform• Used to fit most models with three or

more predictor variablesp• Choose the response variables Y,

the model effects X and the model personality to determine the model JMP will help guide you by providing

appropriate options for your variablesappropriate options for your variables Model types determined by selected

variables and model personality

Multiple regression

Multifactor ANOVA

Linear mixed models

• Contact [email protected] help building models

Generalized linear models (GLIM)

Proportional hazards d i i land parametric survival

Construct Model Effects • Cross two predictors to evaluate an interaction

I.e. synergy of fertilizer and hydration levels on plant yieldI.e. synergy of fertilizer and hydration levels on plant yield

• Nest two predictors when the levels of one variable depend on the levels of anothervariable depend on the levels of another I.e. auto make and model effects (e.g. Ford Mustang)

• Choose the Random effect attribute when factor levels represent a random sample from the larger populationthe larger population E.g. Hospital and subject are random effects drawn from

populations of all possible hospitals and patients

Model Personalities• Standard Least Squares used for most ANOVA and

regression models with continuous response variables

• Use Stepwise personality for model selection

• Use MANOVA for multivariate response variables

• Nominal or Ordinal Logistic for categorical responses

Use Generalized linear models to test for overdispersion• Use Generalized linear models to test for overdispersioneffects in least squares, logistic or log-linear models

SAS Integration• Use JMP to access procedures

from a SAS license E.g. NLMixed, GLIMMIX, etc.

• Click > File > SAS > Server• Click > File > SAS > Server connections to access a SAS license on the local machine or on a remote serveron a remote server

• Submit SAS code directly fromSubmit SAS code directly from a JSL script window

JMP DOE Menu• Design of Experiments (DOE) Platform in

JMP allows you to utilize statistics concepts y pwhile planning your experiments Estimate power and necessary sample sizes Plan efficient experiments intended to find important effects Plan efficient experiments intended to find important effects

or optimize responses

• Designed Experiments in JMP presentations are available for any who are interested

Thank You

For questions or comments please contact:

[email protected]

301.496.4455

29