60
Visual Analytics for Linguistics - Day 4 Olga Scrivner Course Info tm Package Statistical Visualization Data Preparation LVS Working with Data Visual Analytics Inferential Analysis Visual Analytics for Linguistics - Day 4 Olga Scrivner

Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Embed Size (px)

Citation preview

Page 1: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Visual Analytics for Linguistics - Day 4

Olga Scrivner

Page 2: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

What You Will Learn

DAY 1 Introduction to Visual Analytics

DAY 2 Visualization Methods, Design, and Tools

DAY 3 Working with Unstructured Data

DAY 4 Working with Structured Data

DAY 5 Advanced Analytics

Page 3: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Our Materials - Web Site

http://obscrivn.wixsite.com/visualization

Page 4: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

What We Need

I Interactive Text Mining Suite

I Language Variation Suite

I Download R code

I Download files from Day 3 and Day 4

I R and Rstudio

I R libraries: tm, party, partykit

Page 5: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

What We Need

I Interactive Text Mining Suite

I Language Variation Suite

I Download R code

I Download files from Day 3 and Day 4

I R and Rstudio

I R libraries: tm, party, partykit

Page 6: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Text Mining Review

I Intro to ITMS - slides Day 3I R coding

Page 7: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Text Mining Review

1. Open RStudio2. Open R file textmining.R:

3. Set up working directory

Page 8: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Text Mining Package - tm

I Main structure - corpus

I Corpus is constructed via DirSource, VectorSource,DataframeSource

mycorpus <- Corpus(VectorSource(file.txt))

Page 9: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Preprocessing with tm

I lower case

mycorpus <- tm_map(mycorpus, tolower)

I remove punctuation

mycorpus <- tm_map(mycorpus,removePunctuation)

I remove numbers

mycorpus <- tm_map(mycorpus, removeNumbers)

I remove stopwords

mycorpus <- tm_map(mycorpus, removeWords,stopwords(’english’))

I mycorpus <- tm_map(mycorpus, stripWhitespace)

I mycorpus <- tm_map(mycorpus,PlainTextDocument)

Page 10: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

What is LVS?

Language Variation Suite

It is a Shiny web application originally designed for dataanalysis in sociolinguistic research.

It can be used for:I Processing spreadsheet data

I Visualizing data

I Analyzing means, regression, conditional trees ...(and much more)

Page 11: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Background

LVS is built in R using Shiny package:

1. R - a free programming language for statisticalcomputing and graphics

2. Shiny App - a web application framework for R

Computational power of R + Web interactivity

Page 12: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Background

http://littleactuary.github.io/blog/Web-application-framework-with-Shiny/

Page 13: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Workspace

Browser

I Chrome, Firefox, Safari - recommendable

I Explorer may cause instability issues

AccessibilityI PC, Mac, Linux

I Data files will be uploaded from any location on yourcomputer

I Smart PhoneI Data files must be on a cloud platform connected to

your phone account (e.g. dropbox)

Page 14: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Server

Since LVS is hosted on a server, Shiny idle time-out settingsmay stop application when it is left inactive (it will grey out).

Solution: Click reload and re-upload your csv file

Page 15: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Data Preparation

Important things to consider before data entry:I File format:

I Comma separated value (CSV) - faster processingI Excel format will slow processing

I Column names should not contain spacesI Permitted: non-accented characters, numbers,

underscore, hyphen, and period

I One column must contain your dependent variableI The rest of the columns contain independent variables

Page 16: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Terminology Review

a. Categorical - non-numerical data with two values

I yes - no; male - female

b. Continuous - numerical data

I duration, age, year

c. Multinomial - non-numerical data with three or morevalues

I regions, nationalities

d. Ordinal - scale: currently not supported

Page 17: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Terminology Review

a. Categorical - non-numerical data with two values

I yes - no; male - female

b. Continuous - numerical data

I duration, age, year

c. Multinomial - non-numerical data with three or morevalues

I regions, nationalities

d. Ordinal - scale: currently not supported

Page 18: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Workshop Files

1. http://obscrivn.wixsite.com/visualizationDownload file Day 4: movie_metadata.csvSimplified set from https://www.kaggle.com/deepmatrix/imdb-5000-movie-dataset

2. LVS web site: http://languagevariationsuite.com/

http://cl.indiana.edu/~obscrivn/docs/movie_metadata.csv

Page 19: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Movie Data

I BudgetI DirectorI Actor 1I Director facebook likesI Actor 1 facebook likesI GenreI Year

Page 20: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Language Variation Suite - Structure

1. Data

I Upload file, data summary, adjust data, cross tabulation

2. Visual Analysis

I Plotting, cluster classification

3. Inferential Statistics

I Modeling, regression, conditional trees, random forest

Page 21: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Language Variation Suite - Structure

1. Data

I Upload file, data summary, adjust data, cross tabulation

2. Visual Analysis

I Plotting, cluster classification

3. Inferential Statistics

I Modeling, regression, conditional trees, random forest

Page 22: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Upload File

Upload movie_metadata.csv

Page 23: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Uploaded Dataset

The data content is imported as a table and allows forsorting columns.

Page 24: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Summary

Summary provides a quantitative summary for each variable,e.g. frequency count, mean, median.

Page 25: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Data Structure

1. Total number of observations (rows)

2. Number of variables (columns)

3. Variable types

I Factor - categorical valuesI Num - numeric values (0.95, 1.05)I Int - integer values (1, 2, 3)

Page 26: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Cross Tabulation

Cross-tabulation examines the relationship between variables.

Page 27: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Cross Tabulation Plot

Page 28: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Cross Tabulation Plot

Page 29: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Adjusting Browser - Layout

Shiny pages are fluid and reactive.

Page 30: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Adjusting Browser - Layout

Page 31: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Language Variation Suite - Structure

1. Data

I Upload file, data summary, adjust data, cross tabulation

2. Visual Analysis

I Plotting, cluster classification

3. Inferential statistics

I Modeling, regression, varbrul analysis, conditional trees,random forest

Page 32: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

One Variable Plot

Page 33: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

One Variable Plot

Page 34: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Customizing Plot

Page 35: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Customizing Plot

Page 36: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Saving Plot

Page 37: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Cluster Plot

I Classification of data into sub-groups is based onpairwise similarities

I Groups are clustered in the form of a tree-likedendrogram

Page 38: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Cluster Plot

Page 39: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Cluster Plot

Group 1 Animation, Biography and Group 2 Action, Drama,Comedy

Page 40: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Inferential Statistics

Page 41: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Language Variation Suite - Structure

1. Data

I Upload file, data summary, adjust data, cross tabulation

2. Visual Analysis

I Plotting, cluster classification

3. Inferential statistics

I Modeling, regression, varbrul analysis, conditional trees,random forest

Page 42: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

How to Create a Regression Model

Step 1 Modeling - create a model with dependentand independent variables

Step 2 Regression - specify the type of regression(fixed, mixed) and type of dependent variable(binary, continuous, multinomial)

Step 3 Stepwise Regression - compare models(Log-likelihood, AIC, BIC)

Step 4 Conditional Trees - apply non-parametrictests to the model

Page 43: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Modeling

Page 44: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Regression Types

I Model

a.) Fixed effect

b.) Mixed effect - individual speaker/token variation (withingroup)

I Type of Dependent Variable

a.) Binary/categorical (only two values)

b.) Continuous (numeric)

c.) Multinomial - categorical with more than two values

Page 45: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Regression

Page 46: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Model Output

Page 47: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Model Output

Page 48: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Interpretation: Budget and Genre

I Genre Action is the reference value

I Positive coefficient - positive effect

I Negative coefficient - negative effect

http://www.free-online-calculator-use.com/scientific-notation-converter.html

Page 49: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Interpretation: Budget and Genre

I Genre Action is the reference value

I Positive coefficient - positive effect

I Negative coefficient - negative effecthttp://www.free-online-calculator-use.com/scientific-notation-converter.html

exponential notation:1.46e-7

0.000000146

Page 50: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Conditional Tree

Conditional tree: a simple non-parametric regression analysis,commonly used in social and psychological studies

I Linear regression: all information is combined linearly

I Conditional tree regression: visual splitting to captureinteraction between variables

Recursive splitting (tree branches)

Page 51: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Conditional Tree

Page 52: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Conditional Tree

Page 53: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Conditional Tree

1. Genre is the significant factor for budget2. Budget distribution is split in two groups:

I Action and Animation

I Biography, Comedy and Drama

3. Budget is significantly higher for Animation and Action

Page 54: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Random Forest

1. Variable importance for predictors

2. Robust technique with small n large p data

3. All predictors considered jointly (allows for inclusion ofcorrelated factors)

Page 55: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Random Forest

Let’s add more factors!

I Return to Modeling

I Add independent factors: director facebook likes,actor 1 facebook likes, title year

Page 56: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Random Forest

I Genre is the most important predictor for this model.I Close to zero or red-dotted line (cut off values) -

irrelevant for this model

Page 57: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Random Forest

I Genre is the most important predictor for this model.I Close to zero or red-dotted line (cut off values) -

irrelevant for this model

Page 58: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

Assignment

I Select LVS or ITMS

I Upload your own file (csv for LVS or txt/pdf for ITMS)

I Explore your data

Page 59: Visual Analytics for Linguistics - Day 4 ESSLLI - structured data

Visual Analyticsfor Linguistics -

Day 4

Olga Scrivner

Course Info

tm Package

StatisticalVisualization

Data Preparation

LVS

Working withData

Visual Analytics

InferentialAnalysis

References I

Baayen, Harald. 2008. Analyzing linguistic data: A practical introduction to statistics.Cambridge: Cambridge University Press

Gries, Stefan Th. 2015. Quantitative designs and statistical techniques. In DouglasBiber Randi Reppen (eds.), The Cambridge Handbook of English Corpus Linguistics.Cambridge: Cambridge University Press

Schnapp, Jeffrey, and Peter Presner. 2009. Digital Humanities Manifesto 2.0.

http://gifsanimados.espaciolatino.com/x_bob_esponja_8.gif

https://daniellestolt.files.wordpress.com/2013/01/are-you-ready1.gif

http://www.martijnwieling.nl/R/sheets.pdf