16
Data Input/Output Data Input/Output (Sesi 2) Jurusan Statistika UNPAD February 2013

Data Input/Output Sesi 2) - zulstat.files.wordpress.com · Data Input/Output dalam R ... xx AngkatancontohD1G99234 = ... Buatlah rata-rata usia untuk masing-masing angkatan, jurusan

Embed Size (px)

Citation preview

Data Input/OutputData Input/Output(Sesi 2)

�Jurusan Statistika UNPAD

�February 2013

Struktur Data dalam R• Matriks

• Vektor• Vektor

• Array (Data berindeks) • Data Frame

Data Input/Output dalam R

• Input/Output secara langsung (keyboard) • Input/Output dari/ke format lain• Input/Output dari/ke format lain

� Format ASCII dengan pemisah Koma (*.csv), Tab (*.txt), Spasi (*.dat)

� Excel (*.xls) � SPSS (*.sav) � Minitab (*.mtw) Stata (*.dta) � Stata (*.dta)

� SAS

Data Frame Dalam R

Data Array yang terdiri atas type data yang berbedadata.frame() merupakan fungsi dalam R untuk membangun data frame

Nama<-c("surip","zul","budi","nordin")Usia<-c(23,34,44,12)Kelas<-c("A","B","C","D")Domisili<-c(“bdg",“cjr",“jkt",“sby")Siswa<-data.frame(Nama,Usia,Kelas,Domisili)SiswaSiswa

Fungsi penting data Frame� names(Siswa)� Siswa[,1]� Siswa$Usia� Siswa$Usia� mean(Siswa$Usia)� min(Siswa$Usia)� table(Siswa$Kelas) � table(Siswa$Kelas,Siswa$Domisili) � i<-order(Siswa$Usia);i� Siswa[i,] � edit(data.frame(Siswa)) � colnames(Siswa)[1]=“MyName"

Operator In R� Aritmatik Operator

Operator Syntax

* Kali

/ Bagi

- Kurang

^ atau *** Pangkat

� Operator Logika

Operator Syntax

< less than

<= less than or equal to

> greater than

>= greater than or equal to

== exactly equal to

!= not equal to

< less than

Contoh# An example x <- c(1:10)x[(x>8) | (x<5)]# yeilds 1 2 3 4 9 10# yeilds 1 2 3 4 9 10

# How it works x <- c(1:10)x1 2 3 4 5 6 7 8 9 10x > 8F F F F F F F FT Tx < 5T TTT F F F F F FT TTT F F F F F Fx > 8 | x < 5T TTT F F F FT Tx[c(T,T,T,T,F,F,F,F,T,T)]1 2 3 4 9 10

Fungsi-fungsi yang penting !!!!!

Function Description

Fungsi Matematika

abs(x) absolute value

sqrt(x) square root

ceiling(x) ceiling(3.475) is 4

floor(x) floor(3.475) is 3

trunc(x) trunc(5.99) is 5

round(x, digits=n) round(3.475, digits=2) is 3.48

signif(x, digits=n) signif(3.475, digits=2) is 3.5 signif(x, digits=n) signif(3.475, digits=2) is 3.5

cos(x), sin(x), tan(x) also acos(x), cosh(x), acosh(x), etc.

log(x) natural logarithm

log10(x) common logarithm

exp(x) e^x

Function Description

substr(x, start=n1, stop=n2) Extract or replace substrings in a character vector.x <- "abcdef" substr(x, 2, 4) is "bcd" substr(x, 2, 4) <- "22222" is "a222ef"

grep(pattern, x , ignore.case=FALSE, fixed=FALSE) Search for pattern in x. If fixed =FALSE then pattern is a

Fungsi String/Karakter

grep(pattern, x , ignore.case=FALSE, fixed=FALSE) Search for pattern in x. If fixed =FALSE then pattern is a regular expression. If fixed=TRUE then pattern is a text string. Returns matching indices.grep("A", c("b","A","c"), fixed=TRUE) returns 2

sub(pattern, replacement, x, ignore.case =FALSE, fixed=FALSE)

Find pattern in x and replace with replacement text. If fixed=FALSE then pattern is a regular expression.If fixed = T then pattern is a text string. sub("\\s",".","HelloThere") returns "Hello.There"

strsplit(x, split) Split the elements of character vector x at split. strsplit("abc", "") returns 3 element vector "a","b","c"

paste(..., sep="") Concatenate strings after using sep string to seperatethem.paste("x",1:3,sep="") returns c("x1","x2" "x3")paste("x",1:3,sep="M") returns c("xM1","xM2" "xM3")paste("Today is", date())

toupper(x) Uppercase

tolower(x) Lowercase

Fungsi Tanggal/Waktu

Symbol Meaning

%d day as a number (0-31)

%a%A

abbreviated weekday unabbreviated weekday %A unabbreviated weekday

%m month (00-12)

%b%B

abbreviated monthunabbreviated month

%y%Y

2-digit year 4-digit year

� d <- Sys.Date()as.numeric(format(d, format = "%Y"))as.numeric(format(d, format = "%m")) as.numeric(format(d, format = "%m")) as.numeric(format(d, format = "%d"))

� # use as.Date( ) to convert strings to dates mydates <- as.Date(c("2007-06-22", "2004-02-13"))# number of days between 6/22/07 and 2/13/04 days <- mydates[1] - mydates[2]

� # print today's datetoday <- Sys.Date()format(today, format="%B %d %Y")"June 20 2007“

� # convert date info in format 'mm/dd/yyyy'� # convert date info in format 'mm/dd/yyyy'strDates <- c("01/05/1965", "08/16/1975")dates <- as.Date(strDates, "%m/%d/%Y")

� # convert dates to character datastrDates <- as.character(dates)

Fungsi Statistik DasarFunction Description

mean(x, trim=0,na.rm=FALSE)

mean of object x# trimmed mean, removing any missing values and # 5 percent of highest and lowest scores mx <- mean(x,trim=.05,na.rm=TRUE)

sd(x) standard deviation of object(x). also look at var(x) for sd(x) standard deviation of object(x). also look at var(x) for variance and mad(x) for median absolute deviation.

median(x) median

quantile(x, probs) quantiles where x is the numeric vector whose quantiles are desired and probs is a numeric vector with probabilities in [0,1].# 30th and 84th percentiles of xy <- quantile(x, c(.3,.84))

range(x) range

sum(x) sumsum(x) sum

diff(x, lag=1) lagged differences, with lag indicating which lag to use

min(x) minimum

max(x) maximum

scale(x, center=TRUE, scale=TRUE) column center or standardize a matrix.

LatihanNama NPM IPK Jenis Kelamin Tgl Lahir

Fulan D1G99234 3.4 L 1990-08-2

Dede D1G99224 2.7 L 1989-11-22

Sondakh D1G98344 2.6 P 1991-12-9Sondakh D1G98344 2.6 P 1991-12-9

Nurdin D1G98211 2.3 L 1989-08-2

John D1G98833 3.5 L 1988-07-4

Lung D1D00234 3.7 P 1991-02-25

Yaris D1D00345 3.1 L 1987-04-24

Asep D1G00566 2.9 L 1990-03-25

Dedi D1C01546 2.3 L 1988-04-26

Zeni D1A01234 2.8 P 1991-05-27Zeni D1A01234 2.8 P 1991-05-27

Nia D1A01233 2.9 P 1990-08-14

Sinto D1B02344 3.0 L 1988-09-12

Cucu D1B02455 3.1 P 1989-03-14

Fika D1B99008 3.4 P 1992-02-12

Neo D1C98001 3.6 L 1989-02-11

Code Book Untuk NPMCode Arti

G Jurusan Statistika

C Jurusan FisikaC Jurusan Fisika

A Jurusan Matematika

B Jurusan Kimia

xx Angkatan contoh D1G99234 =Jurusan statistika angkatan 1999 no urut 234

� Buatlah rata-rata, simpangan baku dan median IPK untuk masing-masing angkatan, jurusan dan jenis kelamin ?

� Buatlah rata-rata usia untuk masing-masing angkatan, jurusan dan jenis kelamin ?

� Dari semua jurusan yang ada jurusan manakah yang relatif IPK nya seragam ?