【Coursera】R Language

Background Material

L4

get the current working directory >getwd() #same in MAC read the csv (comma-seperated values) file >read.csv("xx.csv") list the file and directory under the current directory >dir() change the working directory >setwd("F:/workspace/") To list out all the functions in one script >ls()

Week1

L4

assignment >x <- 5 comments # hi, this is a comment

L5

5 basic "atomic" classes

character
numeric (real numbers)
integer
complex
logical (True/False)

basic object: vector vector function: vector(type_of, length_of) basically, R think all number as real type, if you want a integer 1, type: 1L Infinity: Inf not a number: 0/0=NAN (or as missing value) R objects can have attributes:

names, dimnames
dimensions
class (numeric)
length

function to modify attributes attributes()

L6

c() function is to create a vector: x <- c(0.5,0.6) x <- c(True, Flase) vector() function acts alike, with initial value 0 for numeric. x <- vector("numeric", length = 10) mixed classes x<- c(True, 3) # will be numeric Explicit Coercion > x<-0:2 > class(x) [1] "integer" > as.logical(x) [1] FALSE TRUE TRUE we use list() to create list. >y<-list("a", 1, TRUE)

L7

create matrix >m<-matrix(nrows=2, ncols=3) >attributes(m) $dim [1] 2 3 matrix is column wise change vector to matrix >m<-1:10 >dim(m)<-c(2,5) or use cbind or rbind >x<-1:3 >y<-10:12 >cbind(x,y) 1 10 2 11 2 12 >rbind(x,y) similar but in row

L8

factors are ordered or unordered, like key for the php array, factors could be treated as numeric vector with labels give an example >x<-factor(c("yes","yes","no")) >x [1] yes yes no Levels: no yes >table(x) yes no 2 1 >unclass(x) 221 that is how factor x expressed in R underneath! The first level is called the baseline level, it is determined by the alphabet rank, however, you could change the order. >x<-factor(c("yes","yes","no"),levels=c("yes","no")) and yes will be in the first place.

L9-Missing Values

is.na() is.nan() NaN value all be treated as NA, but the converse is not true.

L10-Data Frames

tabular 表格的 data frames ~ matrices list ~ vector Yes, that the class doesn't matter in data frames. Special attributes: row.names created by read.table() or read.csv() convert to matrix data.matrix() an example: >x<- data.frame(foo=1:4, bar = c(T,T,F,F)) >x foo bar 1 1 TRUE 2 2 TRUE 3 3 FALSE 4 4 FALSE

L10-Names Attribute

>x >- 1:3 > names(x) <- c("foo","bar","norf") >x foo bar norf 1 2 3

L12-Reading Tabular Data

read.table, read.csv
readLines
source, for reading in R code files (inverse of dump)
dget, same as above, but for dparsed code (inverse of dput)
load, for reading in saved workspaces
unserialize, for reading single R objects in binary form

write

write.table
writeLines
dump
dput
save
serialize

read.table

file, the name of a file, or a connection
header, logical indicating if the file has a header line
sep, a string indicating how the columns are separated
colClasses, a character vector indicating the class of each column in the dataset
nrows, the number of rows in the dataset
comment.char, a character string indicating the comment character
skip, the number of lines to skip from the beginning
stringsAsFactors, should character variables be coded as factors?

no argument is fine, and the result would be in a data frame. read.table default separator is space. be sure to read the document of read.table

L12-Reading Large Tables

set the arguments! all numeric, one is fine: colClasses = "numeric"

L13-Textual Data Format

L17-Subsetting-Basics

>x <- c("a","b","c") >x[1] [1] "a" > x[1:3] >x[x>"a"] >u <- x>"a" >u [1] FALSE TRUE TRUE

L17-Subsetting-Lists

>x <- list (foo =1:4, bar =0.6) >x[1] $foo [1] 1 2 3 4 we got a list! >x[[1]] [1] 1 2 3 4 we got a sequence! >x$bar [1] 0.6 >x[["bar"]] #this is equal >x["bar"] # we got a list >x[c(1, 3)] >name = "foo" >x[[name]] this is useful

L17-Subsetting-Matrices

>x[1, ] # missing is fine No dropping forcing: >x[1, 2, drop =FALSE]

L18-partial matching

>x<-list(aardvark=1:5) >x$a [1] 1 2 3 4 5 >x[["a"]] NULL >x[["a", exact = FALSE]] [1] 1 2 3 4 5

L19-Removing NA Values

>x <- c(1, 2, NA, 4, NA, 5) >bad <- is.na(x) >x[!bad] >good<-complete.cases(x, y)

L20-Vectorized Operations

matrix x*y #by rank x%*%y # by true matrix multiplication

Week1

L2 if-else

if(x>3) { y<-10 }else{ y<-0 } also true: y<- if(x>3) { 10 }else{ 0 }

L2 For loops

for (i in 1:10){ } x <- c("a","b","c","d") for (i in seq_along(x)){ print(x[i]) } for (letter in x){ print(letter) }

L4 Functions

set default value: abc = function(a = 10){ } columnmean <- function(y, removeNA = TRUE){ nc <- ncol(y) means <- numeric(nc) for(i in 1:nc) { mean[i] <- mean(y[,i], na.rm = removeNA) } }

L6 Functions

... argument indicate a variable number of arguments that are usually passed on to other funcitons. myplot <- function(x, y, type = "l", ...) { plot (x, y, type = type, ...) } explicityly matching after dot dot dot

L7 Functions could be made dynamically!

lexical vs. dynamical scoping make.power <- function(n) { pow <- function(x) { x^n } pow } >cube <- make.power(3) # note cube is a function > cube(3) [1] 27 ls(environment(cube))

L8 code style

indenting 缩进

L10 Date and times

x <- as.Date("1970-01-01") x <- Sys.time()

Point

here, we can see the sytle is very like windows/dos cute assignment sign same comment sign as bash complex! Like fortran interesting... Attributes! Like NCL! be careful with c(), not like other language in fact, I think this is really convenient merge vectors to matrix, this is really user-friendly! Impressive! like PHP array, but much easier to understand Like NCL or MATLAB See it? Plenty of data types, very user-friendly. You could imagine how simple when use data frames to process EXCEL type files. Briliant!!! Very like $$ in ncl Dim down or not, it is a problem

Measure

【Coursera】R Language

Background Material

L4

Week1

L4

L5

L6

L7

L8

L9-Missing Values

L10-Data Frames

L10-Names Attribute

L12-Reading Tabular Data

read.table

L12-Reading Large Tables

L13-Textual Data Format

L17-Subsetting-Basics

L17-Subsetting-Lists

L17-Subsetting-Matrices

L18-partial matching

L19-Removing NA Values

L20-Vectorized Operations

Week1

L2 if-else

L2 For loops

L4 Functions

L6 Functions

L7 Functions could be made dynamically!

L8 code style

L10 Date and times

Point

Comments