Background Material
L4
get the current working directory
>getwd() #same in MAC
read the csv (comma-seperated values) file
>read.csv("xx.csv")
list the file and directory under the current directory
>dir()
change the working directory
>setwd("F:/workspace/")
To list out all the functions in one script
>ls()
Week1
L4
assignment
>x <- 5
comments
# hi, this is a comment
L5
5 basic "atomic" classes
- character
- numeric (real numbers)
- integer
- complex
- logical (True/False)
basic object: vector
vector function: vector(type_of, length_of)
basically, R think all number as real type, if you want a integer 1, type:
1L
Infinity:
Inf
not a number:
0/0=NAN (or as missing value)
R objects can have attributes:
- names, dimnames
- dimensions
- class (numeric)
- length
function to modify attributes
attributes()
L6
c() function is to create a vector:
x <- c(0.5,0.6)
x <- c(True, Flase)
vector() function acts alike, with initial value 0 for numeric.
x <- vector("numeric", length = 10)
mixed classes
x<- c(True, 3) # will be numeric
Explicit Coercion
> x<-0:2
> class(x)
[1] "integer"
> as.logical(x)
[1] FALSE TRUE TRUE
we use list() to create list.
>y<-list("a", 1, TRUE)
L7
create matrix
>m<-matrix(nrows=2, ncols=3)
>attributes(m)
$dim
[1] 2 3
matrix is column wise
change vector to matrix
>m<-1:10
>dim(m)<-c(2,5)
or use cbind or rbind
>x<-1:3
>y<-10:12
>cbind(x,y)
1 10
2 11
2 12
>rbind(x,y)
similar but in row
L8
factors are ordered or unordered, like key for the php array, factors could be treated as numeric vector with labels
give an example
>x<-factor(c("yes","yes","no"))
>x
[1] yes yes no
Levels: no yes
>table(x)
yes no
2 1
>unclass(x)
221
that is how factor x expressed in R underneath!
The first level is called the baseline level, it is determined by the alphabet rank, however, you could change the order.
>x<-factor(c("yes","yes","no"),levels=c("yes","no"))
and yes will be in the first place.
L9-Missing Values
is.na()
is.nan()
NaN value all be treated as NA, but the converse is not true.
L10-Data Frames
tabular 表格的
data frames ~ matrices
list ~ vector
Yes, that the class doesn't matter in data frames.
Special attributes: row.names
created by
read.table() or read.csv()
convert to matrix
data.matrix()
an example:
>x<- data.frame(foo=1:4, bar = c(T,T,F,F))
>x
foo bar
1 1 TRUE
2 2 TRUE
3 3 FALSE
4 4 FALSE
L10-Names Attribute
>x >- 1:3
> names(x) <- c("foo","bar","norf")
>x
foo bar norf
1 2 3
L12-Reading Tabular Data
- read.table, read.csv
- readLines
- source, for reading in R code files (inverse of dump)
- dget, same as above, but for dparsed code (inverse of dput)
- load, for reading in saved workspaces
- unserialize, for reading single R objects in binary form
write
- write.table
- writeLines
- dump
- dput
- save
- serialize
read.table
- file, the name of a file, or a connection
- header, logical indicating if the file has a header line
- sep, a string indicating how the columns are separated
- colClasses, a character vector indicating the class of each column in the dataset
- nrows, the number of rows in the dataset
- comment.char, a character string indicating the comment character
- skip, the number of lines to skip from the beginning
- stringsAsFactors, should character variables be coded as factors?
no argument is fine, and the result would be in a data frame.
read.table default separator is space.
be sure to read the document of read.table
L12-Reading Large Tables
set the arguments!
all numeric, one is fine:
colClasses = "numeric"
L13-Textual Data Format
L17-Subsetting-Basics
>x <- c("a","b","c")
>x[1]
[1] "a"
> x[1:3]
>x[x>"a"]
>u <- x>"a"
>u
[1] FALSE TRUE TRUE
L17-Subsetting-Lists
>x <- list (foo =1:4, bar =0.6)
>x[1]
$foo
[1] 1 2 3 4
we got a list!
>x[[1]]
[1] 1 2 3 4
we got a sequence!
>x$bar
[1] 0.6
>x[["bar"]] #this is equal
>x["bar"] # we got a list
>x[c(1, 3)]
>name = "foo"
>x[[name]]
this is useful
L17-Subsetting-Matrices
>x[1, ] # missing is fine
No dropping forcing:
>x[1, 2, drop =FALSE]
L18-partial matching
>x<-list(aardvark=1:5)
>x$a
[1] 1 2 3 4 5
>x[["a"]]
NULL
>x[["a", exact = FALSE]]
[1] 1 2 3 4 5
L19-Removing NA Values
>x <- c(1, 2, NA, 4, NA, 5)
>bad <- is.na(x)
>x[!bad]
>good<-complete.cases(x, y)
L20-Vectorized Operations
matrix
x*y #by rank
x%*%y # by true matrix multiplication
Week1
L2 if-else
if(x>3) {
y<-10
}else{
y<-0
}
also true:
y<- if(x>3) {
10
}else{
0
}
L2 For loops
for (i in 1:10){
}
x <- c("a","b","c","d")
for (i in seq_along(x)){
print(x[i])
}
for (letter in x){
print(letter)
}
L4 Functions
set default value:
abc = function(a = 10){
}
columnmean <- function(y, removeNA = TRUE){
nc <- ncol(y)
means <- numeric(nc)
for(i in 1:nc) {
mean[i] <- mean(y[,i], na.rm = removeNA)
}
}
L6 Functions
... argument indicate a variable number of arguments that are usually passed on to other funcitons.
myplot <- function(x, y, type = "l", ...) {
plot (x, y, type = type, ...)
}
explicityly matching after dot dot dot
L7 Functions could be made dynamically!
lexical vs. dynamical scoping
make.power <- function(n) {
pow <- function(x) {
x^n
}
pow
}
>cube <- make.power(3) # note cube is a function
> cube(3)
[1] 27
ls(environment(cube))
L8 code style
indenting 缩进
L10 Date and times
x <- as.Date("1970-01-01")
x <- Sys.time() |
Point
here, we can see the sytle is very like windows/dos
cute assignment sign
same comment sign as bash
complex! Like fortran
interesting...
Attributes! Like NCL!
be careful with c(), not like other language
in fact, I think this is really convenient
merge vectors to matrix, this is really user-friendly!
Impressive! like PHP array, but much easier to understand
Like NCL or MATLAB
See it? Plenty of data types, very user-friendly. You could imagine how simple when use data frames to process EXCEL type files.
Briliant!!!
Very like $$ in ncl
Dim down or not, it is a problem |