a <- 3
a is the variable, 3 is the value
b <- 4
b is another variable
c <- 2
c is a third variable
(a + b)^c
the result is displayed on the screen
d <- a*b/c
the result of the operation is stored in a new variable d
print(d)
the value of d is printed
d
A shortcut for print(d): if you evoke a variable without any nistruction, its value is printed on the screen.
log(a)
## Each R function has an help page
help(log)
sqrt(b)
Generate a series from 0 to 10
0:10
The interval is always 1 but multiplying by a real coefficient can lead to series with any interval:
-1 + 0.2*0:10
Series can also be generated with the R function seq().
seq(from=-5, to=5, by=0.5)
c() concatenates a list of values
v <- c(2.12,5.04,3.17,77.1,11.09,62)
print(v)
v is a vector containing numeric values.
class(v) ## Get the data type of a variable
If a vector contains some string, all its elements are considered as strings as well.
w <- c(3,"hello",5,"this",7,"is",9,"a",23,"string")
print(w)
This is indicated by the double quotes around the “numbers” # Actually, a,b,c, and d are also vectors (which is the simplest object in R), but they contain a single element each. This is the reason why a [1] appears when their value is printed
Functions and operations apply directly to each element of a vector, and a vector of the same size is returned.
sqrt(v)
3*v + v^2
A vector can be read from a text file with the function scan().
test.file <- "http://pedagogix-tagc.univ-mrs.fr/courses/statistics_bioinformatics/data/orf_lengths/yeast_orf_length_enum.txt"
v <- scan(test.file)
Note that in R we generally work with tables rather than vectors. The function scan() is rarely used.
Get the fourth element of the vector v:
v[4]
Rlements 2 to 74:
v[2:7]
Elements can also be selected in a custom order specified within the concatenation c():
v[c(2,6,8,1,3,5)]
Get 100 first elements (note the numbering at the beginning of each row):
v[1:100]
Read a data frame from a text file. This takes a few seconds (the loading time depends on the network speed).
test.table <- "http://pedagogix-tagc.univ-mrs.fr/courses/statistics_bioinformatics/data/gene_expression/gasch2000/carbon_sources/zscores_carbon_sources.tab"
carbon <- read.table(test.table,header=T,row.names=1)
dim(carbon)
Print the column names for a data frame.
names(carbon)
Print the row names for a data frame.
row.names(carbon)
Return the first row, together with the header. Since there are too many columns, several lines are required.
carbon[1,]
Return the 10 first data rows
carbon[1:10,]
The following notation returns the 6152 numbers that are found in the column “mannose” (these are the measures for all the genes at time point 1)
carbon[,1]
Return the names of the columns (column headers)
names(carbon)
Another way to get the same column, specified by its name rather than index.
carbon[,"mannose"]
Yet another way to access the same column
carbon$mannose
The content of single column (in this case the 4th chip) can be assigned to a vector.
v <- carbon[,4]
An element of the vector can then be accessed by adding a single index. This returns the value of the 3201th gene of the 4th chip.
v[3201]
This directly returns the 4th chip for the 3201th gene.
carbon[3201,4]
This returns a rectangular section of the data frame (genes 100 to 120, chips 1 to 3).
carbon[100:120,1:3]
For technical reasons, a chip might contain some undefined values. R contains a specific value for undefined values, which are displayed as NA
.
The presence of NA values poses problems for some calculation methods. R methods generally include some options allowing to specify the way to treat these values.
One simple way to get rid of NA values is to filter out all the rows which contain one of these.
dim(carbon)
carbon.filtered <- na.omit(carbon)
dim(carbon.filtered)
This returns columns 5 to 7 for all genes having a higher level than 3 for the 7th chip.
carbon.filtered[carbon.filtered[,7] > 3,5:7]
q()
The program asks you "Save workspace image? [y/n/c]:"
Answer “n” (for the time being, later it can be useful to save the state of the program between two working sessions).