Prepare for your exams
Get points
Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

R Programming Cheat Sheet, Just the Basics, Cheat Sheet of Advanced Computer Programming

Massachusetts Institute of Technology (MIT)Advanced Computer Programming

A cheat sheet for beginners to R language programming. Learn the basic functions, syntax and paradigm of R programming

Typology: Cheat Sheet

2020/2021

On special offer

~~30 Points~~

Limited-time offer

Uploaded on 04/27/2021

ehimay 🇺🇸

4.7

(20)

20 documents

1 / 2

On special offer

Partial preview of the text

Download R Programming Cheat Sheet, Just the Basics and more Cheat Sheet Advanced Computer Programming in PDF only on Docsity! General Data StructureS ManipulatinG StrinGS R Programming Cheat Sheet juSt the baSicS Putting Together Strings paste('string1', 'string2', sep = '/') # separator ('sep') is a space by default paste(c('1', '2'), collapse = '/') # returns '1/2' Split String stringr::str_split(string = v1, pattern = '-') # returns a list Get Substring stringr::str_sub(string = v1, start = 1, end = 3) Match String isJohnFound <- stringr::str_ detect(string = df1$col1, pattern = ignore.case('John')) # returns True/False if John was found df1[isJohnFound, c('col1', ...)] Data typeS • R version 3.0 and greater adds support for 64 bit integers • R is case sensitive • R index starts from 1 HELP help(functionName) or ?functionName Help Home Page help.start() Special Character Help help('[') Search Help help.search(..)or ??.. Search Function - with Partial Name apropos('mea') See Example(s) example(topic) ObjEcts in current environment Display Object Name objects() or ls() Remove Object rm(object1, object2,..) Notes: 1. .name starting with a period are accessible but invisible, so they will not be found by ‘ls’ 2. To guarantee memory removal, use ‘gc’, releasing unused memory to the OS. R performs automatic ‘gc’ periodically symbOL NamE ENvirONmENt • If multiple packages use the same function name the function that the package loaded the last will get called. • To avoid this precede the function with the name of the package. e.g. packageName::functionName(..) Library Only trust reliable R packages i.e., 'ggplot2' for plotting, 'sp' for dealing spatial data, 'reshape2', 'survival', etc. Load Package library(packageName)or require(packageName) Unload Package detach(packageName) Note: require() returns the status(True/False) vEctOr • Group of elements of the SAME type • R is a vectorized language, operations are applied to each element of the vector automatically • R has no concept of column vectors or row vectors • Special vectors: letters and LETTERS, that contain lower-case and upper-case letters Create Vector v1 <- c(1, 2, 3) Get Length length(v1) Check if All or Any is True all(v1); any(v1) Integer Indexing v1[1:3]; v1[c(1,6)] Boolean Indexing v1[is.na(v1)] <- 0 Naming c(first = 'a', ..)or names(v1) <- c('first', ..) FactOr • as.factor(v1) gets you the levels which is the number of unique values • Factors can reduce the size of a variable because they only store unique values, but could be buggy if not used properly List Store any number of items of ANY type Create List list1 <- list(first = 'a', ...) Create Empty List vector(mode = 'list', length = 3) Get Element list1[[1]] or list1[['first']] Append Using Numeric Index list1[[6]] <- 2 Append Using Name list1[['newElement']] <- 2 Note: repeatedly appending to list, vector, data.frame etc. is expensive, it is best to create a list of a certain size, then fill it. data.FramE • Each column is a variable, each row is an observation • Internally, each column is a vector • idata.frame is a data structure that creates a reference to a data.frame, therefore, no copying is performed Create Data Frame df1 <- data.frame(col1 = v1, col2 = v2, v3) Dimension nrow(df1); ncol(df1); dim(df1) Get/Set Column Names names(df1) names(df1) <- c(...) Get/Set Row Names rownames(df1) rownames(df1) <- c(...) Preview head(df1, n = 10); tail(...) Get Data Type class(df1) # is data.frame Index by Column(s) df1['col1']or df1[1];† df1[c('col1', 'col3')] or df1[c(1, 3)] Index by Rows and Columns df1[c(1, 3), 2:3] # returns data from row 1 & 3, columns 2 to 3 † Index method: df1$col1 or df1[, 'col1'] or df1[, 1] returns as a vector. To return single column Check data type: class(variable) FOur basic data tyPEs 1. Numeric - includes float/double, int, etc. is.numeric(variable) 2. Character(string) nchar(variable) # length of a character or numeric 3. Date/POSIXct • Date: stores just a date. In numeric form, number of days since 1/1/1970 (see below). date1 <- as.Date('2012-06-28'), as.numeric(date1) • POSIXct: stores a date and time. In numeric form, number of seconds since 1/1/1970. date2 <- as.POSIXct('2012-06-28 18:00') Note: Use 'lubridate' and 'chron' packages to work with Dates 4. Logical • (TRUE = 1, FALSE = 0) • Use ==/!= to test equality and inequality as.numeric(TRUE) => 1 data.frame while using single-square brackets, use ‘drop’: df1[, 'col1', drop = FALSE] data.tabLE What is a data.table • Extends and enhances the functionality of data.frames Differences: data.table vs. data.frame • By default data.frame turns character data into factors, while data.table does not • When you print data.frame data, all data prints to the console, with a data.table, it intelligently prints the first and last five rows • Key Difference: Data.tables are fast because they have an index like a database. i.e., this search, dt1$col1 > number, does a sequential scan (vector scan). After you create a key for this, it will be much faster via binary search. Create data.table from data.frame data.table(df1) Index by Column(s)* dt1[, 'col1', with = FALSE] or dt1[, list(col1)] Show info for each data.table in memory (i.e., size, ...) tables() Show Keys in data.table key(dt1) Create index for col1 and reorder data according to col1 setkey(dt1, col1) Use Key to Select Data dt1[c('col1Value1','col1Value2'), ] Multiple Key Select dt1[J('1', c('2', '3')), ] Aggregation** dt1[, list(col1 = mean(col1)), by = col2] dt1[, list(col1 = mean(col1), col2Sum = sum(col2)), by = list(col3, col4)] * Accessing columns must be done via list of actual names, not as characters. If column names are characters, then "with" argument should be set to FALSE. ** Aggregate and d*ply functions will work, but built-in aggregation functionality of data table is faster matrix • Similar to data.frame except every element must be the SAME type, most commonly all numerics • Functions that work with data.frame should work with matrix as well Create Matrix matrix1 <- matrix(1:10, nrow = 5), # fillsrows 1 to 5, column 1 with 1:5, and column 2 with 6:10 Matrix Multiplication matrix1 %*% t(matrix2) # where t() is transpose array • Multidimensional vector of the SAME type • array1 <- array(1:12, dim = c(2, 3, 2)) • Using arrays is not recommended • Matrices are restricted to two dimensions while array can have any dimension

Documents

questions

R Programming Cheat Sheet, Just the Basics, Cheat Sheet of Advanced Computer Programming

Related documents

Partial preview of the text