Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

R Cheat Sheet: Functions and Examples for Data Analysis with R, Summaries of Statistics

R ProgrammingData AnalysisStatistics

A cheat sheet for using various functions in r for data analysis, including installing and loading packages, formula syntax, numerical summaries, general graphics, distributions, hypothesis tests, linear regression, data manipulation, and examples. It covers functions from packages such as mosaic, ggplot2, and base r.

What you will learn

  • What functions are available for numerical summaries in R?
  • How do you install and load a package in R?
  • What is the formula syntax for using functions in R?

Typology: Summaries

2021/2022

Uploaded on 08/05/2022

jacqueline_nel
jacqueline_nel 🇧🇪

4.4

(229)

506 documents

Partial preview of the text

Download R Cheat Sheet: Functions and Examples for Data Analysis with R and more Summaries Statistics in PDF only on Docsity! R-cheatsheet 1 Help ? ?? example() apropos() help.search() Packages In order to use functionalities from a certain package, we need to first install and then load the package: # install package (do once): install.packages("") # load package (do once in every script): require(); library() Formula syntax Most of the functions that we need for this course uses a formula syntax: goal(y ~ x | z, data = mydata, ...) where goal may be a function for plotting, calculat- ing numerical summaries or making inference. For plots: • y is y-axis variable • x is x-axis variable • z a conditioning variable (separate panels). For other things: ‘y ~ x | z’ can usually be read ‘y is modeled by (or depends on) x differently for each z’. Numerical summaries These functions from the mosaic-package uses a for- mula syntax. favstats() # min/max, median, etc. tally() # tabulate data mean() median() sd() # standard deviation cor() # correlation General graphics gf_boxplot() gf_point() # scatter plot gf_histogram() gf_bar() # bar graph mplot(HELPrct)# different plots splom() # matrix of scatter plots Distributions plotDist() # plot theoretical distribution pdist() # find prob. from percentile qdist() # find percentile from prob. Hypothesis tests t.test() # t-test binom.test() # binomial (exact) test prop.test() # approximate test fisher.test() # Fisher's exact test cor.test() # correlation test chisq.test() # chi-square test Linear regression model <- lm() # fit linear model summary(model) # model fit summary coef(model) # estimated parameters confint(model) # CI for estimates anova(model) # F-tests, etc. drop1(model) rstudent(model) # Studentized residuals fitted(model) # fitted values plotModel(model)# plot regression lines model <- glm() # generalized linear model Data # Load data: read.file(); read.delim(); read.csv() # Data information nrow(); ncol() # data dimensions head() # extract first part of data tail() # extract last part of data colnames() # column names rownames() # row names summary() # Alter/create data: subset() # subset data by condition factor() # create grouping variable relevel() # change reference level cut() # cut numeric into intervals round() # rounding numbers c() # concatenate numerics seq() # create sequence with() aggregate() margin.table() # sum table entries
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved