Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

SAS-R cheat sheet, compile sas with R, Cheat Sheet of Programming Languages

SAS-R cheat sheet, compile sas with R

Typology: Cheat Sheet

2021/2022

Uploaded on 01/30/2022

zakaria-hamane
zakaria-hamane šŸ‡ŗšŸ‡ø

2 documents

Partial preview of the text

Download SAS-R cheat sheet, compile sas with R and more Cheat Sheet Programming Languages in PDF only on Docsity! SAS <-> R :: CHEAT SHEET Introduction This guide aims to familiarise SAS users with R. R examples make use of tidyverse collection of packages. Install tidyverse: Attach tidyverse packages for use: install.packages("tidyverse") library(tidyverse) R data here in ā€˜data framesā€™, and occasionally vectors (via c( ) ) Other R structures (lists, matricesā€¦) are not explored here. Keyboard shortcuts: <- Alt + - %>% Ctrl + Shift + m Datasets; drop, keep & rename variables data new_data; set old_data; run; new_data <- old_data data new_data (keep=id); set old_data (drop=job_title) ; run; new_data <- old_data %>% select(-job_title) %>% select(id) data new_data (drop= temp: ); set old_data; run; new_data <- old_data %>% select( -starts_with("temp") data new_data; set old_data; rename old_name = new_name; run; new_data <- old_data %>% rename(new_name = old_name) Conditional filtering data new_data; set old_data; if Sex = "M"; run; new_data <- old_data %>% filter(Sex == "M") data new_data; set old_data; if year in (2010,2011,2012); run; new_data <- old_data %>% filter(year %in% c(2010,2011,2012)) data new_data; set old_data; by id ; if first.id ; run; new_data <- old_data %>% group_by( id ) %>% slice(1) data new_data; set old_data; if dob > "25APR1990"d; run; new_data <- old_data %>% filter(dob > as.Date("1990-04-25")) New variables, conditional editing data new_data; set old_data; total_income = wages + benefits ; run; new_data <- old_data %>% mutate(total_income = wages + benefits) data new_data; set old_data; if hours > 30 then full_time = "Y"; else full_time = "N"; run; new_data <- old_data %>% mutate(full_time = if_else(hours > 30 , "Y" , "N")) data new_data; set old_data; if temp > 20 then weather = "Warm"; else if temp > 10 then weather = "Mild"; else weather = "Cold"; run; new_data <- old_data %>% mutate(weather = case_when( temp > 20 ~ "Warm", temp > 10 ~"Mild", TRUE ~ "Cold" ) ) Counting and Summarising proc freq data = old_data ; table job_type ; run; old_data %>% count( job_type ) proc freq data = old_data ; table job_type*region ; run; old_data %>% count( job_type , region ) proc summary data = old_data nway ; class job_type region ; output out = new_data ; run; new_data <- old_data %>% group_by( job_type , region ) %>% summarise( Count = n( ) ) proc summary data = old_data nway ; class job_type region ; var salary ; output out = new_data sum( salary ) = total_salaries ; run; new_data <- old_data %>% group_by( job_type , region ) %>% summarise( total_salaries = sum( salary ) , Count = n( ) ) Combining datasets data new_data ; set data_1 data_2 ; run; new_data <- bind_rows( data_1 , data_2 ) data new_data ; merge data_1 (in= in_1) data_2 ; by id ; if in_1 ; run; new_data <- left_join( data_1 , data_2 , by = "id") C.f. rbind( ) which produces error if columns are not identical Lots of summary functions in both languages Swap summarise( ) for mutate( ) to add summary data to original data Equivalent without nway not trivially produced For percent, add: %>% mutate(percent = n*100/sum(n)) C.f. full_join( ) , right_join( ) , inner_join( ) Could use slice(n( )) for last Note order differs C.f. contains( ) , ends_with( ) Some plotting in R ggplot( my_data , aes( year , sales ) ) + geom_point( ) + geom_line( ) ggplot( my_data , aes( year , sales ) ) + geom_point( ) + geom_line( ) + ylim(0, 40) + labs(x = "" , y = "Sales per year") ggplot(my_data, aes( year, sales, colour = dept) ) + geom_point( ) + geom_line( ) ggplot( my_data , aes( year, sales, fill = dept) ) + geom_col( ) ggplot( my_data , aes( year, sales, fill = dept) ) + geom_col( position = "dodge" ) + coord_flip( ) Note ā€˜colourā€™ for lines & points, ā€˜fillā€™ for shapes C.f. position = "fill" for 100% stacked bars/cols CC BY SA Brendan Oā€™Dowd ā€¢ brendanjodowd@gmail.com ā€¢ Updated 2021-09
Docsity logo



Copyright Ā© 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved