Jeong Lim
August 31, 2016
Focus on how to make your analysis really reproducible
chunk
optionscache
```{r}
```
Code chunk with option
```{r NAME, OPTIONS HERE}
```
Global option
```{r, include=FALSE }
knitr::opts_chunk$set(OPTIONS HERE)
```
echo=FALSE Don't include the code
results="hide" Don't include the output
include=FALSE Don't show code or output
eval=FALSE Don't evaluate the code at all
collapse=TRUE Collapse all the source and ouput blocks into a single block
warning=FALSE Don't show R warnings
message=FALSE Don't show R messages
error=FALSE Don't show R error
cache=TRUE Cache code chunk
tidy=TRUE Reformat code in a tidy way
comment=NA Remove ##
echo=1
, echo=c(1,3)
, echo=-(1:2)
code chunk without option
```{r withoutoption}
cars[1:4,]
sum(cars$speed)
sum(cars$dist)
mean(cars)
plot(cars$speed, cars$dist)
```
It looks like
cars[1:4,]
## speed dist
## 1 4 2
## 2 4 10
## 3 7 4
## 4 7 22
sum(cars$speed)
## [1] 770
sum(cars$dist)
## [1] 2149
mean(cars)
## Warning in mean.default(cars): argument is not numeric or logical:
## returning NA
## [1] NA
plot(cars$speed, cars$dist)
code chunk with options
```{r withoption, echo=2:3, warning=FALSE, collapse=TRUE, comment=NA, fig.align='center', fig.width=4}
cars[1:4,]
sum(cars$speed)
sum(cars$dist)
mean(cars)
plot(cars$speed, cars$dist)
```
It looks like
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
sum(cars$speed)
[1] 770
sum(cars$dist)
[1] 2149
[1] NA
tidy
tidy_source()
function in the formatR package make code tidy to improve readability
library(formatR)
source("C:/Users/limje/Desktop/Reproducible/Jeong/UglyScript.R")
tidy_source("UglyScript.R", file="BeautifulScript.R", arrow=getOption("formatR.arrow", TRUE))
ex) This is example R code : MeanRiver <- mean(rivers)
ex) The mean length of 141 major rivers in North America is 591
Source from a local file
source("C:/Users/limje/Desktop/Reproducible/Jeong/MainAnalysis.R")
Source from a secure URL
library(devtools)
source_url("http://bit.ly/1D5p1w6")
## SHA-1 hash of file is ff75a88b90decfcaefc9903bbc283e1fc4cd2339
SHA-1 hash is a unique number for the file. If the file changes, its SHA-1 hash will change
cache=TRUE
cache.path
: set the cache directorycreate an object Sample to a file called Sample.RData
```{r gen-data, cache=TRUE}
# create data
Sample<-rnorm(n=1000, mean=5, sd=2)
# save sample
save(Sample, file="Sample.RData")
```
latter code chunk for creating the histogram
```{r histgram, cache=TRUE, dependson='gen-data'}
# load Sample
load(file="Sample.RData")
# create histogram
hist(Sample)
```
engine
```{r, engine="sas", engine.path="C:/Program Files/SASHome/SASFoundation/9.4/sas.exe"}
proc means data=sashelp.class;
run;
```
SAS code and output
proc means data=sashelp.class;
run;
Variable N Mean Std Dev Minimum Maximum
------------------------------------------------------------------------------
Age 19 13.3157895 1.4926722 11.0000000 16.0000000
Height 19 62.3368421 5.1270752 51.3000000 72.0000000
Weight 19 100.0263158 22.7739335 50.5000000 150.0000000
------------------------------------------------------------------------------
SAS HTML output
set.seed(123)
Draw1<-rnorm(1000, mean=0, sd=2)
summary(Draw1)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5.62000 -1.25700 0.01842 0.03226 1.32900 6.48200
hist(Draw1)
set.seed(125)
Draw2<-rnorm(1000, mean=0, sd=2)
summary(Draw2)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -7.2110 -1.4070 -0.1040 -0.1215 1.3160 5.6770
hist(Draw2)