#+TITLE: An Org-mode Demo #+AUTHOR: Eric Schulte #+OPTIONS: num:nil ^:nil f:nil #+LATEX_HEADER: \usepackage{amscd} #+STARTUP: hideblocks #+PROPERTY: session *R* #+PROPERTY: results silent # some minor customization for nicer looking LaTeX output #+begin_LaTeX \hypersetup{ linkcolor=blue, pdfborder={0 0 0 0} } \lstset{basicstyle=\ttfamily\bfseries\small} #+end_LaTeX #+begin_center Adapted from /[[http://www.stat.umn.edu/~charlie/Sweave/foo.Rnw][An Sweave Demo]]/ by Charles J. Geyer. #+end_center This is a demo for using Org-babel to produce LaTeX documents with embedded R code. To get started fire up Emacs and create a text file with the =.org= suffix. You should see Org-mode become your major mode -- denoted by =Org= in your status bar. Press =C-c C-e= while viewing this Org-mode buffer and you will see a menu appear with options for export to a variety target formats -- herein we'll only consider export to LaTeX. So now we have a more complicated file chain $$ \begin{CD} \texttt{foo.org} @>\texttt{Org-mode}>> \texttt{foo.tex} @>\texttt{latex}>> \texttt{foo.dvi} @>\texttt{xdvi}>> \text{view of document} \end{CD} $$ and what have we accomplished other than making it twice as annoying to the WYSIWYG crowds (having to use both =Org-mode= and =latex= to get anything that looks like the document)? Well, we can now include =R= in our document. Here's a simple example #+begin_src R :exports both 2 + 2 #+end_src What I actually typed in =foo.org= was : #+begin_src R :exports both : 2 + 2 : #+end_src This is a "code block" to be processed by Org-babel. When Org-babel hits such a thing, it processes it, runs R to get the results, and stuffs the output in the LaTeX file it is creating. The LaTeX between code chunks is copied verbatim (except for in-line src code, about which see below). Hence to create a /active/ document you just write plain old text interspersed with "code blocks" which are plain old R. #+LaTeX: \pagebreak[3] Plots get a little more complicated. First we make something to plot (simulated regression data). #+name: reg #+begin_src R :results output :exports both n <- 50 x <- seq(1, n) a.true <- 3 b.true <- 1.5 y.true <- a.true + b.true * x s.true <- 17.3 y <- y.true + s.true * rnorm(n) out1 <- lm(y ~ x) summary(out1) #+end_src (for once we won't show the code chunk itself, look at =foo.org= if you want to see what the actual code chunk was). Figure \ref{fig:one} (p. \pageref{fig:one}) is produced by the following code #+name: fig1plot #+begin_src R :exports code plot(x, y) abline(out1) #+end_src Note that =x=, =y=, and =out1= are remembered from the preceding code chunk. We don't have to regenerate them. All code chunks are part of one R "session". #+name: fig1 #+begin_src R :exports results :noweb yes :file fig1.pdf <> #+end_src #+attr_latex: width=0.8\textwidth,placement=[p] #+label: fig:one #+caption: Scatter Plot with Regression Line #+results: fig1 [[file:fig1.pdf]] Now this was a little tricky. We did this with two code chunks, one visible and one invisible. First we did : #+name: fig1plot : #+begin_src R :exports code :file fig1plot.pdf : plot(x, y) : abline(out1) : #+end_src where the =:exports code= indicates that only the return value (not code) should be exported and the =#+name: fig1plot= gives the code block a name (to be used later). And "later" is almost immediate. Next we did : #+name: fig1 : #+begin_src R :exports results :noweb yes :file fig1.pdf : <> : #+end_src In this code block the =:file fig1.pdf= header argument indicates that the block generates a figure. Org-babel automagically makes a PDF file for the figure, and Org-mode handles the export to LaTeX. The =<>= is an example of "code block reuse". It means that we reuse the code of the code chunk named =fig1plot=. The =:exports results= in the code block means just what it says (we've already seen the code---it was produced by the preceding chunk---and we don't want to see it again, we only want to see the results). It is important that we observe the DRY/SPOT rule (/don't repeat yourself/ or /single point of truth/) and only have one bit of code for generating the plot. What the reader sees is guaranteed to be the code that made the plot. If we had used cut-and-paste, just repeating the code, the duplicated code might get out of sync after edits. The rest of this should be recognizable to anyone who has ever done a LaTeX figure. So making a figure is a bit more complicated in some ways, but much simpler than others. Note the following virtues - The figure is guaranteed to be the one described by the text (at least by the R in the text). - No messing around with sizing or rotations. It just works! #+name: fig2 #+begin_src R :exports results :file fig2.pdf out3 <- lm(y ~ x + I(x^2) + I(x^3)) plot(x, y) curve(predict(out3, newdata=data.frame(x=x)), add = TRUE) #+end_src Note that if you don't care to show the R code to make the figure, it is simpler still. Figure \ref{fig:two} shows another plot. What I actually typed in =foo.org= was : #+name: fig2 : #+begin_src R :exports results :file fig2.pdf : out3 <- lm(y ~ x + I(x^2) + I(x^3)) : plot(x, y) : curve(predict(out3, newdata=data.frame(x=x)), add = TRUE) : #+end_src #+attr_latex: width=0.8\textwidth,placement=[p] #+label: fig:two #+caption: Scatter Plot with Cubic Regression Curve #+results: fig2 [[file:fig2.pdf]] #+LaTeX: \pagebreak Now we just excluded the code for the plot from the figure (with =:exports results= so it doesn't show). Also note that every time we re-export Figures \ref{fig:one} and \ref{fig:two} change, the latter conspicuously (because the simulated data are random). Everything just works. This should tell you the main virtue of Org-babel. It's always correct. There is never a problem with stale cut-and-paste. #+begin_src R :exports none options(scipen=10) #+end_src #+results: : 0 Simple numbers can be plugged into the text with the =src_R= command, for example, the quadratic and cubic regression coefficients in the preceding regression were \beta_2 = src_R{round(out3$coef[3], 4)} and \beta_3 = src_R{round(out3$coef[4], 4)}. Just magic! What I actually typed in =foo.org= was : were \beta_2 = src_R{round(out3$coef[3], 4)} : and \beta_3 = src_R{round(out3$coef[4], 4)} #+begin_src R :exports none options(scipen=0) #+end_src The =xtable= command is used to make tables. (The following is the Org-babel output of another code block that we don't explicitly show. Look at =foo.org= for details.) #+begin_src R :exports both :results output out2 <- lm(y ~ x + I(x^2)) foo <- anova(out1, out2, out3) foo #+end_src #+begin_src R :exports both :results output class(foo) #+end_src #+begin_src R :exports both :results output dim(foo) #+end_src #+name: foo-as-matrix #+begin_src R :exports both :results output foo <- as.matrix(foo) foo #+end_src #+LaTeX: \pagebreak #+begin_src R :results output latex :exports results library(xtable) xtable(foo, caption = "ANOVA Table", label = "tab:one", digits = c(0, 0, 2, 0, 2, 3, 3)) #+end_src #+results: foo-as-matrix So now we are ready to turn the matrix =foo= into Table \ref{tab:one} using the R chunk : #+begin_src R :results output latex :exports results : library(xtable) : xtable(foo, caption = "ANOVA Table", label = "tab:one", : digits = c(0, 0, 2, 0, 2, 3, 3)) : #+end_src (note the difference between arguments to the =xtable= function and to the =xtable= method of the =print= function) To summarize, Org-babel is terrific, so important that soon we'll not be able to get along without it. Its virtues are - The numbers and graphics you report are actually what they are claimed to be. - Your analysis is reproducible. Even years later, when you've completely forgotten what you did, the whole write-up, every single number or pixel in a plot is reproducible. - Your analysis actually works---at least in this particular instance. The code you show actually executes without error. - Toward the end of your work, with the write-up almost done you discover an error. Months of rework to do? No! Just fix the error and re-export. One single problem like this and you will have all the time invested in Org-babel repaid. - This methodology provides discipline. There's nothing that will make you clean up your code like the prospect of actually revealing it to the world. Whether we're talking about homework, a consulting report, a textbook, or a research paper. If they involve computing and statistics, this is the way to do it.