Org-mode mailing list
 help / color / Atom feed
From: Dan Davison <davison@stats.ox.ac.uk>
To: emacs org-mode mailing list <emacs-orgmode@gnu.org>
Subject: org tables and R
Date: Tue, 30 Dec 2008 19:35:50 +0000
Message-ID: <20081230193550.GA7961@stats.ox.ac.uk> (raw)

Hi all,

I've had a go at taking the org tables and R thing a bit further. I'm
using two different #+ lines in the org buffer: Lines starting with
#+TBLR: are in the standard org style (option:value) and can be used
to specify certain transformations of the table and standard plots of
the table data. In lines starting #+TBLR:: you can supply literal R
code, giving you full control over what you do with the table. M-x
org-table-R-apply makes happen whatever has been specified in those
lines. As long as the transformation results in something reasonably
one- or two-dimensional, then this is output to the org-buffer as an
org table (you can choose whether or not it replaces the original
table). You need to have R running in an inferior-ess-mode
buffer. Then, if you have this table,

| rowname | col1 | col2 |
|---------+------+------|
| row 1   |    1 |    2 |
| row 2   |    3 |    4 |
| total   |      |      |
#+TBLR:: x[3,] <- x[1,] + x[2,]
#+TBLR: rownames:1

org-table-R-apply turns it into 

| rownames(x) | col1 | col2 |
|-------------+------+------|
| row 1       |    1 |    2 |
| row 2       |    3 |    4 |
| total       |    4 |    6 |

The action:<something> option specifies off-the-shelf actions, without
having to write any R code. E.g.

| col1 | col2 |
|------+------|
|    1 |    2 |
|    3 |    4 |
#+TBLR: action:transpose

produces

|      | V1 | V2 |
|------+----+----|
| col1 |  1 |  3 |
| col2 |  2 |  4 |

and

#+TBLR: action:plot columns:((1)(2)) lines:t rownames:1

would plot column 2 against column 1.

You can mix user-code and off-the-shelf code: in this somewhat
solipsistic example user-supplied code is used to extract the day of
week, and then action:tabulate is used to build a 2-way table:

| author              | date                            |
|---------------------+---------------------------------|
| Carsten Dominik     | Thu, 12 Jun 2008 12:51:54 +0200 |
| Carsten Dominik     | Wed, 11 Jun 2008 08:57:39 +0200 |
| Adam Spiers         | Wed, 11 Jun 2008 12:06:23 +0100 |
| Eddward DeVilla     | Wed, 11 Jun 2008 12:15:11 -0500 |
| Eddward DeVilla     | Wed, 11 Jun 2008 20:09:50 -0500 |
| Harri Kiiskinen     | Wed, 04 Jun 2008 16:38:37 +0200 |
| Carsten Dominik     | Thu, 12 Jun 2008 14:15:49 +0200 |
| Harri Kiiskinen     | Thu, 12 Jun 2008 14:31:49 +0200 |
| Carsten Dominik     | Thu, 12 Jun 2008 16:17:59 +0200 |
| Manoj Srivastava    | Mon, 09 Jun 2008 01:52:03 -0500 |
| Daniel Clemente     | Wed, 04 Jun 2008 16:35:01 +0200 |
| Carsten Dominik     | Mon, 9 Jun 2008 09:56:09 +0200  |
| Carsten Dominik     | Tue, 10 Jun 2008 10:05:24 +0200 |
| Adam Spiers         | Tue, 10 Jun 2008 10:57:52 +0100 |
| Manuel Hermenegildo | Tue, 10 Jun 2008 13:50:44 +0200 |
| Christian Egli      | Tue, 24 Jun 2008 13:27:05 +0200 |
#+TBLR: columns:(1 2) action:tabulate
#+TBLR:: x[,2] <- substr(x[,2], 1, 3)

results in

|                     | Mon | Thu | Tue | Wed |
|---------------------+-----+-----+-----+-----|
| Adam Spiers         |   0 |   0 |   1 |   1 |
| Carsten Dominik     |   1 |   3 |   1 |   1 |
| Christian Egli      |   0 |   0 |   1 |   0 |
| Daniel Clemente     |   0 |   0 |   0 |   1 |
| Eddward DeVilla     |   0 |   0 |   0 |   2 |
| Harri Kiiskinen     |   0 |   1 |   0 |   1 |
| Manoj Srivastava    |   1 |   0 |   0 |   0 |
| Manuel Hermenegildo |   0 |   0 |   1 |   0 |
#+TBLR: action:barplot rownames:1 columns:(1 2 3 4) showcode:t

The #+TBLR: line below that produces a bar plot of the data.

There are more details below. The code is at

http://www.stats.ox.ac.uk/~davison/software/org-table-R/org-table-R.el

It would be great to get any feedback on this. My thought was that
something like this has the potential to provide a unified plotting
and table formula interface, which might be attractive to people who
know and/or like and/or want to learn R. There's lots more that could
be done with this, and there must be all sorts of bugs in it at this
stage. But if there's any interest in it then it could be
improved. Anyway, read on if you're interested in hearing more details
about the options and actions available.

Dan

Currently, the available actions are

- plot
   A simple plot of the x and y values. If no x-values are specified
   then the the y values are plotted against 1,2,...,length(y). If
   lines:t then the points are joined by lines.
- lines
   Equivalent to action:plot lines:t
- points
   Equivalent to action:plot lines:nil
- barplot
   Create a bar plot. A vertical bar is drawn for each row, with
   height given by the value in that row. If multiple columns are
   selected the bars for different columns are placed side-by-side.
- hist
   A histogram
- density
   A smoothed histogram
- image
   A plot of a table in which each cell is coloured according its numeric value.
- tabulate
   Create a table containing counts of the distinct values of the
   columns selected (if v columns are selected, the table will be
   v-dimensional, giving the counts of joint occurrences of the
   different values of the columns).
- transpose
   Transpose the table
   

 . Apart from tabulate and transpose, those produce plots of the
selected columns using the R function of the same name (type
e.g. ?barplot at the R prompt to see the help page). 

In addition to the action: option, the following options can be given
on the #+TBLR: line:
   
- showcode:t
   org-table-R-apply creates an R function which hopefully implements
   the requested actions (explicit user-supplied code comes first;
   off-the-shelf afterwards). With this option that function
   definition is displayed in a new R-mode buffer. That could serve as
   a starting point for fine-tuning the behaviour. One option would be
   to edit that function definition (say you call it f), save it in a
   file, and then use
#+TBLR:: source("/path/to/file.R") ; f(x)

- rownames:<integer>
   Specifies that column n contains the names of the rows of the
   table. These must be unique.

- replace:t
   The original org-table is replaced by the text output (which will be
   an org-table if the result is like a 1- or 2-dimensional array).

- columns:<lisp-list>
   This specifies the columns that the off-the-shelf action will
   operate on (e.g. the columns you want to plot). The simplest case
   is columns:j, where j is an integer. This could also be written
   columns:(j). columns:((1)(2 3)) says that you want a graphic in
   which columns 2 and 3 are plotted on the y-axis, and column 1 is
   plotted on the x-axis. What form that will take depends on the
   plotting function used (action:<something>). It might involve
   multiple plots in a single figure, although to be fair I haven't
   implemented most of the multiple column options so you're likely to
   get an error with anything except for
   action:<plot/lines/points>. I've given a description of how columns
   are specified, and what sort of behaviour might be expected, in the
   docstring to org-table-R-make-index-vectors. Basically, my
   intention was that columns:((1)(2 3)) should correspond to
   xy.coords(x=1, y=c(2,3)) in R. (See ?xy.coords if you want to get
   involved in this.)
		
- lines:t
   When action:plot is given, this means that the points are joined
   with lines. That's the same behaviour as action:lines.

- output-to-buffer:t
  This specifies that the text output from R goes into the org
  buffer. You shouldn't normally need to use this option as the code
  tries to work out whether it's appropriate. The rule it follows is
  that the org buffer gets the output if any bespoke code has been
  supplied on the #+TBLR:: line, or if an action: has been requested
  that results in text (action:<tabulate/transpose> at the moment).
  
p.s.
I agree with Eric that we could do with a way of referencing tables
from remote areas of an org file.


-- 
http://www.stats.ox.ac.uk/~davison

             reply index

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-30 19:35 Dan Davison [this message]
2009-01-02 22:34 ` Tom Short
2009-01-22  8:09 ` Carsten Dominik
2009-01-22 13:19   ` Graham Smith
     [not found] <20081231084619.2353434807@mail2.panix.com>
2008-12-31 19:54 ` Tom Breton (Tehom)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://orgmode.org

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081230193550.GA7961@stats.ox.ac.uk \
    --to=davison@stats.ox.ac.uk \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Org-mode mailing list

Archives are clonable:
	git clone --mirror https://orgmode.org/list/0 list/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 list list/ https://orgmode.org/list \
		emacs-orgmode@gnu.org
	public-inbox-index list

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.yhetil.org/yhetil.emacs.orgmode
	nntp://news.gmane.io/gmane.emacs.orgmode


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git