From 874f3b0dc644f15de2f842617852f1ab187ea7ce Mon Sep 17 00:00:00 2001
From: Jarmo Hurri
Date: Sun, 14 Oct 2012 14:21:33 +0300
Subject: [PATCH] Added a new tutorial on table lookup functions.
---
org-tutorials/org-lookups.org | 229 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 229 insertions(+)
create mode 100644 org-tutorials/org-lookups.org
diff --git a/org-tutorials/org-lookups.org b/org-tutorials/org-lookups.org
new file mode 100644
index 0000000..c73777c
--- /dev/null
+++ b/org-tutorials/org-lookups.org
@@ -0,0 +1,229 @@
+#+OPTIONS: H:3 num:nil toc:t \n:nil @:t ::t |:t ^:t -:t f:t *:t TeX:t LaTeX:t skip:nil d:(HIDE) tags:not-in-toc
+#+STARTUP: align fold nodlcheck hidestars oddeven lognotestate
+#+SEQ_TODO: TODO(t) INPROGRESS(i) WAITING(w@) | DONE(d) CANCELED(c@)
+#+TAGS: Write(w) Update(u) Fix(f) Check(c)
+#+TITLE: Org tutorial on table lookup functions
+#+AUTHOR: Jarmo Hurri
+#+EMAIL: jarmo.hurri AT syk DOT fi
+#+LANGUAGE: en
+#+PRIORITIES: A C B
+#+CATEGORY: org-tutorial
+* Introduction
+
+ Org provides three different functions for performing searches and
+ data dependent calculations in tables. These functions can, among
+ other things, be used to implement associative arrays, count
+ matching cells, rank results, or group data. The following examples
+ will hopefully help you in getting started with these functions.
+
+* Associative array with unique keys
+
+ The most straightforward use of lookups is to treat part of an org
+ table as an associative array: a key can be used to look up a
+ corresponding value.
+
+ Say you are taking a trip to Scandinavia, and you want to keep track
+ of how much money you have spent on the trip. You decide to convert
+ all sums to euros. Before your trip you write down the following
+ table of approximate currency rates.
+ #+TBLNAME: rates
+ | currency | abbreviation | euros |
+ |-----------------+--------------+-------|
+ | euro | eur | 1 |
+ | Norwegian krone | nok | 0.14 |
+ | Swedish krona | sek | 0.12 |
+ | US dollar | usd | 0.77 |
+
+ In what follows we will use the function =org-lookup-first= and the
+ previous table =rates= to automatically convert the sums in
+ different currencies to euros. The signature of function
+ =org-lookup-first= looks as follows:
+ #+BEGIN_SRC elisp
+ (org-lookup-first VAL S-LIST R-LIST &optional PREDICATE)
+ #+END_SRC
+ Assuming that =PREDICATE= is =nil=, in which case the default
+ predicate =equal= is used, this function does a search for the first
+ instance of =VAL= in =S-LIST= and returns the a value from the
+ corresponding position in =R-LIST=. In the table below, each sum is
+ assigned a currency abbreviation; a lookup is done in table =rates=
+ above in the second column for the corresponding abbreviation, and
+ then the corresponding rate is returned from the third column. For
+ each row only the first four columns need to filled; columns 5 and 6
+ are calculated automatically. Notice that an error results if the
+ key is not found: in the last row, an empty key is being searched
+ for.
+
+ | date | expense | sum | currency | rate | euros |
+ |-------+------------------+------+----------+--------+--------|
+ | 1.3. | flights | 324 | eur | 1 | 324 |
+ | 4.6. | books and maps | 243 | usd | 0.77 | 187.11 |
+ | 30.7. | rental car | 8300 | sek | 0.12 | 996. |
+ | 2.7. | hotel | 1150 | sek | 0.12 | 138. |
+ | 2.7. | lunch | 190 | sek | 0.12 | 22.8 |
+ | 3.7. | fishing licenses | 1400 | nok | 0.14 | 196. |
+ | 3.7. | gasoline | 340 | | #ERROR | #ERROR |
+ #+TBLFM: $5='(org-lookup-first $4 '(remote(rates,@2$2..@>$2)) '(remote(rates,@2$3..@>$3)))::$6=$5*$3
+
+* Multiple matches with preferred ordering
+
+ A common task for teachers is the assignment of exam grades from
+ total marks. The starting point for such grading is a table with
+ grade boundaries. Below is one such table, with the rows in
+ increasing order of the lower bound required for a particular grade.
+
+ #+TBLNAME: grade-boundaries
+ | lower bound | grade |
+ |-------------+-------|
+ | 0 | F |
+ | 10 | D |
+ | 20 | C |
+ | 30 | B |
+ | 40 | A |
+
+ We will use the function =org-lookup-last= and the previous table
+ =grade-boundaries= to assign grades to students based on their
+ marks. The signature of function =org-lookup-last= is exactly like
+ the signature of =org-lookup-first=:
+ #+BEGIN_SRC elisp
+ (org-lookup-last VAL S-LIST R-LIST &optional PREDICATE)
+ #+END_SRC
+ However, this function does a search for the /last/ match in
+ =S-LIST= and returns the a value from the corresponding position in
+ =R-LIST=. Here the idea of the lookup used in assigning the grade is
+ as follows. Say a student's exam result is 33 marks. We look for
+ the /last/ row in the table for which the students marks are greater
+ than or equal to the lower bound; in this case it is the row with
+ lower bound 30. The students grade is the corresponding element from
+ the second column, in this case a B.
+
+ Thus, given the number of marks =VAL= of a student, we find the last
+ row of the first column of table =grade-boundaries= for which the
+ lower bound =S= fulfils ~(>= VAL S)~. Thus we will use ~>=~ as
+ =PREDICATE= to perform the matching. Note that =VAL= and =S= are
+ assigned as arguments to the predicate in the order they are in the
+ signature of =org-lookup-last=, where =VAL= precedes =S-LIST=. The
+ following table does the conversion from total marks to the final
+ grade. Notice the literal interpolation =L= of table values into
+ the Elisp formula, which is needed because some values are numbers
+ and some are symbols.
+
+ | student | marks | grade |
+ |---------+-------+-------|
+ | X | 30 | B |
+ | Y | 29 | C |
+ | Z | 5 | F |
+ | W | 55 | A |
+ #+TBLFM: $3='(org-lookup-last $2 '(remote(grade-boundaries,@2$1..@>$1)) '(remote(grade-boundaries,@2$2..@>$2)) '>=);L
+
+* Counting matching cells
+
+ The function =org-lookup-all= can not be used by itself in a table
+ equation, because it returns a list of values. However, powerful
+ lookup tasks can be performed by combining the function with other
+ Elisp functions.
+
+ As a simple example consider counting the number of missing values
+ in a table. The signature of function =org-lookup-all= is exactly
+ like the signatures of the other two lookup functions:
+ #+BEGIN_SRC elisp
+ (org-lookup-all VAL S-LIST R-LIST &optional PREDICATE)
+ #+END_SRC
+ However, this function does a search for the /all/ matches in
+ =S-LIST= and returns the all corresponding values from the
+ corresponding positions in =R-LIST=. As is the case with
+ =org-lookup-first= and =org-lookup-last=, if =R-LIST= is =nil=, then
+ the corresponding matching values of =S-LIST= are returned
+ directly. Notice the use of the =E= flag to retain empty fields in
+ the range. Also notice that in this case we are doing the lookup in
+ a true two-dimensional range, which is thus also possible
+
+ | group | round 1 | round 2 |
+ |-------+---------+---------|
+ | A | | 2.4 |
+ | B | 4.7 | 11 |
+ | C | | |
+ | D | 5 | |
+ | E | | 7.2 |
+ | F | 3.2 | 4.3 |
+ | G | | 4.4 |
+ | H | | 8 |
+ |-------+---------+---------|
+ | total | missing | 7 |
+ #+TBLFM: @>$3='(length(org-lookup-all "" '(@2$2..@-1$3) nil));E
+* Ranking results
+
+ Another example application of =org-lookup-all= is an automatic
+ ranking of results. In the table below, a larger total number of
+ marks is better. Notice that the Elisp expression also
+ automatically takes care of ties.
+
+ | group | marks | rank |
+ |-------+-------+------|
+ | A | 22 | 2 |
+ | B | 22 | 2 |
+ | C | 14 | 4 |
+ | D | 28 | 1 |
+ | E | 9 | 5 |
+ #+TBLFM: $3='(+ 1 (length (org-lookup-all $2 '(@2$2..@>$2) nil '<)));N
+* Frequency counts from raw data
+ A common situation in the analysis of data is the classification
+ (grouping) of raw data values for, e.g., visualisation. Often this
+ is done by counting the frequencies of observations within certain
+ bounds. The function =org-lookup-all=, combined with other Elisp
+ functions, can be used to perform this task. This example also shows
+ how to construct more complicated lookup rules using multiple values
+ from a table.
+
+ Consider the following table with different results from different
+ groups A-I.
+ #+TBLNAME: raw-data
+ | group | result |
+ |-------+--------|
+ | A | 2.3 |
+ | B | 4.2 |
+ | C | 1.1 |
+ | D | 3.6 |
+ | E | 4.5 |
+ | F | 2.4 |
+ | G | 1.0 |
+ | H | 2.3 |
+ | I | 2.8 |
+
+ We will classify the results into different, mutually exclusive
+ classes. For example, the observations that will belong to the first
+ class are in the interval =[1, 1.9]= (endpoints included). In order
+ to perform this classification, we define the following two-place
+ predicate function =in-interval=. Notice that the first parameter of
+ this function is a pair whose first element is the lower bound and
+ second member the upper bound of the interval.
+
+ #+BEGIN_SRC emacs-lisp
+ (defun in-interval (bounds el)
+ (and (>= el (car bounds)) (<= el (cadr bounds))))
+ #+END_SRC
+
+ #+RESULTS:
+ : in-interval
+
+ Using this predicate function, we can construct a table with class
+ boundaries and corresponding frequencies. Note that the first
+ argument to the function =org-lookup-all=, which is then passed over
+ as the first argument to the predicate =in-interval=, is the pair of
+ bounds.
+
+ | lower bound | upper bound | frequency |
+ |-------------+-------------+-----------|
+ | 1 | 1.9 | 2 |
+ | 2 | 2.9 | 4 |
+ | 3 | 3.9 | 1 |
+ | 4 | 4.9 | 2 |
+ #+TBLFM: $3='(length (org-lookup-all '($1 $2) '(remote(raw-data,@2$2..@>$2)) nil 'in-interval));N
+* Conclusion
+
+ The org lookup functions can be used for a large number of different
+ data-dependent calculations. For example, the following spreadsheet
+ operations familiar to libreoffice or Excel users can be implemented
+ using them: =HLOOKUP=, =VLOOKUP=, =COUNTIF=, =SUMIF= and
+ =FREQUENCY=. If you have other interesting examples of the use of
+ these functions, feel free to send them to the [[https://lists.gnu.org/mailman/listinfo/emacs-orgmode][org mailing list]] and
+ we will be happy to add them on this page.
--
2.1.4