From: Jarmo Hurri Date: Sun, 14 Oct 2012 11:21:33 +0000 (+0300) Subject: Added a new tutorial on table lookup functions. X-Git-Tag: release_7.9.3~76 X-Git-Url: http://orgmode.org/w/?p=worg.git;a=commitdiff_plain;h=874f3b0dc644f15de2f842617852f1ab187ea7ce Added a new tutorial on table lookup functions. --- diff --git a/org-tutorials/org-lookups.org b/org-tutorials/org-lookups.org new file mode 100644 index 0000000..c73777c --- /dev/null +++ b/org-tutorials/org-lookups.org @@ -0,0 +1,229 @@ +#+OPTIONS: H:3 num:nil toc:t \n:nil @:t ::t |:t ^:t -:t f:t *:t TeX:t LaTeX:t skip:nil d:(HIDE) tags:not-in-toc +#+STARTUP: align fold nodlcheck hidestars oddeven lognotestate +#+SEQ_TODO: TODO(t) INPROGRESS(i) WAITING(w@) | DONE(d) CANCELED(c@) +#+TAGS: Write(w) Update(u) Fix(f) Check(c) +#+TITLE: Org tutorial on table lookup functions +#+AUTHOR: Jarmo Hurri +#+EMAIL: jarmo.hurri AT syk DOT fi +#+LANGUAGE: en +#+PRIORITIES: A C B +#+CATEGORY: org-tutorial +* Introduction + + Org provides three different functions for performing searches and + data dependent calculations in tables. These functions can, among + other things, be used to implement associative arrays, count + matching cells, rank results, or group data. The following examples + will hopefully help you in getting started with these functions. + +* Associative array with unique keys + + The most straightforward use of lookups is to treat part of an org + table as an associative array: a key can be used to look up a + corresponding value. + + Say you are taking a trip to Scandinavia, and you want to keep track + of how much money you have spent on the trip. You decide to convert + all sums to euros. Before your trip you write down the following + table of approximate currency rates. + #+TBLNAME: rates + | currency | abbreviation | euros | + |-----------------+--------------+-------| + | euro | eur | 1 | + | Norwegian krone | nok | 0.14 | + | Swedish krona | sek | 0.12 | + | US dollar | usd | 0.77 | + + In what follows we will use the function =org-lookup-first= and the + previous table =rates= to automatically convert the sums in + different currencies to euros. The signature of function + =org-lookup-first= looks as follows: + #+BEGIN_SRC elisp + (org-lookup-first VAL S-LIST R-LIST &optional PREDICATE) + #+END_SRC + Assuming that =PREDICATE= is =nil=, in which case the default + predicate =equal= is used, this function does a search for the first + instance of =VAL= in =S-LIST= and returns the a value from the + corresponding position in =R-LIST=. In the table below, each sum is + assigned a currency abbreviation; a lookup is done in table =rates= + above in the second column for the corresponding abbreviation, and + then the corresponding rate is returned from the third column. For + each row only the first four columns need to filled; columns 5 and 6 + are calculated automatically. Notice that an error results if the + key is not found: in the last row, an empty key is being searched + for. + + | date | expense | sum | currency | rate | euros | + |-------+------------------+------+----------+--------+--------| + | 1.3. | flights | 324 | eur | 1 | 324 | + | 4.6. | books and maps | 243 | usd | 0.77 | 187.11 | + | 30.7. | rental car | 8300 | sek | 0.12 | 996. | + | 2.7. | hotel | 1150 | sek | 0.12 | 138. | + | 2.7. | lunch | 190 | sek | 0.12 | 22.8 | + | 3.7. | fishing licenses | 1400 | nok | 0.14 | 196. | + | 3.7. | gasoline | 340 | | #ERROR | #ERROR | + #+TBLFM: $5='(org-lookup-first $4 '(remote(rates,@2$2..@>$2)) '(remote(rates,@2$3..@>$3)))::$6=$5*$3 + +* Multiple matches with preferred ordering + + A common task for teachers is the assignment of exam grades from + total marks. The starting point for such grading is a table with + grade boundaries. Below is one such table, with the rows in + increasing order of the lower bound required for a particular grade. + + #+TBLNAME: grade-boundaries + | lower bound | grade | + |-------------+-------| + | 0 | F | + | 10 | D | + | 20 | C | + | 30 | B | + | 40 | A | + + We will use the function =org-lookup-last= and the previous table + =grade-boundaries= to assign grades to students based on their + marks. The signature of function =org-lookup-last= is exactly like + the signature of =org-lookup-first=: + #+BEGIN_SRC elisp + (org-lookup-last VAL S-LIST R-LIST &optional PREDICATE) + #+END_SRC + However, this function does a search for the /last/ match in + =S-LIST= and returns the a value from the corresponding position in + =R-LIST=. Here the idea of the lookup used in assigning the grade is + as follows. Say a student's exam result is 33 marks. We look for + the /last/ row in the table for which the students marks are greater + than or equal to the lower bound; in this case it is the row with + lower bound 30. The students grade is the corresponding element from + the second column, in this case a B. + + Thus, given the number of marks =VAL= of a student, we find the last + row of the first column of table =grade-boundaries= for which the + lower bound =S= fulfils ~(>= VAL S)~. Thus we will use ~>=~ as + =PREDICATE= to perform the matching. Note that =VAL= and =S= are + assigned as arguments to the predicate in the order they are in the + signature of =org-lookup-last=, where =VAL= precedes =S-LIST=. The + following table does the conversion from total marks to the final + grade. Notice the literal interpolation =L= of table values into + the Elisp formula, which is needed because some values are numbers + and some are symbols. + + | student | marks | grade | + |---------+-------+-------| + | X | 30 | B | + | Y | 29 | C | + | Z | 5 | F | + | W | 55 | A | + #+TBLFM: $3='(org-lookup-last $2 '(remote(grade-boundaries,@2$1..@>$1)) '(remote(grade-boundaries,@2$2..@>$2)) '>=);L + +* Counting matching cells + + The function =org-lookup-all= can not be used by itself in a table + equation, because it returns a list of values. However, powerful + lookup tasks can be performed by combining the function with other + Elisp functions. + + As a simple example consider counting the number of missing values + in a table. The signature of function =org-lookup-all= is exactly + like the signatures of the other two lookup functions: + #+BEGIN_SRC elisp + (org-lookup-all VAL S-LIST R-LIST &optional PREDICATE) + #+END_SRC + However, this function does a search for the /all/ matches in + =S-LIST= and returns the all corresponding values from the + corresponding positions in =R-LIST=. As is the case with + =org-lookup-first= and =org-lookup-last=, if =R-LIST= is =nil=, then + the corresponding matching values of =S-LIST= are returned + directly. Notice the use of the =E= flag to retain empty fields in + the range. Also notice that in this case we are doing the lookup in + a true two-dimensional range, which is thus also possible + + | group | round 1 | round 2 | + |-------+---------+---------| + | A | | 2.4 | + | B | 4.7 | 11 | + | C | | | + | D | 5 | | + | E | | 7.2 | + | F | 3.2 | 4.3 | + | G | | 4.4 | + | H | | 8 | + |-------+---------+---------| + | total | missing | 7 | + #+TBLFM: @>$3='(length(org-lookup-all "" '(@2$2..@-1$3) nil));E +* Ranking results + + Another example application of =org-lookup-all= is an automatic + ranking of results. In the table below, a larger total number of + marks is better. Notice that the Elisp expression also + automatically takes care of ties. + + | group | marks | rank | + |-------+-------+------| + | A | 22 | 2 | + | B | 22 | 2 | + | C | 14 | 4 | + | D | 28 | 1 | + | E | 9 | 5 | + #+TBLFM: $3='(+ 1 (length (org-lookup-all $2 '(@2$2..@>$2) nil '<)));N +* Frequency counts from raw data + A common situation in the analysis of data is the classification + (grouping) of raw data values for, e.g., visualisation. Often this + is done by counting the frequencies of observations within certain + bounds. The function =org-lookup-all=, combined with other Elisp + functions, can be used to perform this task. This example also shows + how to construct more complicated lookup rules using multiple values + from a table. + + Consider the following table with different results from different + groups A-I. + #+TBLNAME: raw-data + | group | result | + |-------+--------| + | A | 2.3 | + | B | 4.2 | + | C | 1.1 | + | D | 3.6 | + | E | 4.5 | + | F | 2.4 | + | G | 1.0 | + | H | 2.3 | + | I | 2.8 | + + We will classify the results into different, mutually exclusive + classes. For example, the observations that will belong to the first + class are in the interval =[1, 1.9]= (endpoints included). In order + to perform this classification, we define the following two-place + predicate function =in-interval=. Notice that the first parameter of + this function is a pair whose first element is the lower bound and + second member the upper bound of the interval. + + #+BEGIN_SRC emacs-lisp + (defun in-interval (bounds el) + (and (>= el (car bounds)) (<= el (cadr bounds)))) + #+END_SRC + + #+RESULTS: + : in-interval + + Using this predicate function, we can construct a table with class + boundaries and corresponding frequencies. Note that the first + argument to the function =org-lookup-all=, which is then passed over + as the first argument to the predicate =in-interval=, is the pair of + bounds. + + | lower bound | upper bound | frequency | + |-------------+-------------+-----------| + | 1 | 1.9 | 2 | + | 2 | 2.9 | 4 | + | 3 | 3.9 | 1 | + | 4 | 4.9 | 2 | + #+TBLFM: $3='(length (org-lookup-all '($1 $2) '(remote(raw-data,@2$2..@>$2)) nil 'in-interval));N +* Conclusion + + The org lookup functions can be used for a large number of different + data-dependent calculations. For example, the following spreadsheet + operations familiar to libreoffice or Excel users can be implemented + using them: =HLOOKUP=, =VLOOKUP=, =COUNTIF=, =SUMIF= and + =FREQUENCY=. If you have other interesting examples of the use of + these functions, feel free to send them to the [[https://lists.gnu.org/mailman/listinfo/emacs-orgmode][org mailing list]] and + we will be happy to add them on this page.