Advanced searching
Introduction
Org-mode has many powerful built-in search functions. These tools transform hierarchical org files into robust plain text "databases" that can be queried in sophisticated ways. Outline headings in Org-mode not only function as document sections or todo items; each heading can also store an unlimited amount of text and various types of metadata. And, of course, since Org-mode files are plain text, any number of tools (grep, awk, perl, etc.) can be used to filter and manipulate the data they contain.
The goal of this tutorial is to offer an introduction to the built-in commands and syntax for querying Org-mode outlines. While these are explained in various places in the Org-mode manual, this tutorial attempts to provide an overview in one place. It is particularly aimed at those who would like to use Org-mode as a note-taking and reference management tool. Nonetheless, it should prove useful to anyone who needs to locate specific information buried in an ever-growing collection of Org-mode data.
Outline nodes as "data containers"
Before discussing specific search commands, it is worth taking a few moments to consider the basic structure of an Org-mode entry.
Though outline entries can be nested within one another hierarchically, each node is also a discrete container of data. Indeed, a large variety of metadata—a todo keyword, tags, timestamps, logging information, and properties (i.e., arbitrary data pairs)—can be attached to each heading. Similarly each outline entry can store an unlimited amount of text.
Here is an example:
* TODO Buy clothes for wedding :wedding:important:errands: , SCHEDULED: <2010-12-01 Wed> , :PROPERTIES: , :estimated-cost: 100 , :END: , [2010-11-17 Wed 12:22] , I need to look spiffy for the big day! , - [ ] Suit , - [ ] Tie , - [ ] Shoes , - [ ] Hat , Possible stores to visit: , , | Store | Location | Miles away | , |-----------------+------------------------+------------| , | The Suit King | 1000 E. Washington St. | 5.1 | , | Mr. Haberdasher | 259 Western Rd. | 7.2 |
The sample entry above has the following metadata:
- todo keyword
- TODO
- scheduled timestamp
- 2010-12-01 Wed
- inactive timestamp
- 2010-11-17 Wed 12:22
- property
- estimated-cost ⇒ 100
- tags
- "wedding", "important", and "errands"
The entry also contains some text, including a checklist and a table.
Normally, an Org-mode file/outline contains several entries such as
the one above, nested hierarchically. Moreover, Org users typically
make their most important files available for easy searching by adding
them to their list of agenda files, either selecting them one-by-one
with C-c [
or by setting the variable org-agenda-files
.
The agenda as a search engine
For querying a collection of org files, Org-mode includes a powerful
built-in search-engine, the agenda (C-c a
). As its name suggests,
the most common use of the agenda is to pull together, from all the
agenda files, a daily or weekly schedule or a list of todos. But the
agenda is also a powerful search engine that offers various ways tools
for querying both the metadata and the text of org-mode entries. In
fact, Org-mode's default agenda view (C-c a a
or org-agenda-list
)
is simply a search tool that gathers and displays all org-mode entries
with certain types of metadata—timestamps that fall within a given
range of dates.
Typing C-c a
or M-x org-agenda
brings up the agenda dispatcher, an
overview of Org-mode's various search tools:
Press key for an agenda command: < Buffer, subtree/region restriction -------------------------------- > Remove restriction a Agenda for current week or day e Export agenda views t List of all TODO entries T Entries with special TODO kwd m Match a TAGS/PROP/TODO query M Like m, but only TODO entries L Timeline for current buffer # List stuck projects (!=configure) s Search for keywords C Configure custom agenda commands / Multi-occur ? Find :FLAGGED: entries
A quick perusal of the commands here reveals that one can query for a wide variety of data. This tutorial will focus on three searches in particular:
C-c a T
- for todo keywords
C-c a m
- for tags and properties
C-c a s
- for full text searches
Now, let's look at the syntax for each of these search tools.
Searching metadata (todos, tags, and properties)
Todo keyword searches
The simplest type of metadata query in org-mode is org-todo-list
(invoked with C-c a T
). This function prompts the user for a search
string and then retrieves a list of outline headings containing the
TODOs specified in the search string.1
Since each outline heading can contain only one TODO keyword, the
search syntax is quite simple, consisting either of a single keyword
or two or more keywords bound together by the boolean operator |
("or").
For instance, the following query…
TODO
…retrieves all entries marked with a TODO keyword, whereas…
TODO|PROJECT|MAYBE
…displays a list of all headlines containing either TODO or PROJECT or MAYBE.
Tag searches
Though the org-todo-list serves its purpose well, it is limited to
only one type of metadata. If you would like to search for other types
of metadata, or mix and match a search for todo keywords with, say, a
search for tags, org-mode offers a more powerful tool,
org-tags-view
, which is called with the following keys:
C-c a m
- searches all headlines
C-c a M
- searches only headlines with active todos
At its simplest, org-tags-view does exactly what it says: it queries for headlines marked with particular combinations of tags. The syntax for such searches follows a simple boolean logic:
|
- or
&
- and
+
- include matches
-
- exclude matches
Here are a few examples:
+computer&+urgent
…will result in all items tagged both computer and urgent, while the search…
+computer|+urgent
…will result in all items tagged either computer or urgent. Meanwhile, the query…
+computer&-urgent
…will display all items tagged computer and not urgent.
As you may have noticed, the syntax above can be a little verbose, so
org-mode offers convenient ways of shortening it. First, -
and +
imply "and" if no boolean operator is stated, so example three above
could be rewritten simply as:
+computer-urgent
Second, inclusion of matches is implied if no +
or -
is present,
so example three could be further shortened to:
computer-urgent
Example number two, meanwhile, could be shortened to:
computer|urgent
Property searches
Org-mode allows outline entries to contain any number of arbitrary data pairs, which are conveniently hidden within a folding PROPERTIES drawer, e.g.:
* TODO Evensong's magisterial work on the Amazon :science:read:BIB: SCHEDULED: <2010-11-20 Sat> [2010-11-16 Tue 23:11] :PROPERTIES: :BIB_AUTHOR: Walter Evensong :BIB_TITLE: Mysteries of the Amazon :BIB_PAGES: 1234 :BIB_PUBLISHER: Humbug University Press :END: Lots of good stuff on Brazil.
Let's imagine a free software aficionado named Mr. Gnu has added a
number of similar bibliographical outline nodes to his org files and
that he would like to find all entries that contain "Walter Evensong"
in their BIB_AUTHOR
field. He can construct such a search so by
calling org-tags-view
and entering the desired key/value match:
C-c a m Match: BIB_AUTHOR="Walter Evensong"
Property searches can be mixed and matched with tag searches. If Mr.
Gnu would like to see all books by "Walter Evensong" with the tag
"read", he can simply join the two desired matches together with the
+
sign:
BIB_AUTHOR="Walter Evensong"+read
Properties with numeric values can be queried with inequalities. If Mr. Gnu would like to retrieve all books by the prolific Walter Evensong that span over 1000 pages, he could enter the following:
BIB_AUTHOR="Walter Evensong"+BIB_PAGES>1000
The comparison operators for searches are as follows:
= (equal), > (greater than), <= (greater than or equal to), < (less than), <= (less than or equal to), <> (not equal)
What if Mr. Gnus would to like of find all books by Walter Evensong or any books over 1000 pages?
BIB_AUTHOR="Walter Evensong"|BIB_PAGES>1000
For his own clarity, Mr. Gnu can always insert "+" signs, though they are not required:
+BIB_AUTHOR="Walter Evensong"|+BIB_PAGES>1000
It is important to note that the equal sign in the searches above implies an exact match. If Mr. Gnu is searching for a string, such as "Mysteries of the Amazon", the entire search query must match. Thus, the search…
BIB_TITLE="Amazon"
…will not match the entry above.
How then can you search for partial matches? The answer is regular expressions. Instead of surrounding your query with quotation marks (which will necessitate a precise and complete match), you can instead enfold it in brackets, which instructs Org-mode to treat the query as a regular expression. Thus, the search…
BIB_TITLE={Amazon}
…will locate all entries that match contain the sequence "Amazon" and pull them up in the agenda:
Headlines with TAGS match: BIB_TITLE={Amazon} Press `C-u r' to search again with new search string org: TODO Evensong's magisterial work on the Amazon :science:read:BIB:
Mr. Gnu jots down the following rule in his growing org file collection:
* Tags/property search matching
- For exact matches, use quotation marks.
- For partial matches, use curly brackets.
Regular expressions allow for more flexible searches. Let's say that for some strange reason Mr. Gnu would like to find all books containing either "Amazon" or "Amazing" in their titles. The following regular expression search should do the trick:
BIB_TITLE={Amaz\(on\|ing\)}
Let's break this expression down:
Amaz
- This is the string shared by both words.
\(...\)
- These parentheses create a grouping to set off the alternative matches that follow "Amaz".
on\|ing
\|
is the "or" expression. Since it is placed within the parentheses, it means that a match must begin with "Amaz" but can end either in "on" or "ing".
You may be wondering why the search query contains so many
backslashes. It is because Emacs' regular expression engine gives the
characters (
, )
, and |
a special meaning only when they are
"escaped" (i.e., preceded by a backslash). Thus, Mr. Gnu had simply
typed BIB_TITLE={Amaz(on|ing)}
, he would have instructed Org-mode to
match entries with the exact sequence Amaz(on|ing)
(an unlikely
match, unless he has a large collection of literary theory from the
1990s).
Here's a simpler example. If Mr. Gnu would like to find all entries with either "Walter" or "Evensong" in the author field, he could type:
BIB_TITLE={Walter\|Evensong}
If he would like to pull up all entries that have defined value for
the BIB_TITLE
property, he can simply use a single dot to match any
character:
BIB_TITLE={.}
Special Properties
In addition to any explicitly declared key/value property pairs, each
Org-mode entry also has a number of special (i.e., implicit)
properties that can be queried with org-tags-view
(C-c a m
). These
include, among other things, the entry's TODO state, tags (local and
inherited), category, priority, and timestamps (DEADLINE, SCHEDULED,
active, and inactive). See the sample entry above for an illustration
of where these properties are typically found in an outline node.
To see all of the properties (both explicit and implicit) defined for an Org-mode entry, place the following text in an org-mode entry and evaluate it by typing C-x C-e after the closing parenthesis:
(org-entry-properties nil)
Here's an example of how such "special properties" can be put to good use in a search:
C-c a m Match: Effort>1+PRIORITY="A"+SCHEDULED<"<tomorrow>"+ALLTAGS={computer\|email}
This query finds all items with:
- An estimated effort greater than one hour
- A priority of "A"
- A scheduled date "less than" tomorrow (i.e., today or earlier).
- Either the tag "computer" or the tag "email"
- Note: the ALLTAGS property includes inherited tags, while the TAGS property includes only local tags.
- This search is also a good example of how to achieve a grouping logic without parentheses while querying tags.
Please consult the manual for a fuller explanation of the syntax of such searches.
Querying timestamps
A few words should be said here about querying timestamps contained in
the following properties: DEADLINE
, SCHEDULED
, TIMESTAMP
(the
first active timestamp in an entry), and TIMESTAMP_IA
(the first
inactive timestamp in an entry).
The basic syntax for querying timestamps is a time string enclosed in double quotes and angular brackets. E.g., the search…
C-c a m Match: +SCHEDULED="<2010-08-20 Sat>"
…will find all items scheduled for Saturday, August 20, 2010 without a time of day specification. This last caveat is important to note: if you have a timestamp with time of day information, such as…
* Some task SCHEDULED: <2010-08-20-Sat 10:30>
…the search above will not retrieve it. (This is not normally a problem, since the daily/weekly agenda view provides a far superior mechanism for viewing all timestamps that fall on a particular day.)
The true value of timestamp property queries lies in the use of inequalities to capture a range of dates. To assist with this task, Org-mode provides a number of convenient shortcuts:
<today>
and<tomorrow>
- timestamps for today and tomorrow (without a time of day specification)
<now>
- right now, including time of day
- e.g.,
2010-11-20 Sat 12:42
- e.g.,
<-5d>
,<-10w>
,<+3m>
,<+1y>
- relative date indicators
- the shortcuts above indicate five days ago, ten weeks ago, three months from now, and one year from now
To see all items SCHEDULED far in the future, say, more than a year from now, you could type:
C-c a m Match: SCHEDULED>"<+1y>"
Here's another scenario. Imagine you use org-capture to take all your notes and that you automatically stamp each notes with an inactive timestamp. To find all notes you took in the past two weeks with the tag "chimpanzees", you could perform the following search:
C-c a m Match: chimpanzees+TIMESTAMP_IA>="<-2w>"
Limit tags and properties searches by TODO state
You can limit any of these tags/property searches to active todo
states simply by using C-c a M
instead of C-c a m
.
You can also, of course, limit the searches to a particular todo keyword (say, NEXT) by adding…
+TODO="NEXT"
…to any of the searches above. But Org-mode also provides a
convenient (and more efficient) syntax for limiting searches to
particular TODO keywords. Simply add a /
followed by a TODO search
in the form we've already discussed. For instance, to limit the
chimpanzee search above to items marked DONE, you could type:
C-c a m Match: chimpanzees+TIMESTAMP_IA>="<-2w>"/DONE
As with normal todo searches, you can use or (|
) to expand the
allowed matches. For instance, the query…
chimpanzees+TIMESTAMP_IA>="<-2w>"/TODO|NEXT
…will match against items marked either TODO or NEXT.
If you are matching only against active todos (i.e., things not marked done), you can make your search more efficient by adding an exclamation point. E.g., the following search…
computer/!TODO|NEXT
…will result in all items tagged "computer" and either a TODO or
NEXT keyword. The exclamation mark will speed up the search, because
org-mode will only query items that have an active todo keyword (as
defined either in the variable org-todo-keywords
or in #+TODO
declarations at the top of an org file). For instance, if you had
placed the following line at the top of your org files…
#+TODO: TODO NEXT STARTED WAITING | DONE CANCELED
…an exclamation point limit the possible matches items marked TODO, NEXT, STARTED, or WAITING.
You can use a a negative (-
) to exclude TODO states. The search…
computer/!-WAITING
…will result only in items marked TODO, NEXT, or STARTED.
Be careful to avoid using "and" logic when you query TODOs, since each item, by definition can have only one TODO state. Take a look at the following two searches:
computer/!WAITING+TODO
chimpanzees+TODO="TODO"+SCHEDULED<="<+1w>"+TODO="WAITING"
These searches will never return any positive results, since an org entry cannot have both a TODO and a WAITING keyword.
Searching the full text of entries
Keyword searches
Thus far, we have explored different ways to query the various types of metadata attached to an org entry. But what if you would like to search the entire text of your org entries?
The answer: call org-search-view
with C-c a s
. In the agenda
dispatcher, this appears as…
s Search for keywords
Don't be fooled by the word "keywords," which some programs use as a synonym for tags. Here, a keyword search scours the full text of org entries.
Let's start with an example:
Desperately in need of typing practice (as if Emacs does not provide enough keyboarding practice), our friend Mr. Gnu would like to locate the following entry, which is buried somewhere in his agenda files:
* A sentence to test my keyboarding skills
The quick brown fox jumped over the lazy dog.
Mr. Gnu vaguely remembers that the entry contains the word "fox", so he pecks at the keyboard to enter…
C-c a s
He is confronted with the prompt…
[+-]Word/{Regexp} ...:
…so he enters…
fox
…and receives an agenda buffer with the correct results:
Search words: fox Press `[', `]' to add/sub word, `{', `}' to add/sub regexp, `C-u r' to edit typing: A sentence to test my keyboarding skills
Here, we should note that Org-mode's keyword searches are case-insensitive, so "fox" will match any of the following: "fox", "Fox", "FOX", etc.
Let's say, however, that Mr. Gnu's day job involves studying the behavior of foxes, so he knows ahead of time that a simple search will bring up hundreds of results. In addition, he recalls that the desired entry also contains the word "dog". Thus, he enters the following:
C-c a s [+-]Word/{Regexp} ...: fox dog
Somewhat puzzlingly, Mr. Gnu's search yields no results. What went wrong?
Mr. Gnu consults the manual and finds that the default behavior of
org-search-view
is to treat the entered query as a single string, so
when he typed fox dog
, Org-mode looked quite literally for
fox[whitespace]dog
.
Mr. Gnu further finds that to treat "dog" and "fox" as boolean
keywords that can be located anywhere in the entry, he needs to
precede each term with a +
. (Technically, he only needs to precede
the first search term with +
to initiate a boolean search, but he
decides to put +
in front of both for the sake of clarity.) So he
types…
C-c a s [+-]Word/{Regexp} ...: +fox +dog
…and is overjoyed to retrieve the expected results.
Mr. Gnu makes a mental note: unless the first character of the search
query is a +
, Org-mode will treat the entire query as a single
string. Thus, the query…
fox +dog
…will prompt Org-mode to search for the single string "fox +dog". (To change this behavior, please read the section for "Google addicts" below.)
Later, while at work, Mr. Gnu wants to find all entries on foxes that do not contain the word dog, so he types…
C-c a s [+-]Word/{Regexp} ...: +fox -dog
If Mr. Gnu wants to incorporate a substring/phrase into a boolean
search (i.e., a query with a +
at the beginning), he can use
quotation marks:
+fox +"lazy dog"
At home again, while practicing typing, Mr. Gnu wants to find all entries that contain either the word "keyboarding" or the word "typing". Remember his lessons on tag searches, he tries the following search query:
+keyboarding|+typing
Alas, the search returns no results, because Mr. Gnu just instructed
Org-mode to look for the entire string "keyboarding|+typing." Reading
the manual, Mr. Gnu discovers that, unlike todo and tag searches,
keyword searches require separate terms to be separated by whitespace
(e.g., +fox +dog
). In addition, Mr. Gnus realizes that keyword
searches have only two simple boolean expressions: +
("and") and -
("and not"). There is no "or" symbol, such as |
. What then should
Mr. Gnu do to find entries containing keyboarding or typing?
Full text search using regular expressions
The solution to Mr. Gnu's puzzle is found in regular expressions. Indeed, Mr. Gnu deduced as much by glancing at the org-search-view prompt:
[+-]Word/{Regexp} ...:
As the prompt suggests, Mr. Gnu can search org-entries using Emacs' powerful regular expression engine. To do so, he simply needs to enclose the regular expression in brackets. So he types…
C-c a s [+-]Word/{Regexp} ...: +{keyboarding\|typing}
…to find all entries that contain either "keyboarding" or "typing".
(Mr. Gnu could also have used parentheses to create a more compact
search query, such as +{\(keyboard\|typ\)ing}
. Also, it is good to
recall here that (
, |
, and )
only become special characters only
when escaped with a \
.)
Regular expressions, Mr. Gnu finds, can be combined with words. The query…
+{keyboarding\|typing} +fox
…finds the "quick brown fox" entry above, while…
+{keyboarding\|typing} -fox
…excludes it, finding only those entries that contain either the word "keyboarding" or "typing" and not the word dog.
Again, Org-mode's default behavior is to treat the entire query as a
single string unless it sees a +
or a {
at the beginning of the
line. So if Mr. Gnus types…
dog +{keyboarding\|typing}
…Org-mode will search for the entire substring "dog +{keyboarding\|typing}". (If you don't like this behavior, please read the section for "Google addicts" below.)
Regular expression syntax
The possibilities afforded by regular expressions are myriad. The examples discussed here are relatively basic. For a thorough introduction to regular expression syntax, please consult the emacs lisp manual.
Let's look at a couple of examples:
Imagine you've entered a lot of contact entries with phone numbers in the conventional U.S. format: 123-456-6789. To find all Org-mode entries with such numbers, you could type:
C-c a s [+-]Word/{Regexp} ...: +{[0-9]\{3\}-[0-9]\{3\}-[0-9]\{4\}}
The square brackets here are special characters; they match any of
characters they enclose. For instance, [abc]
matches either a or b
or c. In this particular case, the [0-9]
matches any digit between 0
and 9. In addition, the escaped curly brackets (\{...\}
) that
immediate follow the square brackets indicate how many times in a row
the character should occur. In this case, Org-mode will search for
the following sequence:
- exactly three digits
- a hyphen
- exactly three digits
- a hyphen
- exactly four digits
Instead of specifying the precise number of times a match such as
[0-9]
must repeat, you can also use the following special
characters:
*
- match any number of times (including none)
+
- match at least once and possibly more
?
- match either once or not at all
Now, imagine our friend Mr. Gnu is a new fan of Org-mode and has jotted down a lot of notes on his favorite PIM. However, he have entered the name Org-mode inconsistently, sometimes as "orgmode", other times as "Org mode", and still other times as "Org-mode". He'd like to find all his references to Org-mode, taking into account the various spellings. Here's a simple query that will accomplish this:
+{org[-\s]?mode}
Mr. Gnu just instructed Org-mode to search for any entry that contains the character sequence "org", followed by a hyphen, a space, or no character, followed by "mode". Since the search is case-insensitive, it will match "org-mode", "org mode", or "orgmode".
Limiting full text searches
There are several convenient ways to refine and limit full text searches.
First, if you find that a search produces too many results, you can easily add a new word or regexp by typing any of the following in the agenda buffer:
[
- add a word (i.e.,
+
) ]
- exclude a word (i.e.,
-
) {
- add a regexp (i.e.,
+{}
) }
- exclude a regexp (i.e.,
-{}
)
Let's say Mr. Gnu searches for the words Carsten and Dominik:
C-c a s [+-]Word/{Regexp} ...: +Carsten +Dominik
Since Mr. Gnu is an avid reader of the Org-mode mailing list and a
heavy user of org-capture, he discovers that he has hundreds of
entries that include Carsten's name. He wants to limit the search only
to entries with an inactive timestamp from November of 2010. So he
types [
in the agenda buffer to add a new search term and receives the
following prompt…
[+-]Word/{Regexp} ...: +Carsten +Dominik +
…with the cursor conveniently located after the plus sign. He completes the query to find inactive timestamps from November…
[+-]Word/{Regexp} ...: +Carsten +Dominik +[2010-11-
…and voilĂ , he retrieves a smaller subset of results.
If Mr. Gnu wants to find both active and inactive timestamps, he could
instead type {
to add a regular expression:
[+-]Word/{Regexp} ...: +Carsten +Dominik +{[\[<]2010-11-}
Similarly, if Mr. Gnu wants to guarantee the precision of his match, he could use a detailed regular expression…
+{\[2010-11-[0-9]\{2\}\s-[A-Za-z]\{3\}\(\s-[0-9]\{2\}:[0-9]\{2\}\)?\]}
But Mr. Gnu quickly decides that searching for the string "[2010-11-" good enough for his purposes.
Org-mode also provides convenient syntax for limiting full text searches.
If you place an asterisk at the beginning of your search, Org-mode will search only headlines (and not entry text). E.g., to find all entries with "emacs" in the headline, you could type:
C-c a s [+-]Word/{Regexp} ...: *+emacs
If you place an exclamation mark at the beginning of the query, Org-mode will only pull up entries that are active todos:
!+emacs
(You can also limit your search to active todos by using a prefix argument:
C-u C-c a s
.)Finally, if you place a colon at the beginning of a query, the boolean words you provide will only match entire words. Thus the following search…
:+emacs
…will match "emacs" but not "emacswiki".
You can mix and match these three limiting symbols, but they will only
work if they appear in the correct order: i.e., *
-> !
-> :
. If
you type :!+emacs
, your search will not retrieve any results.
Combining metadata and full text queries
As an expert on tag and property searches, you might ask: is it
possible to combine metadata and full text searches? For instance, how
could Mr. Gnu find all entries with "Walter Evensong" in the
BIB_AUTHOR
field, the todo keyword "DONE", and the word "Brazil" in
the full text of the entry?
- It is not possible to simply combine the syntax of metadata and full text searches. Org-mode parses each query in fundamentally different ways.
- You can, however, easily accomplish "mixed" queries by using regular
expressions and
org-search-view
. In some instances,org-search-view
offers an easier and more efficient way of querying metadata than the tags and property search.
The simplest way to think about Org-mode metadata is as different
types of markup patterns. Tags are enclosed in colons, todo keywords
directly follow the asterisks that mark outline headings, timestamps
are contained in brackets and have the pattern YYYY-MM-DD DOW HH:MM
,
and so on. Thus, to query for particular types of metadata, one simply
has to construct regular expressions that match these patterns.
Back to Mr. Gnu, our mediocre typist and reader of very long books. Today, he would like to find all entries in which:
- the
BIB_AUTHOR
is "Walter Evensong" - the todo keyword is "DONE"
- the word "Brazil" appears in the full text.
First, he invokes org-search-view
:
C-c a s
At the prompt, he adds a plus sign and the word "brazil":
[+-]Word/{Regexp} ...: +brazil
He remembers that he must add the plus sign to instruct Org-mode to treat this search as a boolean search. Otherwise it will simply look for the entire string entered at the prompt.
Next he needs to search for the todo keyword "DONE". Since todo keywords immediately follow the markup for outline headings, he can simply add a regexp that matches an outline heading immediately followed by the word DONE:
[+-]Word/{Regexp} ...: +brazil +{^\*+\s-+DONE\s-}
This regexp begins with ^
, which forces a match at the beginning of
the line. It is followed by an asterisk, which needs to be escaped,
since an asterisk a special character in regular expressions. the +
after the asterisk instructs Org-mode to look for one or more
asterisks, while the \s-+
indicates that at least one space follows
the asterisk(s). So Mr. Gnu is searching for at least one asterisk at
the beginning of the line followed by a space—the very definition of
an outline heading in Org-mode. And the keyword DONE followed by
whitespace completes the match. If Mr. Gnu would like to match more
than one todo keyword, say DONE or WAITING, he could use grouping:
+{^\*+\s-+\(DONE\|WAITING\)\s-}
Finally, Mr. Gnu finishes his query by searching for the property
BIB_AUTHOR
. He recalls that a property line looks like this:
:BIB_AUTHOR: Walter Evensong
With this in mind, he can easily construct a regexp to search for the
string :BIB_AUTHOR:
followed by an arbitrary amount of whitespace
followed in turn by the string "Walter Evensong".
[+-]Word/{Regexp} ...: +brazil +{^\*+\s-+DONE\s-} +{:BIB_AUTHOR:\s-+Walter Evensong}
Mr. Gnu is surprised at the speed with which Org-mode returns his results. Indeed, he finds that regexp searches (especially those querying properties) usually return their results more quickly than property and tag searches. And he deduces the reason: whereas property searches have to query each headline to determine whether a given property contains a value, keyword searches simply scan each file for matches and then return the appropriate headlines.
In many instances, of course, the DONE regexp above may be overkill. Searching for the string "* DONE" will often do the trick. E.g.,
[+-]Word/{Regexp} ...: +brazil +"* DONE"
Indeed, Mr. Gnu could probably also dispense with the :BIB_AUTHOR:
regexp above, simply typing…
[+-]Word/{Regexp} ...: +brazil +"* DONE" +":BIB_AUTHOR: Walter Evensong"
Here's another example. Let's say Mr. Gnu would like to find all active todos directly tagged "urgent" (i.e., not inherited) with the word "wedding" somewhere in the entry text. The following keyword search does the trick:
C-c a s [+-]Word/{Regexp} ...: !+wedding :urgent:
If Mr. Gnu wants to see either the tag "urgent" or the tag "important", he could use a regular expression:
!+wedding +{:\(urgent\|important\):}
The main limitation of such searches is that keyword searches know
nothing of outline tree inheritance. Thus, if Mr. Gnu is interested in
all entries that inherit the tag "urgent", he should always use
org-tags-view
.
Searching additional files
Often, the set of files one would like to search by keyword is larger than one's set of active agenda files. For instance, one might archive old projects in separate files so that they no longer contribute to the agenda. Yet one would still like to search the reference material in these projects by keyword/regexp.
The solution lies in the variable
org-agenda-text-search-extra-files
. Adding a list of files to this
variable instructs org-search-view
to query those files in addition
to the agenda files. Note that setting
org-agenda-text-search-extra-files
has no effect on other types of
agenda commands, such as todo and tags/property searches.
Keyword searches for Google addicts
As noted before, org-search-view
will treat a search query as a
boolean expression only if it begins with either a +
or a {
(i.e.,
a regular expression). Without these characters, Org-mode will treat
the query as a single substring.
This default syntax of org-search-view
is thus different than the
behavior of search engines such as Google, which treat searches as
lazy boolean queries by default. If you type "emacs org-mode" into
Google, it will not search for the literal string "emacs org-mode",
but rather assume the space implies a boolean expression: "emacs and
org-mode".
If you find yourself often forgetting to add an initial +
to your
org-search-view
queries, you can make "lazy booleans" the default
behavior by adding the following to your .emacs:
(setq org-agenda-search-view-always-boolean t)
Then you can happily type your lazy searches:
C-c a s [+-]Word/{Regexp} ...: org mode Carsten :email:
If you would like to include a substring or phrase in your search, you can do so by enclosing it in quotation marks. And if you want to exclude items or use regular expressions, you will, of course, still have to use a minus sign and curly brackets, respectively.
Searching org files line-by-line
All the searches we have discussed thus far return their results as a list of org headlines in the agenda buffer. Sometimes, however, you might prefer to see each line in which a word or regular expression occurs. There are different ways to do this:
Multi-occur
Org-mode uses Emacs' multi-occur command to search for any lines in
the agenda files containing a regular expression. Simply type C-c a
/
followed by a word or regular expression and you will be presented
a buffer with all lines that match the query, with each line
conveniently linked to its original location.
External commands and scripts
Emacs provides convenient interfaces to common Unix search commands,
such as grep. Simply type M-x grep
and complete the query (the
working directory is usually that of the current buffer in Emacs).
Using grep is especially convenient when you want quickly to search
org files that are not in org-agenda-files
or
org-agenda-text-search-extra-files
. And, of course, grep can be used
outside of Emacs.
Since org-mode files are plain text, you can use your favorite scripting language (perl, awk, python, etc.) to develop new and ever more creative ways to search and analyze them.
Sparse trees
The commands we have examined so far typically search multiple files and display the resulting heading in a separate agenda buffer. But sometimes, you might want to search for various types of data within a single file, so as to see all the matching headlines and entries in context.
The way to accomplish this is via a sparse tree view (C-c /
), which
collapses the outline in the current file, showing only the portions
that match a query.
Calling org-sparse-tree
with C-c /
brings up a prompt with several
search options:
Sparse tree: [r]egexp [/]regexp [t]odo [T]odo-kwd [m]atch [p]roperty [d]eadlines [b]efore-date [a]fter-date
Some of these search, such as "todo" (t
) and "deadlines" (d
) are
quite simple, showing all headlines in a buffer that contain an active
todo keyword or a deadline, respectively. Others, such as "property"
(p
), prompt for a single key/value pair.
One search that may be of particular interest is "match" (m
). This
query uses exactly the same syntax as org-tags-view
, allowing us to
use complex metadata searches to create sparse trees
For instance, to highlight all active todos without a timestamp in the current buffer, you could type:
C-c / m Match: -SCHEDULED={.}/!
This instructs Org-mode to look for any active todo (/!
) without a
SCHEDULED timestamp.
Custom agenda commands
If there are searches you perform again and again, you can easily save them by adding them to you custom agenda commands.
As we know, Mr. Gnu is an avid collector of very large books (which,
of course, he manages in very long org files). Moreover, he often
likes to peruse your inventory of books over 1,000 pages, querying his
custom BIB_PAGES
field. To save time and energy, Mr. Gnu could add a
custom command such as the following to his .emacs
:
(add-to-list 'org-agenda-custom-commands '("b" "Big books" tags "+BIB_PAGES>1000"))
Note that "tags" here indicates org-tags-view
. Thus, the query uses
the tags/property search syntax.
Mr. Gnu realizes he can save an even faster version of the search above:
(add-to-list 'org-agenda-custom-commands '("B" "Big books (fast)" search "{:BIB_PAGES:\\s-+[0-9]\\{4\\}}"))
The symbol "search", as you might have guessed, instructs Org-mode to
use org-search-view
. And the saved search finds all items with
BIB_PAGES
property that contain four digits (i.e., > 1000 pages).
You might notice that the search query here, compared with the one
above, contains extra backslashes. That is because the backslash is a
special character in emacs-lisp and thus needs to be escaped when
placed in an .emacs
file.
If Mr. Gnu frequently need to perform the "urgent wedding tasks" search above, he could add a command such as the following:
(add-to-list 'org-agenda-custom-commands '("w" "Getting married next week!" search "!+wedding +{:\\(urgent\\|important\\):}"))
Finally, one can use custom commands to run searches with different local settings. For instance, one can set up a custom agenda command to run a tags/property search on files other than the agenda files:
(add-to-list 'org-agenda-custom-commands '("r" "Reference material" tags "" ((org-agenda-files (file-expand-wildcards "~/ref/*.org")))))
For a full introduction to custom agenda commands, please see this tutorial.
Footnotes:
Note that the lowercase variant of the command (C-c
a t
) does not provide a search prompt, but simply pulls up all active
TODOs.