Advanced searching

Introduction

Org-mode has many powerful built-in search functions. These tools transform hierarchical org files into robust plain text "databases" that can be queried in sophisticated ways. Outline headings in Org-mode not only function as document sections or todo items; each heading can also store an unlimited amount of text and various types of metadata. And, of course, since Org-mode files are plain text, any number of tools (grep, awk, perl, etc.) can be used to filter and manipulate the data they contain.

The goal of this tutorial is to offer an introduction to the built-in commands and syntax for querying Org-mode outlines. While these are explained in various places in the Org-mode manual, this tutorial attempts to provide an overview in one place. It is particularly aimed at those who would like to use Org-mode as a note-taking and reference management tool. Nonetheless, it should prove useful to anyone who needs to locate specific information buried in an ever-growing collection of Org-mode data.

Outline nodes as "data containers"

Before discussing specific search commands, it is worth taking a few moments to consider the basic structure of an Org-mode entry.

Though outline entries can be nested within one another hierarchically, each node is also a discrete container of data. Indeed, a large variety of metadata—a todo keyword, tags, timestamps, logging information, and properties (i.e., arbitrary data pairs)—can be attached to each heading. Similarly each outline entry can store an unlimited amount of text.

Here is an example:

* TODO Buy clothes for wedding                    :wedding:important:errands:
,  SCHEDULED: <2010-12-01 Wed>
,  :PROPERTIES:
,  :estimated-cost: 100
,  :END:
,  [2010-11-17 Wed 12:22]

,  I need to look spiffy for the big day!

,   - [ ] Suit
,   - [ ] Tie
,   - [ ] Shoes
,   - [ ] Hat

,  Possible stores to visit:
,  
,  | Store           | Location               | Miles away |
,  |-----------------+------------------------+------------|
,  | The Suit King   | 1000 E. Washington St. |        5.1 |
,  | Mr. Haberdasher | 259 Western Rd.        |        7.2 |

The sample entry above has the following metadata:

todo keyword: TODO
scheduled timestamp: 2010-12-01 Wed
inactive timestamp: 2010-11-17 Wed 12:22
property: estimated-cost ⇒ 100
tags: "wedding", "important", and "errands"

The entry also contains some text, including a checklist and a table.

Normally, an Org-mode file/outline contains several entries such as the one above, nested hierarchically. Moreover, Org users typically make their most important files available for easy searching by adding them to their list of agenda files, either selecting them one-by-one with C-c [ or by setting the variable org-agenda-files.

The agenda as a search engine

For querying a collection of org files, Org-mode includes a powerful built-in search-engine, the agenda (C-c a). As its name suggests, the most common use of the agenda is to pull together, from all the agenda files, a daily or weekly schedule or a list of todos. But the agenda is also a powerful search engine that offers various ways tools for querying both the metadata and the text of org-mode entries. In fact, Org-mode's default agenda view (C-c a a or org-agenda-list) is simply a search tool that gathers and displays all org-mode entries with certain types of metadata—timestamps that fall within a given range of dates.

Typing C-c a or M-x org-agenda brings up the agenda dispatcher, an overview of Org-mode's various search tools:

Press key for an agenda command:        <   Buffer, subtree/region restriction
--------------------------------        >   Remove restriction
a   Agenda for current week or day      e   Export agenda views
t   List of all TODO entries            T   Entries with special TODO kwd
m   Match a TAGS/PROP/TODO query        M   Like m, but only TODO entries
L   Timeline for current buffer         #   List stuck projects (!=configure)
s   Search for keywords                 C   Configure custom agenda commands
/   Multi-occur                         ?   Find :FLAGGED: entries

A quick perusal of the commands here reveals that one can query for a wide variety of data. This tutorial will focus on three searches in particular:

C-c a T: for todo keywords
C-c a m: for tags and properties
C-c a s: for full text searches

Now, let's look at the syntax for each of these search tools.

Searching metadata (todos, tags, and properties)

Todo keyword searches

The simplest type of metadata query in org-mode is org-todo-list (invoked with C-c a T). This function prompts the user for a search string and then retrieves a list of outline headings containing the TODOs specified in the search string.¹

Since each outline heading can contain only one TODO keyword, the search syntax is quite simple, consisting either of a single keyword or two or more keywords bound together by the boolean operator | ("or").

For instance, the following query…

TODO

…retrieves all entries marked with a TODO keyword, whereas…

TODO|PROJECT|MAYBE

…displays a list of all headlines containing either TODO or PROJECT or MAYBE.

Tag searches

Though the org-todo-list serves its purpose well, it is limited to only one type of metadata. If you would like to search for other types of metadata, or mix and match a search for todo keywords with, say, a search for tags, org-mode offers a more powerful tool, org-tags-view, which is called with the following keys:

C-c a m: searches all headlines
C-c a M: searches only headlines with active todos

At its simplest, org-tags-view does exactly what it says: it queries for headlines marked with particular combinations of tags. The syntax for such searches follows a simple boolean logic:

|: or
&: and
+: include matches
-: exclude matches

Here are a few examples:

+computer&+urgent

…will result in all items tagged both computer and urgent, while the search…
+computer|+urgent

…will result in all items tagged either computer or urgent. Meanwhile, the query…
+computer&-urgent

…will display all items tagged computer and not urgent.

As you may have noticed, the syntax above can be a little verbose, so org-mode offers convenient ways of shortening it. First, - and + imply "and" if no boolean operator is stated, so example three above could be rewritten simply as:

+computer-urgent

Second, inclusion of matches is implied if no + or - is present, so example three could be further shortened to:

computer-urgent

Example number two, meanwhile, could be shortened to:

computer|urgent

Grouping tags

There is no way (as yet) to express search grouping with parentheses. The "and" operators (&, +, and -) always bind terms together more strongly than "or" (|). For instance, the following search…

computer|work+email

…results in all headlines tagged either with "computer" or both "work" and "email". An expression such as (computer|work)&email is not supported at the moment.

There, are, however several other ways to achieve the grouping effect of parentheses:

Use a regular expression

To invoke the "grouping" logic of parentheses, you can construct a regular expression:
```
+{computer\|work}+email
```
Note: you can also use the special property ALLTAGS (which queries the same data as a normal tags search) together with a regular expression:
```
ALLTAGS={computer\|work}+email
```
(In the next section we'll learn more about how property searches and regular expressions are constructed.)
Use a slightly more verbose query as a substitute for the logic of parentheses. E.g.,
```
computer&email|work&email
```
This search will match all headlines tagged either with "computer" and "email" or with "work" and "email."
If you are combining a tags search with a TODO search, you can use the following:
```
computer|email/!NOW
```
Use agenda filtering.

Simply search for all headlines tagged with "computer" or "work" and then use the agenda's tag filtering capabilities (/) to see only those headlines among the results that have the tag "email."

Property searches

Org-mode allows outline entries to contain any number of arbitrary data pairs, which are conveniently hidden within a folding PROPERTIES drawer, e.g.:

* TODO Evensong's magisterial work on the Amazon           :science:read:BIB:
  SCHEDULED: <2010-11-20 Sat>
  [2010-11-16 Tue 23:11]
  :PROPERTIES:
  :BIB_AUTHOR: Walter Evensong
  :BIB_TITLE: Mysteries of the Amazon
  :BIB_PAGES: 1234
  :BIB_PUBLISHER: Humbug University Press
  :END:

  Lots of good stuff on Brazil.

Let's imagine a free software aficionado named Mr. Gnu has added a number of similar bibliographical outline nodes to his org files and that he would like to find all entries that contain "Walter Evensong" in their BIB_AUTHOR field. He can construct such a search so by calling org-tags-view and entering the desired key/value match:

C-c a m 
Match: BIB_AUTHOR="Walter Evensong"

Property searches can be mixed and matched with tag searches. If Mr. Gnu would like to see all books by "Walter Evensong" with the tag "read", he can simply join the two desired matches together with the + sign:

BIB_AUTHOR="Walter Evensong"+read

Properties with numeric values can be queried with inequalities. If Mr. Gnu would like to retrieve all books by the prolific Walter Evensong that span over 1000 pages, he could enter the following:

BIB_AUTHOR="Walter Evensong"+BIB_PAGES>1000

The comparison operators for searches are as follows:

= (equal), > (greater than), <= (greater than or equal to), 
< (less than), <= (less than or equal to), <> (not equal)

What if Mr. Gnus would to like of find all books by Walter Evensong or any books over 1000 pages?

BIB_AUTHOR="Walter Evensong"|BIB_PAGES>1000

For his own clarity, Mr. Gnu can always insert "+" signs, though they are not required:

+BIB_AUTHOR="Walter Evensong"|+BIB_PAGES>1000

It is important to note that the equal sign in the searches above implies an exact match. If Mr. Gnu is searching for a string, such as "Mysteries of the Amazon", the entire search query must match. Thus, the search…

BIB_TITLE="Amazon"

…will not match the entry above.

How then can you search for partial matches? The answer is regular expressions. Instead of surrounding your query with quotation marks (which will necessitate a precise and complete match), you can instead enfold it in brackets, which instructs Org-mode to treat the query as a regular expression. Thus, the search…

BIB_TITLE={Amazon}

…will locate all entries that match contain the sequence "Amazon" and pull them up in the agenda:

Headlines with TAGS match: BIB_TITLE={Amazon}
Press `C-u r' to search again with new search string
 org:        TODO Evensong's magisterial work on the Amazon  :science:read:BIB:

Mr. Gnu jots down the following rule in his growing org file collection:

* Tags/property search matching
 - For exact matches, use quotation marks.
 - For partial matches, use curly brackets.

Regular expressions allow for more flexible searches. Let's say that for some strange reason Mr. Gnu would like to find all books containing either "Amazon" or "Amazing" in their titles. The following regular expression search should do the trick:

BIB_TITLE={Amaz\(on\|ing\)}

Let's break this expression down:

Amaz: This is the string shared by both words.
\(...\): These parentheses create a grouping to set off the alternative matches that follow "Amaz".
on\|ing: \| is the "or" expression. Since it is placed within the parentheses, it means that a match must begin with "Amaz" but can end either in "on" or "ing".

You may be wondering why the search query contains so many backslashes. It is because Emacs' regular expression engine gives the characters (, ), and | a special meaning only when they are "escaped" (i.e., preceded by a backslash). Thus, Mr. Gnu had simply typed BIB_TITLE={Amaz(on|ing)}, he would have instructed Org-mode to match entries with the exact sequence Amaz(on|ing) (an unlikely match, unless he has a large collection of literary theory from the 1990s).

Here's a simpler example. If Mr. Gnu would like to find all entries with either "Walter" or "Evensong" in the author field, he could type:

BIB_TITLE={Walter\|Evensong}

If he would like to pull up all entries that have defined value for the BIB_TITLE property, he can simply use a single dot to match any character:

BIB_TITLE={.}

Special Properties

In addition to any explicitly declared key/value property pairs, each Org-mode entry also has a number of special (i.e., implicit) properties that can be queried with org-tags-view (C-c a m). These include, among other things, the entry's TODO state, tags (local and inherited), category, priority, and timestamps (DEADLINE, SCHEDULED, active, and inactive). See the sample entry above for an illustration of where these properties are typically found in an outline node.

To see all of the properties (both explicit and implicit) defined for an Org-mode entry, place the following text in an org-mode entry and evaluate it by typing C-x C-e after the closing parenthesis:

(org-entry-properties nil)

Here's an example of how such "special properties" can be put to good use in a search:

C-c a m
Match: Effort>1+PRIORITY="A"+SCHEDULED<"<tomorrow>"+ALLTAGS={computer\|email}

This query finds all items with:

An estimated effort greater than one hour
A priority of "A"
A scheduled date "less than" tomorrow (i.e., today or earlier).
Either the tag "computer" or the tag "email"
- Note: the ALLTAGS property includes inherited tags, while the TAGS property includes only local tags.
- This search is also a good example of how to achieve a grouping logic without parentheses while querying tags.

Please consult the manual for a fuller explanation of the syntax of such searches.

Querying timestamps

A few words should be said here about querying timestamps contained in the following properties: DEADLINE, SCHEDULED, TIMESTAMP (the first active timestamp in an entry), and TIMESTAMP_IA (the first inactive timestamp in an entry).

The basic syntax for querying timestamps is a time string enclosed in double quotes and angular brackets. E.g., the search…

C-c a m
Match: +SCHEDULED="<2010-08-20 Sat>"

…will find all items scheduled for Saturday, August 20, 2010 without a time of day specification. This last caveat is important to note: if you have a timestamp with time of day information, such as…

* Some task
  SCHEDULED: <2010-08-20-Sat 10:30>

…the search above will not retrieve it. (This is not normally a problem, since the daily/weekly agenda view provides a far superior mechanism for viewing all timestamps that fall on a particular day.)

The true value of timestamp property queries lies in the use of inequalities to capture a range of dates. To assist with this task, Org-mode provides a number of convenient shortcuts:

<today> and <tomorrow>

timestamps for today and tomorrow (without a time of day specification)

<now>

right now, including time of day

e.g., 2010-11-20 Sat 12:42

<-5d>, <-10w>, <+3m>, <+1y>

relative date indicators

the shortcuts above indicate five days ago, ten weeks ago, three months from now, and one year from now

To see all items SCHEDULED far in the future, say, more than a year from now, you could type:

C-c a m
Match: SCHEDULED>"<+1y>"

Here's another scenario. Imagine you use org-capture to take all your notes and that you automatically stamp each notes with an inactive timestamp. To find all notes you took in the past two weeks with the tag "chimpanzees", you could perform the following search:

C-c a m
Match: chimpanzees+TIMESTAMP_IA>="<-2w>"

Limit tags and properties searches by TODO state

You can limit any of these tags/property searches to active todo states simply by using C-c a M instead of C-c a m.

You can also, of course, limit the searches to a particular todo keyword (say, NEXT) by adding…

+TODO="NEXT"

…to any of the searches above. But Org-mode also provides a convenient (and more efficient) syntax for limiting searches to particular TODO keywords. Simply add a / followed by a TODO search in the form we've already discussed. For instance, to limit the chimpanzee search above to items marked DONE, you could type:

C-c a m
Match: chimpanzees+TIMESTAMP_IA>="<-2w>"/DONE

As with normal todo searches, you can use or (|) to expand the allowed matches. For instance, the query…

chimpanzees+TIMESTAMP_IA>="<-2w>"/TODO|NEXT

…will match against items marked either TODO or NEXT.

If you are matching only against active todos (i.e., things not marked done), you can make your search more efficient by adding an exclamation point. E.g., the following search…

computer/!TODO|NEXT

…will result in all items tagged "computer" and either a TODO or NEXT keyword. The exclamation mark will speed up the search, because org-mode will only query items that have an active todo keyword (as defined either in the variable org-todo-keywords or in #+TODO declarations at the top of an org file). For instance, if you had placed the following line at the top of your org files…

#+TODO: TODO NEXT STARTED WAITING | DONE CANCELED

…an exclamation point limit the possible matches items marked TODO, NEXT, STARTED, or WAITING.

You can use a a negative (-) to exclude TODO states. The search…

computer/!-WAITING

…will result only in items marked TODO, NEXT, or STARTED.

Be careful to avoid using "and" logic when you query TODOs, since each item, by definition can have only one TODO state. Take a look at the following two searches:

computer/!WAITING+TODO

chimpanzees+TODO="TODO"+SCHEDULED<="<+1w>"+TODO="WAITING"

These searches will never return any positive results, since an org entry cannot have both a TODO and a WAITING keyword.

Searching the full text of entries

Keyword searches

Thus far, we have explored different ways to query the various types of metadata attached to an org entry. But what if you would like to search the entire text of your org entries?

The answer: call org-search-view with C-c a s. In the agenda dispatcher, this appears as…

s  Search for keywords

Don't be fooled by the word "keywords," which some programs use as a synonym for tags. Here, a keyword search scours the full text of org entries.

Let's start with an example:

Desperately in need of typing practice (as if Emacs does not provide enough keyboarding practice), our friend Mr. Gnu would like to locate the following entry, which is buried somewhere in his agenda files:

* A sentence to test my keyboarding skills

The quick brown fox jumped over the lazy dog.

Mr. Gnu vaguely remembers that the entry contains the word "fox", so he pecks at the keyboard to enter…

C-c a s

He is confronted with the prompt…

[+-]Word/{Regexp} ...:

…so he enters…

fox

…and receives an agenda buffer with the correct results:

Search words: fox
Press `[', `]' to add/sub word, `{', `}' to add/sub regexp, `C-u r' to edit
 typing:        A sentence to test my keyboarding skills

Here, we should note that Org-mode's keyword searches are case-insensitive, so "fox" will match any of the following: "fox", "Fox", "FOX", etc.

Let's say, however, that Mr. Gnu's day job involves studying the behavior of foxes, so he knows ahead of time that a simple search will bring up hundreds of results. In addition, he recalls that the desired entry also contains the word "dog". Thus, he enters the following:

C-c a s
[+-]Word/{Regexp} ...: fox dog

Somewhat puzzlingly, Mr. Gnu's search yields no results. What went wrong?

Mr. Gnu consults the manual and finds that the default behavior of org-search-view is to treat the entered query as a single string, so when he typed fox dog, Org-mode looked quite literally for fox[whitespace]dog.

Mr. Gnu further finds that to treat "dog" and "fox" as boolean keywords that can be located anywhere in the entry, he needs to precede each term with a +. (Technically, he only needs to precede the first search term with + to initiate a boolean search, but he decides to put + in front of both for the sake of clarity.) So he types…

C-c a s
[+-]Word/{Regexp} ...: +fox +dog

…and is overjoyed to retrieve the expected results.

Mr. Gnu makes a mental note: unless the first character of the search query is a +, Org-mode will treat the entire query as a single string. Thus, the query…

fox +dog

…will prompt Org-mode to search for the single string "fox +dog". (To change this behavior, please read the section for "Google addicts" below.)

Later, while at work, Mr. Gnu wants to find all entries on foxes that do not contain the word dog, so he types…

C-c a s
[+-]Word/{Regexp} ...: +fox -dog

If Mr. Gnu wants to incorporate a substring/phrase into a boolean search (i.e., a query with a + at the beginning), he can use quotation marks:

+fox +"lazy dog"

At home again, while practicing typing, Mr. Gnu wants to find all entries that contain either the word "keyboarding" or the word "typing". Remember his lessons on tag searches, he tries the following search query:

+keyboarding|+typing

Alas, the search returns no results, because Mr. Gnu just instructed Org-mode to look for the entire string "keyboarding|+typing." Reading the manual, Mr. Gnu discovers that, unlike todo and tag searches, keyword searches require separate terms to be separated by whitespace (e.g., +fox +dog). In addition, Mr. Gnus realizes that keyword searches have only two simple boolean expressions: + ("and") and - ("and not"). There is no "or" symbol, such as |. What then should Mr. Gnu do to find entries containing keyboarding or typing?

Full text search using regular expressions

The solution to Mr. Gnu's puzzle is found in regular expressions. Indeed, Mr. Gnu deduced as much by glancing at the org-search-view prompt:

[+-]Word/{Regexp} ...:

As the prompt suggests, Mr. Gnu can search org-entries using Emacs' powerful regular expression engine. To do so, he simply needs to enclose the regular expression in brackets. So he types…

C-c a s
[+-]Word/{Regexp} ...: +{keyboarding\|typing}

…to find all entries that contain either "keyboarding" or "typing". (Mr. Gnu could also have used parentheses to create a more compact search query, such as +{\(keyboard\|typ\)ing}. Also, it is good to recall here that (, |, and ) only become special characters only when escaped with a \.)

Regular expressions, Mr. Gnu finds, can be combined with words. The query…

+{keyboarding\|typing} +fox

…finds the "quick brown fox" entry above, while…

+{keyboarding\|typing} -fox

…excludes it, finding only those entries that contain either the word "keyboarding" or "typing" and not the word dog.

Again, Org-mode's default behavior is to treat the entire query as a single string unless it sees a + or a { at the beginning of the line. So if Mr. Gnus types…

dog +{keyboarding\|typing}

…Org-mode will search for the entire substring "dog +{keyboarding\|typing}". (If you don't like this behavior, please read the section for "Google addicts" below.)

Regular expression syntax

The possibilities afforded by regular expressions are myriad. The examples discussed here are relatively basic. For a thorough introduction to regular expression syntax, please consult the emacs lisp manual.

Let's look at a couple of examples:

Imagine you've entered a lot of contact entries with phone numbers in the conventional U.S. format: 123-456-6789. To find all Org-mode entries with such numbers, you could type:

C-c a s
[+-]Word/{Regexp} ...: +{[0-9]\{3\}-[0-9]\{3\}-[0-9]\{4\}}

The square brackets here are special characters; they match any of characters they enclose. For instance, [abc] matches either a or b or c. In this particular case, the [0-9] matches any digit between 0 and 9. In addition, the escaped curly brackets (\{...\}) that immediate follow the square brackets indicate how many times in a row the character should occur. In this case, Org-mode will search for the following sequence:

exactly three digits
a hyphen
exactly three digits
a hyphen
exactly four digits

Instead of specifying the precise number of times a match such as [0-9] must repeat, you can also use the following special characters:

*: match any number of times (including none)
+: match at least once and possibly more
?: match either once or not at all

Now, imagine our friend Mr. Gnu is a new fan of Org-mode and has jotted down a lot of notes on his favorite PIM. However, he have entered the name Org-mode inconsistently, sometimes as "orgmode", other times as "Org mode", and still other times as "Org-mode". He'd like to find all his references to Org-mode, taking into account the various spellings. Here's a simple query that will accomplish this:

+{org[-\s]?mode}

Mr. Gnu just instructed Org-mode to search for any entry that contains the character sequence "org", followed by a hyphen, a space, or no character, followed by "mode". Since the search is case-insensitive, it will match "org-mode", "org mode", or "orgmode".

Limiting full text searches

There are several convenient ways to refine and limit full text searches.

First, if you find that a search produces too many results, you can easily add a new word or regexp by typing any of the following in the agenda buffer:

[: add a word (i.e., +)
]: exclude a word (i.e., -)
{: add a regexp (i.e., +{})
}: exclude a regexp (i.e., -{})

Let's say Mr. Gnu searches for the words Carsten and Dominik:

C-c a s
[+-]Word/{Regexp} ...: +Carsten +Dominik

Since Mr. Gnu is an avid reader of the Org-mode mailing list and a heavy user of org-capture, he discovers that he has hundreds of entries that include Carsten's name. He wants to limit the search only to entries with an inactive timestamp from November of 2010. So he types [ in the agenda buffer to add a new search term and receives the following prompt…

[+-]Word/{Regexp} ...: +Carsten +Dominik +

…with the cursor conveniently located after the plus sign. He completes the query to find inactive timestamps from November…

[+-]Word/{Regexp} ...: +Carsten +Dominik +[2010-11-

…and voilà, he retrieves a smaller subset of results.

If Mr. Gnu wants to find both active and inactive timestamps, he could instead type { to add a regular expression:

[+-]Word/{Regexp} ...: +Carsten +Dominik +{[\[<]2010-11-}

Similarly, if Mr. Gnu wants to guarantee the precision of his match, he could use a detailed regular expression…

+{\[2010-11-[0-9]\{2\}\s-[A-Za-z]\{3\}\(\s-[0-9]\{2\}:[0-9]\{2\}\)?\]}

But Mr. Gnu quickly decides that searching for the string "[2010-11-" good enough for his purposes.

Org-mode also provides convenient syntax for limiting full text searches.

If you place an asterisk at the beginning of your search, Org-mode will search only headlines (and not entry text). E.g., to find all entries with "emacs" in the headline, you could type:
```
C-c a s
[+-]Word/{Regexp} ...: *+emacs
```
If you place an exclamation mark at the beginning of the query, Org-mode will only pull up entries that are active todos:
```
!+emacs
```
(You can also limit your search to active todos by using a prefix argument: C-u C-c a s.)
Finally, if you place a colon at the beginning of a query, the boolean words you provide will only match entire words. Thus the following search…
```
:+emacs
```
…will match "emacs" but not "emacswiki".

You can mix and match these three limiting symbols, but they will only work if they appear in the correct order: i.e., * -> ! -> :. If you type :!+emacs, your search will not retrieve any results.

Combining metadata and full text queries

As an expert on tag and property searches, you might ask: is it possible to combine metadata and full text searches? For instance, how could Mr. Gnu find all entries with "Walter Evensong" in the BIB_AUTHOR field, the todo keyword "DONE", and the word "Brazil" in the full text of the entry?

It is not possible to simply combine the syntax of metadata and full text searches. Org-mode parses each query in fundamentally different ways.
You can, however, easily accomplish "mixed" queries by using regular expressions and org-search-view. In some instances, org-search-view offers an easier and more efficient way of querying metadata than the tags and property search.

The simplest way to think about Org-mode metadata is as different types of markup patterns. Tags are enclosed in colons, todo keywords directly follow the asterisks that mark outline headings, timestamps are contained in brackets and have the pattern YYYY-MM-DD DOW HH:MM, and so on. Thus, to query for particular types of metadata, one simply has to construct regular expressions that match these patterns.

Back to Mr. Gnu, our mediocre typist and reader of very long books. Today, he would like to find all entries in which:

the BIB_AUTHOR is "Walter Evensong"
the todo keyword is "DONE"
the word "Brazil" appears in the full text.

First, he invokes org-search-view:

C-c a s

At the prompt, he adds a plus sign and the word "brazil":

[+-]Word/{Regexp} ...: +brazil

He remembers that he must add the plus sign to instruct Org-mode to treat this search as a boolean search. Otherwise it will simply look for the entire string entered at the prompt.

Next he needs to search for the todo keyword "DONE". Since todo keywords immediately follow the markup for outline headings, he can simply add a regexp that matches an outline heading immediately followed by the word DONE:

[+-]Word/{Regexp} ...: +brazil +{^\*+\s-+DONE\s-}

This regexp begins with ^, which forces a match at the beginning of the line. It is followed by an asterisk, which needs to be escaped, since an asterisk a special character in regular expressions. the + after the asterisk instructs Org-mode to look for one or more asterisks, while the \s-+ indicates that at least one space follows the asterisk(s). So Mr. Gnu is searching for at least one asterisk at the beginning of the line followed by a space—the very definition of an outline heading in Org-mode. And the keyword DONE followed by whitespace completes the match. If Mr. Gnu would like to match more than one todo keyword, say DONE or WAITING, he could use grouping: +{^\*+\s-+\(DONE\|WAITING\)\s-}

Finally, Mr. Gnu finishes his query by searching for the property BIB_AUTHOR. He recalls that a property line looks like this:

:BIB_AUTHOR: Walter Evensong

With this in mind, he can easily construct a regexp to search for the string :BIB_AUTHOR: followed by an arbitrary amount of whitespace followed in turn by the string "Walter Evensong".

[+-]Word/{Regexp} ...: +brazil +{^\*+\s-+DONE\s-} +{:BIB_AUTHOR:\s-+Walter Evensong}

Mr. Gnu is surprised at the speed with which Org-mode returns his results. Indeed, he finds that regexp searches (especially those querying properties) usually return their results more quickly than property and tag searches. And he deduces the reason: whereas property searches have to query each headline to determine whether a given property contains a value, keyword searches simply scan each file for matches and then return the appropriate headlines.

In many instances, of course, the DONE regexp above may be overkill. Searching for the string "* DONE" will often do the trick. E.g.,

[+-]Word/{Regexp} ...: +brazil +"* DONE"

Indeed, Mr. Gnu could probably also dispense with the :BIB_AUTHOR: regexp above, simply typing…

[+-]Word/{Regexp} ...: +brazil +"* DONE" +":BIB_AUTHOR: Walter Evensong"

Here's another example. Let's say Mr. Gnu would like to find all active todos directly tagged "urgent" (i.e., not inherited) with the word "wedding" somewhere in the entry text. The following keyword search does the trick:

C-c a s
[+-]Word/{Regexp} ...: !+wedding :urgent:

If Mr. Gnu wants to see either the tag "urgent" or the tag "important", he could use a regular expression:

!+wedding +{:\(urgent\|important\):}

The main limitation of such searches is that keyword searches know nothing of outline tree inheritance. Thus, if Mr. Gnu is interested in all entries that inherit the tag "urgent", he should always use org-tags-view.

Searching additional files

Often, the set of files one would like to search by keyword is larger than one's set of active agenda files. For instance, one might archive old projects in separate files so that they no longer contribute to the agenda. Yet one would still like to search the reference material in these projects by keyword/regexp.

The solution lies in the variable org-agenda-text-search-extra-files. Adding a list of files to this variable instructs org-search-view to query those files in addition to the agenda files. Note that setting org-agenda-text-search-extra-files has no effect on other types of agenda commands, such as todo and tags/property searches.

Keyword searches for Google addicts

As noted before, org-search-view will treat a search query as a boolean expression only if it begins with either a + or a { (i.e., a regular expression). Without these characters, Org-mode will treat the query as a single substring.

This default syntax of org-search-view is thus different than the behavior of search engines such as Google, which treat searches as lazy boolean queries by default. If you type "emacs org-mode" into Google, it will not search for the literal string "emacs org-mode", but rather assume the space implies a boolean expression: "emacs and org-mode".

If you find yourself often forgetting to add an initial + to your org-search-view queries, you can make "lazy booleans" the default behavior by adding the following to your .emacs:

(setq org-agenda-search-view-always-boolean t)

Then you can happily type your lazy searches:

C-c a s
[+-]Word/{Regexp} ...: org mode Carsten :email:

If you would like to include a substring or phrase in your search, you can do so by enclosing it in quotation marks. And if you want to exclude items or use regular expressions, you will, of course, still have to use a minus sign and curly brackets, respectively.

Searching org files line-by-line

All the searches we have discussed thus far return their results as a list of org headlines in the agenda buffer. Sometimes, however, you might prefer to see each line in which a word or regular expression occurs. There are different ways to do this:

Multi-occur

Org-mode uses Emacs' multi-occur command to search for any lines in the agenda files containing a regular expression. Simply type C-c a / followed by a word or regular expression and you will be presented a buffer with all lines that match the query, with each line conveniently linked to its original location.

External commands and scripts

Emacs provides convenient interfaces to common Unix search commands, such as grep. Simply type M-x grep and complete the query (the working directory is usually that of the current buffer in Emacs). Using grep is especially convenient when you want quickly to search org files that are not in org-agenda-files or org-agenda-text-search-extra-files. And, of course, grep can be used outside of Emacs.

Since org-mode files are plain text, you can use your favorite scripting language (perl, awk, python, etc.) to develop new and ever more creative ways to search and analyze them.

Sparse trees

The commands we have examined so far typically search multiple files and display the resulting heading in a separate agenda buffer. But sometimes, you might want to search for various types of data within a single file, so as to see all the matching headlines and entries in context.

The way to accomplish this is via a sparse tree view (C-c /), which collapses the outline in the current file, showing only the portions that match a query.

Calling org-sparse-tree with C-c / brings up a prompt with several search options:

Sparse tree: [r]egexp [/]regexp [t]odo [T]odo-kwd [m]atch [p]roperty
             [d]eadlines [b]efore-date [a]fter-date

Some of these search, such as "todo" (t) and "deadlines" (d) are quite simple, showing all headlines in a buffer that contain an active todo keyword or a deadline, respectively. Others, such as "property" (p), prompt for a single key/value pair.

One search that may be of particular interest is "match" (m). This query uses exactly the same syntax as org-tags-view, allowing us to use complex metadata searches to create sparse trees

For instance, to highlight all active todos without a timestamp in the current buffer, you could type:

C-c / m 
Match: -SCHEDULED={.}/!

This instructs Org-mode to look for any active todo (/!) without a SCHEDULED timestamp.

Custom agenda commands

If there are searches you perform again and again, you can easily save them by adding them to you custom agenda commands.

As we know, Mr. Gnu is an avid collector of very large books (which, of course, he manages in very long org files). Moreover, he often likes to peruse your inventory of books over 1,000 pages, querying his custom BIB_PAGES field. To save time and energy, Mr. Gnu could add a custom command such as the following to his .emacs:

(add-to-list 'org-agenda-custom-commands
             '("b" "Big books" tags "+BIB_PAGES>1000"))

Note that "tags" here indicates org-tags-view. Thus, the query uses the tags/property search syntax.

Mr. Gnu realizes he can save an even faster version of the search above:

(add-to-list 'org-agenda-custom-commands
             '("B" "Big books (fast)" search "{:BIB_PAGES:\\s-+[0-9]\\{4\\}}"))

The symbol "search", as you might have guessed, instructs Org-mode to use org-search-view. And the saved search finds all items with BIB_PAGES property that contain four digits (i.e., > 1000 pages).

You might notice that the search query here, compared with the one above, contains extra backslashes. That is because the backslash is a special character in emacs-lisp and thus needs to be escaped when placed in an .emacs file.

If Mr. Gnu frequently need to perform the "urgent wedding tasks" search above, he could add a command such as the following:

(add-to-list 'org-agenda-custom-commands
             '("w" "Getting married next week!" 
               search "!+wedding +{:\\(urgent\\|important\\):}"))

Finally, one can use custom commands to run searches with different local settings. For instance, one can set up a custom agenda command to run a tags/property search on files other than the agenda files:

(add-to-list 'org-agenda-custom-commands
             '("r" "Reference material" tags ""
               ((org-agenda-files (file-expand-wildcards "~/ref/*.org")))))

For a full introduction to custom agenda commands, please see this tutorial.

Footnotes:

Note that the lowercase variant of the command (C-c a t) does not provide a search prompt, but simply pulls up all active TODOs.