# Org Syntax

DRAFT v2β

## 1. Introduction

Org is a plaintext format composed of simple, yet versatile, forms which represent formatting and structural information. It is designed to be both intuitive to use, and capable of representing complex documents. Like Markdown (RFC7763), Org may be considered a lightweight markup language. However, while Markdown refers to a collection of similar syntaxes, Org is a single syntax.

Should markdown be mentioned at all?

This document describes and comments on Org syntax as it is currently read by its parser (org-element.el) and, therefore, by the export framework.

## 2. Terminology and conventions

### 2.1. Objects and Elements

The components of this syntax can be divided into two classes: “objects” and “elements”. To better understand these classes, consider the paragraph as a unit of measurement. Elements are syntactic components that exist at the same or greater scope than a paragraph, i.e. which could not be contained by a paragraph. Conversely, objects are syntactic components that exist with a smaller scope than a paragraph, and so can be contained within a paragraph.

Elements can be stratified into “headings”, “sections”, “greater elements”, and “lesser elements”, from broadest scope to narrowest. Along with objects, these sub-classes define categories of syntactic environments. Only headings, sections, property drawers, and planning lines are context-free1, 2, every other syntactic component only exists within specific environments. This is a core concept of the syntax.

Expanding on the stratification of elements, lesser elements are elements that cannot contain any other elements. As such, a paragraph is considered a lesser element. Greater elements can themselves contain greater elements or lesser elements. Sections contain both greater and lesser elements, and headings can contain a section and other headings.

### 2.2. The minimal and standard sets of objects

To simplify references to common collections of objects, we define two useful sets. The minimal set of objects refers to plain text, text markup, entities, LaTeX fragments, superscripts and subscripts. The standard set of objects refers to the entire set of objects, excluding citation references and table cells.

### 2.3. Blank lines

A line containing only spaces, tabs, newlines, and line feeds (\t\n\r) is considered a blank line. Blank lines can be used to separate paragraphs and other elements.

With the exception of list items, blank lines belong to the preceding element with the narrowest possible scope. For example, if at the end of a section we have a paragraph and a blank line, that blank line is considered part of the paragraph.

### 2.4. Indentation

Indentation consists of a series of space and tab characters at the beginning of a line. Most elements can be indentated, with the exception of headings, inlinetasks, footnote definitions, and diary sexps. Indentation is only syntactically meaningful in plain lists.

### 2.5. Syntax patterns

#### 2.5.1. General form

Most elements and objects will be described with the help of syntax patterns, consisting of a series of named tokens written in uppercase and separated by a space, like so:

TOKEN1 TOKEN2


These tokens are often named roughly according to their semantic meaning, For instance, “KEY” and “VALUE” when describing Keywords. Tokens will be specified as either a string, or a series of elements or objects.

Unless otherwise specified, a space in a pattern represents one or more horizontal whitespace characters.

Patterns will often also contain static structures that serve to differentiate a particular element or object type from others, but have no semantic meaning. These are simply included in the pattern verbatim. For instance, if a pattern consists of two plus signs (+) immediately followed by a TOKEN it would be written like so:

++TOKEN


Since tokens are written in uppercase, any letters in static structures are distinguished by being written in lowercase.

#### 2.5.2. Special tokens

In a few cases, an instance of an element or object must be preceded or succeeded by a certain pattern, which is not itself part of the element or object. There patterns are specified using the PRE and POST tokens respectively, like so:

PRE TOKEN POST


#### 2.5.3. Case significance

In this document, unless specified otherwise, case is insignificant.

## 3. Elements

A Heading is an unindented line structured according to the following pattern:

STARS KEYWORD PRIORITY TITLE TAGS

STARS
A string consisting of one or more asterisks (up to org-inlinetask-min-level if the org-inlinetask library is loaded) suffixed by a space character. The number of asterisks is used to define the level of the heading.
KEYWORD (optional)
A string which is a member of org-todo-keywords-13. Case is significant. This is called a “todo keyword”. 4
PRIORITY (optional)
A single alphanumeric character preceded by a hash sign # and enclosed within square brackets (e.g. [#A] or [#1]). This is called a “priority cookie”.
TITLE (optional)
A series of objects from the standard set, excluding line break objects. It is matched after every other part.
TAGS (optional)
A series of colon-separated strings consisting of alpha-numeric characters, underscores, at signs, hash signs, and percent signs (_@#%).

Examples

*
** DONE
*** Some e-mail
**** TODO [#A] COMMENT Title :tag:a2%:


If the first word appearing in the title is COMMENT, the heading will be considered as “commented”. Case is significant.

If the TITLE of a heading is exactly the value of org-footnote-section (Footnotes by default), it will be considered as a “footnote section”. Case is significant.

If ARCHIVE is one of the tags given, the heading will be considered as “archived”. Case is significant.

All content following a heading — up to either the next heading, or the end of the document, forms a section contained by the heading. This is optional, as the next heading may occur immediately in which case no section is formed.

#### 3.1.2. Sections

Sections contain one or more non-heading elements. With the exception of the text before the first heading in a document (which is considered a section), sections only occur within headings.

Since sections are usually thought of as a larger group that includes nested content (e.g. “section 3”), and this isn’t what Org sections are, maybe this should be called something slightly different?

Example

Consider the following document:

An introduction.
Some text.
** Sub-Topic 1
** Sub-Topic 2


Its internal structure could be summarized as:

(document
(section)
(section)


#### 3.1.3. The zeroth section

All elements before the first heading in a document lie in a special section called the zeroth section. It may be preceded by blank lines. Unlike a normal section, the zeroth section can immediately contain a property drawer, optionally preceded by comments. It cannot however, contain planning.

### 3.2. Greater Elements

Unless otherwise specified, greater elements can directly contain any greater or lesser element except:

#### 3.2.1. Greater Blocks

Greater blocks are structured according to the following pattern:

#+begin_NAME PARAMETERS
CONTENTS
#+end_NAME

NAME
A string consisting of any non-whitespace characters, which is not the NAME of a lesser block. Greater blocks are treated differently based on their subtype, which is determined by the NAME as follows:
• center, a “center block”
• quote, a “quote block”
• any other value, a “special block”
PARAMETERS (optional)
A string consisting of any characters other than a newline.
CONTENTS

A collection of zero or more elements, subject to two conditions:

• No line may start with #+end_NAME.
• Lines beginning with an asterisk must be quoted by a comma (,*).

Furthermore, lines starting with #+ may be quoted by a comma (,#+).

#### 3.2.2. Drawers and Property Drawers

Drawers are structured according to the following pattern:

:NAME:
CONTENTS
:end:

NAME
A string consisting of word-constituent characters, hyphens and underscores (-_).
CONTENTS
A collection of zero or more elements, except another drawer.

#### 3.2.3. Dynamic Blocks

Dynamic blocks are structured according to the following pattern:

#+begin: NAME PARAMETERS
CONTENTS
#+end:

NAME
A string consisting of non-whitespace characters.
PARAMETERS (optional)
A string consisting of any characters but a newline.
CONTENTS
A collection of zero or more elements, except another dynamic block.

#### 3.2.4. Footnote Definitions

Footnote definitions must occur at the start of an unindented line, and are structured according to the following pattern:

[fn:LABEL] CONTENTS

LABEL
Either a number or an instance of the pattern fn:WORD, where WORD represents a string consisting of word-constituent characters, hyphens and underscores (-_).
CONTENTS (optional)
A collection of zero or more elements. It ends at the next footnote definition, the next heading, two consecutive blank lines, or the end of buffer.

Examples

[fn:1] A short footnote.

[fn:2] This is a longer footnote.

It even contains a single blank line.


Inlinetasks are syntactically a heading with a level of at least org-inlinetask-min-level5, i.e. starting with at least that many asterisks.

Optionally, inlinetasks can be ended with a second heading with a level of at least org-inlinetask-min-level5, with no optional components (i.e. only STARS and TITLE provided) and the string END as the TITLE. This allows the inlinetask to contain elements.

Urgh, this syntax is ugly. — Tom G, Timothy

Examples

*************** TODO some tiny task
This is a paragraph, it lies outside the inlinetask above.
:PROPERTIES:
:SOMETHING: or other
:END:
And here is some extra text
*************** END


Inlinetasks are only recognized after the org-inlinetask library is loaded.

#### 3.2.6. Items

Items are structured according to the following pattern:

BULLET COUNTER-SET CHECK-BOX TAG CONTENTS

BULLET
One of the two forms below, followed by either a whitespace character or line ending.
• An asterisk, hyphen, or plus sign character (i.e., *, -, or +).
• Either the pattern COUNTER. or COUNTER).
COUNTER
Either a number or a single letter (a-z).
COUNTER-SET (optional)
An instance of the pattern [@COUNTER].
CHECK-BOX (optional)
A single whitespace character, an X character, or a hyphen enclosed by square brackets (i.e. [ ], [X], or [-]).
TAG (optional)
An instance of the pattern TAG-TEXT :: where TAG-TEXT represents a string consisting of non-newline characters that does not contain the substring "\nbsp{}::\nbsp{}" (two colons surrounded by whitespace, without the quotes).
CONTENTS (optional)
A collection of zero or more elements, ending at the first instance of one of the following:
• The next item.
• The first line less or equally indented than the starting line, not counting lines within other elements or inlinetask boundaries.
• Two consecutive blank lines.

Since it is stated that CONTENTS will end at the next item, Examples

- item
3. [@3] set to three
+ [-] tag :: item contents


#### 3.2.7. Plain Lists

A plain list is a set of consecutive items of the same indentation.

At a glance it may appear as though nested lists are not possible. They are, as items may themselves contain lists.

If first item in a plain list has a COUNTER in its BULLET, the plain list will be an “ordered plain-list”. If it contains a TAG, it will be a “descriptive list”. Otherwise, it will be an “unordered list”. List types are mutually exclusive at the same level of indentation, if both types are present consecutively then they parse as separate lists.

For example, consider the following excerpt of an Org document:

1. item 1
2. [X] item 2
- some tag :: item 2.1


Its internal structure is as follows:

(ordered-plain-list
(item)
(item
(descriptive-plain-list
(item))))


#### 3.2.8. Property Drawers

Property drawers are a special type of drawer containing properties attached to a heading or inlinetask. They are located right after a heading and its planning information, as shown below:

HEADLINE
PROPERTYDRAWER

PLANNING
PROPERTYDRAWER


Property Drawers are structured according to the following pattern:

:properties:
CONTENTS
:end:

CONTENTS
A collection of zero or more node properties, not separated by blank lines.

The failure mode for malformed contents needs to be determined more clearly here. We don’t want property draws to suddenly become plain drawers just because a user has a malformed line, that could be disastrous if certain settings in the property drawer mask settings from further up the tree. In short, malformed contents should not poison the whole property drawer. — Tom G

Example

:PROPERTIES:
:CUSTOM_ID: someid
:END:


#### 3.2.9. Tables

Tables are started by a line beginning with either:

• A vertical bar (|), forming an “org” type table.
• The string +- followed by a sequence of plus (+) and minus (-) signs, forming a “table.el” type table.

Maybe drop table.el from the spec?

Tables cannot be immediately preceded by such lines, as the current line would the be part of the earlier table.

Org tables contain table rows, and end at the first line not starting with a vertical bar. An Org table can be followed by a number of #+TBLFM: FORMULAS lines, where FORMULAS represents a string consisting of any characters but a newline.

Table.el tables end at the first line not starting with either a vertical line or a plus sign.

Example

| Name  | Phone | Age |
|-------+-------+-----|
| Peter |  1234 |  24 |
| Anna  |  4321 |  25 |


### 3.3. Lesser Elements

Lesser elements cannot contain any other element.

Only keywords which are a member of org-element-parsed-keywords6, verse blocks, paragraphs or table rows can contain objects.

#### 3.3.1. Blocks

Like greater blocks, blocks are structured according to the following pattern:

#+begin_NAME DATA
CONTENTS
#+end_NAME

NAME
A string consisting of any non-whitespace characters. The type of the block is determined based on the value as follows:
• comment, a “comment block”,
• example, an “example block”,
• export, an “export block”,
• src, a “source block”,
• verse, a “verse block”. The NAME must be one of these values. Otherwise, the pattern forms a greater block.
DATA (optional)
A string consisting of any characters but a newline.
• In the case of an export block, this is mandatory and must be a single word.
• In the case of a source block, this is mandatory and must follow the pattern LANGUAGE SWITCHES ARGUMENTS with:
LANGUAGE
A string consisting of any non-whitespace characters
SWITCHES
Any number of SWITCH patterns, separated by a single space character
SWITCH
Either the pattern -l "FORMAT" where FORMAT represents a string consisting of any characters but a double quote (") or newline, or the pattern -S or +S where S represents a single alphabetic character
ARGUMENTS
A string consisting of any character but a newline.
CONTENTS (optional)

A string consisting of any characters (including newlines) subject to the same two conditions of greater block’s CONTENTS, i.e.

• No line may start with #+end_NAME.
• Lines beginning with an asterisk must be quoted by a comma (,*).

As with greater blocks, lines starting with #+ may be quoted by a comma (,#+). CONTENTS will contain Org objects when the block is a verse block, it is otherwise not parsed.

Can we drop switch support? This seems like a fairly good idea.
“For the love of all that is sane” — Tom G

Example

#+begin_verse
There was an old man of the Cape
Who made himself garments of crepe.
He replied, “Here and there,
But they’re perfectly splendid for shape!”
#+end_verse


#### 3.3.2. Clock

A clock element is structured according to the following pattern:

clock: INACTIVE-TIMESTAMP
clock: INACTIVE-TIMESTAMP-RANGE DURATION

INACTIVE-TIMESTAMP
An inactive timestamp object.
INACTIVE-TIMESTAMP-RANGE
An inactive range timestamp object.
DURATION
An instance of the pattern => HH:MM.
HH
A number consisting of any number of digits.
MM
A two digit number.

Example

clock: [2024-10-12]


#### 3.3.3. Diary Sexp

A diary sexp7 element is an unindented line structured according to the following pattern:

%%SEXP

SEXP
A string starting with an open parenthesis (, with balanced opening and closing parentheses.

Example

%%(org-calendar-holiday)


#### 3.3.4. Planning

A planning element is structured according to the following pattern:

HEADING
PLANNING

PLANNING
A line consisting of one or more KEYWORD: TIMESTAMP patterns (termed “info” patterns).
KEYWORD
Either the string DEADLINE, SCHEDULED, or CLOSED.
TIMESTAMP
A timestamp object.

When a keyword is repeated in a planning element, the last instance of it has priority.

Tom G has requested adding a OPENED keyword to track task creation/registration.

Example

*** TODO watch "The Matrix"
SCHEDULED: <1999-03-31 Wed>


A “comment line” starts with a hash character (#) and either a whitespace character or the immediate end of the line.

Comments consist of one or more consecutive comment lines.

Example

# Just a comment
#
# Over multiple lines


#### 3.3.6. Fixed Width Areas

A “fixed-width line” starts with a colon character (:) and either a whitespace character or the immediate end of the line.

Fixed-width areas consist of one or more consecutive fixed-width lines.

Example

: This is a
: fixed width area


#### 3.3.7. Horizontal Rules

A horizontal rule is formed by a line consisting of at least five consecutive hyphens (-----).

#### 3.3.8. Keywords

Keywords are structured according to the following pattern:

#+KEY: VALUE

KEY
A string consisting of any non-whitespace characters, other than call (which would forms a babel call element).
VALUE
A string consisting of any characters but a newline.

Perhaps this should be changed to be #+KEY[OPT]: VAL? It would make the syntax more regular, considering affiliated keywords. I can’t see any backwards compatibility concerns.
This was suggested by Tom G, but I’m a fan — Timothy.

When KEY is a member of org-element-parsed-keywords6, VALUE can contain the standard set objects, excluding footnote references.

Note that while instances of this pattern are preferentially parsed as affiliated keywords, a keyword with the same KEY as an affiliated keyword may occur so long as it is not immediately preceding a valid element that can be affiliated. For example, an instance of #+caption: hi followed by a blank line will be parsed as a keyword, not an affiliated keyword.

1. Babel Call

Babel calls are structured according to one of the following patterns:

#+call: NAME(ARGUMENTS)

NAME
A string consisting of any non-newline characters except for square brackets, or parentheses ([]()).
ARGUMENTS (optional)
A string consisting of any non-newline characters. Opening and closing parenthesis must be balanced.
A string consisting of any non-newline characters. Opening and closing square brackets must be balanced.

Should this be distinguished from other keywords at the AST interpretation stage, instead of the base syntax? — Tom G

2. Affiliated Keywords

With the exception of comments, clocks, headings, inlinetasks, items, node properties, planning, property drawers, sections, and table rows, every other element type can be assigned attributes.

This is done by adding specific keywords, named affiliated keywords, immediately above the element considered (a blank line cannot lie between the affiliated keyword and element). Structurally, affiliated keyword are not considered an element in their own right but a property of the element they apply to.

Affiliated keywords are structured according to one of the following pattern:

#+KEY: VALUE
#+KEY[OPTVAL]: VALUE
#+attr_BACKEND: VALUE

KEY
A string which is a member of org-element-affiliated-keywords8.
BACKEND
A string consisting of alphanumeric characters, hyphens, or underscores (-_).
OPTVAL (optional)
A string consisting of any characters but a newline. Opening and closing square brackets must be balanced. This term is only valid when KEY is a member of org-element-dual-keywords9.
VALUE
A string consisting of any characters but a newline, except in the case where KEY is member of org-element-parsed-keywords6 in which case VALUE is a series of objects from the standard set, excluding footnote references.

Should this even be described at a syntax level instead of an AST processing level? — Tom G

Repeating an affiliated keyword before an element will usually result in the prior VALUEs being overwritten by the last instance of KEY. The sole exception to this is #+header: keywords, where in the case of multiple :opt val declarations the last declaration on the first line it occurs on has priority.

Maybe this should be first-line-wins for all affiliated keywords? This would be a breaking change though. — Timothy

There are two situations under which the VALUEs will be concatenated:

1. If KEY is a member of org-element-dual-keywords9.
2. If the affiliated keyword is an instance of the pattern #+attr_BACKEND: VALUE.

When no element immediately follows an instance of the “affiliated keyword” pattern, the keyword is a normal, non-affiliated keyword.

The following example contains three affiliated keywords:

#+name: image-name
#+caption: This is a caption for
[[file:some/image.png]]


#### 3.3.9. LaTeX Environments

LaTeX environments are structured according to the following pattern:

\begin{NAME}
CONTENTS
\end{NAME}

NAME
A non-empty string consisting of alphanumeric or asterisk characters
CONTENTS (optional)
A string which does not contain the substring \end{NAME}.

Examples

\begin{align*}
2x - 5y &= 8 \\
3x + 9y &= -12
\end{align*}


#### 3.3.10. Node Properties

Node properties can only exist in property drawers, and are structured according to one of the following patterns:

:NAME: VALUE
:NAME:
:NAME+: VALUE
:NAME+:

NAME
A non-empty string containing any non-whitespace characters which does not end in a plus characters (+).
VALUE (optional)
A string containing any characters but a newline.

#### 3.3.11. Paragraphs

Paragraphs are the default element, which means that any unrecognized context is a paragraph.

Empty lines and other elements end paragraphs.

Paragraphs can contain the standard set of objects.

#### 3.3.12. Table Rows

A table row consists of a vertical bar (|) followed by:

• Any number of table cells, forming a “standard” type row.
• A hyphen (-), forming a “rule” type row. Any non-newline characters can follow the hyphen and this will still be a “rule” type row

Table rows can only exist in tables.

## 4. Objects

Objects can only be found in the following elements:

Most objects cannot contain objects. Those which can will be specified. Furthermore, while many objects may contain newlines, a blank line often terminates the element that the object is a part of, such as a paragraph.

### 4.1. Entities

Entities are structured according to the following pattern:

\NAME POST


Where NAME and POST are not separated by a whitespace character.

NAME
A string with a valid association in either org-entities10 or org-entities-user.
POST
Either:
• The end of line.
• The string {}.
• A non-alphabetic character.

Example

\cent


### 4.2. LaTeX Fragments

LaTeX fragments are structured according to one of the following patterns:

\NAME BRACKETS
$$CONTENTS$$
$CONTENTS$

NAME
A string consisting of alphabetic characters which does not have an association in either org-entities or org-entities-user.
BRACKETS (optional)

An instance of one of the following patterns, not separated from NAME by whitespace.

[CONTENTS1]
{CONTENTS1}

CONTENTS1
A string consisting of any characters but {, }, [, ], or a newline.
CONTENTS2
A string consisting of any characters but {, }, or a newline.
CONTENTS
A string consisting of any characters, so long as it does not contain the substring \) in the case of the second template, or \] in the case of the third template.

Examples

\enlargethispage{2\baselineskip}
$$e^{i \pi}$$


Org also supports TeX-style inline LaTeX fragments, structured according the following pattern:

$$CONTENTS$$
PRE$CHAR$POST
PRE$BORDER1 BODY BORDER2$POST

PRE
Either the beginning of line or a character other than $. CHAR A non-whitespace character that is not ., ,, ?, ;, or a double quote ("). POST Any punctuation character (including parentheses and quotes), a space character, or the end of line. BORDER1 A non-whitespace character that is not ., ,, ;, or $.
BODY
A string consisting of any characters except $, and which does not span more than three lines. BORDER2 A non-whitespace character that is not ., ,, or $.

Example

$$1+1=2$$


It would introduce incompatibilities with previous Org versions, but support for $...$ (and for symmetry, $$...$$) constructs ought to be removed.

They are slow to parse, fragile, redundant and imply false positives. — NGZ

Strong support for removing these. — Tom G

I’m strongly in support of dropping $-syntax. — Timothy ### 4.3. Export Snippets Export snippets are structured according to the following pattern: @@BACKEND:VALUE@@  BACKEND A string consisting of zero or more alphanumeric characters and hyphens. VALUE (optional) A string containing anything but the string @@. ### 4.4. Footnote References Footnote references are structured according to one of the following patterns: [fn:LABEL] [fn:LABEL:DEFINITION] [fn::DEFINITION]  LABEL A string containing one or more word constituent characters, hyphens and underscores (-_). DEFINITION (optional) One or more objects from the standard set, so long as opening and closing square brackets are balanced within DEFINITION. If the reference follows the second pattern, it is called an “inline footnote”. If it follows the third pattern, i.e. if LABEL is omitted, it is called an “anonymous footnote”. Note that the first pattern may not occur on an unindented line, as it is then a footnote definition. ### 4.5. Citations Citations are structured according to the following pattern: [cite CITESTYLE: GLOBALPREFIX REFERENCES GLOBALSUFFIX]  Where “cite” and CITESTYLE, KEYCITES and GLOBALSUFFIX are not separated by whitespace. KEYCITES, GLOBALPREFIX, and GLOBALSUFFIX must be separated by semicolons. Whitespace after the leading colon or before the closing square bracket is not significant. All other whitespace is significant. CITESTYLE (optional) An instance of either the pattern /STYLE or /STYLE/VARIANT STYLE A string made of any alphanumeric character, _, or -. Variant A string made of any alphanumeric character, _, -, or /. GLOBALPREFIX (optional) One or more objects from the standard set, so long as all square brackets are balanced within GLOBALPREFIX, and it does not contain any semicolons (;) or subsequence that matches @KEY. REFERENCES One or more citation reference objects, separated by semicolons (;). GLOBALSUFFIX (optional) One or more objects from the standard set, so long as all square brackets are balanced within GLOBALSUFFIX, and it does not contain any semicolons (;) or subsequence that matches @KEY. Examples [cite:@key] [cite/t:see;@foo p. 7;@bar pp. 4;by foo] [cite/a/f:c.f.;the very important @@atkey @ once;the crucial @baz vol. 3]  ### 4.6. Citation references A reference to an individual resource is given in a citation reference object. Citation references are only found within citations, and are structured according to the following pattern: KEYPREFIX @KEY KEYSUFFIX  Where KEYPREFIX, @​KEY, and KEYSUFFIX are not separated by whitespace. KEYPREFIX (optional) One or more objects from the minimal set, so long as all square brackets are balanced within KEYPREFIX, and it does not contain any semicolons (;) or subsequence that matches @KEY. KEY A string made of any word-constituent character, -, ., :, ?, !, , ', /, *, @, +, |, (, ), {, }, <, >, &, _, ^, $, #, %, or ~.
KEYSUFFIX (optional)
One or more objects from the minimal set, so long as all square brackets are balanced within KEYPREFIX, and it does not contain any semicolons (;).

### 4.7. Inline Babel Calls

Inline Babel calls are structured according to one of the following patterns:

call_NAME(ARGUMENTS)

NAME
A string consisting of any non-whitespace characters except for square brackets or parentheses ([](​)).
A string consisting of any characters but a newline. Opening and closing square brackets must be balanced.

### 4.8. Inline Source Blocks

Inline source blocks follow any of the following patterns:

src_LANG{BODY}

LANG
A string consisting of any non-whitespace characters.
A string consisting of any characters but a newline. Opening and closing square brackets must be balanced.

### 4.9. Line Breaks

Line breaks must occur at the end of an otherwise non-blank line, and are structured according to the following pattern:

\\SPACE

SPACE
Zero or more tab and space characters.

### 4.11. Macros

Macros are structured according to one of the following patterns:

{{{NAME}}}
{{{NAME(ARGUMENTS)}}}

NAME
A string starting with a alphabetic character followed by any number of alphanumeric characters, hyphens and underscores (-_).
ARGUMENTS (optional)
A string consisting of any characters, so long as it does not contain the substring }}}. Values within ARGUMENTS are separated by commas. Non-separating commas have to be escaped with a backslash character.

Examples

{{{title}}}
{{{one_arg_macro(1)}}}
{{{two_arg_macro(1, 2)}}}
{{{two_arg_macro(1\,a, 2)}}}


### 4.12. Targets and Radio Targets

Targets are structured according to the following pattern:

<<TARGET>>

TARGET
A string containing any character but <, >, or \n. It cannot start or end with a whitespace character.

Radio targets are structured according to the following pattern:

<<<CONTENTS>>>

CONTENTS
One or more objects from the minimal set, starting and ending with a non-whitespace character, and containing any character but <, >, or \n.

Statistics cookies are structured according to one of the following patterns:

[PERCENT%]
[NUM1/NUM2]

PERCENT (optional)
A number.
NUM1 (optional)
A number.
NUM2 (optional)
A number.

### 4.14. Subscript and Superscript

Subscripts are structured according to the following pattern:

CHAR_SCRIPT


Superscripts are structured according to the following pattern:

CHAR^SCRIPT

CHAR
Any non-whitespace character.
SCRIPT
One of the following constructs:
• A single asterisk character (*).
• An expression enclosed in curly brackets ({, }), which may itself contain balanced curly brackets.
• An instance of the pattern:

SIGN CHARS FINAL


With no whitespace between SIGN, CHARS and FINAL.

SIGN
Either a plus sign character (+), a minus sign character (-), or the empty string.
CHARS
Either the empty string, or a string consisting of any number of alphanumeric characters, commas, backslashes, and dots.
FINAL
An alphanumeric character.

### 4.15. Table Cells

Table cells are structured according to the following pattern:

CONTENTS SPACES|

CONTENTS
Zero or more objects not containing the vertical bar character (|). It can contain the minimal set of objects, citations, export snippets, footnote references, links, macros, radio targets, targets, and timestamps.
SPACES
A string consisting of zero or more of space characters, used to align the table columns.

The final vertical bar (|) may be omitted in the last cell of a table row.

### 4.16. Timestamps

Timestamps are structured according to one of the seven following patterns:

<%%(SEXP)>                                                     (diary)
<DATE TIME REPEATER-OR-DELAY>                                  (active)
[DATE TIME REPEATER-OR-DELAY]                                  (inactive)
<DATE TIME REPEATER-OR-DELAY>--<DATE TIME REPEATER-OR-DELAY>   (active range)
<DATE TIME-TIME REPEATER-OR-DELAY>                             (active range)
[DATE TIME REPEATER-OR-DELAY]--[DATE TIME REPEATER-OR-DELAY]   (inactive range)
[DATE TIME-TIME REPEATER-OR-DELAY]                             (inactive range)

SEXP
A string consisting of any characters but > and \n.
DATE

An instance of the pattern:

YYYY-MM-DD DAYNAME

Y, M, D
A digit.
DAYNAME (optional)
A string consisting of non-whitespace characters except +, -, ], >, a digit, or \n.
TIME (optional)
An instance of the pattern H:MM where H represents a one to two digit number (and can start with 0), and M represents a single digit.
REPEATER-OR-DELAY (optional)

An instance of the following pattern:

MARK VALUE UNIT


Where MARK, VALUE and UNIT are not separated by whitespace characters.

MARK
Either the string + (cumulative type), ++ (catch-up type), or .+ (restart type) when forming a repeater, and either - (all type) or -- (first type) when forming a warning delay.
VALUE
A number
UNIT
Either the character h (hour), d (day), w (week), m (month), or y (year)

There can be two instances of REPEATER-OR-DELAY in the timestamp: one as a repeater and one as a warning delay.

Tom G has some syntax extensions he’d like to suggest for historical / far-future dates, timezone offsets, and second/sub-second times.

Examples

<1997-11-03 Mon 19:15>
<%%(diary-float t 4 2)>
[2004-08-24 Tue]--[2004-08-26 Thu]
<2012-02-08 Wed 20:00 ++1d>
<2030-10-05 Sat +1m -3d>


### 4.17. Text Markup

There are six text markup objects, which are all structured according to the following pattern:

PRE MARKER CONTENTS MARKER POST


Where PRE, MARKER, CONTENTS, MARKER and POST are not separated by whitespace characters.

PRE
Either a whitespace character, -, (, {, ', ", or the beginning of a line.
MARKER
A character that determines the object type, as follows:
• *, a bold object,
• /, an italic object,
• _ an underline object,
• =, a verbatim object,
• ~, a code object
• +, a strike-through object.
CONTENTS

An instance of the pattern:

BORDER BODY BORDER


Where BORDER and BODY are not separated by whitespace.

BORDER
Any non-whitespace character.
BODY
Either a string (when MARKER represents code or verbatim) or a series of objects from the standard set, not spanning more than three lines.
POST
Either a whitespace character, -, ., ,, ;, :, !, ?, ', ), }, [, ", or the end of a line.

Examples

Org is a /plaintext markup syntax/ developed with *Emacs* in 2003.
The canonical parser is =org-element.el=, which provides a number of
functions starting with ~org-element-~.


#### 4.17.1. Plain Text

Any string that doesn’t match any other object can be considered a plain text object.12 Within a plain text object, all whitespace is collapsed to a single space. For instance, hello\n there is equivalent to hello there.

## 5. Appendix

### 5.1. Summary of changes compared to the current org-syntax document

• Rename “Headlines” -> “Headings”, since while both forms are currently used in the docs a change to consistently use the latter seems imminent if delayed by (what looks like) an ongoing wait for Bastien’s final say
• Describe patterns with consistent phrasing, “Xs are structured according to the following pattern:”
• Describe string patterns with consistent phrasing,“a string constituted of” (and other forms) -> “a string consisting of”
• Describe components of a pattern using description lists
• Use verbatim objects for verbatim text over quotes
• Change the way inlinetasks are described
• Add CONTENTS component to the Item structure
• Some whitespace/capitalisation changes
• (Lots of) miscellaneous wording changes for clarity
• Fix some minor errors (like referencing a variable which was removed 7y ago, or saying that switches in source block headers should be separated by blank lines)
• Change the babel call element syntax description to the more detailed form found in the manual
• Change the inline babel call object syntax description to be consistent with the babel call element syntax. This does not precisely match the parser behaviour, but matches a very slight subset. The previous description in some parts matched a superset of the parser behaviour, and in other places a subset.
• Change “Greater Elements” / “Elements” to “Greater Elements”/ “Lesser Elements”
• Put all Elements under the new level-1 heading “Elements”
• Separate list definition into four sub-definitions
• Add a “Terminology and conventions” section
• Mention plain-text objects (see plain-text) for the sake of consistency (When something can “contain an object” it can contain unformatted text. Without naming plain-text as an object this is a bit funky).
• Specify that whitespace in plain text is semantically collapsed/equivalent to a single space. It is worth noting that this is not indicated by org-element’s parsing, which grabs all the whitespace as-is. However, this feels like something which is done for performance reasons, instead of a deliberate choice to make whitespace significant, and there are a few things which reinforce this view
• ox-ascii, the only export backend to a format which doesn’t itself collapse whitespace when interpreted, re-fills paragraphs and collapses whitespace.
• We have a line break object. If \n was significant in plain text this would be unnecessary.
• org-fill-paragraph collapses whitespace
• Lastly, this is well-established sensible behaviour in every other plaintext format that I can think of (HTML, LaTeX, Markdown, reST, etc.).
• Probably a few bits and pieces that have slipped my mind.

### 5.2. Org Entities

Name Character
Letters
LATIN
Agrave À
agrave à
Aacute Á
aacute á
Acirc Â
acirc â
Amacr Ā
amacr ā
Atilde Ã
atilde ã
Auml Ä
auml ä
Aring Å
AA Å
aring å
AElig Æ
aelig æ
Ccedil Ç
ccedil ç
Egrave È
egrave è
Eacute É
eacute é
Ecirc Ê
ecirc ê
Euml Ë
euml ë
Igrave Ì
igrave ì
Iacute Í
iacute í
Idot &idot;
inodot ı
Icirc Î
icirc î
Iuml Ï
iuml ï
Ntilde Ñ
ntilde ñ
Ograve Ò
ograve ò
Oacute Ó
oacute ó
Ocirc Ô
ocirc ô
Otilde Õ
otilde õ
Ouml Ö
ouml ö
Oslash Ø
oslash ø
OElig Œ
oelig œ
Scaron Š
scaron š
szlig ß
Ugrave Ù
ugrave ù
Uacute Ú
uacute ú
Ucirc Û
ucirc û
Uuml Ü
uuml ü
Yacute Ý
yacute ý
Yuml Ÿ
yuml ÿ
LATIN (SPECIAL FACE)
fnof ƒ
real
image
weierp
ell
imath ı
jmath ȷ
GREEK
Alpha Α
alpha α
Beta Β
beta β
Gamma Γ
gamma γ
Delta Δ
delta δ
Epsilon Ε
epsilon ε
varepsilon ε
Zeta Ζ
zeta ζ
Eta Η
eta η
Theta Θ
theta θ
thetasym ϑ
vartheta ϑ
Iota Ι
iota ι
Kappa Κ
kappa κ
Lambda Λ
lambda λ
Mu Μ
mu μ
nu ν
Nu Ν
Xi Ξ
xi ξ
Omicron Ο
omicron ο
Pi Π
pi π
Rho Ρ
rho ρ
Sigma Σ
sigma σ
sigmaf ς
varsigma ς
Tau Τ
Upsilon Υ
upsih ϒ
upsilon υ
Phi Φ
phi φ
varphi ϕ
Chi Χ
chi χ
acutex ´x
Psi Ψ
psi ψ
tau τ
Omega Ω
omega ω
piv ϖ
varpi ϖ
partial
HEBREW
alefsym
aleph
gimel
beth
dalet
ICELANDIC
ETH Ð
eth ð
THORN Þ
thorn þ
Punctuation
DOTS AND MARKS
dots
cdots
hellip
middot ·
iexcl ¡
iquest ¿
DASH-LIKE
shy ­
ndash
mdash
QUOTATIONS
quot "
acute ´
ldquo
rdquo
bdquo
lsquo
rsquo
sbquo
laquo «
raquo »
lsaquo
rsaquo
Other
MISC. (OFTEN USED)
circ ˆ
vert |
vbar |
brvbar ¦
S §
sect §
amp &
lt <
gt >
tilde ~
slash /
plus +
under _
equal =
asciicirc ^
dagger
dag
Dagger
ddag
WHITESPACE
nbsp
ensp
emsp
thinsp
CURRENCY
curren ¤
cent ¢
pound £
yen ¥
euro
EUR
dollar $USD$
PROPERTY MARKS
copy ©
reg ®
trade
SCIENCE ET AL.
minus
pm ±
plusmn ±
times ×
frasl
colon :
div ÷
frac12 ½
frac14 ¼
frac34 ¾
permil
sup1 ¹
sup2 ²
sup3 ³
radic
sum
prod
micro µ
macr ¯
deg °
prime
Prime
infin
infty
prop
propto
not ¬
neg ¬
land
wedge
lor
vee
cap
cup
smile
frown
int
therefore
there4
because
sim
cong
simeq
asymp
approx
ne
neq
equiv
triangleq
le
leq
ge
geq
lessgtr
lesseqgtr
ll
Ll
lll
gg
Gg
ggg
prec
preceq
preccurlyeq
succ
succeq
succcurlyeq
sub
subset
sup
supset
nsub
sube
nsup
supe
setminus
forall
exist
exists
nexist
nexists
empty
emptyset
isin
in
notin
ni
nabla
ang
angle
perp
parallel
sdot
cdot
lceil
rceil
lfloor
rfloor
lang
rang
langle
rangle
hbar
mho
ARROWS
larr
leftarrow
gets
lArr
Leftarrow
uarr
uparrow
uArr
Uparrow
rarr
to
rightarrow
rArr
Rightarrow
darr
downarrow
dArr
Downarrow
harr
leftrightarrow
hArr
Leftrightarrow
crarr
hookleftarrow
FUNCTION NAMES
arccos arccos
arcsin arcsin
arctan arctan
arg arg
cos cos
cosh cosh
cot cot
coth coth
csc csc
deg °
det det
dim dim
exp exp
gcd gcd
hom hom
inf inf
ker ker
lg lg
lim lim
liminf liminf
limsup limsup
ln ln
log log
max max
min min
Pr Pr
sec sec
sin sin
sinh sinh
sup
tan tan
tanh tanh
SIGNS & SYMBOLS
bull
bullet
star *
lowast
ast
odot o
oplus
otimes
check
checkmark
MISCELLANEOUS (SELDOM USED)
para
ordf ª
ordm º
cedil ¸
oline
uml ¨
zwnj
zwj
lrm
rlm
SMILIES
smiley
blacksmile
sad
frowny
SUITS
clubs
clubsuit
spades
spadesuit
hearts
heartsuit
diams
diamondsuit
diamond
Diamond
loz
=_ =  {}
=_ =   {}
=_ =    {}
=_ =     {}
=_ =      {}
=_ =       {}
=_ =        {}
=_ =         {}
=_ =          {}
=_ =           {}
=_ =            {}
=_ =             {}
=_ =              {}
=_ =               {}
=_ =                {}
=_ =                 {}
=_ =                  {}
=_ =                   {}
=_ =                    {}
=_ =                     {}

## Footnotes:

1

In particular, the parser requires stars at column 0 to be quoted by a comma when they do not define a heading.

2

It also means that only headings and sections can be recognized just by looking at the beginning of the line. Planning lines and property drawers can be recognized by looking at one or two lines above.

As a consequence, using org-element-at-point or org-element-context will move up to the parent heading, and parse top-down from there until context around the original location is found.

3

By default, org-todo-keywords-1 only contains TODO and DONE, however org-todo-keywords-1 is set on a per-document basis.

4

Implementation note: todo keywords cannot be hardcoded in a tokenizer, the tokenizer must be configurable at runtime so that in-file todo keywords are properly interpreted.

5

The default value of org-inlinetask-min-level is 15.

6

By default, org-element-parsed-keywords contains CAPTION.

7

A common abbreviation for S-expression

8

By default, org-element-affiliated-keywords contains CAPTION, DATA, HEADER, NAME, PLOT, and RESULTS.

9

By default, org-element-dual-keywords contains CAPTION and RESULTS.

10

See the appendix for a list of entities.

11

By default, org-link-parameters defines links of type shell, news, mailto, https, http, ftp, help, file, and elisp.

12

In org-element.el` plain text objects are abstracted away to strings for performance reasons.

