#+title: Org Syntax
#+subtitle: DRAFT v2_{\beta}
#+author: Nicolas Goaziou, Timothy E Chapman
#+options: toc:t ':t author:nil
#+language: en
#+category: worg
#+bind: sentence-end-double-space t
#+html_link_up: index.html
#+html_link_home: https://orgmode.org/worg/
#+begin_comment
This file is released by its authors and contributors under the GNU
Free Documentation license v1.3 or later, code examples are released
under the GNU General Public License v3 or later.
#+end_comment
#+begin_export html
#+end_export
* Introduction
Org is a plaintext format composed of simple, yet versatile, forms
which represent formatting and structural information. It is designed
to be both intuitive to use, and capable of representing complex
documents. Like Markdown ([[https://datatracker.ietf.org/doc/html/rfc7763][RFC7763]]), Org may be considered a
lightweight markup language. However, while Markdown refers to a
collection of similar syntaxes, Org is a single syntax.
#+begin_notes
Should markdown be mentioned at all?
#+end_notes
This document describes and comments on Org syntax as it is currently
read by its parser (=org-element.el=) and, therefore, by the export
framework. This is intended as a technical document for developers and
those particularly interested in the syntax. Most users will be better
served by [[https://orgmode.org/manual/][the Org manual]].
* Terminology and conventions
** Objects and Elements
The components of this syntax can be divided into two classes:
"[[#Objects][objects]]" and "[[#Elements][elements]]". To better understand these classes,
consider the paragraph as a unit of measurement. /Elements/ are
syntactic components that exist at the same or greater scope than a
paragraph, i.e. which could not be contained by a paragraph.
Conversely, /objects/ are syntactic components that exist with a smaller
scope than a paragraph, and so can be contained within a paragraph.
Elements can be stratified into "[[#Headings][headings]]", "[[#Sections][sections]]", "[[#Greater_Elements][greater
elements]]", and "[[#Lesser_Elements][lesser elements]]", from broadest scope to
narrowest. Along with objects, these sub-classes define categories of
syntactic environments. Only [[#Headings][headings]], [[#Sections][sections]], [[#Property_Drawers][property drawers]], and
[[#Planning][planning lines]] are context-free[fn:1][fn:2], every other syntactic
component only exists within specific environments. This is a core
concept of the syntax.
Expanding on the stratification of elements, lesser elements are
elements that cannot contain any other elements. As such, a paragraph
is considered a lesser element. Greater elements can themselves
contain greater elements or lesser elements. Sections contain both
greater and lesser elements, and headings can contain a section and
other headings.
** The minimal and standard sets of objects
To simplify references to common collections of objects, we define two
useful sets. The /<<>> of objects/ refers to [[#Plain_Text][plain text]], [[#Emphasis_Markers][text
markup]], [[#Entities][entities]], [[#LaTeX_Fragments][LaTeX fragments]], [[#Subscript_and_Superscript][superscripts and subscripts]]. The
/<<>> of objects/ refers to the entire set of objects, excluding
citation references and [[#Table_Cells][table cells]].
** Blank lines
A line containing only spaces, tabs, newlines, and line feeds (=\t\n\r=)
is considered a /blank line/. Blank lines can be used to separate
paragraphs and other elements.
With the exception of [[#Items][list items]], blank lines belong to the preceding
element with the narrowest possible scope. For example, if at the end
of a section we have a paragraph and a blank line, that blank line is
considered part of the paragraph.
** Indentation
Indentation consists of a series of space and tab characters at the
beginning of a line. Most elements can be indentated, with the
exception of [[#Headings][headings]], [[#Inlinetasks][inlinetasks]], [[#Footnote_Definitions][footnote definitions]], and [[#Diary_Sexp][diary
sexps]]. Indentation is only syntactically meaningful in plain lists.
** Syntax patterns
*** General form
Most elements and objects will be described with the help of syntax
patterns, consisting of a series of named tokens written in uppercase
and separated by a space, like so:
#+begin_example
TOKEN1 TOKEN2
#+end_example
These tokens are often named roughly according to their semantic
meaning, For instance, "KEY" and "VALUE" when describing
[[#Keywords][Keywords]]. Tokens will be specified as either a string, or a series of
elements or objects.
#+attr_latex: :options [Important]
#+begin_info
Unless otherwise specified, a space in a pattern represents one or
more horizontal whitespace characters.
#+end_info
Patterns will often also contain static structures that serve to
differentiate a particular element or object type from others, but
have no semantic meaning. These are simply included in the pattern
verbatim. For instance, if a pattern consists of two plus signs (=+=)
immediately followed by a TOKEN it would be written like so:
#+begin_example
++TOKEN
#+end_example
Since tokens are written in uppercase, any letters in static
structures are distinguished by being written in lowercase.
*** Special tokens
:PROPERTIES:
:CUSTOM_ID: Special_Tokens
:END:
In a few cases, an instance of an element or object must be preceded
or succeeded by a certain pattern, which is not itself part of the
element or object. There patterns are specified using the /PRE/ and
/POST/ tokens respectively, like so:
#+begin_example
PRE TOKEN POST
#+end_example
*** Case significance
In this document, unless specified otherwise, case is insignificant.
* Elements
:PROPERTIES:
:CUSTOM_ID: Elements
:END:
** Headings and Sections
:PROPERTIES:
:CUSTOM_ID: Headings_and_Sections
:END:
*** Headings
:PROPERTIES:
:CUSTOM_ID: Headings
:END:
A Heading is an /unindented/ line structured according to the following pattern:
#+begin_example
STARS KEYWORD PRIORITY TITLE TAGS
#+end_example
+ STARS :: A string consisting of one or more asterisks (up to
~org-inlinetask-min-level~ if the =org-inlinetask= library is loaded)
suffixed by a space character. The number of asterisks is used to
define the level of the heading.
+ KEYWORD (optional) :: A string which is a member of
~org-todo-keywords-1~[fn:otkw1:By default, ~org-todo-keywords-1~ only
contains =TODO= and =DONE=, however ~org-todo-keywords-1~ is set on a
per-document basis.].
Case is significant. This is called a "todo keyword". [fn::Implementation note:
todo keywords cannot be hardcoded in a tokenizer, the tokenizer must
be configurable at runtime so that in-file todo keywords are properly
interpreted.]
+ PRIORITY (optional) :: A single alphanumeric character preceded by a
hash sign =#= and enclosed within square brackets (e.g. =[#A]= or =[#1]=). This
is called a "priority cookie".
+ TITLE (optional) :: A series of objects from the standard set,
excluding line break objects. It is matched after every other part.
+ TAGS (optional) :: A series of colon-separated strings consisting of
alpha-numeric characters, underscores, at signs, hash signs, and
percent signs (=_@#%=).
*Examples*
#+begin_example
,*
,** DONE
,*** Some e-mail
,**** TODO [#A] COMMENT Title :tag:a2%:
#+end_example
If the first word appearing in the title is =COMMENT=, the heading
will be considered as "commented". Case is significant.
If the TITLE of a heading is exactly the value of ~org-footnote-section~
(=Footnotes= by default), it will be considered as a "footnote section".
Case is significant.
If =ARCHIVE= is one of the tags given, the heading will be considered as
"archived". Case is significant.
All content following a heading --- up to either the next heading, or the end of the
document, forms a section contained by the heading. This is optional, as the
next heading may occur immediately in which case no section is formed.
*** Sections
:PROPERTIES:
:CUSTOM_ID: Sections
:END:
Sections contain one or more non-heading elements. With the exception
of the text before the first heading in a document (which is
considered a section), sections only occur within headings.
#+begin_notes
Since sections are usually thought of as a larger group that includes
nested content (e.g. "section 3"), and this isn't what Org sections are,
maybe this should be called something slightly different?
#+end_notes
*Example*
Consider the following document:
#+begin_example
An introduction.
,* A Heading
Some text.
,** Sub-Topic 1
,** Sub-Topic 2
,*** Additional entry
#+end_example
Its internal structure could be summarized as:
#+begin_example
(document
(section)
(heading
(section)
(heading)
(heading
(heading))))
#+end_example
*** The zeroth section
:PROPERTIES:
:CUSTOM_ID: Zeroth_section
:END:
All elements before the first heading in a document lie in a special
section called the /zeroth section/. It may be preceded by blank
lines. Unlike a normal section, the zeroth section can immediately
contain a [[#Property_Drawers][property drawer]], optionally preceded by [[#Comments][comments]]. It cannot
however, contain [[Planning][planning]].
** Greater Elements
:PROPERTIES:
:CUSTOM_ID: Greater_Elements
:END:
Unless otherwise specified, greater elements can directly contain
any greater or [[#Lesser_Elements][lesser element]] except:
+ Elements of their own type.
+ [[#Planning][Planning]], which may only occur in a [[#Headings][heading]].
+ [[#Property_Drawers][Property drawers]], which may only occur in a [[#Headings][heading]] or the [[#Zeroth_section][zeroth
section]].
+ [[#Node_Properties][Node properties]], which can only be found in [[#Property_Drawers][property drawers]].
+ [[#Items][Items]], which may only occur in [[#Plain_Lists][plain lists]].
+ [[#Table_Rows][Table rows]], which may only occur in [[#Tables][tables]].
*** Greater Blocks
:PROPERTIES:
:CUSTOM_ID: Greater_Blocks
:END:
Greater blocks are structured according to the following pattern:
#+begin_example
,#+begin_NAME PARAMETERS
CONTENTS
,#+end_NAME
#+end_example
+ NAME :: A string consisting of any non-whitespace characters, which
is not the NAME of a [[#Blocks][lesser block]]. Greater blocks are treated
differently based on their subtype, which is determined by the NAME
as follows:
- =center=, a "center block"
- =quote=, a "quote block"
- any other value, a "special block"
+ PARAMETERS (optional) :: A string consisting of any characters other
than a newline.
+ CONTENTS :: A collection of zero or more elements, subject to two
conditions:
- No line may start with =#+end_NAME=.
*** Drawers and Property Drawers
:PROPERTIES:
:CUSTOM_ID: Drawers
:END:
Drawers are structured according to the following pattern:
#+begin_example
:NAME:
CONTENTS
:end:
#+end_example
+ NAME :: A string consisting of word-constituent characters, hyphens
and underscores (=-_=).
+ CONTENTS :: A collection of zero or more elements, except another drawer.
*** Dynamic Blocks
:PROPERTIES:
:CUSTOM_ID: Dynamic_Blocks
:END:
Dynamic blocks are structured according to the following pattern:
#+begin_example
,#+begin: NAME PARAMETERS
CONTENTS
,#+end:
#+end_example
+ NAME :: A string consisting of non-whitespace characters.
+ PARAMETERS (optional) :: A string consisting of any characters but a newline.
+ CONTENTS :: A collection of zero or more elements, except another
dynamic block.
*** Footnote Definitions
:PROPERTIES:
:CUSTOM_ID: Footnote_Definitions
:END:
Footnote definitions must occur at the start of an /unindented/ line,
and are structured according to the following pattern:
#+begin_example
[fn:LABEL] CONTENTS
#+end_example
+ LABEL :: Either a number or an instance of the pattern =fn:WORD=, where
=WORD= represents a string consisting of word-constituent characters,
hyphens and underscores (=-_=).
+ CONTENTS (optional) :: A collection of zero or more elements. It
ends at the next footnote definition, the next heading, two
consecutive blank lines, or the end of buffer.
*Examples*
#+begin_example
[fn:1] A short footnote.
[fn:2] This is a longer footnote.
It even contains a single blank line.
#+end_example
*** Inlinetasks
:PROPERTIES:
:CUSTOM_ID: Inlinetasks
:END:
Inlinetasks are syntactically a [[#Headings][heading]] with a level of at least
~org-inlinetask-min-level~[fn:oiml:The default value of
~org-inlinetask-min-level~ is =15=.], i.e. starting with at least that
many asterisks.
Optionally, inlinetasks can be ended with a second heading with a
level of at least ~org-inlinetask-min-level~[fn:oiml], with no optional
components (i.e. only STARS and TITLE provided) and the string =END= as
the TITLE. This allows the inlinetask to contain elements.
#+begin_notes
Urgh, this syntax is ugly. --- Tom G, Timothy
#+end_notes
*Examples*
#+begin_example
,*************** TODO some tiny task
This is a paragraph, it lies outside the inlinetask above.
,*************** TODO some small task
DEADLINE: <2009-03-30 Mon>
:PROPERTIES:
:SOMETHING: or other
:END:
And here is some extra text
,*************** END
#+end_example
Inlinetasks are only recognized after the =org-inlinetask= library is
loaded.
*** Items
:PROPERTIES:
:CUSTOM_ID: Items
:END:
Items are structured according to the following pattern:
#+begin_example
BULLET COUNTER-SET CHECK-BOX TAG CONTENTS
#+end_example
+ BULLET :: One of the two forms below, followed by either a
whitespace character or line ending.
- An asterisk (=*=), hyphen (=-=), or plus sign (=+=) character.
Note that asterisk =*= character starting at the beginning of
line and followed by whitespace cannot be an item as it would
match a [[#Headings][heading]].
- Either the pattern =COUNTER.= or =COUNTER)=.
+ COUNTER :: Either a number or a single letter (a-z).
+ COUNTER-SET (optional) :: An instance of the pattern =[@COUNTER]=.
+ CHECK-BOX (optional) :: A single whitespace character, an =X=
character, or a hyphen enclosed by square brackets (i.e. =[ ]=, =[X]=, or =[-]=).
+ TAG (optional) :: An instance of the pattern =TAG-TEXT ::= where
=TAG-TEXT= represents a string consisting of non-newline characters
that does not contain the substring ="\nbsp{}::\nbsp{}"= (two colons surrounded by
whitespace, without the quotes).
+ CONTENTS (optional) :: A collection of zero or more elements, ending
at the first instance of one of the following:
- The next item.
- The first line less or equally indented than the starting line,
not counting lines within other non-paragraph elements or
[[#Inlinetasks][inlinetask]] boundaries.
- Two consecutive blank lines.
*Examples*
#+begin_example
- item
3. [@3] set to three
+ [-] tag :: item contents
* item, note whitespace in front
,* not an item, but heading - heading takes precedence
#+end_example
*** Plain Lists
:PROPERTIES:
:CUSTOM_ID: Plain_Lists
:END:
A /plain list/ is a set of consecutive [[#Items][items]] of the same indentation.
#+begin_info
At a glance it may appear as though nested lists are not possible. They are, as
items may themselves contain lists.
#+end_info
If first item in a plain list has a COUNTER in its BULLET, the plain
list will be an "ordered plain-list". If it contains a TAG, it will
be a "descriptive list". Otherwise, it will be an "unordered list".
List types are mutually exclusive at the same level of indentation, if
both types are present consecutively then they parse as separate
lists.
For example, consider the following excerpt of an Org document:
#+begin_example
1. item 1
2. [X] item 2
- some tag :: item 2.1
#+end_example
Its internal structure is as follows:
#+begin_example
(ordered-plain-list
(item)
(item
(descriptive-plain-list
(item))))
#+end_example
*** Property Drawers
:PROPERTIES:
:CUSTOM_ID: Property_Drawers
:END:
Property drawers are a special type of [[#Drawers][drawer]] containing properties
attached to a [[#Headings][heading]] or [[#Inlinetasks][inlinetask]]. They are located right after a heading
and its [[#Planning][planning]] information, as shown below:
#+begin_example
HEADLINE
PROPERTYDRAWER
HEADLINE
PLANNING
PROPERTYDRAWER
#+end_example
Property Drawers are structured according to the following pattern:
#+begin_example
:properties:
CONTENTS
:end:
#+end_example
+ CONTENTS :: A collection of zero or more [[#Node_Properties][node properties]], not
separated by blank lines.
#+begin_notes
The failure mode for malformed contents needs to be determined more clearly
here. We don't want property draws to suddenly become plain drawers just because
a user has a malformed line, that could be disastrous if certain settings in the
property drawer mask settings from further up the tree. In short, malformed
contents should not poison the whole property drawer. --- Tom G
#+end_notes
*Example*
#+begin_example
:PROPERTIES:
:CUSTOM_ID: someid
:END:
#+end_example
*** Tables
:PROPERTIES:
:CUSTOM_ID: Tables
:END:
Tables are started by a line beginning with either:
+ A vertical bar (=|=), forming an "org" type table.
+ The string =+-= followed by a sequence of plus (=+=) and minus (=-=)
signs, forming a "table.el" type table.
#+begin_notes
Maybe drop table.el from the spec?
#+end_notes
Tables cannot be immediately preceded by such lines, as the current
line would the be part of the earlier table.
Org tables contain [[#Table_Rows][table rows]], and end at the first line not starting
with a vertical bar. An Org table can be followed by a number of
=#+TBLFM: FORMULAS= lines, where =FORMULAS= represents a string consisting
of any characters but a newline.
Table.el tables end at the first line not starting with either
a vertical line or a plus sign.
*Example*
#+begin_example
| Name | Phone | Age |
|-------+-------+-----|
| Peter | 1234 | 24 |
| Anna | 4321 | 25 |
#+end_example
** Lesser Elements
:PROPERTIES:
:CUSTOM_ID: Lesser_Elements
:END:
Lesser elements cannot contain any other element.
Only [[#Keywords][keywords]] which are a member of ~org-element-parsed-keywords~[fn:oepkw], [[#Blocks][verse
blocks]], [[#Paragraphs][paragraphs]] or [[#Table_Rows][table rows]] can contain objects.
*** Blocks
:PROPERTIES:
:CUSTOM_ID: Blocks
:END:
Like [[#Greater_Blocks][greater blocks]], blocks are structured according to the following pattern:
#+begin_example
,#+begin_NAME DATA
CONTENTS
,#+end_NAME
#+end_example
+ NAME :: A string consisting of any non-whitespace characters. The
type of the block is determined based on the value as follows:
- =comment=, a "comment block",
- =example=, an "example block",
- =export=, an "export block",
- =src=, a "source block",
- =verse=, a "verse block".
The NAME must be one of these values. Otherwise, the pattern
forms a greater block.
+ DATA (optional) :: A string consisting of any characters but a newline.
- In the case of an export block, this is mandatory and must be a
single word.
- In the case of a source block, this is mandatory and must follow
the pattern =LANGUAGE SWITCHES ARGUMENTS= with:
+ LANGUAGE :: A string consisting of any non-whitespace characters
+ SWITCHES :: Any number of SWITCH patterns, separated by a single
space character
- SWITCH :: Either the pattern =-l "FORMAT"= where =FORMAT=
represents a string consisting of any characters but a double
quote (="=) or newline, or the pattern =-S= or =+S= where =S=
represents a single alphabetic character
+ ARGUMENTS :: A string consisting of any character but a newline.
+ CONTENTS (optional) :: A string consisting of any characters
(including newlines) subject to the same two conditions of greater
block's CONTENTS, i.e.
- No line may start with =#+end_NAME=.
- Lines beginning with an asterisk or =#+= must be quoted by a comma
(=,*=, =,#+=).
CONTENTS will contain Org objects and not support comma-quoting when
the block is a verse block, it is otherwise not parsed.
#+begin_notes
Can we drop switch support? This seems like a fairly good idea.
The functionality can simply be shifted to ARGUMENTS with the
well-established =:key val= forms.\\
"For the love of all that is sane" --- Tom G
#+end_notes
*Example*
#+begin_example
,#+begin_verse
There was an old man of the Cape
Who made himself garments of crepe.
When asked, “Do they tear?”
He replied, “Here and there,
But they’re perfectly splendid for shape!”
,#+end_verse
#+end_example
*** Clock
:PROPERTIES:
:CUSTOM_ID: Clocks
:END:
A clock element is structured according to the following pattern:
#+begin_example
clock: INACTIVE-TIMESTAMP
clock: INACTIVE-TIMESTAMP-RANGE DURATION
#+end_example
+ INACTIVE-TIMESTAMP :: An inactive [[#Timestamps][timestamp]] object.
+ INACTIVE-TIMESTAMP-RANGE :: An inactive range [[#Timestamps][timestamp]] object.
+ DURATION :: An instance of the pattern ==> HH:MM=.
- HH :: A number consisting of any number of digits.
- MM :: A two digit number.
*Example*
#+begin_example
clock: [2024-10-12]
#+end_example
*** Diary Sexp
:PROPERTIES:
:CUSTOM_ID: Diary_Sexp
:END:
A diary sexp[fn::A common abbreviation for S-expression] element is an
/unindented/ line structured according to the following pattern:
#+begin_example
%%SEXP
#+end_example
+ SEXP :: A string starting with an open parenthesis =(=, with balanced
opening and closing parentheses.
*Example*
#+begin_example
%%(org-calendar-holiday)
#+end_example
*** Planning
:PROPERTIES:
:CUSTOM_ID: Planning
:END:
A planning element is structured according to the following pattern:
#+begin_example
HEADING
PLANNING
#+end_example
+ HEADING :: A [[#Headings][heading]] element.
+ PLANNING :: A line consisting of one or more =KEYWORD: TIMESTAMP=
patterns (termed "info" patterns).
- KEYWORD :: Either the string =DEADLINE=, =SCHEDULED=, or =CLOSED=.
- TIMESTAMP :: A [[#Timestamps][timestamp]] object.
PLANNING must directly follow HEADING without any blank lines in
between.
When a keyword is repeated in a planning element, the last instance of it has
priority.
#+begin_notes
Tom G has requested adding a =OPENED= keyword to track task creation/registration.
#+end_notes
*Example*
#+begin_example
,*** TODO watch "The Matrix"
SCHEDULED: <1999-03-31 Wed>
#+end_example
*** Comments
:PROPERTIES:
:CUSTOM_ID: Comments
:END:
A "comment line" starts with a hash character (=#=) and either a whitespace
character or the immediate end of the line.
Comments consist of one or more consecutive comment lines.
*Example*
#+begin_example
# Just a comment
#
# Over multiple lines
#+end_example
*** Fixed Width Areas
:PROPERTIES:
:CUSTOM_ID: Fixed_Width_Areas
:END:
A "fixed-width line" starts with a colon character (=:=) and either a whitespace
character or the immediate end of the line.
Fixed-width areas consist of one or more consecutive fixed-width lines.
*Example*
#+begin_example
: This is a
: fixed width area
#+end_example
*** Horizontal Rules
:PROPERTIES:
:CUSTOM_ID: Horizontal_Rules
:END:
A horizontal rule is formed by a line consisting of at least five
consecutive hyphens (=-----=).
*** Keywords
:PROPERTIES:
:CUSTOM_ID: Keywords
:END:
Keywords are structured according to the following pattern:
#+begin_example
,#+KEY: VALUE
#+end_example
+ KEY :: A string consisting of any non-whitespace characters, other
than =call= (which would forms a [[#Babel_Call][babel call]] element).
+ VALUE :: A string consisting of any characters but a newline.
#+begin_notes
Perhaps this should be changed to be =#+KEY[OPT]: VAL=? It would make the syntax
more regular, considering affiliated keywords. I can't see any backwards
compatibility concerns. \\
This was suggested by Tom G, but I'm a fan --- Timothy.
#+end_notes
When KEY is a member of ~org-element-parsed-keywords~[fn:oepkw], VALUE can contain
the standard set objects, excluding footnote references.
Note that while instances of this pattern are preferentially parsed as
[[#Affiliated_Keywords][affiliated keywords]], a keyword with the same KEY as an affiliated
keyword may occur so long as it is not immediately preceding a valid
element that can be affiliated. For example, an instance of
=#+caption: hi= followed by a blank line will be parsed as a keyword,
not an affiliated keyword.
**** Babel Call
:PROPERTIES:
:CUSTOM_ID: Babel_Call
:END:
Babel calls are structured according to one of the following patterns:
#+begin_example
,#+call: NAME(ARGUMENTS)
,#+call: NAME[HEADER1](ARGUMENTS)
,#+call: NAME(ARGUMENTS)[HEADER2]
,#+call: NAME[HEADER1](ARGUMENTS)[HEADER2]
#+end_example
+ NAME :: A string consisting of any non-newline characters except for
square brackets, or parentheses (=[]()=).
+ ARGUMENTS (optional) :: A string consisting of any non-newline
characters. Opening and closing parenthesis must be balanced.
+ HEADER1 (optional), HEADER2 (optional) :: A string consisting of any
non-newline characters. Opening and closing square brackets must be
balanced.
#+begin_notes
Should this be distinguished from other keywords at the AST
interpretation stage, instead of the base syntax? --- Tom G
#+end_notes
**** Affiliated Keywords
:PROPERTIES:
:CUSTOM_ID: Affiliated_Keywords
:END:
With the exception of [[#Comments][comments]], [[#Clocks][clocks]], [[#Headings][headings]], [[#Inlinetasks][inlinetasks]],
[[#Items][items]], [[#Node_Properties][node properties]], [[#Planning][planning]], [[#Property_Drawers][property drawers]], [[#Sections][sections]], and
[[#Table_Rows][table rows]], every other element type can be assigned attributes.
This is done by adding specific [[#Keywords][keywords]], named /affiliated/ keywords,
immediately above the element considered (a blank line cannot lie
between the affiliated keyword and element). Structurally, affiliated
keyword are not considered an element in their own right but a
property of the element they apply to.
Affiliated keywords are structured according to one of the following pattern:
#+begin_example
,#+KEY: VALUE
,#+KEY[OPTVAL]: VALUE
,#+attr_BACKEND: VALUE
#+end_example
+ KEY :: A string which is a member of
~org-element-affiliated-keywords~[fn:oeakw:By default,
~org-element-affiliated-keywords~ contains =CAPTION=, =DATA=, =HEADER=,
=NAME=, =PLOT=, and =RESULTS=.].
+ BACKEND :: A string consisting of alphanumeric characters, hyphens,
or underscores (=-_=).
+ OPTVAL (optional) :: A string consisting of any characters but a
newline. Opening and closing square brackets must be balanced.
This term is only valid when KEY is a member of
~org-element-dual-keywords~[fn:oedkw:By default,
~org-element-dual-keywords~ contains =CAPTION= and =RESULTS=.].
+ VALUE :: A string consisting of any characters but a newline, except
in the case where KEY is member of
~org-element-parsed-keywords~[fn:oepkw:By default,
~org-element-parsed-keywords~ contains =CAPTION=.] in which case VALUE
is a series of objects from the standard set, excluding footnote
references.
#+begin_notes
Should this even be described at a syntax level instead of an AST
processing level? --- Tom G
#+end_notes
Repeating an affiliated keyword before an element will usually result
in the prior VALUEs being overwritten by the last instance of KEY.
The sole exception to this is =#+header:= keywords, where in the case of multiple
=:opt val= declarations the last declaration on the first line it occurs on has
priority.
#+begin_notes
Maybe this should be first-line-wins for all affiliated keywords?
This would be a breaking change though. --- Timothy
#+end_notes
There are two situations under which the VALUEs will be concatenated:
1. If KEY is a member of ~org-element-dual-keywords~[fn:oedkw].
2. If the affiliated keyword is an instance of the pattern
=#+attr_BACKEND: VALUE=.
When no element immediately follows an instance of the "affiliated
keyword" pattern, the keyword is a normal, non-affiliated keyword.
The following example contains three affiliated keywords:
#+begin_example
,#+name: image-name
,#+caption: This is a caption for
,#+caption: the image linked below
[[file:some/image.png]]
#+end_example
*** LaTeX Environments
:PROPERTIES:
:CUSTOM_ID: LaTeX_Environments
:END:
LaTeX environments are structured according to the following pattern:
#+begin_example
\begin{NAME}
CONTENTS
\end{NAME}
#+end_example
+ NAME :: A non-empty string consisting of alphanumeric or asterisk characters
+ CONTENTS (optional) :: A string which does not contain the substring
=\end{NAME}=.
*Examples*
#+begin_example
\begin{align*}
2x - 5y &= 8 \\
3x + 9y &= -12
\end{align*}
#+end_example
*** Node Properties
:PROPERTIES:
:CUSTOM_ID: Node_Properties
:END:
Node properties can only exist in [[#Property_Drawers][property drawers]], and are structured
according to one of the following patterns:
#+begin_example
:NAME: VALUE
:NAME:
:NAME+: VALUE
:NAME+:
#+end_example
+ NAME :: A non-empty string containing any non-whitespace characters
which does not end in a plus characters (=+=).
+ VALUE (optional) :: A string containing any characters but a newline.
*** Paragraphs
:PROPERTIES:
:CUSTOM_ID: Paragraphs
:END:
Paragraphs are the default element, which means that any
unrecognized context is a paragraph.
Empty lines and other elements end paragraphs.
Paragraphs can contain the standard set of objects.
*** Table Rows
:PROPERTIES:
:CUSTOM_ID: Table_Rows
:END:
A table row consists of a vertical bar (=|=) followed by:
+ Any number of [[#Table_Cells][table cells]], forming a "standard" type row.
+ A hyphen (=-=), forming a "rule" type row. Any non-newline characters
can follow the hyphen and this will still be a "rule" type row
Table rows can only exist in [[#Tables][tables]].
* Objects
:PROPERTIES:
:CUSTOM_ID: Objects
:END:
Objects can only be found in the following elements:
- [[#Keywords][keywords]] or [[#Affiliated_Keywords][affiliated keywords]] VALUEs, when KEY is a member of
~org-element-parsed-keywords~[fn:oepkw],
- [[#Headings][heading]] TITLEs,
- [[#Inlinetasks][inlinetask]] TITLEs,
- [[#Items][item]] TAGs,
- [[#Clocks][clock]] INACTIVE-TIMESTAMP and INACTIVE-TIMESTAMP-RANGE, which can
only contain inactive timestamps,
- [[#Planning][planning]] TIMESTAMPs, which can only be timestamps,
- [[#Paragraphs][paragraphs]],
- [[#Table_Cells][table cells]],
- [[#Table_Rows][table rows]], which can only contain table cell objects,
- [[#Blocks][verse blocks]].
Most objects cannot contain objects. Those which can will be
specified. Furthermore, while many objects may contain newlines, a
blank line often terminates the element that the object is a part of,
such as a paragraph.
** Entities
:PROPERTIES:
:CUSTOM_ID: Entities
:END:
Entities are structured according to the following pattern:
#+begin_example
\NAME POST
#+end_example
Where NAME and POST are not separated by a whitespace character.
+ NAME :: A string with a valid association in either
~org-entities~[fn:oe:See the [[#Entities_List][appendix]] for a list of entities.] or
~org-entities-user~.
+ [[#Special_Tokens][POST]] :: Either:
- The end of line.
- The string ={}=.
- A non-alphabetic character.
#+begin_notes
It's [[https://github.com/lucasvreis/org-parser/blob/main/SPEC.org#entities][been raised]] that "{}" is really part of the entity, and so
probably shouldn't be considered part of POST --- Timothy.
#+end_notes
*Example*
#+begin_example
\cent
#+end_example
** LaTeX Fragments
:PROPERTIES:
:CUSTOM_ID: LaTeX_Fragments
:END:
LaTeX fragments are structured according to one of the following patterns:
#+begin_example
\NAME BRACKETS
\(CONTENTS\)
\[CONTENTS\]
#+end_example
+ NAME :: A string consisting of alphabetic characters which does not
have an association in either ~org-entities~ or ~org-entities-user~.
+ BRACKETS (optional) :: An instance of one of the following patterns,
not separated from NAME by whitespace.
#+begin_example
[CONTENTS1]
{CONTENTS1}
#+end_example
- CONTENTS1 :: A string consisting of any characters but ={=, =}=, =[=,
=]=, or a newline.
- CONTENTS2 :: A string consisting of any characters but ={=, =}=, or a newline.
+ CONTENTS :: A string consisting of any characters, so long as it does
not contain the substring =\)= in the case of the
second template, or =\]= in the case of the third template.
*Examples*
#+begin_example
\enlargethispage{2\baselineskip}
\(e^{i \pi}\)
#+end_example
Org also supports TeX-style inline LaTeX fragments, structured
according the following pattern:
#+begin_example
$$CONTENTS$$
PRE$CHAR$POST
PRE$BORDER1 BODY BORDER2$POST
#+end_example
+ [[#Special_Tokens][PRE]] :: Either the beginning of line or a character other than =$=.
+ CHAR :: A non-whitespace character that is not =.=, =,=, =?=, =;=, or a
double quote (="=).
+ [[#Special_Tokens][POST]] :: Any punctuation character (including parentheses and
quotes), a space character, or the end of line.
+ BORDER1 :: A non-whitespace character that is not =.=, =,=, =;=, or =$=.
+ BODY :: A string consisting of any characters except =$=, and which
does not span more than three lines.
+ BORDER2 :: A non-whitespace character that is not =.=, =,=, or =$=.
*Example*
#+begin_example
$$1+1=2$$
#+end_example
#+begin_notes
It would introduce incompatibilities with previous Org versions,
but support for ~$...$~ (and for symmetry, ~$$...$$~) constructs
ought to be removed.
They are slow to parse, fragile, redundant and imply false
positives. --- NGZ
Strong support for removing these. --- Tom G
I'm strongly in support of dropping $-syntax. --- Timothy
#+end_notes
** Export Snippets
:PROPERTIES:
:CUSTOM_ID: Export_Snippets
:END:
Export snippets are structured according to the following pattern:
#+begin_example
@@BACKEND:VALUE@@
#+end_example
+ BACKEND :: A string consisting of one or more alphanumeric characters and hyphens.
+ VALUE (optional) :: A string containing anything but the string =@@=.
** Footnote References
:PROPERTIES:
:CUSTOM_ID: Footnote_References
:END:
Footnote references are structured according to one of the following patterns:
#+begin_example
[fn:LABEL]
[fn:LABEL:DEFINITION]
[fn::DEFINITION]
#+end_example
+ LABEL :: A string containing one or more word constituent characters,
hyphens and underscores (=-_=).
+ DEFINITION (optional) :: One or more objects from the standard set,
so long as opening and closing square brackets are balanced within
DEFINITION.
If the reference follows the second pattern, it is called an "inline
footnote". If it follows the third pattern, i.e. if LABEL is omitted,
it is called an "anonymous footnote".
Note that the first pattern may not occur on an /unindented/ line, as it
is then a [[#Footnote_Definitions][footnote definition]].
** Citations
:PROPERTIES:
:CUSTOM_ID: Citations
:END:
Citations are structured according to the following pattern:
#+begin_example
[cite CITESTYLE: GLOBALPREFIX REFERENCES GLOBALSUFFIX]
#+end_example
Where "cite" and =CITESTYLE=, =KEYCITES= and =GLOBALSUFFIX= are /not/
separated by whitespace. =KEYCITES=, =GLOBALPREFIX=, and =GLOBALSUFFIX=
must be separated by semicolons. Whitespace after the leading colon
or before the closing square bracket is not significant. All other
whitespace is significant.
+ CITESTYLE (optional) :: An instance of either the pattern =/STYLE= or =/STYLE/VARIANT=
- STYLE :: A string made of any alphanumeric character, =_=, or =-=.
- Variant :: A string made of any alphanumeric character, =_=, =-=, or =/=.
+ GLOBALPREFIX (optional) :: One or more objects from the standard set,
so long as all square brackets are balanced within GLOBALPREFIX, and
it does not contain any semicolons (=;=) or subsequence that matches
=@KEY=.
+ REFERENCES :: One or more [[#Citation_References][citation reference]] objects, separated by
semicolons (=;=).
+ GLOBALSUFFIX (optional) :: One or more objects from the standard set,
so long as all square brackets are balanced within GLOBALSUFFIX, and
it does not contain any semicolons (=;=) or subsequence that matches
=@KEY=.
*Examples*
#+begin_example
[cite:@key]
[cite/t:see;@foo p. 7;@bar pp. 4;by foo]
[cite/a/f:c.f.;the very important @@atkey @ once;the crucial @baz vol. 3]
#+end_example
** Citation references
:PROPERTIES:
:CUSTOM_ID: Citation_References
:END:
A reference to an individual resource is given in a /citation reference/
object. Citation references are only found within [[#Citations][citations]], and are
structured according to the following pattern:
#+begin_example
KEYPREFIX @KEY KEYSUFFIX
#+end_example
Where KEYPREFIX, @KEY, and KEYSUFFIX are not separated by whitespace.
+ KEYPREFIX (optional) :: One or more objects from the minimal set,
so long as all square brackets are balanced within KEYPREFIX, and
it does not contain any semicolons (=;=) or subsequence that matches
=@KEY=.
+ KEY :: A string made of any word-constituent character, =-=, =.=, =:=,
=?=, =!=, =`=, ='=, =/=, =*=, =@=, =+=, =|=, =(=, =)=, ={=, =}=, =<=, =>=, =&=, =_=, =^=, =$=, =#=, =%=, or
=~=.
+ KEYSUFFIX (optional) :: One or more objects from the minimal set,
so long as all square brackets are balanced within KEYPREFIX, and
it does not contain any semicolons (=;=).
** Inline Babel Calls
:PROPERTIES:
:CUSTOM_ID: Inline_Babel_Calls
:END:
Inline Babel calls are structured according to one of the following patterns:
#+begin_example
call_NAME(ARGUMENTS)
call_NAME[HEADER1](ARGUMENTS)
call_NAME(ARGUMENTS)[HEADER2]
call_NAME[HEADER1](ARGUMENTS)[HEADER2]
#+end_example
+ NAME :: A string consisting of any non-whitespace characters except
for square brackets or parentheses (=[]()=).
+ ARGUMENTS, HEADER1 (optional), HEADER2 (optional) :: A string
consisting of zero or more non-newline characters. Opening and
closing parentheses must be balanced within HEADER1 and HEADER2, and
opening and closing square brackets within BODY.
** Inline Source Blocks
:PROPERTIES:
:CUSTOM_ID: Source_Blocks
:END:
Inline source blocks follow any of the following patterns:
#+begin_example
src_LANG{BODY}
src_LANG[HEADERS]{BODY}
#+end_example
+ LANG :: A string consisting of any characters other than whitespace,
the opening square bracket (=[=), or opening curly bracket (={=).
+ HEADERS (optional), BODY :: A string consisting of zero or more
non-newline characters. Opening and closing square brackets must be
balanced within HEADERS, and opening and closing curly brackets
within BODY.
** Line Breaks
:PROPERTIES:
:CUSTOM_ID: Line_Breaks
:END:
Line breaks must occur at the end of an otherwise non-blank line, and
are structured according to the following pattern:
#+begin_example
\\SPACE
#+end_example
+ SPACE :: Zero or more tab and space characters.
** Links
:PROPERTIES:
:CUSTOM_ID: Links
:END:
While links are a single object, they come in four subtypes: "radio",
"angle", "plain", and "regular" links.
*** Radio Links
Radio-type links are structured according to the following pattern:
#+begin_example
PRE RADIO POST
#+end_example
+ [[#Special_Tokens][PRE]] :: A non-alphanumeric character.
+ RADIO :: One or more objects matched by some [[#Targets_and_Radio_Targets][radio target]]. It can
contain the minimal set of objects.
+ [[#Special_Tokens][POST]] :: A non-alphanumeric character.
#+begin_notes
Is the raw (unparsed) text or the parsed structure matched with radio links?
#+end_notes
*Example*
#+begin_example
This is some <<<*important* information>>> which we refer to lots.
Make sure you remember the *important* information.
#+end_example
The first instance of =*important* information= defines a radio target,
which is matched by the second instance of =*important* information=,
forming a radio link.
*** Plain links
Plain-type links are structured according to the following pattern:
#+begin_example
PRE PROTOCOL:PATHPLAIN POST
#+end_example
+ [[#Special_Tokens][PRE]] :: A non word constituent character.
+ PROTOCOL :: A string which is one of the link type strings in
~org-link-parameters~[fn:olp:By default, ~org-link-parameters~ defines
links of type =shell=, =news=, =mailto=, =https=, =http=, =ftp=, =help=, =file=, and
=elisp=.].
+ PATHPLAIN :: A string containing non-whitespace non-bracket (=(=)[]<>=)
characters, optionally containing parenthesis-wrapped non-whitespace
non-bracket substrings up to a depth of two. The string must end
with either a non-punctation non-whitespace character, a forwards
slash, or a parenthesis-wrapped substring.[fn::This overall pattern
may be matched with the following regexp: =(?:[^
\t\n\[\]<>()]|\((?:[^ \t\n\[\]<>()]|\([^
\t\n\[\]<>()]*\))*\))+(?:[^[:punct:] \t\n]|\/|\((?:[^
\t\n\[\]<>()]|\([^ \t\n\[\]<>()]*\))*\))=]
+ [[#Special_Tokens][POST]] :: A non word constituent character.
*Example*
#+begin_example
Be sure to look at https://orgmode.org.
#+end_example
*** Angle links
Angle-type essentially provide a method to disambiguate plain links
from surrounding text, and are structured according to the following
pattern:
#+begin_example
#+end_example
+ PROTOCOL :: A string which is one of the link type strings in
~org-link-parameters~[fn:olp]
+ PATHANGLE :: A string containing any character but =>=., where newlines
and indentation are ignored.
The angle brackets allow for a more permissive PATH syntax, without
accidentally matching surrounding text.
*** Regular links
Plain-type links are structured according to one of the following two patterns:
#+begin_example
[[PATHREG]]
[[PATHREG][DESCRIPTION]]
#+end_example
+ PATHREG :: An instance of one of the seven following annotated patterns:
#+begin_example
FILENAME ("file" type)
PROTOCOL:PATHINNER ("PROTOCOL" type)
PROTOCOL://PATHINNER ("PROTOCOL" type)
id:ID ("id" type)
#CUSTOM-ID ("custom-id" type)
(CODEREF) ("coderef" type)
FUZZY ("fuzzy" type)
#+end_example
- FILENAME :: A string representing an absolute or relative file path.
- PROTOCOL :: A string which is one of the link type strings in
~org-link-parameters~[fn:olp]
- PATHINNER :: A string consisting of any character besides square brackets.
- ID :: A string consisting of hexadecimal numbers separated by hyphens.
- CUSTOM-ID :: A string consisting of any character besides square brackets.
- CODEREF :: A string consisting of any character besides square brackets.
- FUZZY :: A string consisting of any character besides square brackets.
Square brackets and backslashes can be present in PATHREG so long as
they are escaped by a backslash (i.e. =\]=, =\\=).
+ DESCRIPTION (optional) :: One or more objects enclosed by square
brackets. It can contain the minimal set of objects as well as
[[#Export_Snippets][export snippets]], [[#Inline_Babel_Calls][inline babel calls]], [[#Source_Blocks][inline source blocks]], [[#Macros][macros]],
and [[#Statistics_Cookies][statistics cookies]]. It can also contain another link, but only
when it is a plain or angle link. It can contain square brackets,
but not =]]=.
*Examples*
#+begin_example
[[https://orgmode.org][The Org project homepage]]
[[file:orgmanual.org]]
[[Regular links]]
#+end_example
** Macros
:PROPERTIES:
:CUSTOM_ID: Macros
:END:
Macros are structured according to one of the following patterns:
#+begin_example
{{{NAME}}}
{{{NAME(ARGUMENTS)}}}
#+end_example
+ NAME :: A string starting with a alphabetic character followed by
any number of alphanumeric characters, hyphens and underscores (=-_=).
+ ARGUMENTS (optional) :: A string consisting of any characters, so
long as it does not contain the substring =}}}=. Values within
ARGUMENTS are separated by commas. Non-separating commas have to be
escaped with a backslash character.
*Examples*
#+begin_example
{{{title}}}
{{{one_arg_macro(1)}}}
{{{two_arg_macro(1, 2)}}}
{{{two_arg_macro(1\,a, 2)}}}
#+end_example
** Targets and Radio Targets
:PROPERTIES:
:CUSTOM_ID: Targets_and_Radio_Targets
:END:
Targets are structured according to the following pattern:
#+begin_example
<>
#+end_example
+ TARGET :: A string containing any character but =<=, =>=, or =\n=. It
cannot start or end with a whitespace character.
Radio targets are structured according to the following pattern:
#+begin_example
<<>>
#+end_example
+ CONTENTS :: One or more objects from the minimal set, starting and
ending with a non-whitespace character, and containing any character
but =<=, =>=, or =\n=.
** Statistics Cookies
:PROPERTIES:
:CUSTOM_ID: Statistics_Cookies
:END:
Statistics cookies are structured according to one of the following patterns:
#+begin_example
[PERCENT%]
[NUM1/NUM2]
#+end_example
+ PERCENT (optional) :: A number.
+ NUM1 (optional) :: A number.
+ NUM2 (optional) :: A number.
** Subscript and Superscript
:PROPERTIES:
:CUSTOM_ID: Subscript_and_Superscript
:END:
Subscripts are structured according to the following pattern:
#+begin_example
CHAR_SCRIPT
#+end_example
Superscripts are structured according to the following pattern:
#+begin_example
CHAR^SCRIPT
#+end_example
+ CHAR :: Any non-whitespace character.
+ SCRIPT :: One of the following constructs:
- A single asterisk character (=*=).
- An expression enclosed in curly brackets (={=, =}=), which may itself
contain balanced curly brackets and the standard set of objects.
- An instance of the pattern:
#+begin_example
SIGN CHARS FINAL
#+end_example
With no whitespace between SIGN, CHARS and FINAL.
+ SIGN (optional) :: Either a plus sign character (=+=), a minus sign
character (=-=), or the empty string.
+ CHARS :: Either the empty string, or a string consisting of any
number of alphanumeric characters, commas, backslashes, and
dots.
+ FINAL :: An alphanumeric character.
** Table Cells
:PROPERTIES:
:CUSTOM_ID: Table_Cells
:END:
Table cells are structured according to the following pattern:
#+begin_example
CONTENTS SPACES|
#+end_example
+ CONTENTS :: Zero or more objects not containing the vertical bar
character (=|=). It can contain the minimal set of objects,
[[#Citations][citations]], [[#Export_Snippets][export snippets]], [[#Footnote_References][footnote references]], [[#Links][links]], [[#Macros][macros]],
[[#Targets_and_Radio_Targets][radio targets]], [[#Targets_and_Radio_Targets][targets]], and [[#Timestamps][timestamps]].
+ SPACES :: A string consisting of zero or more of space characters,
used to align the table columns.
The final vertical bar (=|=) may be omitted in the last cell of a [[#Table_Rows][table row]].
** Timestamps
:PROPERTIES:
:CUSTOM_ID: Timestamps
:END:
Timestamps are structured according to one of the seven following patterns:
#+begin_example
<%%(SEXP)> (diary)
(active)
[DATE TIME REPEATER-OR-DELAY] (inactive)
-- (active range)
(active range)
[DATE TIME REPEATER-OR-DELAY]--[DATE TIME REPEATER-OR-DELAY] (inactive range)
[DATE TIME-TIME REPEATER-OR-DELAY] (inactive range)
#+end_example
+ SEXP :: A string consisting of any characters but =>= and =\n=.
+ DATE :: An instance of the pattern:
#+begin_example
YYYY-MM-DD DAYNAME
#+end_example
- Y, M, D :: A digit.
- DAYNAME (optional) :: A string consisting of non-whitespace
characters except =+=, =-=, =]=, =>=, a digit, or =\n=.
+ TIME (optional) :: An instance of the pattern =H:MM= where =H= represents a one to
two digit number (and can start with =0=), and =M= represents a single
digit.
+ REPEATER-OR-DELAY (optional) :: An instance of the following pattern:
#+begin_example
MARK VALUE UNIT
#+end_example
Where MARK, VALUE and UNIT are not separated by whitespace characters.
- MARK :: Either the string =+= (cumulative type), =++= (catch-up type),
or =.+= (restart type) when forming a repeater, and either =-= (all
type) or =--= (first type) when forming a warning delay.
- VALUE :: A number
- UNIT :: Either the character =h= (hour), =d= (day), =w= (week), =m=
(month), or =y= (year)
There can be two instances of =REPEATER-OR-DELAY= in the timestamp: one
as a repeater and one as a warning delay.
#+begin_notes
Tom G has some syntax extensions he'd like to suggest for historical /
far-future dates, timezone offsets, and second/sub-second times.
#+end_notes
*Examples*
#+begin_example
<1997-11-03 Mon 19:15>
<%%(diary-float t 4 2)>
[2004-08-24 Tue]--[2004-08-26 Thu]
<2012-02-08 Wed 20:00 ++1d>
<2030-10-05 Sat +1m -3d>
#+end_example
** Text Markup
:PROPERTIES:
:CUSTOM_ID: Emphasis_Markers
:END:
There are six text markup objects, which are all structured according
to the following pattern:
#+begin_example
PRE MARKER CONTENTS MARKER POST
#+end_example
Where PRE, MARKER, CONTENTS, MARKER and POST are not separated by
whitespace characters.
+ [[#Special_Tokens][PRE]] :: Either a whitespace character, =-=, =(=, ={=, ='=, ="=, or the beginning
of a line.
+ MARKER :: A character that determines the object type, as follows:
- =*=, a /bold/ object,
- =/=, an /italic/ object,
- =_= an /underline/ object,
- ===, a /verbatim/ object,
- =~=, a /code/ object
- =+=, a /strike-through/ object.
+ CONTENTS :: An instance of the pattern:
#+begin_example
BORDER BODY BORDER
#+end_example
Where BORDER and BODY are not separated by whitespace.
- BORDER :: Any non-whitespace character.
- BODY :: Either a string (when MARKER represents code or verbatim)
or a series of objects from the standard set, not spanning more
than three lines.
+ [[#Special_Tokens][POST]] :: Either a whitespace character, =-=, =.=, =,=, =;=, =:=, =!=, =?=, ='=, =)=, =}=,
=[=, ="=, or the end of a line.
*Examples*
#+begin_example
Org is a /plaintext markup syntax/ developed with *Emacs* in 2003.
The canonical parser is =org-element.el=, which provides a number of
functions starting with ~org-element-~.
#+end_example
*** Plain Text
:PROPERTIES:
:CUSTOM_ID: Plain_Text
:END:
Any string that doesn't match any other object can be considered a
plain text object.[fn::In ~org-element.el~ plain text objects are
abstracted away to strings for performance reasons.]
Within a plain text object, all whitespace is collapsed to a single
space. For instance, =hello\n there= is equivalent to =hello there=.
* Footnotes
[fn:1] In particular, the parser requires stars at column 0 to be
quoted by a comma when they do not define a heading.
[fn:2] It also means that only headings and sections can be recognized
just by looking at the beginning of the line. Planning lines and
property drawers can be recognized by looking at one or two lines
above.
As a consequence, using ~org-element-at-point~ or ~org-element-context~
will move up to the parent heading, and parse top-down from there
until context around the original location is found.
#+latex: \appendix
* Appendix
** Summary of changes compared to the current =org-syntax= document
:PROPERTIES:
:CUSTOM_ID: Changes
:END:
+ Rename "Headlines" -> "Headings", since while both forms are
currently used in the docs a change to consistently use the latter
seems imminent if delayed by (what looks like) an ongoing wait for
Bastien's final say
+ Describe patterns with consistent phrasing, "Xs are structured
according to the following pattern:"
+ Describe string patterns with consistent phrasing,"a string
constituted of" (and other forms) -> "a string consisting of"
+ Describe components of a pattern using description lists
+ Use verbatim objects for verbatim text over quotes
+ Change the way inlinetasks are described
+ Add =CONTENTS= component to the Item structure
+ Some whitespace/capitalisation changes
+ (Lots of) miscellaneous wording changes for clarity
+ Fix some minor errors (like referencing a variable which was removed
7y ago, or saying that switches in source block headers should be
separated by blank lines)
+ Change the babel call element syntax description to the more detailed form
found in the manual
+ Change the inline babel call object syntax description to be
consistent with the babel call element syntax. This does not
precisely match the parser behaviour, but matches a very slight
subset. The previous description in some parts matched a superset
of the parser behaviour, and in other places a subset.
+ Change "Greater Elements" / "Elements" to "Greater Elements"/
"Lesser Elements"
+ Put all Elements under the new level-1 heading "Elements"
+ Separate list definition into four sub-definitions
+ Add a "Terminology and conventions" section
+ Mention ~plain-text~ objects (see src_elisp{(org-element-type
"text")}) for the sake of consistency (When something can "contain
an object" it can contain unformatted text. Without naming
~plain-text~ as an object this is a bit funky).
+ Specify that whitespace in plain text is semantically
collapsed/equivalent to a single space. It is worth noting that this
is not indicated by =org-element='s parsing, which grabs all the
whitespace as-is. However, this feels like something which is done
for performance reasons, instead of a deliberate choice to make
whitespace significant, and there are a few things which reinforce
this view
- =ox-ascii=, the only export backend to a format which doesn't itself
collapse whitespace when interpreted, re-fills paragraphs and
collapses whitespace.
- We have a line break object. If =\n= was significant in plain text
this would be unnecessary.
- ~org-fill-paragraph~ collapses whitespace
- Lastly, this is well-established sensible behaviour in every other
plaintext format that I can think of (HTML, LaTeX, Markdown, reST,
etc.).
+ Added bunch of examples
+ Probably a few bits and pieces that have slipped my mind.
#+latex: \newpage
** Org Entities
:PROPERTIES:
:CUSTOM_ID: Entities_List
:END:
#+begin_src emacs-lisp :results raw :exports results
(concat "| Name | Character |\n|-\n"
(mapconcat
(lambda (entity)
(if (stringp entity)
(format "| %s | |"
(cond
((string-match-p "^\\*\\*" entity)
(upcase (replace-regexp-in-string "^\\*+ " "" entity)))
((string-match-p "^\\*" entity)
(replace-regexp-in-string "^\\*+ \\(.+\\)$" "/\\1/" entity))
(t entity)))
(format "| =%s= | \\%s{} |"
(car entity)
(car entity))))
org-entities
"\n"))
#+end_src
#+attr_latex: :environment longtable :font \small
#+RESULTS:
| Name | Character |
|-----------------------------+--------------------------|
| /Letters/ | |
| LATIN | |
| =Agrave= | \Agrave{} |
| =agrave= | \agrave{} |
| =Aacute= | \Aacute{} |
| =aacute= | \aacute{} |
| =Acirc= | \Acirc{} |
| =acirc= | \acirc{} |
| =Amacr= | \Amacr{} |
| =amacr= | \amacr{} |
| =Atilde= | \Atilde{} |
| =atilde= | \atilde{} |
| =Auml= | \Auml{} |
| =auml= | \auml{} |
| =Aring= | \Aring{} |
| =AA= | \AA{} |
| =aring= | \aring{} |
| =AElig= | \AElig{} |
| =aelig= | \aelig{} |
| =Ccedil= | \Ccedil{} |
| =ccedil= | \ccedil{} |
| =Egrave= | \Egrave{} |
| =egrave= | \egrave{} |
| =Eacute= | \Eacute{} |
| =eacute= | \eacute{} |
| =Ecirc= | \Ecirc{} |
| =ecirc= | \ecirc{} |
| =Euml= | \Euml{} |
| =euml= | \euml{} |
| =Igrave= | \Igrave{} |
| =igrave= | \igrave{} |
| =Iacute= | \Iacute{} |
| =iacute= | \iacute{} |
| =Idot= | \Idot{} |
| =inodot= | \inodot{} |
| =Icirc= | \Icirc{} |
| =icirc= | \icirc{} |
| =Iuml= | \Iuml{} |
| =iuml= | \iuml{} |
| =Ntilde= | \Ntilde{} |
| =ntilde= | \ntilde{} |
| =Ograve= | \Ograve{} |
| =ograve= | \ograve{} |
| =Oacute= | \Oacute{} |
| =oacute= | \oacute{} |
| =Ocirc= | \Ocirc{} |
| =ocirc= | \ocirc{} |
| =Otilde= | \Otilde{} |
| =otilde= | \otilde{} |
| =Ouml= | \Ouml{} |
| =ouml= | \ouml{} |
| =Oslash= | \Oslash{} |
| =oslash= | \oslash{} |
| =OElig= | \OElig{} |
| =oelig= | \oelig{} |
| =Scaron= | \Scaron{} |
| =scaron= | \scaron{} |
| =szlig= | \szlig{} |
| =Ugrave= | \Ugrave{} |
| =ugrave= | \ugrave{} |
| =Uacute= | \Uacute{} |
| =uacute= | \uacute{} |
| =Ucirc= | \Ucirc{} |
| =ucirc= | \ucirc{} |
| =Uuml= | \Uuml{} |
| =uuml= | \uuml{} |
| =Yacute= | \Yacute{} |
| =yacute= | \yacute{} |
| =Yuml= | \Yuml{} |
| =yuml= | \yuml{} |
| LATIN (SPECIAL FACE) | |
| =fnof= | \fnof{} |
| =real= | \real{} |
| =image= | \image{} |
| =weierp= | \weierp{} |
| =ell= | \ell{} |
| =imath= | \imath{} |
| =jmath= | \jmath{} |
| GREEK | |
| =Alpha= | \Alpha{} |
| =alpha= | \alpha{} |
| =Beta= | \Beta{} |
| =beta= | \beta{} |
| =Gamma= | \Gamma{} |
| =gamma= | \gamma{} |
| =Delta= | \Delta{} |
| =delta= | \delta{} |
| =Epsilon= | \Epsilon{} |
| =epsilon= | \epsilon{} |
| =varepsilon= | \varepsilon{} |
| =Zeta= | \Zeta{} |
| =zeta= | \zeta{} |
| =Eta= | \Eta{} |
| =eta= | \eta{} |
| =Theta= | \Theta{} |
| =theta= | \theta{} |
| =thetasym= | \thetasym{} |
| =vartheta= | \vartheta{} |
| =Iota= | \Iota{} |
| =iota= | \iota{} |
| =Kappa= | \Kappa{} |
| =kappa= | \kappa{} |
| =Lambda= | \Lambda{} |
| =lambda= | \lambda{} |
| =Mu= | \Mu{} |
| =mu= | \mu{} |
| =nu= | \nu{} |
| =Nu= | \Nu{} |
| =Xi= | \Xi{} |
| =xi= | \xi{} |
| =Omicron= | \Omicron{} |
| =omicron= | \omicron{} |
| =Pi= | \Pi{} |
| =pi= | \pi{} |
| =Rho= | \Rho{} |
| =rho= | \rho{} |
| =Sigma= | \Sigma{} |
| =sigma= | \sigma{} |
| =sigmaf= | \sigmaf{} |
| =varsigma= | \varsigma{} |
| =Tau= | \Tau{} |
| =Upsilon= | \Upsilon{} |
| =upsih= | \upsih{} |
| =upsilon= | \upsilon{} |
| =Phi= | \Phi{} |
| =phi= | \phi{} |
| =varphi= | \varphi{} |
| =Chi= | \Chi{} |
| =chi= | \chi{} |
| =acutex= | \acutex{} |
| =Psi= | \Psi{} |
| =psi= | \psi{} |
| =tau= | \tau{} |
| =Omega= | \Omega{} |
| =omega= | \omega{} |
| =piv= | \piv{} |
| =varpi= | \varpi{} |
| =partial= | \partial{} |
| HEBREW | |
| =alefsym= | \alefsym{} |
| =aleph= | \aleph{} |
| =gimel= | \gimel{} |
| =beth= | \beth{} |
| =dalet= | \dalet{} |
| ICELANDIC | |
| =ETH= | \ETH{} |
| =eth= | \eth{} |
| =THORN= | \THORN{} |
| =thorn= | \thorn{} |
| /Punctuation/ | |
| DOTS AND MARKS | |
| =dots= | \dots{} |
| =cdots= | \cdots{} |
| =hellip= | \hellip{} |
| =middot= | \middot{} |
| =iexcl= | \iexcl{} |
| =iquest= | \iquest{} |
| DASH-LIKE | |
| =shy= | \shy{} |
| =ndash= | \ndash{} |
| =mdash= | \mdash{} |
| QUOTATIONS | |
| =quot= | \quot{} |
| =acute= | \acute{} |
| =ldquo= | \ldquo{} |
| =rdquo= | \rdquo{} |
| =bdquo= | \bdquo{} |
| =lsquo= | \lsquo{} |
| =rsquo= | \rsquo{} |
| =sbquo= | \sbquo{} |
| =laquo= | \laquo{} |
| =raquo= | \raquo{} |
| =lsaquo= | \lsaquo{} |
| =rsaquo= | \rsaquo{} |
| /Other/ | |
| MISC. (OFTEN USED) | |
| =circ= | \circ{} |
| =vert= | \vert{} |
| =vbar= | \vbar{} |
| =brvbar= | \brvbar{} |
| =S= | \S{} |
| =sect= | \sect{} |
| =amp= | \amp{} |
| =lt= | \lt{} |
| =gt= | \gt{} |
| =tilde= | \tilde{} |
| =slash= | \slash{} |
| =plus= | \plus{} |
| =under= | \under{} |
| =equal= | \equal{} |
| =asciicirc= | \asciicirc{} |
| =dagger= | \dagger{} |
| =dag= | \dag{} |
| =Dagger= | \Dagger{} |
| =ddag= | \ddag{} |
| WHITESPACE | |
| =nbsp= | \nbsp{} |
| =ensp= | \ensp{} |
| =emsp= | \emsp{} |
| =thinsp= | \thinsp{} |
| CURRENCY | |
| =curren= | \curren{} |
| =cent= | \cent{} |
| =pound= | \pound{} |
| =yen= | \yen{} |
| =euro= | \euro{} |
| =EUR= | \EUR{} |
| =dollar= | \dollar{} |
| =USD= | \USD{} |
| PROPERTY MARKS | |
| =copy= | \copy{} |
| =reg= | \reg{} |
| =trade= | \trade{} |
| SCIENCE ET AL. | |
| =minus= | \minus{} |
| =pm= | \pm{} |
| =plusmn= | \plusmn{} |
| =times= | \times{} |
| =frasl= | \frasl{} |
| =colon= | \colon{} |
| =div= | \div{} |
| =frac12= | \frac12{} |
| =frac14= | \frac14{} |
| =frac34= | \frac34{} |
| =permil= | \permil{} |
| =sup1= | \sup1{} |
| =sup2= | \sup2{} |
| =sup3= | \sup3{} |
| =radic= | \radic{} |
| =sum= | \sum{} |
| =prod= | \prod{} |
| =micro= | \micro{} |
| =macr= | \macr{} |
| =deg= | \deg{} |
| =prime= | \prime{} |
| =Prime= | \Prime{} |
| =infin= | \infin{} |
| =infty= | \infty{} |
| =prop= | \prop{} |
| =propto= | \propto{} |
| =not= | \not{} |
| =neg= | \neg{} |
| =land= | \land{} |
| =wedge= | \wedge{} |
| =lor= | \lor{} |
| =vee= | \vee{} |
| =cap= | \cap{} |
| =cup= | \cup{} |
| =smile= | \smile{} |
| =frown= | \frown{} |
| =int= | \int{} |
| =therefore= | \therefore{} |
| =there4= | \there4{} |
| =because= | \because{} |
| =sim= | \sim{} |
| =cong= | \cong{} |
| =simeq= | \simeq{} |
| =asymp= | \asymp{} |
| =approx= | \approx{} |
| =ne= | \ne{} |
| =neq= | \neq{} |
| =equiv= | \equiv{} |
| =triangleq= | \triangleq{} |
| =le= | \le{} |
| =leq= | \leq{} |
| =ge= | \ge{} |
| =geq= | \geq{} |
| =lessgtr= | \lessgtr{} |
| =lesseqgtr= | \lesseqgtr{} |
| =ll= | \ll{} |
| =Ll= | \Ll{} |
| =lll= | \lll{} |
| =gg= | \gg{} |
| =Gg= | \Gg{} |
| =ggg= | \ggg{} |
| =prec= | \prec{} |
| =preceq= | \preceq{} |
| =preccurlyeq= | \preccurlyeq{} |
| =succ= | \succ{} |
| =succeq= | \succeq{} |
| =succcurlyeq= | \succcurlyeq{} |
| =sub= | \sub{} |
| =subset= | \subset{} |
| =sup= | \sup{} |
| =supset= | \supset{} |
| =nsub= | \nsub{} |
| =sube= | \sube{} |
| =nsup= | \nsup{} |
| =supe= | \supe{} |
| =setminus= | \setminus{} |
| =forall= | \forall{} |
| =exist= | \exist{} |
| =exists= | \exists{} |
| =nexist= | \nexist{} |
| =nexists= | \nexists{} |
| =empty= | \empty{} |
| =emptyset= | \emptyset{} |
| =isin= | \isin{} |
| =in= | \in{} |
| =notin= | \notin{} |
| =ni= | \ni{} |
| =nabla= | \nabla{} |
| =ang= | \ang{} |
| =angle= | \angle{} |
| =perp= | \perp{} |
| =parallel= | \parallel{} |
| =sdot= | \sdot{} |
| =cdot= | \cdot{} |
| =lceil= | \lceil{} |
| =rceil= | \rceil{} |
| =lfloor= | \lfloor{} |
| =rfloor= | \rfloor{} |
| =lang= | \lang{} |
| =rang= | \rang{} |
| =langle= | \langle{} |
| =rangle= | \rangle{} |
| =hbar= | \hbar{} |
| =mho= | \mho{} |
| ARROWS | |
| =larr= | \larr{} |
| =leftarrow= | \leftarrow{} |
| =gets= | \gets{} |
| =lArr= | \lArr{} |
| =Leftarrow= | \Leftarrow{} |
| =uarr= | \uarr{} |
| =uparrow= | \uparrow{} |
| =uArr= | \uArr{} |
| =Uparrow= | \Uparrow{} |
| =rarr= | \rarr{} |
| =to= | \to{} |
| =rightarrow= | \rightarrow{} |
| =rArr= | \rArr{} |
| =Rightarrow= | \Rightarrow{} |
| =darr= | \darr{} |
| =downarrow= | \downarrow{} |
| =dArr= | \dArr{} |
| =Downarrow= | \Downarrow{} |
| =harr= | \harr{} |
| =leftrightarrow= | \leftrightarrow{} |
| =hArr= | \hArr{} |
| =Leftrightarrow= | \Leftrightarrow{} |
| =crarr= | \crarr{} |
| =hookleftarrow= | \hookleftarrow{} |
| FUNCTION NAMES | |
| =arccos= | \arccos{} |
| =arcsin= | \arcsin{} |
| =arctan= | \arctan{} |
| =arg= | \arg{} |
| =cos= | \cos{} |
| =cosh= | \cosh{} |
| =cot= | \cot{} |
| =coth= | \coth{} |
| =csc= | \csc{} |
| =deg= | \deg{} |
| =det= | \det{} |
| =dim= | \dim{} |
| =exp= | \exp{} |
| =gcd= | \gcd{} |
| =hom= | \hom{} |
| =inf= | \inf{} |
| =ker= | \ker{} |
| =lg= | \lg{} |
| =lim= | \lim{} |
| =liminf= | \liminf{} |
| =limsup= | \limsup{} |
| =ln= | \ln{} |
| =log= | \log{} |
| =max= | \max{} |
| =min= | \min{} |
| =Pr= | \Pr{} |
| =sec= | \sec{} |
| =sin= | \sin{} |
| =sinh= | \sinh{} |
| =sup= | \sup{} |
| =tan= | \tan{} |
| =tanh= | \tanh{} |
| SIGNS & SYMBOLS | |
| =bull= | \bull{} |
| =bullet= | \bullet{} |
| =star= | \star{} |
| =lowast= | \lowast{} |
| =ast= | \ast{} |
| =odot= | \odot{} |
| =oplus= | \oplus{} |
| =otimes= | \otimes{} |
| =check= | \check{} |
| =checkmark= | \checkmark{} |
| MISCELLANEOUS (SELDOM USED) | |
| =para= | \para{} |
| =ordf= | \ordf{} |
| =ordm= | \ordm{} |
| =cedil= | \cedil{} |
| =oline= | \oline{} |
| =uml= | \uml{} |
| =zwnj= | \zwnj{} |
| =zwj= | \zwj{} |
| =lrm= | \lrm{} |
| =rlm= | \rlm{} |
| SMILIES | |
| =smiley= | \smiley{} |
| =blacksmile= | \blacksmile{} |
| =sad= | \sad{} |
| =frowny= | \frowny{} |
| SUITS | |
| =clubs= | \clubs{} |
| =clubsuit= | \clubsuit{} |
| =spades= | \spades{} |
| =spadesuit= | \spadesuit{} |
| =hearts= | \hearts{} |
| =heartsuit= | \heartsuit{} |
| =diams= | \diams{} |
| =diamondsuit= | \diamondsuit{} |
| =diamond= | \diamond{} |
| =Diamond= | \Diamond{} |
| =loz= | \loz{} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |
| =_ = | \_ {} |