13.17 Advanced Export Configuration

Export hooks

The export process executes two hooks before the actual exporting begins. The first hook, org-export-before-processing-functions, runs before any expansions of macros, Babel code, and include keywords in the buffer. The second hook, org-export-before-parsing-functions, runs before the buffer is parsed.

Functions added to these hooks are called with a single argument: the export backend actually used, as a symbol. You may use them for heavy duty structural modifications of the document. For example, you can remove every headline in the buffer during export like this:

(defun my-headline-removal (backend)
  "Remove all headlines in the current buffer.
BACKEND is the export backend being used, as a symbol."
  (org-map-entries
   (lambda ()
     (delete-region (point) (line-beginning-position 2))
     ;; We need to tell `org-map-entries' to not skip over heading at
     ;; point. Otherwise, it would continue from _next_ heading. See
     ;; the docstring of `org-map-entries' for details.
     (setq org-map-continue-from (point)))))

(add-hook 'org-export-before-parsing-functions #'my-headline-removal)

Filters

Filters are lists of functions to be applied to certain parts for a given backend. The output from the first function in the filter is passed on to the next function in the filter. The final output is the output from the final function in the filter.

The Org export process has many filter sets applicable to different types of objects, plain text, parse trees, export options, and final output formats. The filters are named after the element type or object type: org-export-filter-TYPE-functions, where TYPE is the type targeted by the filter. Valid types are:

bodyboldbabel-call
center-blockclockcode
diary-sexpdrawerdynamic-block
entityexample-blockexport-block
export-snippetfinal-outputfixed-width
footnote-definitionfootnote-referenceheadline
horizontal-ruleinline-babel-callinline-src-block
inlinetaskitalicitem
keywordlatex-environmentlatex-fragment
line-breaklinknode-property
optionsparagraphparse-tree
plain-listplain-textplanning
property-drawerquote-blockradio-target
sectionspecial-blocksrc-block
statistics-cookiestrike-throughsubscript
superscripttabletable-cell
table-rowtargettimestamp
underlineverbatimverse-block

Here is an example filter that replaces non-breaking spaces ~ ~ in the Org buffer with ‘~’ for the LaTeX backend.

(defun my-latex-filter-nobreaks (text backend info)
  "Ensure \" \" are properly handled in LaTeX export."
  (when (org-export-derived-backend-p backend 'latex)
    (replace-regexp-in-string " " "~" text)))

(add-to-list 'org-export-filter-plain-text-functions
             'my-latex-filter-nobreaks)

A filter requires three arguments: the code to be transformed, the name of the backend, and some optional information about the export process. The third argument can be safely ignored. Note the use of org-export-derived-backend-p predicate that tests for latex backend or any other backend, such as beamer, derived from latex.

Defining filters for individual files

The Org export can filter not just for backends, but also for specific files through the ‘BIND’ keyword. Here is an example with two filters; one removes brackets from time stamps, and the other removes strike-through text. The filter functions are defined in a code block in the same Org file, which is a handy location for debugging.

#+BIND: org-export-filter-timestamp-functions (tmp-f-timestamp)
#+BIND: org-export-filter-strike-through-functions (tmp-f-strike-through)
#+BEGIN_SRC emacs-lisp :exports results :results none
  (defun tmp-f-timestamp (s backend info)
    (replace-regexp-in-string "&[lg]t;\\|[][]" "" s))
  (defun tmp-f-strike-through (s backend info) "")
#+END_SRC

Summary of the export process

Org mode export is a multi-step process that works on a temporary copy of the buffer. The export process consists of 4 major steps:

  1. Process the temporary copy, making necessary changes to the buffer text;
  2. Parse the buffer, converting plain Org markup into an abstract syntax tree (AST);
  3. Convert the AST to text, as prescribed by the selected export backend;
  4. Post-process the resulting exported text.

Process temporary copy of the source Org buffer 147:

  1. Execute org-export-before-processing-functions (see Export hooks);
  2. Expand ‘#+include’ keywords in the whole buffer (see Include Files);
  3. Remove commented subtrees in the whole buffer (see Comment Lines);
  4. Replace macros in the whole buffer (see Macro Replacement);
  5. When org-export-use-babel is non-nil (default), process code blocks:

Parse the temporary buffer, creating AST:

  1. Execute org-export-before-parsing-functions (see Export hooks). The hook functions may still modify the buffer;
  2. Calculate export option values according to subtree-specific export settings, in-buffer keywords, ‘#+BIND’ keywords, and buffer-local and global customization. The whole buffer is considered;
  3. When org-org-with-cite-processors is non-nil (default), determine contributing bibliographies and record them into export options (see Citations). The whole buffer is considered;
  4. Execute org-export-filter-options-functions;
  5. Parse the accessible portion of the temporary buffer to generate an AST. The AST is a nested list of lists representing Org syntax elements (see Org Element API for more details):
    (org-data ...
     (heading
      (section
       (paragraph (plain-text) (bold (plain-text))))
      (heading)
      (heading (section ...))))
    

    Past this point, modifications to the temporary buffer no longer affect the export; Org export works only with the AST;

  6. Remove elements that are not exported from the AST:
    • Headings according to ‘SELECT_TAGS’ and ‘EXCLUDE_TAGS’ export keywords; ‘task’, ‘inline’, ‘arch’ export options (see Export Settings);
    • Comments;
    • Clocks, drawers, fixed-width environments, footnotes, LaTeX environments and fragments, node properties, planning lines, property drawers, statistics cookies, timestamps, etc according to ‘#+OPTIONS’ keyword (see Export Settings);
    • Table rows containing width and alignment markers, unless the selected export backend changes :with-special-rows export option to non-nil (see Column Width and Alignment);
    • Table columns containing recalc marks (see Advanced features).
  7. Expand environment variables in file link AST nodes according to the ‘expand-links’ export option (see Export Settings);
  8. Execute org-export-filter-parse-tree-functions. These functions can modify the AST by side effects;
  9. When org-org-with-cite-processors is non-nil (default), replace citation AST nodes and ‘#+print_bibliography’ keyword AST nodes as prescribed by the selected citation export processor (see Citation export processors).

Convert the AST to text by traversing the AST nodes, depth-first:

  1. Convert the leaf nodes (without children) to text as prescribed by “transcoders” in the selected export backend 148;
  2. Pass the converted nodes through the corresponding export filters (see Filters);
  3. Concatenate all the converted child nodes to produce parent node contents;
  4. Convert the nodes with children to text, passing the nodes themselves and their exported contents to the corresponding transcoders and then to the export filters (see Filters).

Post-process the exported text:

  1. Post-process the converted AST, as prescribed by the export backend. 149 This step usually adds generated content (like Table of Contents) to the exported text;
  2. Execute org-export-filter-body-functions;
  3. Unless body-only export is selected (see The Export Dispatcher), add the necessary metadata to the final document, as prescribed by the export backend. Examples: Document author/title; HTML headers/footers; LaTeX preamble;
  4. When org-org-with-cite-processors is non-nil (default), add bibliography metadata, as prescribed by the citation export processor;
  5. Execute org-export-filter-final-output-functions.

Extending an existing backend

Some parts of the conversion process can be extended for certain elements so as to introduce a new or revised translation. That is how the HTML export backend was extended to handle Markdown format. The extensions work seamlessly so any aspect of filtering not done by the extended backend is handled by the original backend. Of all the export customization in Org, extending is very powerful as it operates at the parser level.

For this example, make the ascii backend display the language used in a source code block. Also make it display only when some attribute is non-nil, like the following:

#+ATTR_ASCII: :language t

Then extend ASCII backend with a custom “my-ascii” backend.

(defun my-ascii-src-block (src-block contents info)
  "Transcode a SRC-BLOCK element from Org to ASCII.
CONTENTS is nil.  INFO is a plist used as a communication
channel."
  (if (not (org-export-read-attribute :attr_ascii src-block :language))
      (org-export-with-backend 'ascii src-block contents info)
    (concat
     (format ",--[ %s ]--\n%s`----"
             (org-element-property :language src-block)
             (replace-regexp-in-string
              "^" "| "
              (org-element-normalize-string
               (org-export-format-code-default src-block info)))))))

(org-export-define-derived-backend 'my-ascii 'ascii
  :translate-alist '((src-block . my-ascii-src-block)))

The my-ascii-src-block function looks at the attribute above the current element. If not true, hands over to ascii backend. If true, which it is in this example, it creates a box around the code and leaves room for the inserting a string for language. The last form creates the new backend that springs to action only when translating src-block type elements.

To use the newly defined backend, evaluate the following from an Org buffer:

(org-export-to-buffer 'my-ascii "*Org MY-ASCII Export*")

Further steps to consider would be an interactive function, self-installing an item in the export dispatcher menu, and other user-friendly improvements. See https://orgmode.org/worg/dev/org-export-reference.html for more details.


Footnotes

(147)

Unless otherwise specified, each step of the export process only operates on the accessible portion of the buffer. When subtree export is selected (see The Export Dispatcher), the buffer is narrowed to the body of the selected subtree, so that the rest of the buffer text, except export keywords, does not contribute to the export output.

(148)

See transcoders and :translate-alist in the docstrings of org-export-define-backend and org-export-define-derived-backend.

(149)

See inner-template in the docstring of org-export-define-backend.