Maxim Nikulin writes: > I am trying to minimize number of regexp searches. Mostly it is applied > when information concerning multiple headings is required (agenda, > refile targets). It unlikely will get some benefits during interactive > calls related to single heading. > Having the tree, it is possible to instantly find heading for > *arbitrary* position in the buffer. Likely the next operation is goto to > the heading or to it parent and parsing the line for more detailed > properties. The latter is cacheable, structure for heading properties > can be expanded. Thanks for the explanation! I very much like this idea, though I do not think that it is going to be practical on actual large files. Your code runs for 0.25sec on my largest file: | scan x10 | 2.482179163 | 1 | 0.38996822200002157 | | btree x10 | 0.103329151 | 0 | 0.0 | | scan+btree x10 | 2.666915096 | 1 | 0.3657946569999808 | | find random x10 000 | 0.048668756 | 0 | 0.0 | | nodes | 16641 | | | > Since there is no operations as insertion or deletion of nodes from > tree, no complex code is required to implement balancing rotations. That > is why I think that avl-tree is an overkill. Yet, if we could keep the tree in sync with buffer modifications... And we can, using org-element-cache. It already uses AVL-tree and already syncs it automatically on buffer edits. The only problem is that it does not work with headlines. However, I wrote some code adding headline support to org-element-cache in https://github.com/yantar92/org. Some benchmarks: | scan x10 | 2.179845106 | 0 | 0.0 | | btree x10 | 0.474887597 | 1 | 0.364201286999986 | | scan+btree x10 | 2.318029902 | 0 | 0.0 | | scan using org-element-cache x10 | 2.8941826990000004 | 0 | 0.0 | | populating the cache x1 | 11.332366941 | 5 | 2.040159146999997 | | find random x10 000 | 0.053074866 | 0 | 0.0 | | find random in org-element-cache x10 000 | 0.007963375 | 0 | 0.0 | | nodes | 16641 | | | Scan through the file takes pretty much the same time with the btree approach. Sometimes more, sometimes less. And most of the time is really just re-search-forward. Find random showcases that avl-tree vs. btree really makes a difference on such large number of headings. On the worse side, initial population of the org-element-cache takes quite a bit of time. However, it is normally done asynchronously during idle time and does not really affect the user except the first loading (which might be further optimised if we store cache between Emacs sessions). Later, cache is updated automatically with buffer modifications. Moreover, the org-element-cache will already provide the parsed headlines, including titles (and tags and other useful staff). No need to match headline text with regexp. For me, the idea of storing headlines in cache looks promising. I can imagine many other Org functions using the cache after trivial modifications. At least org-get-tags, org-get-category, org-entry-get can be easily adapted to use the cache. What do you think? > See the attachment for experimental (thus almost untested) code. Likely > you will find code style quite ugly. I am not happy with 0.1 sec for a > moderately large file. It is close to the limit for comfortable > interactive operations. No worries about the style. It is testing anyway. I modified your code slightly for testing the org-element-cache. See the attached diff. Best, Ihor