Org-mode mailing list
 help / color / mirror / Atom feed
From: Ihor Radchenko <yantar92@gmail.com>
To: Nicolas Goaziou <mail@nicolasgoaziou.fr>
Cc: emacs-orgmode@gnu.org
Subject: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Date: Fri, 05 Jun 2020 16:18:59 +0800
Message-ID: <87mu5iq618.fsf@localhost> (raw)
In-Reply-To: <87r1uuotw8.fsf@nicolasgoaziou.fr>

> See also `gensym'. Do we really need to use it for something else than
> `invisible'? If not, the tool doesn't need to be generic.

For now, I also use it for buffer-local 'invisible stack. The stack is
needed to preserve folding state of drawers/blocks inside folded
outline. Though I am thinking about replacing the stack with separate
text properties, like 'invisible-outline-buffer-local +
'invisible-drawer-buffer-local + 'invisible-block-buffer-local.
Maintaining stack takes a noticeable percentage of CPU time in profiler.

org--get-buffer-local-text-property-symbol must take care about
situation with indirect buffers. When an indirect buffer is created from
some org buffer, the old value of char-property-alias-alist is carried
over. We need to detect this case and create new buffer-local symbol,
which is unique to the newly created buffer (but not create it if the
buffer-local property is already there). Then, the new symbol must
replace the old alias in char-property-alias-alist + old folding state
must be preserved (via copying the old invisibility specs into the new
buffer-local text property). I do not see how gensym can benefit this
logic.

> OK, but this may not be sufficient if we want to do slightly better than
> overlays in that area. This is not mandatory, though.

Could you elaborate on what can be "slightly better"? 

> As discussed before, I don't think you need to use `modification-hooks'
> or `insert-behind-hooks' if you already use `after-change-functions'.
>
> `after-change-functions' are also triggered upon text properties
> changes. So, what is the use case for the other hooks?

The problem is that `after-change-functions' cannot be a text property.
Only `modification-hooks' and `insert-in-front/behind-hooks' can be a
valid text property. If we use `after-change-functions', they will
always be triggered, regardless if the change was made inside or outside
folded region.

>> :asd:
>> :drawer:
>> lksjdfksdfjl
>> sdfsdfsdf
>> :end:
>>
>> If :asd: was inserted in front of folded :drawer:, changes in :drawer:
>> line of the new folded :asd: drawer would reveal the text between
>> :drawer: and :end:.
>>
>> Let me know what you think on this.

> I have first to understand the use case for `modification-hook'. But
> I think unfolding is the right thing to do in this situation, isn't it?

That situation arises because the modification-hooks from ":drawer:"
(they are set via text properties) only have information about the
:drawer:...:end: drawer before the modifications (they were set when
:drawer: was folded last time). So, they will only unfold a part of the
new :asd: drawer. I do not see a simple way to unfold everything without
re-parsing the drawer around the changed text.

Actually, I am quite unhappy with the performance of modification-hooks
set via text properties (I am using this patch on my Emacs during this
week). It appears that setting the text properties costs a significant
CPU time in practice, even though running the hooks is pretty fast.
I will think about a way to handle modifications using global
after-change-functions.

> `org--get-element-region-at-point' is certainly faster, but it is also
> wrong, unfortunately.
>
> Org syntax is not context-free grammar. If you try to parse it locally,
> starting from anywhere, it will fail at some point. For example, your
> function would choke in the following case:
>
>     [fn:1] Def1
>     #+begin_something
>
>     [fn:2] Def2
>     #+end_something

I see. 

> AFAIK, the only proper way to parse it is to start from a known position
> in the buffer. If you have no information about the buffer, the headline
> above is the position you want. With cache could help to start below.
> Anyway, in this particular case, you should not use
> `org--get-element-region-at-point'.

OK

Best,
Ihor

Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:

> Hello,
>
> Ihor Radchenko <yantar92@gmail.com> writes:
>
>> [The patch itself will be provided in the following email]
>
> Thank you.
>
>> I have found char-property-alias-alist variable that controls how Emacs
>> calculates text property value if the property is not set. This variable
>> can be buffer-local, which allows independent 'invisible states in
>> different buffers.
>
> Great. I didn't know about this variable!
>
>> All the implementation stays in
>> org--get-buffer-local-text-property-symbol, which takes care about
>> generating unique property name and mapping it to 'invisible (or any
>> other) text property.
>
> See also `gensym'. Do we really need to use it for something else than
> `invisible'? If not, the tool doesn't need to be generic.
>
>> I simplified the code as suggested, without using pairs of before- and
>> after-change-functions.
>
> Great!
>
>> Handling text inserted into folded/invisible region is handled by a
>> simple after-change function. After testing, it turned out that simple
>> re-hiding text based on 'invisible property of the text before/after the
>> inserted region works pretty well.
>
> OK, but this may not be sufficient if we want to do slightly better than
> overlays in that area. This is not mandatory, though.
>
>> Modifications to BEGIN/END line of the drawers and blocks is handled via
>> 'modification-hooks + 'insert-behind-hooks text properties (there is no
>> after-change-functions analogue for text properties in Emacs). The
>> property is applied during folding and the modification-hook function is
>> made aware about the drawer/block boundaries (via apply-partially
>> passing element containing :begin :end markers for the current
>> drawer/block). Passing the element boundary is important because the
>> 'modification-hook will not directly know where it belongs to. Only the
>> modified region (which can be larger than the drawer) is passed to the
>> function. In the worst case, the region can be the whole buffer (if one
>> runs revert-buffer).
>
> As discussed before, I don't think you need to use `modification-hooks'
> or `insert-behind-hooks' if you already use `after-change-functions'.
>
> `after-change-functions' are also triggered upon text properties
> changes. So, what is the use case for the other hooks?
>
>> It turned out that adding 'modification-hook text property takes a
>> significant cpu time (partially, because we need to take care about
>> possible existing 'modification-hook value, see
>> org--add-to-list-text-property). For now, I decided to not clear the
>> modification hooks during unfolding because of poor performance.
>> However, this approach would lead to partial unfolding in the following
>> case:
>>
>> :asd:
>> :drawer:
>> lksjdfksdfjl
>> sdfsdfsdf
>> :end:
>>
>> If :asd: was inserted in front of folded :drawer:, changes in :drawer:
>> line of the new folded :asd: drawer would reveal the text between
>> :drawer: and :end:.
>>
>> Let me know what you think on this.
>
> I have first to understand the use case for `modification-hook'. But
> I think unfolding is the right thing to do in this situation, isn't it?
>
>> My simplified implementation of element boundary parser
>> (org--get-element-region-at-point) appears to be much faster and also
>> uses much less memory in comparison with org-element-at-point.
>> Moreover, not all the places where org-element-at-point is called
>> actually need the full parsed element. For example, org-hide-drawer-all,
>> org-hide-drawer-toggle, org-hide-block-toggle, and
>> org--hide-wrapper-toggle only need element type and some information
>> about the element boundaries - the information we can get from
>> org--get-element-region-at-point.
>
> [...]
>
>> What do you think about the idea of making use of
>> org--get-element-region-at-point in org code base?
>
> `org--get-element-region-at-point' is certainly faster, but it is also
> wrong, unfortunately.
>
> Org syntax is not context-free grammar. If you try to parse it locally,
> starting from anywhere, it will fail at some point. For example, your
> function would choke in the following case:
>
>     [fn:1] Def1
>     #+begin_something
>
>     [fn:2] Def2
>     #+end_something
>
> AFAIK, the only proper way to parse it is to start from a known position
> in the buffer. If you have no information about the buffer, the headline
> above is the position you want. With cache could help to start below.
> Anyway, in this particular case, you should not use
> `org--get-element-region-at-point'.
>
> Hopefully, we don't need to parse anything. In an earlier message,
> I suggested a few checks to make on the modified text in order to decide
> if something should be unfolded, or not. I suggest to start from there,
> and fix any shortcomings we might encounter. We're replacing overlays:
> low-level is good in this area.
>
> WDYT?
>
>
> Regards,
>
> -- 
> Nicolas Goaziou

-- 
Ihor Radchenko,
PhD,
Center for Advancing Materials Performance from the Nanoscale (CAMP-nano)
State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong University, Xi'an, China
Email: yantar92@gmail.com, ihor_radchenko@alumni.sutd.edu.sg


  reply	other threads:[~2020-06-05  8:24 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-24  6:55 Ihor Radchenko
2020-04-24  8:02 ` Nicolas Goaziou
2020-04-25  0:29   ` stardiviner
2020-04-26 16:04   ` Ihor Radchenko
2020-05-04 16:56     ` Karl Voit
2020-05-07  7:18       ` Karl Voit
2020-05-09 15:43       ` Ihor Radchenko
2020-05-07 11:04     ` Christian Heinrich
2020-05-09 15:46       ` Ihor Radchenko
2020-05-08 16:38     ` Nicolas Goaziou
2020-05-09 13:58       ` Nicolas Goaziou
2020-05-09 16:22         ` Ihor Radchenko
2020-05-09 17:21           ` Nicolas Goaziou
2020-05-10  5:25             ` Ihor Radchenko
2020-05-10  9:47               ` Nicolas Goaziou
2020-05-10 13:29                 ` Ihor Radchenko
2020-05-10 14:46                   ` Nicolas Goaziou
2020-05-10 16:21                     ` Ihor Radchenko
2020-05-10 16:38                       ` Nicolas Goaziou
2020-05-10 17:08                         ` Ihor Radchenko
2020-05-10 19:38                           ` Nicolas Goaziou
2020-05-09 15:40       ` Ihor Radchenko
2020-05-09 16:30         ` Ihor Radchenko
2020-05-09 17:32           ` Nicolas Goaziou
2020-05-09 18:06             ` Ihor Radchenko
2020-05-10 14:59               ` Nicolas Goaziou
2020-05-10 15:15                 ` Kyle Meyer
2020-05-10 16:30                 ` Ihor Radchenko
2020-05-10 19:32                   ` Nicolas Goaziou
2020-05-12 10:03                     ` Nicolas Goaziou
2020-05-17 15:00                     ` Ihor Radchenko
2020-05-17 15:40                       ` Ihor Radchenko
2020-05-18 14:35                         ` Nicolas Goaziou
2020-05-18 16:52                           ` Ihor Radchenko
2020-05-19 13:07                             ` Nicolas Goaziou
2020-05-23 13:52                               ` Ihor Radchenko
2020-05-23 13:53                                 ` Ihor Radchenko
2020-05-23 15:26                                   ` Ihor Radchenko
2020-05-26  8:33                                 ` Nicolas Goaziou
2020-06-02  9:21                                   ` Ihor Radchenko
2020-06-02  9:23                                     ` Ihor Radchenko
2020-06-02 12:10                                       ` Bastien
2020-06-02 13:12                                         ` Ihor Radchenko
2020-06-02 13:23                                           ` Bastien
2020-06-02 13:30                                             ` Ihor Radchenko
2020-06-02  9:25                                     ` Ihor Radchenko
2020-06-05  7:26                                     ` Nicolas Goaziou
2020-06-05  8:18                                       ` Ihor Radchenko [this message]
2020-06-05 13:50                                         ` Nicolas Goaziou
2020-06-08  5:05                                           ` Ihor Radchenko
2020-06-08  5:06                                             ` Ihor Radchenko
2020-06-08  5:08                                             ` Ihor Radchenko
2020-06-10 17:14                                             ` Nicolas Goaziou
2020-06-21  9:52                                               ` Ihor Radchenko
2020-06-21 15:01                                                 ` Nicolas Goaziou
2020-08-11  6:45                                               ` Ihor Radchenko
2020-08-11 23:07                                                 ` Kyle Meyer
2020-08-12  6:29                                                   ` Ihor Radchenko
2020-09-20  5:53                                                     ` Ihor Radchenko
2020-09-20 11:45                                                       ` Kévin Le Gouguec
2020-09-22  9:05                                                         ` Ihor Radchenko
2020-09-22 10:00                                                           ` Ihor Radchenko
2020-09-23  6:16                                                             ` Kévin Le Gouguec
2020-09-23  6:48                                                               ` Ihor Radchenko
2020-09-23  7:09                                                                 ` Bastien
2020-09-23  7:30                                                                   ` Ihor Radchenko
2020-09-24 18:07                                                                 ` Kévin Le Gouguec
2020-09-25  2:16                                                                   ` Ihor Radchenko
2020-12-15 17:38                                                                     ` [9.4] Fixing logbook visibility during isearch Kévin Le Gouguec
2020-12-16  3:15                                                                       ` Ihor Radchenko
2020-12-16 18:05                                                                         ` Kévin Le Gouguec
2020-12-17  3:18                                                                           ` Ihor Radchenko
2020-12-17 14:50                                                                             ` Kévin Le Gouguec
2020-12-18  2:23                                                                               ` Ihor Radchenko
2020-12-24 23:37                                                                                 ` Kévin Le Gouguec
2020-12-25  2:51                                                                                   ` Ihor Radchenko
2020-12-25 10:59                                                                                     ` Kévin Le Gouguec
2020-12-25 12:32                                                                                       ` Ihor Radchenko
2020-12-25 21:35                                                                                     ` Kévin Le Gouguec
2020-12-26  4:14                                                                                       ` Ihor Radchenko
2020-12-26 11:44                                                                                         ` Kévin Le Gouguec
2020-12-26 12:22                                                                                           ` Ihor Radchenko
2020-12-04  5:58                                                       ` [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers Ihor Radchenko
2021-03-21  9:09                                                         ` Ihor Radchenko
2021-05-03 17:28                                                           ` Bastien

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://orgmode.org

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mu5iq618.fsf@localhost \
    --to=yantar92@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=mail@nicolasgoaziou.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Org-mode mailing list

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://orgmode.org/list/0 list/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 list list/ https://orgmode.org/list \
		emacs-orgmode@gnu.org
	public-inbox-index list

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.yhetil.org/yhetil.emacs.orgmode
	nntp://news.gmane.io/gmane.emacs.orgmode


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git