Org-mode mailing list
 help / color / mirror / Atom feed
* [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
@ 2021-01-01 19:58 Emily Bourke
  2021-01-04  3:28 ` Kyle Meyer
  0 siblings, 1 reply; 4+ messages in thread
From: Emily Bourke @ 2021-01-01 19:58 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 474 bytes --]

Hi,

I found publishing when there were no changes to be slower than expected. Profiling showed me that `org-publish-cache-file-needs-publishing' was invoking the `after-find-file' hooks, which I don't think is necessary.

I've changed it to avoid doing that, by using `with-temp-buffer' and `insert-file-contents', and noticed a significant increase in speed.

Is there any reason I'm missing for using `find-file-noselect' in this case?

Best wishes,
Emily Bourke

[-- Attachment #2: 0001-ox-publish.el-Speed-up-org-publish-cache-file-needs-.patch --]
[-- Type: application/octet-stream, Size: 2742 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-01 19:58 [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing Emily Bourke
@ 2021-01-04  3:28 ` Kyle Meyer
  2021-01-06 20:58   ` Emily Bourke
  2021-01-07  1:11   ` Dr. Arne Babenhauserheide
  0 siblings, 2 replies; 4+ messages in thread
From: Kyle Meyer @ 2021-01-04  3:28 UTC (permalink / raw)
  To: Emily Bourke; +Cc: emacs-orgmode

Thank you for the patch.

Emily Bourke writes:

> I found publishing when there were no changes to be slower than
> expected. Profiling showed me that
> `org-publish-cache-file-needs-publishing' was invoking the
> `after-find-file' hooks, which I don't think is necessary.
>
> I've changed it to avoid doing that, by using `with-temp-buffer' and
> `insert-file-contents', and noticed a significant increase in speed.
>
> Is there any reason I'm missing for using `find-file-noselect' in this
> case?

Nothing jumps out to me.  For large files that are already visited, I
suppose find-file-noselect returning an existing buffer can be faster,
so relevant factors would include how many Org files a project has, how
large they are, and how many of those are visited in the current
session.  My guess is that using with-temp-buffer and
insert-file-contents would be a net gain, though that gain would be
narrowed some if the temporary buffer was put into org-mode rather than
kept in fundamental-mode (more below).

> Subject: [PATCH] ox-publish.el: Speed up
>  org-publish-cache-file-needs-publishing
>
> * lisp/ox-publish.el (org-publish-cache-file-needs-publishing): Use
> `with-temp-buffer' with `insert-file-contents' instead of
> `find-file-noselect'.  This avoids running the `after-find-file' hook,
> which can make it significantly faster.

This reads to me like after-find-file is the hook itself.  Perhaps
something like this would be clearer: "... avoids calling
after-find-file and running find-file-hook, ...".

> diff --git a/lisp/ox-publish.el b/lisp/ox-publish.el
> index 7bb2fed6e..e967286cf 100644
> --- a/lisp/ox-publish.el
> +++ b/lisp/ox-publish.el
> @@ -1290,29 +1290,26 @@ the file including them will be republished as well."
>  	 (org-inhibit-startup t)
>  	 included-files-ctime)
>      (when (equal (file-name-extension filename) "org")
> -      (let ((visiting (find-buffer-visiting filename))
> -	    (buf (find-file-noselect filename))
> -	    (case-fold-search t))
> -	(unwind-protect
> -	    (with-current-buffer buf
> -	      (goto-char (point-min))
> -	      (while (re-search-forward "^[ \t]*#\\+INCLUDE:" nil t)
> -		(let ((element (org-element-at-point)))
> -		  (when (eq 'keyword (org-element-type element))
> -		    (let* ((value (org-element-property :value element))
> -			   (filename
> -			    (and (string-match "\\`\\(\".+?\"\\|\\S-+\\)" value)
> -				 (let ((m (org-strip-quotes
> -					   (match-string 1 value))))
> -				   ;; Ignore search suffix.
> -				   (if (string-match "::.*?\\'" m)
> -				       (substring m 0 (match-beginning 0))
> -				     m)))))
> -		      (when filename
> -			(push (org-publish-cache-ctime-of-src
> -			       (expand-file-name filename))
> -			      included-files-ctime)))))))
> -	  (unless visiting (kill-buffer buf)))))
> +      (let ((case-fold-search t))
> +	(with-temp-buffer
> +	  (insert-file-contents filename)
> +	  (goto-char (point-min))

The goto-char call can be dropped now because insert-file-contents inserts
after point.

Unlike the previous code, this doesn't activate org-mode in the buffer.
That gives a speedup.  And I don't spot any code downstream that depends
on the major mode being org-mode, so it's probably safe, though perhaps
there's a subtle change in behavior here (e.g., related to syntax
table).

If org-mode isn't called, the org-inhibit-startup binding above could be
dropped.

> +	  (while (re-search-forward "^[ \t]*#\\+INCLUDE:" nil t)
> +	    (let ((element (org-element-at-point)))
> +	      (when (eq 'keyword (org-element-type element))
> +		(let* ((value (org-element-property :value element))
> +		       (filename
> +			(and (string-match "\\`\\(\".+?\"\\|\\S-+\\)" value)
> +			     (let ((m (org-strip-quotes
> +				       (match-string 1 value))))
> +			       ;; Ignore search suffix.
> +			       (if (string-match "::.*?\\'" m)
> +				   (substring m 0 (match-beginning 0))
> +				 m)))))
> +		  (when filename
> +		    (push (org-publish-cache-ctime-of-src
> +			   (expand-file-name filename))

This introduces a regression.  With the previous code, the
find-file-noselect call led to default-directory being set to the Org
file's directory, and then this expand-file call on the included file
was relative to that.  With the new code, default-directory isn't
changed, so it points to a non-existing or incorrect file unless the
current default-directory and the Org file's happen to match.

> +			  included-files-ctime)))))))))


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-04  3:28 ` Kyle Meyer
@ 2021-01-06 20:58   ` Emily Bourke
  2021-01-07  1:11   ` Dr. Arne Babenhauserheide
  1 sibling, 0 replies; 4+ messages in thread
From: Emily Bourke @ 2021-01-06 20:58 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: emacs-orgmode

Thanks for the feedback!

> Nothing jumps out to me. For large files that are already visited, I
> suppose find-file-noselect returning an existing buffer can be faster,
> so relevant factors would include how many Org files a project has, how
> large they are, and how many of those are visited in the current
> session. My guess is that using with-temp-buffer and
> insert-file-contents would be a net gain, though that gain would be
> narrowed some if the temporary buffer was put into org-mode rather than
> kept in fundamental-mode (more below).

I'll do some testing with some large org files and see how things compare – it might be worth switching to the buffer for the file if there is one already.

> This reads to me like after-find-file is the hook itself. Perhaps
> something like this would be clearer: "... avoids calling
> after-find-file and running find-file-hook, ...".

Ah yes, I had misunderstood – I'll rephrase it.

> The goto-char call can be dropped now because insert-file-contents inserts
> after point.

I follow, will remove.

> Unlike the previous code, this doesn't activate org-mode in the buffer.
> That gives a speedup. And I don't spot any code downstream that depends
> on the major mode being org-mode, so it's probably safe, though perhaps
> there's a subtle change in behavior here (e.g., related to syntax
> table).
>
> If org-mode isn't called, the org-inhibit-startup binding above could be
> dropped.

Yeah, if you're worried about it I could try manually activating org mode in the temp buffer – I'm not confident I could predict any problems there might be from not activating it.

> This introduces a regression. With the previous code, the
> find-file-noselect call led to default-directory being set to the Org
> file's directory, and then this expand-file call on the included file
> was relative to that. With the new code, default-directory isn't
> changed, so it points to a non-existing or incorrect file unless the
> current default-directory and the Org file's happen to match.

Ah, I hadn't noticed this – I'll change it to set default-directory manually.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-04  3:28 ` Kyle Meyer
  2021-01-06 20:58   ` Emily Bourke
@ 2021-01-07  1:11   ` Dr. Arne Babenhauserheide
  1 sibling, 0 replies; 4+ messages in thread
From: Dr. Arne Babenhauserheide @ 2021-01-07  1:11 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: Emily Bourke, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]


Kyle Meyer <kyle@kyleam.com> writes:

> Nothing jumps out to me.  For large files that are already visited, I
> suppose find-file-noselect returning an existing buffer can be faster,
> so relevant factors would include how many Org files a project has, how
> large they are, and how many of those are visited in the current
> session.  My guess is that using with-temp-buffer and
> insert-file-contents would be a net gain, though that gain would be
> narrowed some if the temporary buffer was put into org-mode rather than
> kept in fundamental-mode (more below).

If you want to test a rather complex setup, you can try my website build:

hg clone http://hg.sr.ht/~arnebab/draketo; cd draketo/; autoreconf -i; ./configure ; time make

It takes around 3 minutes to build on my machine.

(hg is Mercurial)

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein
ohne es zu merken

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-01-07  1:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-01 19:58 [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing Emily Bourke
2021-01-04  3:28 ` Kyle Meyer
2021-01-06 20:58   ` Emily Bourke
2021-01-07  1:11   ` Dr. Arne Babenhauserheide

Org-mode mailing list

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://orgmode.org/list/0 list/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 list list/ https://orgmode.org/list \
		emacs-orgmode@gnu.org
	public-inbox-index list

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.yhetil.org/yhetil.emacs.orgmode
	nntp://news.gmane.io/gmane.emacs.orgmode


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git