Org-mode mailing list
 help / color / mirror / Atom feed
* [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
@ 2021-01-01 19:58 Emily Bourke
  2021-01-04  3:28 ` Kyle Meyer
  0 siblings, 1 reply; 6+ messages in thread
From: Emily Bourke @ 2021-01-01 19:58 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 474 bytes --]

Hi,

I found publishing when there were no changes to be slower than expected. Profiling showed me that `org-publish-cache-file-needs-publishing' was invoking the `after-find-file' hooks, which I don't think is necessary.

I've changed it to avoid doing that, by using `with-temp-buffer' and `insert-file-contents', and noticed a significant increase in speed.

Is there any reason I'm missing for using `find-file-noselect' in this case?

Best wishes,
Emily Bourke

[-- Attachment #2: 0001-ox-publish.el-Speed-up-org-publish-cache-file-needs-.patch --]
[-- Type: application/octet-stream, Size: 2742 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-01 19:58 [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing Emily Bourke
@ 2021-01-04  3:28 ` Kyle Meyer
  2021-01-06 20:58   ` Emily Bourke
  2021-01-07  1:11   ` Dr. Arne Babenhauserheide
  0 siblings, 2 replies; 6+ messages in thread
From: Kyle Meyer @ 2021-01-04  3:28 UTC (permalink / raw)
  To: Emily Bourke; +Cc: emacs-orgmode

Thank you for the patch.

Emily Bourke writes:

> I found publishing when there were no changes to be slower than
> expected. Profiling showed me that
> `org-publish-cache-file-needs-publishing' was invoking the
> `after-find-file' hooks, which I don't think is necessary.
>
> I've changed it to avoid doing that, by using `with-temp-buffer' and
> `insert-file-contents', and noticed a significant increase in speed.
>
> Is there any reason I'm missing for using `find-file-noselect' in this
> case?

Nothing jumps out to me.  For large files that are already visited, I
suppose find-file-noselect returning an existing buffer can be faster,
so relevant factors would include how many Org files a project has, how
large they are, and how many of those are visited in the current
session.  My guess is that using with-temp-buffer and
insert-file-contents would be a net gain, though that gain would be
narrowed some if the temporary buffer was put into org-mode rather than
kept in fundamental-mode (more below).

> Subject: [PATCH] ox-publish.el: Speed up
>  org-publish-cache-file-needs-publishing
>
> * lisp/ox-publish.el (org-publish-cache-file-needs-publishing): Use
> `with-temp-buffer' with `insert-file-contents' instead of
> `find-file-noselect'.  This avoids running the `after-find-file' hook,
> which can make it significantly faster.

This reads to me like after-find-file is the hook itself.  Perhaps
something like this would be clearer: "... avoids calling
after-find-file and running find-file-hook, ...".

> diff --git a/lisp/ox-publish.el b/lisp/ox-publish.el
> index 7bb2fed6e..e967286cf 100644
> --- a/lisp/ox-publish.el
> +++ b/lisp/ox-publish.el
> @@ -1290,29 +1290,26 @@ the file including them will be republished as well."
>  	 (org-inhibit-startup t)
>  	 included-files-ctime)
>      (when (equal (file-name-extension filename) "org")
> -      (let ((visiting (find-buffer-visiting filename))
> -	    (buf (find-file-noselect filename))
> -	    (case-fold-search t))
> -	(unwind-protect
> -	    (with-current-buffer buf
> -	      (goto-char (point-min))
> -	      (while (re-search-forward "^[ \t]*#\\+INCLUDE:" nil t)
> -		(let ((element (org-element-at-point)))
> -		  (when (eq 'keyword (org-element-type element))
> -		    (let* ((value (org-element-property :value element))
> -			   (filename
> -			    (and (string-match "\\`\\(\".+?\"\\|\\S-+\\)" value)
> -				 (let ((m (org-strip-quotes
> -					   (match-string 1 value))))
> -				   ;; Ignore search suffix.
> -				   (if (string-match "::.*?\\'" m)
> -				       (substring m 0 (match-beginning 0))
> -				     m)))))
> -		      (when filename
> -			(push (org-publish-cache-ctime-of-src
> -			       (expand-file-name filename))
> -			      included-files-ctime)))))))
> -	  (unless visiting (kill-buffer buf)))))
> +      (let ((case-fold-search t))
> +	(with-temp-buffer
> +	  (insert-file-contents filename)
> +	  (goto-char (point-min))

The goto-char call can be dropped now because insert-file-contents inserts
after point.

Unlike the previous code, this doesn't activate org-mode in the buffer.
That gives a speedup.  And I don't spot any code downstream that depends
on the major mode being org-mode, so it's probably safe, though perhaps
there's a subtle change in behavior here (e.g., related to syntax
table).

If org-mode isn't called, the org-inhibit-startup binding above could be
dropped.

> +	  (while (re-search-forward "^[ \t]*#\\+INCLUDE:" nil t)
> +	    (let ((element (org-element-at-point)))
> +	      (when (eq 'keyword (org-element-type element))
> +		(let* ((value (org-element-property :value element))
> +		       (filename
> +			(and (string-match "\\`\\(\".+?\"\\|\\S-+\\)" value)
> +			     (let ((m (org-strip-quotes
> +				       (match-string 1 value))))
> +			       ;; Ignore search suffix.
> +			       (if (string-match "::.*?\\'" m)
> +				   (substring m 0 (match-beginning 0))
> +				 m)))))
> +		  (when filename
> +		    (push (org-publish-cache-ctime-of-src
> +			   (expand-file-name filename))

This introduces a regression.  With the previous code, the
find-file-noselect call led to default-directory being set to the Org
file's directory, and then this expand-file call on the included file
was relative to that.  With the new code, default-directory isn't
changed, so it points to a non-existing or incorrect file unless the
current default-directory and the Org file's happen to match.

> +			  included-files-ctime)))))))))


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-04  3:28 ` Kyle Meyer
@ 2021-01-06 20:58   ` Emily Bourke
  2021-05-01 10:22     ` Bastien
  2021-01-07  1:11   ` Dr. Arne Babenhauserheide
  1 sibling, 1 reply; 6+ messages in thread
From: Emily Bourke @ 2021-01-06 20:58 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: emacs-orgmode

Thanks for the feedback!

> Nothing jumps out to me. For large files that are already visited, I
> suppose find-file-noselect returning an existing buffer can be faster,
> so relevant factors would include how many Org files a project has, how
> large they are, and how many of those are visited in the current
> session. My guess is that using with-temp-buffer and
> insert-file-contents would be a net gain, though that gain would be
> narrowed some if the temporary buffer was put into org-mode rather than
> kept in fundamental-mode (more below).

I'll do some testing with some large org files and see how things compare – it might be worth switching to the buffer for the file if there is one already.

> This reads to me like after-find-file is the hook itself. Perhaps
> something like this would be clearer: "... avoids calling
> after-find-file and running find-file-hook, ...".

Ah yes, I had misunderstood – I'll rephrase it.

> The goto-char call can be dropped now because insert-file-contents inserts
> after point.

I follow, will remove.

> Unlike the previous code, this doesn't activate org-mode in the buffer.
> That gives a speedup. And I don't spot any code downstream that depends
> on the major mode being org-mode, so it's probably safe, though perhaps
> there's a subtle change in behavior here (e.g., related to syntax
> table).
>
> If org-mode isn't called, the org-inhibit-startup binding above could be
> dropped.

Yeah, if you're worried about it I could try manually activating org mode in the temp buffer – I'm not confident I could predict any problems there might be from not activating it.

> This introduces a regression. With the previous code, the
> find-file-noselect call led to default-directory being set to the Org
> file's directory, and then this expand-file call on the included file
> was relative to that. With the new code, default-directory isn't
> changed, so it points to a non-existing or incorrect file unless the
> current default-directory and the Org file's happen to match.

Ah, I hadn't noticed this – I'll change it to set default-directory manually.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-04  3:28 ` Kyle Meyer
  2021-01-06 20:58   ` Emily Bourke
@ 2021-01-07  1:11   ` Dr. Arne Babenhauserheide
  1 sibling, 0 replies; 6+ messages in thread
From: Dr. Arne Babenhauserheide @ 2021-01-07  1:11 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: Emily Bourke, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]


Kyle Meyer <kyle@kyleam.com> writes:

> Nothing jumps out to me.  For large files that are already visited, I
> suppose find-file-noselect returning an existing buffer can be faster,
> so relevant factors would include how many Org files a project has, how
> large they are, and how many of those are visited in the current
> session.  My guess is that using with-temp-buffer and
> insert-file-contents would be a net gain, though that gain would be
> narrowed some if the temporary buffer was put into org-mode rather than
> kept in fundamental-mode (more below).

If you want to test a rather complex setup, you can try my website build:

hg clone http://hg.sr.ht/~arnebab/draketo; cd draketo/; autoreconf -i; ./configure ; time make

It takes around 3 minutes to build on my machine.

(hg is Mercurial)

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein
ohne es zu merken

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-06 20:58   ` Emily Bourke
@ 2021-05-01 10:22     ` Bastien
  2021-05-05 18:55       ` Emily Bourke
  0 siblings, 1 reply; 6+ messages in thread
From: Bastien @ 2021-05-01 10:22 UTC (permalink / raw)
  To: Emily Bourke; +Cc: Kyle Meyer, emacs-orgmode

Hi Emily,

thanks for the patch and sorry to reactivate this old thread.

Did you find time to make the tests and, perhaps, to update the patch?

Please let us know - thanks!

-- 
 Bastien


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-05-01 10:22     ` Bastien
@ 2021-05-05 18:55       ` Emily Bourke
  0 siblings, 0 replies; 6+ messages in thread
From: Emily Bourke @ 2021-05-05 18:55 UTC (permalink / raw)
  To: Bastien; +Cc: Kyle Meyer, emacs-orgmode

Hi Bastien,

> thanks for the patch and sorry to reactivate this old thread.

No problem, thanks for getting in touch - the reminder is appreciated.

> Did you find time to make the tests and, perhaps, to update the patch?

I'm afraid I haven't had a chance to look at this any further since my last email. I'll try to find some time this week.

Best wishes,
Emily


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-05-05 18:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-01 19:58 [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing Emily Bourke
2021-01-04  3:28 ` Kyle Meyer
2021-01-06 20:58   ` Emily Bourke
2021-05-01 10:22     ` Bastien
2021-05-05 18:55       ` Emily Bourke
2021-01-07  1:11   ` Dr. Arne Babenhauserheide

Org-mode mailing list

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://orgmode.org/list/0 list/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 list list/ https://orgmode.org/list \
		emacs-orgmode@gnu.org
	public-inbox-index list

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.yhetil.org/yhetil.emacs.orgmode
	nntp://news.gmane.io/gmane.emacs.orgmode


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git