emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: James Harkins <jamshark70@zoho.com>
To: "emacs-orgmode" <emacs-orgmode@gnu.org>
Subject: Bug: ODT export of Chinese text inserts spaces for line breaks
Date: Tue, 29 Jun 2021 11:47:06 +0800	[thread overview]
Message-ID: <17a55e0b01d.11be78c6c72761.7557666657037565597@zoho.com> (raw)
In-Reply-To: 

Consider the following org document.

* Test
1本人不想亲自拿到学历学位证书、急于离校者,可书面委托他人代领学历学位证
书,29日起即可离校;2本人想亲自领取学历学位证书者,按学校规定的程序及有关
要求办理离校手续,领取相关证书后离校;

This was produced by pasting in a single, long line, and then using alt-Q (a normal thing to do, and good for readability, because org-mode doesn't wrap lines by default).

Exporting to ODT produces the following (body text, omitting titles, headers and such).

1本人不想亲自拿到学历学位证书、急于离校者,可书面委托他人代领学历学位证 书,29日起即可离校;2本人想亲自领取学历学位证书者,按学校规定的程序及有关 要求办理离校手续,领取相关证书后离校;

Between 证 and 书, and between 关 and 要, there is a space. Chinese typography does not allow for spaces mid-sentence.

So, it would make sense to add a rule to the exporter: if one of the characters before or after a source-text line break is a Chinese, Japanese or Korean character, do not add a space. (The space is valid, of course, if the characters on either side of the line breaks are Roman or [I would guess] Cyrillic as well.)

(Side note: Exporting to a LaTeX buffer shows that the line breaks have been copied into the .tex document as is -- but, provided that you have a `usepackage{xeCJK}` in the preamble, LaTeX produces correct, space-free output. So -- Org "gets away with it" because of LaTeX's handling of CJK text. It seems for ODT, Org needs to handle the spacing within its own logic.)

This is org 9.1.9... bit old, I know, but I'm gonna take a wild guess that this has not been a high-visibility issue.

hjh


             reply	other threads:[~2021-06-29  3:47 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-29  3:47 James Harkins [this message]
2021-06-29  4:43 ` Re:Bug: ODT export of Chinese text inserts spaces for line breaks tumashu
2021-06-29 17:01   ` Bug: " Maxim Nikulin
2021-06-29 18:19     ` Eric Abrahamsen
2021-06-30 12:22       ` Maxim Nikulin
2022-10-08 13:14         ` Ihor Radchenko
2022-10-21  5:38           ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17a55e0b01d.11be78c6c72761.7557666657037565597@zoho.com \
    --to=jamshark70@zoho.com \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).