From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paul Sexton Subject: Re: Context-sensitive word count in org mode (elisp) Date: Sun, 20 Feb 2011 21:49:16 +0000 (UTC) Message-ID: References: <87zkptbee7.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Return-path: Received: from [140.186.70.92] (port=42144 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PrHA0-0006Lq-85 for emacs-orgmode@gnu.org; Sun, 20 Feb 2011 16:49:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PrH9x-0003VN-VM for emacs-orgmode@gnu.org; Sun, 20 Feb 2011 16:49:44 -0500 Received: from lo.gmane.org ([80.91.229.12]:53618) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PrH9x-0003UL-Cc for emacs-orgmode@gnu.org; Sun, 20 Feb 2011 16:49:41 -0500 Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PrH9l-00081i-BY for emacs-orgmode@gnu.org; Sun, 20 Feb 2011 22:49:34 +0100 Received: from rp.young.med.auckland.ac.nz ([130.216.140.20]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 20 Feb 2011 22:49:29 +0100 Received: from psexton by rp.young.med.auckland.ac.nz with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 20 Feb 2011 22:49:29 +0100 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org Bastien wikimedia.fr> writes: > #+begin_src emacs-lisp > (when (looking-at org-bracket-link-analytic-regexp) > (match-string-no-properties 5)) > #+end_src emacs-lisp Thanks. Here is version 3 if the function, which is now able to count words in link descriptions. The code to advance to the next word has been moved to the end of the loop, which improves accuracy. Paul ---------------------------------------------------------------------- (defun org-word-count (beg end &optional count-latex-macro-args? count-footnotes?) "Report the number of words in the Org mode buffer or selected region. Ignores: - comments - tables - source code blocks (#+BEGIN_SRC ... #+END_SRC, and inline blocks) - hyperlinks (but does count words in hyperlink descriptions) - tags, priorities, and TODO keywords in headers - sections tagged as 'not for export'. The text of footnote definitions is ignored, unless the optional argument COUNT-FOOTNOTES? is non-nil. If the optional argument COUNT-LATEX-MACRO-ARGS? is non-nil, the word count includes LaTeX macro arguments (the material between {curly braces}). Otherwise, and by default, every LaTeX macro counts as 1 word regardless of its arguments." (interactive "r") (unless mark-active (setf beg (point-min) end (point-max))) (let ((wc 0) (latex-macro-regexp "\\\\[A-Za-z]+\\(\\[[^]]*\\]\\|\\){\\([^}]*\\)}")) (save-excursion (goto-char beg) (while (< (point) end) (cond ;; Ignore comments. ((or (org-in-commented-line) (org-at-table-p)) nil) ;; Ignore hyperlinks. But if link has a description, count ;; the words within the description. ((looking-at org-bracket-link-analytic-regexp) (when (match-string-no-properties 5) (let ((desc (match-string-no-properties 5))) (save-match-data (incf wc (length (remove "" (org-split-string desc "\\W"))))))) (goto-char (match-end 0))) ((looking-at org-any-link-re) (goto-char (match-end 0))) ;; Ignore source code blocks. ((org-in-regexps-block-p "^#\\+BEGIN_SRC\\W" "^#\\+END_SRC\\W") nil) ;; Ignore inline source blocks, counting them as 1 word. ((save-excursion (backward-char) (looking-at org-babel-inline-src-block-regexp)) (goto-char (match-end 0)) (setf wc (+ 2 wc))) ;; Count latex macros as 1 word, ignoring their arguments. ((save-excursion (backward-char) (looking-at latex-macro-regexp)) (goto-char (if count-latex-macro-args? (match-beginning 2) (match-end 0))) (setf wc (+ 2 wc))) ;; Ignore footnotes. ((and (not count-footnotes?) (or (org-footnote-at-definition-p) (org-footnote-at-reference-p))) nil) (t (let ((contexts (org-context))) (cond ;; Ignore tags and TODO keywords, etc. ((or (assoc :todo-keyword contexts) (assoc :priority contexts) (assoc :keyword contexts) (assoc :checkbox contexts)) nil) ;; Ignore sections marked with tags that are ;; excluded from export. ((assoc :tags contexts) (if (intersection (org-get-tags-at) org-export-exclude-tags :test 'equal) (org-forward-same-level 1) nil)) (t (incf wc)))))) (re-search-forward "\\w+\\W*"))) (message (format "%d words in %s." wc (if mark-active "region" "buffer")))))