From mboxrd@z Thu Jan 1 00:00:00 1970 From: Carsten Dominik Subject: Re: ... Date: Thu, 31 Jan 2013 12:59:29 +0100 Message-ID: References: <877gmt3dzq@ch.ristopher.com> <871ud13dkp@ch.ristopher.com> <6C559BF7-BACF-48AF-AB06-383D6AC14BDE@gmail.com> <87622dlhqm.fsf@bzg.ath.cx> Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([208.118.235.92]:54403) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U0snm-0003gh-1l for emacs-orgmode@gnu.org; Thu, 31 Jan 2013 06:59:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U0snk-0006fa-NN for emacs-orgmode@gnu.org; Thu, 31 Jan 2013 06:59:33 -0500 Received: from mail-ea0-f178.google.com ([209.85.215.178]:44291) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U0snk-0006fN-H7 for emacs-orgmode@gnu.org; Thu, 31 Jan 2013 06:59:32 -0500 Received: by mail-ea0-f178.google.com with SMTP id a14so1206728eaa.37 for ; Thu, 31 Jan 2013 03:59:31 -0800 (PST) In-Reply-To: <87622dlhqm.fsf@bzg.ath.cx> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Bastien Cc: Christopher Schmidt , "emacs-orgmode@gnu.org Mailing List" Hi Bastien, as you know, regular expressions are a language to do a programmed = search for text. The pattern string has to be compiled before it can be = used. That compilation is a costly process, so most languages that have = pattern matching use some kind of cache to store compiled patterns, so = that frequently used patterns can be reused without compilation. I am aware of this very much from studying perl. In perl, a compiled = pattern is associated with a particular instance of a string. Often you = build the pattern by constructing it through concatenation of other = parts etc. In Perl this means that the pattern is recompiled each time = a match. You can work around this issue in Perl by telling it = explicitly and on programmers authority that, "yes, this pattern is = dynamically constructed, but only once, I guarantee that it will not = change, so compile it only once". So in Perl the difference is /pattern/ will match against pattern /$pattern/ will match agains the pattern contained in the variable $pattern, and recompilation will occur each time /$pattern/o will compile only once and trust the programmer. So I am very aware of this speedup issue. And I thought that in Emacs, = the caching would work by associating a specific string object with the = compiled pattern. But the code Christopher pointed out seems to suggest = that the pattern cache works also for strings that are `equal', not only = for string that are `eq'. If this is the case, this means that there is only a very small = difference between (defconst my-pattern (concat "^" "xyz")) (re-search-forward my-pattern ....) ; many times in different = functions and (defconst my-partial-pattern "xyz") (re-search-forward (concat "^" my-partial-pattern) ....) ; many times The difference is only the repeated concatenation operation, and not the = recompilation. I always thought that this would work differently, and = that is why a lot of regexps get constructed and then stored in = variables or constants. Of course this is also a good practice for = readable and maintainable code, but the impact on efficiency is not as = big as I used to think. So when I saw Christoher's initial patch, I = thought a function to create org-ooutline-regexp-bol would be a large burden in speed - but it now = seems that it would only be a minor impact. Still, I think making a local variable in buffers with org-struct-mode = is also a good way to get the functionality Christopher wants. Clearer? - Carsten On 31 jan. 2013, at 12:22, Bastien wrote: > Hi Carsten and Christopher, >=20 > Carsten Dominik writes: >=20 >> I mant to copy the list, I am doing this again now. >>=20 >> Wow, I was not aware that Emacs caches by content, this is an = important >> piece of information. I guess this removed the main concern I had. = Thanks >> for looking it up in the code and showing it to me. I am not sure if = I >> understand that code completely, but i trust your judgment. >=20 > I'm not sure I have all the background to understand the issue at > stake... can anyone educate me? Thanks! >=20 > --=20 > Bastien --=20 There is no unscripted life. Only a badly scripted one. -- Brothers = Bloom