From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id CGp5AJkMf2AiAQEAgWs5BA (envelope-from ) for ; Tue, 20 Apr 2021 19:17:13 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id KC69N5gMf2DbLQAAB5/wlQ (envelope-from ) for ; Tue, 20 Apr 2021 17:17:12 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 6F4D922585 for ; Tue, 20 Apr 2021 19:17:12 +0200 (CEST) Received: from localhost ([::1]:60078 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lYtzq-0000xI-0U for larch@yhetil.org; Tue, 20 Apr 2021 13:17:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48840) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lYtyD-0000x3-4A for emacs-orgmode@gnu.org; Tue, 20 Apr 2021 13:15:29 -0400 Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]:37862) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lYtyA-0003Ho-Tn for emacs-orgmode@gnu.org; Tue, 20 Apr 2021 13:15:28 -0400 Received: by mail-pf1-x42d.google.com with SMTP id y62so2189295pfg.4 for ; Tue, 20 Apr 2021 10:15:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:in-reply-to:references:cc:date:message-id :mime-version; bh=NukRM5CJ9bRA54iEIh/uGDnqEAtbdbMrVJmPi5g7L1s=; b=B11A6MV1bKSsQOtaRnHrFhpUcMcR6Gw36jvcBsILbTn8K3poKLNCWGQ7MQwqHA1pFw fQjLKkdBWMRtwvJOL8LdBG/uFpTfiJ6gSc0hjvIXp7hBHH5R10JloDUK1ZMoKz0ByWZD +G2tilfU+Bx2SWF7XllS2bp8n0egmHsHEj1oFLjETU1lNqZgR/h8jw5G0PRSrKgi3Zf3 5TWqEMtSLQw6MN/weaK3+nLZajW6Q+8ZCxIxh5mD94V4DiCD+GsiEdmurLBBh37o/xs5 ppdc6MbJiQR5zG0i4asvYg039L80nPTYIY8QmS8WyrSGyQucK8yUh9Fw7QUKPu+mGq36 23jQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:in-reply-to:references:cc:date :message-id:mime-version; bh=NukRM5CJ9bRA54iEIh/uGDnqEAtbdbMrVJmPi5g7L1s=; b=Hku2cGqjEW4o5mhwqAPiNQUMNT98mQAmx8YXPkasYEqON4/sV2XbbtavECAaIL2NjN zTIwOyenYAEhXwAH+kBmlk3i2spHPtHoMnY5dfF6Ne/YoTeH7jWUr6z45j+HLKoJESa3 mNgcAt9njgGWRCzP7O48Ptg/VBKZUIXA7q4bskPMSozHZFixz7D40SklbJ8WY9ch2FHM oMw2kl1AAW2GPGlXhgEiBin1xteDQnh/6VA4oC/dgpqbtlalLiWQdw1RMgCSHRc3gZ7Q +cm2dq+bkBB3rCvjtegSUTaxbAYwFujkwXL2Jb7DsmOjtND3/mKSMeDZ+3GNQxha4mXr woHw== X-Gm-Message-State: AOAM531ePXG0KlJWphfMteX3ee5ImrpHSLFaxpw6q/bfzEp4FF1oJep2 wusoZzRv81c7NllRWT2KEiQ= X-Google-Smtp-Source: ABdhPJy+pErFuieS7YszOwMxguHhXEazQ3/xqgbi4yWGIM7JK6PSf83UMOrBbD4JSeLBqFPFgneFWQ== X-Received: by 2002:aa7:9f08:0:b029:25b:70c0:a31b with SMTP id g8-20020aa79f080000b029025b70c0a31bmr17852213pfr.61.1618938920844; Tue, 20 Apr 2021 10:15:20 -0700 (PDT) Received: from localhost ([45.251.50.123]) by smtp.gmail.com with ESMTPSA id a20sm6751156pfi.138.2021.04.20.10.15.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Apr 2021 10:15:20 -0700 (PDT) From: Utkarsh Singh To: Nicolas Goaziou Subject: Re: [PATCH] org-table-import: Make it more smarter for interactive use In-Reply-To: <87im4h9irn.fsf@nicolasgoaziou.fr> References: <87czuq9958.fsf@gmail.com> <8735vmelfs.fsf@nicolasgoaziou.fr> <87k0oyfj4y.fsf@gmail.com> <87im4h9irn.fsf@nicolasgoaziou.fr> Date: Tue, 20 Apr 2021 22:45:22 +0530 Message-ID: <87r1j4ri6t.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: pass client-ip=2607:f8b0:4864:20::42d; envelope-from=utkarsh190601@gmail.com; helo=mail-pf1-x42d.google.com X-Spam_score_int: 1 X-Spam_score: 0.1 X-Spam_bar: / X-Spam_report: (0.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, PDS_OTHER_BAD_TLD=1.999, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 47885@debbugs.gnu.org, emacs-orgmode@gnu.org Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1618939032; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=NukRM5CJ9bRA54iEIh/uGDnqEAtbdbMrVJmPi5g7L1s=; b=t+XYwuBUMkBzI7C6/H1wc69V1Lg54rn0KmhBVlcwxXU0axNGbzaJNceec0EqUqmFibrASG dRsZ8fRBuYfJ8/nRCh3UqZCpoiaqWSV9txKLaqsUB1SIVsPJm5Fa0EUhu5XnI8TDrO5Mup FrGiYRhRWXN0SvymZGeZEjMm4SfkWFqjHhHdZJw9fOHhxiInAiyDhwLcR67i+tOHVsVCZ3 YKNMeL7IG47S7pWdBIIhPOLbUuwC0fvD3kr6CUmnke0z1ndhk+UTGPz0b6ajJcm3NbkWQe 3jcU9W+bO1rTuKrifNbVz2w3eG2kmskP3klauskxyt/ANK9D0LXixy4M6m0Llg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1618939032; a=rsa-sha256; cv=none; b=fNS86MVK+/c/Kgja0cXEBxVncmFERsua68cJ0v69ANWpNVXP4iDIW57fM/+X6DqBlJi3Fy 9iNHiolasV9qtg9vlOpIqLTAWJKzCXJ4qPHDyqc8WLTgUfUCdhK8m3HRS8S+wYRmtHriWp Ez+utyaBVtQoFgzKEQiGUvbPI7D+L/042MiOXtl9cd8GwEug8JYJFl1CRITg/rSicR27Y1 n/ianQ9SMyMqvLuQOnBIUWpfgePQvHE90Rcn2VeVCsyj5NNsYZEwBgPuQks6FqpC2oCeZr s/g0l830ZBQG6FMAwQ26hK6CTU++sok3eYO5hDduf4DA8iwzF38wjON3d+P1pg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=B11A6MV1; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Spam-Score: -3.14 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=B11A6MV1; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: 6F4D922585 X-Spam-Score: -3.14 X-Migadu-Scanner: scn0.migadu.com X-TUID: XRzdKT2S+C/R Hi, On 2021-04-20, 15:40 +0200, Nicolas Goaziou wrote: > For the problem we're trying to solve, this sounds like over-engineering > to me. Do we want so badly to guess a separator? Earlier I took is as an assignment to learn Elisp but now I don't think we should increase complexity this much. > Thinking again about it, this needs extra care, as end0 might end up on > an empty line. You tried to avoid this in your first function, but > I think this was not sufficient either. Actually, beg0 could also start > on an empty line. > > This needs to be tested extensively, but as a first approximation, > I think `beg' needs to be defined as: > > (save-excursion > (goto-char (min beg0 end0)) > (skip-chars-forward " \t\n") > (if (eobp) (point) (line-beginning-position))) > > and `end' as > > (save-excursion > (goto-char (max beg end0)) > (skip-chars-backward " \t\n" beg) > (if (= beg (point)) (point) (line-end-position))) > > Then you need to bail out if beg = end. > >> (sep-rexp '(("," "^[^\n,]+$") > > sep-rexp -> sep-regexp > >> ("\t" "^[^\n\t]+$") >> (";" "^[^\n;]+$") >> (":" "^[^\n:]+$") >> (" " "^\\([^'\"][^\n\s][^'\"]\\)+$"))) > > At this point, I suggest to use `rx' macro instead. > > I suggest this (yes, I like pattern-matching, `car' and `cdr' are so > 80's) instead: > > (save-excursion > (goto-char beg) > (catch :found > (pcase-dolist (`(,sep ,regexp) sep-regexp) > (save-excursion > (unless (re-search-forward regexp end t) > (throw :found sep)))) > nil)) > Thanks! I was not aware of pcase-dolist function. Function after doing the necessary changes: (defun org-table-guess-separator (beg0 end0) "Guess separator for `org-table-convert-region' for region BEG0 to END0. List of preferred separator: comma, TAB, semicolon, colon or SPACE. If region contains a line which doesn't contain the required separator then discard the separator and search again using next separator." (let* ((beg (save-excursion (goto-char (min beg0 end0)) (skip-chars-forward " \t\n") (if (eobp) (point) (line-beginning-position)))) (end (save-excursion (goto-char (max beg end0)) (skip-chars-backward " \t\n" beg) (if (= beg (point)) (point) (line-end-position)))) (sep-regexp '(("," (rx bol (1+ (not (or ?\n ?,))) eol)) ("\t" (rx bol (1+ (not (or ?\n ?\t))) eol)) (";" (rx bol (1+ (not (or ?\n ?\;))) eol)) (":" (rx bol (1+ (not (or ?\n ?:))) eol)) (" " (rx bol (1+ (not (or ?' ?\" )) (not (or ?\s ?\;)) (not (or ?' ?\"))) eol)))) sep) (unless (= beg end) (save-excursion (goto-char beg) (catch :found (pcase-dolist (`(,sep ,regexp) sep-regexp) (save-excursion (unless (re-search-forward (eval regexp) end t) (throw :found sep)))) nil))))) > Again all this needs to extensively tested, as there are a lot of > dangers lurking around. Summary of things that still requires a review: + Setting boundary right + When using SPACE as separator is it sufficient to check for all for all non quoted SPACE's? -- Utkarsh Singh http://utkarshsingh.xyz