From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:8:6d80::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id CKYrNR5UgmBTxQAAgWs5BA (envelope-from ) for ; Fri, 23 Apr 2021 06:59:10 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id SHi+MB5UgmBCbQAAB5/wlQ (envelope-from ) for ; Fri, 23 Apr 2021 04:59:10 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id E737D2916A for ; Fri, 23 Apr 2021 06:59:09 +0200 (CEST) Received: from localhost ([::1]:45964 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lZnuE-0002u6-IZ for larch@yhetil.org; Fri, 23 Apr 2021 00:59:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50994) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lZntZ-0002tz-EN for emacs-orgmode@gnu.org; Fri, 23 Apr 2021 00:58:25 -0400 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]:40914) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lZntX-0004Mz-DY for emacs-orgmode@gnu.org; Fri, 23 Apr 2021 00:58:25 -0400 Received: by mail-pl1-x62a.google.com with SMTP id 20so20700605pll.7 for ; Thu, 22 Apr 2021 21:58:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=xznZQK2tcTDx3ofiT9a/6PP6K4yrTb9plv41Xw+QtRQ=; b=fIhH5IdFOLgbe/SbNfq6y1YEhlmTUYWlAQkyUBH4g91W6TDbB1MFUC17eeYj3KTPMe PDT/gEYCSqAmfK0j3SQt8KlWYLgTR4vJR62oJ03ehJsN6AvMFbI5ZON2/bGkHBETKXM6 kbJjUYU2aMEtKK8GwRYGaI59JJ/0gGSX4/wDxxIBxqVH5eAjtmcy/uQyJmS7KXNYL73d AMrbALcHxCN2JGW/afzzSJL96PTzFfiHxPz6mCMmcuXtNLg2256RTdMNM1VkPsS0s46c D0Wd3HfFxtzR2R8YvJWWMMJ9N5jJNprUm64ky7h0UjNWrjuCQ+MOikqVkhHA7/+qYjGb i/3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=xznZQK2tcTDx3ofiT9a/6PP6K4yrTb9plv41Xw+QtRQ=; b=bvdkYt66pM2y2arajyNq3V4CbyOtTouCg4MNFuubH0TlUQYLk9sjMP4NeEy7s5/3D6 N3kIH9Bk98dvx3QgfXoXSb6gclTHEfY8K+p5kU76I9RNtG/LTts9yHrh2BVGpChszeYq a0PsotWFB4o9QRCnYHtIjA9G9jzXkfxUjWjXO3KbubZTI3HCKsrYaHn2ZHHUy8ax1WYM reHuOuMjhCN0InC/nqYj1JymhPvhFwxkY22Z8atHybFU0Q1oa2tfcOJhpw0CiykhCokv uKYxocpkiZ4vk55OZrf4FnAJrI8CLD5y4+PozzfV7H7ZbezVGUz0zoneqksadp+sJm/9 OU2A== X-Gm-Message-State: AOAM531aqYfWvyJy6Ce77w9veKeCNeE2RQ2XpKBmxhibKTeehGOQCXAf 34vodxmWoUSpmT7WCKNrrXqudDVRzLg= X-Google-Smtp-Source: ABdhPJwrp0Mf40X39ayDYQaZPIKEHa33geCndDy3k+6AHVrSFU8iRBOFCl8TM5cvWBsMCjPX3/6o1w== X-Received: by 2002:a17:90b:4504:: with SMTP id iu4mr2405377pjb.76.1619153901799; Thu, 22 Apr 2021 21:58:21 -0700 (PDT) Received: from localhost ([103.77.0.212]) by smtp.gmail.com with ESMTPSA id i14sm3358231pfa.156.2021.04.22.21.58.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Apr 2021 21:58:21 -0700 (PDT) From: Utkarsh Singh To: Nicolas Goaziou Subject: Re: [PATCH] org-table-import: Make it more smarter for interactive use References: <87czuq9958.fsf@gmail.com> <8735vmelfs.fsf@nicolasgoaziou.fr> <87k0oyfj4y.fsf@gmail.com> <87im4h9irn.fsf@nicolasgoaziou.fr> Date: Fri, 23 Apr 2021 10:28:24 +0530 In-Reply-To: <87im4h9irn.fsf@nicolasgoaziou.fr> (Nicolas Goaziou's message of "Tue, 20 Apr 2021 15:40:12 +0200") Message-ID: <87zgxpwqa7.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=utkarsh190601@gmail.com; helo=mail-pl1-x62a.google.com X-Spam_score_int: 2 X-Spam_score: 0.2 X-Spam_bar: / X-Spam_report: (0.2 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, PDS_OTHER_BAD_TLD=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 47885@debbugs.gnu.org, emacs-orgmode@gnu.org Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1619153950; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=xznZQK2tcTDx3ofiT9a/6PP6K4yrTb9plv41Xw+QtRQ=; b=gK3uGQp6rba6EQQsn/RbUtiJoVJdP5T9fbHQE1hj7tGNqeAWnQWX7mdPTSkaUXj2NZC6AO OiSBGX6ksH1/JDtkCzZOEV6kUQWJL7Sw/Z+ej1Shbw9AuLkVOqQPGYcS8DWJ7P1mzELh/n SOmuoxxTiAvxJNntwmxL28Vu5DkLz8Hf0jjG1ZKxOA61HZDzywz2ymXcL0/685X/+Yda7t eRqJZwkM6ZWnVdtEy/WIBYpXh9O+OqRCrqGDfHTLH8qcrPPD53+qM7Xnzx0p0nI+Y4prOg 4AetdDHNcmuhc2hej8V7onpSSc9kPPhOlDPZe52m0LGyT2CqTq4ZcD9ItZQjZA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1619153950; a=rsa-sha256; cv=none; b=jWaQ3Tqq+xdKLonuS08bpR/K2i9MGDRY8wQfQi26r1mXfiXB01dXsbrWF9hj6VzCLljWE+ ZBU53SWhKdIkGvfTDMjOOI5gSLSs0V/9jkvw2P5/Km/t8RuAJ+0zzZ72CpsrK5WJgLU57D W3SyPZlF8zOIf0X1nLnVE157XOGT0HZDaqaEg6HfXtkgajDUnWpfImC2axxmmKU9LrMBLk W07s8YVHRVCg0MoKdwE6aOpOlL4OyVjnEq6lZKVrv2xfmMR3NvCPDhf3RhzLIkt1cKjTC0 82+3iFEXBqO3spHVkrZVNHC+2lOH+RoIabfAut9BuqhJ5J3kY/z7En/IPHWHYw== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=fIhH5IdF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Spam-Score: -3.15 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=fIhH5IdF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: E737D2916A X-Spam-Score: -3.15 X-Migadu-Scanner: scn0.migadu.com X-TUID: luz0EyMCu69/ --=-=-= Content-Type: text/plain Hi, On 2021-04-20, 15:40 +0200, Nicolas Goaziou wrote: > Again all this needs to extensively tested, as there are a lot of > dangers lurking around. I am attaching my patch which also include my previous suggestion of including yes-or-no prompt to org-table-import to allow file which don't have csv, tsv or txt as extension. Here are some concerns with require your attention: + When using org-table-import interactively if we failed to guess separator then we will be left with a user-error message and an 'unconverted table'. We can make use of 'temp-buffer' to import our file after successfully conversion. + Conversion part of org-table-convert-region make a distinction between '(4) (comma separator) and rest of the separator we should either string version of comma as AND condition or rewrite to simplify it. I am willing to do these possible changes but currently waiting for your review for org-table-guess-separator as there can be more serious bugs lurking around on my code which I am considering base for these changes. All the best, Utkarsh --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=org-table.patch Content-Description: org-table diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el index 0e93fb271f..84bc981fec 100644 --- a/lisp/org/org-table.el +++ b/lisp/org/org-table.el @@ -846,6 +846,42 @@ org-table-create (goto-char pos)) (org-table-align))) + +(defun org-table-guess-separator (beg0 end0) + "Guess separator for `org-table-convert-region' for region BEG0 to END0. + +List of preferred separator: +comma, TAB, semicolon, colon or SPACE. + +If region contains a line which doesn't contain the required +separator then discard the separator and search again using next +separator." + (let* ((beg (save-excursion + (goto-char (min beg0 end0)) + (skip-chars-forward " \t\n") + (if (eobp) (point) (line-beginning-position)))) + (end (save-excursion + (goto-char (max beg end0)) + (skip-chars-backward " \t\n" beg) + (if (= beg (point)) (point) (line-end-position)))) + (sep-regexp '(("," (rx bol (1+ (not (or ?\n ?,))) eol)) + ("\t" (rx bol (1+ (not (or ?\n ?\t))) eol)) + (";" (rx bol (1+ (not (or ?\n ?\;))) eol)) + (":" (rx bol (1+ (not (or ?\n ?:))) eol)) + (" " (rx bol (1+ (not (or ?' ?\" )) + (not (or ?\s ?\;)) + (not (or ?' ?\"))) eol)))) + sep) + (unless (= beg end) + (save-excursion + (goto-char beg) + (catch :found + (pcase-dolist (`(,sep ,regexp) sep-regexp) + (save-excursion + (unless (re-search-forward (eval regexp) end t) + (throw :found sep)))) + nil))))) + ;;;###autoload (defun org-table-convert-region (beg0 end0 &optional separator) "Convert region to a table. @@ -859,20 +895,19 @@ org-table-convert-region (4) Use the comma as a field separator (16) Use a TAB as field separator (64) Prompt for a regular expression as field separator -integer When a number, use that many spaces, or a TAB, as field separator -regexp When a regular expression, use it to match the separator -nil When nil, the command tries to be smart and figure out the - separator in the following way: - - when each line contains a TAB, assume TAB-separated material - - when each line contains a comma, assume CSV material - - else, assume one or more SPACE characters as separator." +integer When a number, use that many spaces, or a TAB, as field separator +regexp When a regular expression, use it to match the separator +nil When nil, the command tries to be smart and figure out the + separator using `org-table-guess-seperator'." (interactive "r\nP") (let* ((beg (min beg0 end0)) (end (max beg0 end0)) re) + (if (> (count-lines beg end) org-table-convert-region-max-lines) (user-error "Region is longer than `org-table-convert-region-max-lines' (%s) lines; not converting" org-table-convert-region-max-lines) + (when (equal separator '(64)) (setq separator (read-regexp "Regexp for field separator"))) (goto-char beg) @@ -881,17 +916,13 @@ org-table-convert-region (goto-char end) (if (bolp) (backward-char 1) (end-of-line 1)) (setq end (point-marker)) - ;; Get the right field separator - (unless separator - (goto-char beg) - (setq separator - (cond - ((not (re-search-forward "^[^\n\t]+$" end t)) '(16)) - ((not (re-search-forward "^[^\n,]+$" end t)) '(4)) - (t 1)))) + (when (and (not separator) + (not (setq separator + (org-table-guess-separator (beg end))))) + (user-error "Failed to guess separator")) (goto-char beg) (if (equal separator '(4)) - (while (< (point) end) + (while (< (point) end) ;; parse the csv stuff (cond ((looking-at "^") (insert "| ")) @@ -905,7 +936,7 @@ org-table-convert-region (setq re (cond ((equal separator '(4)) "^\\|\"?[ \t]*,[ \t]*\"?") ((equal separator '(16)) "^\\|\t") - ((integerp separator) + ((integerp separator) (if (< separator 1) (user-error "Number of spaces in separator must be >= 1") (format "^ *\\| *\t *\\| \\{%d,\\}" separator))) @@ -921,12 +952,8 @@ org-table-convert-region (defun org-table-import (file separator) "Import FILE as a table. -The command tries to be smart and figure out the separator in the -following way: - -- when each line contains a TAB, assume TAB-separated material; -- when each line contains a comma, assume CSV material; -- else, assume one or more SPACE characters as separator. +The command tries to be smart and figure out the separator using +`org-table-guess-seperator'. When non-nil, SEPARATOR specifies the field separator in the lines. It can have the following values: @@ -938,7 +965,8 @@ org-table-import - regexp When a regular expression, use it to match the separator." (interactive "f\nP") (when (and (called-interactively-p 'any) - (not (string-match-p (rx "." (or "txt" "tsv" "csv") eos) file))) + (not (string-match-p (rx "." (or "txt" "tsv" "csv") eos) file)) + (not (yes-or-no-p "File does not havs .txt .txt .csv as extension. Do you still want to continue? "))) (user-error "Cannot import such file")) (unless (bolp) (insert "\n")) (let ((beg (point)) --=-=-= Content-Type: text/plain -- Utkarsh Singh http://utkarshsingh.xyz --=-=-=--