emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Eric S Fraga <ucecesf@ucl.ac.uk>
To: Torsten Wagner <torsten.wagner@gmail.com>,
	org-mode mailing list <emacs-orgmode@gnu.org>
Subject: Re: [google] updated awk script for google to org conversion
Date: Fri, 09 Jul 2010 10:44:50 +0100	[thread overview]
Message-ID: <87sk3tkmzx.wl%ucecesf@ucl.ac.uk> (raw)
In-Reply-To: <87wrt5kqfn.wl%ucecesf@ucl.ac.uk>

[-- Attachment #1: Type: text/plain, Size: 598 bytes --]

I've updated the awk script for converting Google calendar iCal export to org:

- the time zone calculation is now down completely within the awk
  script, not requiring any external calculations.

- fixed handling of continuation lines as Google seems to have
  changed, since last night, the format of the export iCal entries.

Again, comments, questions and criticisms all more than welcome!

Please note that the shell command required to use this is now much simpler:

  awk -f thisawkscript < icsfile > orgfile

as there is no longer any need to define the SECONDS variable.

Enjoy! :)

eric

[-- Attachment #2: ical2org.awk --]
[-- Type: text/plain, Size: 6961 bytes --]

# awk script for converting an iCal formatted file to a sequence of org-mode headings.
# this may not work in general but seems to work for day and timed events from Google's
# calendar, which is really all I need right now...
#
# usage:
#   awk -f THISFILE < icalinputfile.ics > orgmodeentries.org
#
# Note: change org meta information generated below for author and
# email entries!
#
# Known bugs:
# - not so much a bug as a possible assumption: date entries with no time
#   specified are assumed to be independent of the time zone.
#
# Eric S Fraga
# 20100629 - initial version
# 20100708 - added end times to timed events
#          - adjust times according to time zone information
#          - fixed incorrect transfer for entries with ":" embedded within the text
#          - added support for multi-line summary entries (which become headlines)
# 20100709 - incorporated time zone identification
#          - fixed processing of continuation lines as Google seems to
#            have changed, in the last day, the number of spaces at
#            the start of the line for each continuation...
#          - remove backslashes used to protect commas in iCal text entries
#
# Last change: 2010.07.09 10:42:54
#----------------------------------------------------------------------------------

# a function to take the iCal formatted date+time, convert it into an
# internal form (seconds since time 0), and adjust according to the
# local time zone (specified by +-seconds calculated in the BEGIN
# section)

function datetimestamp(input)
{
    # convert the iCal Date+Time entry to a format that mktime can understand
    datespec = gensub("([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])T([0-9][0-9])([0-9][0-9])([0-9][0-9]).*[\r]*", "\\1 \\2 \\3 \\4 \\5 \\6", "g", input);
    # print "date spec : " datespec; convert this date+time into
    # seconds from the beginning of time and include adjustment for
    # time zone, as determined in the BEGIN section below.  For time
    # zone adjustment, I have not tested edge effects, specifically
    # what happens when UTC time is a different day to local time and
    # especially when an event with a duration crosses midnight in UTC
    # time.  It should work but...
    timestamp = mktime(datespec) + seconds;
    # print "adjusted    : " timestamp
    # print "Time stamp  : " strftime("%Y-%m-%d %H:%M", timestamp);
    return timestamp;
}

BEGIN {
    # use a colon to separate the type of data line from the actual contents
    FS = ":";
    
    # determine the number of seconds to use for adjusting for time
    # zone difference from UTC.  This is used in the function
    # datetimestamp above.  The time zone information returned by
    # strftime() is in hours * 100 so we multiply by 36 to get
    # seconds.  This does not work for time zones that are not an
    # integral multiple of hours (e.g. Newfoundland)
    seconds = gensub("([+-])0", "\\1", "", strftime("%z")) * 36;
    
    date = "";
    entry = ""
    first = 1;			# true until an event has been found
    headline = ""
    icalentry = ""  # the full entry for inspection
    id = ""
    indescription = 0;
    time2given = 0;
    
    print "#+TITLE:     Main Google calendar entries"
    print "#+AUTHOR:    Eric S Fraga"
    print "#+EMAIL:     e.fraga@ucl.ac.uk"
    print "#+DESCRIPTION: converted using the ical2org awk script"
    print "#+CATEGORY: google"
    print " "
}

# continuation lines (at least from Google) start with two spaces
# if the continuation is after a description or a summary, append the entry
# to the respective variable

/^[ ]+/ { 
    if (indescription) {
	entry = entry gensub("\r", "", "g", gensub("^[ ]+", "", "", $0));
    } else if (insummary) {
	summary = summary gensub("\r", "", "g", gensub("^[ ]+", "", "", $0))
    }
    icalentry = icalentry "\n" $0
}

/^BEGIN:VEVENT/ {
    # start of an event.  if this is the first, output the preamble from the iCal file
    if (first) {
	print "* COMMENT original iCal preamble"
	print gensub("\r", "", "g", icalentry)
	icalentry = ""
    }
    first = false;
}
# any line that starts at the left with a non-space character is a new data field

/^[A-Z]/ {
    # we ignore DTSTAMP lines as they change every time you download
    # the iCal format file which leads to a change in the converted
    # org file as I output the original input.  This change, which is
    # really content free, makes a revision control system update the
    # repository and confuses.
    if (! index("DTSTAMP", $1)) icalentry = icalentry "\n" $0
    # this line terminates the collection of description and summary entries
    indescription = 0;
    insummary = 0;
}

# this type of entry represents a day entry, not timed, with date stamp YYYYMMDD

/^DTSTART;VALUE=DATE/ {
    date = gensub("([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9]).*[\r]", "\\1-\\2-\\3", "g", $2)
    # print date
}

# this represents a timed entry with date and time stamp YYYYMMDDTHHMMSS
# we ignore the seconds

/^DTSTART:/ {
    # print $0
    date = strftime("%Y-%m-%d %H:%M", datetimestamp($2));
    # print date;
}

# and the same for the end date; here we extract only the time and append this to the 
# date+time found by the DTSTART entry.  We assume that entry was there, of course.
# should probably add some error checking here!  In time...

/^DTEND:/ {
    # print $0
    time2 = strftime("%H:%M", datetimestamp($2));
    date = date "-" time2;
}

# The description will the contents of the entry in org-mode.
# this line may be continued.

/^DESCRIPTION/ { 
    $1 = "";
    entry = entry "\n" gensub("\r", "", "g", $0);
    indescription = 1;
}

# the summary will be the org heading

/^SUMMARY/ { 
    $1 = "";
    summary = gensub("\r", "", "g", $0);
    insummary = 1;
}

# the unique ID will be stored as a property of the entry

/^UID/ { 
    $1 = "";
    id = gensub("\r", "", "g", $0);
}

# when we reach the end of the event line, we output everything we
# have collected so far, creating a top level org headline with the
# date/time stamp, unique ID property and the contents, if any

/^END:VEVENT/ {
    # translate \n sequences to actual newlines and unprotect commas (,)
    print "* " gensub("\\\\,", ",", "g", gensub("\\\\n", " ", "g", summary))
    print "  :PROPERTIES:"
    print "  :ID:       " id
    print "  :END:"
    print "  <" date ">"
    # for the entry, convert all embedded "\n" strings to actual newlines
    print ""
    # translate \n sequences to actual newlines and unprotect commas (,)
    print gensub("\\\\,", ",", "g", gensub("\\\\n", "\n", "g", entry));
    print "** COMMENT original iCal entry"
    print gensub("\r", "", "g", icalentry)
    summary = ""
    date = ""
    entry = ""
    icalentry = ""
    indescription = 0
}

# Local Variables:
# time-stamp-line-limit: 1000
# time-stamp-format: "%04y.%02m.%02d %02H:%02M:%02S"
# time-stamp-active: t
# time-stamp-start: "Last change:[ \t]+"
# time-stamp-end: "$"
# End:

[-- Attachment #3: Type: text/plain, Size: 75 bytes --]

-- 
Eric S Fraga
GnuPG: 8F5C 279D 3907 E14A 5C29  570D C891 93D8 FFFC F67D

[-- Attachment #4: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

  reply	other threads:[~2010-07-09  9:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-09  0:57 Google calendar to org mode script and a feature request for agenda Torsten Wagner
2010-07-09  1:54 ` Nick Dokos
2010-07-09  8:30 ` Eric S Fraga
2010-07-09  9:44   ` Eric S Fraga [this message]
2010-07-11  3:03     ` [google] updated awk script for google to org conversion Daniel Martins
2010-07-12  7:49     ` Sven Bretfeld
2010-07-12  9:22       ` Rémi Vanicat
2010-07-12 13:48         ` Michael Steiner
2010-07-13  7:29           ` Sven Bretfeld
2010-07-12 17:50       ` Eric S Fraga
2010-07-12 17:54         ` Marcelo de Moraes Serpa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sk3tkmzx.wl%ucecesf@ucl.ac.uk \
    --to=ucecesf@ucl.ac.uk \
    --cc=e.fraga@ucl.ac.uk \
    --cc=emacs-orgmode@gnu.org \
    --cc=torsten.wagner@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).