Org-mode mailing list
 help / color / mirror / Atom feed
* best practices query: non-emacs packages based on tangled source
@ 2020-10-15 18:11 Greg Minshall
  2020-10-15 21:22 ` Tim Cross
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Greg Minshall @ 2020-10-15 18:11 UTC (permalink / raw)
  To: emacs-orgmode

hi.  i apologize if this has been asked before (especially if by me).
but, since i had a question recently about Org Src... buffers, this came
up.

i'm wondering what people do who want to release a non-emacs'y package
(an R package, say, or ...), and who did their development "from within"
a .org file.

i can "build" whatever files are needed to release the package.  but,
it's nice to be able to let people look at the sources, maybe submit
'pull requests', etc.

if anyone has any techniques they've used, liked (or hated), i'd love to
hear.

thank you very much, Greg


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: best practices query: non-emacs packages based on tangled source
  2020-10-15 18:11 best practices query: non-emacs packages based on tangled source Greg Minshall
@ 2020-10-15 21:22 ` Tim Cross
  2020-10-16  9:09 ` Eric S Fraga
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Tim Cross @ 2020-10-15 21:22 UTC (permalink / raw)
  To: emacs-orgmode


There is no great answer I am aware of. However, I will sometimes
generate a markdown version of the source so that at least non-emacs
users have a slightly better chance of being able to view the source in
a more format friendly manner than a 'raw' org file. However, pull
requests and the like are more likely going to be diffs on the generated
sources as most people will want to use their preferred editor and
that will likely need the generated source file is in order to get their
editor 'IDE' features etc.

This is one reason I tend not to use org's 'literate programming' model
for anything other than documentation, simple examples, configuration
files, sql and basic scripting. I find for more complex development,
especially when it requires multiple files, namespaces/modules, long
running repl sessions, extensive test suits etc, using org adds another
layer of complexity which soon outstrips the benefits of having
documentation and source in one file. Of course, this will also depend
on the development language/platform. I tend to use languages which
involve a fair bit of 'REPL' based development rather than a more
traditional write, generate, compile, debug loop. On the other hand,
when it comes to documentation, tutorials, configuration files and
workflow automation, org is definitely my preferred tool. 

Tim

Greg Minshall <minshall@umich.edu> writes:

> hi.  i apologize if this has been asked before (especially if by me).
> but, since i had a question recently about Org Src... buffers, this came
> up.
>
> i'm wondering what people do who want to release a non-emacs'y package
> (an R package, say, or ...), and who did their development "from within"
> a .org file.
>
> i can "build" whatever files are needed to release the package.  but,
> it's nice to be able to let people look at the sources, maybe submit
> 'pull requests', etc.
>
> if anyone has any techniques they've used, liked (or hated), i'd love to
> hear.
>
> thank you very much, Greg


-- 
Tim Cross


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: best practices query: non-emacs packages based on tangled source
  2020-10-15 18:11 best practices query: non-emacs packages based on tangled source Greg Minshall
  2020-10-15 21:22 ` Tim Cross
@ 2020-10-16  9:09 ` Eric S Fraga
  2020-10-16 14:52 ` Diego Zamboni
  2020-10-16 15:04 ` TEC
  3 siblings, 0 replies; 6+ messages in thread
From: Eric S Fraga @ 2020-10-16  9:09 UTC (permalink / raw)
  To: Greg Minshall; +Cc: emacs-orgmode

On Thursday, 15 Oct 2020 at 21:11, Greg Minshall wrote:
> i can "build" whatever files are needed to release the package.  but,
> it's nice to be able to let people look at the sources, maybe submit
> 'pull requests', etc.

I have recently done this with a Julia project which I make available at
github.  All code and documentation is developed in a single org and
files tangled etc. to create the github repository for the project.

I include the org file in the repository although the HTML that I
generate from it (as documentation) is hosted on my work web site for
visibility reasons.

If you're interested, you can see all the files including the complete
org file at https://github.com/ericsfraga/Fresa.jl

-- 
: Eric S Fraga via Emacs 28.0.50, Org release_9.4-57-g8402c4


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: best practices query: non-emacs packages based on tangled source
  2020-10-15 18:11 best practices query: non-emacs packages based on tangled source Greg Minshall
  2020-10-15 21:22 ` Tim Cross
  2020-10-16  9:09 ` Eric S Fraga
@ 2020-10-16 14:52 ` Diego Zamboni
  2020-10-16 15:04 ` TEC
  3 siblings, 0 replies; 6+ messages in thread
From: Diego Zamboni @ 2020-10-16 14:52 UTC (permalink / raw)
  To: Greg Minshall; +Cc: Org-mode

[-- Attachment #1: Type: text/plain, Size: 900 bytes --]

Hi Greg,

What I do with my Elvish modules (https://github.com/zzamboni/elvish-modules
, https://github.com/zzamboni/elvish-completions) is to just include the
Org files together with the tangled .elv files.

--Diego


On Thu, Oct 15, 2020 at 8:28 PM Greg Minshall <minshall@umich.edu> wrote:

> hi.  i apologize if this has been asked before (especially if by me).
> but, since i had a question recently about Org Src... buffers, this came
> up.
>
> i'm wondering what people do who want to release a non-emacs'y package
> (an R package, say, or ...), and who did their development "from within"
> a .org file.
>
> i can "build" whatever files are needed to release the package.  but,
> it's nice to be able to let people look at the sources, maybe submit
> 'pull requests', etc.
>
> if anyone has any techniques they've used, liked (or hated), i'd love to
> hear.
>
> thank you very much, Greg
>
>

[-- Attachment #2: Type: text/html, Size: 1462 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: best practices query: non-emacs packages based on tangled source
  2020-10-15 18:11 best practices query: non-emacs packages based on tangled source Greg Minshall
                   ` (2 preceding siblings ...)
  2020-10-16 14:52 ` Diego Zamboni
@ 2020-10-16 15:04 ` TEC
  2020-10-18  6:01   ` Tom Gillespie
  3 siblings, 1 reply; 6+ messages in thread
From: TEC @ 2020-10-16 15:04 UTC (permalink / raw)
  To: Greg Minshall; +Cc: emacs-orgmode


Hi Greg,

Just one little thing that occurs to me, for accepting PRs if you 
add
the header arg (globally would probably be best) :comments link
That with M-x org-babel-detangle should help with accepting PRs.

Hope that helps,

Timothy.

Greg Minshall <minshall@umich.edu> writes:

> hi.  i apologize if this has been asked before (especially if by 
> me).
> but, since i had a question recently about Org Src... buffers, 
> this came
> up.
>
> i'm wondering what people do who want to release a non-emacs'y 
> package
> (an R package, say, or ...), and who did their development "from 
> within"
> a .org file.
>
> i can "build" whatever files are needed to release the package. 
> but,
> it's nice to be able to let people look at the sources, maybe 
> submit
> 'pull requests', etc.
>
> if anyone has any techniques they've used, liked (or hated), i'd 
> love to
> hear.
>
> thank you very much, Greg



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: best practices query: non-emacs packages based on tangled source
  2020-10-16 15:04 ` TEC
@ 2020-10-18  6:01   ` Tom Gillespie
  0 siblings, 0 replies; 6+ messages in thread
From: Tom Gillespie @ 2020-10-18  6:01 UTC (permalink / raw)
  To: Greg Minshall; +Cc: emacs-orgmode, TEC

Hi Greg,

Great question. This came out a bit longer than I anticipated since I
wrote up a couple of relevant workflows. Sync between org source
blocks and tangled code is something that I think needs improvement. I
have covered the difference in semantics between tangled code and
babel evaluated code (along with some others factors) since it can
have an impact on what workflows you might choose.

Following on Timothy's suggestion, one key thing that I think is
needed is the ability to detangle nested and arbitrary code. Detangle
of code tangled with :comments noweb is not fully implemented. A full
detangling implementation would make it easier and safer to
automatically detangle back to the org source blocks using a
pre-commit hook or similar. Detangling from the org sources is also
something that needs to be implemented for this to work. I think that
a more complete detangling implementation could go a long way toward
making it easier for those who are not used to org to commit to a
project. In the absence of this, I have found that most of my existing
workflows actively avoid keeping tangled code and org sources tracked
in git at the same time unless absolutely necessary, and even with git
there to back me up I have shot myself in the foot tangling over files
that I forgot to detangle.

Below are a few examples. In all cases I have had to consciously work
around the issue of having tangled code that is outside the source of
truth that is the org file.

As an additional note before the examples, I have found that the trade
offs when tangling code also depend on the language you are using. For
example, I consider some languages, such as Python, to be obligate
tanglers since their semantics conflate modules and files. Org babel
might be able to work around this in some cases, but it would mean,
for example, that ob-python would have to explicitly compensate for
this deficiency by implementing the ability to treat source blocks as
modules to be loaded into a session or somehow pulled in during the
prologue by pre-parsing blocks to look for import statements, etc. The
deficiencies of a language mean that if you want certain functionality
for that language then org babel can't just treat the code as text,
and might have to go to great lengths to try to keep the semantics of
babel evaluation and of tangled code aligned.

The workflow that I have found to be the most reasonable I developed
while working on an elisp project (it is public but I'm not quite
ready to link it on this list). In this workflow I add a hook via
~(add-hook 'before-save-hook #'org-babel-tangle nil t)~ for any org
file that should be tangled, that way I don't have to worry about
whether I remember to tangle etc. However, there are a couple of
issues. It does not work in reverse, you still always have to edit the
org blocks. It will become annoyingly slow if you have many blocks to
tangle. You really want it to run only when the source blocks change,
not whenever the org files changes. Finally, The exact semantics of
tangling multiple blocks to the same file can have a major impact on
performance. So even this best case is not the greatest and doesn't
enable your specific use case (detangle issues in particular are a
show stopper). Since this is an elisp project I have to run tests on
the tangled file in a separate instance of Emacs to ensure that it
works as expected.

In this project I also have a completely unreadable file that is valid
and executable in 3 languages. Tangled blocks that are easily readable
in the org sources are commented out intentionally in the tangled
file. This is a worst case for detangling. I suspect that it can be
done, but it will push the detangling implementation to the limit. At
the moment, there is no way to detangle this file back to readable
form at all, and it is not clear that anyone should try to edit the
tangled file in the first place. All this to say, if we reason from
this extreme example, maybe the best thing is to tangle at the last
possible moment, never keep the tangled form under version control
etc. Unfortunately the use case for this file is to bootstrap Emacs,
which means that in order for it to be useful it _must_ be tangled and
put under version control since the systems it needs to run on don't
have Emacs.

The worst experience I have had was when I was developing python code
where I needed to capture the output of the block in order to populate
tables. Over time the code grew to the point where it needed to become
a library. This is where Python being an obligate tangler reared its
head, and the differences in semantics between tangled and evaluated
code became a major pain. Combine this with the fact that my testing
workflows in Python essentially require me to edit the tangled code
for me not to lose my mind, but I would also forget to detangle, and
sometimes overwrite on retangle, and I was quite unhappy.

As a result I eventually gave up and moved all python development out
of the org file except for the few critical parts that were needed to
produce the tables. There were simply too many steps between modifying
a file and being able to test changes (my time writing elisp and
common lisp has massively reduced my tolerance for this kind of
thing). The risk of forgetting whether I had or needed to tangle, or
detangle went to zero. I was much more productive and could do sane
things like safely import the python code into other modules etc.

Even if you automatically tangle code to the file system on save, you
still have to be able to use it from the org file. In obligate
tanglers like Python this means that you must figure out how to do
something like setting ~PYTHONPATH~ so that org babel can find
it. There weren't good ways to do this inside a single org file and
adding a random path to your .bashrc for each one of these would be a
nightmare not to mention that it completely defeats the purpose of
using org to simplify documentation of code (this is one use case for
the elisp project discussed above).

A slightly better experience with Python is one where I have an
existing code base with a single module containing most of the
implementation. I then wrote a developer guide as an org mode file and
I tangle that code to a submodule.
https://github.com/SciCrunch/sparc-curation/blob/master/docs/developer-guide.org#datasets

This was not nearly as bad as the other python project because I wrote
each source block as if it were its own complete file and module. This
severely limited the style that I could use and recombination and
reuse within the org file is difficult (as noted). I still have to
tangle everything before I test, and I have to (if I have not already)
add the tangled files to .gitignore so that other developers cannot
accidentally edit them (runnable documentation is cool, except when
people don't read it and start modifying just the runnable part). To
compensate for this I now have a build time dependency on Emacs (major
WTF right there ya?) that all the python packaging tools know nothing
about, just so that there is only a single source of truth for the
python code.

Now, you would think that I could use the source block header
arguments with the modularized example code to run the code via org
babel directly, but it is not really possible because when tangled I
import code from other source blocks as a module, but in org babel
that means those files would still have to be tangled, otherwise the
python import system could not find the code. Maybe ob-python could be
enhanced to dynamically load other source blocks as modules?  I'm sure
that other languages have similar issues.

Just to be a bit less harsh on Python, I have had similar issues
developing code in bash that needed to be sourced in order for the
functions to be available for use in a shell. Sometimes I would find
myself accidentally editing the tangled source and forget to detangle,
or was unable to detangle because I was using :comments noweb. As
mentioned above, I think this is the single largest issue preventing
sane workflows for keeping tangled files and org sources in sync.

Another example of how the quality and experience of the workflow
depends on the language you are working in. I started a project
(https://github.com/tgbugs/git-share) in common lisp that includes
other languages such as sql, bash, and elisp. I wanted to write
everything in a single org file. In this case I have been able to
develop two separate workflows. For production release I tangle all
the files and then run ~save-lisp-and-die~ in sbcl. For development I
have a workflow where everything is set up and runs via slime and all
modifications can be made and run via org babel directly. For this
project I also explicitly never commit the tangled code to git.

This winds up being less of an issue for this project compared to the
others because dumping the files to disk is only needed to create the
production build (and might not even be required for that). One
disadvantage of this approach (which also applies to a pure elisp
babel approach) is that there aren't concrete source files so you
can't use ~xref-find-definitions~ or ~slime-edit-definition~ to jump
to a definition. I imagine that this is something that could be fixed
though, so that the source location for definitions could point to
lines in an org file.

This kind of split setup is really only possible in languages where
the semantics for an org babel session are the same as the semantics
when tangled (common lisp and elisp being two examples). As mentioned,
in Python this is virtually impossible because the semantics of the
babel session and the semantics of a tangled file that start from the
same block(s) are radically different. This is understandable due to
the fact that the CL community put an enormous amount of effort into
making sure that compiled code and interpreted code, top level and
nested code had semantics that were as close to each other as possible
(and it shows).

Best!
Tom


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-18  6:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-15 18:11 best practices query: non-emacs packages based on tangled source Greg Minshall
2020-10-15 21:22 ` Tim Cross
2020-10-16  9:09 ` Eric S Fraga
2020-10-16 14:52 ` Diego Zamboni
2020-10-16 15:04 ` TEC
2020-10-18  6:01   ` Tom Gillespie

Org-mode mailing list

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://orgmode.org/list/0 list/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 list list/ https://orgmode.org/list \
		emacs-orgmode@gnu.org
	public-inbox-index list

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.yhetil.org/yhetil.emacs.orgmode
	nntp://news.gmane.io/gmane.emacs.orgmode


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git