Opened 13 years ago
Closed 13 years ago
#10225 closed enhancement (fixed)
Provide Python hooks for Mercurial to call trac-admin
Reported by: | Remy Blank | Owned by: | Remy Blank |
---|---|---|---|
Priority: | normal | Milestone: | plugin - mercurial |
Component: | plugin/mercurial | Version: | |
Severity: | normal | Keywords: | review |
Cc: | eric@… | Branch: | |
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description
Setting up hooks in Mercurial to call trac-admin $ENV changeset
commands is reasonably simple on Unix, but quite difficult on Windows due to the absence of xargs.
We could simplify the setup by providing Python functions for the commit
and changegroup
in MercurialPlugin. This would make for a very simple setup when the plugin is installed globally:
[hooks] commit = python:tracext.hg.hooks:commit changegroup = python:tracext.hg.hooks:changegroup
Or, if the plugin is installed per environment, people could download the file hooks.py
from here and use:
[hooks] commit = python:/path/to/hooks.py:commit changegroup = python:/path/to/hooks.py:changegroup
Of course, the hooks should be cross-platform. The path to the trac-admin
command should be configurable in .hgrc
.
Attachments (1)
Change History (14)
comment:1 by , 13 years ago
Cc: | added |
---|
follow-up: 4 comment:2 by , 13 years ago
comment:3 by , 13 years ago
I don't see any downsides if it works … but it still won't fully solve my problem at least. I'm running Trac in Apache on a Windows box, and Windows doesn't like xargs
or $HG_NODE
. I tried subbing %HG_NODE%
instead without any success, which is why I'm hoping an actual commit/changegroup hook scripted in Python will help.
comment:4 by , 13 years ago
Replying to psuter <petsuter@…>:
I ended up using the
incoming
hook instead ofchangegroup
to avoid the need for xargs. Are there any downsides to that?
Yes, a pretty major one. The incoming
hook is called once for every new changeset. The changegroup
hook is called once with the first new changeset (and the other ones can be determined from the changelog). So with the former, you have to call trac-admin
once for every changeset, whereas the latter allows a single call to trac-admin
with a list of revisions. Considering that the setup phase of a Trac environment is very expensive, there's a major performance gain in using changegroup
.
I'm almost done, and I will need some testers on Windows :)
comment:5 by , 13 years ago
Keywords: | review added |
---|
A hook function has been added to TracMercurial in [10718]. It has been tested on Linux, and I need some volunteers to test on Windows. Copy the file hooks.py somewhere accessible, then configure the hooks in hgrc
:
[hooks] commit = python:C:\path\to\hooks.py:add_changesets changegroup = python:C:\path\to\hooks.py:add_changesets [trac] env = C:\path\to\env trac-admin = C:\path\to\trac-admin
follow-up: 7 comment:6 by , 13 years ago
Simple testing shows it works (on Windows), but it could still be improved.
First, using the chain of first children is not going to work in all cases, and using changectx.descendants
as I first thought neither. Consider the following case, for the incoming sequence b, c, d
where d
is the merge of b
and c
(i.e. we have for example a -> b, a -> c, (b, c) -> d
, with a
being a changeset already present in the repository). If we start with node == b
, we don't want to miss c
even if it's not a descendant of b
.
Fortunately there's a much simpler way to get all the relevant changesets, because the hook is executed in a transaction (at least I verified for pull
, I assume its the same for other means), so we have a guarantee that all changesets after node
are part of the changegroup. So we'll need to use rev numbers here.
Second remark, using subprocess
is one approach… and I understand you could sometimes want to notify environments with different versions of Trac for the same repository; in that situation using trac-admin
is certainly the simpler way to go. But in the usual case where you have one version of Trac installed and reachable from the same version of Python used to run the hooks, you could as well call the relevant Trac Python code directly. I propose doing that when there's no trac-admin
entry specified in the [trac]
section of the .hgrc
.
See follow-up patch, t10255-use-revs-r10718.patch.
by , 13 years ago
Attachment: | t10255-use-revs-r10718.patch added |
---|
add all revs for the changegroup and optionally use directly Trac's API for the notification
follow-up: 8 comment:7 by , 13 years ago
Replying to cboos:
First, using the chain of first children is not going to work in all cases, and using
changectx.descendants
as I first thought neither.
To be honest, I took the code from next_rev()
of the plugin, because I wrongly assumed that it was used to make the changelog. Using revision numbers makes sense, and allows increasing the default number of changesets per call to trac-admin
. I tried to find out how Mercurial implemented its log
command, but the code combines many different cases (with a file specification, with revision specification, …) that I couldn't find an easy way.
Second remark, using
subprocess
is one approach… and I understand you could sometimes want to notify environments with different versions of Trac for the same repository; in that situation usingtrac-admin
is certainly the simpler way to go.
It's not really due to different versions of Trac, but rather for the situation where you run Trac instances in virtualenvs. I actually wanted to add some functionality to handle this case (instead of requiring to write a wrapper script for trac-admin
), but I was afraid that it would be difficult to make it cross-platform.
But in the usual case where you have one version of Trac installed and reachable from the same version of Python used to run the hooks, you could as well call the relevant Trac Python code directly. I propose doing that when there's no
trac-admin
entry specified in the[trac]
section of the.hgrc
.
I'm not sure I like the idea. It means that we load a complete Trac environment into the process executing the operation, which is fine when you run the hg
executable (local operations, ssh) but not so nice when you run hgweb
. All your hgweb
processes end up taking much more memory. The upside is that you only pay once for environment startup, which can be nice if you push a lot.
follow-up: 9 comment:8 by , 13 years ago
Replying to rblank:
But in the usual case where you have one version of Trac installed and reachable from the same version of Python used to run the hooks, you could as well call the relevant Trac Python code directly. I propose doing that when there's no
trac-admin
entry specified in the[trac]
section of the.hgrc
.I'm not sure I like the idea. It means that we load a complete Trac environment into the process executing the operation, which is fine when you run the
hg
executable (local operations, ssh) but not so nice when you runhgweb
. All yourhgweb
processes end up taking much more memory.
It's probably worth to measure this cost … and verify the env gets properly gc'ed.
If memory usage appears to be a real issue, it's easy to use the alternative way with trac-admin
. But if think it's nice to have at least the option to bypass the call to trac-admin
.
The upside is that you only pay once for environment startup, which can be nice if you push a lot.
Well, only if we add use_cache=True
to open_environment()
, which I didn't do here.
follow-up: 10 comment:9 by , 13 years ago
Replying to cboos:
It's probably worth to measure this cost … and verify the env gets properly gc'ed. If memory usage appears to be a real issue, it's easy to use the alternative way with
trac-admin
. But if think it's nice to have at least the option to bypass the call totrac-admin
.
IIRC, Python is notoriously bad at releasing memory back to the OS, so even if the environment is garbage-collected, the process will probably keep the memory around. Measuring should indeed clarify the issue.
Well, only if we add
use_cache=True
toopen_environment()
, which I didn't do here.
Shame, then I see no upside at all :)
I mean, in the configuration where everything is installed globally, the trac-admin
executable should be in PATH
, so the default value of trac-admin
should already be fine.
follow-up: 11 comment:10 by , 13 years ago
Replying to rblank:
Replying to cboos:
It's probably worth to measure this cost … and verify the env gets properly gc'ed. If memory usage appears to be a real issue, it's easy to use the alternative way with
trac-admin
. But if think it's nice to have at least the option to bypass the call totrac-admin
.IIRC, Python is notoriously bad at releasing memory back to the OS, so even if the environment is garbage-collected, the process will probably keep the memory around.
Not on Windows ;-) But even without gc'ing the env, I'd be surprised if from one hook run to the next you'd have any noticeable memory increase.
Measuring should indeed clarify the issue.
Well, only if we add
use_cache=True
toopen_environment()
, which I didn't do here.Shame, then I see no upside at all :)
For me the upside is to avoid having to launch an external process, pass a possibly long list of command line arguments, when the call could have been done instead directly in the current process, in one go. I indeed didn't think about long running processes, that's why I didn't bother using the cache. But if the memory impact is negligible, it might indeed be interesting to activate the cache.
I mean, in the configuration where everything is installed globally, the
trac-admin
executable should be inPATH
, so the default value oftrac-admin
should already be fine.
Again, probably not on Windows.
comment:11 by , 13 years ago
Replying to cboos:
Not on Windows ;-) But even without gc'ing the env, I'd be surprised if from one hook run to the next you'd have any noticeable memory increase.
Not from one run to the next. You would get one increase on first run, and then constant usage (probably).
For me the upside is to avoid having to launch an external process, pass a possibly long list of command line arguments, when the call could have been done instead directly in the current process, in one go.
Ah, you've been lurking on Mercurial mailing list for too long, and are influenced by all the people moaning when they are told to call hg
on the command-line to integrate Mercurial into their software.
In this particular case, I find the subprocess approach quite sensible: start a new process, use whatever memory is necessary, and release all of it to the OS on termination. Nice and clean.
But I don't mind having the option to call Trac directly. Choice is good ;) Patch applied in [10726].
Do you want me to add an option cache_env
to allow using the environment cache?
comment:13 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Added the Mercurial option cache_env
in [10731]. Please reopen if there's anything else to be done.
I ended up using the
incoming
hook instead ofchangegroup
to avoid the need for xargs. Are there any downsides to that?