Opened 20 years ago

Closed 18 years ago

Last modified 9 years ago

#1310 closed defect (fixed)

trac-post-commit-hook and codepages

Reported by: GregZ
Priority: normal Milestone: 0.10.4
Component: version control/browser Version: 0.9.3
Severity: minor Keywords: utf-8 unicode
Cc: vyt@…, blaufalke@…, m@…, trac@…, zak_trac@…, stock@… Branch:
Release Notes:
API Changes:
Internal Changes:


Sorry for bad English.

I'am writing comments for SVN revisions on Russian. To see this comments correctly in timeline page, I'am set UNICODE UTF-8 encoding in my web browser (Opera).

And whan I'am using trac-post-commit-hook the same coomments that fixed and closed tickets displayed in Codepage 866.

Switching codepages very inconvenient to use.

Server and Workstation OS: Window XP SP2

Буду рад любой помощи.

trac-post-commit-hook.0.10-stable.diff (2.1 KB ) - added by markus 18 years ago.
Patch for international characters in changeset and tickets
trac-post-commit-hook.0.10-stable.2.diff (2.0 KB ) - added by markus 18 years ago.
Patch for international characters in changeset and tickets v2

comment:1 by anonymous, 20 years ago

Why did you close this, anonymous?

comment:4 by GregZ, 20 years ago

Component: timelinebrowser

To solve this problem, i'am added 2 lines to the trac-post-commit-hook:

    def __init__(self, project=options.project, author=options.user, rev=options.rev,msg=options.msg):
        self.author = author
        self.rev = rev

this>>        csmsg = "(In [%s]) %s" % (rev, msg)
this>>        self.msg = util.to_utf8(csmsg, 'windows-1251')

        self.now = int(time.time()) 
        self.con = sqlite.connect(os.path.join(project, 'db', 'trac.db'), autocommit=0) 

Priority: highestlow
Severity: normalminor

This is not high-priority, being just a contrib script.

What svn client you use for writing comments ? It's strange since nornal subversion encoding is utf-8.

PS Можно пообщаться об этом напрямую - email/jabber:vyt@vzljot.ru

I think that problem is a way for getting log message.

svnlook does recoding to current locale charset and no way for disable this recoding. Charset detection in post-commit hook is overhead, so best way, IMHO - using subversion python API for getting log message.

comment:8 by dans, 20 years ago

vyt >>

That sounds reasonable, but I'm feeling a bit lost here.

I'm on Windows XP, running the "post-commit.cmd" from #1602. I guess I should replace the line

FOR /F "usebackq delims==" %%i IN (`%%SVNLOOK%% log -r %TXN% %REPOS%`) DO SET LOG=%%i

with something calling the Python-SVN API, but I neither fully understand the current line nor have any clue about the API.

Can I just replace the part between the parentheses? And if so, what do I replace it with?

comment:9 by Christopher Lenz, 19 years ago

Milestone: 0.9

comment:10 by vyt@…, 19 years ago

At least trac-post-commit-hook should contain note about invoking svnlook in non UTF-8 locales like LANG=ru_RU.UTF-8 svnlook...

in MYREPO/hooks/post-commit

Think, that should be pointed in hook's documentation.

(I'm working on trac-0.9.3 and trunk/contrib/trac-post-commit-hook)

comment:12 by Markus Tacker <m@…>, 19 years ago

comment:13 by Markus Tacker <m@…>, 19 years ago

#1625 has been marked as duplicate of this bug.

comment:14 by Christian Boos, 19 years ago

#2845 has also been marked as duplicate of this bug.

comment:15 by Christian Boos, 19 years ago

#2352 is yet another duplicate.

comment:17 by Markus Tacker <m@…>, 18 years ago

#3732 has been marked as duplicate of this bug.

comment:18 by Christian Boos, 18 years ago

Milestone: 0.10
Owner: changed from Jonas Borgström to Christian Boos
Priority: lownormal
Status: reopenednew

He, not sure that #3732 is really a duplicate of this one, as I wrote a patch that would have fixed #3732, I believe. We'll see.

Can you try out the following patch? (be careful, it's for 0.10, don't apply on 0.9)

Index: trac-post-commit-hook
--- trac-post-commit-hook	(revision 3720)
+++ trac-post-commit-hook	(working copy)
@@ -78,6 +78,7 @@
 from trac.ticket import Ticket
 from trac.ticket.web_ui import TicketModule
 # TODO: move grouped_changelog_entries to model.py
+from trac.util.text import to_unicode
 from trac.web.href import Href
@@ -101,6 +102,8 @@
                   help='The user who is responsible for this action')
 parser.add_option('-m', '--msg', dest='msg',
                   help='The log message to search.')
+parser.add_option('-c', '--encoding', dest='encoding',
+                  help='The encoding used by the log message.')
 parser.add_option('-s', '--siteurl', dest='url',
                   help='The base URL to the project\'s trac website (to which '
                        '/ticket/## is appended).  If this is not specified, '
@@ -132,7 +135,9 @@
                        'see':        '_cmdRefs'}
     def __init__(self, project=options.project, author=options.user,
-                 rev=options.rev, msg=options.msg, url=options.url):
+                 rev=options.rev, msg=options.msg, url=options.url,
+                 encoding=options.encoding):
+        msg = to_unicode(msg, encoding)
         self.author = author
         self.rev = rev
         self.msg = "(In [%s]) %s" % (rev, msg)

As normally I think you'd get UTF-8 strings when Subversion calls the post-commit hook, the -c/--encoding option shouldn't be needed.

However it can be useful when testing the script from the command line, or if for some reason the post-commit hook is actually given the message with a different encoding.

Replying to cboos:

He, not sure that #3732 is really a duplicate of this one, as I wrote a patch that would have fixed #3732, I believe. We'll see.

The patch works. Thanks!!

comment:20 by Christian Boos, 18 years ago

Resolution: fixed
Status: newclosed

Ok, patch applied in r3743, and I'm going to close this, so we're going to find out if #3732 was really a duplicate ;)

comment:21 by Marksu Tacker <m@…>, 18 years ago

Confirmed on r3747

Confirmed on r3747

comment:22 by Christian Boos, 18 years ago

Confirmed what, the fix or the bug? I assume the former, otherwise I guess you'd have reopened ;)

comment:24 by blaufalke@…, 18 years ago

Resolution: fixed
Status: closedreopened

doesn't work for me and my trac 0.10. The changset itself displays the log message with the correct German umlauts, but the fixed / addressed tickets are generated faulty - even with trac-post-commit-hook from the current trunk.

Expected result

(In [353]) "Testcommit für den trac-post-commit-hook addresses #2 fixes #1"

Actual result:

(In [353]) "Testcommit f?\195?\188r den trac-post-commit-hook addresses #2 fixes #1"

Replying to GregZ:

Sorry for bad English.

I'am writing comments for SVN revisions on Russian. To see this comments correctly in timeline page, I'am set UNICODE UTF-8 encoding in my web browser (Opera).

And whan I'am using trac-post-commit-hook the same coomments that fixed and closed tickets displayed in Codepage 866.

Switching codepages very inconvenient to use.

Server and Workstation OS: Window XP SP2

Буду рад любой помощи.

I have the same problem as blaufalke

comment:29 by Christian Boos, 18 years ago

Milestone: 1.0

See TracDev/Proposals/Journaling. I think that we could eventually get rid of the trac-post-commit-hook in the future (and hence get rid of the encoding issues as well).

comment:30 by stock@…, 18 years ago

I did the following changes to the script, which now uses the trac api to get the change message. So the conversion to locale is avoided, I think. It works for me with german umlauts entered through TortoiseSVN.

  • trac-post-commit-hook

    8080# TODO: move grouped_changelog_entries to model.py
    8181from trac.util.text import to_unicode
    8282from trac.web.href import Href
     83from trac.versioncontrol.svn_fs import SubversionRepository, SubversionChangeset
     84from trac.log import logger_factory
    8587    from optparse import OptionParser
    137139    def __init__(self, project=options.project, author=options.user,
    138140                 rev=options.rev, msg=options.msg, url=options.url,
    139141                 encoding=options.encoding):
    140         msg = to_unicode(msg, encoding)
     142        self.env = open_environment(project)
     143        repos_dir = self.env.config.get('trac', 'repository_dir')
     144        repos = SubversionRepository(repos_dir,
     145                             '', logger_factory('test'))
     146        change = repos.get_changeset(rev)
     147        msg = change.message
     148        to_unicode(msg)
     149        msg = msg.decode('utf-8')
    141150        self.author = author
    142151        self.rev = rev
    143152        self.msg = "(In [%s]) %s" % (rev, msg)
    144153        self.now = int(time.time())
    145         self.env = open_environment(project)
    146154        if url is None:
    147155            url = self.env.config.get('project', 'url')
    148156        self.env.href = Href(url)
    196204if __name__ == "__main__":
    197     if len(sys.argv) < 5:
     205    if len(sys.argv) < 4:
    198206        print "For usage: %s --help" % (sys.argv[0])
    199207    else:
    200208        CommitHook()

comment:31 by markus, 18 years ago

The above trac api solution works for me!

You could also get the author name that way, this would simplify the calling batch file:

self.author = author


self.author = change.author

as well as

if len(sys.argv) < 4:


if len(sys.argv) < 3:

The batch files in the hooks directories of all repositories simply call an other batch script in a directory outside the repository directory. post-commit.cmd:

%~dp0\..\..\..\hooks\trac-post-commit-hook.cmd %1 %2

The trac-post-commit-hook.cmd (purified version from #1602) calls the python script. This script can be used for all the repositories as long as the trac environments have the same names as the subversion repositories:



:: Modify paths and port number here
SET TRAC_ENV=D:\trac\trac_env\%REPNAME%
SET PYTHON="%PYTHON_DIR%\python.exe"

:: Do not execute hook if trac environment does not exist

%PYTHON% "%~dp0\trac-post-commit-hook.py" -p "%TRAC_ENV%" -r "%REV%" -s "%TRAC_URL%"

by markus, 18 years ago

Patch for international characters in changeset and tickets

comment:32 by markus, 18 years ago

I attached a patch attachment:trac-post-commit-hook.0.10-stable.diff which applies all the changes from stock@… as well as the one I described above to the latest version from 0.10-stable. It works for me, cyrillic characters in changeset and the ticket comments are correctly displayed. The same changes could be applied to the trunk of 0.11.

comment:33 by Christian Boos, 18 years ago

A small fix to your patch:

repos_dir = self.env.config.get('trac', 'repository_dir') 
repos = self.env.get_repository() # will do a `sync` if needed 

(and then the log related import is not needed)

Besides the above, I think it's a step in the right direction.

comment:34 by Christian Boos, 18 years ago

Milestone: 1.00.11

woops, and I didn't see the msg = msg.decode('utf-8') line. This is unnecessary and potentially harmful.

by markus, 18 years ago

Patch for international characters in changeset and tickets v2

comment:35 by markus, 18 years ago

I made a second patch attachment:trac-post-commit-hook.0.10-stable.2.diff containing your fixes. I hope this is better (it's my first day with python, I'm just trying to put together a patch from what works on my windows box…). I tested it with trac 0.10.3 and subversion 1.4.2, french accents and cyrillic characters work fine.

comment:36 by Christian Boos, 18 years ago

Resolution: fixed
Status: reopenedclosed

Thanks for the patch and the tests. I've committed a slightly different version, which won't break compatibility with the previous script.

Fixed in r4532, r4533 (trunk) and r4534 (0.10-stable).

Please try out once again to let me know if everything works for you.

comment:37 by markus, 18 years ago

Works great for me, thanks!

