Edgewall Software
Modify

Opened 20 years ago

Closed 18 years ago

Last modified 9 years ago

#1310 closed defect (fixed)

trac-post-commit-hook and codepages

Reported by: GregZ Owned by: Christian Boos
Priority: normal Milestone: 0.10.4
Component: version control/browser Version: 0.9.3
Severity: minor Keywords: utf-8 unicode
Cc: vyt@…, blaufalke@…, m@…, trac@…, zak_trac@…, stock@… Branch:
Release Notes:
API Changes:
Internal Changes:

Description

Sorry for bad English.

I'am writing comments for SVN revisions on Russian. To see this comments correctly in timeline page, I'am set UNICODE UTF-8 encoding in my web browser (Opera).

And whan I'am using trac-post-commit-hook the same coomments that fixed and closed tickets displayed in Codepage 866.

Switching codepages very inconvenient to use.

Server and Workstation OS: Window XP SP2

Буду рад любой помощи.

Attachments (2)

trac-post-commit-hook.0.10-stable.diff (2.1 KB ) - added by markus 18 years ago.
Patch for international characters in changeset and tickets
trac-post-commit-hook.0.10-stable.2.diff (2.0 KB ) - added by markus 18 years ago.
Patch for international characters in changeset and tickets v2

Download all attachments as: .zip

Change History (37)

comment:1 by anonymous, 20 years ago

Component: generaltimeline
Milestone: 0.8.2
Priority: normalhighest

comment:2 by anonymous, 20 years ago

Status: newclosed

comment:3 by Christopher Lenz, 20 years ago

Status: closedreopened

Why did you close this, anonymous?

comment:4 by GregZ, 20 years ago

Component: timelinebrowser

To solve this problem, i'am added 2 lines to the trac-post-commit-hook:

    def __init__(self, project=options.project, author=options.user, rev=options.rev,msg=options.msg):
        self.author = author
        self.rev = rev

this>>        csmsg = "(In [%s]) %s" % (rev, msg)
this>>        self.msg = util.to_utf8(csmsg, 'windows-1251')

        self.now = int(time.time()) 
        self.con = sqlite.connect(os.path.join(project, 'db', 'trac.db'), autocommit=0) 

comment:5 by Christopher Lenz, 20 years ago

Milestone: 0.8.20.9
Priority: highestlow
Severity: normalminor

This is not high-priority, being just a contrib script.

comment:6 by anonymous, 20 years ago

What svn client you use for writing comments ? It's strange since nornal subversion encoding is utf-8.

PS Можно пообщаться об этом напрямую - email/jabber:vyt@vzljot.ru

comment:7 by vyt@…, 20 years ago

Cc: vyt@… added

I think that problem is a way for getting log message.

svnlook does recoding to current locale charset and no way for disable this recoding. Charset detection in post-commit hook is overhead, so best way, IMHO - using subversion python API for getting log message.

comment:8 by dans, 19 years ago

vyt >>

That sounds reasonable, but I'm feeling a bit lost here.

I'm on Windows XP, running the "post-commit.cmd" from #1602. I guess I should replace the line

FOR /F "usebackq delims==" %%i IN (`%%SVNLOOK%% log -r %TXN% %REPOS%`) DO SET LOG=%%i

with something calling the Python-SVN API, but I neither fully understand the current line nor have any clue about the API.

Can I just replace the part between the parentheses? And if so, what do I replace it with?

comment:9 by Christopher Lenz, 19 years ago

Milestone: 0.9

comment:10 by vyt@…, 19 years ago

At least trac-post-commit-hook should contain note about invoking svnlook in non UTF-8 locales like LANG=ru_RU.UTF-8 svnlook...

comment:11 by szaman, 19 years ago

Version: 0.8.10.9.3

i did:

LC_ALL="pl_PL.UTF-8"

in MYREPO/hooks/post-commit

Think, that should be pointed in hook's documentation.

(I'm working on trac-0.9.3 and trunk/contrib/trac-post-commit-hook)

comment:12 by Markus Tacker <m@…>, 19 years ago

Cc: m@… added
Keywords: utf-8 unicode added

comment:13 by Markus Tacker <m@…>, 19 years ago

#1625 has been marked as duplicate of this bug.

comment:14 by Christian Boos, 19 years ago

#2845 has also been marked as duplicate of this bug.

comment:15 by Christian Boos, 19 years ago

#2352 is yet another duplicate.

comment:17 by Markus Tacker <m@…>, 18 years ago

#3732 has been marked as duplicate of this bug.

comment:18 by Christian Boos, 18 years ago

Milestone: 0.10
Owner: changed from Jonas Borgström to Christian Boos
Priority: lownormal
Status: reopenednew

He, not sure that #3732 is really a duplicate of this one, as I wrote a patch that would have fixed #3732, I believe. We'll see.

Can you try out the following patch? (be careful, it's for 0.10, don't apply on 0.9)

Index: trac-post-commit-hook
===================================================================
--- trac-post-commit-hook	(revision 3720)
+++ trac-post-commit-hook	(working copy)
@@ -78,6 +78,7 @@
 from trac.ticket import Ticket
 from trac.ticket.web_ui import TicketModule
 # TODO: move grouped_changelog_entries to model.py
+from trac.util.text import to_unicode
 from trac.web.href import Href
 
 try:
@@ -101,6 +102,8 @@
                   help='The user who is responsible for this action')
 parser.add_option('-m', '--msg', dest='msg',
                   help='The log message to search.')
+parser.add_option('-c', '--encoding', dest='encoding',
+                  help='The encoding used by the log message.')
 parser.add_option('-s', '--siteurl', dest='url',
                   help='The base URL to the project\'s trac website (to which '
                        '/ticket/## is appended).  If this is not specified, '
@@ -132,7 +135,9 @@
                        'see':        '_cmdRefs'}
 
     def __init__(self, project=options.project, author=options.user,
-                 rev=options.rev, msg=options.msg, url=options.url):
+                 rev=options.rev, msg=options.msg, url=options.url,
+                 encoding=options.encoding):
+        msg = to_unicode(msg, encoding)
         self.author = author
         self.rev = rev
         self.msg = "(In [%s]) %s" % (rev, msg)

As normally I think you'd get UTF-8 strings when Subversion calls the post-commit hook, the -c/--encoding option shouldn't be needed.

However it can be useful when testing the script from the command line, or if for some reason the post-commit hook is actually given the message with a different encoding.

in reply to:  18 comment:19 by sier@…, 18 years ago

Replying to cboos:

He, not sure that #3732 is really a duplicate of this one, as I wrote a patch that would have fixed #3732, I believe. We'll see.

The patch works. Thanks!!

comment:20 by Christian Boos, 18 years ago

Resolution: fixed
Status: newclosed

Ok, patch applied in r3743, and I'm going to close this, so we're going to find out if #3732 was really a duplicate ;)

comment:21 by Marksu Tacker <m@…>, 18 years ago

Confirmed on r3747

comment:22 by Christian Boos, 18 years ago

Confirmed what, the fix or the bug? I assume the former, otherwise I guess you'd have reopened ;)

comment:24 by blaufalke@…, 18 years ago

Resolution: fixed
Status: closedreopened

doesn't work for me and my trac 0.10. The changset itself displays the log message with the correct German umlauts, but the fixed / addressed tickets are generated faulty - even with trac-post-commit-hook from the current trunk.

Expected result

(In [353]) "Testcommit für den trac-post-commit-hook addresses #2 fixes #1"

Actual result:

(In [353]) "Testcommit f?\195?\188r den trac-post-commit-hook addresses #2 fixes #1"

in reply to:  description comment:25 by anonymous, 18 years ago

Replying to GregZ:

Sorry for bad English.

I'am writing comments for SVN revisions on Russian. To see this comments correctly in timeline page, I'am set UNICODE UTF-8 encoding in my web browser (Opera).

And whan I'am using trac-post-commit-hook the same coomments that fixed and closed tickets displayed in Codepage 866.

Switching codepages very inconvenient to use.

Server and Workstation OS: Window XP SP2

Буду рад любой помощи.

comment:26 by l2k, 18 years ago

Cc: trac@… added

I have the same problem as blaufalke

comment:27 by anonymous, 18 years ago

Cc: zak_trac@… added

comment:28 by anonymous, 18 years ago

Cc: blaufalke@… added

comment:29 by Christian Boos, 18 years ago

Milestone: 1.0

See TracDev/Proposals/Journaling. I think that we could eventually get rid of the trac-post-commit-hook in the future (and hence get rid of the encoding issues as well).

comment:30 by stock@…, 18 years ago

Cc: stock@… added

I did the following changes to the script, which now uses the trac api to get the change message. So the conversion to locale is avoided, I think. It works for me with german umlauts entered through TortoiseSVN.

  • trac-post-commit-hook

     
    8080# TODO: move grouped_changelog_entries to model.py
    8181from trac.util.text import to_unicode
    8282from trac.web.href import Href
     83from trac.versioncontrol.svn_fs import SubversionRepository, SubversionChangeset
     84from trac.log import logger_factory
    8385
    8486try:
    8587    from optparse import OptionParser
     
    137139    def __init__(self, project=options.project, author=options.user,
    138140                 rev=options.rev, msg=options.msg, url=options.url,
    139141                 encoding=options.encoding):
    140         msg = to_unicode(msg, encoding)
     142        self.env = open_environment(project)
     143        repos_dir = self.env.config.get('trac', 'repository_dir')
     144        repos = SubversionRepository(repos_dir,
     145                             '', logger_factory('test'))
     146        change = repos.get_changeset(rev)
     147        msg = change.message
     148        to_unicode(msg)
     149        msg = msg.decode('utf-8')
    141150        self.author = author
    142151        self.rev = rev
    143152        self.msg = "(In [%s]) %s" % (rev, msg)
    144153        self.now = int(time.time())
    145         self.env = open_environment(project)
    146154        if url is None:
    147155            url = self.env.config.get('project', 'url')
    148156        self.env.href = Href(url)
     
    194202
    195203
    196204if __name__ == "__main__":
    197     if len(sys.argv) < 5:
     205    if len(sys.argv) < 4:
    198206        print "For usage: %s --help" % (sys.argv[0])
    199207    else:
    200208        CommitHook()

comment:31 by markus, 18 years ago

The above trac api solution works for me!

You could also get the author name that way, this would simplify the calling batch file:

self.author = author

becomes

self.author = change.author

as well as

if len(sys.argv) < 4:

becomes

if len(sys.argv) < 3:

The batch files in the hooks directories of all repositories simply call an other batch script in a directory outside the repository directory. post-commit.cmd:

%~dp0\..\..\..\hooks\trac-post-commit-hook.cmd %1 %2

The trac-post-commit-hook.cmd (purified version from #1602) calls the python script. This script can be used for all the repositories as long as the trac environments have the same names as the subversion repositories:

@ECHO OFF

SET REV=%2
SET REPNAME=%~nx1

:: Modify paths and port number here
SET TRAC_ENV=D:\trac\trac_env\%REPNAME%
SET PYTHON_DIR=C:\Python24
SET TRAC_URL=http://%COMPUTERNAME%:8080/%REPNAME%
SET PYTHON="%PYTHON_DIR%\python.exe"

:: Do not execute hook if trac environment does not exist
IF NOT EXIST %TRAC_ENV% GOTO :EOF

%PYTHON% "%~dp0\trac-post-commit-hook.py" -p "%TRAC_ENV%" -r "%REV%" -s "%TRAC_URL%"

by markus, 18 years ago

Patch for international characters in changeset and tickets

comment:32 by markus, 18 years ago

I attached a patch attachment:trac-post-commit-hook.0.10-stable.diff which applies all the changes from stock@… as well as the one I described above to the latest version from 0.10-stable. It works for me, cyrillic characters in changeset and the ticket comments are correctly displayed. The same changes could be applied to the trunk of 0.11.

comment:33 by Christian Boos, 18 years ago

A small fix to your patch:

repos_dir = self.env.config.get('trac', 'repository_dir') 
repos = self.env.get_repository() # will do a `sync` if needed 

(and then the log related import is not needed)

Besides the above, I think it's a step in the right direction.

comment:34 by Christian Boos, 18 years ago

Milestone: 1.00.11

woops, and I didn't see the msg = msg.decode('utf-8') line. This is unnecessary and potentially harmful.

by markus, 18 years ago

Patch for international characters in changeset and tickets v2

comment:35 by markus, 18 years ago

I made a second patch attachment:trac-post-commit-hook.0.10-stable.2.diff containing your fixes. I hope this is better (it's my first day with python, I'm just trying to put together a patch from what works on my windows box…). I tested it with trac 0.10.3 and subversion 1.4.2, french accents and cyrillic characters work fine.

comment:36 by Christian Boos, 18 years ago

Milestone: 0.110.10.4
Resolution: fixed
Status: reopenedclosed

Thanks for the patch and the tests. I've committed a slightly different version, which won't break compatibility with the previous script.

Fixed in r4532, r4533 (trunk) and r4534 (0.10-stable).

Please try out once again to let me know if everything works for you.

comment:37 by markus, 18 years ago

Works great for me, thanks!

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.