Edgewall Software
Modify

Ticket #1310 (closed defect: fixed)

Opened 7 years ago

Last modified 5 years ago

trac-post-commit-hook and codepages

Reported by: GregZ Owned by: cboos
Priority: normal Milestone: 0.10.4
Component: version control/browser Version: 0.9.3
Severity: minor Keywords: utf-8 unicode
Cc: vyt@…, blaufalke@…, m@…, trac@…, zak_trac@…, stock@…
Release Notes:
API Changes:

Description

Sorry for bad English.

I'am writing comments for SVN revisions on Russian.
To see this comments correctly in timeline page, I'am set UNICODE UTF-8 encoding in my web browser (Opera).

And whan I'am using trac-post-commit-hook the same coomments that fixed and closed tickets displayed in Codepage 866.

Switching codepages very inconvenient to use.

Server and Workstation OS: Window XP SP2

Буду рад любой помощи.

Attachments

trac-post-commit-hook.0.10-stable.diff (2.1 KB) - added by markus 5 years ago.
Patch for international characters in changeset and tickets
trac-post-commit-hook.0.10-stable.2.diff (2.0 KB) - added by markus 5 years ago.
Patch for international characters in changeset and tickets v2

Download all attachments as: .zip

Change History

comment:1 Changed 7 years ago by anonymous

  • Component changed from general to timeline
  • Milestone set to 0.8.2
  • Priority changed from normal to highest

comment:2 Changed 7 years ago by anonymous

  • Status changed from new to closed

comment:3 Changed 7 years ago by cmlenz

  • Status changed from closed to reopened

Why did you close this, anonymous?

comment:4 Changed 7 years ago by GregZ

  • Component changed from timeline to browser

To solve this problem, i'am added 2 lines to the trac-post-commit-hook:

    def __init__(self, project=options.project, author=options.user, rev=options.rev,msg=options.msg):
        self.author = author
        self.rev = rev

this>>        csmsg = "(In [%s]) %s" % (rev, msg)
this>>        self.msg = util.to_utf8(csmsg, 'windows-1251')

        self.now = int(time.time()) 
        self.con = sqlite.connect(os.path.join(project, 'db', 'trac.db'), autocommit=0) 

comment:5 Changed 7 years ago by cmlenz

  • Milestone changed from 0.8.2 to 0.9
  • Priority changed from highest to low
  • Severity changed from normal to minor

This is not high-priority, being just a contrib script.

comment:6 Changed 7 years ago by anonymous

What svn client you use for writing comments ? It's strange since nornal subversion encoding is utf-8.

PS Можно пообщаться об этом напрямую - email/jabber:vyt@vzljot.ru

comment:7 Changed 7 years ago by vyt@…

  • Cc vyt@… added

I think that problem is a way for getting log message.

svnlook does recoding to current locale charset and no way for disable this recoding. Charset detection in post-commit hook is overhead, so best way, IMHO - using subversion python API for getting log message.

comment:8 Changed 7 years ago by dans

vyt >>

That sounds reasonable, but I'm feeling a bit lost here.

I'm on Windows XP, running the "post-commit.cmd" from #1602. I guess I should replace the line

FOR /F "usebackq delims==" %%i IN (`%%SVNLOOK%% log -r %TXN% %REPOS%`) DO SET LOG=%%i

with something calling the Python-SVN API, but I neither fully understand the current line nor have any clue about the API.

Can I just replace the part between the parentheses? And if so, what do I replace it with?

comment:9 Changed 7 years ago by cmlenz

  • Milestone 0.9 deleted

comment:10 Changed 7 years ago by vyt@…

At least trac-post-commit-hook should contain note about invoking svnlook in non UTF-8 locales like LANG=ru_RU.UTF-8 svnlook...

comment:11 Changed 6 years ago by szaman

  • Version changed from 0.8.1 to 0.9.3

i did:

LC_ALL="pl_PL.UTF-8"

in MYREPO/hooks/post-commit

Think, that should be pointed in hook's documentation.

(I'm working on trac-0.9.3 and trunk/contrib/trac-post-commit-hook)

comment:12 Changed 6 years ago by Markus Tacker <m@…>

  • Cc m@… added
  • Keywords utf-8 unicode added

comment:13 Changed 6 years ago by Markus Tacker <m@…>

#1625 has been marked as duplicate of this bug.

comment:14 Changed 6 years ago by cboos

#2845 has also been marked as duplicate of this bug.

comment:15 Changed 6 years ago by cboos

#2352 is yet another duplicate.

comment:17 Changed 6 years ago by Markus Tacker <m@…>

#3732 has been marked as duplicate of this bug.

comment:18 follow-up: Changed 6 years ago by cboos

  • Milestone set to 0.10
  • Owner changed from jonas to cboos
  • Priority changed from low to normal
  • Status changed from reopened to new

He, not sure that #3732 is really a duplicate of this one, as I wrote a patch that would have fixed #3732, I believe. We'll see.

Can you try out the following patch?
(be careful, it's for 0.10, don't apply on 0.9)

Index: trac-post-commit-hook
===================================================================
--- trac-post-commit-hook	(revision 3720)
+++ trac-post-commit-hook	(working copy)
@@ -78,6 +78,7 @@
 from trac.ticket import Ticket
 from trac.ticket.web_ui import TicketModule
 # TODO: move grouped_changelog_entries to model.py
+from trac.util.text import to_unicode
 from trac.web.href import Href
 
 try:
@@ -101,6 +102,8 @@
                   help='The user who is responsible for this action')
 parser.add_option('-m', '--msg', dest='msg',
                   help='The log message to search.')
+parser.add_option('-c', '--encoding', dest='encoding',
+                  help='The encoding used by the log message.')
 parser.add_option('-s', '--siteurl', dest='url',
                   help='The base URL to the project\'s trac website (to which '
                        '/ticket/## is appended).  If this is not specified, '
@@ -132,7 +135,9 @@
                        'see':        '_cmdRefs'}
 
     def __init__(self, project=options.project, author=options.user,
-                 rev=options.rev, msg=options.msg, url=options.url):
+                 rev=options.rev, msg=options.msg, url=options.url,
+                 encoding=options.encoding):
+        msg = to_unicode(msg, encoding)
         self.author = author
         self.rev = rev
         self.msg = "(In [%s]) %s" % (rev, msg)

As normally I think you'd get UTF-8 strings when Subversion calls the post-commit hook, the -c/--encoding option shouldn't be needed.

However it can be useful when testing the script from the command line, or if for some reason the post-commit hook is actually given the message with a different encoding.

comment:19 in reply to: ↑ 18 Changed 6 years ago by sier@…

Replying to cboos:

He, not sure that #3732 is really a duplicate of this one, as I wrote a patch that would have fixed #3732, I believe. We'll see.

The patch works. Thanks!!

comment:20 Changed 6 years ago by cboos

  • Resolution set to fixed
  • Status changed from new to closed

Ok, patch applied in r3743, and I'm going to close this, so we're going to find out if #3732 was really a duplicate ;)

comment:21 Changed 6 years ago by Marksu Tacker <m@…>

Confirmed on r3747

comment:22 Changed 6 years ago by cboos

Confirmed what, the fix or the bug? I assume the former, otherwise I guess you'd have reopened ;)

comment:24 Changed 6 years ago by blaufalke@…

  • Resolution fixed deleted
  • Status changed from closed to reopened

doesn't work for me and my trac 0.10.
The changset itself displays the log message with the correct German umlauts, but the fixed / addressed tickets are generated faulty - even with trac-post-commit-hook from the current trunk.

Expected result

(In [353]) "Testcommit für den trac-post-commit-hook addresses #2 fixes #1"

Actual result:

(In [353]) "Testcommit f?\195?\188r den trac-post-commit-hook addresses #2 fixes #1"

comment:25 in reply to: ↑ description Changed 6 years ago by anonymous

Replying to GregZ:

Sorry for bad English.

I'am writing comments for SVN revisions on Russian.
To see this comments correctly in timeline page, I'am set UNICODE UTF-8 encoding in my web browser (Opera).

And whan I'am using trac-post-commit-hook the same coomments that fixed and closed tickets displayed in Codepage 866.

Switching codepages very inconvenient to use.

Server and Workstation OS: Window XP SP2

Буду рад любой помощи.

comment:26 Changed 6 years ago by l2k

  • Cc trac@… added

I have the same problem as blaufalke

comment:27 Changed 6 years ago by anonymous

  • Cc zak_trac@… added

comment:28 Changed 6 years ago by anonymous

  • Cc blaufalke@… added

comment:29 Changed 5 years ago by cboos

  • Milestone set to 1.0

See TracDev/Proposals/Journaling. I think that we could eventually get rid of the trac-post-commit-hook in the future (and hence get rid of the encoding issues as well).

comment:30 Changed 5 years ago by stock@…

  • Cc stock@… added

I did the following changes to the script, which now uses the trac api to get the change message. So the conversion to locale is avoided, I think.
It works for me with german umlauts entered through TortoiseSVN.

  • trac-post-commit-hook

     
    8080# TODO: move grouped_changelog_entries to model.py 
    8181from trac.util.text import to_unicode 
    8282from trac.web.href import Href 
     83from trac.versioncontrol.svn_fs import SubversionRepository, SubversionChangeset 
     84from trac.log import logger_factory 
    8385 
    8486try: 
    8587    from optparse import OptionParser 
     
    137139    def __init__(self, project=options.project, author=options.user, 
    138140                 rev=options.rev, msg=options.msg, url=options.url, 
    139141                 encoding=options.encoding): 
    140         msg = to_unicode(msg, encoding) 
     142        self.env = open_environment(project) 
     143        repos_dir = self.env.config.get('trac', 'repository_dir') 
     144        repos = SubversionRepository(repos_dir, 
     145                             '', logger_factory('test')) 
     146        change = repos.get_changeset(rev) 
     147        msg = change.message 
     148        to_unicode(msg) 
     149        msg = msg.decode('utf-8') 
    141150        self.author = author 
    142151        self.rev = rev 
    143152        self.msg = "(In [%s]) %s" % (rev, msg) 
    144153        self.now = int(time.time())  
    145         self.env = open_environment(project) 
    146154        if url is None: 
    147155            url = self.env.config.get('project', 'url') 
    148156        self.env.href = Href(url) 
     
    194202 
    195203 
    196204if __name__ == "__main__": 
    197     if len(sys.argv) < 5: 
     205    if len(sys.argv) < 4: 
    198206        print "For usage: %s --help" % (sys.argv[0]) 
    199207    else: 
    200208        CommitHook() 

comment:31 Changed 5 years ago by markus

The above trac api solution works for me!

You could also get the author name that way, this would simplify the calling batch file:

self.author = author

becomes

self.author = change.author

as well as

if len(sys.argv) < 4:

becomes

if len(sys.argv) < 3:

The batch files in the hooks directories of all repositories simply call an other batch script in a directory outside the repository directory.
post-commit.cmd:

%~dp0\..\..\..\hooks\trac-post-commit-hook.cmd %1 %2

The trac-post-commit-hook.cmd (purified version from #1602) calls the python script. This script can be used for all the repositories as long as the trac environments have the same names as the subversion repositories:

@ECHO OFF

SET REV=%2
SET REPNAME=%~nx1

:: Modify paths and port number here
SET TRAC_ENV=D:\trac\trac_env\%REPNAME%
SET PYTHON_DIR=C:\Python24
SET TRAC_URL=http://%COMPUTERNAME%:8080/%REPNAME%
SET PYTHON="%PYTHON_DIR%\python.exe"

:: Do not execute hook if trac environment does not exist
IF NOT EXIST %TRAC_ENV% GOTO :EOF

%PYTHON% "%~dp0\trac-post-commit-hook.py" -p "%TRAC_ENV%" -r "%REV%" -s "%TRAC_URL%"

Changed 5 years ago by markus

Patch for international characters in changeset and tickets

comment:32 Changed 5 years ago by markus

I attached a patch attachment:trac-post-commit-hook.0.10-stable.diff which applies all the changes from stock@… as well as the one I described above to the latest version from 0.10-stable. It works for me, cyrillic characters in changeset and the ticket comments are correctly displayed. The same changes could be applied to the trunk of 0.11.

comment:33 Changed 5 years ago by cboos

A small fix to your patch:

repos_dir = self.env.config.get('trac', 'repository_dir') 
repos = self.env.get_repository() # will do a `sync` if needed 

(and then the log related import is not needed)

Besides the above, I think it's a step in the right direction.

comment:34 Changed 5 years ago by cboos

  • Milestone changed from 1.0 to 0.11

woops, and I didn't see the msg = msg.decode('utf-8') line. This is unnecessary and potentially harmful.

Changed 5 years ago by markus

Patch for international characters in changeset and tickets v2

comment:35 Changed 5 years ago by markus

I made a second patch attachment:trac-post-commit-hook.0.10-stable.2.diff containing your fixes. I hope this is better (it's my first day with python, I'm just trying to put together a patch from what works on my windows box…). I tested it with trac 0.10.3 and subversion 1.4.2, french accents and cyrillic characters work fine.

comment:36 Changed 5 years ago by cboos

  • Milestone changed from 0.11 to 0.10.4
  • Resolution set to fixed
  • Status changed from reopened to closed

Thanks for the patch and the tests. I've committed a slightly different version, which won't break compatibility with the previous script.

Fixed in r4532, r4533 (trunk) and r4534 (0.10-stable).

Please try out once again to let me know if everything works for you.

comment:37 Changed 5 years ago by markus

Works great for me, thanks!

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
to The owner will be changed from cboos. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.