Edgewall Software

Ticket #1310 (closed defect: fixed)

Opened 4 years ago

Last modified 23 months ago

trac-post-commit-hook and codepages

Reported by: GregZ Owned by: cboos
Priority: normal Milestone: 0.10.4
Component: version control/browser Version: 0.9.3
Severity: minor Keywords: utf-8 unicode
Cc: vyt@…, blaufalke@…, m@…, trac@…, zak_trac@…, stock@…

Description

Sorry for bad English.

I'am writing comments for SVN revisions on Russian. To see this comments correctly in timeline page, I'am set UNICODE UTF-8 encoding in my web browser (Opera).

And whan I'am using trac-post-commit-hook the same coomments that fixed and closed tickets displayed in Codepage 866.

Switching codepages very inconvenient to use.

Server and Workstation OS: Window XP SP2

Буду рад любой помощи.

Attachments

trac-post-commit-hook.0.10-stable.diff (2.1 KB) - added by markus 23 months ago.
Patch for international characters in changeset and tickets
trac-post-commit-hook.0.10-stable.2.diff (2.0 KB) - added by markus 23 months ago.
Patch for international characters in changeset and tickets v2

Change History

  Changed 4 years ago by anonymous

  • priority changed from normal to highest
  • component changed from general to timeline
  • milestone set to 0.8.2

  Changed 4 years ago by anonymous

  • status changed from new to closed

  Changed 4 years ago by cmlenz

  • status changed from closed to reopened

Why did you close this, anonymous?

  Changed 4 years ago by GregZ

  • component changed from timeline to browser

To solve this problem, i'am added 2 lines to the trac-post-commit-hook:

    def __init__(self, project=options.project, author=options.user, rev=options.rev,msg=options.msg):
        self.author = author
        self.rev = rev

this>>        csmsg = "(In [%s]) %s" % (rev, msg)
this>>        self.msg = util.to_utf8(csmsg, 'windows-1251')

        self.now = int(time.time()) 
        self.con = sqlite.connect(os.path.join(project, 'db', 'trac.db'), autocommit=0) 

  Changed 4 years ago by cmlenz

  • priority changed from highest to low
  • severity changed from normal to minor
  • milestone changed from 0.8.2 to 0.9

This is not high-priority, being just a contrib script.

  Changed 4 years ago by anonymous

What svn client you use for writing comments ? It's strange since nornal subversion encoding is utf-8.

PS Можно пообщаться об этом напрямую - email/jabber:vyt@vzljot.ru

  Changed 4 years ago by vyt@…

  • cc vyt@… added

I think that problem is a way for getting log message.

svnlook does recoding to current locale charset and no way for disable this recoding. Charset detection in post-commit hook is overhead, so best way, IMHO - using subversion python API for getting log message.

  Changed 3 years ago by dans

vyt >>

That sounds reasonable, but I'm feeling a bit lost here.

I'm on Windows XP, running the "post-commit.cmd" from #1602. I guess I should replace the line

FOR /F "usebackq delims==" %%i IN (`%%SVNLOOK%% log -r %TXN% %REPOS%`) DO SET LOG=%%i

with something calling the Python-SVN API, but I neither fully understand the current line nor have any clue about the API.

Can I just replace the part between the parentheses? And if so, what do I replace it with?

  Changed 3 years ago by cmlenz

  • milestone 0.9 deleted

  Changed 3 years ago by vyt@…

At least trac-post-commit-hook should contain note about invoking svnlook in non UTF-8 locales like LANG=ru_RU.UTF-8 svnlook...

  Changed 3 years ago by szaman

  • version changed from 0.8.1 to 0.9.3

i did:

LC_ALL="pl_PL.UTF-8"

in MYREPO/hooks/post-commit

Think, that should be pointed in hook's documentation.

(I'm working on trac-0.9.3 and trunk/contrib/trac-post-commit-hook)

  Changed 3 years ago by Markus Tacker <m@…>

  • cc m@… added
  • keywords utf-8 unicode added

  Changed 3 years ago by Markus Tacker <m@…>

#1625 has been marked as duplicate of this bug.

  Changed 3 years ago by cboos

#2845 has also been marked as duplicate of this bug.

  Changed 3 years ago by cboos

#2352 is yet another duplicate.

  Changed 2 years ago by Markus Tacker <m@…>

#3732 has been marked as duplicate of this bug.

follow-up: ↓ 19   Changed 2 years ago by cboos

  • owner changed from jonas to cboos
  • priority changed from low to normal
  • status changed from reopened to new
  • milestone set to 0.10

He, not sure that #3732 is really a duplicate of this one, as I wrote a patch that would have fixed #3732, I believe. We'll see.

Can you try out the following patch? (be careful, it's for 0.10, don't apply on 0.9)

Index: trac-post-commit-hook
===================================================================
--- trac-post-commit-hook	(revision 3720)
+++ trac-post-commit-hook	(working copy)
@@ -78,6 +78,7 @@
 from trac.ticket import Ticket
 from trac.ticket.web_ui import TicketModule
 # TODO: move grouped_changelog_entries to model.py
+from trac.util.text import to_unicode
 from trac.web.href import Href
 
 try:
@@ -101,6 +102,8 @@
                   help='The user who is responsible for this action')
 parser.add_option('-m', '--msg', dest='msg',
                   help='The log message to search.')
+parser.add_option('-c', '--encoding', dest='encoding',
+                  help='The encoding used by the log message.')
 parser.add_option('-s', '--siteurl', dest='url',
                   help='The base URL to the project\'s trac website (to which '
                        '/ticket/## is appended).  If this is not specified, '
@@ -132,7 +135,9 @@
                        'see':        '_cmdRefs'}
 
     def __init__(self, project=options.project, author=options.user,
-                 rev=options.rev, msg=options.msg, url=options.url):
+                 rev=options.rev, msg=options.msg, url=options.url,
+                 encoding=options.encoding):
+        msg = to_unicode(msg, encoding)
         self.author = author
         self.rev = rev
         self.msg = "(In [%s]) %s" % (rev, msg)

As normally I think you'd get UTF-8 strings when Subversion calls the post-commit hook, the -c/--encoding option shouldn't be needed.

However it can be useful when testing the script from the command line, or if for some reason the post-commit hook is actually given the message with a different encoding.

in reply to: ↑ 18   Changed 2 years ago by sier@…

Replying to cboos:

He, not sure that #3732 is really a duplicate of this one, as I wrote a patch that would have fixed #3732, I believe. We'll see.

The patch works. Thanks!!

  Changed 2 years ago by cboos

  • status changed from new to closed
  • resolution set to fixed

Ok, patch applied in r3743, and I'm going to close this, so we're going to find out if #3732 was really a duplicate ;)

  Changed 2 years ago by Marksu Tacker <m@…>

Confirmed on r3747

  Changed 2 years ago by cboos

Confirmed what, the fix or the bug? I assume the former, otherwise I guess you'd have reopened ;)

  Changed 2 years ago by blaufalke@…

  • status changed from closed to reopened
  • resolution fixed deleted

doesn't work for me and my trac 0.10. The changset itself displays the log message with the correct German umlauts, but the fixed / addressed tickets are generated faulty - even with trac-post-commit-hook from the current trunk.

Expected result

(In [353]) "Testcommit für den trac-post-commit-hook addresses #2 fixes #1"

Actual result:

(In [353]) "Testcommit f?\195?\188r den trac-post-commit-hook addresses #2 fixes #1"

in reply to: ↑ description   Changed 2 years ago by anonymous

Replying to GregZ:

Sorry for bad English. I'am writing comments for SVN revisions on Russian. To see this comments correctly in timeline page, I'am set UNICODE UTF-8 encoding in my web browser (Opera). And whan I'am using trac-post-commit-hook the same coomments that fixed and closed tickets displayed in Codepage 866. Switching codepages very inconvenient to use. Server and Workstation OS: Window XP SP2 Буду рад любой помощи.

  Changed 2 years ago by l2k

  • cc trac@… added

I have the same problem as blaufalke

  Changed 2 years ago by anonymous

  • cc zak_trac@… added

  Changed 2 years ago by anonymous

  • cc blaufalke@… added

  Changed 2 years ago by cboos

  • milestone set to 1.0

See TracDev/Proposals/Journaling. I think that we could eventually get rid of the trac-post-commit-hook in the future (and hence get rid of the encoding issues as well).

  Changed 2 years ago by stock@…

  • cc stock@… added

I did the following changes to the script, which now uses the trac api to get the change message. So the conversion to locale is avoided, I think. It works for me with german umlauts entered through TortoiseSVN.

  • trac-post-commit-hook

     
    8080# TODO: move grouped_changelog_entries to model.py 
    8181from trac.util.text import to_unicode 
    8282from trac.web.href import Href 
     83from trac.versioncontrol.svn_fs import SubversionRepository, SubversionChangeset 
     84from trac.log import logger_factory 
    8385 
    8486try: 
    8587    from optparse import OptionParser 
     
    137139    def __init__(self, project=options.project, author=options.user, 
    138140                 rev=options.rev, msg=options.msg, url=options.url, 
    139141                 encoding=options.encoding): 
    140         msg = to_unicode(msg, encoding) 
     142        self.env = open_environment(project) 
     143        repos_dir = self.env.config.get('trac', 'repository_dir') 
     144        repos = SubversionRepository(repos_dir, 
     145                             '', logger_factory('test')) 
     146        change = repos.get_changeset(rev) 
     147        msg = change.message 
     148        to_unicode(msg) 
     149        msg = msg.decode('utf-8') 
    141150        self.author = author 
    142151        self.rev = rev 
    143152        self.msg = "(In [%s]) %s" % (rev, msg) 
    144153        self.now = int(time.time())  
    145         self.env = open_environment(project) 
    146154        if url is None: 
    147155            url = self.env.config.get('project', 'url') 
    148156        self.env.href = Href(url) 
     
    194202 
    195203 
    196204if __name__ == "__main__": 
    197     if len(sys.argv) < 5: 
     205    if len(sys.argv) < 4: 
    198206        print "For usage: %s --help" % (sys.argv[0]) 
    199207    else: 
    200208        CommitHook() 

  Changed 23 months ago by markus

The above trac api solution works for me!

You could also get the author name that way, this would simplify the calling batch file:

self.author = author

becomes

self.author = change.author

as well as

if len(sys.argv) < 4:

becomes

if len(sys.argv) < 3:

The batch files in the hooks directories of all repositories simply call an other batch script in a directory outside the repository directory. post-commit.cmd:

%~dp0\..\..\..\hooks\trac-post-commit-hook.cmd %1 %2

The trac-post-commit-hook.cmd (purified version from #1602) calls the python script. This script can be used for all the repositories as long as the trac environments have the same names as the subversion repositories:

@ECHO OFF

SET REV=%2
SET REPNAME=%~nx1

:: Modify paths and port number here
SET TRAC_ENV=D:\trac\trac_env\%REPNAME%
SET PYTHON_DIR=C:\Python24
SET TRAC_URL=http://%COMPUTERNAME%:8080/%REPNAME%
SET PYTHON="%PYTHON_DIR%\python.exe"

:: Do not execute hook if trac environment does not exist
IF NOT EXIST %TRAC_ENV% GOTO :EOF

%PYTHON% "%~dp0\trac-post-commit-hook.py" -p "%TRAC_ENV%" -r "%REV%" -s "%TRAC_URL%"

Changed 23 months ago by markus

Patch for international characters in changeset and tickets

  Changed 23 months ago by markus

I attached a patch attachment:trac-post-commit-hook.0.10-stable.diff which applies all the changes from stock@… as well as the one I described above to the latest version from 0.10-stable. It works for me, cyrillic characters in changeset and the ticket comments are correctly displayed. The same changes could be applied to the trunk of 0.11.

  Changed 23 months ago by cboos

A small fix to your patch:

repos_dir = self.env.config.get('trac', 'repository_dir') 
repos = self.env.get_repository() # will do a `sync` if needed 

(and then the log related import is not needed)

Besides the above, I think it's a step in the right direction.

  Changed 23 months ago by cboos

  • milestone changed from 1.0 to 0.11

woops, and I didn't see the msg = msg.decode('utf-8') line. This is unnecessary and potentially harmful.

Changed 23 months ago by markus

Patch for international characters in changeset and tickets v2

  Changed 23 months ago by markus

I made a second patch attachment:trac-post-commit-hook.0.10-stable.2.diff containing your fixes. I hope this is better (it's my first day with python, I'm just trying to put together a patch from what works on my windows box...). I tested it with trac 0.10.3 and subversion 1.4.2, french accents and cyrillic characters work fine.

  Changed 23 months ago by cboos

  • status changed from reopened to closed
  • resolution set to fixed
  • milestone changed from 0.11 to 0.10.4

Thanks for the patch and the tests. I've committed a slightly different version, which won't break compatibility with the previous script.

Fixed in r4532, r4533 (trunk) and r4534 (0.10-stable).

Please try out once again to let me know if everything works for you.

  Changed 23 months ago by markus

Works great for me, thanks!

Add/Change #1310 (trac-post-commit-hook and codepages)

Author



Change Properties
<Author field>
Action
as closed
Next status will be 'reopened'
to The owner will change from cboos. Next status will be 'closed'
 
Note: See TracTickets for help on using tickets.