Opened 8 years ago
Last modified 8 years ago
#12322 closed defect
UnicodeDecodeError: 'utf8' codec can't decode byte 0xca in position 8: invalid continuation byte — at Version 5
Reported by: | Ryan J Ollos | Owned by: | Ryan J Ollos |
---|---|---|---|
Priority: | normal | Milestone: | 1.0.10 |
Component: | plugin/git | Version: | |
Severity: | normal | Keywords: | |
Cc: | Branch: | ||
Release Notes: |
Invalid byte sequence in filepath is replaced when reading Git commits. |
||
API Changes: | |||
Internal Changes: |
Description
Encountered this error while running trac-admin $env repository resync "(default)"
:
2016-01-19 00:21:23,635 Trac[console] ERROR: Exception in trac-admin command: Traceback (most recent call last): File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/admin/console.py", line 109, in onecmd rv = cmd.Cmd.onecmd(self, line) or 0 File "/usr/lib/python2.7/cmd.py", line 220, in onecmd return self.default(line) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/admin/console.py", line 287, in default return self.cmd_mgr.execute_command(*args) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/admin/api.py", line 127, in execute_command return f(*fargs) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/versioncontrol/admin.py", line 156, in _do_resync self._sync(reponame, rev, clean=True) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/versioncontrol/admin.py", line 143, in _sync repos.sync(self._sync_feedback, clean=clean) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/tracopt/versioncontrol/git/git_fs.py", line 141, in sync self._insert_changeset(db, rev, cset) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/trac/versioncontrol/cache.py", line 285, in _insert_changeset for path, kind, action, bpath, brev in cset.get_changes(): File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/tracopt/versioncontrol/git/git_fs.py", line 851, in get_changes self.repos.git.diff_tree(parent, self.rev, find_renames=True): File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/tracopt/versioncontrol/git/PyGIT.py", line 1044, in diff_tree yield __chg_tuple() File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/tracopt/versioncontrol/git/PyGIT.py", line 1036, in __chg_tuple chg[5] = self._fs_to_unicode(chg[5]) File "/var/www/bugs.jqueryui.com/private/pve/local/lib/python2.7/site-packages/tracopt/versioncontrol/git/PyGIT.py", line 380, in <lambda> self._fs_to_unicode = lambda s: s.decode(git_fs_encoding) File "/var/www/bugs.jqueryui.com/private/pve/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xca in position 8: invalid continuation byte
I'll post more info if I can reproduce at a different debug level.
Change History (5)
comment:1 by , 8 years ago
Milestone: | → 1.0.10 |
---|
comment:2 by , 8 years ago
That commit has invalid byte sequence in the name of files.
$ git show --name-status c1800c59953161d88432ea8a307b5cdf08c5ec41 ... M ya/demos/accordion/default.html M ya/demos/dialog/default.html A ya/external/PIE.htc A ya/external/border-radius.htc A ya/external/jquery.bgiframe-2.1.2.js A ya/lib/sl.css M ya/lib/sl.js A ya/lib/uihelper.js A "ya/test/\312\326\267\347\307\331.txt" A ya/themes/default/images/ui-icon-arrows.png A ya/themes/default/images/ui-icon-close.png A ya/themes/default/images/ui-icon-triangle-1-e.png A ya/themes/default/images/ui-icon-triangle-1-s.png A ya/themes/default/images/ui-icons.png A ya/themes/default/jquery.ui.accordion.css A ya/themes/default/jquery.ui.dialog.css A ya/themes/default/jquery.ui.override.css M ya/ui/jquery.ya.accordion0.js M ya/ui/jquery.ya.dialog0.js
$ python -c '"ya/test/\312\326\267\347\307\331.txt".decode("utf-8")' Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xca in position 8: invalid continuation byte
We could ignore those invalid byte sequence in git repository.
-
tracopt/versioncontrol/git/PyGIT.py
diff --git a/tracopt/versioncontrol/git/PyGIT.py b/tracopt/versioncontrol/git/PyGIT.py index 966df98bc..fc61319ed 100644
a b class Storage(object): 380 380 codecs.lookup(git_fs_encoding) 381 381 382 382 # setup conversion functions 383 self._fs_to_unicode = lambda s: s.decode(git_fs_encoding) 383 self._fs_to_unicode = lambda s: s.decode(git_fs_encoding, 384 'replace') 384 385 self._fs_from_unicode = lambda s: s.encode(git_fs_encoding) 385 386 else: 386 387 # pass bytestrings as-is w/o any conversion
After the patch:
Python 2.5.6 (r256:88840, Oct 21 2014, 22:26:35) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from trac.env import open_environment >>> env = open_environment('/home/jun66j5/var/trac/1.0-sqlite') >>> repos = env.get_repository('jquery-ui.git') >>> cset = repos.get_changeset('c1800c59953161d88432ea8a307b5cdf08c5ec41') >>> for _ in cset.get_changes(): print _[0] ... ya/demos/accordion/default.html ya/demos/dialog/default.html ya/external/PIE.htc ya/external/border-radius.htc ya/external/jquery.bgiframe-2.1.2.js ya/lib/sl.css ya/lib/sl.js ya/lib/uihelper.js ya/test/�ַ���.txt ya/themes/default/images/ui-icon-arrows.png ya/themes/default/images/ui-icon-close.png ya/themes/default/images/ui-icon-triangle-1-e.png ya/themes/default/images/ui-icon-triangle-1-s.png ya/themes/default/images/ui-icons.png ya/themes/default/jquery.ui.accordion.css ya/themes/default/jquery.ui.dialog.css ya/themes/default/jquery.ui.override.css ya/ui/jquery.ya.accordion0.js ya/ui/jquery.ya.dialog0.js
comment:3 by , 8 years ago
Replacing invalid characters seems like a good solution. Thanks for investigating.
comment:4 by , 8 years ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:5 by , 8 years ago
Release Notes: | modified (diff) |
---|
Change from comment:2 committed to 1.0-stable in [14523], merged to trunk in [14524].
It would be good to have a test case, but I struggled with that. I was trying to use _git_fast_import and the format used in _generate_data_many_merges, but I'm unsure of the specification of that format, or how I can export a Git commit in the format.
With
log_level
atINFO
:The commit can be found here.