Opened 9 years ago
Last modified 9 years ago
#12453 new defect
branch/tag/bookmark name should be decoded as utf-8
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | plugin/mercurial | Version: | |
Severity: | normal | Keywords: | patch encoding |
Cc: | Branch: | ||
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description
For example, when I set hg.encoding=cp932
(Japanese Shift_JIS),
I see broken Japanese tag/branch/bookmark names in repo browser.
We should use utf-8 (=HGENCODING
) to decode branch/tag/bookmark string,
not with hg.encoding
(for file name).
I have two patches for this issue.
Attachments (0)
Change History (4)
comment:1 by , 9 years ago
Keywords: | patch utf-8 added |
---|
comment:2 by , 9 years ago
Keywords: | encoding added; utf-8 removed |
---|
follow-up: 4 comment:3 by , 9 years ago
No, branch name etc. is metadata, and therefore Mercurial always uses UTF-8 for that (cf. EncodingStrategy#UTF-8_strings) + what we did in #7217 to ensure that it stays in UTF-8 on the hg side.
I'm rather thinking of using to_unicode
for the metadata, that would simplify things.
If we still want to support legacy encodings for the metadata, then we probably need to split [hg] encoding
into several other options, as this tries to do too many things (also considering the current mess w.r.t. filenames).
comment:4 by , 9 years ago
Replying to Christian Boos:
No, branch name etc. is metadata, and therefore Mercurial always uses UTF-8 for that (cf. EncodingStrategy#UTF-8_strings) + what we did in #7217 to ensure that it stays in UTF-8 on the hg side.
Yes, that is the point of this issue.
I'm rather thinking of using
to_unicode
for the metadata, that would simplify things.
another candidate is from_utf8() / to_utf8()
#10950 is the same issue?