Opened 10 years ago
Last modified 10 years ago
#12453 new defect
branch/tag/bookmark name should be decoded as utf-8
| Reported by: | Owned by: | ||
|---|---|---|---|
| Priority: | normal | Milestone: | |
| Component: | plugin/mercurial | Version: | |
| Severity: | normal | Keywords: | patch encoding |
| Cc: | Branch: | ||
| Release Notes: | |||
| API Changes: | |||
| Internal Changes: | |||
Description
For example, when I set hg.encoding=cp932 (Japanese Shift_JIS),
I see broken Japanese tag/branch/bookmark names in repo browser.
We should use utf-8 (=HGENCODING) to decode branch/tag/bookmark string,
not with hg.encoding (for file name).
I have two patches for this issue.
Attachments (0)
Change History (4)
comment:1 by , 10 years ago
| Keywords: | patch utf-8 added |
|---|
comment:2 by , 10 years ago
| Keywords: | encoding added; utf-8 removed |
|---|
follow-up: 4 comment:3 by , 10 years ago
No, branch name etc. is metadata, and therefore Mercurial always uses UTF-8 for that (cf. EncodingStrategy#UTF-8_strings) + what we did in #7217 to ensure that it stays in UTF-8 on the hg side.
I'm rather thinking of using to_unicode for the metadata, that would simplify things.
If we still want to support legacy encodings for the metadata, then we probably need to split [hg] encoding into several other options, as this tries to do too many things (also considering the current mess w.r.t. filenames).
comment:4 by , 10 years ago
Replying to Christian Boos:
No, branch name etc. is metadata, and therefore Mercurial always uses UTF-8 for that (cf. EncodingStrategy#UTF-8_strings) + what we did in #7217 to ensure that it stays in UTF-8 on the hg side.
Yes, that is the point of this issue.
I'm rather thinking of using
to_unicodefor the metadata, that would simplify things.
another candidate is from_utf8() / to_utf8()



#10950 is the same issue?