Edgewall Software
Modify

Opened 14 years ago

Last modified 5 years ago

#4080 new enhancement

[Patch] Detect charset using enca or chardet

Reported by: mpv Owned by:
Priority: high Milestone: next-major-releases
Component: version control/changeset view Version: 0.10
Severity: normal Keywords: charset encoding patch
Cc: Thijs Triemstra Branch:
Release Notes:
API Changes:
Internal Changes:

Description (last modified by Thijs Triemstra)

to detect charset in multiply charset repository

http://gitorious.org/enca

Attachments (2)

trac-enca.diff (858 bytes ) - added by mpv@… 14 years ago.
patch
chardet.patch (1.2 KB ) - added by Kyosuke Takayama <support@…> 13 years ago.

Download all attachments as: .zip

Change History (12)

by mpv@…, 14 years ago

Attachment: trac-enca.diff added

patch

comment:1 by Christian Boos, 14 years ago

Milestone: 0.12

I'm not sure the mimeview refactoring will still happen in 0.11 timeframe, therefore scheduling this for 0.12.

In the mimeview refactoring, there will be entry points for pluggable mimeview and charset detectors.

comment:2 by Matthew Good, 14 years ago

Alternatively chardet is pure Python and easy_install-able, so this might be a better choice.

comment:3 by Christian Boos, 14 years ago

#4930 presents the problem of garbled content for attachments, where it's typically hard to get metadata about the file content (for repository, there could be either the svn:mime-type property setting the charset, or a repo wide setting and convention).

comment:4 by Kyosuke Takayama <support@…>, 13 years ago

I have patch using chardet.

This works fine for me.

by Kyosuke Takayama <support@…>, 13 years ago

Attachment: chardet.patch added

comment:5 by Christian Boos, 11 years ago

Priority: normalhigh

Upping the priority, related to #3332.

comment:6 by Thijs Triemstra, 10 years ago

Cc: Thijs Triemstra added
Keywords: patch added

The patch needs a review.

comment:7 by Thijs Triemstra, 10 years ago

Description: modified (diff)

update project url

comment:8 by Ryan J Ollos, 6 years ago

Owner: Christian Boos removed

comment:9 by figaro, 5 years ago

The Universal Character Encoding Detector chardet can currently be found on PyPi.

comment:10 by figaro, 5 years ago

Summary: [PATCH] detect charset using enca[Patch] Detect charset using enca or chardet

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The ticket will remain with no owner.
The ticket will be disowned.
as The resolution will be set. Next status will be 'closed'.
The owner will be changed from (none) to anonymous. Next status will be 'assigned'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.