Edgewall Software
Modify

Opened 10 years ago

Last modified 8 months ago

#4080 new enhancement

[Patch] Detect charset using enca or chardet

Reported by: mpv Owned by:
Priority: high Milestone: next-major-releases
Component: version control/changeset view Version: 0.10
Severity: normal Keywords: charset encoding patch
Cc: Thijs Triemstra
Release Notes:
API Changes:

Description (last modified by Thijs Triemstra)

to detect charset in multiply charset repository

http://gitorious.org/enca

Attachments (2)

trac-enca.diff (858 bytes) - added by mpv@… 10 years ago.
patch
chardet.patch (1.2 KB) - added by Kyosuke Takayama <support@…> 8 years ago.

Download all attachments as: .zip

Change History (12)

Changed 10 years ago by mpv@…

Attachment: trac-enca.diff added

patch

comment:1 Changed 10 years ago by Christian Boos

Milestone: 0.12

I'm not sure the mimeview refactoring will still happen in 0.11 timeframe, therefore scheduling this for 0.12.

In the mimeview refactoring, there will be entry points for pluggable mimeview and charset detectors.

comment:2 Changed 10 years ago by Matthew Good

Alternatively chardet is pure Python and easy_install-able, so this might be a better choice.

comment:3 Changed 9 years ago by Christian Boos

#4930 presents the problem of garbled content for attachments, where it's typically hard to get metadata about the file content (for repository, there could be either the svn:mime-type property setting the charset, or a repo wide setting and convention).

comment:4 Changed 8 years ago by Kyosuke Takayama <support@…>

I have patch using chardet.

This works fine for me.

Changed 8 years ago by Kyosuke Takayama <support@…>

Attachment: chardet.patch added

comment:5 Changed 6 years ago by Christian Boos

Priority: normalhigh

Upping the priority, related to #3332.

comment:6 Changed 6 years ago by Thijs Triemstra

Cc: Thijs Triemstra added
Keywords: patch added

The patch needs a review.

comment:7 Changed 6 years ago by Thijs Triemstra

Description: modified (diff)

update project url

comment:8 Changed 15 months ago by Ryan J Ollos

Owner: Christian Boos deleted

comment:9 Changed 8 months ago by figaro

The Universal Character Encoding Detector chardet can currently be found on PyPi.

comment:10 Changed 8 months ago by figaro

Summary: [PATCH] detect charset using enca[Patch] Detect charset using enca or chardet

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The ticket will remain with no owner.
The ticket will be disowned. Next status will be 'new'.
as The resolution will be set. Next status will be 'closed'.
The owner will be changed from (none) to anonymous. Next status will be 'assigned'.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.