Edgewall Software

Opened 14 years ago

Last modified 5 years ago

#2775 new enhancement

Automatic matching of similar issues

Reported by: chris@… Owned by:
Priority: normal Milestone: unscheduled
Component: ticket system Version: 0.9.4
Severity: normal Keywords: clustering
Cc: gyozo.papp@… Branch:
Release Notes:
API Changes:


So in order to make things easier when prioritizing, it would be really cool to see something that displays how many issues there are with closely matching content. This would help when duplicate tickets are created, in order to more easily show the user where the dupes are, and also when prioritizing which issues should be worked on first (if the basis is the amount of requests)

Attachments (0)

Change History (10)

comment:1 by matthew.donald@…, 14 years ago

Chris, I agree, this would be a very neat feature.

The only question is, how do you implement it? I mean, what algorithm will automatically classify "closely matching content"?

comment:2 by chris@…, 14 years ago

You got me, but I was just thinking word matches (minus words like the/a/etc).

comment:3 by Christian Boos, 14 years ago

This ticket is a bit vague, it suggests something like a clustering of tickets. But you may want to have a look at #1069 nevertheless (searching for similar tickets at ticket creation time).

comment:4 by Christian Boos, 13 years ago

Resolution: duplicate
Status: newclosed

See #1069.

comment:5 by chris@…, 13 years ago

Resolution: duplicate
Status: closedreopened

That's not really the aspect I was going with on this one, I was however unclear when I filed this. I did think this would help when matching dupe tickets, but that's more of a side point.

From a ticket administration perspective, say on Adium trac, I'd like to be able to have queue's/clusters/groupings of tickets so that I know that a general area of tickets are all related, and by what percentage.

For instance:

79% of tickets mention the word msn, but only 25% of those mention accounts. Being able to drill down by subsections of tickets to find out where the actual problems are within a superset of a feature would be truly beneficial.

If I maintained a portion of code, it would be great to be able to narrow down, out of what's left open, what to concentrate the most on.

Let me know if I didn't explain this very well, I think it's kind of hard to relay.

comment:6 by Christian Boos, 13 years ago

Keywords: clustering added
Milestone: 2.0

Ok, that's clearer to me now ;-)

I'm setting that as a 2.0 ticket, which doesn't mean someone couldn't propose a patch earlier ;-)

One possible approach would be to use the ticket keywords as tags, and display groups in a tag cloud style, with precise stats informations (percentages as you suggested) and additional links (to the corresponding query).

in reply to:  2 comment:7 by gyozo.papp@…, 13 years ago

Cc: gyozo.papp@… added

Replying to chris@growl.info:

You got me, but I was just thinking word matches (minus words like the/a/etc).

Then it should be dictionary based IMHO not to exclude non-English Trac users. Probably it can benefit from SPAM techniques somehow. (No clear conception just thinking…)

comment:8 by anonymous, 11 years ago

The major databases (MSSQL, Oracle, MySQL, PostgreSQL) all support "full text" searching. Of course each is completely different in richness, ease of use, and of course… setup.

comment:9 by Christian Boos, 10 years ago

Milestone: 2.0unscheduled

Milestone 2.0 deleted

comment:10 by Ryan J Ollos, 5 years ago

Owner: Jonas Borgström removed
Status: reopenednew

Modify Ticket

Change Properties
Set your email in Preferences
as new The ticket will remain with no owner.
The ticket will be disowned. Next status will be 'new'.
as The resolution will be set. Next status will be 'closed'.
The owner will be changed from (none) to anonymous. Next status will be 'assigned'.

Add Comment

E-mail address and name can be saved in the Preferences .
Note: See TracTickets for help on using tickets.