Edgewall Software
Modify

Opened 13 years ago

Last modified 3 years ago

#2775 new enhancement

Automatic matching of similar issues

Reported by: chris@… Owned by:
Priority: normal Milestone: unscheduled
Component: ticket system Version: 0.9.4
Severity: normal Keywords: clustering
Cc: gyozo.papp@…
Release Notes:
API Changes:

Description

So in order to make things easier when prioritizing, it would be really cool to see something that displays how many issues there are with closely matching content. This would help when duplicate tickets are created, in order to more easily show the user where the dupes are, and also when prioritizing which issues should be worked on first (if the basis is the amount of requests)

Attachments (0)

Change History (10)

comment:1 Changed 13 years ago by matthew.donald@…

Chris, I agree, this would be a very neat feature.

The only question is, how do you implement it? I mean, what algorithm will automatically classify "closely matching content"?

comment:2 Changed 13 years ago by chris@…

You got me, but I was just thinking word matches (minus words like the/a/etc).

comment:3 Changed 13 years ago by Christian Boos

This ticket is a bit vague, it suggests something like a clustering of tickets. But you may want to have a look at #1069 nevertheless (searching for similar tickets at ticket creation time).

comment:4 Changed 12 years ago by Christian Boos

Resolution: duplicate
Status: newclosed

See #1069.

comment:5 Changed 12 years ago by chris@…

Resolution: duplicate
Status: closedreopened

That's not really the aspect I was going with on this one, I was however unclear when I filed this. I did think this would help when matching dupe tickets, but that's more of a side point.

From a ticket administration perspective, say on Adium trac, I'd like to be able to have queue's/clusters/groupings of tickets so that I know that a general area of tickets are all related, and by what percentage.

For instance:

79% of tickets mention the word msn, but only 25% of those mention accounts. Being able to drill down by subsections of tickets to find out where the actual problems are within a superset of a feature would be truly beneficial.

If I maintained a portion of code, it would be great to be able to narrow down, out of what's left open, what to concentrate the most on.

Let me know if I didn't explain this very well, I think it's kind of hard to relay.

comment:6 Changed 12 years ago by Christian Boos

Keywords: clustering added
Milestone: 2.0

Ok, that's clearer to me now ;-)

I'm setting that as a 2.0 ticket, which doesn't mean someone couldn't propose a patch earlier ;-)

One possible approach would be to use the ticket keywords as tags, and display groups in a tag cloud style, with precise stats informations (percentages as you suggested) and additional links (to the corresponding query).

comment:7 in reply to:  2 Changed 11 years ago by gyozo.papp@…

Cc: gyozo.papp@… added

Replying to chris@growl.info:

You got me, but I was just thinking word matches (minus words like the/a/etc).

Then it should be dictionary based IMHO not to exclude non-English Trac users. Probably it can benefit from SPAM techniques somehow. (No clear conception just thinking…)

comment:8 Changed 9 years ago by anonymous

The major databases (MSSQL, Oracle, MySQL, PostgreSQL) all support "full text" searching. Of course each is completely different in richness, ease of use, and of course… setup.

comment:9 Changed 8 years ago by Christian Boos

Milestone: 2.0unscheduled

Milestone 2.0 deleted

comment:10 Changed 3 years ago by Ryan J Ollos

Owner: Jonas Borgström deleted
Status: reopenednew

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The ticket will remain with no owner.
The ticket will be disowned.
as The resolution will be set.
The owner will be changed from (none) to anonymous.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.