#2775 new enhancement

Automatic matching of similar issues

Reported by:	chris@…	Owned by:
Priority:	normal	Milestone:	unscheduled
Component:	ticket system	Version:	0.9.4
Severity:	normal	Keywords:	clustering
Cc:	gyozo.papp@…	Branch:
Release Notes:
API Changes:
Internal Changes:

Description

So in order to make things easier when prioritizing, it would be really cool to see something that displays how many issues there are with closely matching content. This would help when duplicate tickets are created, in order to more easily show the user where the dupes are, and also when prioritizing which issues should be worked on first (if the basis is the amount of requests)

Change History (10)

comment:1 by matthew.donald@…, 20 years ago

Chris, I agree, this would be a very neat feature.

The only question is, how do you implement it? I mean, what algorithm will automatically classify "closely matching content"?

follow-up: 7 comment:2 by chris@…, 20 years ago

You got me, but I was just thinking word matches (minus words like the/a/etc).

comment:3 by Christian Boos, 19 years ago

This ticket is a bit vague, it suggests something like a clustering of tickets. But you may want to have a look at #1069 nevertheless (searching for similar tickets at ticket creation time).

comment:4 by Christian Boos, 18 years ago

Resolution:	→ duplicate
Status:	new → closed

See #1069.

comment:5 by chris@…, 18 years ago

Resolution:	duplicate
Status:	closed → reopened

That's not really the aspect I was going with on this one, I was however unclear when I filed this. I did think this would help when matching dupe tickets, but that's more of a side point.

From a ticket administration perspective, say on Adium trac, I'd like to be able to have queue's/clusters/groupings of tickets so that I know that a general area of tickets are all related, and by what percentage.

For instance:

79% of tickets mention the word msn, but only 25% of those mention accounts. Being able to drill down by subsections of tickets to find out where the actual problems are within a superset of a feature would be truly beneficial.

If I maintained a portion of code, it would be great to be able to narrow down, out of what's left open, what to concentrate the most on.

Let me know if I didn't explain this very well, I think it's kind of hard to relay.

comment:6 by Christian Boos, 18 years ago

Keywords:	clustering added
Milestone:	→ 2.0

Ok, that's clearer to me now ;-)

I'm setting that as a 2.0 ticket, which doesn't mean someone couldn't propose a patch earlier ;-)

One possible approach would be to use the ticket keywords as tags, and display groups in a tag cloud style, with precise stats informations (percentages as you suggested) and additional links (to the corresponding query).

in reply to: 2 comment:7 by gyozo.papp@…, 18 years ago

Cc:	gyozo.papp@… added

Replying to chris@growl.info:

You got me, but I was just thinking word matches (minus words like the/a/etc).

Then it should be dictionary based IMHO not to exclude non-English Trac users. Probably it can benefit from SPAM techniques somehow. (No clear conception just thinking…)

comment:8 by anonymous, 16 years ago

The major databases (MSSQL, Oracle, MySQL, PostgreSQL) all support "full text" searching. Of course each is completely different in richness, ease of use, and of course… setup.

comment:9 by Christian Boos, 15 years ago

Milestone:	2.0 → unscheduled

Milestone 2.0 deleted

comment:10 by Ryan J Ollos, 10 years ago

Owner:	Jonas Borgström removed
Status:	reopened → new

Context Navigation

#2775 new enhancement

Automatic matching of similar issues

Description

Attachments (0)

Change History (10)

comment:1 by matthew.donald@…, 20 years ago

follow-up: 7 comment:2 by chris@…, 20 years ago

comment:3 by Christian Boos, 19 years ago

comment:4 by Christian Boos, 18 years ago

comment:5 by chris@…, 18 years ago

comment:6 by Christian Boos, 18 years ago

in reply to: 2 comment:7 by gyozo.papp@…, 18 years ago

comment:8 by anonymous, 16 years ago

comment:9 by Christian Boos, 15 years ago

comment:10 by Ryan J Ollos, 10 years ago

Modify Ticket

Add Comment

by anonymous

Download in other formats:

Summary:
Description:	So in order to make things easier when prioritizing, it would be really cool to see something that displays how many issues there are with closely matching content. This would help when duplicate tickets are created, in order to more easily show the user where the dupes are, and also when prioritizing which issues should be worked on first (if the basis is the amount of requests) You may use WikiFormatting here.
Type:		Priority:
Milestone:		Component:
Version:		Severity:
Keywords:		Cc:	Set your email in Preferences
Branch:
Release Notes:
API Changes:
Internal Changes: