TY - JOUR

T1 - A practical algorithm to find the best subsequence patterns

AU - Hirao, Masahiro

AU - Hoshino, Hiromasa

AU - Shinohara, Ayumi

AU - Takeda, Masayuki

AU - Arikawa, Setsuo

PY - 2003/1/27

Y1 - 2003/1/27

N2 - Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. We regard it to find a subsequence pattern which separates these two sets. The problem is known to be NP-complete. We naturally generalize it to an optimization problem, where we try to find a subsequence pattern which maximally separates these two sets. We provide a practical algorithm to solve it exactly. Our algorithm uses two pruning heuristics based on the properties of subsequence languages, and utilizes the data structure called subsequence automata. We report some experimental results, which show these heuristics and the data structure contribute to reduce the search time.

AB - Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. We regard it to find a subsequence pattern which separates these two sets. The problem is known to be NP-complete. We naturally generalize it to an optimization problem, where we try to find a subsequence pattern which maximally separates these two sets. We provide a practical algorithm to solve it exactly. Our algorithm uses two pruning heuristics based on the properties of subsequence languages, and utilizes the data structure called subsequence automata. We report some experimental results, which show these heuristics and the data structure contribute to reduce the search time.

KW - Optimal pattern discovery

KW - Subsequence

KW - Subsequence automata

UR - http://www.scopus.com/inward/record.url?scp=0037467657&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037467657&partnerID=8YFLogxK

U2 - 10.1016/S0304-3975(02)00182-2

DO - 10.1016/S0304-3975(02)00182-2

M3 - Article

AN - SCOPUS:0037467657

VL - 292

SP - 465

EP - 479

JO - Theoretical Computer Science

JF - Theoretical Computer Science

SN - 0304-3975

IS - 2

ER -