We prove the LCSK query is NP-hard, and devise exact algorithm as well as approximate algorithm with provable approximation bound to this problem. The proposed exact algorithm, namely MergeList, explores the candidate space progressively with several pruning strategies, which is based on the keyword hash table index structure. Unfortunately, this approach is not scalable to large datasets. We thus develop an approximate algorithm called MaxMargin. It finds the answer by traversing the proposed LIR-tree in the best-first fashion. Moreover, two optimizing strategies are used to improve the query performance. The experiments on real and synthetic datasets verify that the proposed approximate algorithm runs much faster than the competitor with desired accuracy.
© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号 地址:北京市海淀区学院路29号 邮编:100083 电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700 |