|||
内容:
2009年初,计算机系数据库研究小组已有多篇论文先后在重要国际期刊及会议上发表或被录用。其中题目为“CONTOUR: An Efficient Algorithm for Discovering Discriminating Subsequences”的论文(作者为:王建勇、张宇宙、周立柱、George Karypis和Charu Arggarwal)发表于数据挖掘领域的重要国际期刊《Data Mining and Knowledge Discovery》(简称DMKD)2009年的第一期,另一篇题目为“Incremental Sequence-Based Frequent Query Pattern Mining from XML Queries”的论文(作者为:李国良、冯建华、王建勇、周立柱)完全由清华大学独立完成并已被DMKD期刊录用。到目前为止尚没有完全由中国大陆研究人员独立完成的论文发表于DMKD。
此外,数据库研究小组另有3篇论文先后被ICDE 2009、EDBT 2009和WWW 2009所录用。其中,题目为“Progressive Top-k Keyword Search in Relational Databases”的论文(作者为:李国良、周晓方、冯建华、王建勇)被数据库领域的重要国际会议ICDE 2009录用为短文,题目为“FOGGER: An Algorithm for Graph Generator Discovery”的论文(作者为:曾志平、王建勇、张军、周立柱)被数据库领域的重要国际会议EDBT 2009所录用,题目为“Efficient Interactive Fuzzy Keyword Search”的论文(作者为:季声乐、李国良、李晨、冯建华)被万维网领域的重要国际会议WWW 2009录用为长文。
Abstract Existing algorithms of mining frequent XML query patterns (XQPs) employ a candidate generate-and-test strategy. They involve expensive candidate enumeration and costly tree-containment checking. Further, most of existing methods compute the frequencies of candidate query patterns from scratch periodically by checking the entire transaction database, which consists of XQPs transferred from user query logs. However, it is not straightforward to maintain such discovered frequent patterns in real XML databases as there may be frequent updates that may not only invalidate some existing frequent query patterns but also generate some new frequent query patterns. Therefore, a drawback of existing methods is that they are rather inefficient for the evolution of transaction databases. To address above-mentioned problems, this paper proposes an efficient algorithm ESPRIT to mine frequent XQPs without costly tree-containment checking. ESPRIT transforms XML queries into sequences using a one-to-one mapping technique and mines the frequent sequences to generate frequent XQPs. We propose two efficient incremental algorithms, ESPRIT-i and ESPRIT-i+, to incrementally mine frequent XQPs. We devise several novel optimization techniques of query rewriting, cache lookup, and cache replacement to improve the answerability and the hit rate of caching. We have implemented our algorithms and conducted a set of experimental studies on various datasets. The experimental results demonstrate that our algorithms achieve high efficiency and scalability and outperform state-of-the-art methods significantly.
Keywords XML query patterns - Frequent query patterns - XML frequent pattern mining - Incremental mining - Sequential pattern mining
另一篇论文,来自http://dbgroup.cs.tsinghua.edu.cn/ligl/Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-7-28 18:38
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社