|||
为了比较各类分词器的分词结果才能对的结果进行较
下面是获取获取Lucene分词的一段代码,返回的结果是list数组:
//获取字符串的分词结果
public static List<String> getAnalyseResult(String analyzeStr, Analyzer analyzer) {
List<String> response = new ArrayList<String>();
TokenStream tokenStream = null;
try {
tokenStream = analyzer.tokenStream("content", new StringReader(analyzeStr));
CharTermAttribute attr = tokenStream.addAttribute(CharTermAttribute.class);
tokenStream.reset();
while (tokenStream.incrementToken()) {
response.add(attr.toString());
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (tokenStream != null) {
try {
tokenStream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return response;
}
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-9-27 21:25
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社