Lucene基本的查询语句:
Searcher searcher = new IndexSearcher(dbpath);
Query query = QueryParser.parse(searchkey, searchfield,
new StandardAnalyzer());
Hits hits = searcher.search(query);
下面是Query的各种子查询,他们斗鱼QueryParser都有对应关系。
1.TermQuery常用,对一个Term(最小的索引块,包含一个field名字和值)进行索引查询。
Term直接与QueryParser.parse里面的key和field直接对应。
IndexSearcher searcher = new IndexSearcher(directory);
Term t = new Term("isbn", "1930110995");
Query query = new TermQuery(t);
Hits hits = searcher.search(query);
2.RangeQuery用于区间查询,RangeQuery的第三个参数表示是开区间还是闭区间。
QueryParser会构建从begin到end之间的N个查询进行查询。
Term begin, end;
Searcher searcher = new IndexSearcher(dbpath);
begin = new Term("pubmonth","199801");
end = new Term("pubmonth","199810");
RangeQuery query = new RangeQuery(begin, end, true);
RangeQuery本质是比较大小。所以如下查询也是可以的,但是意义就于上面不大一样了,总之是大小的比较
设定了一个区间,在区间内的都能够搜索出来,这里就存在一个比较大小的原则,比如字符串会首先比较第一个字符,这样与字符长度没有关系。
begin = new Term("pubmonth","19");
end = new Term("pubmonth","20");
RangeQuery query = new RangeQuery(begin, end, true);
3.PrefixQuery.对于TermQuery,必须完全匹配(用Field.Keyword生成的字段)才能够查询出来。
这就制约了查询的灵活性,PrefixQuery只需要匹配value的前面任何字段即可。如Field为name,记录
中那么有jackliu,jackwu,jackli,那么使用jack就可以查询出所有的记录。QueryParser creates a PrefixQuery
for a term when it ends with an asterisk (*) in query expressions.
IndexSearcher searcher = new IndexSearcher(directory);
Term term = new Term("category", "/technology/computers/programming");
PrefixQuery query = new PrefixQuery(term);
Hits hits = searcher.search(query);
4.BooleanQuery.上面所有的查询都是基于单个field的查询,多个field怎么查询呢,BooleanQuery
就是解决多个查询的问题。通过add(Query query, boolean required, boolean prohibited)加入
多个查询.通过BooleanQuery的嵌套可以组合非常复杂的查询。
IndexSearcher searcher = new IndexSearcher(directory);
TermQuery searchingBooks =
new TermQuery(new Term("subject","search"));
RangeQuery currentBooks =
new RangeQuery(new Term("pubmonth","200401"),
new Term("pubmonth","200412"),true);
BooleanQuery currentSearchingBooks = new BooleanQuery();
currentSearchingBooks.add(searchingBook s, true, false);
currentSearchingBooks.add(currentBooks, true, false);
Hits hits = searcher.search(currentSearchingBooks);
BooleanQuery的add方法有两个boolean参数:
true&false:表明当前加入的子句是必须要满足的;
false&true:表明当前加入的子句是不可以被满足的;
false&false:表明当前加入的子句是可选的;
true&true:错误的情况。
QueryParser handily constructs BooleanQuerys when multiple terms are specified.
Grouping is done with parentheses, and the prohibited and required flags are
set when the –, +, AND, OR, and NOT operators are specified.
5.PhraseQuery进行更为精确的查找。它能够对索引文本中的两个或更多的关键词的位置进行
限定。如搜查包含A和B并且A、B之间还有一个文字。Terms surrounded by double quotes in
QueryParser parsed expressions are translated into a PhraseQuery.
The slop factor defaults to zero, but you can adjust the slop factor
by adding a tilde (~) followed by an integer.
For example, the expression "quick fox"~3
6.WildcardQuery.WildcardQuery比PrefixQuery提供了更细的控制和更大的灵活性,这个最容易
理解和使用。
7.FuzzyQuery.这个Query比较特别,它会查询与关键字长得很像的其他记录。QueryParser
supports FuzzyQuery by suffixing a term with a tilde (~),for exmaple wuzza~.
public void testFuzzy() throws Exception {
indexSingleFieldDocs(new Field[] {
Field.Text("contents", "fuzzy"),
Field.Text("contents", "wuzzy")
});
IndexSearcher searcher = new IndexSearcher(directory);
Query query = new FuzzyQuery(new Term("contents", "wuzza"));
Hits hits = searcher.search(query);
assertEquals("both close enough", 2, hits.length());
assertTrue("wuzzy closer than fuzzy",
hits.score(0) != hits.score(1));
assertEquals("wuzza bear","wuzzy", hits.doc(0).get("contents"));
}