0.定义
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java.
It is a technology suitable for nearly any application that requires full-text search, especially cross-platform
Apache Lucene是一个高性能,全文本特性的JAVA搜索引擎.它的技术适合于任何全文搜索应用,特别是跨平台.
1.simpleExample 提供最简单的sample
2.特性
Lucene is a high performance, scalable, cross-platform search engine that contains many advanced features that
often go untapped by the majority of users. In this session, designed for those familiar with Lucene,
we will examine some of Lucene's more advanced topics and their application, including:
Term Vectors:
Manual and Pseudo relevance feedback;
Advanced document collection analysis for domain specialization
Span Queries:
Better phrase matching; Candidate Identification for Question Answering
Tying it all Together:
Building a search framework for experimentation and rapid deployment
Case Studies from CNLP:
Crosslingual/multilingual retrieval in Arabic, English and Dutch;
Sublanguage specialization for commercial trouble ticket analysis;
Passage retrieval and analysis for Question Answering application
Lucene是一个高性能,可收缩,跨平台搜索引擎,包括从未被大多数用户使用很多高级特性.对此,我们将检查Lucene一些更多高级主题和应用,包括:
Term矢量:
人工和虚拟适当的反馈;
高级特殊化域文档收集分析;
Span (=Statistical Processing and Analysis 统计处理及分析)查询
更好短语匹配;
紧密结合
创建一个试验搜索引擎框架和快速开发
来自CNLP学习用例
使用阿拉伯的, 英语 和荷兰语交流/多语言
...
3.关键字/关键用语
4.技术
5.原理
6.同类对比
7.深入研究
8.Referrence
<Lucene in Action>