My第一个搜索模型

这几天，我的第一个基于lucene的搜索搞好了，记载一下：
首先要有一个包包Jar的那个，可以到官方网站去下载，之后现研究一下这个包包由于现在是学习阶段，就下了两个版本1.4.3的和2.0的，lucene-2.0的留着以后开发用，lucene-1.4.3的学习用，毕竟到2.0 时代文件格式有很大的变化，包括生成的index格式都变化了，所以最好是两个版本都有。在开发的时候导入这两个包就行了，我开始真的不会，汗！我还以为和C++里面的一样呢直接include，现在想起来狂晕，那时候刚开始写Java连声明个类都叫Cjavaclass，MFC的写法，汗自己一个！定义变量还保留C的习惯_javaVar_，再汗一个，现在好多了。
步骤一：
先写一个定义常量的*.java文件
public class Constants {
public final static String INDEX_FILE_PATH = "C:\\Java\\lucene\\DataSource";
public final static String INDEX_STORE_PATH = "C:\\Java\\lucene\\DataIndex";
}
用来存储要建立索引的文件和存储建好的索引存储在什么地方
步骤二：
写生成索引的类：
//将要索引的文件构成一个Document对象,并添加一个域"content"
public class LuceneIndex {
//索引器
private IndexWriter writer = null;
// 初始化=====>构造函数
public LuceneIndex() {
  try {
   writer = new IndexWriter(Constants.INDEX_STORE_PATH,new StandardAnalyzer(), true);
  } catch (Exception e) {
   e.printStackTrace();
  }
}
//将要索引的文件构成一个Document对象,并添加一个域"content"
private Document getDocument(File f) throws Exception {
  Document doc = new Document();
  FileInputStream is = new FileInputStream(f);
  Reader reader = new BufferedReader(new InputStreamReader(is));
  doc.add(Field.Text("contents", reader));
  doc.add(Field.Keyword("path", f.getAbsolutePath()));
  return doc;
}
public void writeToIndex() throws Exception {
  File folder = new File(Constants.INDEX_FILE_PATH);
  if (folder.isDirectory()) {
   String[] files = folder.list();
   System.out.println("正在建立索引..........请等待");
   for (int i = 0; i < files.length; i++) {
    File file = new File(folder, files[i]);
    Document doc = getDocument(file);
    System.out.println("正在建立文件 : " + file + " 的索引");
    System.out.println("完毕");
    writer.addDocument(doc);
   }
  }
}
public void close() throws Exception {
  writer.close();
}
//测试用的主程序
public static void main(String[] agrs) throws Exception {
  // 声明一个LuceneIndex对象
  LuceneIndex indexer = new LuceneIndex();
  // 建立索引
  Date start = new Date();
  indexer.writeToIndex();
  Date end = new Date();
  System.out.println("建立索引完毕..........Thank you for Lucene");
  System.out.println("");
  System.out.println("消耗时间 " + (end.getTime() - start.getTime())
    + " 毫秒");
  System.out.println("索引建立完毕");
  indexer.close();
}
}
现在索引生成了，是这些文本的的全文索引用的索引文件
步骤三：
现在基础都有了，要的就是搜索的累了，干嘛？写个搜索类就是用来查询啊！
public class LuceneSearch {
// 声明一个IndexSearcher对象
private IndexSearcher searcher = null;
// 声明一个Query对象
private Query query = null;
// 初始化构造函数
public LuceneSearch() {
  try {
   searcher = new IndexSearcher(IndexReader.open(Constants.INDEX_STORE_PATH));
  } catch (Exception e) {
   e.printStackTrace();
  }
}
public final Hits search(String keyword) {
  System.out.println("正在检索关键字 : " + keyword);
  // System.out.println(keyword);
  try {
   query = QueryParser.parse(keyword, "contents",
     new StandardAnalyzer());
   System.out.println(query);
   Date start = new Date();
   Hits hits = searcher.search(query);
   Date end = new Date();
   System.out.println("检索完成......." + " 用时 "+ (end.getTime() - start.getTime()) + " 毫秒");
   System.out.println(" ");
   return hits;
  } catch (Exception e) {
   e.printStackTrace();
   return null;
  }
}

public void printResult(Hits h) {
  if (h.length() == 0) {
   System.out.println(h);
   System.out.println(h.length());
   System.out.println("对不起,没有找到您需要的结果");
  } else {
   for (int i = 0; i < h.length(); i++) {
    try {
     Document doc = h.doc(i);
     System.out.print("这是第 " + i + "个检索结果,文件名为: ");
     System.out.println(doc.get("path"));
    } catch (Exception e) {
     e.printStackTrace();
    }
   }
  }
  System.out.println(" ");
  System.out.println("----------------------------------");
  System.out.println(" ");
}

public static void main(String[] args) throws Exception {
  LuceneSearch test = new LuceneSearch();
  Hits myHits1 = test.search("足球");
  Hits myHits2 = test.search("世界杯");
  test.printResult(myHits1);
  test.printResult(myHits2);
}
}
步骤四：
运行LuceneIndex.java=====> 生成索引
运行LuceneSearch.java====>查询关键字
ok，this is my first searcher!
Although this is very simple,it let me begin with luceneSearcher.Thanks lucene,Tanks Search!
Keep on studying knowledge of lucene and search,also and artificial intelligence!
I love this job!

posted on 2006-12-29 11:49 在法律保护下合法地抢银行阅读(202) 评论(0) 编辑收藏所属分类: Open-Open

新用户注册刷新评论列表


只有注册用户登录后才能发表评论。




网站导航: 博客园博客园最新博文博问管理

草狼

My第一个搜索模型

公告

导航

统计

常用链接

留言簿(1)

随笔分类(3)

随笔档案(5)

文章分类(1)

文章档案(1)

相册

搜索

积分与排名

最新评论

阅读排行榜

评论排行榜