Ticket #104 (closed QA: fixed)

Opened 14 years ago

Last modified 14 years ago

商城转pg过程中商品全量索引报错及优化

Reported by: huangzhong Owned by: huangzhong
Priority: major Milestone: 2012年6.0版本
Component: 系统相关 Version: 6.0
Keywords: 全量索引,数据库迁移 Cc:
Due Date: 25/05/2012

Description (last modified by huangzhong) (diff)

  • 背景

商城的商品、产品、店铺等搜索都是通过lucene技术实现的,这其中很大一部分工作就是建索引,每天晚上会针对商品、产品、店铺等建全量索引,白天还会做增量索引。生产环境中商品全量索引中涉及到的商品数量有80多万

  • 故障

在6.0版中建商品、产品全量索引时跑不下去,会报内存溢出

  • 原因

在写索引文件的时候,document变量的作用域太大,可能导致GC回收不及时

Document document;
while (res.next()) {//res数据量在百万左右
    try {		       
        //创建索引的Document, Field
	document = new Document();		       
	document.add(new Field("online_flag", String.valueOf(onlineFlag), Field.Store.YES, Field.Index.NOT_ANALYZED));              
	indexWriter.addDocument(document);
    } catch (Exception ex) {
	continue;
    }
}
修改如下
while (res.next()) {//res数据量在百万左右
    Document document;
    try {		       
        //创建索引的Document, Field
	document = new Document();		       
	document.add(new Field("online_flag", String.valueOf(onlineFlag), Field.Store.YES, Field.Index.NOT_ANALYZED));              
	indexWriter.addDocument(document);
    } catch (Exception ex) {
	continue;
    } finally {
        document = null;
    }
}

  • 优化

1.经过修改虽然可以程序虽然可以跑下去了,但其中还是有个隐患,一次把上百万数据查询出来,对应用的压力也太大了,所以通过分页的方法进行循环,每次拿10万条数据,这样虽然读数据库的次数多了,但应用的稳定性得到了改善
2.之前索引是三台服务器分别建立,造成数据库压力比较大,这也是不必要的,这次准备在一台服务器上建立一份索引,然后同步到其他两台机器上

Change History

comment:1 Changed 14 years ago by huangzhong

  • Keywords 全量索引,数据库迁移 added; 全量索引 removed
  • Type changed from defect to QA

comment:2 Changed 14 years ago by huangzhong

  • Description modified (diff)

comment:3 Changed 14 years ago by huangzhong

  • Status changed from new to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.