Ticket #134 (closed 故障: fixed) — at Version 3
爬虫爬取产品报价库频率大些时出现负载高的问题分析及处理
| Reported by: | yuanhuoqing | Owned by: | |
|---|---|---|---|
| Priority: | major | Milestone: | 2012报价库5.0 |
| Component: | 报价库 | Version: | 报价库5.0 |
| Keywords: | 线程堵塞,读写锁 | Cc: | |
| Due Date: | 31/01/2013 |
Description (last modified by yuanhuoqing) (diff)
现象:产品报价库服务器237.45和237.54是虚拟机,且45的配置更差些,2013.1.27下午45这台机持续性负载高,保持4以上,resin的线程数撑满,导致响应慢。
分析:先分析了应用的日志,没有什么异常,用 http://192.168.237.45:8081/threads.jsp 打开查看线程数和线程执行情况,发现线程数撑满,很多积累的线程在等待或者blocked状态,初步判断爬虫爬取频率偏大导致并发请求数比较大引
起的,经网络组排查soso的爬虫爬得比较猛,经过仔细分析线程情况,经常会出现的情况是执行ProductTypeService.getPathById的方法中时有大量的调用此方法的线程堵住了(线程快照参见文件:threads@pricelib-vm237-45.pconline.ctc_192.168.237.45 8081.htm)。出现读锁堵塞代码:
private static Map<Long,String> pathMap = new HashMap<Long,String>();
public String getPathById(long id){
Lock lock = pathLock.readLock();
try{
lock.lock();
String path = pathMap.get(id);
if(path == null){
lock.unlock();
reloadBrandPath();
lock.lock();
path = pathMap.get(id);
}
return path;
}finally{
lock.unlock();
}
}
private ReadWriteLock pathLock = new ReentrantReadWriteLock();
private void reloadBrandPath(){
Lock lock = pathLock.writeLock();
try{
lock.lock();
String sql = "select id,pub_dir,type,parent_id from pdl_product_type where type = 3";
List<Map<String,Object>> rows = productTypeRepository.getSimpleJdbcTemplate().queryForList(sql);
for(Map<String,Object> row : rows){
pathMap.put(NumberUtils.toLong(row.get("id").toString()), row.get("pub_dir").toString());
}
}finally{
lock.unlock();
}
}
因为爬虫自己组装的参数都是不规律的,如果传的小类id和品牌id在pathMap找不到时就会加读锁再t调用reloadBrandPath()方法去重新加载下,其他线程等读锁释放后才能读pathMap,很多线程堵在这里,导致响应很慢。有问题的主要代码:
String path = pathMap.get(id);
if(path == null){
lock.unlock();
reloadBrandPath();
lock.lock();
path = pathMap.get(id);
}
处理方案:临时封soso的ip。把虚拟机237.45和237.54移到配置比较好的8核物理机上,同时根据业务分析先去掉了读写锁。
修改方法为:
public String getPathById(long id){
return pathMap.get(id);
}
Change History
Changed 13 years ago by yuanhuoqing
-
attachment
542 threads@pricelib-vm237-45.pconline.ctc_192.168.237.45 8081.htm
added
![(please configure the [header_logo] section in trac.ini)](http://www1.pconline.com.cn/hr/2009/global/images/logo.gif)