Ticket #118 (new 故障) — at Version 7

Opened 14 years ago

Last modified 13 years ago

jsp编译内存不够问题

Reported by: chenchongqi Owned by:
Priority: major Milestone:
Component: 报价库 Version: 报价库5.0
Keywords: JavaCompileException,Cannot allocate memory Cc:
Due Date:

Description (last modified by chenchongqi) (diff)

现象

8.27报价库前台更新jsp的时候,237.45这台虚拟报编译错误:

[08-27 15:27:29.812] com.caucho.java.JavaCompileException: Resin can't execute the compiler `/bin/sh'.  This usually means that the compiler is not in the op
erating system's PATH or the compiler is incorrectly specified in the configuration.  You may need to add the full path to <java compiler='/bin/sh'/>.
[08-27 15:27:29.812]
[08-27 15:27:29.812] java.io.IOException: Cannot run program "/bin/sh": java.io.IOException: error=12, Cannot allocate memory
[08-27 15:27:29.812]    at com.caucho.java.ExternalCompiler.executeCompiler(ExternalCompiler.java:435)
[08-27 15:27:29.812]    at com.caucho.java.ExternalCompiler.compileInt(ExternalCompiler.java:151)
[08-27 15:27:29.812]    at com.caucho.java.AbstractJavaCompiler.run(AbstractJavaCompiler.java:102)
[08-27 15:27:29.812]    at java.lang.Thread.run(Thread.java:619)

被监控脚本重启时的系统状态:

system stat:
top - 11:52:19 up 38 days, 5 min,  2 users,  load average: 1.23, 1.51, 2.00
Tasks:  84 total,   1 running,  82 sleeping,   1 stopped,   0 zombie
Cpu(s): 14.9%us,  1.2%sy,  0.0%ni, 76.9%id,  1.2%wa,  2.1%hi,  3.6%si,  0.0%st
Mem:   5072544k total,  5025172k used,    47372k free,     5468k buffers
Swap:  2096472k total,    10868k used,  2085604k free,   653936k cached
        
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
19153 root      21   0 4515m 4.0g 9.8m S 99.2 82.9 212:19.36 java
    1 root      15   0 10352  580  544 S  0.0  0.0   0:06.58 init
    2 root      RT  -5     0    0    0 S  0.0  0.0   0:40.73 migration/0
    3 root      34  19     0    0    0 S  0.0  0.0   0:00.54 ksoftirqd/0
    4 root      RT  -5     0    0    0 S  0.0  0.0   1:05.93 migration/1
    5 root      34  19     0    0    0 S  0.0  0.0   0:00.50 ksoftirqd/1
    6 root      RT  -5     0    0    0 S  0.0  0.0   1:13.14 migration/2
    7 root      34  19     0    0    0 S  0.0  0.0   0:00.48 ksoftirqd/2
    8 root      RT  -5     0    0    0 S  0.0  0.0   1:11.65 migration/3
    9 root      34  19     0    0    0 S  0.0  0.0   0:00.52 ksoftirqd/3
   10 root      10  -5     0    0    0 S  0.0  0.0   0:22.16 events/0
   11 root      10  -5     0    0    0 S  0.0  0.0   0:05.40 events/1
   12 root      10  -5     0    0    0 S  0.0  0.0   0:05.29 events/2
   13 root      10  -5     0    0    0 S  0.0  0.0   0:05.33 events/3
   14 root      10  -5     0    0    0 S  0.0  0.0   0:00.28 khelper
   63 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kthread
   70 root      10  -5     0    0    0 S  0.0  0.0   0:56.25 kblockd/0
   71 root      10  -5     0    0    0 S  0.0  0.0   0:13.40 kblockd/1
   72 root      10  -5     0    0    0 S  0.0  0.0   0:13.51 kblockd/2
   73 root      10  -5     0    0    0 S  0.0  0.0   0:13.39 kblockd/3 
   74 root      14  -5     0    0    0 S  0.0  0.0   0:00.00 kacpid
  307 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/0
  308 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/1
  309 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/2
  310 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/3
  313 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 khubd
  315 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kseriod
  407 root      10  -5     0    0    0 S  0.0  0.0  30:25.70 kswapd0
  408 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 aio/0
  409 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 aio/1
  410 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 aio/2
  411 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 aio/3
  595 root      15   0  132m 3740 1560 S  0.0  0.1   0:43.84 python
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ 
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0  10868  47364   5468 653936    0    0   130   273    2    2 15  7 77  1  0
 1  0  10868  47460   5480 653908    0    0     0   177  987  188 25  1 74  0  0
 1  0  10868  46912   5480 653624    0    0     0     0  975  230 24  4 72  0  0
  S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT   
  0.00 100.00  77.18  53.45  98.76   3863  549.840    30   66.716  616.556
  0.00 100.00  95.12  53.45  99.71   3863  549.840    30   66.716  616.556
  0.00  64.81   0.00  54.63 100.00   3865  551.103    31   66.716  617.819

分析

  • 老谢之前有过总结: http://bbs.pconline.cn/topic-175.html
  • 现象上看是jsp编译的时候,操作系统分配不到内存
  • jsp复杂度高导致编译需要的资源多
  • 应用的压力大,导致jvm本身占用的内存多,操作系统的内存余量有影响

初步处理

  • 237.45增加了1g内存
  • 准备dump一次内存看看应用占内存的大头是哪部分,resin是否可以减少内存配置
  • 测试环境模拟压力测试,看看能否重现并做动态监控和分析
    • 测试环境去掉squid和resin自带缓存,50/s的索引页和100/s的readintf模拟ssi压力测试下16个小时后的系统状态,修改jsp暂时没有出现编译内存问题。线程数保持在100左右,ygc比较频繁基本上一秒一次,之所以还可以比较好的状态估计是因为测试的ssi接口相对线上情况会好很多,不会太多出现超时失败的情形。
      top - 08:55:00 up 112 days, 19:28,  1 user,  load average: 4.42, 5.13, 5.35
      Tasks:  76 total,   3 running,  73 sleeping,   0 stopped,   0 zombie
      Cpu(s): 68.0%us,  5.6%sy,  0.0%ni, 19.3%id,  0.0%wa,  1.0%hi,  6.2%si,  0.0%st
      Mem:   4044452k total,  4019004k used,    25448k free,    11736k buffers
      Swap:  3140696k total,   204052k used,  2936644k free,   698764k cached
      
        PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                       
      25617 root      20   0 3469m 2.8g  11m S 288.1 73.2   2486:52 java                                                                                         
      24687 nobody    15   0 18948 3240  872 R 12.0  0.1  55:54.72 nginx                                                                                         
       7561 root      15   0  268m 202m  548 S 10.6  5.1  90:53.67 memcached                                                                                     
      24690 nobody    15   0 18948 3304  872 S  5.0  0.1  56:29.73 nginx                                                                                         
      24688 nobody    15   0 18920 3220  872 R  2.3  0.1  56:56.46 nginx                                                                                         
      24686 nobody    15   0 18796 3160  884 S  0.7  0.1  56:21.20 nginx                                                                                         
      25582 root      23   0  348m  56m 9.8m S  0.3  1.4   0:22.67 java                                                                                          
      30407 root      15   0 12724 1060  820 R  0.3  0.0   0:00.02 top                                                                                           
          1 root      15   0 10348  244  212 S  0.0  0.0   0:01.34 init                                                                                          
          2 root      RT  -5     0    0    0 S  0.0  0.0   0:05.76 migration/0                                                                                   
          3 root      34  19     0    0    0 S  0.0  0.0   0:02.92 ksoftirqd/0                                                                                   
          4 root      RT  -5     0    0    0 S  0.0  0.0   0:04.89 migration/1                                                                                   
          5 root      34  19     0    0    0 S  0.0  0.0   0:00.02 ksoftirqd/1                                                                                   
          6 root      RT  -5     0    0    0 S  0.0  0.0   0:05.21 migration/2                                                                                   
          7 root      34  19     0    0    0 S  0.0  0.0   0:00.02 ksoftirqd/2                                                                                   
          8 root      RT  -5     0    0    0 S  0.0  0.0   0:05.14 migration/3                                                                                   
          9 root      34  19     0    0    0 S  0.0  0.0   0:00.01 ksoftirqd/3                                                                                   
         10 root      10  -5     0    0    0 S  0.0  0.0   0:54.78 events/0                                                                                      
         11 root      10  -5     0    0    0 S  0.0  0.0   0:00.83 events/1                                                                                      
         12 root      10  -5     0    0    0 S  0.0  0.0   0:00.69 events/2                                                                                      
         13 root      10  -5     0    0    0 S  0.0  0.0   0:00.79 events/3                                                                                      
         14 root      10  -5     0    0    0 S  0.0  0.0   0:00.12 khelper                                                                                       
         55 root      11  -5     0    0    0 S  0.0  0.0   0:00.01 kthread                                                                                       
         62 root      10  -5     0    0    0 S  0.0  0.0   0:20.23 kblockd/0                                                                                     
         63 root      10  -5     0    0    0 S  0.0  0.0   0:09.16 kblockd/1                                                                                     
         64 root      10  -5     0    0    0 S  0.0  0.0   0:09.61 kblockd/2                                                                                     
         65 root      10  -5     0    0    0 S  0.0  0.0   0:08.80 kblockd/3                                                                                     
         66 root      16  -5     0    0    0 S  0.0  0.0   0:00.00 kacpid                                                                                        
        294 root      13  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/0                                                                                      
        295 root      13  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/1                                                                                      
        296 root      13  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/2                                                                                      
        297 root      13  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/3                                                                                      
        300 root      13  -5     0    0    0 S  0.0  0.0   0:00.00 khubd                                                                                         
        302 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kseriod                                                                                       
        394 root      10  -5     0    0    0 S  0.0  0.0   3:18.66 kswapd0                                                                                       
        395 root      16  -5     0    0    0 S  0.0  0.0   0:00.00 aio/0                                                                                         
        396 root      17  -5     0    0    0 S  0.0  0.0   0:00.00 aio/1                                                                                         
        397 root      18  -5     0    0    0 S  0.0  0.0   0:00.00 aio/2                            
      
    • 测试服务器上dump下来的内存分析:
      • 总体内存正常,dump的文件1.5g,实际heap仅180M,剩下的都是可回收性质,跑了十几个小时的情况这样看不存在内存泄漏,主要还是过多的请求压力导致,不过也发现了一个已经没有用的包jeasy分词工具,在里面占了几十M内存,所以需要去掉再测一次。
        One instance of "jeasy.analysis.llIlllIIIlIlllll" loaded by "com.caucho.loader.EnvironmentClassLoader @ 0x2aaac06d7fb8" occupies 83,716,264 (43.67%) bytes. The memory is accumulated in one instance of "java.util.TreeMap$Entry" loaded by "<system class loader>".
        
        Keywords
        java.util.TreeMap$Entry
        jeasy.analysis.llIlllIIIlIlllll
        com.caucho.loader.EnvironmentClassLoader @ 0x2aaac06d7fb8
        

长期目标

  • 报价应用前台拆分:索引服务 + SSI服务
  • 报价服务器拆分:静态文件 + mysql + squid

Change History

comment:1 Changed 14 years ago by chenchongqi

  • Description modified (diff)

comment:2 Changed 14 years ago by chenchongqi

  • Description modified (diff)

comment:3 Changed 14 years ago by chenchongqi

  • Description modified (diff)

comment:4 Changed 14 years ago by chenchongqi

  • Description modified (diff)

comment:5 Changed 14 years ago by chenchongqi

  • Description modified (diff)

comment:6 Changed 14 years ago by chenchongqi

  • Description modified (diff)

comment:7 Changed 14 years ago by chenchongqi

  • Description modified (diff)
Note: See TracTickets for help on using tickets.