本文通过实际业务系统中调整的一个案例,试图给出一个常见CPU消耗问题的一个诊断方法.大多数情况下,系统的性能问题都是由不良SQL代码引起的,那么作为DBA,怎样发现和解决这些SQL问题就显得尤为重要.
本案例平台为UNIX,所以不可避免的应用了一些Unix下常用的工具.如vmstat,top等.
本文适宜读者范围:中高级.
系统环境:
OS: Solaris8
Oracle: 8.1.7.4
问题描述:
开发人员报告系统运行缓慢,已经影响业务系统正常使用.请求协助诊断.
1.登陆数据库主机
使用vmstat检查,发现CPU资源已经耗尽,大量任务位于运行队列:
bash-2.03$ vmstat 3
procs memory page disk faults cpu
r b w swap free re mf pi po fr de sr s6 s9 s1 sd in sy cs us sy id
0 0 0 5504232 1464112 0 0 0 0 0 0 0 0 1 1 0 4294967196 0 0 -84 -5 -145
131 0 0 5368072 1518360 56 691 0 2 2 0 0 0 1 0 0 3011 7918 2795 97 3 0
131 0 0 5377328 1522464 81 719 0 2 2 0 0 0 1 0 0 2766 8019 2577 96 4 0
130 0 0 5382400 1524776 67 682 0 0 0 0 0 0 0 0 0 3570 8534 3316 97 3 0
134 0 0 5373616 1520512 127 1078 0 2 2 0 0 0 1 0 0 3838 9584 3623 96 4 0
136 0 0 5369392 1518496 107 924 0 5 5 0 0 0 0 0 0 2920 8573 2639 97 3 0
132 0 0 5364912 1516224 63 578 0 0 0 0 0 0 0 0 0 3358 7944 3119 97 3 0
129 0 0 5358648 1511712 189 1236 0 0 0 0 0 0 0 0 0 3366 10365 3135 95 5 0
129 0 0 5354528 1511304 120 1194 0 0 0 0 0 0 0 4 0 3235 8864 2911 96 4 0
128 0 0 5346848 1507704 99 823 0 0 0 0 0 0 0 3 0 3189 9048 3074 96 4 0
125 0 0 5341248 1504704 80 843 0 2 2 0 0 0 6 1 0 3563 9514 3314 95 5 0
133 0 0 5332744 1501112 79 798 0 0 0 0 0 0 0 1 0 3218 8805 2902 97 3 0
129 0 0 5325384 1497368 107 643 0 2 2 0 0 0 1 4 0 3184 8297 2879 96 4 0
126 0 0 5363144 1514320 81 753 0 0 0 0 0 0 0 0 0 2533 7409 2164 97 3 0
136 0 0 5355624 1510512 169 566 786 0 0 0 0 0 0 1 0 3002 8600 2810 96 4 0
130 1 0 5351448 1502936 267 580 1821 0 0 0 0 0 0 0 0 3126 7812 2900 96 4 0
129 0 0 5347256 1499568 155 913 2 2 2 0 0 0 0 1 0 2225 8076 1941 98 2 0
116 0 0 5338192 1495400 177 1162 0 0 0 0 0 0 0 1 0 1947 7781 1639 97 3 0
2.使用Top命令
观察进程CPU耗用,发现没有明显过高CPU使用的进程
$ top
last pid: 28313; load averages: 99.90, 117.54, 125.71 23:28:38
296 processes: 186 sleeping, 99 running, 2 zombie, 9 on cpu
CPU states: 0.0% idle, 96.5% user, 3.5% kernel, 0.0% iowait, 0.0% swap
Memory: 4096M real, 1404M free, 2185M swap in use, 5114M swap free
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
27082 oracle8i 1 33 0 1328M 1309M run 0:17 1.29% oracle
26719 oracle8i 1 55 0 1327M 1306M sleep 0:29 1.11% oracle
28103 oracle8i 1 35 0 1327M 1304M run 0:06 1.10% oracle
28161 oracle8i 1 25 0 1327M 1305M run 0:04 1.10% oracle
26199 oracle8i 1 45 0 1328M 1309M run 0:42 1.10% oracle
26892 oracle8i 1 33 0 1328M 1310M run 0:24 1.09% oracle
27805 oracle8i 1 45 0 1327M 1306M cpu/1 0:10 1.04% oracle
23800 oracle8i 1 23 0 1327M 1306M run 1:28 1.03% oracle
25197 oracle8i 1 34 0 1328M 1309M run 0:57 1.03% oracle
21593 oracle8i 1 33 0 1327M 1306M run 2:12 1.01% oracle
27616 oracle8i 1 45 0 1329M 1311M run 0:14 1.01% oracle
27821 oracle8i 1 43 0 1327M 1306M run 0:10 1.00% oracle
26517 oracle8i 1 33 0 1328M 1309M run 0:33 0.97% oracle
25785 oracle8i 1 44 0 1328M 1309M run 0:46 0.96% oracle
26241 oracle8i 1 45 0 1327M 1306M run 0:42 0.96% oracle
3.检查进程数量
bash-2.03$ ps -ef|grep ora|wc -l
258
bash-2.03$ ps -ef|grep ora|wc -l
275
bash-2.03$ ps -ef|grep ora|wc -l
274
bash-2.03$ ps -ef|grep ora|wc -l
278
bash-2.03$ ps -ef|grep ora|wc -l
277
bash-2.03$ ps -ef|grep ora|wc -l
366
发现系统存在大量Oracle进程,大约在300左右,大量进程消耗了几乎所有CPU资源,而正常情况下Oracle连接数应该在100左右.
4.检查数据库
查询v$session_wait获取各进程等待事件
SQL> select sid,event,p1,p1text from v$session_wait;
SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ----------------------------------------------------------------
124 latch free 1.6144E+10 address
1 pmon timer 300 duration
2 rdbms ipc message 300 timeout
3 rdbms ipc message 300 timeout
11 rdbms ipc message 30000 timeout
6 rdbms ipc message 180000 timeout
4 rdbms ipc message 300 timeout
134 rdbms ipc message 6000 timeout
147 rdbms ipc message 6000 timeout
275 rdbms ipc message 17995 timeout
274 rdbms ipc message 6000 timeout
SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ----------------------------------------------------------------
118 rdbms ipc message 6000 timeout
7 buffer busy waits 17 file#
56 buffer busy waits 17 file#
161 buffer busy waits 17 file#
195 buffer busy waits 17 file#
311 buffer busy waits 17 file#
314 buffer busy waits 17 file#
205 buffer busy waits 17 file#
269 buffer busy waits 17 file#
200 buffer busy waits 17 file#
164 buffer busy waits 17 file#
SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ----------------------------------------------------------------
140 buffer busy waits 17 file#
66 buffer busy waits 17 file#
10 db file sequential read 17 file#
18 db file sequential read 17 file#
54 db file sequential read 17 file#
49 db file sequential read 17 file#
48 db file sequential read 17 file#
46 db file sequential read 17 file#
45 db file sequential read 17 file#
35 db file sequential read 17 file#
30 db file sequential read 17 file#
SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ----------------------------------------------------------------
29 db file sequential read 17 file#
22 db file sequential read 17 file#
178 db file sequential read 17 file#
175 db file sequential read 17 file#
171 db file sequential read 17 file#
123 db file sequential read 17 file#
121 db file sequential read 17 file#
120 db file sequential read 17 file#
117 db file sequential read 17 file#
114 db file sequential read 17 file#
113 db file sequential read 17 file#
SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ----------------------------------------------------------------
111 db file sequential read 17 file#
107 db file sequential read 17 file#
80 db file sequential read 17 file#
222 db file sequential read 17 file#
218 db file sequential read 17 file#
216 db file sequential read 17 file#
213 db file sequential read 17 file#
199 db file sequential read 17 file#
198 db file sequential read 17 file#
194 db file sequential read 17 file#
192 db file sequential read 17 file#
SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ----------------------------------------------------------------
188 db file sequential read 17 file#
249 db file sequential read 17 file#
242 db file sequential read 17 file#
239 db file sequential read 17 file#
236 db file sequential read 17 file#
235 db file sequential read 17 file#
234 db file sequential read 17 file#
233 db file sequential read 17 file#
230 db file sequential read 17 file#
227 db file sequential read 17 file#
336 db file sequential read 17 file#
SID EVENT P1 P1TEXT
---------- ------------------------------ ---------- ----------------------------------------------------------------
333 db file sequential read 17 file#
331 db file sequential read 17 file#
329 db file sequential read 17 file#
327 db file sequential read 17 file#
325 db file sequential read 17 file#
324 db file sequential read 17 file#
320 db file sequential read 17 file#
318 db file sequential read 17 file#
317 db file sequential read 17 file#
316