サーバーがつながらない。CLOSE_WAITが大量発生

参考ページ1:https://end0tknr.hateblo.jp/entry/20110724/1311490171
参考ページ2:https://hacknote.jp/archives/19502/
参考ページ3:https://seesaawiki.jp/w/dehio3/d/solaris%A1%C1CLOSE_WAIT%C8%AF%C0%B8%A1%C1
参考ページ4:https://hirofukami.com/2008/08/11/close-wait/

 

サーバーがつながらない事態が頻発

ネットワークの状況を確認
---------------------------------------
netstat -tpn
---------------------------------------

結果
---------------------------------------
tcp 0 0 160.16.63.○○○:443 207.46.13.76:15571 TIME_WAIT -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:40457 CLOSE_WAIT 11805/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:11251 CLOSE_WAIT 11567/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:11982 CLOSE_WAIT 11753/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:54793 CLOSE_WAIT 12106/httpd
tcp 0 0 160.16.63.○○○:443 126.34.252.62:59582 TIME_WAIT -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:53009 CLOSE_WAIT 12076/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:57869 CLOSE_WAIT 11894/httpd
tcp 521 0 160.16.63.○○○:443 36.13.167.104:37684 ESTABLISHED -
tcp 0 0 160.16.63.○○○:443 13.66.139.0:19846 TIME_WAIT -
tcp 0 0 160.16.63.○○○:443 112.137.96.56:59062 TIME_WAIT -
tcp 0 0 160.16.63.○○○:443 125.8.175.235:55073 TIME_WAIT -
tcp 0 0 160.16.63.○○○:443 110.232.6.71:62611 TIME_WAIT -
tcp 518 0 160.16.63.○○○:443 210.194.87.121:50331 CLOSE_WAIT -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:9959 CLOSE_WAIT 12046/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:14902 CLOSE_WAIT 11726/httpd
tcp 0 0 160.16.63.○○○:443 60.153.198.22:42116 ESTABLISHED -
tcp 0 0 160.16.63.○○○:443 133.201.90.32:62121 TIME_WAIT -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:13430 CLOSE_WAIT 11769/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:35291 CLOSE_WAIT 12005/httpd
tcp 0 0 160.16.63.○○○:443 13.66.139.0:19845 TIME_WAIT -
tcp 517 0 160.16.63.○○○:443 111.239.35.98:65278 ESTABLISHED -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:42621 CLOSE_WAIT 12102/httpd
tcp 0 0 160.16.63.○○○:443 211.127.83.36:57156 ESTABLISHED 12020/httpd
tcp 0 0 160.16.63.○○○:443 126.34.252.62:59581 TIME_WAIT -
tcp 0 0 160.16.63.○○○:443 111.64.18.75:62476 TIME_WAIT -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:6615 CLOSE_WAIT 11750/httpd
tcp 0 0 160.16.63.○○○:443 72.14.199.110:37727 TIME_WAIT -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:4216 CLOSE_WAIT 12023/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:15151 CLOSE_WAIT 12123/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:9227 CLOSE_WAIT 11850/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:13495 CLOSE_WAIT 11863/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:32764 CLOSE_WAIT 11987/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:17980 CLOSE_WAIT 11958/httpd
tcp 0 0 160.16.63.○○○:443 133.206.82.0:42008 TIME_WAIT -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:38943 CLOSE_WAIT 12051/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:33367 CLOSE_WAIT 12138/httpd
tcp 517 0 160.16.63.○○○:443 112.137.96.56:59073 ESTABLISHED -
tcp 0 0 160.16.63.○○○:443 112.137.96.56:59064 TIME_WAIT -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:43395 CLOSE_WAIT 12149/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:28548 CLOSE_WAIT 11806/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:39903 CLOSE_WAIT 12145/httpd
tcp 0 0 160.16.63.○○○:443 126.53.57.5:32806 ESTABLISHED 11930/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:49601 CLOSE_WAIT 11609/httpd
tcp 0 0 160.16.63.○○○:443 207.46.13.76:15559 TIME_WAIT -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:15573 CLOSE_WAIT 12009/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:62567 CLOSE_WAIT 12033/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:7693 CLOSE_WAIT 11818/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:21720 CLOSE_WAIT 11801/httpd
tcp 0 0 160.16.63.○○○:443 207.46.13.76:15574 ESTABLISHED -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:49345 CLOSE_WAIT 11882/httpd
tcp 0 0 160.16.63.○○○:443 207.46.13.76:15569 TIME_WAIT -
tcp 0 0 160.16.63.○○○:443 203.133.150.50:55961 TIME_WAIT -
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:17553 CLOSE_WAIT 11804/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:23842 CLOSE_WAIT 11762/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:7292 CLOSE_WAIT 11814/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:23870 CLOSE_WAIT 11528/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:32023 CLOSE_WAIT 12062/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:62955 CLOSE_WAIT 11989/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:8371 CLOSE_WAIT 11919/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:22436 CLOSE_WAIT 11682/httpd
tcp 1 0 160.16.63.○○○:443 94.177.○○.○○:21309 CLOSE_WAIT 11913/httpd
---------------------------------------

同じIPアドレスからのアクセスで CLOSE_WAIT が大量発生しているのが分かる

/var/log/httpd/error_log には child process 13944 still did not exit, sending a SIGTERM が残っている。(関係あるかは不明)
---------------------------------------
[Wed Sep 08 21:23:17 2021] [warn] child process 13944 still did not exit, sending a SIGTERM
[Wed Sep 08 21:23:17 2021] [warn] child process 13951 still did not exit, sending a SIGTERM
[Wed Sep 08 21:23:17 2021] [warn] child process 13955 still did not exit, sending a SIGTERM
[Wed Sep 08 21:23:17 2021] [warn] child process 13970 still did not exit, sending a SIGTERM
[Wed Sep 08 21:23:17 2021] [warn] child process 13984 still did not exit, sending a SIGTERM
[Wed Sep 08 21:23:17 2021] [warn] child process 14001 still did not exit, sending a SIGTERM
[Wed Sep 08 21:23:17 2021] [warn] child process 14012 still did not exit, sending a SIGTERM
[Wed Sep 08 21:23:17 2021] [warn] child process 14031 still did not exit, sending a SIGTERM
[Wed Sep 08 21:23:17 2021] [warn] child process 14044 still did not exit, sending a SIGTERM
[Wed Sep 08 21:23:18 2021] [warn] child process 14070 still did not exit, sending a SIGTERM
[Wed Sep 08 21:23:18 2021] [warn] child process 14075 still did not exit, sending a SIGTERM
---------------------------------------

 

CLOSE_WAIT  は通信相手から自分への通信はcloseしたが、自分側は完全にcloseしていない状態。
デフォルトの有効時間は7200秒(2時間)。

CLOSE_WAIT  の有効時間は net.ipv4.tcp_keepalive_time ディレクティブで変更可能。

今の値を確認
---------------------------------------
# sysctl -n net.ipv4.tcp_keepalive_time
7200
---------------------------------------

 

利用できる全ての値を表示して net.ipv4.tcp_keepalive_time があることを確認
---------------------------------------
sysctl -a
---------------------------------------

7200の値を少なくすれば改善されるはず

一時的な変更方法
3600秒に変更
---------------------------------------
echo 3600 > /proc/sys/net/ipv4/tcp_keepalive_time
---------------------------------------

永続的な変更方法
---------------------------------------
# vi /etc/sysctl.conf

net.ipv4.tcp_keepalive_time = 10

net.ipv4.tcp_keepalive_probes = 2

net.ipv4.tcp_keepalive_intvl = 3
---------------------------------------

を追記
---------------------------------------
# sysctl -w
---------------------------------------

変更を反映
---------------------------------------
# sysctl -p
---------------------------------------