问题与排查过程

本地开发环境的服务器,部署了nginx,nginx上对静态的web前端页面进行了http 80端口代理;然后呢,因为一些原因,服务器重启了,重启服务器后,我去把nginx启动起来,但是nginx怎么也代理不了80端口,问题如下:

  1. 浏览器访问没响应,本地telnet服务器的80端口是通的,使用wireshark抓包,过程如下:

    包3-包7,tcp 三次握手;

    包8,本机给开发服务器发了http请求;

    包9,本机重传;

    包10,服务器返回ack,确认收到请求,但是依然没有请求返回。

    后续一直没有请求返回

  2. 服务器端,将nginx.conf进行了精简,精简后,主要配置如下:

    server {
            listen       80;
            server_name  localhost;
            gzip on;
            gzip_http_version 1.1;
            gzip_comp_level 3;
            gzip_types text/plain application/json application/javascript text/css  image/jpeg image/gif image/png application/zip;
    
            access_log  logs/host.access.log;
    
            #这里对web前端静态页面做了代理,路径为/police3-web/scm
            location / {
                root   /police3-web/scm;
                try_files $uri index.html /index.html;
                if ($request_filename ~* \.(gif|jpg|jpeg|png|css|js|ico|eot|otf|fon|font|ttf|ttc|woff|woff2)$) {
                  expires   7d;
                }
            }
            location ^~ /scm {
               proxy_set_header X-Real-IP $remote_addr;
               proxy_set_header Host $host;
               proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                 proxy_pass http://localhost:9080;
    
            }
    
            location ~/group([0-9])/M([0-9])([0-9]) {
                    add_header Access-Control-Allow-Origin *;
                    add_header Access-Control-Allow-Methods 'GET, POST, OPTIONS';
                    add_header Access-Control-Allow-Headers 'DNT,X-Mx-ReqToken,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization';
                    #ngx_fastdfs_module;//因为怀疑和fastdfs有关系,专门注释了
            }
    
        }
    
  3. 重启nginx后,观察logs/error.log:

    2019/12/06 08:56:53 [alert] 10588#0: worker process 10802 exited on signal 11 (core dumped)
    2019/12/06 08:56:53 [alert] 10588#0: worker process 10802 exited on signal 11 (core dumped)
    2019/12/06 08:56:53 [alert] 10588#0: worker process 10802 exited on signal 11 (core dumped)
    2019/12/06 08:56:53 [notice] 10588#0: start worker process 10806
    2019/12/06 08:56:53 [notice] 10588#0: start worker process 10806
    ngx_http_fastdfs_process_init pid=10806
    2019/12/06 08:56:56 [notice] 10588#0: signal 17 (SIGCHLD) received from 10806
    2019/12/06 08:56:56 [notice] 10588#0: signal 17 (SIGCHLD) received from 10806
    2019/12/06 08:56:56 [alert] 10588#0: worker process 10806 exited on signal 11 (core dumped)
    2019/12/06 08:56:56 [alert] 10588#0: worker process 10806 exited on signal 11 (core dumped)
    2019/12/06 08:56:56 [alert] 10588#0: worker process 10806 exited on signal 11 (core dumped)
    2019/12/06 08:56:56 [notice] 10588#0: start worker process 10808
    2019/12/06 08:56:56 [notice] 10588#0: start worker process 10808
    ngx_http_fastdfs_process_init pid=10808
    2019/12/06 08:56:59 [notice] 10588#0: signal 17 (SIGCHLD) received from 10808
    2019/12/06 08:56:59 [notice] 10588#0: signal 17 (SIGCHLD) received from 10808
    2019/12/06 08:56:59 [alert] 10588#0: worker process 10808 exited on signal 11 (core dumped)
    2019/12/06 08:56:59 [alert] 10588#0: worker process 10808 exited on signal 11 (core dumped)
    2019/12/06 08:56:59 [alert] 10588#0: worker process 10808 exited on signal 11 (core dumped)
    2019/12/06 08:56:59 [notice] 10588#0: start worker process 10812
    2019/12/06 08:56:59 [notice] 10588#0: start worker process 10812

    差不多就是循环往复地打这些日志,然后我就去百度了一下,查到了这篇:

    https://blog.csdn.net/hexuan1/article/details/45222867

    里面提到了dmesg命令,我这边也执行了一下,输出如下:

    [55522.992453] nginx[10844]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55526.018405] nginx[10851]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55529.043590] nginx[10853]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55532.070932] nginx[10855]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55535.097444] nginx[10859]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55538.122466] nginx[10861]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55541.148914] nginx[10871]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55544.201955] nginx[10914]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55547.229135] nginx[10921]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55550.254445] nginx[10923]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55553.279543] nginx[10928]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55556.305874] nginx[10932]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55559.331128] nginx[10936]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55562.356655] nginx[10938]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55565.408411] nginx[10943]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55568.434908] nginx[10948]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]
    [55571.459719] nginx[10950]: segfault at 4 ip 00007f23d017ff84 sp 00007ffdf82783e0 error 4 in libfdfsclient.so[7f23d0179000+18000]

    我注意到,这里面提到了:libfdfsclient.so,因为我们的nginx对fastdfs进行了代理(这块同事弄的,暂时没问具体机制,搜了一下,大概如下:https://blog.csdn.net/qq_34301871/article/details/80060235

    我就想,之前同事好像和我说,要把fastdfs重新启动起来,我一想,可能服务器重启后,fastdfs没启动导致的。

    然后执行了一下命令,重启了fastdfs相关服务:

    /etc/init.d/fdfs_trackerd start
    /etc/init.d/fdfs_storaged start

    然后再观察nginx的error.log,居然就没有一直打印上面的错误日志了:

    2019/12/06 08:58:21 [notice] 10588#0: signal 17 (SIGCHLD) received from 10950
    2019/12/06 08:58:21 [notice] 10588#0: signal 17 (SIGCHLD) received from 10950
    2019/12/06 08:58:21 [alert] 10588#0: worker process 10950 exited on signal 11 (core dumped)
    2019/12/06 08:58:21 [alert] 10588#0: worker process 10950 exited on signal 11 (core dumped)
    2019/12/06 08:58:21 [alert] 10588#0: worker process 10950 exited on signal 11 (core dumped)
    2019/12/06 08:58:21 [notice] 10588#0: start worker process 10954
    2019/12/06 08:58:21 [notice] 10588#0: start worker process 10954
    ngx_http_fastdfs_process_init pid=10954
      ##重启fastdfs服务后,这里日志就稳定在下面这一行,没有一直打印错误了:
    [2019-12-06 08:58:23] INFO - fastdfs apache / nginx module v1.21, response_mode=proxy, base_path=/tmp, url_have_group_name=1, group_name=group1, storage_server_port=23000, path_count=1, store_path0=/home/fastdfs/storage, connect_timeout=2, network_timeout=30, tracker_server_count=1, if_alias_prefix=, local_host_ip_count=3, anti_steal_token=0, token_ttl=0s, anti_steal_secret_key length=0, token_check_fail content_type=, token_check_fail buff length=0, load_fdfs_parameters_from_tracker=1, storage_sync_file_max_delay=86400s, use_storage_id=0, storage server id/ip count=0 / 0, flv_support=1, flv_extension=flv
    2019/12/06 09:00:32 [info] 10954#0: *2 client timed out (110: Connection timed out) while waiting for request, client: 10.15.4.46, server: 0.0.0.0:80
  4. 那,错误排查就到这里了。略坑略坑。

参考

dmesg命令:

https://www.runoob.com/linux/linux-comm-dmesg.html

https://www.cnblogs.com/zhaoxuguang/p/7810651.html

版权声明:本文为grey-wolf原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://www.cnblogs.com/grey-wolf/p/11993526.html