Nginx理解

本文主要帮助理解 Linux操作系统下Nginx 的设计和实现,对源码理解也有一定的帮助。相关代码来自nginx GitHub源码仓上的release-1.16.0个人观点,仅供参考~

1. Nginx是什么&有些什么功能、特点

基本、常用的http服务器功能

  1. 静态文件服务、首页设置
  2. 反向代理、负载均衡
  3. FastCGI协议支持、uwsgi协议支持
  4. https、http/2、虚拟主机支持
  5. 更多

邮件代理服务器功能

不常用,详见:mail_proxy_server_features

TCP/UDP代理服务器功能

不常用,详见:generic_proxy_server_features

架构和可扩展性、跨平台

  1. master/worker进程模型
  2. 模块化设计,功能文档基本上可与代码文件一一对应,阅读、理解源码更方便,开发新功能模块更容易
  3. 灵活的配置,可以说Nginx的所有功能都是配置驱动的
  4. 充分利用操作系统提供的高性能API:异步io(事件驱动)等
  5. 多操作系统支持

2. socket server编程&http协议(tcp/ip)

tcp协议为上层提供数据传输功能,不关心传输的数据的格式;http协议规定了数据的格式、通讯双方(客户端、web服务器程序)的交互,典型的“请求-应答”模式(websocket和http/2打破了这种模式)。这里的数据就是我们写的程序发送的或者接收到的数据。何为协议也需要我们好好思考的,平时定义的json接口也属于协议,同样的json数据可以使用tcp协议发送,也可以使用http协议发送,而http协议又是在tcp协议之上的。

浏览器发送http请求示例(为方便显示\r\n后面加了换行):

GET / HTTP/1.1\r\n
Host: 127.0.0.1\r\n
Connection: keep-alive\r\n
Cache-Control: max-age=0\r\n
Upgrade-Insecure-Requests: 1\r\n
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36\r\n
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3\r\n
Accept-Encoding: gzip, deflate, br\r\n
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8\r\n\r\n

http服务器应答示例(为方便显示\r\n后面加了换行):

HTTP/1.1 301 Moved Permanently\r\n
Server: GitHub.com\r\n
Content-Type: text/html\r\n
Location: https://gobuild.tech/\r\n
X-GitHub-Request-Id: F368:05D3:205F88:230D33:5CC40FF1\r\n
Content-Length: 164\r\n
Accept-Ranges: bytes\r\n
Date: Sat, 27 Apr 2019 08:17:00 GMT\r\n
Via: 1.1 varnish\r\n
Age: 11\r\n
Connection: keep-alive\r\n
X-Served-By: cache-hnd18720-HND\r\n
X-Cache: HIT\r\n
X-Cache-Hits: 1\r\n
X-Timer: S1556353020.348737,VS0,VE1\r\n
Vary: Accept-Encoding\r\n
X-Fastly-Request-ID: 20026dc4a1797ecdb889bd0c6ab786f6ab351e65\r\n\r\n
<html><head><title>301 Moved Permanently</title></head><body bgcolor="white"><center><h1>301 Moved Permanently</h1></center><hr><center>nginx</center></body></html>

一个简单的http服务器

#coding:utf-8
#python3 simple-http-server.py
import socket

server_address = ('127.0.0.1', 80)
server_socket = socket.socket(family=socket.AF_INET, type=socket.SOCK_STREAM)
server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR  , True) # 避免测试时发生 Address already in use 错误
server_socket.bind(server_address)
server_socket.listen(8)
print('serving http://%s:%d' % server_address)
while True:
    client_socket, client_address = server_socket.accept()
    print('client', client_address, 'connected')
    data = client_socket.recv(1024)
    if not data:
        print('client', client_address, 'maybe close')
        client_socket.close()
        continue
    print('client', client_address, 'send:')
    print(data)
    client_socket.send(b'HTTP/1.1 200 OK\r\n') # 状态行
    client_socket.send(b'Connection: Close\r\n') # 响应头
    client_socket.send(b'\r\n') # 响应头结束
    client_socket.send(b'<h1>It works!</h1>') # 响应体
    client_socket.close()

思考:

  1. 如何解析&存放请求头
  2. 如何根据请求头执行不同的处理逻辑
  3. 如何生成并发送响应(读取文件并发送文件内容等)
  4. 如何处理并发

一个简单的http客户端

#-*- coding:utf-8 -*-
# python3 simple-http-client.py
import socket

client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_address = ('185.199.111.153', 80) # 可域名直连,但python解析到的ip可能连不上
client.connect(server_address)
client.send(b'GET / HTTP/1.1\r\n')
client.send(b'Host: gobuild.tech\r\n')
client.send(b'\r\n')

print('recv:')
print(client.recv(1024))

可以结合两个示例实现简单的代理功能。

相关链接:

  1. Internet Protocol Suite
  2. Network Socket
  3. Python socket docs
  4. Linux C socket(man 2 socket)

一个简单的异步(事件驱动)http服务器示例

#-*- coding:utf-8 -*-
# python3 simple-epoll-http-server.py
import socket
import select

server_address = ('127.0.0.1', 80)
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, True)
server_socket.bind(server_address)
server_socket.listen(8)
server_socket.setblocking(False)

#创建epoll事件对象,后续要监控的事件添加到其中
epoll = select.epoll()
#注册服务器监听fd到等待读事件集合,server_socket是用来accept新连接的,不能读/写数据
epoll.register(server_socket.fileno(), select.EPOLLIN)
#用于关联事件和socket对象;和C语言不同,注册事件时不能关联上下文
fd_to_socket = {server_socket.fileno(): server_socket, }

while True:
  print("http://%s:%d waiting connection active......" % server_address)
  #轮询注册的事件集合,返回值为[(文件句柄,对应的事件),(...),....]
  events = epoll.poll(-1)  # 永不超时
  if not events:
     print("epoll.poll error")
     break

  for fd, event in events:
      #如果活动socket为当前服务器socket,表示有新连接
    if fd == server_socket.fileno():
         connection, client_address = server_socket.accept()
         print(client_address, 'connected')
         connection.setblocking(False)
         client_fd = connection.fileno()
         fd_to_socket[client_fd] = connection
         epoll.register(client_fd, select.EPOLLIN)  # 只添加读事件
         continue

    connection = fd_to_socket[fd]
    close_reason = 'close because of network event'
    if event & select.EPOLLIN:  # 可读事件
        print(connection.getpeername(), 'can read')
        data = connection.recv(1024)
        if data:
           print(connection.getpeername(), 'send:')
           print(data)
           epoll.modify(fd, select.EPOLLOUT)
           continue
        #else: # 关闭
        close_reason = 'close because of recv error'
    if event & select.EPOLLOUT:  # 可写事件
        print(connection.getpeername(), 'can write')
        #发送响应数据
        connection.send(b'HTTP/1.1 200 OK\r\n')
        connection.send(b'Connection: close\r\n')
        connection.send(b'\r\n')
        connection.send(b'<h1>It Works!</h1>')
        # 数据可能只写到操作系统的缓冲区,后面的close调用会尝试把数据发送到网络
        close_reason = 'close after send response data'

    #if event & select.EPOLLHUP:  # 关闭事件
    print(connection.getpeername(), close_reason)
    epoll.unregister(fd)
    connection.close()
    del fd_to_socket[fd]

epoll.unregister(server_socket.fileno())
epoll.close()
server_socket.close()

思考:

  1. Nginx会在哪个进程调用listen、在哪个进程调用accept、如何均衡每个进程上的连接数(Nginx的进程模型
  2. 事件驱动给代码编写(组织)带来哪些挑战(可对比前面两个示例,发送或接收的数据过大可能需要多次调用send/recv)
  3. 事件处理逻辑代码执行时间过长会有什么影响

相关链接:

  1. Python select module docs
  2. Linux C socket(man 2 socket)
  3. Linux C epoll(man 7 epoll)

3. 开发指南

内容主要摘自官方Development guide,参考淘宝团队的Nginx开发从入门到精通-nginx平台初探

内置数据结构

  1. 字符串(带长度、和以’\0’结束的C字符串不同的,结构同go语言的string)
  2. 内存池
  3. 动态数组(可理解成go语言的切片)
  4. 列表(其实是把一块块数组用链表连起来)
  5. 更多

模块开发

  1. 理解请求处理的11个阶段
  2. 模块编译配置、模块类型(–add-module参数所需)
  3. 配置指令定义
  4. 模块功能实现

代码行数最短的Nginx模块:empty_gif,注意指令定义和如何关联模块功能。

// 摘自 https://github.com/nginx/nginx/blob/release-1.16.0/src/http/modules/ngx_http_empty_gif_module.c
/*
 * Copyright (C) Igor Sysoev
 * Copyright (C) Nginx, Inc.
 */

#include <ngx_config.h>
#include <ngx_core.h>
#include <ngx_http.h>


static char *ngx_http_empty_gif(ngx_conf_t *cf, ngx_command_t *cmd,
    void *conf);

static ngx_command_t  ngx_http_empty_gif_commands[] = {

    { ngx_string("empty_gif"),
      NGX_HTTP_LOC_CONF|NGX_CONF_NOARGS,
      ngx_http_empty_gif,
      0,
      0,
      NULL },

      ngx_null_command
};


/* the minimal single pixel transparent GIF, 43 bytes */

static u_char  ngx_empty_gif[] = {

    'G', 'I', 'F', '8', '9', 'a',  /* header                                 */

                                   /* logical screen descriptor              */
    0x01, 0x00,                    /* logical screen width                   */
    0x01, 0x00,                    /* logical screen height                  */
    0x80,                          /* global 1-bit color table               */
    0x01,                          /* background color #1                    */
    0x00,                          /* no aspect ratio                        */

                                   /* global color table                     */
    0x00, 0x00, 0x00,              /* #0: black                              */
    0xff, 0xff, 0xff,              /* #1: white                              */

                                   /* graphic control extension              */
    0x21,                          /* extension introducer                   */
    0xf9,                          /* graphic control label                  */
    0x04,                          /* block size                             */
    0x01,                          /* transparent color is given,            */
                                   /*     no disposal specified,             */
                                   /*     user input is not expected         */
    0x00, 0x00,                    /* delay time                             */
    0x01,                          /* transparent color #1                   */
    0x00,                          /* block terminator                       */

                                   /* image descriptor                       */
    0x2c,                          /* image separator                        */
    0x00, 0x00,                    /* image left position                    */
    0x00, 0x00,                    /* image top position                     */
    0x01, 0x00,                    /* image width                            */
    0x01, 0x00,                    /* image height                           */
    0x00,                          /* no local color table, no interlaced    */

                                   /* table based image data                 */
    0x02,                          /* LZW minimum code size,                 */
                                   /*     must be at least 2-bit             */
    0x02,                          /* block size                             */
    0x4c, 0x01,                    /* compressed bytes 01_001_100, 0000000_1 */
                                   /* 100: clear code                        */
                                   /* 001: 1                                 */
                                   /* 101: end of information code           */
    0x00,                          /* block terminator                       */

    0x3B                           /* trailer                                */
};


static ngx_http_module_t  ngx_http_empty_gif_module_ctx = {
    NULL,                          /* preconfiguration */
    NULL,                          /* postconfiguration */

    NULL,                          /* create main configuration */
    NULL,                          /* init main configuration */

    NULL,                          /* create server configuration */
    NULL,                          /* merge server configuration */

    NULL,                          /* create location configuration */
    NULL                           /* merge location configuration */
};


ngx_module_t  ngx_http_empty_gif_module = {
    NGX_MODULE_V1,
    &ngx_http_empty_gif_module_ctx, /* module context */
    ngx_http_empty_gif_commands,   /* module directives */
    NGX_HTTP_MODULE,               /* module type */
    NULL,                          /* init master */
    NULL,                          /* init module */
    NULL,                          /* init process */
    NULL,                          /* init thread */
    NULL,                          /* exit thread */
    NULL,                          /* exit process */
    NULL,                          /* exit master */
    NGX_MODULE_V1_PADDING
};


static ngx_str_t  ngx_http_gif_type = ngx_string("image/gif");


static ngx_int_t
ngx_http_empty_gif_handler(ngx_http_request_t *r)
{
    ngx_http_complex_value_t  cv;

    if (!(r->method & (NGX_HTTP_GET|NGX_HTTP_HEAD))) {
        return NGX_HTTP_NOT_ALLOWED;
    }

    ngx_memzero(&cv, sizeof(ngx_http_complex_value_t));

    cv.value.len = sizeof(ngx_empty_gif);
    cv.value.data = ngx_empty_gif;
    r->headers_out.last_modified_time = 23349600;

    return ngx_http_send_response(r, NGX_HTTP_OK, &ngx_http_gif_type, &cv);
}


static char *
ngx_http_empty_gif(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
    ngx_http_core_loc_conf_t  *clcf;

    clcf = ngx_http_conf_get_module_loc_conf(cf, ngx_http_core_module);
    clcf->handler = ngx_http_empty_gif_handler;

    return NGX_CONF_OK;
}