Tornado 是一个用 Python 实现的 Web 服务器框架,其最大的特点是使用异步非阻塞IO的处理方式来获取高负载和高性能。到底 Tornado 的底层是如何实现的呢?我们来一起看看。
注明一下,这里展示的 Tornado 版本为4.2.1,不同版本可能会有一些出入。
让我们从官方的 helloworld 开始我们的旅程:
helloworld.py:
import tornado.httpserver
import tornado.ioloop
import tornado.options
import tornado.web
from tornado.options import define, options
define("port", default=8888, help="run on the given port", type=int)
class MainHandler(tornado.web.RequestHandler):
def get(self):
self.write("Hello, world")
def main():
tornado.options.parse_command_line()
application = tornado.web.Application([
(r"/", MainHandler),
])
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(options.port)
tornado.ioloop.IOLoop.current().start()
if __name__ == "__main__":
main()
Tornado 的基本框架是先实现处理各种模式的RequestHandler
类,用这些模式和对应的 Handler 初始化一个Application
类,再用这个Application
类初始化一个HTTPServer
,设置监听端口,再启动IOLoop
。
我们先来看Application
:
web.py:
class Application(httputil.HTTPServerConnectionDelegate):
def __init__(self, handlers=None, default_host="", transforms=None,
**settings):
... # 省略transform和ui模块的设置
self.handlers = []
self.named_handlers = {}
self.default_host = default_host
self.settings = settings
if self.settings.get("static_path"):
path = self.settings["static_path"]
handlers = list(handlers or [])
static_url_prefix = settings.get("static_url_prefix",
"/static/")
static_handler_class = settings.get("static_handler_class",
StaticFileHandler)
static_handler_args = settings.get("static_handler_args", {})
static_handler_args['path'] = path
for pattern in [re.escape(static_url_prefix) + r"(.*)",
r"/(favicon\.ico)", r"/(robots\.txt)"]:
handlers.insert(0, (pattern, static_handler_class,
static_handler_args))
if handlers:
self.add_handlers(".*$", handlers)
...
Application
的初始化的工作主要是设置 transform , 加载 ui 模块,为静态文件设置StaticFileHandler
,后面还有为 debug 模式设置 autoreload,最主要的还是调用self.add_handlers(".*$", handlers)
来添加 handlers,
def add_handlers(self, host_pattern, host_handlers):
"""Appends the given handlers to our handler list."""
if not host_pattern.endswith("$"):
host_pattern += "$"
handlers = []
# 确保全部匹配的 handler 处于 self.handlers 的最后
if self.handlers and self.handlers[-1][0].pattern == '.*$':
self.handlers.insert(-1, (re.compile(host_pattern), handlers))
else:
self.handlers.append((re.compile(host_pattern), handlers))
for spec in host_handlers:
if isinstance(spec, (tuple, list)):
assert len(spec) in (2, 3, 4)
spec = URLSpec(*spec)
handlers.append(spec)
if spec.name:
if spec.name in self.named_handlers:
app_log.warning(
"Multiple handlers named %s; replacing previous value",
spec.name)
self.named_handlers[spec.name] = spec
接下来我们看HttpServer
类,它的初始化没什么好讲的,值得注意到一点是它以self.request_callback
属性来保存Application
类。直接来看listen
方法HttpServer
的 listen 方法直接继承自TCPServer
的listen
方法。
tcpserver.py:
def listen(self, port, address=""):
sockets = bind_sockets(port, address=address)
self.add_sockets(sockets)
Socket 连接要经过 create -> bind -> listen 的三部曲。bind_sockets 函数走完了前面两步:就是创建 socket 并进行设置,然后 bind。因为输入的 address 可能对应多个IP地址,所以bind_sockets
返回值为一个 socket 的列表。 我们来看add_sockets
:
def add_sockets(self, sockets):
if self.io_loop is None:
self.io_loop = IOLoop.current()
for sock in sockets:
self._sockets[sock.fileno()] = sock
add_accept_handler(sock, self._handle_connection,
io_loop=self.io_loop)
一开始先获得IOLoop
实例,对于IOLoop
是全局的单例,关于IOLoop
的将在其他地方详述。然后调用add_accept_handler
用于设置 socket 接收到连接时的回调函数为TCPServer._handle_connection
。先来看 add_accept_handler
。
netutil.py:
def add_accept_handler(sock, callback, io_loop=None):
if io_loop is None:
io_loop = IOLoop.current()
def accept_handler(fd, events):
for i in xrange(_DEFAULT_BACKLOG):
try:
connection, address = sock.accept()
except socket.error as e:
# _ERRNO_WOULDBLOCK indicate we have accepted every
# connection that is available.
if errno_from_exception(e) in _ERRNO_WOULDBLOCK:
return
# ECONNABORTED indicates that there was a connection
# but it was closed while still in the accept queue.
# (observed on FreeBSD).
if errno_from_exception(e) == errno.ECONNABORTED:
continue
raise
callback(connection, address)
io_loop.add_handler(sock, accept_handler, IOLoop.READ)
add_accept_handler
向io_loop
注册了 socket,在IOLoop.READ
事件到来时的 调用accept_handler
来处理 socket 事件。accept_handler
就定义在add_accept_handler
内部,可以看到其实就是sock.accept()
获取连接的 socket 和 address,然后再调用回调函数,也就是TCPServer. _handle_connection
:
tcpserver.py:
def _handle_connection(self, connection, address):
if self.ssl_options is not None:
... # 处理ssl
try:
if self.ssl_options is not None:
stream = SSLIOStream(connection, io_loop=self.io_loop,
max_buffer_size=self.max_buffer_size,
read_chunk_size=self.read_chunk_size)
else:
stream = IOStream(connection, io_loop=self.io_loop,
max_buffer_size=self.max_buffer_size,
read_chunk_size=self.read_chunk_size)
future = self.handle_stream(stream, address)
if future is not None:
self.io_loop.add_future(future, lambda f: f.result())
except Exception:
app_log.error("Error in connection callback", exc_info=True)
我们略去处理 ssl 的部分,在这里主要的内容就是将 socket 封装成IOStream
类给上层处理。handle_stream
方法在HttpServer
中实现,返回None
或一个future
对象。关于future
对象后面再细讲。
到此Application
和httpserver
就准备就绪,下面我们来看看 tornado 的核心——IOLoop