Flask基础学习-线程隔离篇

1. 进程和线程

进程就是操作系统分配和调度资源的最小单位。

线程之间的切换消耗更小。

进程分配资源内存资源

线程利用CPU执行代码

线程是不能被分配和获得资源的，但是线程能够访问资源（对应所属进程的资源），正因为线程不用管理资源和拥有资源的，因此线程切换的开销比进程小。

对于多核CPU，当存在多个线程的时候，可以并行运行在多个CPU上，但是，由于Python中GIL的存在，导致无法充分利用多核CPU的优势。GIL只存在于CPython中。

2. 线程隔离

我们使用flask默认情况是单进程单线程的，不过我们可以开启多线程。

1 2	# threaded 设置为 True app.run(threaded=True)

如果开启多线程就会引起一些问题哦！

我们以web请求为例做以下学习。。

我们先看单线程情况，当是单线程的时候。

所有的请求是顺序执行的，当一个请求过来的时候Flask直接实例化一个Request对象，request变量指向这个实例化后的变量，请求结束后，对于新的请求Flask会重新实例化一个Request对象，然后request变量会指向新的实例化对象，这样request总是指向最新的请求。不会出现request变量指向不明的情况。

我们看看多线程的情况。

当是多线程的时候，我们的变量request就会出现指向不明的情况，导致无法获取到正确的请求。

初始解决方法，我们创建一个字典，以线程id做为主键key，以Request实例化对象作为值，这样我们就能保证每个线程中的request变量是独立的，不会出现指代不明。这就是初步的线程隔离，各个线程之前的Request是互不影响的。线程隔离只是一种思想，并不一定需要使用字典来实现。

3. 线程隔离对象Local

flask中的线程隔离是使用werzeug库的local模块下的Local对象来实现的。Local对象也是对字典的一些封装。

我们看下源码：


class Local(object):
    __slots__ = ('__storage__', '__ident_func__')

    def __init__(self):
        # 这里的 __storage__ 是字典类型
        object.__setattr__(self, '__storage__', {})
        object.__setattr__(self, '__ident_func__', get_ident)


    def __setattr__(self, name, value):
        # 获得线程ID
        ident = self.__ident_func__()
        storage = self.__storage__
        try:
            storage[ident][name] = value
        except KeyError:
            storage[ident] = {name: value}

什么是线程隔离对象呢？

对于一个对象来说，不同线程对于该对象的操作是互不影响的，那么这样一个对象就是线程隔离对象。

Local对象就是一个线程隔离对象，多线程中我们对于Local对象的操作就是相互隔离的。

我们实际写段代码测试下：

my_obj = Local()

my_obj.a = 1


def worker():
    my_obj.a = 2
    print("new thread a is " + str(my_obj.a))

new_t = threading.Thread(target=worker, name="new thread")

new_t.start()

time.sleep(2)

print("main thread a is " + str(my_obj.a))

# 输出
new thread a is 2
main thread a is 1

4. 线程隔离的栈：LocalStack

我们看一张之前在学习上下文的时候学习的照片。

我们看下图片中的_request_ctx_stack和_app_ctx_stack指向什么：

1 2	_request_ctx_stack = LocalStack() _app_ctx_stack = LocalStack()

源码中显示这两个都是指向LocalStack，那LocalStack又是什么呢？

class LocalStack(object):

    """This class works similar to a :class:`Local` but keeps a stack
    of objects instead.  This is best explained with an example::

        >>> ls = LocalStack()
        >>> ls.push(42)
        >>> ls.top
        42
        >>> ls.push(23)
        >>> ls.top
        23
        >>> ls.pop()
        23
        >>> ls.top
        42

    They can be force released by using a :class:`LocalManager` or with
    the :func:`release_local` function but the correct way is to pop the
    item from the stack after using.  When the stack is empty it will
    no longer be bound to the current context (and as such released).

    By calling the stack without arguments it returns a proxy that resolves to
    the topmost item on the stack.

    .. versionadded:: 0.6.1
    """

    def __init__(self):
        self._local = Local()

    def __release_local__(self):
        self._local.__release_local__()

    def _get__ident_func__(self):
        return self._local.__ident_func__

    def _set__ident_func__(self, value):
        object.__setattr__(self._local, '__ident_func__', value)
    __ident_func__ = property(_get__ident_func__, _set__ident_func__)
    del _get__ident_func__, _set__ident_func__

    def __call__(self):
        def _lookup():
            rv = self.top
            if rv is None:
                raise RuntimeError('object unbound')
            return rv
        return LocalProxy(_lookup)

    def push(self, obj):
        """Pushes a new item to the stack"""
        rv = getattr(self._local, 'stack', None)
        if rv is None:
            self._local.stack = rv = []
        rv.append(obj)
        return rv

    def pop(self):
        """Removes the topmost item from the stack, will return the
        old value or `None` if the stack was already empty.
        """
        stack = getattr(self._local, 'stack', None)
        if stack is None:
            return None
        elif len(stack) == 1:
            release_local(self._local)
            return stack[-1]
        else:
            return stack.pop()

    @property
    def top(self):
        """The topmost item on the stack.  If the stack is empty,
        `None` is returned.
        """
        try:
            return self._local.stack[-1]
        except (AttributeError, IndexError):
            return None

从源码中我们看到封装了Local又保留了栈的特性（使用List来实现）

LocalStact、Local、字典之间的关系：

Local是使用字典的方式实现的线程隔离，LocalStack是封装了Local实现了线程隔离的栈结构。

软件世界里一切都是封装来实现的，一次不行多次封装。

5. LocalStack作为Stack的用法

相比于Local的直接赋值和读取，LocalStack的使用是通过push,pop,top方法。

我们直接看下源码中的实例代码：

>>> ls = LocalStack()
>>> ls.push(42)
>>> ls.top
42
>>> ls.push(23)
>>> ls.top
23
>>> ls.pop()
23
>>> ls.top
42

# 后进先出

top是只多去不删除，pop是读取和删除。

6. LocalStack作为线程隔离的使用

我们直接看下代码的输出

my_stack = LocalStack()
my_stack.push(1)

print("in main thread after push, value is: " + str(my_stack.top))


def worker():
    print("in new thread before push, value is: " + str(my_stack.top))
    my_stack.push(2)
    print("in new thread after push, value is: " + str(my_stack.top))


new_t = threading.Thread(target=worker, name="new thread")

new_t.start()

time.sleep(2)

print("in main thread finally push, value is: " + str(my_stack.top))

# 输出
in main thread after push, value is: 1
in new thread before push, value is: None
in new thread after push, value is: 2
in main thread finally push, value is: 1

代码中我们看到主线程的操作和新线程互不影响，也就是说两个线程两个栈。

7. flask中被线程隔离的对象

flask中为什么要使用LocalStack呢？

看上面的图，每次请求的时候我们需要对两个上下文进行推栈操作，并且要求每次都是线程隔离的，LocalStack刚好能满足我们的需求。

flask为什么要使用线程隔离呢？

表面原因：在多线程编程的时候，每个线程中都会创建各种对象，如果不进行线程隔离，就会出现变量指代不明确，程序运行出错。

深入原因：我们以Request为例，使用线程隔离后request变量在各个线程中会指向当前的Request实例化对象，不会出现指代不明的情况。

线程隔离的意义：使当前线程能够正确引用到它自己所创建的对象，而不是引用到其他线程所创建的对象。

我们现在对比下线程隔离和非线程隔离：

先创建一个非线程隔离的对象

class NoneLocal:
    def __init__(self, v):
        self.v = v


n = NoneLocal(1)

在一个视图函数中使用和改变非线程隔离对象。

from flask import Flask, request, session, g


from nonelocal import n

app = Flask(__name__)


@app.route('/test')
def test():
    print(n.v)
    n.v = 2

    print('----------------')
    print(getattr(request, 'v', None))
    print(getattr(session, 'v', None))
    print(getattr(g, 'v', None))
	
    setattr(request, 'v', 2)
    setattr(session, 'v', 2)
    setattr(g, 'v', 2)

    return ''

# 记得开启多线程
app.run(port=5005, threaded=True, debug=True)

我们连续请求两次这个页面，查看输出：

从上面的输出我们看到，request, session, g这三个对象都是线程隔离的对象。不是线程隔离的对象在不同的请求中会相互影响。

flask中一些名词

线程隔离对象：LocalStack和Local就是线程隔离对象

被线程隔离的对象：由线程隔离对象创建被线程隔离的对象，如两个上下文通过退到LocalStack中变成被线程隔离的对象。

我们现在探讨一个问题，current_app是线程隔离的吗？

先看下源码：

def _find_app():
    top = _app_ctx_stack.top
    if top is None:
        raise RuntimeError(_app_ctx_err_msg)
    return top.app


# context locals
_app_ctx_stack = LocalStack()
current_app = LocalProxy(_find_app)

_find_app函数的最终返回值是flask的核心对象Flask.

应用上下文和flask的核心对象是有区别的，flask的核心对象时作为一个属性存在应用上下文中。

flask的核心对象在全局是只有一个的。

为什么全局只有一个呢？

这是因为app的创建是在入口文件中创建的，只是在程序初次运行的时候创建，相比于Request则是每次请求都是需要创建的。因此是全局只有一个的。

从上我们得出的是current_app不是线程隔离的。

最后看一些这些名词的关系：

内部是以线程ID作为key的字典 –> Local –> LocalStack

AppContext RequestContext -> LocalStack

两个上下文在每次请求的时候会被推到LocalStack的栈结构中，当请求结束再弹出来。

Flask –> AppContext Request –> RequestContext

AppContext将flask的核心对象作为属性保存起来，RequestContext同样将Request保存封装起来。

current_app –> (LocalStack.top == AppContext top.app = Flask)

current_app指向的是栈顶元素的app属性，栈顶元素是应用上下文。

request –> (LocalStack.top == RequestContext top.request = Request)

request实际指向的是Request。