Python 并发编程：threading、asyncio 与多进程，选型不再纠结

最新推荐文章于 2026-06-28 21:35:05 发布

原创最新推荐文章于 2026-06-28 21:35:05 发布 · 154 阅读

3 ·

本内容遵循CC 4.0 BY-SA版权协议

GEO检测

标签

#rust #github #python

Python 并发编程：threading、asyncio 与多进程，选型不再纠结

cover

一、并发场景下的 Python 困境：GIL 之痛与 IO 瓶颈

Python 的并发编程一直是个让人头疼的话题。GIL（全局解释器锁）的存在，让多线程在 CPU 密集型任务面前几乎无用武之地。但 Python 在 IO 密集型场景下依然有大量并发需求：网络请求、数据库查询、文件读写。选错并发模型，轻则性能上不去，重则死锁、资源泄漏、调试到崩溃。

生产环境里常见的坑：用 threading 写爬虫，线程数一上去就频繁上下文切换，性能反而下降；用 asyncio 改写，结果第三方库不支持协程，到处 run_in_executor，代码比多线程版本还复杂；用 multiprocessing 跑计算任务，进程间通信的开销比计算本身还大。

这些问题的根源不是 Python 的并发模型不好，而是没有根据任务特征选对模型。IO 密集型、CPU 密集型、混合型——每种场景都有最优解，但前提是你得理解底层机制。

二、三种并发模型的运行机制与调度原理

Python 提供了三种主要的并发模型，它们的调度机制完全不同。

graph TD
    A[Python 并发模型] --> B[threading 多线程]
    A --> C[asyncio 协程]
    A --> D[multiprocessing 多进程]

    B --> B1[操作系统调度]
    B1 --> B2[GIL 限制：同一时刻仅一个线程执行 Python 字节码]
    B2 --> B3[IO 操作时释放 GIL]

    C --> C1[事件循环调度]
    C1 --> C2[单线程内协作式切换]
    C2 --> C3[await 挂起协程，无锁竞争]

    D --> D1[操作系统进程调度]
    D1 --> D2[每个进程独立 GIL]
    D2 --> D3[真正并行，但进程间通信开销大]

threading 的调度由操作系统内核完成，线程切换是抢占式的。但在 CPython 中，GIL 保证同一时刻只有一个线程执行 Python 字节码。这意味着多线程无法利用多核进行 CPU 计算，但在 IO 等待时 GIL 会被释放，所以 IO 密集型任务仍能受益。

asyncio 的调度由事件循环完成，协程切换是协作式的。await 关键字标记了协程的挂起点，事件循环在此时切换到其他就绪的协程。整个过程在单线程内完成，没有锁竞争，没有上下文切换开销。代价是必须全程使用异步 API，一旦混用阻塞调用，整个事件循环都会卡住。

multiprocessing 通过创建独立进程来绕过 GIL，每个进程有自己的 GIL 和内存空间。这是 Python 中唯一能实现真正 CPU 并行的方式。但进程创建和 IPC（进程间通信）的开销很大，数据需要序列化传输，不适合高频小任务的场景。

三、生产级并发代码：三种模型的实战实现

threading：IO 密集型任务池

import threading
import queue
import requests
from typing import List, Optional

class ThreadedFetcher:
    """基于线程池的并发 HTTP 请求器"""

    def __init__(self, worker_count: int = 8):
        self.worker_count = worker_count
        self.task_queue: queue.Queue = queue.Queue()
        self.results: dict[str, Optional[str]] = {}
        self.errors: dict[str, str] = {}
        self._lock = threading.Lock()

    def fetch_all(self, urls: List[str]) -> dict[str, Optional[str]]:
        """并发请求多个 URL，返回 {url: 响应文本} 映射"""
        # 将所有 URL 放入任务队列
        for url in urls:
            self.task_queue.put(url)

        # 创建并启动工作线程
        workers = []
        for _ in range(self.worker_count):
            t = threading.Thread(target=self._worker, daemon=True)
            t.start()
            workers.append(t)

        # 等待所有任务完成
        self.task_queue.join()
        return self.results

    def _worker(self):
        """工作线程：从队列取 URL 并发起请求"""
        while True:
            try:
                url = self.task_queue.get_nowait()
            except queue.Empty:
                break

            try:
                resp = requests.get(url, timeout=10)
                resp.raise_for_status()
                with self._lock:
                    self.results[url] = resp.text
            except requests.RequestException as e:
                with self._lock:
                    self.errors[url] = str(e)
                    self.results[url] = None
            finally:
                self.task_queue.task_done()

asyncio：高并发 HTTP 请求

import asyncio
import aiohttp

async def fetch_batch(urls: list[str], concurrency: int = 20) -> dict[str, str | None]:
    """使用 asyncio 并发请求多个 URL，通过信号量控制并发数"""
    results: dict[str, str | None] = {}
    semaphore = asyncio.Semaphore(concurrency)

    async def fetch_one(session: aiohttp.ClientSession, url: str):
        async with semaphore:
            try:
                async with session.get(url, timeout=aiohttp.ClientTimeout(total=10)) as resp:
                    resp.raise_for_status()
                    results[url] = await resp.text()
            except (aiohttp.ClientError, asyncio.TimeoutError) as e:
                results[url] = None
                print(f"请求失败 [{url}]: {e}")

    async with aiohttp.ClientSession() as session:
        tasks = [fetch_one(session, url) for url in urls]
        await asyncio.gather(*tasks)

    return results

multiprocessing：CPU 密集型并行计算

import multiprocessing as mp
from functools import partial

def process_chunk(chunk: list[int], threshold: int) -> list[int]:
    """对数据块执行 CPU 密集型计算，筛选满足阈值的数据"""
    return [x for x in chunk if _heavy_computation(x) > threshold]

def _heavy_computation(n: int) -> float:
    """模拟耗时的计算逻辑"""
    result = 0.0
    for i in range(1000):
        result += (n ** 0.5) * i / (i + 1)
    return result

def parallel_filter(data: list[int], threshold: int, workers: int = 4) -> list[int]:
    """将数据分块，多进程并行计算"""
    chunk_size = len(data) // workers
    chunks = [
        data[i * chunk_size : (i + 1) * chunk_size]
        for i in range(workers)
    ]

    # 使用进程池并行处理
    func = partial(process_chunk, threshold=threshold)
    with mp.Pool(processes=workers) as pool:
        chunk_results = pool.map(func, chunks)

    # 合并结果
    return [item for chunk in chunk_results for item in chunk]

四、三种模型的适用边界与性能权衡

threading 适合：IO 密集型任务，且需要兼容同步第三方库。比如用 requests 做爬虫、用 psycopg2 查数据库。线程数建议控制在 CPU 核心数的 2-5 倍，过多会导致上下文切换开销超过 IO 等待的收益。

asyncio 适合：高并发 IO 场景，并发连接数在百级以上。比如 WebSocket 服务、大量 HTTP 请求。优势是单线程无锁，劣势是生态要求高——所有 IO 操作都必须用异步库。混用阻塞调用是 asyncio 最大的坑，一行 time.sleep() 就能让整个事件循环卡死。

multiprocessing 适合：CPU 密集型计算，数据量足够大以至于能摊平进程创建和 IPC 的开销。注意避免在进程间传递大对象，序列化/反序列化的成本可能比计算本身还高。共享状态尽量用 multiprocessing.Value 或 Manager，而不是文件或数据库。

一个常被忽略的取舍：调试难度。多线程的竞态条件难以复现，asyncio 的协程栈追踪不如同步代码直观，多进程的调试需要附加到子进程。生产环境中，可观测性比性能更重要——如果出问题查不到原因，再快的并发也没用。

五、总结

Python 三种并发模型各有适用场景：threading 适合 IO 密集且需要兼容同步库的场景，asyncio 适合高并发纯异步 IO 场景，multiprocessing 适合 CPU 密集型并行计算。选型的核心依据是任务类型（IO 还是 CPU）和生态约束（同步库还是异步库）。混合场景下，可以用 asyncio 做主调度，CPU 任务通过 run_in_executor 委托给进程池。并发编程没有银弹，理解调度机制才能做出正确的权衡。