nikkie-ftnextの日記

イベントレポートや読書メモを発信

HTTPXでレスポンスをstreamで扱う(OpenAIのChat completionsを例に)

はじめに

シャニマス、むずい... nikkieです。

OpenAIのChat completionsは、クライアント(openaiライブラリ)で1トークンずつ処理(streamで処理)できます。

openaiライブラリはHTTPX1を使っていると知っていたので、HTTPX単体のstreamを素振りしました。

※なおOpenAIのAPIを使うのであれば、HTTPXではなく、使いやすくラップしてくれているopenaiライブラリを使うほうがよいと考えます。
本記事は理解を深めるための車輪の再実装です

目次

OpenAIのChat completions、curlでstream

APIリファレンスに載っている例です(Streaming + curl)。
https://platform.openai.com/docs/api-reference/chat/create

curl -N https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Count to 10, with a comma between each number and no newlines. E.g., 1, 2, 3, ..."
      }
    ],
    "temperature": 0,
    "stream": true
  }'

出力の抜粋です(完全な出力はHTTPXのところで示します)。

data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"content":"1"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}

(略)


data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"content":"10"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}

data: [DONE]

curl-Nオプションは--no-buffer

In normal work situations, curl uses a standard buffered output stream that has the effect that it outputs the data in chunks, not necessarily exactly when the data arrives.
Using this option disables that buffering.

レスポンスをバッファせずに出力したかったので指定しました(streamで返ってきているところが見えるかなーって)

動作環境です。

% curl -V
curl 8.6.0 (x86_64-apple-darwin23.0) libcurl/8.6.0 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12 nghttp2/1.61.0
Release-Date: 2024-01-31
Protocols: dict file ftp ftps gopher gophers http https imap imaps ipfs ipns ldap ldaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS GSS-API HSTS HTTP2 HTTPS-proxy IPv6 Kerberos Largefile libz MultiSSL NTLM NTLM_WB SPNEGO SSL threadsafe UnixSockets

HTTPXでstream

上記のcurlをHTTPXに読み替えていきます。
HTTPXのドキュメントはこちら
https://www.python-httpx.org/quickstart/#streaming-responses

stream the text, on a line-by-line basis」のr.iter_lines()を使いました2

動作環境

  • uv 0.4.27
  • Python 3.12.6
  • HTTPX 0.27.2
% uv run httpx_stream.py
0 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]}
1 
2 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"1"},"logprobs":null,"finish_reason":null}]}
3 
4 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
5 
6 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
7 
8 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"2"},"logprobs":null,"finish_reason":null}]}
9 
10 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
11 
12 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
13 
14 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"3"},"logprobs":null,"finish_reason":null}]}
15 
16 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
17 
18 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
19 
20 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"4"},"logprobs":null,"finish_reason":null}]}
21 
22 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
23 
24 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
25 
26 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"5"},"logprobs":null,"finish_reason":null}]}
27 
28 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
29 
30 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
31 
32 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"6"},"logprobs":null,"finish_reason":null}]}
33 
34 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
35 
36 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
37 
38 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"7"},"logprobs":null,"finish_reason":null}]}
39 
40 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
41 
42 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
43 
44 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"8"},"logprobs":null,"finish_reason":null}]}
45 
46 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
47 
48 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
49 
50 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"9"},"logprobs":null,"finish_reason":null}]}
51 
52 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
53 
54 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
55 
56 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"10"},"logprobs":null,"finish_reason":null}]}
57 
58 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}
59 
60 data: [DONE]
61 

HTTPXでstream、async編

asyncでも素振りします。
https://www.python-httpx.org/async/#streaming-responses
r.iter_lines()に代えてr.aiter_lines()になります。

出力(一部抜粋)

data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"1"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}

(略)

data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"10"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}

data: [DONE]

終わりに

HTTPXでstreamできました!

  • 同期処理の場合はwith httpx.stream()3
    • レスポンスのiter_lines()を呼び出してforで回した
  • 並行処理の場合はAsyncClientを初期化して、async with client.stream()
    • async for chunk in response.aiter_lines():のようになる

OpenAIのAPIの場合は、prefixの data: の処理が必要ですね。
openaiライブラリではやってくれているのだと思われます4

空行が入っているのは、server-sent eventsだからじゃないかと思っています(宿題事項)

個々の通知は、 2 つの改行で終わるテキストのブロックとして送信されます。(サーバー送信イベントの使用 - Web API | MDN


  1. 推しのHTTPクライアントです。
  2. HTTPXはrequests互換のAPIを謳っていますが、ことstreamに関しては主張があるようです。requests互換から外したインターフェースです。ref: Requests Compatibility - HTTPX
  3. asyncでないClientstream()メソッドを呼び出しています。ref: https://github.com/encode/httpx/blob/0.27.2/httpx/_api.py#L163-L184
  4. SSEDecoderなるものがいました。ref: https://github.com/openai/openai-python/blob/v1.55.0/src/openai/_streaming.py#L346