はじめに
シャニマス、むずい... nikkieです。
OpenAIのChat completionsは、クライアント(openaiライブラリ)で1トークンずつ処理(streamで処理)できます。
openaiライブラリはHTTPX1を使っていると知っていたので、HTTPX単体のstreamを素振りしました。
※なおOpenAIのAPIを使うのであれば、HTTPXではなく、使いやすくラップしてくれているopenaiライブラリを使うほうがよいと考えます。
本記事は理解を深めるための車輪の再実装です
目次
OpenAIのChat completions、curlでstream
APIリファレンスに載っている例です(Streaming + curl)。
https://platform.openai.com/docs/api-reference/chat/create
curl -N https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "Count to 10, with a comma between each number and no newlines. E.g., 1, 2, 3, ..."
}
],
"temperature": 0,
"stream": true
}'
出力の抜粋です(完全な出力はHTTPXのところで示します)。
data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"content":"1"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
(略)
data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"content":"10"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}
data: [DONE]
curlの-Nオプションは--no-buffer。
In normal work situations, curl uses a standard buffered output stream that has the effect that it outputs the data in chunks, not necessarily exactly when the data arrives.
Using this option disables that buffering.
レスポンスをバッファせずに出力したかったので指定しました(streamで返ってきているところが見えるかなーって)
動作環境です。
% curl -V curl 8.6.0 (x86_64-apple-darwin23.0) libcurl/8.6.0 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12 nghttp2/1.61.0 Release-Date: 2024-01-31 Protocols: dict file ftp ftps gopher gophers http https imap imaps ipfs ipns ldap ldaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp Features: alt-svc AsynchDNS GSS-API HSTS HTTP2 HTTPS-proxy IPv6 Kerberos Largefile libz MultiSSL NTLM NTLM_WB SPNEGO SSL threadsafe UnixSockets
HTTPXでstream
上記のcurlをHTTPXに読み替えていきます。
HTTPXのドキュメントはこちら
https://www.python-httpx.org/quickstart/#streaming-responses
「stream the text, on a line-by-line basis」のr.iter_lines()を使いました2。
動作環境
- uv 0.4.27
- Python 3.12.6
- HTTPX 0.27.2
% uv run httpx_stream.py
0 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]}
1
2 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"1"},"logprobs":null,"finish_reason":null}]}
3
4 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
5
6 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
7
8 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"2"},"logprobs":null,"finish_reason":null}]}
9
10 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
11
12 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
13
14 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"3"},"logprobs":null,"finish_reason":null}]}
15
16 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
17
18 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
19
20 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"4"},"logprobs":null,"finish_reason":null}]}
21
22 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
23
24 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
25
26 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"5"},"logprobs":null,"finish_reason":null}]}
27
28 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
29
30 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
31
32 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"6"},"logprobs":null,"finish_reason":null}]}
33
34 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
35
36 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
37
38 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"7"},"logprobs":null,"finish_reason":null}]}
39
40 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
41
42 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
43
44 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"8"},"logprobs":null,"finish_reason":null}]}
45
46 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
47
48 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
49
50 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"9"},"logprobs":null,"finish_reason":null}]}
51
52 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
53
54 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]}
55
56 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"10"},"logprobs":null,"finish_reason":null}]}
57
58 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}
59
60 data: [DONE]
61
HTTPXでstream、async編
asyncでも素振りします。
https://www.python-httpx.org/async/#streaming-responses
r.iter_lines()に代えてr.aiter_lines()になります。
出力(一部抜粋)
data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"1"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]}
(略)
data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"10"},"logprobs":null,"finish_reason":null}]}
data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}
data: [DONE]
終わりに
HTTPXでstreamできました!
- 同期処理の場合は
with httpx.stream()3- レスポンスの
iter_lines()を呼び出してforで回した
- レスポンスの
- 並行処理の場合はAsyncClientを初期化して、
async with client.stream()async for chunk in response.aiter_lines():のようになる
OpenAIのAPIの場合は、prefixの data: の処理が必要ですね。
openaiライブラリではやってくれているのだと思われます4。
空行が入っているのは、server-sent eventsだからじゃないかと思っています(宿題事項)
個々の通知は、 2 つの改行で終わるテキストのブロックとして送信されます。(サーバー送信イベントの使用 - Web API | MDN)
- 推しのHTTPクライアントです。↩
- HTTPXはrequests互換のAPIを謳っていますが、ことstreamに関しては主張があるようです。requests互換から外したインターフェースです。ref: Requests Compatibility - HTTPX↩
-
asyncでない
Clientのstream()メソッドを呼び出しています。ref: https://github.com/encode/httpx/blob/0.27.2/httpx/_api.py#L163-L184↩ -
SSEDecoderなるものがいました。ref: https://github.com/openai/openai-python/blob/v1.55.0/src/openai/_streaming.py#L346↩