はじめに
シャニマス、むずい... nikkieです。
OpenAIのChat completionsは、クライアント(openaiライブラリ)で1トークンずつ処理(streamで処理)できます。
openaiライブラリはHTTPX1を使っていると知っていたので、HTTPX単体のstreamを素振りしました。
※なおOpenAIのAPIを使うのであれば、HTTPXではなく、使いやすくラップしてくれているopenaiライブラリを使うほうがよいと考えます。
本記事は理解を深めるための車輪の再実装です
目次
OpenAIのChat completions、curlでstream
APIリファレンスに載っている例です(Streaming + curl)。
https://platform.openai.com/docs/api-reference/chat/create
curl -N https://api.openai.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{ "model": "gpt-4o-mini", "messages": [ { "role": "user", "content": "Count to 10, with a comma between each number and no newlines. E.g., 1, 2, 3, ..." } ], "temperature": 0, "stream": true }'
出力の抜粋です(完全な出力はHTTPXのところで示します)。
data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"content":"1"},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} (略) data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{"content":"10"},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-AWmRzod8CKPO337kranC5XEVrw2cT","object":"chat.completion.chunk","created":1732376103,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_0705bf87c0","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]} data: [DONE]
curlの-N
オプションは--no-buffer
。
In normal work situations, curl uses a standard buffered output stream that has the effect that it outputs the data in chunks, not necessarily exactly when the data arrives.
Using this option disables that buffering.
レスポンスをバッファせずに出力したかったので指定しました(streamで返ってきているところが見えるかなーって)
動作環境です。
% curl -V curl 8.6.0 (x86_64-apple-darwin23.0) libcurl/8.6.0 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12 nghttp2/1.61.0 Release-Date: 2024-01-31 Protocols: dict file ftp ftps gopher gophers http https imap imaps ipfs ipns ldap ldaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp Features: alt-svc AsynchDNS GSS-API HSTS HTTP2 HTTPS-proxy IPv6 Kerberos Largefile libz MultiSSL NTLM NTLM_WB SPNEGO SSL threadsafe UnixSockets
HTTPXでstream
上記のcurlをHTTPXに読み替えていきます。
HTTPXのドキュメントはこちら
https://www.python-httpx.org/quickstart/#streaming-responses
「stream the text, on a line-by-line basis」のr.iter_lines()
を使いました2。
動作環境
- uv 0.4.27
- Python 3.12.6
- HTTPX 0.27.2
% uv run httpx_stream.py 0 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]} 1 2 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"1"},"logprobs":null,"finish_reason":null}]} 3 4 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} 5 6 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]} 7 8 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"2"},"logprobs":null,"finish_reason":null}]} 9 10 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} 11 12 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]} 13 14 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"3"},"logprobs":null,"finish_reason":null}]} 15 16 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} 17 18 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]} 19 20 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"4"},"logprobs":null,"finish_reason":null}]} 21 22 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} 23 24 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]} 25 26 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"5"},"logprobs":null,"finish_reason":null}]} 27 28 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} 29 30 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]} 31 32 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"6"},"logprobs":null,"finish_reason":null}]} 33 34 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} 35 36 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]} 37 38 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"7"},"logprobs":null,"finish_reason":null}]} 39 40 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} 41 42 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]} 43 44 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"8"},"logprobs":null,"finish_reason":null}]} 45 46 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} 47 48 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]} 49 50 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"9"},"logprobs":null,"finish_reason":null}]} 51 52 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} 53 54 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":" "},"logprobs":null,"finish_reason":null}]} 55 56 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"10"},"logprobs":null,"finish_reason":null}]} 57 58 data: {"id":"chatcmpl-AWmbogNqdsbF6W5f74CtUtJQr6sWd","object":"chat.completion.chunk","created":1732376712,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]} 59 60 data: [DONE] 61
HTTPXでstream、async編
asyncでも素振りします。
https://www.python-httpx.org/async/#streaming-responses
r.iter_lines()
に代えてr.aiter_lines()
になります。
出力(一部抜粋)
data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"1"},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":","},"logprobs":null,"finish_reason":null}]} (略) data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{"content":"10"},"logprobs":null,"finish_reason":null}]} data: {"id":"chatcmpl-AWmft72XjrHSzcneJzSxRxEBT0XoK","object":"chat.completion.chunk","created":1732376965,"model":"gpt-4o-mini-2024-07-18","system_fingerprint":"fp_3de1288069","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]} data: [DONE]
終わりに
HTTPXでstreamできました!
- 同期処理の場合は
with httpx.stream()
3- レスポンスの
iter_lines()
を呼び出してfor
で回した
- レスポンスの
- 並行処理の場合はAsyncClientを初期化して、
async with client.stream()
async for chunk in response.aiter_lines():
のようになる
OpenAIのAPIの場合は、prefixの data:
の処理が必要ですね。
openaiライブラリではやってくれているのだと思われます4。
空行が入っているのは、server-sent eventsだからじゃないかと思っています(宿題事項)
個々の通知は、 2 つの改行で終わるテキストのブロックとして送信されます。(サーバー送信イベントの使用 - Web API | MDN)
- 推しのHTTPクライアントです。↩
- HTTPXはrequests互換のAPIを謳っていますが、ことstreamに関しては主張があるようです。requests互換から外したインターフェースです。ref: Requests Compatibility - HTTPX↩
-
asyncでない
Client
のstream()
メソッドを呼び出しています。ref: https://github.com/encode/httpx/blob/0.27.2/httpx/_api.py#L163-L184↩ -
SSEDecoder
なるものがいました。ref: https://github.com/openai/openai-python/blob/v1.55.0/src/openai/_streaming.py#L346↩