はじめに
今晩は朝までラーニング! nikkieです。
ChatGPTが流れるように出力する仕組みに迫る素振りです。
目次
- はじめに
- 目次
- OpenAI Cookbook「How to stream completions」
- 動作環境
- streamを指定しない場合
- streamを指定する場合
- まとめ:OpenAIのChat completionsでstream指定の有無の比較
- 終わりに
- 番外編:asyncバージョン
OpenAI Cookbook「How to stream completions」
OpenAIのGPTをはじめとするLLM(大規模言語モデル)は、プロンプトに続く文章を1トークンずつ生成しています1
OpenAIのAPIを普通に使うと、すべてのトークンが生成されたレスポンスが返ります2。
すべてのトークンが揃うまでレスポンスが返らないので、待ちがあります3。
一方、Web UIからChatGPTを使うと、テキストが流れて表示されます。
おそらくここに使っていると思われるのが、stream呼び出しとでも呼ぶべき機能。
生成したトークンを1トークンずつ返させることができます。
上記のCookbookには、streamでレスポンスを生成させるやり方が載っています。
こちらに沿って素振りしました。
なお、streamを指定することで生成が完全に終わるまでユーザを待たせずに出力を始められるのですが、コンテンツのモデレーション4は難しくなるとのことです(「Downsides」参照)
動作環境
uv 0.4.27 で inline script metadata (PEP 723)5 を使っています
- Python 3.12.6
- openai 1.55.0
環境変数OPENAI_API_KEY
を指定しています
streamを指定しない場合
from openai import OpenAI prompt = "Count to 10, with a comma between each number and no newlines. E.g., 1, 2, 3, ..." client = OpenAI() response = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": prompt}], temperature=0, )
% PYTHONINSPECT=1 uv run openai_stream.py >>> type(response) <class 'openai.types.chat.chat_completion.ChatCompletion'> >>> response.choices[0].message.content '1, 2, 3, 4, 5, 6, 7, 8, 9, 10'
生成されたテキストが返ってきていますね。
streamを指定する場合
from openai import OpenAI prompt = "Count to 10, with a comma between each number and no newlines. E.g., 1, 2, 3, ..." client = OpenAI() stream_response = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": prompt}], temperature=0, stream=True, ) for chunk in stream_response: print(chunk) print(chunk.choices[0].delta.content) print("*"* 20)
% PYTHONINSPECT=1 uv run openai_stream.py ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='1', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) 1 ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) , ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='2', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) 2 ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) , ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='3', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) 3 ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) , ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='4', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) 4 ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) , ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='5', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) 5 ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) , ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='6', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) 6 ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) , ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='7', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) 7 ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) , ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='8', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) 8 ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) , ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='9', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) 9 ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) , ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='10', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) 10 ******************** ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None) None ********************
chunk.choices[0].delta.content
にご注目。
空文字列(role=assistant指定) -> 1 -> , -> 半角スペース -> 2 -> ... と、1トークンずつ返されています。
最後はcontentがNone
ですね(finish_reason='stop'
となっているので、これが制御に使えそう)
stream=True
を指定すると、client.chat.completions.create
から返るオブジェクトが変わっています。
>>> type(stream_response) <class 'openai.Stream'>
まとめ:OpenAIのChat completionsでstream指定の有無の比較
********** stream=False response ********** 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ********** stream=True response ********** 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 None
終わりに
OpenAIのChat completions APIのstreamの素振りでした。
- デフォルトでは
stream=False
response.choices[0].message.content
に生成されたテキスト(完全版)
stream=True
を指定する- 返ったレスポンスは
for
で回す(Iterableということ) chunk.choices[0].delta.content
に生成された1トークン- roleやfinish_reasonを見て制御できそう(宿題事項)
- 返ったレスポンスは
今回のCookbookにある6のですが、server-sent eventsというのがキーワードっぽいです。
番外編:asyncバージョン
出力はasyncでない場合と一致していました
- 続く箇条書きはgreedy decodingを念頭に置いています。他にもbeam searchなどがありますが、分かりやすさのために採用しました↩
- Cookbook「By default, when you request a completion from the OpenAI, the entire completion is generated before being sent back in a single response.」↩
- Cookbook「If you're generating long completions, waiting for the response can take many seconds.」↩
- OpenAIはModeration APIを無料で提供しています ↩
- 手前味噌ですが ↩
- 「This will return an object that streams back the response as data-only server-sent events.」↩