nikkie-ftnextの日記

イベントレポートや読書メモを発信

OpenAIのChat completions APIをstreamで使う(Pythonクライアントでstream=Trueを指定)

はじめに

今晩は朝までラーニング! nikkieです。

ChatGPTが流れるように出力する仕組みに迫る素振りです。

目次

OpenAI Cookbook「How to stream completions」

OpenAIのGPTをはじめとするLLM(大規模言語モデル)は、プロンプトに続く文章を1トークンずつ生成しています1

OpenAIのAPIを普通に使うと、すべてのトークンが生成されたレスポンスが返ります2
すべてのトークンが揃うまでレスポンスが返らないので、待ちがあります3

一方、Web UIからChatGPTを使うと、テキストが流れて表示されます。
おそらくここに使っていると思われるのが、stream呼び出しとでも呼ぶべき機能。
生成したトークンを1トークンずつ返させることができます。

上記のCookbookには、streamでレスポンスを生成させるやり方が載っています。
こちらに沿って素振りしました。

なお、streamを指定することで生成が完全に終わるまでユーザを待たせずに出力を始められるのですが、コンテンツのモデレーション4は難しくなるとのことです(「Downsides」参照)

動作環境

uv 0.4.27 で inline script metadata (PEP 723)5 を使っています

環境変数OPENAI_API_KEYを指定しています

streamを指定しない場合

from openai import OpenAI

prompt = "Count to 10, with a comma between each number and no newlines. E.g., 1, 2, 3, ..."

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": prompt}],
    temperature=0,
)
% PYTHONINSPECT=1 uv run openai_stream.py
>>> type(response)
<class 'openai.types.chat.chat_completion.ChatCompletion'>
>>> response.choices[0].message.content
'1, 2, 3, 4, 5, 6, 7, 8, 9, 10'

生成されたテキストが返ってきていますね。

streamを指定する場合

from openai import OpenAI

prompt = "Count to 10, with a comma between each number and no newlines. E.g., 1, 2, 3, ..."

client = OpenAI()
stream_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": prompt}],
    temperature=0,
    stream=True,
)
for chunk in stream_response:
    print(chunk)
    print(chunk.choices[0].delta.content)
    print("*"* 20)
% PYTHONINSPECT=1 uv run openai_stream.py
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)

********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='1', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
1
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
,
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
 
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='2', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
2
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
,
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
 
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='3', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
3
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
,
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
 
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='4', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
4
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
,
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
 
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='5', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
5
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
,
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
 
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='6', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
6
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
,
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
 
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='7', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
7
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
,
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
 
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='8', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
8
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
,
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
 
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='9', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
9
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
,
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=' ', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
 
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content='10', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
10
********************
ChatCompletionChunk(id='chatcmpl-AWjPGQ6exHoP0NWXH33IkBoOCwPeo', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1732364402, model='gpt-4o-mini-2024-07-18', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_0705bf87c0', usage=None)
None
********************

chunk.choices[0].delta.contentにご注目。
空文字列(role=assistant指定) -> 1 -> , -> 半角スペース -> 2 -> ... と、1トークンずつ返されています。
最後はcontentがNoneですね(finish_reason='stop'となっているので、これが制御に使えそう)

stream=Trueを指定すると、client.chat.completions.createから返るオブジェクトが変わっています。

>>> type(stream_response)
<class 'openai.Stream'>

まとめ:OpenAIのChat completionsでstream指定の有無の比較

********** stream=False response **********
1, 2, 3, 4, 5, 6, 7, 8, 9, 10

********** stream=True response **********

1
,
 
2
,
 
3
,
 
4
,
 
5
,
 
6
,
 
7
,
 
8
,
 
9
,
 
10
None

終わりに

OpenAIのChat completions APIのstreamの素振りでした。

  • デフォルトではstream=False
    • response.choices[0].message.contentに生成されたテキスト(完全版)
  • stream=Trueを指定する
    • 返ったレスポンスはforで回す(Iterableということ)
    • chunk.choices[0].delta.contentに生成された1トーク
    • roleやfinish_reasonを見て制御できそう(宿題事項)

今回のCookbookにある6のですが、server-sent eventsというのがキーワードっぽいです。

番外編:asyncバージョン

出力はasyncでない場合と一致していました


  1. 続く箇条書きはgreedy decodingを念頭に置いています。他にもbeam searchなどがありますが、分かりやすさのために採用しました
  2. Cookbook「By default, when you request a completion from the OpenAI, the entire completion is generated before being sent back in a single response.
  3. Cookbook「If you're generating long completions, waiting for the response can take many seconds.
  4. OpenAIはModeration APIを無料で提供しています
  5. 手前味噌ですが
  6. This will return an object that streams back the response as data-only server-sent events.