2025/04/14週のLLMまわりアップデート自分用まとめ〜LLM開発3社の一次情報中心に〜

はじめに

七尾百合子さん、お誕生日 32日目 おめでとうございます！ nikkieです。

LLMまわりで激動の1週間でした（いつも激動というのはあります）。
激動の模様を一次情報中心に自分用メモとして書き起こします。
私の興味に引っ張られているので、網羅性はありません

OpenAI

GPT-4.1

無印、mini、nanoの3タイプ
コンテキストウィンドウ 1M

記事から図を引用します。
GPT-4o miniに対して、かしこさは 4.1 nano < 4o mini << 4.1 mini という並びになるので、普段遣いは4.1 miniなのかなと思っています（お金は要確認）

GPT-4.1はコーディングが得意で、入力長1Mあるので、私としてはGeminiを使う理由がちょっと弱まったなという感想です。
ただ4.5より4.1の方が高性能というのは紛らわしすぎるので、正直やめてほしいです（クローズドモデルのOpenAIと矛盾を内包する流れは汲めてますが）

キャッチアップが残っているのは、4.1向けのカッチリ全部指示するプロンプトですね。

o3 / o4-mini

同じ週にまだ出すのか！

船の画像をアップロードして「I took this picture earlier today. Can you find the name of the biggest ship you can see, and where it will dock next?」
Web検索して見つけてきます！（早く触ってみたい）

Introducing OpenAI o3 and o4-mini—our smartest and most capable models to date.

For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation. pic.twitter.com/rDaqV0x0wE
— OpenAI (@OpenAI) 2025年4月16日

強調しているのが、Think with imagesですね。SeeではなくThink（マルチモーダルのGeminiに対してさらに上と言いたい？）

OpenAI o3 and o4-mini are our first models to integrate uploaded images directly into their chain of thought.

That means they don’t just see an image—they think with it. https://t.co/hSJkzeuNQR
— OpenAI (@OpenAI) 2025年4月16日

ドキュメント気になるポイント（積ん読）

Codex CLI

OpenなClaude Codeと捉えました（触るのはこれから）

Codex open source fund、OSS開発者に届け！
https://openai.com/form/codex-open-source-fund/

Anthropic

ClaudeにResearch

3月に（Brave Searchを使った¹）Searchができていたのですが、今回Researchができるように。
日本にも展開されていてありがたい限り。試すぞ！

Researchはどんなものかはこちらのツイートが分かりやすかったです。
このあとDeep Researchが来るのかも

We've added two major features to claude dot ai today that have been 10x productivity boosts for me:

- Google Docs, Calendar, and Gmail integration
- Research - our first step toward an agentic researching agent

Most of today's AI research tools fall into two extremes: instant… pic.twitter.com/WDiUWIf2ao
— Alex Albert (@alexalbert__) 2025年4月15日

Google

Gemini 2.5 Flash

Cloud Next ‘25 でアナウンスされた 2.5 Flash がもう来ました！

Today we’re announcing Gemini 2.5 Flash – our workhorse model optimized specifically for low latency and cost efficiency – is coming to Vertex AI.

think budgetを0〜24576トークンで指定します。

think budget 0は2.0 Flashからの改善差分のみ
think budgetを指定することで、2.0 Flashからの改善 + reasoning
- think budgetが少なくて済むタスク（low reasoning）の例
  - “Thank you” in Spanish
- think budgetが多く必要なタスク（high reasoning）の例
  - Write a function evaluate_cells(cells: Dict[str, str]) -> Dict[str, float] that computes the values of spreadsheet cells. （仕様が続きます）

終わりに

いやー、激動でした。
このところLLMの性能という点ではおとなしかったOpenAIが巻き返した感がありますね

手を動かして消化を進めていきます

I've confirmed that the search engine being used by Claude's web search feature is @brave - it's listed in a recent update to their "Trust Center" and the search results are an exact match https://t.co/h48MuVt9eg
— Simon Willison (@simonw) 2025年3月21日
↩

nikkie-ftnextの日記

イベントレポートや読書メモを発信