Running OpenAI Codex with locally running GPT OSS 20b model

October 4, 2025 One minute read

llm • codex

Llama.cpp

Install llama.cpp

brew install llama.cpp

Run llama-server

llama-server -hf ggml-org/gpt-oss-20b-GGUF --ctx-size 0 --jinja -ub 2048 -b 2048 -ngl 99 -fa on --port 1234

The above command is suggested for devices less than 96GB RAM. See reference for other commands.

Codex

brew install codex

Setup config.toml file

take ~/.codex
vim ~/.codex/config.toml

[model_providers.lms]
name = "LM Studio"
base_url = "http://localhost:1234/v1"
[profiles.gpt-oss-20b-lms]
model_provider = "lms"
model = "gpt-oss:20b"

Run codex

codex --profile gpt-oss-20b-lms

Reference:

https://github.com/ggml-org/llama.cpp/discussions/15396