modern agentic software engineering
OpenAI’s codex
has demonstrated something I’ve struggled to for a while: we’ve been doing AI software engineering wrong. Or at least, there’s a better way with CLIs and files as the interface. What’s old is new again.
In this post, I demonstrate the core workflow for modern agentic software engineering that can be applied to pair-programming or automation via CI/CD:
- (don’t skip) have a well-structured, well-documented code repository
- (don’t skip) have a code repository with good, fast linting and tests
- generate a
task.md
file with all relevant out-of-repo context ai "open the task.md and work on it"
- further wrap this workflow with scripts for automation when appropriate
Click for some pro tips
gh issue view NUMBER > task.md
and such…- if working interactively, supervise and iterate
- if working in CI/CD, make a PR and review (noting you can loop agentic reviews in here…)
- consolidate commands for linting and testing in a
Makefile
or similar (I likejustfile
) and instructions yourai
on using them - ideate, learn, do, iterate, automate
You can then leverage standard software development lifecycle (SDLC) best practices and GitOps for modern agentic software engineering.
a toy example#
In writing this post I wanted to preview this website locally. It wasn’t working for only this post and I was very confused (I just re-wrote this website into a new framework today, or rather yesterday…). In my terminal, side-by-side with this post open in my editor, I typed ai "for some reason modern-agentic-software-engineering.md in posts/ isn't showing up in my build, any ideas?"
and got:
Click to expand codex
session
╭──────────────────────────────────────────────────────────────╮
│ ● OpenAI Codex (research preview) v0.1.2504172351 │
╰──────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────────────────────────────╮
│ localhost session: 55768aecf37642c2b407726b45aab06f │
│ ↳ workdir: ~/code/dkdc-io/dkdc.dev/website │
│ ↳ model: o4-mini │
│ ↳ approval: full-auto │
╰─────────────────────────────────────────────────────────────╯
user for some reason modern-agentic-software-engineering.md in posts/ isn't showing
up in my build, any ideas?
Inspecting Hugo structure
The user is working with a Hugo-based personal website, and the file
modern-agentic-software-engineering.md in posts/ isn’t showing up in the build.
I need to figure out Hugo's content directory structure. While the standard
content directory is typically content/, it's unclear if posts/ is under
content/posts or at the root. I should examine the repository structure.
I'll look into the directories by running a command to list the structure
so we can get clarity.
command
$ ls -R .
command.stdout (code: 0, duration: 0s)
archetypes codex.md content data ... (381 more lines)
Inspecting markdown file
I see that the markdown file is located in the content/posts directory,
specifically at content/posts/modern-agentic-software-engineering.md. The user
mentioned it isn't showing up in the build. This suggests I should inspect
the markdown file for potential issues, like front matter problems. I'll go
ahead and check the file's content to identify any errors that might be
preventing it from being included in the build process. Let's see what's
going on!
command
$ sed -n '1,200p' content/posts/modern-agentic-software-engineering.md
command.stdout (code: 0, duration: 0s)
+++ title = "modern agentic pair programming" date = "2025-04-21" author =
"Cody" ... (15 more lines)
Investigating Hugo site issues
I see that the content has proper front matter, and while there's no
index.md file, that's okay since Hugo should handle .md files within posts.
I'm wondering if taxonomy settings in hugo.toml are misconfigured or if the
section name is incorrect. The posts should be listed under public/posts,
but I can't find modern-agentic-software-engineering there. I'll need to check
the file's presence and ensure everything is set correctly for proper
display.
Analyzing Hugo's posting issue
It looks like Hugo skipped generating a folder for the post. I'm wondering
if it’s due to the file naming or perhaps the post being future dated. The
file is located at content/posts/modern-agentic-software-engineering.md, which
should create an index.html in public/posts, but it doesn't. Since the date
is set to 2025-04-21 and today is 2025-04-20, Hugo defaults to not
publishing future content. I need to check the date settings for other
posts to confirm this.
command
$ grep -R "date =" -n content/posts/context-length-pain.md
command.stdout (code: 0, duration: 0s)
content/posts/context-length-pain.md:3:date = "2023-08-13"
command
$ sed -n '1,200p' justfile
command.stdout (code: 0, duration: 0s)
# Justfile
# load environment variables set dotenv-load ... (40 more lines)
command
$ sed -n '1,200p' hugo.toml
command.stdout (code: 0, duration: 0s)
baseURL = 'https://dkdc.dev' languageCode = 'en-us' title = 'dkdc.dev'
theme = ["terminal"] ... (22 more lines)
command
$ sed -n '200,400p' hugo.toml
command.stdout (code: 0, duration: 0s)
Explaining build settings
I see the user is having an issue with their Hugo site where a post isn't
showing up. By default, Hugo doesn't build future-dated posts since
buildFuture is set to false. If the user wants to include future posts,
they need to change buildFuture to true in the config or adjust the post
date to today or earlier. Since there's no extra context, I’ll suggest they
check their post date or run the command to allow future posts.
Fixing Hugo future posts
It seems the issue is that Hugo is skipping future-dated content. To fix
this, the user can either change the post date to today or run the command
hugo server -D -F to include drafts and future posts. They mentioned that
the file date is set to "2025-04-21", but the system date is "2025-04-20".
I’ll suggest checking the front matter date and confirm its accuracy.
Additionally, the user could set buildFuture = true in the config if they
want future posts included. I'll verify the correct configuration.
codex It turned out nothing was wrong with Hugo or your file‑layout – Hugo
simply skipped it because its date is “in the future.” Your front‑matter
says:
date = "2025‑04‑21"
but today (per your build) is 2025‑04‑20, so by default Hugo won’t render
it until that date arrives.
You have a couple of easy options:
• Lower the date to today (or earlier), e.g. date = "2025‑04‑20"
• When you preview, tell Hugo to include future posts: just preview -D -F
(that’s hugo server -D -F under the hood)
• For a one‑off production build that includes future posts: hugo
--buildFuture --minify
If you’d rather always build future content you can also add the CLI flag
to your build recipe or wrap it into your hugo.toml, but most folks simply
update the date when they’re ready to publish.
Summarizing:
- I was writing the post on April 20, 2025
- The post was dated April 21, 2025
- Hugo by default doesn’t include posts in the future in its preview or build
Instead of going down some rabbit hole on some ad-riddled website, I just “asked” my terminal around where the problematic file was. You may have noticed before reading any files it noted:
Inspecting Hugo structure
The user is working with a Hugo-based personal website...
Where did this information come from? And what makes this pattern so powerful?
a couple of 3 things#
There are 3 things that are revolutionary here for agentic software engineering:
- Dynamic in-repo context
- Using CLIs directly
- Files (strings) as the interface
dynamic in-repo context#
codex
(uniquely as of release?) works off codex.md
files per-directory. You can also set global instructions in $HOME/.codex/instructions.md
by default (use codex --config/-c
to quickly open global instructions in $EDITOR
). This means as your agent navigates down the file tree you give it access to, it’s dynamically loading relevant context per-directory. This solves a lot of issues with supplying relevant context to agents that is a lot more natural than alternatives.
You can structure repos and directories like:
./
├── repoA/
│ ├── subdirA/
│ │ └── codex.md
│ ├── subdirB/
│ │ └── codex.md
│ └── codex.md
├── repoB/
│ └── codex.md
└── repoC/
└── codex.md
This explains how codex
knew details about my website’s codebase earlier, and has directory-specific context for other repositories I work in. Your task.md
just needs any context that doesn’t already exist in the filetree.
using CLI commands directly#
SQL is one of the most popular “programming languages” in the world, despite software engineers not thinking of it as such often. Why? Largely because it’s declarative and reads like English. SELECT * FROM people WHERE age > 21
kinda makes sense even if you don’t code. It’s easy to learn and (relatively) limited in scope while allowing users to accomplish a lot.
Similarly, gh pr list
is a CLI command that probably just makes sense (assuming you know what gh
and pr
are, to be fair). And if they don’t, just ai "explain <whatever>"
! codex
uses CLI tools directly, allowing you to use the same tools and automate repeated actions for human or agentic use. To demonstrate, early codex
didn’t use a special tool to read a file, just sed
like:
$ sed -n '1,200p' content/posts/modern-agentic-software-engineering.md
or grep
to search files:
$ grep -R "date =" -n content/posts/context-length-pain.md
The broader point here is LLMs are trained on human text and work best like humans interacting with text. Working via CLIs (which also solve a lot of client/server interaction problems) is a natural fit. And CLIs have good built-in documentation (for humans or agents) via --help/-h
flags. It’s easy to use the exact same tools your agent uses and write any deterministic workflows into reusable scripts that humans, agents, or CI/CD can run again and again.
files (strings) as the interface#
Working with LLMs is easiest when you think of them as autoregressive text processors. This explains a lot of their behavior: why they struggle to generate ASCII art, why they can’t reliably do math, etc. If this doesn’t make sense, you should learn more about neural networks and modern LLMs until it does.
Once you do, you understand the text (string) data that goes into the LLM is essentially all that matters: text in, text out. Getting that input text data fast and efficiently is critical to modern agentic software engineering. The Unix philosophy largely solves this and many more problems with effective use.
I’m standardizing on a task.md
convention for loading all relevant context before my ai
call (ai "open the task.md and work on it"
).
how I use git
, gh
, nvim
, tmux
, and codex
#
I have a love/hate relationship with typing. I use a lot of bash zsh aliases functions like:
function ai() {
codex --full-auto "$@"
}
function v() {
nvim "$@"
}
function ghprc() {
local repo="${1:-}"
local pr_number="${2:-}"
local org="ascend-io"
# Derive missing arguments with gh
if [[ -z "$repo" ]]; then
repo=$(gh repo view --json name --jq '.name' 2>/dev/null) || repo=""
fi
if [[ -z "$pr_number" ]]; then
pr_number=$(gh pr view --json number --jq '.number' 2>/dev/null) || pr_number=""
fi
# Validate
if [[ -z "$repo" ]]; then
echo "Error: repo not specified and could not determine current repository." >&2
return 1
fi
if [[ -z "$pr_number" ]]; then
echo "Error: PR number not specified and could not determine current pull request." >&2
return 1
fi
gh api --paginate -H "Accept: application/vnd.github+json" \
"/repos/${org}/${repo}/pulls/${pr_number}/comments" |
jq -r '
.[] |
"Reviewer: \(.user.login)
File: \(.path) (line \(.line // "N/A"))
Diff:
\(.diff_hunk)
Comment:
\(.body)
-------------------------------------------------------------------------------"
'
}
function taskit() {
echo "# PR description" > task.md
gh pr view >> task.md
echo "# PR comments" >> task.md
gh pr view -c >> task.md
echo "# PR diff" >> task.md
gh pr diff >> task.md
echo "# PR review comments" >> task.md
ghprc >> task.md
}
(as an exercise for the reader, without copy/pasting from your browser, pipe the code above for ghprc
into codex
to explain it)
I like to think of this as a form of automation. If I can automatically build a task.md
quickly, I can ai "open the task.md and work on it"
for any task. It’s just about quickly building up the task.md
that will accomplish what I want.
Git is critical for managing strings in the filesystem. Vim keybindings make editing task.md
files to feed into codex
fast. Tmux allows me to have my editor and ai
open side-by-side, and open new windows as needed for more sessions in parallel. And the GitHub CLI + my bash functions help pull information from software engineers I work with into the task.md
file. If your GitHub issues are well-written, there’s no reason gh issue view NUMBER | codex --full-auto
shouldn’t be your first attempt at solving a problem. Though until you and your team adopt this new paradigm, you might want to start with gh issue view NUMBER >> task.md
and edit manually. Do what works!
BUT WHAT ABOUT#
But what about Claude Code?? They even released first! And actually, there’s Aider! And that’s already open source, and they’ve been doign it forever!1!!
Yes. But claude
isn’t open source (and thus doesn’t have a --config/-c
flag added by me, making it inferior) and aider
is…just not quite the same level of agentic in my admittedly inexperienced opinion. OpenAI’s codex
is the first where the workflow has clicked for me, though I’ve been using claude
more side-by-side and it’s also fine. I’m more concerned with the workflow than the provider/specific tool.
Claude Code, as of this post, does not support CLAUDE.md
files in subdirectories that are used in the same way codex.md
files are (as far as I can tell from usage and their documentation). I’m less-sure, but believe loading context into Aider is a more manual process as well. These tools will converge on doing whatever works best, and I suspect both will soon do this as codex
does.
Oh and what about RAG! Using grep
? So limiting! Vector search!!!
This last one might be a fair point – and my counterpoint is if semantic vector search is so useful on its own, it should be its own CLI tool alongside grep
for filesystem search. Why limit it to just agents? Modern agentic pair programming implies using the best tool for humans and agents alike, which tend to be open source standards and CLIs following the Unix philosophy.
results#
But where are the results!?!?!? In this post, I’ve only answered one (somewhat) simple question with codex
.
While I can’t share the code, in the past 2 weeks using agentic software engineering (claude
, VSCode Copilot agent mode, codex
) I’ve made eight meaningful contributions to our frontend codebase that, as a product manager and JavaScript hater, I would never have made otherwise – my prior contributions were all updating strings in a en-US.json
file. There’s still reason to be skeptical but the evidence is mounting and the usefulness is getting easier to scale. I’ve also redone my personal website, refactored stuff into Go and Rust, and more with agentic assistance. Learning things yourself is important (and critical for effective use of agents), but the results are here now.
predictions#
I predict:
- We standardize on
README
/README.md
/readme.md
/etc. instead of LLM-specific filenames for dynamic context across tooling. Why should the instructions for humans versus agents differ anyway? It should be assumed everyone working in the codebase read theREADME
(crazy stuff). - Specialized domain-specific wrappers around standard software engineering agents emerge. Organizations and teams developer their own business logic as usual around these.
- More people will realize the power of terminal-driven, file-driven agentic pair programming.
- More people will realize the power of the above for agentic CI/CD.
- Automation, automation, automation…