A model is only as good as its harness

Opus 4.8 is here but claude code has some serious competition

Howdy howdy,

I was about to click publish, then Opus 4.8 dropped. so I thought we’d squeeze it in…

also i'm flying to New York next week to record two massive podcasts, and i'm so excited:

📌 TL;DR

  • Opus 4.8 dropped → Claude is back. People love it. Big honesty upgrade, finishes long tasks properly, and 4.7's bad attitude is gone. Same price as 4.7

  • Codex Thursday → OpenAI shipped remote Mac control, Appshots (⌘⌘ to share full screen context), multi-day Goal mode, and visual annotation. Claude Code got some serious competition.

  • Higgsfield × Adobe → native plugins for Premiere Pro and After Effects. AI video generation inside your timeline.

Opus 4.8 is here ladies and gentlemen

Anthropic shipped Opus 4.8 this week and it's a great model.

I’m personally loving it, but I cant tell if the model is really that good, or if 4.7 was just so degraded. Opus 4.7 felt lazy, quit tasks early, refused reasonable work in the name of "safety," and had a bad attitude.

It also deleted my entire .claude folder one day, wiping hundreds of hours of my IP.

A lot of people, die-hard Claude users included, started drifting over to GPT-5.5 in the Codex app for certain tasks (myself included), so I’m really glad we have a good model again.

Two main things people are loving about Opus 4.8:

  • It actually finishes the job. Where 4.7 would ditch a task wayy too early, 4.8 happily grinds away on long-running coding work until it's 100% done.

  • It’s more honest. It catches its own mistakes before you do, and it's about 4x less likely to let a code flaw slide without a word. It pushes back on bad plans, asks instead of guessing, and has stopped being a yes-man. It's got one of the lowest hallucination rates going, because it'll admit when it doesn't know instead of bluffing. massive.

One thing to watch: it's reasoning-sensitive, so pay attention to your effort level. Crank it up or you're not seeing the model everyone's raving about. The default is fine for most stuff, but for hard coding work, push it to xhigh.

the coding score drops from 63 to 42 just going from xhigh to high.

Its the same exact price as Opus 4.7 too.

Codex is kinda cooking

These days a model is only as good as its harness (where you use it), and Codex is still a far superior harness to the Claude Desktop app.

THE CLAUDE DESKTOP APP FUCKING SUCKS.

It's super messy and confusing, especially for beginners.

You've got Cowork and Code (basically the same thing).

For recurring workflows you've got routines and scheduled tasks (also the same thing).

Want to use Claude from your phone? Cool, here's remote control, dispatch, the Telegram plugin, etc. Pick one, I guess.

On top of all that it's slow and buggy.

The Codex app, on the other hand is very well done. OpenAI outdid themselves.

Firstly, its just one thing… Codex (OpenAI’s agent). Unlike Claude having multiple products that are basically just the same thing.

Its also just really clean, fast, simple and user friendly.

OpenAI dropped four new features on Codex this week. (they’re actually class)

1. Control your Mac from your phone

OpenAI shipped remote Mac control to Codex. You can leave your macbook, go about your day,  and your agent operates apps on your Mac from your phone.

You can continue on your project while youre out for lunch.

Halfway through your coffee, and you remember you forgot to send your COO that file? just ask codex, and it will find it on your computer and send it over.

2. Appshots: stop copy-pasting context into your AI

Just a streamlined way of pasting screenshots into Codex. Press ⌘⌘ (both command keys) and Codex captures your entire screen as readable text.

3. Goal mode just got way more accessible

You'll remember I covered Goal mode in last week’s newsletter: set Codex a /goal and it works self-operating on it for hours, even days.

Goal mode is now available on the lower tier plans too.

4. Advanced annotation mode: edit UI like a drag & drop website builder

Instead of writing 200 words describing how you want a button to look, you just… edit it.

Click, drag, adjust colors, reposition elements. Codex sees exactly what you want changed. No need to write a thesis on why the button should be 2px to the left.

Also this week...

  • Higgsfield plugged into Adobe → native plugins for Premiere Pro and After Effects. Generate B-roll, motion graphics, and AI elements right inside your timeline instead of tabbing out to the browser and back. Everything matches your project's grading and aspect ratio automatically. Free if you've got a Higgsfield sub.

🧰 Tools to try

  • Limora → AI asset generator for web designers. Add your brand once and it generates on-brand hero images, backgrounds, product shots, OG cards, and more.

  • USB-C Dummy Dongle → found this dummy USB-C that tricks your Mac into thinking it's plugged into a monitor, so your macbook stays on and running while on the go (if you shut your lid normally on your macbook claude stops running)

  • DaydreamvideoAI video editor built for Claude Code and Codex. Edit videos by transcript, search B-roll, generate motion graphics. All local. Free to start.

  • Beehiiv → Beehiiv is the tool I use to run my newsletter (dope team and product) if you want to start your own, use this link for 20% off for 3 months

🥣 Brain food

i've been pretty slack on content this week.

honestly, i'm not happy with the state of AI content right now, and not even that happy with some of my own last few videos.

so i took the week to step back and kind of re-strategise on the sort of content i actually want to be putting out into the world.

it feels like AI has just become a gimmick.

every single video on my feed is some version of "replace your marketing agency with these five Claude skills," and it's just absolute slop.

i think two things are driving it.

first, we're in an AI content gold rush, so any man and their dog can start posting AI videos and rack up 10, 20, 30,000 followers in a few weeks.

second, everyone's got AI content-idea agents scraping the internet for the best-performing AI content so they can "copy what works." you can kind of imagine what happens when everyone has an agent scraping everyone else's best stuff to copy it. you end up with this giant cesspit of regurgitated AI slop.

i mean, in the end it just makes it easier to win, right? because the barrier is so low.

good day,