Canex
OpenClaw Workshop
Two hours in Kelvin Grove with Mark and Jayden. A working AI stack already in place. Now we sharpen it for cost, consistency, and capability.
Three moves
In this order
Everything below this section expands and explains. If Jayden only has an hour this week, execute these three in sequence. They close the biggest capability gap and knock a large chunk off the monthly bill.
Step 01
Lock down a real skill
Open the .md for the skill we drafted together today and review it line by line. Or ask Hector to write a new one from scratch. Either way, tweak with Sonnet 4.6 until the prose matches the job you actually want done.
›
Lock down a real skill
Open the .md for the skill we drafted together today and review it line by line. Or ask Hector to write a new one from scratch. Either way, tweak with Sonnet 4.6 until the prose matches the job you actually want done.
Skills are named .md files that describe a job in enough detail that the agent produces the same output every time. Without them, every reply is free form and every result drifts.
Today's workflow: track down the skill we drafted, open it, and read every line as if you were briefing a new employee. Anywhere the prose is vague, tighten it. Anywhere it skips a step, add the step.
Test the skill three times with slightly different inputs. If the output drifts between runs, the skill isn't tight enough yet. Keep iterating with Sonnet 4.6 until it doesn't.
Jump to: Consistency is the product · below
Step 02
Bring Beatrice online
New Telegram bot via BotFather. Default her to GPT-5.4-mini via OpenRouter. Separate workspace, separate memory, cheaper brain.
›
Bring Beatrice online
New Telegram bot via BotFather. Default her to GPT-5.4-mini via OpenRouter. Separate workspace, separate memory, cheaper brain.
Hit BotFather in Telegram, create a new bot, copy the token. In OpenClaw's Channels tab, add Beatrice as a new agent and paste the token in.
Create an OpenRouter account first and load US$10 of credits. In Config, set Beatrice's default model to GPT-5.4-mini. Paid tier only, never free models, they'll corrupt the config file.
She gets her own workspace and memory by default. Hector can't see inside them, and she can't see inside his. That isolation is the whole point, it's what keeps context rot from killing performance as you pile on jobs.
Jump to: Two agents · below
Step 03
Cron one job
Daily worker report, or weekly Dropbox backup. Pick the easiest. Let Beatrice run it on schedule. Set it and forget it. That's the stability flywheel.
›
Cron one job
Daily worker report, or weekly Dropbox backup. Pick the easiest. Let Beatrice run it on schedule. Set it and forget it. That's the stability flywheel.
Don't start Beatrice on discovery work, she'll drift and disappoint. Give her one routine that can be locked into a skill and forgotten.
Weekly Dropbox backup is the safest first pick. File copy is not an intelligence problem, so the dumbest cheap model handles it. When it runs clean for a week, you've proven the whole loop end to end: skill → schedule → tool call → result.
Once the backup is stable, layer on a daily worker report or an activity summary. Now you've got real leverage, the bulk of Canex's recurring work running on cheap scheduled loops while you sleep.
Jump to: First jobs for Beatrice · below
The inherited stack
More capable than you realised
Jayden had built the entire thing from the TUI, which we joked was basically DOS. Working, but flying blind. Opening the Dashboard today changed the picture.
- · OpenClaw agent harness on the office Mac Mini
- · Anthropic API key, Sonnet 4.6 on everything
- · Hector configured as assistant to the CEO
- · Telegram bot wired to Hector for remote chat
- · Dropbox linked · VOIP connection pending
All Sonnet 4.6 API. No model routing. No scheduled jobs yet. This is effectively the cost of interactive chat plus Hector's spin up Haiku sub agent.
Zero custom. Only the out of the box set. This is the single biggest short term value gap, and the fastest one to close.
Four manual snapshots on disk. No schedule. No off-machine redundancy yet.
Local only
But load bearing
I opened the Dashboard in your browser. It should now be in Favourites. It only runs locally, so no remote access, but when you chat with Hector from Telegram he's operating on the same data underneath. Get familiar with four tabs.
See which skills are enabled. Confirm the agent actually has the skill you think it does before asking it to use one.
Hector today, Beatrice once you've spun her up. Each has its own workspace, memory, and Telegram bot.
Where each agent's inputs and outputs are wired up. Telegram bots, schedulers, anything that sends work in or delivers work out lives here.
Where model lists and keys live. Editing here is how you wire OpenRouter in. Also where the agent is most likely to write a bug when helping you.
Two streams
Don't mix them up
"Business efficiencies via AI" is too broad to action. We split it in the room into two streams, with different tools for each.
Find what's broken
Inefficiencies you don't yet know about. Human led, AI assisted. Mark already has a paid ChatGPT subscription, so run this stream on Codex. It's a stronger build environment than OpenClaw for exploratory work, and you're already paying for it.
Tool · Codex · conversational
Kill the known work
Jobs you already do by hand today and want gone. Automated via skills and cron jobs. This is OpenClaw's sweet spot, and where Beatrice lives. Once it's set up you leave it running.
Tool · OpenClaw · scheduled
Start with Stream B. Daily reports, backups, data parsing. Move to Stream A only once you've got a working rhythm with scheduled agents and the bill under control.
Consistency is the product
Skills deliver it
You have zero custom skills today. Every interaction is free-form, which means every output is a little bit different. Skills are how you get the same thing, the same way, every time.
The authoring loop
- 01 Ask Hector (on Sonnet 4.6) to write a skill for a job you already do by hand. Describe the job in detail.
- 02 Find the generated .md file on disk and open it. Read every line.
- 03 Tweak by hand, or instruct Sonnet with very specific corrections. Keep iterating until the prose matches exactly how you'd brief a junior employee.
- 04 Give the skill a clear name. From now on, when you ask the agent to do that job, reference the skill by name: "use the weekly_backup skill."
- 05 Test it three times. If the output drifts, the skill is under-specified. Tighten it.
Two agents
Different jobs, different brains
We set Beatrice up today as a second agent. She runs in her own workspace, with her own memory, and her own Telegram bot (Jayden, hit BotFather for a fresh token). Agents can share via tools, but they don't see each other's workspace directly. That's the point.
Both agents have full tool access to the machine. But what's immediately visible to each is only its own workspace and memory. They reach each other indirectly, by reading files, summarising Dropbox folders, or calling a shared skill. That separation is what keeps performance high.
Don't burn frontier tokens
On spreadsheet work
Right now Sonnet 4.6 is doing everything, including tasks that a cheaper model will handle just as well. The immediate win is routing by job type. Spin up an OpenRouter account (start with US$10 of credits) and wire it into the config.
- · Skill authoring + refinement
- · Interactive problem-solving with Mark
- · Anything that needs judgement
- · Bulk data parsing (ingest → spreadsheet)
- · Scheduled cron jobs on Beatrice
- · File copy / backup / summarisation
~5× cheaper · often as good
With the router key live, ask Hector to set up a /models command in Telegram so you can switch mid chat. Filter the list to current gen paid models only. The latest MiMo Kimiko 2.6, MiniMax 2.7, GPT 5.4 micro / mini / full, GLM 5.1. Test each one against a skill that Sonnet wrote and see which stays on spec.
Have the AI check the AI
From a different seat
That candidate scrape spreadsheet you showed me, with job types landing in the referee column and names in phone fields, those are exactly the kind of mistakes a model can spot instantly when told what to look for.
The main skill. Performs the job. Owns the happy path.
A second, separate skill. Reads the output of the first. Looks for a catalogued list of known failure modes: wrong column, missing field, nonsense pattern. Flags or fixes.
The checker works best as its own file because it forces the agent to re read the output from a fresh context. Keep a running list of real mistakes you've caught. That list is the checker skill.
Cheap model
Predictable jobs
Don't start Beatrice on discovery work. Give her routines that can be locked down in a skill, run on a clock, and forgotten about. Three good candidates:
Weekly Dropbox backup
Copy both agents' configs and workspaces to Dropbox once a week. Use the dumbest model that works. File copy is not an intelligence problem.
Daily worker report
If the data point for 130 workers is easy to access, Beatrice produces the report file. Hector delivers it to Mark via Telegram. Clean handoff.
Activity summary
Cron Beatrice to summarise both her and Hector's activity daily. Cheap model, consistent format, builds a paper trail of what your agents actually did.
Once the bill is sane
Build deterministic tools
OpenClaw + skills gets you a long way. But for high value recurring work, say the report you'll send Mark every day for the next two years, probabilistic AI is a liability. The same prompt can produce different output. That's fine for discovery. It's wrong for reporting.
AU$169/mo subscription. Gives you ~5× the Sonnet usage of raw API. Use it as your build environment to write real code that deterministically produces the same report every day.
Deterministic beats probabilistic · for repeat tasks
OpenAI's coding agent. Arguably ahead right now, because OpenAI has more compute to train and serve. Strong second option if Claude Code hits capacity issues, and already useful for Stream A given Mark's existing ChatGPT subscription.
The pattern: Claude Code writes a small program that produces the report from your actual data sources. Agent doesn't need to run every day. Code does. Once a month you use Claude Code to update the program when a spreadsheet schema changes. You get a live report system that never drifts, and your AI bill collapses to "I wrote some code this month."
Push it until it breaks
Then pull back
Keep adding tools, connectivity, skills, until things start to fail. That's the information. You now know your agent's actual ceiling.
Revert to the last working state. Harden the small number of things that already work. Don't ship half-built.
Prioritise this quadrant ruthlessly. Cheap model routing on cron jobs is the obvious one right now.
Build directly on the office machine for full ease of access. Dial in remotely only ad hoc and after hours.
Your data is their training set
Act accordingly
› Economics AI companies lose money on every token, so your usage is subsidised.
› IP leakage Skills you teach the model become available to everyone, eventually.
› Capacity Anthropic is constrained. Codex is catching up fast.
› Memory OpenClaw's default memory isn't great.
Where I'm running
So you can see where this all leads
A custom agent harness. Hybrid of Claude Code, Codex, OpenClaw, and Hermes. Each for what it's best at.
Most of my skills are thin wrappers that invoke purpose built local programs. Deterministic output. Data never hits the API. Code stays out of the training set.
Hand rolled from open source memory snippets. Short term scratchpad, a small RAG index for facts, and a local wiki the agent reads and writes. No heavy vector database unless the scale genuinely demands it.
Follow ups
Who owes what
- · Find the Dashboard in browser Favourites, it's already open
- · Locate the skill we created today for Hector. Open the .md. Read every line. Tweak.
- · Create an OpenRouter account. Load US$10. Wire key into Config.
- · BotFather → new Telegram bot for Beatrice
- · Connect VOIP + Dropbox to the agents for full access
- · Ask Hector to add a /models command to Telegram (paid router models only)
- · Spin up the weekly Dropbox backup cron on Beatrice
- · Share memory system upgrade notes if Hector's quality starts slipping
- · Introduce RAG and local wiki patterns when we get there
- · Send Jayden the Andrej Karpathy wiki post on agent memory, and recommend following him for the clearest signal in this space
You type five words
The model sees five thousand
This is the single most useful thing to understand about OpenClaw, or any agent harness. Before the LLM sees anything, the harness silently wraps your message in layers of context you never wrote. That invisible wrapping is what makes Hector Hector, and not a generic chatbot. It's also why your bill is what it is.
Under the hood
The LLM is a lattice
A bit of bonus conceptual reading. An LLM isn't magic, it's a very large grid of simple nodes wired together in layers. A prompt enters on the left, gets transformed layer by layer, and a response falls out the right. What you see below is a simulation of that lattice. Every bright packet is the kind of activation a real model fires billions of times per response.
Each vertical column is a transformer layer. A production model has dozens to hundreds. More layers, more abstract reasoning.
The dots are individual nodes. Each one does a tiny bit of arithmetic on what the previous layer passed forward. Simple units, colossal networks.
The glowing trails are activations firing between nodes. In a real model these are numerical weights, not little dots, but the shape of the flow is the same.
Signed
Chris