Session Debrief · Canex · 2026-04-23
Prepared by Chris
Session Debrief · 2026-04-23

Canex
OpenClaw Workshop

Two hours in Kelvin Grove with Mark and Jayden. A working AI stack already in place. Now we sharpen it for cost, consistency, and capability.

Client
Canex · Labour Hire
Location
Kelvin Grove, QLD
Duration
2 hours · Workshop
Attending
Mark · Jayden
Do these first

Three moves
In this order

Everything below this section expands and explains. If Jayden only has an hour this week, execute these three in sequence. They close the biggest capability gap and knock a large chunk off the monthly bill.

Step 01

Lock down a real skill

Open the .md for the skill we drafted together today and review it line by line. Or ask Hector to write a new one from scratch. Either way, tweak with Sonnet 4.6 until the prose matches the job you actually want done.

Skills are named .md files that describe a job in enough detail that the agent produces the same output every time. Without them, every reply is free form and every result drifts.

Today's workflow: track down the skill we drafted, open it, and read every line as if you were briefing a new employee. Anywhere the prose is vague, tighten it. Anywhere it skips a step, add the step.

Test the skill three times with slightly different inputs. If the output drifts between runs, the skill isn't tight enough yet. Keep iterating with Sonnet 4.6 until it doesn't.

Jump to: Consistency is the product · below

Step 02

Bring Beatrice online

New Telegram bot via BotFather. Default her to GPT-5.4-mini via OpenRouter. Separate workspace, separate memory, cheaper brain.

Hit BotFather in Telegram, create a new bot, copy the token. In OpenClaw's Channels tab, add Beatrice as a new agent and paste the token in.

Create an OpenRouter account first and load US$10 of credits. In Config, set Beatrice's default model to GPT-5.4-mini. Paid tier only, never free models, they'll corrupt the config file.

She gets her own workspace and memory by default. Hector can't see inside them, and she can't see inside his. That isolation is the whole point, it's what keeps context rot from killing performance as you pile on jobs.

Jump to: Two agents · below

Step 03

Cron one job

Daily worker report, or weekly Dropbox backup. Pick the easiest. Let Beatrice run it on schedule. Set it and forget it. That's the stability flywheel.

Don't start Beatrice on discovery work, she'll drift and disappoint. Give her one routine that can be locked into a skill and forgotten.

Weekly Dropbox backup is the safest first pick. File copy is not an intelligence problem, so the dumbest cheap model handles it. When it runs clean for a week, you've proven the whole loop end to end: skill → schedule → tool call → result.

Once the backup is stable, layer on a daily worker report or an activity summary. Now you've got real leverage, the bulk of Canex's recurring work running on cheap scheduled loops while you sleep.

Jump to: First jobs for Beatrice · below

Where we started

The inherited stack
More capable than you realised

Jayden had built the entire thing from the TUI, which we joked was basically DOS. Working, but flying blind. Opening the Dashboard today changed the picture.

Runtime
  • · OpenClaw agent harness on the office Mac Mini
  • · Anthropic API key, Sonnet 4.6 on everything
  • · Hector configured as assistant to the CEO
  • · Telegram bot wired to Hector for remote chat
  • · Dropbox linked · VOIP connection pending
Spend (month-to-date)
US$379 ≈ AU$526

All Sonnet 4.6 API. No model routing. No scheduled jobs yet. This is effectively the cost of interactive chat plus Hector's spin up Haiku sub agent.

Skills

Zero custom. Only the out of the box set. This is the single biggest short term value gap, and the fastest one to close.

Backups

Four manual snapshots on disk. No schedule. No off-machine redundancy yet.

The Dashboard

Local only
But load bearing

I opened the Dashboard in your browser. It should now be in Favourites. It only runs locally, so no remote access, but when you chat with Hector from Telegram he's operating on the same data underneath. Get familiar with four tabs.

Tab · Skills

See which skills are enabled. Confirm the agent actually has the skill you think it does before asking it to use one.

Tab · Agents

Hector today, Beatrice once you've spun her up. Each has its own workspace, memory, and Telegram bot.

Tab · Channels

Where each agent's inputs and outputs are wired up. Telegram bots, schedulers, anything that sends work in or delivers work out lives here.

Tab · Config

Where model lists and keys live. Editing here is how you wire OpenRouter in. Also where the agent is most likely to write a bug when helping you.

Reminder The Dashboard only works locally on the Mac Mini. Chatting via Telegram still benefits from being dashboard literate. You can ask Hector to change a setting and then eyeball it here to confirm.
Framing the work

Two streams
Don't mix them up

"Business efficiencies via AI" is too broad to action. We split it in the room into two streams, with different tools for each.

Stream A · Discovery

Find what's broken

Inefficiencies you don't yet know about. Human led, AI assisted. Mark already has a paid ChatGPT subscription, so run this stream on Codex. It's a stronger build environment than OpenClaw for exploratory work, and you're already paying for it.

Tool · Codex · conversational

Stream B · Automation

Kill the known work

Jobs you already do by hand today and want gone. Automated via skills and cron jobs. This is OpenClaw's sweet spot, and where Beatrice lives. Once it's set up you leave it running.

Tool · OpenClaw · scheduled

Start with Stream B. Daily reports, backups, data parsing. Move to Stream A only once you've got a working rhythm with scheduled agents and the bill under control.

Skills · The Biggest Gap

Consistency is the product
Skills deliver it

You have zero custom skills today. Every interaction is free-form, which means every output is a little bit different. Skills are how you get the same thing, the same way, every time.

SKL

The authoring loop

  1. 01 Ask Hector (on Sonnet 4.6) to write a skill for a job you already do by hand. Describe the job in detail.
  2. 02 Find the generated .md file on disk and open it. Read every line.
  3. 03 Tweak by hand, or instruct Sonnet with very specific corrections. Keep iterating until the prose matches exactly how you'd brief a junior employee.
  4. 04 Give the skill a clear name. From now on, when you ask the agent to do that job, reference the skill by name: "use the weekly_backup skill."
  5. 05 Test it three times. If the output drifts, the skill is under-specified. Tighten it.
Today We created a skill together for Hector. Jayden, track down that file, open it, read it end to end. That's the first skill to polish. Don't skip the manual read. The agent won't catch its own vagueness.
Separation of concerns

Two agents
Different jobs, different brains

We set Beatrice up today as a second agent. She runs in her own workspace, with her own memory, and her own Telegram bot (Jayden, hit BotFather for a fresh token). Agents can share via tools, but they don't see each other's workspace directly. That's the point.

Live Diagram
Click any node for its role · solid lines = direct access · dashed = via tools
direct indirect
Hector · CEO Assistant
Beatrice · Routine Worker
Shared · Infra
Hector
Sonnet 4.6 · interactive
Workspace H
files, context, scratchpad
Memory H
long term state
Telegram · Hector
Mark's daily channel
Beatrice
GPT 5.4 mini · scheduled
Workspace B
isolated from Hector
Memory B
long term state
Telegram · Beatrice
new BotFather token
Skills Library
called by name · .md files
Config File
model list · keys · routing
Tools
FS · Dropbox · VOIP · APIs
Cron Scheduler
fires Beatrice on a clock
OpenRouter + Anthropic
model providers
Node · select one
Click any node above. Details appear here.

Both agents have full tool access to the machine. But what's immediately visible to each is only its own workspace and memory. They reach each other indirectly, by reading files, summarising Dropbox folders, or calling a shared skill. That separation is what keeps performance high.

Context Rot Agents degrade as you pile on irrelevant information. Keeping Hector and Beatrice on separate concerns isn't an aesthetic choice. It's a performance requirement.
Cost routing

Don't burn frontier tokens
On spreadsheet work

Right now Sonnet 4.6 is doing everything, including tasks that a cheaper model will handle just as well. The immediate win is routing by job type. Spin up an OpenRouter account (start with US$10 of credits) and wire it into the config.

Keep on Sonnet 4.6
  • · Skill authoring + refinement
  • · Interactive problem-solving with Mark
  • · Anything that needs judgement
Move to GPT-5.4-mini
  • · Bulk data parsing (ingest → spreadsheet)
  • · Scheduled cron jobs on Beatrice
  • · File copy / backup / summarisation

~5× cheaper · often as good

With the router key live, ask Hector to set up a /models command in Telegram so you can switch mid chat. Filter the list to current gen paid models only. The latest MiMo Kimiko 2.6, MiniMax 2.7, GPT 5.4 micro / mini / full, GLM 5.1. Test each one against a skill that Sonnet wrote and see which stays on spec.

Warning Avoid all free OpenRouter models. They use a different configuration scheme and will corrupt your OpenClaw config file. Paid tier only.
Quality control

Have the AI check the AI
From a different seat

That candidate scrape spreadsheet you showed me, with job types landing in the referee column and names in phone fields, those are exactly the kind of mistakes a model can spot instantly when told what to look for.

Skill · Do

The main skill. Performs the job. Owns the happy path.

Skill · Check

A second, separate skill. Reads the output of the first. Looks for a catalogued list of known failure modes: wrong column, missing field, nonsense pattern. Flags or fixes.

The checker works best as its own file because it forces the agent to re read the output from a fresh context. Keep a running list of real mistakes you've caught. That list is the checker skill.

First jobs for Beatrice

Cheap model
Predictable jobs

Don't start Beatrice on discovery work. Give her routines that can be locked down in a skill, run on a clock, and forgotten about. Three good candidates:

Job 01

Weekly Dropbox backup

Copy both agents' configs and workspaces to Dropbox once a week. Use the dumbest model that works. File copy is not an intelligence problem.

Job 02

Daily worker report

If the data point for 130 workers is easy to access, Beatrice produces the report file. Hector delivers it to Mark via Telegram. Clean handoff.

Job 03

Activity summary

Cron Beatrice to summarise both her and Hector's activity daily. Cheap model, consistent format, builds a paper trail of what your agents actually did.

Reminder You currently have 4 manual backups and no schedule. Set up the backup cron first. It's the lowest risk way to prove Beatrice's loop works end to end.
The next horizon

Once the bill is sane
Build deterministic tools

OpenClaw + skills gets you a long way. But for high value recurring work, say the report you'll send Mark every day for the next two years, probabilistic AI is a liability. The same prompt can produce different output. That's fine for discovery. It's wrong for reporting.

Tool · Claude Code

AU$169/mo subscription. Gives you ~5× the Sonnet usage of raw API. Use it as your build environment to write real code that deterministically produces the same report every day.

Deterministic beats probabilistic · for repeat tasks

Tool · Codex

OpenAI's coding agent. Arguably ahead right now, because OpenAI has more compute to train and serve. Strong second option if Claude Code hits capacity issues, and already useful for Stream A given Mark's existing ChatGPT subscription.

The pattern: Claude Code writes a small program that produces the report from your actual data sources. Agent doesn't need to run every day. Code does. Once a month you use Claude Code to update the program when a spreadsheet schema changes. You get a live report system that never drifts, and your AI bill collapses to "I wrote some code this month."

How to work

Push it until it breaks
Then pull back

Find the edge

Keep adding tools, connectivity, skills, until things start to fail. That's the information. You now know your agent's actual ceiling.

Pull back to stable

Revert to the last working state. Harden the small number of things that already work. Don't ship half-built.

Highest saving × easiest build

Prioritise this quadrant ruthlessly. Cheap model routing on cron jobs is the obvious one right now.

Work at the Mac Mini

Build directly on the office machine for full ease of access. Dial in remotely only ad hoc and after hours.

Things to stay aware of

Your data is their training set
Act accordingly

Economics AI companies lose money on every token, so your usage is subsidised.
They're betting on getting more efficient over time. In the meantime, they're farming your prompts and outputs for training. Everything you send and receive is read by a big American company. That includes the skills you author.
IP leakage Skills you teach the model become available to everyone, eventually.
Unless you build defences. One pattern I use: skills that call purpose built local programs. The agent only sees the tool to call, not the code or data. That's how you preserve proprietary process while still getting the benefit of agentic orchestration.
Capacity Anthropic is constrained. Codex is catching up fast.
Anthropic got popular faster than they scaled, so they're the most expensive and the most restricted. OpenAI bought far more compute. Expect Codex to close ground quickly. Keep a second option warm.
Memory OpenClaw's default memory isn't great.
We didn't cover what memory system you're running. If Hector starts forgetting things or his quality drops, tell me and I'll share an upgrade path. The best options right now combine lightweight open source memory snippets with a small RAG layer and a local wiki the agent can read and write. Worth following Andrej Karpathy as well, his recent wiki post on agent memory is the best plain English primer I've seen.
For reference · my own setup

Where I'm running
So you can see where this all leads

01 · Harness

A custom agent harness. Hybrid of Claude Code, Codex, OpenClaw, and Hermes. Each for what it's best at.

02 · Skills that call programs

Most of my skills are thin wrappers that invoke purpose built local programs. Deterministic output. Data never hits the API. Code stays out of the training set.

03 · Memory

Hand rolled from open source memory snippets. Short term scratchpad, a small RAG index for facts, and a local wiki the agent reads and writes. No heavy vector database unless the scale genuinely demands it.

Open items

Follow ups
Who owes what

Jayden
  • · Find the Dashboard in browser Favourites, it's already open
  • · Locate the skill we created today for Hector. Open the .md. Read every line. Tweak.
  • · Create an OpenRouter account. Load US$10. Wire key into Config.
  • · BotFather → new Telegram bot for Beatrice
  • · Connect VOIP + Dropbox to the agents for full access
  • · Ask Hector to add a /models command to Telegram (paid router models only)
  • · Spin up the weekly Dropbox backup cron on Beatrice
Chris
  • · Share memory system upgrade notes if Hector's quality starts slipping
  • · Introduce RAG and local wiki patterns when we get there
  • · Send Jayden the Andrej Karpathy wiki post on agent memory, and recommend following him for the clearest signal in this space
For interest

You type five words
The model sees five thousand

This is the single most useful thing to understand about OpenClaw, or any agent harness. Before the LLM sees anything, the harness silently wraps your message in layers of context you never wrote. That invisible wrapping is what makes Hector Hector, and not a generic chatbot. It's also why your bill is what it is.

Prompt Assembly
Press play and watch the layers stack up
What you typed
"Backup my files to Dropbox"
5 words · ~7 tokens
OpenClaw
wraps
What the LLM actually receives
System Prompt
"You are Hector, assistant to Mark, CEO of Canex Labour Hire..."
Memory Recall
"Last backup: 3 days ago. Mark prefers morning summaries..."
Skills Available
backup_dropbox · weekly_report · send_telegram · parse_csv ...
Tool Manifest
read_file · write_file · call_api · invoke_skill · list_dir ...
Your Message
"Backup my files to Dropbox"
≈ 2,400 tokens sent to Sonnet 4.6
Hit play
Watch the harness wrap your tiny message in layers of context the model needs to act. None of this is written by you. All of it costs tokens, every single call.
What's actually happening

Under the hood
The LLM is a lattice

A bit of bonus conceptual reading. An LLM isn't magic, it's a very large grid of simple nodes wired together in layers. A prompt enters on the left, gets transformed layer by layer, and a response falls out the right. What you see below is a simulation of that lattice. Every bright packet is the kind of activation a real model fires billions of times per response.

Live · Transformer Lattice
Data packets firing across layers. Every response is the sum of millions of these activations.
Input · layers · output
Layers

Each vertical column is a transformer layer. A production model has dozens to hundreds. More layers, more abstract reasoning.

Neurons

The dots are individual nodes. Each one does a tiny bit of arithmetic on what the previous layer passed forward. Simple units, colossal networks.

Packets

The glowing trails are activations firing between nodes. In a real model these are numerical weights, not little dots, but the shape of the flow is the same.

Signed

Chris

Reach me

Mobile · +61 446 537 166

Telegram · @Chris0x88