We wanted an agent to map a web application so the next agent could write tests against it. The transcript was the wrong shape for that, so we put the graph where it belongs and the agents around it.
OpenAI shipped Operator in January. We built a smaller one inside Mockingjay, using the accessibility tree where we can and falling back to vision when it comes up empty.
We let GPT-4 pick selectors and run actions on a real page during a test. The demo went great. Then we met a virtualized list.
Why we shipped a GPT-4 feature on a weekend a few months back, and the prompt I'm still not proud of.
Why we moved off BullMQ to RedPanda once a Go service had to consume from the bus, and the part that was just curiosity.
Why Mockingjay runs its own browsers in containers instead of shipping a Chrome extension, and what that costs.
A side project I stopped maintaining, why it kept growing anyway, and an Oracle trademark notice somewhere in the middle.
Recorder vs code tests, for automation engineers who already think one side is obviously right.
Why e2e test codebases rot, and the founding bet behind Mockingjay.
Inheriting an AWS account at a small fintech after the vendor was fired, with no handover and no AWS on my resume.
How I shipped JaDX inside an Android app by rewriting two of its methods at build time, without forking it.
Looking back at a run of GSoC blog posts I wrote in 2016, and realizing they were something other than what I thought they were.