| Jetski | Stage | Key learning |
|---|---|---|
|
ifood-reports Does internal data hold sellable insight? |
Ready for test | Genuine buyer interest. Reworked the metrics to match iFood's own way of measuring. Caught the agent gaming it — naming a park as the best restaurant site — so future versions must penalise non-buildable locations. An account manager is now building the first report.
Watch the demo
[p: 2$nqNutB]
|
|
REST-arena Best simulation for testing agents vs. competitor |
Developing | Blog released. Simulating on real data, model behaviour diverges sharply — GPT conservative, Claude more creative, smaller models riskier. Next: real RL on the arena — the first ecosystem environment where a model actually gets better, in line with Nadella's vision. Read the blog |
|
GPU-MCP Mini-jetski · MCP for GPU workloads |
Ready for test | Feedback in, all of it about Toqan-class usability. People like it, but demand is thin — zero of OLX's 180 data scientists reached out; only one Jetskibay and two Prosus people actually use it.
Watch the demo
[p: K+j7Vd5X]
|
|
Feed.me Can ad→checkout scale across Europe? |
Scaling | Now know which ads and funnel convert best. Two TikToks went viral past 200k as the account grew. Blocked by SDK limits — moving payments into the app is 6+ months out. Native iOS app ships end of month.
Watch the demo
[p: tQrZ#m3Z]
|
|
Vending machine Can agents run a business on light human oversight? |
Live — ep. 1 | Episode one shipped, first revenue in. The human check-in UX is the real bottleneck before scaling agent scope. Project Rest building POS. Watch episode one |
|
Toqan apps Can an autonomous factory ship sellable apps? |
Building | Software factory stable 2+ weeks — the core asset. Off-the-shelf "manage your company" tools (Palacia etc.) all underperformed. Needs a wedge customer + clear job before brand/GTM. Now live: home page and @trytoqanapps on Instagram.
Open dashboard
[p: growbabygrow]
|
|
UX self improvement Can human feedback give the "why" ads miss? |
Exploring | Micro-worker survey panels (e.g. Microworkers) beat every other channel — $10 buys in-depth app feedback in hours, versus days or weeks of running ads or live UX for a faint signal. This is what accelerates self-improvement. |
|
Jetskibay Toolkit Can we ship niche tools fast on Claude? |
Blocked | Communication has to be built into the workflow, not bolted on. The toolkit's job is to make sharing effortless — going from execution to communication with as little friction as possible. Joint effort with Grant; learnings feeding the AI Prosus Lab blog. |