JONAS STAMM
— FIELD NOTES
growth-storiesApril 10, 2026· 4 min read

Growth Metrics That Actually Matter for AI Products

Growth Metrics That Actually Matter for AI Products If you’re building an AI product, stop looking at your DAU (Daily Active Users) for a second. It might be ly...

JS
Jonas Stamm
Founder, BauGPT

Growth Metrics That Actually Matter for AI Products

If you’re building an AI product, stop looking at your DAU (Daily Active Users) for a second. It might be lying to you. 🫢

In normal SaaS, we’re obsessed with engagement. More clicks, more sessions, more time in the app. If a user spends 30 minutes in your dashboard, they’re "engaged."

But in AI? If a user spends 30 minutes chatting with your bot, they might just be frustrated. 🤣

They might be stuck in a hallucination loop, trying to get the right answer and failing. In the AI world, "high engagement" can actually be a signal of a broken product.

Here’s what we learned at BauGPT about the metrics that actually matter when you’re building with LLMs.

The "Chatty User" Trap

When we first launched our construction AI, we were stoked. People were sending dozens of messages. They were "chatting."

Then we looked at the transcripts.

A lot of those messages were: "No, that's not right," "Try again," or "Where did you get that number?"

The user was working hard, but the AI wasn't. We were tracking "Engagement," but we weren't tracking "Value."

We realized we needed a new set of metrics that reflect the unique nature of AI. Metrics that track outcomes, not just activity.

1. Job Completion Rate (JCR)

This is our North Star.

In construction, nobody opens BauGPT to "chat." They open it to find a specific DIN standard, calculate a material cost, or verify a building code.

They have a job to do.

We track JCR by looking at the end of a session. Did the user export the result? Did they copy the text? Did they give a thumbs up? Or did they just close the tab after 5 failed attempts?

If your JCR is low, it doesn't matter how many "active users" you have. You’re just a fancy toy that doesn't work. 😎

2. Tokens-to-Value Ratio

This is my favorite "nerd" metric. 🤓

Every token costs money. If it takes 5,000 tokens of "chatting" to get a simple answer that should have taken 200 tokens, your product is inefficient.

We try to track how many tokens it takes to reach a "Value Event" (like a successful document extraction).

If this ratio is going up, your prompts are getting bloated or your users are struggling to communicate with the model. We use this to prune our system prompts and keep the AI focused.

3. Human-in-the-loop Correction Rate

Since we deal with construction documents, accuracy is everything. 🏗️

We have a feature where users can correct the AI if it misses a detail in a floor plan or an invoice. We track the percentage of extractions that require a manual fix.

If this number isn't going down week-over-week, your product isn't learning.

How we track this (The Tech Bit)

We use a mix of Amplitude for the high-level stuff and custom logs in our Postgres database for the LLM specifics.

Every completion gets logged with:

It’s not enough to know that they used it. You need to know why they used it and if it worked.

The Result: Higher Retention, Less Noise

Once we stopped optimizing for "chatting" and started optimizing for "Value Events," something cool happened.

Our session lengths actually decreased. Users were in and out faster.

But our retention went up. 🚀

Because they realized they could rely on BauGPT to get the job done in 2 minutes instead of 20.

Takeaway

If you’re building an AI startup, don’t get blinded by standard SaaS vanity metrics.

The best AI products are the ones that disappear because they work so well. ✌️

LG Jonas

Keep reading

All writing →
build-in-public · Jun 1, 2026
$ we run ai agents inside baugpt.
# build-in-public
read-time: 7min

We run AI agents inside BauGPT. Here's what it taught us about building them.

We build AI for the construction industry. We also run AI agents inside our own company to handle scheduling, ticket routing, code review, and content ops. That...

7 MIN READ
product · May 21, 2026
02

Our enterprise onboarding takes 90 minutes. The procurement took 11 weeks.

A construction company with a four-billion-euro annual turnover signed up for BauGPT last quarter. Their procurement process took eleven weeks. The actual onboa...

4 MIN READ
product · May 21, 2026
03

We process 40,000 WhatsApp messages a week. Here's why we built there.

BauGPT processes 40,000 WhatsApp messages a week. About 40% of them are voice notes. I mention this not to flex on a number. I mention it because it explains ev...

4 MIN READ
— THE NEWSLETTER

One note a week.
No fluff, just what works.

AI engineering, growth hacks, and messy lessons from shipping BauGPT. Unsubscribe anytime. I'll even miss you.

FIELD NOTES · NEXT ISSUE DROPS MONDAY
↳ No spam. One note weekly.