AI clustering · built into Supoid

Duplicates merge themselves.

47 customers asked for "dark mode" in 47 different ways. One wrote "night theme." Another wrote "less white please my eyes hurt." You shouldn't have to read each one fresh. Supoid clusters them the moment they land.

Try clustering free How the AI works

How it works in 3 hops

1.Embeddings on every feedback
When a feedback lands, we generate a 1,536-dim embedding via OpenAI's text-embedding-3-small. Costs ~$0.000001/feedback — under a penny per 10K. Stored in pgvector with an HNSW index.
2.Cosine similarity match
We query the workspace's existing cluster centroids and pick the nearest. If similarity > 0.82 (threshold tuned over 50K real feedback samples), we add it to that cluster. Otherwise a new cluster opens with this feedback as the seed.
3.Running-mean centroid update
Each cluster's centroid is a running mean of its members' embeddings. New members nudge it — the cluster slowly shifts to represent the true theme. No periodic re-clustering needed.

Why not k-means or HDBSCAN?

We considered both. Two reasons we shipped threshold-based incremental clustering instead:

No re-clustering cost.
K-means + HDBSCAN need a periodic full re-run when new data lands. For a feedback tool with 100+ items/day, this is expensive and creates inconsistent cluster IDs. Incremental clustering: O(K) per insertion where K is small (~50 active clusters per workspace).
Cluster identity is stable.
Your "dark mode" cluster keeps its ID forever even as you add new dark-mode requests. That stability matters when you're emailing customers "your idea moved to planned" or linking from Linear.

Stop triaging duplicates.

Free plan covers 25 feedback items / month — enough to see clustering kick in on real data before you commit.

Try clustering free

Ready to hear what your customers actually want?

Six minutes from sign-up to your first clustered, tagged, actionable feedback. Free forever for solo founders.

Start free View pricing

No credit card · Cancel anytime · GDPR self-service

How it works in 3 hops

1.Embeddings on every feedback

When a feedback lands, we generate a 1,536-dim embedding via OpenAI's text-embedding-3-small. Costs ~$0.000001/feedback — under a penny per 10K. Stored in pgvector with an HNSW index.

2.Cosine similarity match

We query the workspace's existing cluster centroids and pick the nearest. If similarity > 0.82 (threshold tuned over 50K real feedback samples), we add it to that cluster. Otherwise a new cluster opens with this feedback as the seed.

3.Running-mean centroid update

Each cluster's centroid is a running mean of its members' embeddings. New members nudge it — the cluster slowly shifts to represent the true theme. No periodic re-clustering needed.

Why not k-means or HDBSCAN?

We considered both. Two reasons we shipped threshold-based incremental clustering instead:

No re-clustering cost.

K-means + HDBSCAN need a periodic full re-run when new data lands. For a feedback tool with 100+ items/day, this is expensive and creates inconsistent cluster IDs. Incremental clustering: O(K) per insertion where K is small (~50 active clusters per workspace).

Cluster identity is stable.

Your "dark mode" cluster keeps its ID forever even as you add new dark-mode requests. That stability matters when you're emailing customers "your idea moved to planned" or linking from Linear.