PM Case Study

Building Sifty AI: From Personal
Frustration to Shipped Product

Identifying a gap in photo management, designing an AI-powered solution, and shipping it to the Play Store — from research through launch.

TL;DR
  • Personal pain point (8,000+ photos, zero motivation to sort) turned into a shipped product
  • End-to-end ownership: research, strategy, design, development, and launch
  • Google Gemini LLM for multimodal photo analysis with composite relevance scoring
  • On-device AI descriptions power a keyword search feature no competitor offers
  • Live on Google Play Store
Status
Live on Play Store
Stack
Flutter, Gemini, SQLite
Platform
Android & iOS
The Problem

8,000+ photos, a decade of accumulation, and zero motivation to sort

I had over 8,000 images on my phone accumulated over nearly a decade. Screenshots of things I'd already dealt with. Food photos I'd never look at again. Dozens of nearly identical shots from the same moment. Memes. Accidental photos. Images that made sense at the time but were digital clutter months later.

The problem wasn't that I lacked tools to delete photos. The problem was that deciding what to keep vs. what to delete is exhausting. Every photo requires a micro-decision. Multiply that by 8,000 and you understand why most people never start.

I searched online for a solution. Every app I found — gallery cleaners, duplicate finders, storage managers — optimized the act of deletion. A better delete button. A faster swipe interface. Bulk select. But none of them touched the real bottleneck: the cognitive load of the decision itself.

The bottleneck was never the delete button — it was the 8,000 decisions required to reach it.

Market Landscape

What exists and why it falls short

I downloaded and tested over 10 photo management apps before building Sifty. Here's what I found.

Google Photos

What it does

Smart storage, compression, “Memories” that resurface old photos

Where it falls short

Doesn't help you decide what to keep. Resurfaces memories but doesn't declutter.

Gallery Cleaner Apps

What it does

Files by Google, Cleaner for iPhone — cache/junk file removal, simple delete UI

Where it falls short

Still requires you to make every individual decision. The cognitive load is unchanged.

Duplicate Finders

What it does

Detect and remove exact or near-duplicate photos

Where it falls short

Solves one narrow problem. Most clutter isn't duplicates — it's photos that outlived their purpose.

AI Photo Organizers

What it does

Categorize and tag photos by content, faces, locations

Where it falls short

Categorize and tag but stop at organization. They don't reduce the collection, and their search remains basic — limited to predefined categories rather than natural language descriptions.

Every existing solution shifts the UI around deletion. None of them reduce the cognitive load of the decision itself. That's the whitespace Sifty targets.

User Insights

Honest research, not fabricated data

I didn't commission a survey or fabricate statistics. Here's what my research actually looked like:

Personal Dogfooding

Used my own gallery of 8,000+ photos as the primary test case. You can't hide from your own frustrations when you're the user.

Friends & Family

Informal but revealing conversations. Everyone described the same problem. Nobody had ever tried to solve it systematically.

Competitive Analysis

Downloaded and tested 10+ gallery management apps. Documented what each did well and where every one fell short.

App Store Review Mining

Reviewed competitor app listings and user reviews. Users consistently appreciated the ease and speed of deletion these tools offered. But notably absent was the deeper frustration — nobody talked about how tedious it is to go through thousands of images. As if people didn't know this was a problem that could be solved.

Key patterns observed:
“Everyone I talked to said their gallery was a mess but none had tried to fix it.”
“People had emotional attachment to the idea of their photos even when they had no memory of what 80% of them contained.”
“The most common reason for not cleaning up: ‘I wouldn't know where to start.’”
“Existing apps required just as many decisions as manual sorting.”
Vision & Strategy

Eliminate the cognitive load of photo curation

The goal isn't maximum deletion — it's informed decisions. The AI should carry the weight of the decision, with the user confirming or overriding.

North Star Metric
Total images analyzed

Not “photos deleted.” If users find value, they run more photos through the app. A single metric that captures both adoption and engagement.

Product Principles
01

Decide for them, not just show them

The AI should carry the weight of the decision. Users confirm or override, not start from scratch.

02

Trust is built, not assumed

The learning-then-cleaning system, transparent reasoning, and safe trash bin all exist to earn user trust gradually.

03

Personal, not generic

Every user's definition of 'worth keeping' is different. The AI must learn individual preferences, not apply generic rules.

04

Privacy by architecture

Analysis stored on-device. No cloud uploads. Privacy isn't a feature toggle — it's how the system is built.

How this metric tells the full story

Images analyzed is both the adoption metric and the engagement metric. New users analyze their first batch. Satisfied users come back to run more. The number only grows when the product delivers real value — accurate recommendations, useful descriptions, and reclaimed storage that users can see.

Solution Design

A system that earns trust before it acts

Rather than analyzing everything at once, Sifty uses a progressive approach: first learn the user, then clean confidently.

Sifty Dashboard
LEARNING

Learning — Calibrating to You

The AI selects a random subset of photos from the gallery and processes them through Gemini. Each photo is analyzed and presented with a description and recommendation. As the user reviews each result — keep or delete — the system calibrates scoring weights specific to that user's preferences. This phase builds a personalized model of what matters to this person.

  • 1.Random subset selected from gallery
  • 2.AI presents recommendations with reasoning
  • 3.User reviews and decides on each photo
  • 4.Scoring weights calibrate to user preferences
CLEANING

Cleaning — Full Gallery Analysis

Using the calibrated weights from learning, the AI runs through the entire gallery. Each photo is analyzed, scored, and given a recommendation. During this process, rich text descriptions are generated for every image — these descriptions power the keyword search feature.

  • 1.Custom weights applied across entire gallery
  • 2.Rich text descriptions generated for every photo
  • 3.Personalized keep/delete recommendations at scale
  • 4.Descriptions stored locally for keyword search
AI Recommendation Screen
Personalized Weight Calibration

Each photo receives a composite relevance score — not a binary keep/delete flag, but a weighted continuum personalized to the user. The scoring model starts with baseline weights and recalibrates during learning based on the user's actual decisions:

1
Content Analysis
What Gemini sees in the photo (people, places, objects, text)
2
Metadata Signals
When it was taken, recency, location, frequency
3
Calibrated Weights
Adjusted during learning to match this user's preferences
4
Pertinent Relevance
How meaningful the photo is to this specific person right now
Trade-offs

The decisions that shaped the product

Every product is a series of trade-offs. Here are the seven decisions that had the most impact on what Sifty became.

Why two phases instead of one?

Options considered

A single-pass analyzer would be simpler to build and faster for users. Why add the complexity of a learning phase?

Decision

Learn first, then clean

Why

A single pass applies generic rules to everyone. But a food photo is trash for one person and a cherished memory for another. The learning phase calibrates weights to individual preferences before the AI touches the full gallery. This is fundamentally different from existing tools that apply one-size-fits-all rules — Sifty earns the right to decide by learning what you care about first.

PM Lens

Progressive disclosure applied to an AI system. User effort invested early compounds into trust and accuracy later.

Choosing the right LLM

Options considered

GPT-4V (strong vision, high cost), Claude (excellent reasoning), Gemini (strong multimodal, generous free tier)

Decision

Google Gemini

Why

For a consumer app processing thousands of photos per user, API cost is existential. Gemini offered the best balance of multimodal quality and cost per image. The free tier made prototyping viable. At 2-3x the cost per image, other models would have made the free tier unsustainable.

PM Lens

Unit economics as much as a technical decision. The relationship between AI capability and business model viability is a PM responsibility.

Why on-device, not cloud?

Options considered

Cloud storage is easier to sync and scale. On-device means no cross-device access. Why accept that trade-off?

Decision

On-device SQLite

Why

Photos are deeply personal. Storing analysis on-device eliminated privacy concerns entirely — no data leaves the phone. It also meant zero server costs and enabled offline keyword search. For a gallery app that's inherently device-specific, the sync trade-off was acceptable.

PM Lens

Privacy as architecture, not a checkbox. The system is designed so that compromising user data isn't possible, not just unlikely.

Scoring on a spectrum, not a binary

Options considered

Binary keep/delete would be simpler to present and faster to act on.

Decision

Composite relevance score

Why

Binary classification forces false certainty. A flight confirmation screenshot isn't 'keep forever' or 'delete now' — it's relevant today, irrelevant in three months. The composite score acknowledges that importance exists on a spectrum, and the calibrated weights ensure the spectrum is personal.

PM Lens

Resisting the temptation to oversimplify. The scoring system creates room for future features (time-decay, archiving) without re-architecting.

One photo at a time

Options considered

Showing a grid of 20 photos is more 'efficient' — more photos visible, batch operations possible.

Decision

Swipe interface

Why

Testing showed that seeing many photos at once increased decision fatigue — the opposite of what Sifty exists to solve. The swipe interface forces single-photo focus, matching how the AI presents its recommendation. One photo, one decision, one swipe. Cognitive load per decision drops to nearly zero.

PM Lens

The interaction model must align with the core value proposition, even when it looks less efficient on paper.

Deliberate friction before deletion

Options considered

Direct delete gives immediate space recovery and a simpler flow.

Decision

Safe trash bin

Why

Deleting photos is irreversible and emotionally charged. The trash bin adds one step of friction but eliminates the fear of making a mistake. Essential during cleaning where the AI acts autonomously — users need to know they can review and reverse before anything is permanent.

PM Lens

Sometimes making something slightly harder makes the overall experience dramatically better. Trust is the product's most important currency.

Free at launch, monetize later

Options considered

Freemium with limits, subscription, ad-supported, or completely free.

Decision

Completely free at launch

Why

Launching a consumer app in a crowded category with a paywall is a distribution problem. The priority was real usage data and word-of-mouth. Monetization is planned but gating the core experience before proving product-market fit would be premature.

PM Lens

Sequencing decisions correctly. Monetization is a strategy question, not a launch requirement.

Key Differentiator

AI-powered keyword search: find any photo by describing it

During analysis, Gemini generates a rich text description for every photo. These descriptions are stored locally on the device. This infrastructure byproduct became a standalone feature: semantic search across your entire gallery.

Example searches:
passport photo
white pants at beach
sunset in Bali
receipt from dinner
cat sleeping on couch
kids at playground
AI Context and Search

The descriptions needed for relevance scoring turned out to be a standalone product feature. Good PMs recognize when infrastructure creates unexpected product value.

Why this is different from Google Photos search

Google Photos search requires cloud processing and only works with cloud-stored photos. Sifty's search works entirely on-device, across your full native gallery, with descriptions enriched by context the AI learned about you. No internet required. No data leaves your phone.

Architecture

High-level technical architecture

A privacy-first architecture where all user data stays on the device.

Mobile App
Flutter / Dart
Gemini API
Photo analysis & descriptions
On-Device SQLite
User Story, scores, descriptions
Local Search
Keyword matching
FlutterDartGoogle GeminiSQLite
Launch & GTM

From code to Play Store

The go-to-market was intentionally lean — prove the product works before investing in paid acquisition.

Build

Product development

From initial prototype to production-ready app with learning and cleaning system, composite scoring, and keyword search.

Marketing

siftyai.com

Launched the product site to support the app — clear storytelling, honest positioning, and direct download links.

Launch

Play Store submission and ASO

Published on Google Play Store. Optimized listing with screenshots, feature descriptions, and targeted keywords.

Distribution

Organic growth

Word-of-mouth, portfolio showcase, and organic discovery. No paid acquisition at this stage — the priority is proving product-market fit.

Metrics & Impact

Real metrics, not vanity numbers

I'm committed to sharing honest metrics. These are early-stage numbers that will be updated as usage grows.

High
Accuracy After Training
1000+
Photos Analyzed Per Session
On-Device
Privacy-First Architecture
Personal usage results

In my own gallery of 8,000+ photos, Sifty helped me identify and remove thousands of images I'd been carrying for years — screenshots, accidental photos, memes, and duplicates. Beyond decluttering, the keyword search feature became something I use weekly to find specific photos without scrolling.

Reflections

What I learned and what's next

1
The hardest part was never the code

The most difficult decisions were product decisions: what to build, what to cut, how to sequence, and when to ship. Getting the technology to work was straightforward compared to getting the product right.

2
Dogfooding is the most honest user research

You can't hide from your own frustrations when you're the user. Every annoyance was a feature request. Every delight was validation.

3
The learning-then-cleaning system was the riskiest and most important decision

It would have been easier to ship a single-pass analyzer. The learning-first approach took longer to build but the quality difference is what makes Sifty work.

4
Build good foundations and unexpected features reveal themselves

Keyword search emerged from the analysis infrastructure. The descriptions needed for scoring became a product feature nobody planned for.

What's next
iOS App Store launch
Expanding to Apple's ecosystem
Freemium monetization
Premium search features, higher batch limits
Time-decay scoring
Photos lose relevance over time — factor this into recommendations
Smart albums
Auto-generated from AI descriptions and learned preferences

Try Sifty AI yourself

See the product behind this case study. Download Sifty AI free and let the AI learn what matters to you.