Product Manager. AI/LLM workflows and the data platforms underneath them.
Drag the confidence threshold. Watch which mistake you're choosing to make. The numbers are illustrative, the tradeoff is the job.
Three roles spanning Maritime Law and Finance, Renewable Energy Analytics and Finance, and Fixed Income Analysis
Analysts weren't short on insights. They were short on knowing which of them mattered to one specific portfolio.
I worked on PM10x, a portfolio intelligence tool that ranked financial insights by relevance using a weighted scoring formula. The first version was an educated guess at the weights. From there I tuned them by hand, testing which signals mattered and in what proportion. On top of that, every upvote and downvote adjusted how a given insight scored for that analyst, so each person's feed sharpened the more they used it.
That feedback loop taught me the lesson I've been chasing ever since. A score is only as good as the signal you feed it, and the people using the product are the signal.
What customers asked for and what they actually did rarely matched. Building for the first one kept burning roadmap.
I owned three surfaces at once. Customer-facing data APIs, a structured data platform, and a zero-to-one internal tool for user management. Keeping them coherent meant grounding decisions in observed usage instead of stated preferences. The datasets people pulled every day showed me where the value lived. The ones they praised on calls but never opened were noise.
That discipline changed where my time went. Features customers used every day kept earning roadmap space. Features that had shipped but sat unused stopped getting it.
An AI workflow that gets a claim wrong isn't a demo bug. It's a financial decision with consequences.
I owned the production AI claims workflow. I joined once the initiative was already in motion and carried it through to live customers. It ran on multi-agent orchestration with confidence scoring, fallback logic, and evaluation frameworks, all behind a human-in-the-loop audit interface. The signal that mattered most wasn't headline accuracy. It was reversion. When a reviewer overrides the system, that's the clearest tell that the automation boundary sits too aggressively for that kind of case. So I designed the boundaries to treat reversion as the calibration signal rather than leaning on a single eval score. It's the same instinct from PM10x, now built into a system with money on the line.
Most PMs hand off a requirements doc and wait. I open Claude Code, v0, or Figma Make and build the thing first. A working prototype answers in an afternoon what a spec argues about for two weeks, and it puts a real artifact in front of engineering instead of a wishlist.
The interactive widget on this page is the same habit. It's faster to build the idea than to describe it, and the build is the proof that I can.