journalprojectsresume
all journal

2025-09-19 / 7 min / Data Engineering + Markets + AI Agents

Prediction market API and normalizing messy sources

Building one schema across Polymarket, Kalshi, and Manifold exposed the hidden work in data products.

The centralized prediction market API began with a practical annoyance: Polymarket, Kalshi, and Manifold describe similar concepts with different schemas, conventions, and edge cases. If an AI agent is supposed to compare markets, it needs clean data before it needs clever reasoning.

A lot of the project was unglamorous normalization work. Titles, outcomes, probabilities, volumes, settlement status, and market metadata all needed a common shape. The hard part was not fetching data. It was deciding which differences were meaningful and which ones should be abstracted away.

The arbitrage and mispricing agent depended on that judgment. If the normalized layer hides too much, the agent makes false comparisons. If it exposes every platform detail, the downstream logic becomes brittle. The API had to preserve enough context for reasoning while still giving the agent a stable contract.

This project made data engineering feel less like plumbing and more like product design. The schema is the interface, and every field is an opinion about what future builders should be able to trust.


takeaways.

- Agents are only as useful as the contracts beneath them.

- Normalization requires product judgment, not only mapping code.

- A good API preserves meaningful differences while removing accidental ones.


related project.

Centralized Prediction Market API - Built a unified API aggregating and normalizing Polymarket, Kalshi, and Manifold data into one schema, plus an AI agent that spots arbitrage and market mispricing opportunities.