Polymarket is the largest prediction market by volume, running on the Polygon blockchain. Whether you want to backtest a trading strategy, analyze whale behavior, or build a dashboard, you need the underlying data. This guide covers every way to get it, from free API calls to full historical dumps.
Polymarket trades happen on-chain through two smart contracts on Polygon: the CTF Exchange and the NegRisk Exchange. Every trade emits an OrderFilled event with the maker, taker, amounts, and token IDs.
Beyond fills, positions change through splits, merges, redemptions, and ERC-1155 transfers. A complete picture of any wallet's P&L requires tracking all of these events, not just trades.
Here's what's available:
| Data | Records | Since |
|---|---|---|
| Order fills (trades) | 865M+ | 2022 |
| Wallet positions | 150M+ | 2022 |
| Splits & merges | Millions | 2022 |
| Redemptions (payouts) | Millions | 2022 |
| Position conversions | Millions | 2023 |
| Markets (metadata) | Thousands | 2022 |
Polymarket runs a public API at https://clob.polymarket.com and a Gamma API at https://gamma-api.polymarket.com. These are useful for current market data but have significant limitations for historical analysis.
# Get active markets curl "https://gamma-api.polymarket.com/markets?active=true&limit=10" | python3 -m json.tool # Get a specific market curl "https://gamma-api.polymarket.com/markets?slug=will-trump-win-2024"
The API is designed for app integrations, not data analysis. It returns paginated JSON with fixed fields and no way to filter, aggregate, or query across the full dataset. Here are real examples of questions you simply cannot answer with it:
Get all trades in a date range. There's no from_date or to_date parameter. You can't say "give me every trade from October 2024" to analyze the US election period. You'd have to paginate through the entire history and filter client-side.
Find all positions larger than $10,000. There's no amount_gt filter. You can't ask "show me every wallet holding more than $10K on any market." The API has no concept of aggregated positions at all.
Filter by average entry price. Want to find wallets that bought YES above $0.80? The API doesn't track average cost basis. That requires replaying every fill, split, and merge for every wallet — something only an indexer can compute.
Rank wallets by realized P&L. "Who made the most money on the 2024 election?" requires computing P&L across fills, redemptions, and merges for every wallet on every related market. The API returns individual trades, not portfolio-level analytics.
Cross-market analysis. "What percentage of wallets that traded the Trump market also traded the Fed rate market?" Impossible — you can't join across markets, and there's no way to query by wallet across all markets at once.
With Parquet dumps, all of these become a single SQL query:
Every example above is a one-liner. Filter by date, aggregate by wallet, compute P&L, join across markets — it's just SQL on local files.
-- Trades in October 2024 (election month) SELECT * FROM 'order_filled_events/*.parquet' WHERE timestamp BETWEEN '2024-10-01' AND '2024-10-31'; -- Wallets holding more than $10K on any market SELECT user_address, token_id, amount / 1e6 as amount_usdc FROM 'positions/positions_*.parquet' WHERE amount > 10000000000 ORDER BY amount DESC; -- Top 20 wallets by realized P&L SELECT user_address, SUM(realized_pnl) / 1e6 as total_pnl FROM 'positions/positions_*.parquet' GROUP BY user_address ORDER BY total_pnl DESC LIMIT 20;
The Polymarket API works well for building a live dashboard or getting current prices. It does not work for backtesting, P&L analysis, or anything that requires the complete trade history.
All Polymarket data lives on Polygon. You can index it yourself by running a node or using an RPC provider to fetch event logs from the relevant contracts.
| Contract | Address | Events |
|---|---|---|
| CTF Exchange | 0x4bFb...982E | OrderFilled |
| NegRisk Exchange | 0xC5d5...0f80a | OrderFilled |
| ConditionalTokens | 0x4D97...0045 | Split, Merge, Redeem, Transfer |
| NegRisk Adapter | 0xd91E...5296 | Conversions |
from web3 import Web3 w3 = Web3(Web3.HTTPProvider("https://polygon-rpc.com")) # CTF Exchange OrderFilled topic topic = "0xd0a08e8c493f9c94f29311604c9de1d4e1f89571..." logs = w3.eth.get_logs({ "fromBlock": 55000000, "toBlock": 55001000, "address": "0x4bFb41d5B3570DeFd03C39a9A4D8dE6Bd8B8982E", "topics": [topic] }) print(f"Found {len(logs)} fills in 1000 blocks")
This approach is free but comes with real engineering costs:
If you have the engineering resources and want full control, this is the way. Budget 2-4 weeks for a production-quality indexer.
The fastest path from zero to analysis. Instead of indexing the chain yourself, download the complete dataset as Parquet files and load them into whatever tool you use — Python, DuckDB, PostgreSQL, ClickHouse, or even Excel.
-- Query local Parquet files with DuckDB SELECT maker, SUM(maker_amount_filled) / 1e6 as volume_usdc FROM 'order_filled_events/20250115.parquet' GROUP BY maker ORDER BY volume_usdc DESC LIMIT 20;
import pandas as pd # Read daily Parquet files df = pd.read_parquet("order_filled_events/20250115.parquet") # Top markets by trade count df.groupby("taker_asset_id").size().sort_values(ascending=False).head(10)
| Table | Records | Description |
|---|---|---|
| order_filled_events | 865M+ | Every fill from both exchanges. Maker, taker, amounts in USDC. |
| positions | 150M+ | Daily snapshot of every wallet's position. Amount, avg price, realized P&L. |
| position_splits | Millions | When users split collateral into outcome tokens. |
| position_merges | Millions | When users merge outcome tokens back into collateral. |
| payout_redemptions | Millions | Winning payouts after market resolution. |
| position_conversions | Millions | NegRisk position conversions. |
| markets | Thousands | Market metadata: question, slug, condition ID, token IDs. |
Files are split by day (e.g., order_filled_events/20250115.parquet) so you can download only the period you need.
| Polymarket API | Self-index | Parquet dumps | |
|---|---|---|---|
| Cost | Free | Free + infra | From $49/mo |
| Historical data | Limited | Full | Full (since 2022) |
| Setup time | Minutes | Weeks | Minutes |
| Maintenance | None | Ongoing | None |
| Positions / P&L | No | You build it | Pre-computed |
| Date filters | No | You build it | SQL WHERE clause |
| Cross-market joins | No | You build it | SQL JOIN |
| Real-time | Yes | You build it | Pro plan: sub-second |
| Best for | Dashboards, current prices | Full control, custom logic | Backtesting, research, analytics |
DuckDB can query Parquet files over HTTP with zero setup. No database to install, no data to download first. This is the fastest way to start exploring Polymarket data.
# Install DuckDB (macOS) brew install duckdb # Or with pip pip install duckdb
-- Launch DuckDB and query directly duckdb -- Total USDC volume per month SELECT strftime(timestamp, '%Y-%m') as month, round(SUM(maker_amount_filled) / 1e6, 2) as volume_usdc, COUNT(*) as trades FROM 'order_filled_events/*.parquet' GROUP BY month ORDER BY month;
865M+ trades, 150M+ positions. Parquet. Updated daily.
Get startedCurrent market data, yes — through the Polymarket REST API. Full historical trade data requires either building your own indexer (free but weeks of engineering) or using a data provider.
The CTF Exchange launched in 2022. The NegRisk Exchange launched later in 2023. The dumps include all events from both exchanges since deployment.
Everything is Parquet (ZSTD-compressed, columnar). Use DuckDB for the fastest experience — it reads Parquet natively over HTTP with zero setup. pandas, Spark, and ClickHouse also work out of the box.
You need to replay all fills, splits, merges, and redemptions for that wallet's token IDs. The positions table in the dumps has this pre-computed as a daily snapshot with avg_price and realized_pnl.
No. Polymarket's API serves market metadata and recent activity. The on-chain data includes every transaction that ever happened — including trades through aggregators, splits, merges, and direct transfers that never touch Polymarket's frontend.