Shaga logo

The Default Data Rail for World-Models

DePIN Network Subsidizing Cloud Gaming Through World-Model Data Sales

2+ Year DePIN Network Cloud Gaming Subsidized Centralized Can't Replicate

The Trillion Dollar Opportunity

$1T+
World Models market potential
"Gold Rush"
Every major lab racing to build
Selling Shovels
Essential data infrastructure

"Most Mind-Blowing Tech Ever"

"Next trillion dollar business"
"Killer use case for VR"
"Paradigm changing moment"
Industry expert demo and analysis

Investor Quick Reference

Key numbers for your model (click to expand)

Unit Economics
COGS (validated data) per hour
$1.00 ($SHAG)
Market price per hour
$2–5
Margin (3 buyers)
83%
Data Deals
Signed contract (5$/h -> 1.4$/h)
23M hrs/3yr
Monthly cap
1M hrs
Current run-rate
520k hrs/yr
Supply Constrained
Active Nodes
1,140
Invite Only Gamers
7,130
Waitlisted Gamers
~1M

Market Landscape at a Glance

Why Decentralized Networks Win This Market

Why Centralized Efforts Fail

  • Cost prohibitive: Paying human laborers $2-3/hour for gameplay data
  • Low signal: Google's SIMA lacks player diversity, human irrationality & emotions.
  • Scale limitations: Cannot coordinate thousands of simultaneous players
  • Rights complexity: Individual licensing across multiple game publishers

Shaga's DePIN Solution

  • Free gameplay: Players get free cloud gaming—data sales cover costs
  • Authentic behavior: Real players making genuine decisions, not performing tasks
  • Massive scale: ~1M Waitlisted Gamers, 7k+ Gamers (Invite-Only), 1,140 Nodes
  • Network-level licensing: Bulk deals with publishers for entire ecosystem

The Scaling-Law Urgency: Genie3 now = GPT3 in 2020

Video and world models now show power-law scaling (bigger models = better models) like LLMs did in 2020. Labs that acquire the most interactive data the fastest will lead the next wave of AI.

The Unreplicable Moat (Timing)

Labs need this data now. Frontier training runs are active; they can't wait 12–24 months for a new network to bootstrap. We're ready now—the chicken‑and‑egg is already solved on Shaga.

  • 1,140 Active Nodes — Nearly 1M gamers waitlisted, hungering for supply to unlock
  • Streamers + gaming communities are in place; buyers can turn on with minimal lead time

Executive Summary: Why Shaga Wins

The Market Signal

General Intuition raised $134M seed (Khosla, General Catalyst) using Medal's gaming clips. OpenAI offered $500M for Medal. DeepMind, Microsoft, xAI are building world models.

The bottleneck is causality data: how inputs change frames.

Why Shaga's Data Wins

  • Cloud gaming captures better data: Long sessions (10-60+ min), pure native interactions, 120Hz synchronized controls
  • Uncapped scale: 1 PC = many gamers (vs. Medal's 1 PC = 1 gamer). Massive throughput per node
  • Diverse distributions: Casual gamers, not just pro streamers. Richer behavioral patterns

The Strategy:

[DATA] Build relationships with labs selling data & bootstrap network growth today.

[COMPUTE] Become the rails for world model edge-computing tomorrow.

Economic Advantage: The Resale Multiplier

COGS Structure
$1.00/hr
Paid in $SHAG tokens
Only for data passing validation
$1.00
Fixed COGS per validated hour
Sell to N buyers
Marginal revenue ~100%
Conservative
$1.50/hr
1 resale
33%
Gross margin
Base Case
$2/hr × 3
3 buyers = $6 revenue
83%
Gross margin
Target
$3/hr × 5
5 buyers = $15 revenue
93%
Gross margin

Key Insight: Data production costs are paid once; Resale opportunity is infinite.

Traction & Production Run‑Rate

Network Metrics

7,130
Clients
iOS/Android apps.
+60% MoM growth.
340k hrs
Network Traffic (Mar-Oct)
Cloud gaming hours streamed.
Note: Metrics reconciled monthly.
~1M
Waitlisted Gamers
Pending activation → Gaming Activity
Note: Metrics reconciled monthly. Cloud gaming traffic proves supply-side functionality; persistence layer activates Phase-1 data collection.

$SHAG Token: The Data Acceleration Engine

We Have Demand
  • Token rewards accelerate nodes onboarding
  • Higher rewards for rare mechanics & quality data
  • Network effects: more creators → better coverage
Revenue Acceleration
  • Faster data collection → faster revenue scaling
  • Supply/demand matching through token incentives
  • Virtuous cycle: more data sales → bigger token rewards

Phase-1 Target: Data Persistence Layer

Goal: Convert existing network activity into AI-training datasets. Target ≥1M hrs/mo with $SHAG token incentives driving creator participation.

1M+ hrs/mo
Phase-1 data collection target

Product & Pricing Primitives

The "Premium Hour" Standard

Shaga's billable unit isn't raw footage—it's a rigorously defined data product. Each "premium hour" represents:

  • Training-quality (AFK/menu filtered, QA validated)
  • Video + native player controls (time-synced)
  • Clean IP rights for commercial AI training

Current Production SKU

Core

$22M/3yr Deal • 2$/hr avg
720p60 @ 120Hz sync

Production-grade interactive data

  • Video: 720p60 MP4 (H.264, 6–12 Mbps CBR)
  • Controls: 120Hz native capture (not inferred)
  • Sync: ≤4ms p95 video↔controls alignment

What AI Labs Need in Training Data

Title Diversity

Broad coverage of first-person titles across genres—shooters, RPGs, simulations—to capture varied physics, lighting, and interaction mechanics.

Skill Distribution

Full spectrum of player abilities, including valuable mistakes—missed shots, car crashes, failed jumps. Low-skill play enriches behavioral distributions.

Long Sequences

Extended gameplay sessions (10-30+ minutes) to capture temporal dependencies, multi-step decision chains, and evolving strategy—not just isolated moments.

Future Data Enrichment Roadmap

As AI labs evolve from pre-training to fine-tuning and RL phases, Shaga can systematically enrich data without rebuilding supply:

Engagement Signals

  • • Key-hold duration & mouse velocity
  • • Player skill brackets & outcomes
  • • Decision-making moments

Higher Fidelity

  • • 1080p60+ resolution
  • • 240Hz control sync
  • • Ray-tracing & ultra settings

3D Geometric Data

  • • Camera pose extraction
  • • Depth maps (RGB-D)
  • • Object segmentation masks

Buyer Economics & Shaga's Cost Advantage

The "DIY" Approach

For labs considering building their own datasets:

Labor + Equipment ≥$2.00/hour
QA Fallout +30-50%
Legal Uncertainty High Risk
Total True Cost >$2.00/hour

Shaga's Model

"Produce Once, Resell Many" advantage:

COGS (validated data) $1.00/hour
Payment currency $SHAG tokens
Payment trigger Post-validation
Fixed COGS $1.00/hour

The Cloud Gaming Multiplier: 1 Node → Many Gamers

DIY Data Production (Labs' Alternative)

To produce 1,000 hours of data, a lab must provision 1,000 gaming PCs—each tied to a single local gamer. Fixed hardware costs, geographic constraints, and idle time waste.

1 PC = 1 Gamer
High capex, low throughput

Shaga's Cloud Gaming Model

A single gaming node streams to multiple gamers remotely from different regions and time zones. Same hardware, continuous utilization, exponentially higher data throughput per dollar of infrastructure.

1 Node = Many Gamers
Shared infra, massive scale

The Resale Moat in Practice

1 Buyer
Single resale
80% Margin
@ $5.00/hour
5 Buyers
Multiple resales
60% Margin
@ $0.50/hour each
8+ Buyers
High volume
Defense Mode
Price wars & competition

The "Gold Rush" Buyer Map

The world model ecosystem spans hyperscalers and startups—all racing to solve the same data bottleneck. Shaga's buyer map categorizes these customers by their strategic importance, budget size, and specific data requirements.

Tier A: Strategic Hyperscalers

Biggest Budgets

Google DeepMind

Project

Genie 2/3, GameNGen

Data Need

High-fps gameplay with synchronized actions, camera-rich 3D scenes

Microsoft Research

Project

MineWorld, Muse/WHAM

Data Need

Minecraft-style frames+controls, minutes-long FPS trajectories

OpenAI

Project

Sora & Sora 2

Data Need

Interactive gameplay data — offered $500M for Medal last year

xAI (Elon Musk)

Project

World Models for gaming & simulation

Data Need

Interactive gameplay data with causal physics understanding

Meta AI

Project

V-JEPA 2

Data Need

Large-scale internet video + interaction data

Alibaba

Project

The Matrix

Data Need

AAA game footage + real-world video with player actions

Tier B: Frontier Startups

High Growth

Incomplete, new ones funded under the radar

World Labs

3D spatial world models

Odyssey

Photorealistic worlds

Decart

Interactive models

Skywork AI

Matrix-Game 2.0

Data Broker Outreach

We're engaging with established data brokers who already serve frontier AI labs. These partnerships provide immediate access to buyer networks while we build direct relationships.

Enterprise Brokers

  • Defined.ai - Multimodal datasets
  • Appen - Enterprise programs
  • Sama - Curated datasets
  • TELUS International - Scale programs

Specialized Brokers

  • iMerit - Video/robotics focus
  • Toloka - Custom collections
  • TransPerfect - Global reach

Data Marketplaces

  • AWS Data Exchange
  • Snowflake Marketplace
  • Datarade
  • Narrative I/O

Lighthouse Customer: Proof of Demand

Wayfarer Labs has signed as our first enterprise customer, validating real market demand for interactive gaming data in world model training.

This partnership proves labs are actively seeking this exact data type and willing to pay premium rates for quality interactive datasets.

Pricing Strategy: Price Maker → Price Weapon

Phase I: Price Maker

As the dominant supplier, Shaga sets market prices. With limited competition and high demand, pricing anchors at premium levels to maximize revenue capture.

Standard data (Core SKU) $2-5/hr
Premium/rush delivery $5-10/hr
Temporary exclusivity 3-10× premium

Extract maximum value while supply remains constrained and buyer urgency is high.

Phase II: Price Weapon

When competitors emerge, Shaga can weaponize its resale model. Because COGS are paid once and resold to N buyers, Shaga can price aggressively to starve competitors who lack multi-buyer depth.

The Resale Weapon:
COGS: $1.00/hr (paid once)
With 5 buyers: Sell at $0.20/hr each = break even
Competitor with 1 buyer: Needs $1.00/hr to break even

Result: Shaga can undercut competitors 5:1 and still maintain profitability through resale depth. Competitors without distributed buyer relationships get priced out.

Aggressive Defense Playbook

In contested market segments, Shaga can temporarily price at $0.20-0.50/hr while securing 4-8 buyers per hour. This pricing is below any competitor's break-even unless they also have deep resale networks—which takes years to build.

Strategic pricing becomes a moat: competitors without multi-buyer infrastructure cannot survive a price war, even if they match Shaga's production costs.

Competitive Model Generations Drive Exponential Demand

The Competitive Dynamic: Industry rumors suggest Genie 3 trained on ~1M hours. As labs race to outperform each other, each model generation requires 10-100× more data. Every competing lab needs this data to stay relevant.

Scaling Law Context: The GPT-3→GPT-5 Parallel

Language Models: GPT-3 (300B tokens) → GPT-5 (~15T tokens) = 50× data increase = transformative capability leap
World Models: Genie 3 (~1M hrs) → Genie 5 (~100M hrs) = 100× data increase = same competitive pressure

Gen3 Models (Now)

All labs competing: 6-8 Tier-A + 7 Startups
Labs competing: ~13-15
Data per model: ~1M hrs
Total Demand: 13-15M hrs

Gen4 Models (2025-26)

Tier-A labs + well-funded startups
Labs competing: ~8-11
Data per model: ~10M hrs
Total Demand: 80-110M hrs

Gen5 Models (2027+)

Only top 4 hyperscalers survive
Labs competing: ~4
Data per model: ~100M hrs
Total Demand: 400M hrs

Key Insight: Demand is driven by competitive pressure, not just model performance. Every lab needs Gen4 data to compete with Gen4 models. Labs that fall behind a generation lose market relevance. This creates sustained, exponential demand independent of any single lab's training schedule.

Supply Engine: Frictionless P2P Scale

The Critical Innovation: Nodes Don't Play

Every gaming PC in the world can become a data-producing node—without the owner playing.

Node owners simply leave their PC online. Shaga's cloud gaming infrastructure streams to remote gamers from different regions and time zones. The node owner collects $SHAG tokens while sleeping, working, or traveling. The remote gamer gets free cloud gaming. Labs get data.

Frictionless onboarding = exponential supply growth. No competitor can replicate this P2P model.

What if GPUs designed for gaming, were used for gaming? (shocking)

Millions of consumer gaming GPUs sit idle because they can't compete in AI DePIN networks. RTX 3060s, 3070s, 4060s, 4080s, 4090s—the entire consumer GPU stack from NVIDIA and AMD—are poorly suited for AI training workloads compared to datacenter GPUs (A100, H100, B200).

Consumer GPUs: Bad for AI Compute

  • • Lower FP16/INT8 throughput vs datacenter cards
  • • Limited VRAM (8-24GB vs 80-192GB for H100/B200)
  • • Poor performance/watt for training workloads
  • • Can't compete in AI DePIN networks (Render, Akash, io.net)

Result: Idle consumer GPUs earn minimal returns in AI networks

Consumer GPUs: Perfect for Cloud Gaming

  • • Designed for real-time rendering at 60-120+ FPS
  • • NVENC/AMD VCE hardware encoding built-in
  • • 8-16GB VRAM sufficient for AAA games at 1080p-1440p
  • • Power efficiency optimized for sustained gameplay

Result: Shaga monetizes GPU supply other networks can't use

The Supply Unlock

Competitive Advantage: While AI DePIN networks compete for scarce datacenter GPUs, Shaga taps into hundreds of millions of consumer gaming GPUs sitting idle worldwide. These cards can't earn meaningful returns in AI compute markets but are perfectly suited for high-quality cloud gaming and data generation.

Streamer-Led GTM: Supply + Distribution in One

Streamers are both suppliers AND go-to-market channels. A single partnership unlocks:

  • Supply: Streamer's PC becomes a node (passive income)
  • Distribution: Their audience onboards as nodes or gamers (viral loop)
  • Content: Data collection becomes monetizable content (dual revenue)
Example Impact:
Top streamer reach: 1M+ viewers
Community conversion: 0.5-2% → 5-20k nodes
UGC multiplier: 10-50× organic reach

Scale Physics: 100M Hour Capacity

10k
Nodes (target onboarding)
160k hrs/day
Production capacity @ 16hr/day
~21 months
To mint 100M hours

Scale is technically feasible. Constraints are economic (token incentives, node onboarding), not physical infrastructure.

What Convinced Wayfarer Labs: Unprecedented Data Diversity

1,800
Unique Games
Played by network participants
3,731
Active Players
Player diversity across skill levels
315
Avg Hours/Game
Consistent quality across titles

No other supplier offers this breadth + depth combination. Supports all PC games.

Go-to-Market: Demand & Supply Channels

Supply Acquisition: Data-Driven Emissions (100M PCs)

TAM & Regional Economics

268M PC gamers mapped by region with latency acceptance bands, ARPU ranges ($1.49-$36/mo), and infrastructure costs.

Node Economics

Operator payback periods, regional cost structures, and token emission schedules. Shows why nodes activate in premium markets.

5-Year Projections

Conservative/Base/Aggressive scenarios with subscriber growth, utilization targets, and revenue scaling by market.

View Shaga Explorer

Tier A: Strategic Hyperscalers (6-8 labs)

Offering

Core 720p60 @ 120Hz sync product. Gen4 model training volumes (10M+ hrs/yr).

Pricing

$2-5/hr (price maker). Premium pricing for rush delivery or title prioritization.

Motion

Direct BD led by founders. Wayfarer Labs as lighthouse customer ($22M/3yr).

Tier B: Frontier Startups (7+ labs)

Incomplete list, new ones funded under the radar. World Labs, Odyssey, Decart, Skywork AI, etc.

Offering

Same Core product. Gen3-Gen4 training volumes (1M+ hrs/yr).

Pricing

$2-3/hr volume pricing. Pay-as-you-go or quarterly contracts.

Motion

Direct founder BD + data broker partnerships for distribution.

Channel Partners: Brokers & Marketplaces (Low-Volume)

Target Channels

Enterprise brokers (Defined.ai, Appen, Sama), specialized brokers (iMerit, Toloka), data marketplaces (AWS Data Exchange, Snowflake).

Use Case

Handle low-volume purchases (10k-50k hrs/yr). Robotics labs, academic groups, or smaller buyers exploring world model data.

Strategy

Brokers provide distribution without Shaga's direct sales effort. High-volume buyers graduate to direct channel.

Defensible Advantages

1. Only Scaled Supplier of Full-Stack Training Data

Wayfarer Labs signed $22M/3yr because Shaga is the ONLY supplier with: pixels + synchronized controls (120Hz) + clean rights + train-ready format + proven diversity (1,800 games). Medal has clips but no controls. Google SIMA uses bots (expensive, low signal). Replay sites have controls but no pixels. Shaga solves the full stack.

Data quality differentiation + production readiness = switching costs once labs integrate Shaga pipelines into training workflows.

2. Impossible-to-Replicate Timing Window

2+ years of network operations. Chicken-and-egg (gamers need nodes, nodes need gamers) is SOLVED. Labs need data NOW for Gen3/Gen4 models—they can't wait 12-24 months for competitors to bootstrap networks. This timing window is worth billions if executed correctly.

Move fast: sign 3-5 more labs in 6 months. Become THE default supplier while competitors are still onboarding their first nodes.

3. DePIN Cost Structure (50-75% Cheaper)

Labs' alternatives all fail on economics: DIY (hire employees @ $50+/hr, unscalable). Medal: 1 PC = 1 gamer (linear, short clips). QA houses: $2-3/hr labor + build-to-order, no inventory. Shaga: $1/hr COGS + 1 node = many gamers + resale to N buyers.

Cost advantage widens with scale. At 10k nodes, per-hour cost goes DOWN (token rewards + network effects) while centralized costs stay flat.

Bottom Line: Only scaled supplier of bottleneck resource in the biggest AI race since LLMs, with 2-year head start and 50-75% cost advantage nobody can replicate.

Risk Assessment & Competitive Defense

Market Catalyst: Medal's Vertical Integration Creates Supply Gap

General Intuition raised $134M seed (Khosla Ventures, General Catalyst) using Medal.tv's gaming data, then vertically integrated. Medal is now a lab—not an open data supplier.

Impact: This removes 10M+ gamers from the open market. Every other lab (OpenAI, DeepMind, xAI, Microsoft, Anthropic) now needs alternative sources for interactive gaming data.

Our Position: We're the only scaled alternative with pixels + synchronized controls + multi-game breadth ready for immediate delivery. Labs that were evaluating Medal now have one option: us.

Competitive Landscape: Why Alternatives Can't Scale

Real competition comes from alternative data suppliers trying to serve the same 100M+ hour demand. Brokers (Appen, Scale AI, Defined.ai) are partners who provide distribution, not competitors building supply.

Telemetry/Replay-Only Platforms

Examples: ballchasing.com, OpenDota, HSReplay, PureSkill.gg, PUBG API, Overwolf Game Events

They have: Controls/telemetry for single games

They lack: Video frames. Controls without synchronized pixels = useless for world model training (can't learn visual prediction from historical control logs)

Risk Level: LOW - fundamentally wrong data type

IDM/LAM-Derived Actions (Model-Labeled)

Approach: Labs infer actions from frame pairs (inverse dynamics) or learn latent action mappings

Why it breaks: ~10% prediction error per step. Non-identifiable problem (many actions → same frame transition) creates biased labels that underpredict high-frequency corrections.

Result: Error compounds at scale → model collapse. Models learn "smooth" actions that fail in fast FPS dynamics.

Risk Level: MEDIUM - Labs try this, hit scaling limits, then buy Ground-Truth data (us)

QA/Playtesting Houses (Build-to-Order)

Examples: Keywords Studios, PlaytestCloud, custom capture programs

They have: Can produce custom datasets with right format (pixels + controls)

They lack: Linear cost structure ($2-5/hr labor), no resale economics, build-to-order bottleneck

Risk Level: HIGH - Most credible alternative format-wise

Our advantage: DePIN + $SHAG unlock exponential supply ($1/hr COGS) vs their linear hiring constraints

Cloud Gaming Platforms

Examples: GeForce NOW, Xbox Cloud, Luna

They have: Massive infrastructure, millions of users, publisher relationships

They lack: (1) Speed (12-18 mo approval/build cycles), (2) Brand risk tolerance (gamers wouldn't want gameplay sold)

Risk Level: MEDIUM - biggest long-term threat, but slow-moving and risk-averse

Our advantage: 18-24 month speed advantage + neutral third-party positioning (explicit opt-in + token rewards)

Shaga's Differentiation

  • Only supplier with breadth + pixels + synchronized controls + clean rights
  • Train-ready packaging: Data schema, loaders, QA pipelines, ShagaScore validation
  • Cross-title normalization at scale: 1,800 games, consistent quality
  • DePIN cost structure: $1/hr COGS vs $2-5/hr centralized alternatives
  • Resale economics: Produce once, sell to N buyers (margin scales with buyer depth)

IP & Publisher Relationships: Three-Layer Defense

Layer 1: DePIN Regulatory Arbitrage

  • • Decentralized protocol structure with token-based payments
  • • Individual node operators make autonomous data decisions
  • • Jurisdictional complexity across global node network
  • Result: Creates operational flexibility during rapid scaling phase

Layer 2: Publisher Partnership Model - "Infinite DLC" Strategy

Publishers can't build data infrastructure but want AI monetization. We convert potential IP risk into revenue partnerships:

  • • Publishers license titles to us, we handle data ops, they get 20-30% revenue share
  • • They receive fine-tuned world models for their games = new revenue stream (DLC, UGC, infinite content)
  • Target: Sign 3-5 major publishers (Riot, Epic, indies) in next 12 months with explicit licensing deals

Layer 3: Legal Framework

  • • $2-3M allocated for direct licensing deals and legal infrastructure
  • • Title whitelist of confirmed-safe games
  • • Portfolio approach: build publisher partnerships to legitimize business model
  • • Focus on publishers who benefit from AI ecosystem growth (indie devs, catalog monetization)

Big Tech Competition: Speed as Moat

Microsoft (Azure + Xbox), NVIDIA (GeForce NOW), Amazon (AWS + Luna) have resources but face constraints:

Our Advantages:

  • Speed: We're live with 2+ years of infrastructure. They need 12-18 months for approvals and builds.
  • Platform Risk: They face brand backlash if users discover gameplay is sold. We're neutral third-party with explicit opt-in.
  • Focus: They're distracted by $B-scale core businesses. We're 100% focused on this market.

Data Quality & Commoditization Defense

Mitigation Strategy:

  • Prove Quality Delta: Publish benchmarks showing training performance advantages
  • Build Switching Costs: Deep pipeline integration with labs' training infrastructure (6-12 months engineering work to replicate)
  • Exclusive Relationships: Custom data collection via publisher partnerships (priority access to new game launches)
  • Tooling Ecosystem: Train-ready loaders, quality validation, continuous monitoring → we become infrastructure, not just data

Operational Risks (LOW)

Wayfarer Labs Dependency

  • • Sign 2-3 additional lab partners for publisher lab-as-a-service
  • • Position as neutral data infrastructure layer that ANY lab can plug into

Privacy & Compliance

  • • Comprehensive consent flows, age verification (COPPA)
  • • On-device content redaction for sensitive data
  • • Geo-fenced storage (GDPR, CCPA compliance)

The 18-24 Month Window: First-Mover Becomes Default Infrastructure

Three converging tailwinds create our land-grab moment:

1. Big tech is 12-18 months behind

(approvals, builds, bureaucracy)

2. Legal clarity is 18-24 months away

(time to build publisher legitimacy)

3. Medal's exit leaves market undersupplied RIGHT NOW

(labs need data today)

Critical Mass = Irreversible Moat:

  • 5-8 major lab contracts → switching costs lock us in
  • 100M+ hours delivered → we ARE the training data standard
  • 3-5 publisher partnerships → IP risk converts to revenue
  • 10k+ nodes → supply becomes unreplicable

This is a land-grab moment. We're the only scaled supplier ready now. By the time big tech builds or legal challenges emerge, we'll be embedded in every major lab's training infrastructure.

The window is open. We're moving.

The Investment Opportunity

Market Timing: Three Converging Forces

Supply Shock

Medal (10M+ gamers) vertically integrated with General Intuition. Open market supply disappeared overnight. Every other lab needs alternative sources NOW.

Scaling Law Breakthrough

World models show power-law scaling like LLMs in 2020. Labs that secure data supply first dominate Gen4/Gen5 generations.

Speed Advantage

We're live with 2+ years of infrastructure. Big tech needs 12-18 months to build. This is a land-grab moment.

The Path: Data Supplier → Infrastructure Standard

Phase 1 (Now): Only scaled supplier post-Medal. 80%+ margins, 4-8× resale leverage.
Phase 2 (12-18mo): 5-8 major labs integrated. 100M+ hours = training data standard. Deep switching costs.
Phase 3 (24mo+): Infrastructure layer for world models. Labs build on our formats and schemas. Same compute network scales to edge inference.

TAM Evolution:

$500M data market → $5B infrastructure → $50B+ compute layer

Why We Win

Unreplicable Timing

2-year head start. Labs can't wait for competitors to bootstrap.

DePIN Economics

50-75% cost advantage. Every dollar generates $6-15 revenue through resale.

Three-Sided Moat

Gamer distribution + lab relationships + compute rails

The Opportunity

Unsupplied
The market
Proven
The economics
Open
The window

18 months to embed in every major lab's infrastructure. Too expensive for big tech to displace. Too late for startups to catch up.