Standardized Data Schemas

Strongly-typed JSON and CSV structures with strict version control. We guarantee backward-compatible schema evolution — no surprise pipeline breaks.

v2.1 Clinical Trial Entity (Pharma)

Normalized output from global medical journals and adverse event (DILI) reports. Mapped to MeSH ontology and NCI Thesaurus.

{
  "doc_id": "PMC12834359",
  "source_type": "peer_reviewed_journal",
  "ingestion_timestamp": "2026-03-07T12:00:00Z",
  "clinical_entities": {
    "compound_mesh_id": "D000082",
    "compound_name": "Acetaminophen",
    "adverse_signals": [
      {
        "phenotype": "Drug-Induced Liver Injury (DILI)",
        "mesh_id": "D056486",
        "confidence_score": 0.982,
        "extracted_context": "Significant transaminase elevation observed..."
      }
    ]
  },
  "cohort_size": 245,
  "statistical_significance": true,
  "p_value": 0.003
}

v3.0 Predictive Physics Vector (Sports)

Ultra-low latency MLB pitch kinematics and calculated prop probabilities. Schema aligns with Statcast coordinate system.

{
  "event_id": "mlb_pitch_8849201",
  "timestamp_ms": 1710004523000,
  "kinematics": {
    "release_speed_mph": 97.4,
    "spin_rate_rpm": 2450,
    "pfx_x": -0.85,
    "pfx_z": 1.42,
    "plate_x": 0.12,
    "plate_z": 2.85
  },
  "prop_model_deltas": {
    "strikeout_prob_shift": "+0.045",
    "contact_quality_est": "weak_grounder",
    "chase_rate_model": 0.31
  }
}

v1.5 Verified Firmographic Contact (B2B)

Fully enriched professional profiles with SMTP-verified email domains and NAICS industry classification.

{
  "contact_id": "usr_9948a8b",
  "work_email": "jane.doe@targetcompany.com",
  "email_validation": {
    "status": "valid",
    "mx_found": true,
    "smtp_check": true,
    "catch_all": false
  },
  "firmographics": {
    "company_domain": "targetcompany.com",
    "estimated_revenue": "$50M-$100M",
    "industry_naics": "5112",
    "tech_stack": ["Salesforce", "Snowflake", "Python"]
  }
}

v1.0 Alternative Finance Signal (Quant)

Macro and micro alternative signals derived from hiring trends, biotech trial catalysts, and supply-chain events correlated with equity movements.

{
  "signal_id": "fin_signal_00293",
  "ticker": "MRNA",
  "signal_date": "2026-03-07",
  "signal_type": "clinical_trial_catalyst",
  "direction": "bullish",
  "confidence": 0.78,
  "correlated_event": {
    "type": "phase_3_trial_initiation",
    "compound": "mRNA-1345",
    "expected_readout": "Q3 2026"
  }
}