Valyu — Unified Data Search

Give AI agents access to 25+ specialized data sources — 40M+ academic papers, SEC filings, clinical trials, patents, real-time news, and prediction markets — through a single API.

Valyu — Unified Data Search for AI Agents

A single API for 25+ specialized data sources: academic papers, SEC filings, clinical trials, patent databases, real-time news, and prediction markets. Built for AI agents that need reliable, citable research data.

Integrates natively with Claude, OpenAI, LangChain, LlamaIndex, and the Vercel AI SDK. $10 free credits to get started at platform.valyu.ai.

What This Skill Does

Core Capabilities

  • Search API: Query across web, academic (40M+ papers via arXiv/PubMed), medical, financial, and proprietary sources with date range and relevance filters
  • Contents API: Clean markdown extraction from up to 10 URLs simultaneously; supports structured JSON schema extraction
  • Answer API: AI-synthesized responses with citations; fast mode for low latency, streaming support
  • DeepResearch API: Comprehensive research reports in fast (~5 min), standard (~10–20 min), and heavy (~90 min) tiers
  • 27 Recipes: Ready-to-use workflows covering search, extraction, answers, and research

Installation

BASH
npx skills add valyuAI/skills

Get your API key at platform.valyu.ai.

Quick Start

PYTHON
from valyu import Valyu

client = Valyu(api_key="your_api_key")

# Search academic papers
results = client.search(
    query="CRISPR gene editing cancer treatment efficacy",
    sources=["academic", "medical"],
    max_results=10
)

# Get a synthesized answer with citations
answer = client.answer(
    question="What are the latest developments in quantum computing?",
    mode="fast"
)

Data Sources

Academic & Scientific

  • arXiv — 40M+ preprints across physics, math, CS, biology
  • PubMed — biomedical and life science literature
  • Patent databases — global patent search

Financial & Business

  • SEC EDGAR — filings, earnings, disclosures
  • Prediction markets — real-time probability data

Medical

  • ClinicalTrials.gov — active and completed trials
  • Medical literature databases

Web & News

  • Real-time news across major publications
  • General web search with relevance ranking

DeepResearch Tiers

TierDurationBest For
Fast~5 minQuick overviews, initial exploration
Standard~10–20 minComprehensive analysis with citations
Heavy~90 minExhaustive research reports

Search Best Practices

Queries should be under 400 characters, specific, and single-topic for best results. Broad queries return lower-quality results.

Good queries:

  • "mRNA vaccine efficacy COVID-19 variants 2024"
  • "SEC Form 10-K Apple Inc fiscal year 2024"

Avoid:

  • "tell me everything about AI" (too broad)
  • Multiple unrelated topics in one query

Integrations

  • Anthropic Claude — native tool use support
  • OpenAI — function calling compatible
  • LangChain — Valyu retriever
  • LlamaIndex — data connector
  • Vercel AI SDK — streaming support
  • MCP Server — Model Context Protocol