Meet CatchAll: Our recall-first web search API

CatchAll is a web search API that generates unique datasets that don’t exist anywhere else on the web. Built on NewsCatcher’s proprietary real-world event index, it delivers state-of-the-art recall—finding all relevant events, not just top results.

How it works

CatchAll processes queries through a multi-stage pipeline that analyzes over 50,000 web pages per job:

Analyze

Generates targeted search queries for NewsCatcher’s proprietary news index and creates validation rules and extraction patterns based on your input.

Fetch

Retrieves and processes 50,000+ articles from web sources to ensure comprehensive coverage.

Cluster

Groups related articles into distinct real-world events using the Leiden algorithm for community detection.

Validate

Applies generated validators to filter clusters, ensuring only relevant events that match your criteria proceed to extraction.

Extract

Transforms validated events into structured JSON records with dynamic schemas tailored to your query.

Processing typically takes 10-15 minutes per job. Poll the status endpoint every 30-60 seconds to track progress through each stage.

Each job returns structured JSON records with dynamic schemas. Fields like company_name, deal_value, and acquisition_date are automatically generated based on your query.See the Quickstart > Review response for a complete response example.

Key characteristics

Event-centric index

CatchAll searches NewsCatcher’s continuously updated index of 2+ billion web pages, optimized for finding real-world events (acquisitions, approvals, incidents) within a recent timeframe. The system excels at comprehensive event discovery, not static content retrieval.

Learn how to construct effective event queries in Write effective queries.

Dynamic schemas

Each job generates a unique response schema. Field names and structure in the enrichment object vary between jobs—even with identical inputs.Guaranteed fields in every record:

record_id
record_title
enrichment object
citations array

Variable fields:

All fields inside enrichment (names, types, structure)

See Understanding dynamic schemas for integration patterns.

Customizable extraction

Control what data gets extracted by providing custom validators and enrichments, or let the system generate them automatically based on your query.Custom validators filter which events are relevant:

{
  "name": "is_acquisition",
  "description": "true if article describes an acquisition",
  "type": "boolean"
}

Custom enrichments define what data to extract:

{
  "name": "acquiring_company",
  "description": "Extract the acquiring company name",
  "type": "company"
}

The company enrichment type extracts structured data including name, alternative names, website candidates, people, and address.

Use the POST /catchAll/initialize endpoint to get suggested validators and enrichments before submitting your job.

Date range controls

Specify custom date ranges for article search, or let the system determine the optimal time window based on your query (default: 5 days).Date ranges are validated against your plan’s allowed lookback period. If your requested dates exceed plan limits, the API returns a 400 error with specific guidance.

Use the POST /catchAll/initialize endpoint to preview date adjustments before submitting.

Non-deterministic processing

Identical queries can produce different results:

LLMs may generate different keywords, validators, and extractors
Different content sources may be retrieved
Field names and structure vary between runs
Record counts differ

Asynchronous operation

Each query creates a job that processes asynchronously. Use the returned job_id to poll the job status and retrieve results when completed. Processing typically takes 10-15 minutes.Track detailed progress through the steps array in the status endpoint response.

Batch processing

Results become available progressively during the enriching stage as validation completes in batches. Check for status: "enriching" to retrieve partial results before job completion.The progress_validated field tracks how many candidate clusters have been processed. This allows you to access early results while the job continues processing remaining batches.

Job continuation

Start with fewer records using the limit parameter for quick testing, then use POST /catchAll/continue to process more records without re-submitting the query.Continue requests preserve all analysis, validation, and extraction logic from the original job.

Endpoints

Base URL: https://catchall.newscatcherapi.com

Jobs
Monitors
Meta

Endpoint	Method	Description
`/catchAll/initialize`	`POST`	Get validator, enrichment, and date suggestions
`/catchAll/submit`	`POST`	Create a new job
`/catchAll/continue`	`POST`	Continue job with higher limit
`/catchAll/jobs/user`	`GET`	List all jobs for your API key
`/catchAll/status/{job_id}`	`GET`	Check job processing status
`/catchAll/pull/{job_id}`	`GET`	Retrieve job results

Track detailed progress using the steps array in the status endpoint response. See Job status > steps for details.

Endpoint	Method	Description
`/catchAll/monitors/create`	`POST`	Create scheduled monitor
`/catchAll/monitors/{monitor_id}`	`PATCH`	Update monitor webhook
`/catchAll/monitors`	`GET`	List all monitors
`/catchAll/monitors/{monitor_id}/jobs`	`GET`	List jobs for a monitor
`/catchAll/monitors/pull/{monitor_id}`	`GET`	Get aggregated monitor results
`/catchAll/monitors/{monitor_id}/enable`	`POST`	Enable a monitor
`/catchAll/monitors/{monitor_id}/disable`	`POST`	Disable a monitor

Endpoint	Method	Description
`/health`	`GET`	Check API health status
`/version`	`GET`	Get API version info

Use cases

Market intelligence: Company earnings, M&A activity, product launches
Regulatory monitoring: Policy changes, government actions, compliance updates
Business development: Partnerships, funding rounds, market entries
Competitive analysis: Competitor activities and announcements
Research automation: Structured data extraction for analysis
News aggregation: Topic-specific news with structured output

What’s next

Quickstart

Make your first request and get results in minutes

Monitors

Automate recurring queries with scheduled execution

API Reference

Detailed endpoint documentation and parameters

Dynamic Schemas

Handle variable response structures in your integration

For technical support, contact us at support@newscatcherapi.com.

Get started

Guides and concepts

How to

API Reference

Libraries

Integrations

Meet CatchAll: Our recall-first web search API

How it works

Key characteristics

Endpoints

Use cases

What’s next

Quickstart

Monitors

API Reference

Dynamic Schemas

Get started

Guides and concepts

How to

API Reference

Libraries

Integrations

​How it works

​Key characteristics

​Endpoints

​Use cases

​What’s next

Quickstart

Monitors

API Reference

Dynamic Schemas

How it works

Key characteristics

Endpoints

Use cases

What’s next