Configure monitors

This guide covers best practices for monitor configuration, webhook implementation, and performance optimization.

Encountering issues? See Troubleshoot monitors for solutions to common problems.

Choose appropriate schedules

Match schedule frequency to your data needs and budget. Each scheduled run creates a billable job.

Recommended frequencies

Use Case	Schedule	Rationale
News monitoring	Every 6-12 hours	Balances freshness with cost
Regulatory updates	Daily	Regulations rarely change more frequently
Market intelligence	Daily or twice daily	Financial data updates during business hours
Real-time alerts	Hourly	Minimum recommended for time-sensitive use cases

Avoid schedules more frequent than hourly unless necessary. High-frequency monitors with broad queries may produce many duplicate-free runs (zero new records), increasing costs without adding value.

Schedule frequency vs. deduplication

More frequent schedules may result in more executions with zero new records:

Every hour: Higher likelihood of finding new events in each run
Every 15 minutes: Most runs may return zero records after deduplication
Every 5 minutes: Very likely to have consecutive runs with no new results

Recommendation: Start with hourly or less frequent schedules and adjust based on actual data velocity.

Test schedules before production

Invalid schedules may be parsed as every-minute execution (* * * * *), leading to unexpected costs and rate limits.

Testing procedure

Create test monitor

Create a monitor with a short interval: "every 5 minutes"

Wait for executions

Wait 10-15 minutes for 2-3 executions to complete.

Check execution times

curl

"https://catchall.newscatcherapi.com/catchAll/monitors/{monitor_id}/jobs" \
-H "x-api-key: YOUR_API_KEY"

Verify cron expression

Check the cron_expression field in results or webhook payload. For "every 5 minutes", expect */5 * * * *.

Create production monitor

If correct, disable the test monitor and create your production monitor with the desired schedule. If incorrect, try a different schedule format.

Show Valid schedule formats and cron expressions

Define schedules in natural language with explicit timezone.Time-based schedules (recommended):

"every day at 12 PM UTC"
"every Monday at 9 AM EST"
"every Friday at 5 PM GMT"

Interval-based schedules:

"every 6 hours"
"every 12 hours"
"every hour"

Invalid formats (avoid):

❌ "daily at noon"
❌ "twice per day"
❌ "every weekday"

Common cron patterns:

Schedule	Cron Expression	Meaning
`"every day at 12 PM UTC"`	`0 12 * * *`	Daily at noon UTC
`"every 6 hours"`	`0 /6 * *`	Every 6 hours
`"every hour"`	`0 * * * *`	Top of every hour
`"every Monday at 9 AM EST"`	`0 9 * * 1`	Weekly on Monday at 9 AM
`* * * * *`	Every minute	Parsing error

Verify reference job quality

Before creating a monitor, ensure your reference job produces high-quality results.

Reference job quality checklist

Record count: 10-500 records (adjust for your use case)
- Too few (less than 10): Query may be too specific
- Too many (more than 500): Query may be too broad
No time-based validators: Check the validators array for time constraints
- ❌ Avoid: event_in_last_hour, event_in_last_7_days, announcement_within_date_range
- These indicate time-constrained queries that can fail on subsequent runs
Clean extraction: Review the enrichment object structure
- All important fields extracted
- Field names are semantic and consistent
- Data is accurate
Quality citations: Verify sources are authoritative and relevant
- Sources are credible
- Publication dates are recent
- Citations support the extracted data

❌ Time-based validators indicate problems. If your reference job contains validators like event_in_last_hour or announcement_within_date_range, your monitor can return zero records after the first execution.Solution: Create a new job with an open-ended query (no time constraints like “this week”, “today”, or “last hour”), then create a monitor from that job.Fix zero records issue →

Implement robust webhooks

Configure webhook endpoints to handle notifications reliably.

Endpoint requirements

Your webhook endpoint must:

Return 2xx status code within 5 seconds.
Be publicly accessible (not localhost or private network).
Use HTTPS (not HTTP).
Handle POST requests with JSON body.

Quick implementation

Return 200 immediately and process asynchronously to avoid timeouts:

Python
TypeScript

from flask import Flask, request, jsonify
import logging

app = Flask(__name__)
logging.basicConfig(level=logging.INFO)

@app.route('/catchall/webhook', methods=['POST'])
def handle_catchall_webhook():
    try:
        # Get payload
        payload = request.json
        logging.info(f"Received webhook: {payload['monitor_id']}")

        # Return 200 immediately - process async
        process_webhook_async(payload)
        return jsonify({"status": "received"}), 200

    except Exception as e:
        logging.error(f"Webhook error: {e}")
        # Return 200 even on error to avoid retries
        return jsonify({"status": "error"}), 200

def process_webhook_async(payload):
    """Queue for background processing"""
    monitor_id = payload['monitor_id']
    records_count = payload['records_count']

    if records_count > 0:
        # Your processing logic here
        save_records(payload['records'])

If webhooks aren’t firing, see Troubleshoot: Webhook not firing.

Show Add retry logic with exponential backoff

Implement exponential backoff for webhook processing failures:

Python
TypeScript

import time

def process_webhook_with_retry(payload, max_retries=3):
    """Process webhook with exponential backoff"""
    for attempt in range(max_retries):
        try:
            # Your processing logic
            process_records(payload['records'])
            return True

        except Exception as e:
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt  # Exponential backoff: 1s, 2s, 4s
                time.sleep(wait_time)
                continue
            else:
                # Log failure after all retries
                log_webhook_failure(payload, str(e))
                return False

Show Add webhook logging

Log all webhook events for debugging:

Python
TypeScript

import json
from datetime import datetime

def log_webhook(payload, status):
    """Log webhook receipt and processing status"""
    log_entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "monitor_id": payload['monitor_id'],
        "latest_job_id": payload['latest_job_id'],
        "records_count": payload['records_count'],
        "status": status
    }

    with open('webhook_log.jsonl', 'a') as f:
        f.write(json.dumps(log_entry) + '\n')

Optimize performance

Query specificity

Balance query specificity with result volume: Too broad (high volume, many duplicates):

"query": "company news"

Too specific (low volume, may miss events):

"query": "Series C funding rounds for AI companies in San Francisco over $50M"

Optimal (focused but flexible):

"query": "AI company funding rounds",
"context": "Focus on Series B and later, amounts over $10M"

Context usage

Use context to refine results without creating overly specific validators:

{
  "query": "Technology company acquisitions",
  "context": "Include deal size if available, focus on public companies"
}

This provides guidance to the LLM without generating restrictive validators.

Schema design

Design schemas that extract core fields consistently: Good schema (flexible, semantic):

"schema": "[ACQUIRER] acquired [TARGET] for [AMOUNT] on [DATE]"

Problematic schema (too specific):

"schema": "[ACQUIRER] acquired [TARGET] in [CITY], [COUNTRY] for exactly [AMOUNT] USD on [SPECIFIC_DATE]"

Overview

How to

Endpoints

Choose appropriate schedules

Recommended frequencies

Schedule frequency vs. deduplication

Test schedules before production

Testing procedure

Verify reference job quality

Implement robust webhooks

Endpoint requirements

Quick implementation

Optimize performance

Query specificity

Context usage

Schema design

See also

Overview

How to

Endpoints

​Choose appropriate schedules

​Recommended frequencies

​Schedule frequency vs. deduplication

​Test schedules before production

​Testing procedure

​Verify reference job quality

​Implement robust webhooks

​Endpoint requirements

​Quick implementation

​Optimize performance

​Query specificity

​Context usage

​Schema design

​See also

Choose appropriate schedules

Recommended frequencies

Schedule frequency vs. deduplication

Test schedules before production

Testing procedure

Verify reference job quality

Implement robust webhooks

Endpoint requirements

Quick implementation

Optimize performance

Query specificity

Context usage

Schema design

See also