Skip to main content
This guide covers best practices for monitor configuration, webhook implementation, and performance optimization.
Encountering issues? See Troubleshoot monitors for solutions to common problems.

Choose appropriate schedules

Match schedule frequency to your data needs and budget. Each scheduled run creates a billable job.
Use CaseScheduleRationale
News monitoringEvery 6-12 hoursBalances freshness with cost
Regulatory updatesDailyRegulations rarely change more frequently
Market intelligenceDaily or twice dailyFinancial data updates during business hours
Real-time alertsHourlyMinimum recommended for time-sensitive use cases
Avoid schedules more frequent than hourly unless necessary. High-frequency monitors with broad queries may produce many duplicate-free runs (zero new records), increasing costs without adding value.

Schedule frequency vs. deduplication

More frequent schedules may result in more executions with zero new records:
  • Every hour: Higher likelihood of finding new events in each run
  • Every 15 minutes: Most runs may return zero records after deduplication
  • Every 5 minutes: Very likely to have consecutive runs with no new results
Recommendation: Start with hourly or less frequent schedules and adjust based on actual data velocity.

Test schedules before production

Invalid schedules may be parsed as every-minute execution (* * * * *), leading to unexpected costs and rate limits.

Testing procedure

1

Create test monitor

Create a monitor with a short interval: "every 5 minutes"
2

Wait for executions

Wait 10-15 minutes for 2-3 executions to complete.
3

Check execution times

curl
"https://catchall.newscatcherapi.com/catchAll/monitors/{monitor_id}/jobs" \
-H "x-api-key: YOUR_API_KEY"
4

Verify cron expression

Check the cron_expression field in results or webhook payload. For "every 5 minutes", expect */5 * * * *.
5

Create production monitor

If correct, disable the test monitor and create your production monitor with the desired schedule. If incorrect, try a different schedule format.

Verify reference job quality

Before creating a monitor, ensure your reference job produces high-quality results.
  • Record count: 10-500 records (adjust for your use case)
    • Too few (less than 10): Query may be too specific
    • Too many (more than 500): Query may be too broad
  • No time-based validators: Check the validators array for time constraints
    • ❌ Avoid: event_in_last_hour, event_in_last_7_days, announcement_within_date_range
    • These indicate time-constrained queries that can fail on subsequent runs
  • Clean extraction: Review the enrichment object structure
    • All important fields extracted
    • Field names are semantic and consistent
    • Data is accurate
  • Quality citations: Verify sources are authoritative and relevant
    • Sources are credible
    • Publication dates are recent
    • Citations support the extracted data
Time-based validators indicate problems. If your reference job contains validators like event_in_last_hour or announcement_within_date_range, your monitor can return zero records after the first execution.Solution: Create a new job with an open-ended query (no time constraints like “this week”, “today”, or “last hour”), then create a monitor from that job.Fix zero records issue →

Implement robust webhooks

Configure webhook endpoints to handle notifications reliably.

Endpoint requirements

Your webhook endpoint must:
  1. Return 2xx status code within 5 seconds.
  2. Be publicly accessible (not localhost or private network).
  3. Use HTTPS (not HTTP).
  4. Handle POST requests with JSON body.

Quick implementation

Return 200 immediately and process asynchronously to avoid timeouts:
  • Python
  • TypeScript
from flask import Flask, request, jsonify
import logging

app = Flask(__name__)
logging.basicConfig(level=logging.INFO)

@app.route('/catchall/webhook', methods=['POST'])
def handle_catchall_webhook():
    try:
        # Get payload
        payload = request.json
        logging.info(f"Received webhook: {payload['monitor_id']}")

        # Return 200 immediately - process async
        process_webhook_async(payload)
        return jsonify({"status": "received"}), 200

    except Exception as e:
        logging.error(f"Webhook error: {e}")
        # Return 200 even on error to avoid retries
        return jsonify({"status": "error"}), 200

def process_webhook_async(payload):
    """Queue for background processing"""
    monitor_id = payload['monitor_id']
    records_count = payload['records_count']

    if records_count > 0:
        # Your processing logic here
        save_records(payload['records'])
If webhooks aren’t firing, see Troubleshoot: Webhook not firing.

Optimize performance

Query specificity

Balance query specificity with result volume: Too broad (high volume, many duplicates):
"query": "company news"
Too specific (low volume, may miss events):
"query": "Series C funding rounds for AI companies in San Francisco over $50M"
Optimal (focused but flexible):
"query": "AI company funding rounds",
"context": "Focus on Series B and later, amounts over $10M"

Context usage

Use context to refine results without creating overly specific validators:
{
  "query": "Technology company acquisitions",
  "context": "Include deal size if available, focus on public companies"
}
This provides guidance to the LLM without generating restrictive validators.

Schema design

Design schemas that extract core fields consistently: Good schema (flexible, semantic):
"schema": "[ACQUIRER] acquired [TARGET] for [AMOUNT] on [DATE]"
Problematic schema (too specific):
"schema": "[ACQUIRER] acquired [TARGET] in [CITY], [COUNTRY] for exactly [AMOUNT] USD on [SPECIFIC_DATE]"

See also