March 18, 2026cronsafe

How to Set Up Cron Job Monitoring in 3 Minutes

One curl command. That's all it takes to monitor any cron job.

cronsafetutorialquickstart

Your backup script has been silently failing for 3 weeks. The database dump that runs at 2 AM every night hit a disk-full error on March 1st, wrote one line to stderr, and cron moved on. Nobody checked. Nobody got pinged. Three weeks of backups are gone, and you only found out because a developer tried to restore a staging database this morning. One curl command at the end of that script would have caught it on day one.

Most cron monitoring guides walk you through 30 minutes of setup: install an agent, configure YAML, learn a dashboard, set up alert channels. By the time you finish reading, you've lost the motivation to actually wire it up. CronSafe takes a different approach. You create a monitor, you get a URL, you add one line to your script. Three minutes, start to finish. No agent to install, no config files to maintain, no dashboard to learn before you get your first alert.

Here's the entire process.

Step 1: Create a monitor (30 seconds)

Sign up at CronSafe and create a new monitor. You'll give it a name (like prod-db-backup) and tell it the expected schedule. If your cron runs every hour, set the expected interval to 60 minutes. CronSafe gives you a unique ping URL:

https://ping.luxkern.com/cronsafe/abc123-def456

That URL is your monitor's heartbeat endpoint. Every time your job pings it, CronSafe records a successful run. If a ping doesn't arrive within the expected window, CronSafe fires an alert. The alert latency is 3 seconds from the moment the deadline passes — you'll know almost instantly.

Step 2: Add one curl to your script (60 seconds)

Open your cron script and add a single line at the end:

#!/bin/bash
/opt/scripts/nightly-backup.sh


set -euo pipefail

PING_URL="https://ping.luxkern.com/cronsafe/abc123-def456"

Your existing backup logic
pg_dump -Fc production -f "/backups/production_$(date +%Y%m%d).dump"

Signal success to CronSafe
curl -fsS --retry 3 --max-time 10 "${PING_URL}" > /dev/null

Let's break down the curl flags, because each one matters:

-f (fail silently): Returns a non-zero exit code on HTTP errors. Without this, curl exits 0 even on a 500 response, and your script thinks the ping succeeded.

-s (silent): Suppresses the progress bar. You don't want curl's transfer stats cluttering your cron logs.

-S (show error): When combined with -s, still prints error messages on actual failures. Silent success, visible errors.

--retry 3: Retries up to 3 times on transient network failures. A single dropped packet won't cause a false alarm.

--max-time 10: Gives up after 10 seconds. If the ping endpoint is unreachable, your script won't hang forever.

That's it. Your cron job is now monitored. If pg_dump fails, set -euo pipefail causes the script to exit before reaching the curl line. CronSafe never receives the ping, the deadline expires, and you get an alert.

Step 3: Configure where alerts go (90 seconds)

In the CronSafe dashboard, set your alert channels. You get Slack, email, Discord, and webhook out of the box. Most teams send critical job failures to Slack and email simultaneously. The free tier gives you 20 monitors with unlimited alerts — enough for a full production environment.

Total setup time: 3 minutes. You now have monitoring that catches both failures (job ran but errored) and omissions (job didn't run at all, because the server was down or crontab was deleted).

Explicit failure pings: know why it broke

The basic setup above uses a dead man's switch pattern: silence means failure. That tells you *that* something broke, but not *why*. CronSafe supports explicit success and failure pings so you can report the exit code and capture error output:

#!/bin/bash
/opt/scripts/nightly-backup-v2.sh


set -uo pipefail

PING_URL="https://ping.luxkern.com/cronsafe/abc123-def456"
START=$(date +%s)

Run the backup, capture stderr
ERROR_OUTPUT=$(pg_dump -Fc production \
    -f "/backups/production_$(date +%Y%m%d).dump" 2>&1) && {
    DURATION=$(( $(date +%s) - START ))
    # Success ping with duration metadata
    curl -fsS --retry 3 --max-time 10 \
        "${PING_URL}/success" \
        -d "{\"duration\": ${DURATION}, \"msg\": \"Backup completed\"}" \
        > /dev/null
} || {
    EXIT_CODE=$?
    DURATION=$(( $(date +%s) - START ))
    # Failure ping with exit code and error message
    curl -fsS --retry 3 --max-time 10 \
        "${PING_URL}/fail" \
        -d "{\"exit_code\": ${EXIT_CODE}, \"duration\": ${DURATION}, \"error\": \"${ERROR_OUTPUT:0:500}\"}" \
        > /dev/null
    exit ${EXIT_CODE}
}

When you ping /fail, CronSafe immediately fires an alert — no waiting for a missed deadline. The alert includes the exit code and the first 500 characters of stderr, so you can often diagnose the issue directly from the Slack notification without SSHing into the server.

Python: monitoring a scheduled task

If your job runs in Python instead of bash, the pattern is the same. Ping on success, report on failure:

import subprocess
import requests
import time
import json

PING_URL = "https://ping.luxkern.com/cronsafe/abc123-def456"

def run_monitored_job():
    start = time.time()

    try:
        # Your job logic
        result = subprocess.run(
            ["python", "/opt/scripts/etl_pipeline.py"],
            capture_output=True,
            text=True,
            timeout=3600,  # 1 hour max
            check=True,
        )

        duration = round(time.time() - start, 2)
        requests.post(
            f"{PING_URL}/success",
            json={"duration": duration, "rows_processed": parse_row_count(result.stdout)},
            timeout=10,
        )

    except subprocess.CalledProcessError as e:
        duration = round(time.time() - start, 2)
        requests.post(
            f"{PING_URL}/fail",
            json={
                "exit_code": e.returncode,
                "duration": duration,
                "stderr": e.stderr[:500],
            },
            timeout=10,
        )
        raise

    except subprocess.TimeoutExpired:
        requests.post(
            f"{PING_URL}/fail",
            json={"error": "Job exceeded 3600s timeout", "duration": 3600},
            timeout=10,
        )
        raise

if __name__ == "__main__":
    run_monitored_job()

This captures three failure modes: non-zero exit, timeout, and uncaught exceptions. Each sends structured data to CronSafe so the alert is actionable.

Node.js: monitoring a scheduled function

For Node.js applications using node-cron or a similar scheduler:

const https = require("https");

const PING_URL = "https://ping.luxkern.com/cronsafe/abc123-def456";

async function pingCronSafe(status, data = {}) {
  const payload = JSON.stringify(data);
  const url = new URL(${PING_URL}/${status});

  return new Promise((resolve) => {
    const req = https.request(url, { method: "POST", headers: { "Content-Type": "application/json" } }, resolve);
    req.on("error", () => resolve()); // Don't let ping failures crash the job
    req.setTimeout(10000, () => req.destroy());
    req.write(payload);
    req.end();
  });
}

async function runDailyReport() {
  const start = Date.now();

  try {
    const rowCount = await generateReport();
    const duration = Math.round((Date.now() - start) / 1000);
    await pingCronSafe("success", { duration, rows: rowCount });
  } catch (err) {
    const duration = Math.round((Date.now() - start) / 1000);
    await pingCronSafe("fail", { duration, error: err.message.slice(0, 500) });
    throw err;
  }
}

The req.on("error", () => resolve()) line is deliberate. If the CronSafe endpoint is unreachable, you don't want your actual job to crash. The monitoring layer should never interfere with the work it's monitoring. The missed ping will still trigger a deadline alert.

GitHub Actions: monitoring scheduled workflows

GitHub Actions scheduled workflows fail silently more often than most teams realize. GitHub can skip or delay scheduled runs when runners are under load, and failures in scheduled workflows don't generate notifications by default. Add a ping step:

# .github/workflows/daily-sync.yml
name: Daily Data Sync
on:
  schedule:
    - cron: "0 3 * * *"  # 3 AM UTC daily

jobs:
  sync:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run sync
        run: python scripts/sync_data.py

      - name: Ping CronSafe on success
        if: success()
        run: |
          curl -fsS --retry 3 --max-time 10 \
            "https://ping.luxkern.com/cronsafe/abc123-def456/success" \
            -d '{"workflow": "${{ github.workflow }}", "run_id": "${{ github.run_id }}"}' \
            > /dev/null

      - name: Ping CronSafe on failure
        if: failure()
        run: |
          curl -fsS --retry 3 --max-time 10 \
            "https://ping.luxkern.com/cronsafe/abc123-def456/fail" \
            -d '{"workflow": "${{ github.workflow }}", "run_id": "${{ github.run_id }}"}' \
            > /dev/null

The if: success() and if: failure() conditions ensure the correct endpoint gets pinged regardless of the outcome. The run_id in the payload gives you a direct link back to the Actions run for debugging.

What CronSafe actually checks

When your monitor doesn't receive a ping within the expected window, CronSafe does three things:

Fires the alert within 3 seconds of the deadline passing. No batching, no digests, no 5-minute polling intervals. Three seconds.

Records the gap in the run history. You can see exactly when runs started missing and correlate with deploy times or infrastructure changes.

Continues watching. If the next scheduled run succeeds, CronSafe sends a recovery notification so you know the issue resolved itself.

This covers the two failure modes that trip up most teams: jobs that run and fail (caught by explicit /fail pings) and jobs that don't run at all (caught by the missed deadline). Both modes are invisible to cron itself.

Common mistakes to avoid

Don't put the curl before your job logic. The ping should happen after successful completion, not at the start. A ping at the start tells you the job *started*, not that it *finished*.

Don't skip the retry flag. A single transient network error without --retry triggers a false-positive alert. Three retries eliminate virtually all false alarms from network blips.

Don't set the deadline too tight. If your job usually takes 10 minutes but occasionally takes 25, set the expected window to 35 minutes. CronSafe alerts on missed deadlines, so a tight window means false alarms when the job is just slow.

Don't monitor non-critical jobs at first. Start with the 3-5 jobs where a silent failure causes real damage: backups, billing runs, data syncs. Expand from there. With 20 free monitors on the CronSafe free tier, you have room to grow.

From zero to monitored in 3 minutes

You now have everything you need to monitor any cron job, in any language, on any platform. The pattern is always the same: run your job, ping on success, report on failure, let CronSafe handle the alerting.

If your cron jobs run on a schedule that matters — and if they didn't matter, they wouldn't be on a schedule — 3 minutes of setup with CronSafe is the cheapest insurance you'll add this year. For a deeper look at monitoring patterns and failure detection strategies, check out how to monitor cron jobs and cron job failure alerts.

Set up your first monitor at luxkern.com/cronsafe. It takes less time than reading this sentence.

Step 1: Create a monitor (30 seconds)

Step 2: Add one curl to your script (60 seconds)

/opt/scripts/nightly-backup.sh

Your existing backup logic

Signal success to CronSafe