TraceCrowd beta
Live Public beta · sign up with a magic link

Get paged before
your users do.

Minute-resolution HTTP checks for the engineer on call. Smart retries keep the blips out of your pager. Keyword matching catches the 200s that aren’t really OK. Every incident writes its own timeline — ready for the postmortem.

No credit card · passwordless sign-in · < 60s to first check

app.tracecrowd.com/monitors/api-checkout Preview
Up GET https://api.checkout.example.com/health every 1m · sample monitor
Uptime 24h
99.93%
Avg response
184 ms
p95
412 ms
p99
690 ms
now 6h ago 12h ago 18h ago 24h ago
Up ↓ 8 min outage · 04:23 UTC 1,437 checks
TraceCrowd itself
Beta — reachable now
Public uptime metrics
Coming soon
Check regions today
Germany · New York · Singapore · Sydney · Toronto
1-min checks from 5 regions Cross-region consensus Keyword body match p95 threshold alerts Maintenance windows Incident audit log API + bulk operations Email · Slack · Telegram · webhook
The problem

You’ve been in this moment before.

The one where the dashboard says green, the inbox is quiet, and the complaints are piling up somewhere you’re not looking.

03:14

Silence.

Phone didn’t buzz. Users are already posting screenshots on X. Your monitoring vendor’s 5-minute interval meant the alert hadn’t even been raised yet.

GREEN

But broken.

Monitor says UP. Status code: 200. Body: a rendered error page that says “something went wrong.” Your uptime vendor never read the response — it only checked the code.

PAGED

To nowhere.

Alert fired. To an email alias that stopped being read in 2022. The runbook said “contact on-call” — but nobody’s owned on-call rotation in months.

TraceCrowd makes all three of those stories harder to have.

check log — sample Preview

Illustrative — this is the shape of the log a workspace sees. Hosts are placeholder (.example.com).

Why TraceCrowd

Opinionated where it helps. Boring where it matters.

We stripped uptime monitoring down to the three things that actually matter, and said no to the rest.

Fast

Catch it in 60 seconds.

Most free tiers run checks every 5 minutes. By the time you’re paged, it’s been 6. We ship 1-minute checks by default — so you’re in the incident before your users start posting about it.

Smart

Fewer false alarms. No silent 200s.

Require 1–3 other regions to confirm before flipping a monitor Down. Retry-with-backoff before we page you. Keyword matching on the response body, so a 200 with an error page still trips the check.

Honest

Every incident writes its own timeline.

When it opened, who ack’d, which alerts landed in which channels, who clicked the email. Postmortem-ready out of the box — no spreadsheet archaeology.

Frequent asks

Answers, before you ask.

Is it actually free during beta?

Yes. No credit card, no trial timer, no credit-starved feature gates. When we move to paid, you’ll get a month’s notice and can export everything.

How is this different from Pingdom or UptimeRobot?

Minute-resolution in the free tier, flat per-workspace pricing, and a UI built for the person on call — not the person procuring. No transaction monitoring, no synthetic browsers, no AI root-cause. We do one thing.

What happens when a check fails?

We retry with backoff first, so a single timeout doesn’t wake anyone. Only sustained failures (beyond your configured fail threshold) open an incident and fire the routes you set — email, Slack, Telegram, or webhook. Recovery fires an all-clear automatically. A reopen window keeps incidents from flapping back open the moment you resolve them.

My monitor says UP but my users say it’s broken. What now?

Match a keyword against the response body, not just the status code. A server returning 200 with an error page, a maintenance screen, or a missing “OK” string will still trip the check. Configure the expected string per monitor and silent 200s stop being silent.

What does an incident record actually contain?

Every lifecycle transition — opened, reopened, acknowledged, resolved — plus each alert delivery (which channel, which recipient, how fast), and email engagement (who opened it, who clicked through). It’s a timeline you can paste straight into a postmortem.

Can I skip incidents during a planned deploy?

Yes. Declare a maintenance window covering a monitor (or the whole project) and the checker keeps running, but incidents won’t open and nobody gets paged for the duration. When the window ends, behaviour snaps back to normal.

Is there an API I can script against?

Yes. Mint a project-scoped API token from the app, then call every monitor/incident endpoint with Authorization: Bearer <token>. Bulk create, pause, resume, delete, and patch endpoints each take an id array so managing 5,000 monitors is one request, not 5,000.

Start watching your URLs now.

Takes a minute. Free during beta. Leave anytime — your data comes with you.

No credit card · cancel by closing the tab · free during beta