Live Public beta · sign up with a magic link

Get paged before
your users do.

Q: Is it actually free during beta?

Yes. No credit card, no trial-timer countdown, no credit-starved feature gates. When we move to paid, you'll get a month's notice and can export everything.

Q: How is this different from Pingdom or UptimeRobot?

Minute-resolution in the free tier, flat per-workspace pricing, and a UI built for the person on call, not the person procuring. No transaction monitoring, no synthetic browsers, no AI root-cause. We do one thing.

Q: What happens when a check fails?

We retry with backoff first so a single timeout doesn't wake anyone. Only sustained failures (beyond your configured fail threshold) open an incident and fire the routes you set (email, Slack, Telegram, webhook). Recovery fires an all-clear automatically. A reopen window guards against incidents flapping back open the moment you resolve them.

Q: My monitor says UP but my users say it's broken. What now?

TraceCrowd can match a keyword against the response body, not just the status code. A server returning 200 with an error page, a maintenance screen, or a missing 'OK' string will still trip a check. Configure the expected string per monitor and silent 200s stop being silent.

Q: What does an incident record actually contain?

Every lifecycle transition — opened, reopened, acknowledged, resolved — plus each alert delivery (which channel, which recipient, how fast), and email engagement (who opened it, who clicked through). It's a timeline you can paste straight into a postmortem.

Q: Can I skip incidents during a planned deploy?

Yes. Declare a maintenance window covering a monitor (or the whole project) and the checker keeps running, but incidents won't open and nobody gets paged for the duration. When the window ends, behaviour snaps back to normal.

Q: Is there an API I can script against?

Yes. Mint a project-scoped API token from the app, then call every monitor/incident endpoint with Authorization: Bearer . Bulk create, pause, resume, delete, and patch endpoints each take an id array so managing 5,000 monitors is one request, not 5,000.

Minute-resolution HTTP checks for the engineer on call. Smart retries keep the blips out of your pager. Keyword matching catches the 200s that aren’t really OK. Every incident writes its own timeline — ready for the postmortem.

Start monitoring free See how it works

No credit card · passwordless sign-in · < 60s to first check

app.tracecrowd.com/monitors/api-checkout Preview

Up GET https://api.checkout.example.com/health every 1m · sample monitor

Uptime 24h

99.93%

Avg response

184 ms

p95

412 ms

p99

690 ms

now 6h ago 12h ago 18h ago 24h ago

Up ↓ 8 min outage · 04:23 UTC 1,437 checks

TraceCrowd itself

Beta — reachable now

Public uptime metrics

Coming soon

Check regions today

Germany · New York · Singapore · Sydney · Toronto

1-min checks from 5 regions Cross-region consensus Keyword body match p95 threshold alerts Maintenance windows Incident audit log API + bulk operations Email · Slack · Telegram · webhook

The problem

You’ve been in this moment before.

The one where the dashboard says green, the inbox is quiet, and the complaints are piling up somewhere you’re not looking.

03:14

Silence.

Phone didn’t buzz. Users are already posting screenshots on X. Your monitoring vendor’s 5-minute interval meant the alert hadn’t even been raised yet.

GREEN

But broken.

Monitor says UP. Status code: 200. Body: a rendered error page that says “something went wrong.” Your uptime vendor never read the response — it only checked the code.

PAGED

To nowhere.

Alert fired. To an email alias that stopped being read in 2022. The runbook said “contact on-call” — but nobody’s owned on-call rotation in months.

TraceCrowd makes all three of those stories harder to have.

check log — sample Preview

Illustrative — this is the shape of the log a workspace sees. Hosts are placeholder (.example.com).

Why TraceCrowd

Opinionated where it helps. Boring where it matters.

We stripped uptime monitoring down to the three things that actually matter, and said no to the rest.

Fast

Catch it in 60 seconds.

Most free tiers run checks every 5 minutes. By the time you’re paged, it’s been 6. We ship 1-minute checks by default — so you’re in the incident before your users start posting about it.

Smart

Fewer false alarms. No silent 200s.

Require 1–3 other regions to confirm before flipping a monitor Down. Retry-with-backoff before we page you. Keyword matching on the response body, so a 200 with an error page still trips the check.

Honest

Every incident writes its own timeline.

When it opened, who ack’d, which alerts landed in which channels, who clicked the email. Postmortem-ready out of the box — no spreadsheet archaeology.

Features

Minute-resolution checks, percentiles, incidents with a paper trail, and retries that keep blips out of your pager.

See what’s inside →

How it works

Four steps from landing page to “we’re watching it.” No passwords, no configs to edit, no Slack salesperson in your DMs.

Walk through it →

Pricing

Free during beta. When we launch, one flat price per workspace. Everyone’s in — no per-seat math.

See pricing →

Frequent asks

Answers, before you ask.

Is it actually free during beta?

Yes. No credit card, no trial timer, no credit-starved feature gates. When we move to paid, you’ll get a month’s notice and can export everything.

How is this different from Pingdom or UptimeRobot?

Minute-resolution in the free tier, flat per-workspace pricing, and a UI built for the person on call — not the person procuring. No transaction monitoring, no synthetic browsers, no AI root-cause. We do one thing.

What happens when a check fails?

We retry with backoff first, so a single timeout doesn’t wake anyone. Only sustained failures (beyond your configured fail threshold) open an incident and fire the routes you set — email, Slack, Telegram, or webhook. Recovery fires an all-clear automatically. A reopen window keeps incidents from flapping back open the moment you resolve them.

My monitor says UP but my users say it’s broken. What now?

Match a keyword against the response body, not just the status code. A server returning 200 with an error page, a maintenance screen, or a missing “OK” string will still trip the check. Configure the expected string per monitor and silent 200s stop being silent.

What does an incident record actually contain?

Every lifecycle transition — opened, reopened, acknowledged, resolved — plus each alert delivery (which channel, which recipient, how fast), and email engagement (who opened it, who clicked through). It’s a timeline you can paste straight into a postmortem.

Can I skip incidents during a planned deploy?

Yes. Declare a maintenance window covering a monitor (or the whole project) and the checker keeps running, but incidents won’t open and nobody gets paged for the duration. When the window ends, behaviour snaps back to normal.

Is there an API I can script against?

Yes. Mint a project-scoped API token from the app, then call every monitor/incident endpoint with Authorization: Bearer <token>. Bulk create, pause, resume, delete, and patch endpoints each take an id array so managing 5,000 monitors is one request, not 5,000.

Read all FAQs

Start watching your URLs now.

Takes a minute. Free during beta. Leave anytime — your data comes with you.

Start free See pricing

No credit card · cancel by closing the tab · free during beta

Get paged before your users do.