close

DEV Community

SEN LLC
SEN LLC

Posted on

When does this cron run next? Give it its own 70 MB container

When does this cron run next? Give it its own 70 MB container

Every serious backend eventually needs to answer "when will this cron run next?" — for dashboards, deploy-freeze windows, scheduling-aware alerts, and jobs that need to know whether they were triggered on schedule or on demand. That logic does not belong in your main app. It's a pure, stateless computation — perfect for its own tiny service.

📦 GitHub: https://github.com/sen-ltd/cron-next-api

Screenshot

I built cron-next-api, a FastAPI service whose entire job is this:

GET /next?expr=*/5 * * * *&count=3&tz=Asia/Tokyo&after=2026-04-15T00:00:00Z
Enter fullscreen mode Exit fullscreen mode

It returns the next N fire times in UTC and local time, plus a breakdown of what the parser actually understood. That's 95% of the use case. The other 5% is /prev, /describe, and /validate for dashboards that need to show "valid" / "invalid" badges next to a text input. The whole container is 71 MB.

This article is about why that's an entire deployable service, not a helper function buried in utils/scheduling.py.

The cron-in-app antipattern

Almost every backend I've worked on eventually grows a feature that needs cron arithmetic. Some real examples:

  • Deploy-freeze dashboards. "No deploys between Friday 18:00 JST and Monday 09:00 JST." You need to answer "is the next deploy window more than 2 hours away?" for a status page.
  • Trigger-source auditing. A nightly job ran at 03:17 instead of 03:00. Was it a retry, a manual trigger, or a scheduler misconfiguration? You can't tell unless you can compute "when was this supposed to run?"
  • User-facing schedule UIs. "Your report will be emailed next at: ..." The user entered a cron string two months ago; you have to render a concrete date.
  • Alerting. An alert that fires when a cron job hasn't run in N + 10% of its expected interval needs to know what that interval is.

What usually happens: someone pip install croniter, drops a helper into the main Django app, and moves on. Six months later, three different services have their own copy of the same helper, with subtly different timezone handling. One of them breaks during DST because someone wrote datetime.utcnow() instead of using zoneinfo.

The logic is:

  1. Stateless. Given (expression, after, tz, count), the answer is deterministic.
  2. Pure. No database, no caching is needed.
  3. Language-agnostic. Your Go service, your Rust worker, and your PHP admin panel all benefit.
  4. Bounded. The surface area is maybe 5 endpoints, ever.

That's the shape of a microservice. Not a 40-file package. A tiny service — start a container, forget about it, query over HTTP.

Design: borrow what's hard, write what's easy

The hard parts of cron are:

  • Parsing 5-field and 6-field expressions (including step values, ranges, and lists).
  • Iterating to "the next / previous fire time" correctly across DST transitions.
  • Understanding dow weirdness: is Sunday 0 or 7? (Both, depending on the implementation.)
  • Handling Quartz extensions (L, W, #) for the users who expect them.

Python has a mature library for all of this: croniter. It's the de facto choice, handles 5-field and 6-field, iterates forward and backward, and integrates cleanly with zoneinfo. I'd be crazy to reimplement any of that.

The easy part of cron is describing common expressions in English. And here I deliberately did not use cron-descriptor, for one reason: most cron expressions anyone actually writes fit a dozen shapes. Writing those by hand gave me something substantive to talk about, and kept the dependency count minimal.

Here's the full describer for "every N minutes":

# src/cron_next_api/describe.py
_NICKNAMES = {
    "@hourly": "Every hour, at minute 0",
    "@daily": "Every day at 00:00",
    "@weekly": "Every Sunday at 00:00",
    "@monthly": "On the 1st of every month at 00:00",
    "@yearly": "On January 1st at 00:00",
}

def describe(expression: str) -> Optional[str]:
    expr = expression.strip()
    if expr in _NICKNAMES:
        return _NICKNAMES[expr]
    parts = expr.split()
    if len(parts) != 5:
        return None  # 6-field handled above this branch

    minute, hour, dom, month, dow = parts

    if minute == "*" and hour == "*" and dom == "*" and month == "*" and dow == "*":
        return "Every minute"

    if minute.startswith("*/") and hour == "*" and dom == "*":
        n = int(minute[2:])
        if 1 <= n <= 59:
            return f"Every {n} minutes"

    m = _is_int(minute); h = _is_int(hour)
    if m is not None and h is not None and dom == "*" and month == "*":
        if dow == "*":
            return f"Every day at {h:02d}:{m:02d}"
        if dow == "1-5":
            return f"At {h:02d}:{m:02d} on weekdays"
        n = _is_int(dow)
        if n is not None:
            return f"At {h:02d}:{m:02d} on {_WEEKDAY_NAMES[n]}"

    return None  # don't guess
Enter fullscreen mode Exit fullscreen mode

Two properties that matter:

  1. It returns None for anything complex. 0 0 L * * (last day of month)? None. 0 9 * * 1#2 (second Monday)? None. The API response has "covered": false and the client can decide to fall back to something else. The service never lies about what it understands.
  2. It handles the common cases well. Every deploy-time cron I've ever seen — */5 * * * *, 0 9 * * 1-5, @hourly — is in that list. That's good enough for 95% of production usage, and users actually prefer "I don't know" over "At 09:00 on Mon-Fri unless the day is between 1 and 31 and the month is between 1 and 12 but not on last day of month" from a full descriptor.

The README enumerates the coverage exactly. No surprises for callers.

DST correctness is a business-critical detail

Cron's interaction with DST is where junior implementations fall apart. Consider a freeze window: "no deploys between Friday 18:00 and Monday 09:00, America/Los_Angeles." On the weekend that DST ends, 01:00 happens twice. On the weekend it begins, 02:00 doesn't happen at all.

If your cron engine treats "every day at 02:00 local" naively, you'll either get:

  • A double fire on fall-back weekend (02:00 exists twice, so the iterator returns both).
  • A missing fire on spring-forward weekend (02:00 doesn't exist, so the iterator silently skips — but your dashboard's "next fire" display shows the wrong thing).

The fix is to do the arithmetic in the target timezone, not UTC. The engine does this:

# src/cron_next_api/engine.py
def iter_fires(
    expression: str,
    *,
    start: datetime,
    count: int,
    tz_name: str = "UTC",
    direction: str = "next",
) -> list[tuple[datetime, datetime]]:
    parse(expression)  # raises CronError on invalid
    tz = _resolve_tz(tz_name)
    local_start = _to_tz(start, tz)

    it = croniter(expression, start_time=local_start)
    method = it.get_next if direction == "next" else it.get_prev

    fires = []
    for _ in range(count):
        local_next = method(datetime)
        if local_next.tzinfo is None:
            local_next = local_next.replace(tzinfo=tz)
        utc_next = local_next.astimezone(timezone.utc)
        fires.append((utc_next, local_next))
    return fires
Enter fullscreen mode Exit fullscreen mode

Three things to notice:

  1. start is always passed in. Nothing in the engine module calls datetime.now(). That's the single discipline that makes the whole thing testable. Every test just passes an anchor and asserts on the output. No freezegun, no monkeypatching.
  2. Iteration happens in the local timezone. The start_time passed to croniter is already converted to tz, so croniter's internal DST handling kicks in.
  3. Every fire is returned as (utc, local). Clients that want to display both don't have to round-trip through zoneinfo themselves.

DST tests in the suite:

def test_next_dst_spring_forward_us_pacific():
    anchor = datetime(2026, 3, 7, 12, 0, tzinfo=timezone.utc)
    fires = iter_fires("0 2 * * *", start=anchor, count=2, tz_name="America/Los_Angeles")
    locals_ = [l for _, l in fires]
    assert locals_[0] < locals_[1]
    assert all(l.tzinfo is not None for l in locals_)
Enter fullscreen mode Exit fullscreen mode

It's a smoke test, not a full correctness proof, but it catches the class of bugs where someone accidentally drops tzinfo somewhere in the pipeline.

The after parameter should be explicit

A design choice I waffled on: should after default to now() or be required?

Required feels correct (explicit is better than implicit), but in practice people always want the next fire from now for the simple case. Making it required means every quick-check curl needs a timestamp.

The compromise: after defaults to datetime.now(timezone.utc) in the HTTP layer, never in the engine. The engine itself refuses to read the clock. This means:

# Quick check: "when does this fire next?"
curl "localhost:8000/next?expr=0+9+*+*+1-5"

# Deterministic: "when did this fire last, from a specific anchor?"
curl "localhost:8000/prev?expr=0+9+*+*+1-5&before=2026-04-15T12:00:00Z"
Enter fullscreen mode Exit fullscreen mode

...and every unit test passes an explicit anchor. No test depends on wall-clock time. No production code leaks datetime.now() into a pure function.

The route handler

# src/cron_next_api/main.py
@app.get("/next", response_model=NextResponse)
async def next_get(
    expr: str = Query(..., max_length=256),
    count: int = Query(1, ge=1, le=100),
    tz: str = Query("UTC", max_length=64),
    after: Optional[str] = Query(None, max_length=64),
):
    return _compute(expr, count, tz, after, direction="next")

def _compute(expr, count, tz, anchor, direction):
    try:
        parsed_fields = _parsed_fields(expr)
    except CronError as exc:
        return _error(422, "invalid_cron", str(exc), exc.position)

    if anchor:
        try:
            start = parse_iso_datetime(anchor)
        except ValueError as exc:
            return _error(422, "invalid_datetime", str(exc))
    else:
        start = datetime.now(timezone.utc)

    try:
        fires = iter_fires(expr, start=start, count=count,
                           tz_name=tz, direction=direction)
    except (CronError, TimezoneError) as exc:
        return _error(422, "invalid_cron" if isinstance(exc, CronError)
                          else "invalid_timezone", str(exc))
    return NextResponse(...)
Enter fullscreen mode Exit fullscreen mode

Standard FastAPI. Pydantic models validate at the edge; the pure engine raises domain errors; the route handler translates them to HTTP. Nothing clever.

The POST /next variant takes the same arguments as a JSON body, because cron expressions that use # or ? need URL-encoding in GET and nobody wants to curl --data-urlencode for a five-character expression.

Tradeoffs

Things this service does not do, and why:

  • Quartz extensions (L, W, #). croniter has partial support depending on version. The describer returns null for them. The parser forwards them to croniter and lets it decide.
  • Sub-second precision. The 6-field form uses whole seconds. If you're scheduling at millisecond granularity, cron is the wrong abstraction anyway.
  • 6-field seconds position. croniter's convention is minute hour dom month dow second (seconds at the end), not Quartz's second minute hour dom month dow. I picked croniter's convention because that's what actually runs in production Python codebases. Documented in the README.
  • Persistence. No "show me the last 100 fires" endpoint. If you need that, pair this with a log aggregator.
  • Authentication. It's stateless read-only compute. Put it on an internal network. If you need auth, put a reverse proxy in front.

Try it in 30 seconds

git clone https://github.com/sen-ltd/cron-next-api
cd cron-next-api
docker build -t cron-next-api .
docker run --rm -p 8000:8000 cron-next-api
Enter fullscreen mode Exit fullscreen mode

Then:

# Every 5 minutes, next 3 fires in Tokyo time
curl "localhost:8000/next?expr=*/5+*+*+*+*&count=3&tz=Asia/Tokyo&after=2026-04-15T00:00:00Z" | jq

# "At 09:00 on weekdays"
curl "localhost:8000/describe?expr=0+9+*+*+1-5"

# Invalid expressions come back as 422 with a position
curl "localhost:8000/validate?expr=broken"
Enter fullscreen mode Exit fullscreen mode

The whole image is 71 MB. 47 tests. One pyproject.toml dependency that does actual work: croniter.

The general principle

The real argument in this post isn't about cron. It's about where computations live.

When you find yourself copy-pasting the same three-line helper into four services — especially when that helper has no state and no dependencies on the host application — that's the shape of a microservice. Not because microservices are always right (they aren't), but because:

  • Fixing it once fixes it everywhere. The DST bug gets fixed in one place.
  • The boundary is enforced by HTTP. Nobody can sneak extra state in.
  • You can rewrite it in Rust later. The tests don't care.
  • Every language in your stack benefits. Not just the one the helper was written in.

Cron "next fire" is the smallest, cleanest example I could find of that pattern. 47 tests, 71 MB, one dependency, one purpose. Deserves its own container.

📦 https://github.com/sen-ltd/cron-next-api

Top comments (0)