Skip to main content

Command Palette

Search for a command to run...

How MongoDB TTL Indexes Replaced Cron Jobs for Efficient Log Cleanup

Updated
3 min read
A

System-level engineer building reliable backend systems with a focus on performance, correctness, and real-world constraints. I work across APIs, databases, networking, and infrastructure, enjoy understanding how systems behave under load and failure, and write to break down complex backend and distributed-systems concepts through practical, real-world learnings.

Every backend system that deals with logs eventually hits the same problem:

Data keeps growing, storage keeps increasing, and nobody wants to own cleanup.

In my case, it was DNS analytics logs in NexoralDNS.

High-volume.
Write-heavy.
Low long-term value.

And like most engineers, my first instinct was wrong.


The Default (Wrong) Thinking

When logs start piling up, the usual suggestions appear:

  • “Run a cron job every night”

  • “Delete old data weekly”

  • “Let DevOps handle it”

  • “We’ll fix it later”

I’ve done this before.
And every time, it ends the same way:

  • Cron fails silently

  • Cleanup lags

  • DB size explodes

  • Someone gets paged

This time, I refused to repeat that mistake.


The Real Requirement (Be Honest About It)

For NexoralDNS:

  • Analytics dashboards show last 24 hours

  • Free users don’t need data older than 7 days

  • Logs are not archival data

  • Accuracy matters more than history

So why was I even keeping old data?

That’s when I revisited something MongoDB already provides.


MongoDB TTL Indexes (The Feature People Ignore)

MongoDB supports TTL (Time-To-Live) indexes.

One line. No cron. No scheduler.

await DnsAnalyticsCol.createIndex(
  { createdAt: 1 },
  { expireAfterSeconds: 60 * 60 * 24 * 7 }
)

That’s it.

MongoDB automatically deletes documents:

  • Based on a Date field

  • After the specified time

  • In the background

  • Without application involvement

This is not a workaround.
This is a first-class database feature.


Why This Was the Correct Decision for NexoralDNS

1. Zero Operational Overhead

No cron jobs.
No Kubernetes jobs.
No external schedulers.

The database enforces retention by design.


2. Predictable Storage Growth

Logs stop growing unbounded.
Disk usage becomes predictable.
Costs stay under control.

This matters more than people admit.


3. Fewer Failure Modes

Cron jobs fail.
Scripts crash.
Deployments break schedules.

TTL indexes don’t care about your deployment pipeline.

They just work.


The Part Most Tutorials Don’t Tell You

TTL indexes are simple — but not magic.

Things you must know (or you’ll get burned):

1. Deletion Is Not Instant

MongoDB’s TTL monitor runs roughly every 60 seconds.

If you expect millisecond-accurate deletion, you’re misunderstanding the feature.


2. createdAt MUST Be a Date

This will not work:

createdAt: 1710000000000 // ❌ timestamp number

This will work:

createdAt: new Date() // ✅

I’ve seen production systems silently fail because of this.


3. TTL Indexes Are Single-Field Only

No compound TTL indexes.
No shortcuts.
Design accordingly.


Why TTL Indexes Beat Cron Jobs (Every Time)

Cron-based cleanup:

  • Runs late

  • Can fail

  • Needs monitoring

  • Adds complexity

TTL indexes:

  • Are declarative

  • Are enforced at the DB level

  • Survive redeployments

  • Reduce mental overhead

If your data has an expiration date, the database should own it.


How This Fits NexoralDNS Architecturally

In NexoralDNS:

  • MongoDB is the source of truth

  • TTL enforces retention

  • Change Streams handle reactivity

  • The application stays stateless

Each layer does one job, cleanly.

That’s how systems stay maintainable.


Hard Lesson I Learned

If you are writing cleanup scripts for data that:

  • Is time-bound

  • Has no long-term value

  • Is high volume

You are probably solving the wrong problem.

MongoDB already solved it for you.


Final Thought (Blunt but Necessary)

Cron jobs are not architecture.
Cleanup scripts are not scalability.
And “we’ll delete it later” is technical debt.

TTL indexes are boring — and that’s exactly why they’re powerful.

Let the database do what it’s good at.
Save your engineering effort for problems that actually matter.

More from this blog

A

AnkanHub

16 posts