Delayed job monitoring

Problem

Background jobs are essential for the functioning of OFN instances. They control all sorts of features suchs as sending emails, finalizing account invoices and refreshing the products cache.

If we want to troubleshoot the behaviour of the app in production, we need to know things like how many jobs failed in a particular date, how many jobs are being executed, etc. This is specially necessary when facing incidents of any kind: why didn’t I receive the welcome email? or why am I still seeing an old cached version of the order cycle?

Proposal

We can already check that information by mangling the log/delayed_job.log in production, but if we ever want our decisions to be data driven, an appropriate data visualization is crucial.

That is why we propose using https://github.com/ejschmitt/delayed_job_web as delayed_job’s repo suggests. This is a gem that needs to be added to the OFN and which gives us a UI like:


It provides counts of enqueued, failed, pending, etc. jobs and the ability to retry and cancel jobs, clear failed ones, etc. Useful when facing problems in production (which always happens).

This definitely increases the RAM usage of the app, but we do need something to help visualize this data.

thoughts?

1 Like

Any objection @oeoeaio @maikel @Matt-Yorkley?