Reports performance

Just briefly, the issue we have with reports is that they can use a large amount of memory, which results in each web worker process (Ruby/Puma) allocating a large amount and holding onto it for a large time. Because we have multiple worker processes, this can compound the amount of allocated memory and cause the server to run out of memory.

We have started on a fix for this, this but hit a blocker.
So how should we proceed…?

@maikel and I have discussed the options.

The current attempted solution is to “fork” a process from the Puma process to execute the report. But we have a problem:

We can’t test with system specs. We need to cover it somehow with automated tests.
Can we fix it? Unknown, haven’t found an answer with initial investigation.
(just a guess: maybe a bug in Cuprite causes a callback on forking that crashes Chromium)

So we have gone back to ask what’s the ideal solution? Lets look at pros/cons:

DB query timeouts

+

  • quick and easy to implement

-

  • DB query time is relatively quick, and doesn’t correalte to report runtime

It simply doesn’t help.

Forking

+

  • Quick to start up and respond quicker
  • Able to provide report response direclty if quick to run, or email later
  • Memory cleaned up every time
  • Relatively quick to implement… except there are some challenges…

-

  • Unable to run system specs
  • We can’t limit number of concurrent jobs, which could compound RAM usage.
  • Needs to allocate memory for every report (inefficient)
  • Forking is not commonly done in Rails, so there could be other challenges we haven’t found yet

Sidekiq

+

  • allows limit concurrent reports (limit RAM blowout)
  • options for concurrency limiting (process, threads, queues, capsules)
  • Able to provide report response direclty if quick to run, or email later
  • We can place more specific memory limits on the report process

-

  • Need to implement app-specific timeout. but it should be straightforward to implement (build into reports framework)

Summary

What we now know is that there is a big hurdle in the forking solution that we couldn’t have anticipated.
Sidekiq should provide a more definitive solution, and has more options.

So we will switch efforts to implementing reports to run in a Sidekiq process. This will look the same to the user (fast reports can be downloaded immediately like usual, and slower ones will be sent by email).

2 Likes