Sidekick – Using Node.js to run scheduled tasks for a service

Posted by David Sklar on August 8, 2011 – 10:04 AM

The Problem

We run PHP inside of Apache 2. This works great for servicing user requests, but that very request/response nature of the PHP setup makes it difficult to do things such as:

  • run PHP code after Apache starts up to initialize server state (such as populating APC cache with data or compiled code) before the server indicates it’s ready to handle real requests
  • run periodic or scheduled tasks inside the server, such as announcing the server to our service discovery system or refreshing local caches of remote information

The Solution

We run a little companion process (the “sidekick”) that is started up at the same time as httpd (and stopped when httpd is stopped). It reads a configuration file which tells it what tasks to execute on what schedule.

The configuration file specifies each task as a URL to execute on the server and the frequency of execution. For example:

"announce": {
  "url": "/xn/tasks/frobnicate",
  "frequency": 60
}

This tells sidekick to issue a GET to /xn/tasks/frobnicate on the local httpd every 60 seconds.

Why This Was Nice To Build With Node.js

Using Node.js for this “sidekick” companion process was useful for a few reasons:

  • Most importantly, the event-based nature of Node.js took care of the timing and scheduling aspects of the sidekick. Scheduling each task execution is a simple setTimeout() call. Avoiding two instances of a task running at once requires only checking a single object property. Long-running tasks don’t affect the scheduling of other tasks. I can mostly just fire off the tasks when I want them and rely on Node’s internal event loop to make the HTTP requests and invoke my callbacks when appropriate.
  • Executing the requests back to the main HTTP server was dead simple with Node’s http module. Making requests, handling responses, and dealing with errors are all straightforward. The only code I needed to write was my application-specific logic. I didn’t need to spend any time on boilerplate connection handling mechanics.
  • Node’s signal handling and message passing on process exit made it straightforward to take special actions before shutdown, such as removing a PID file and executing a special request against the main HTTP server.
  • Using Javascript made the configuration file specification trivial (admittedly, this is not a property unique to Javascript) but also will allow for very easy extension into specifying task logic itself in the config file. (More details on this in the “What’s next” section below.)

Other Features

In addition to the functionality specified above, the configuration file supports the following options:

If a task’s frequency is set to initialize, then the task is treated specially as the “initialization” task. This means it gets run first at process startup. All timed tasks are not run until the first run of the initialization task completes. The initialization task does not run on any regular schedule but can be re-executed by sending SIGUSR2 to the sidekick. We use this feature to prime PHP caches which are cleared when the main HTTP server is gracefully restarted. (Apache uses SIGUSR1 for this function but SIGUSR1 is claimed by Node to activate its debugger.)

If a task’s frequency is set to shutdown, then the task is treated specially as the “shutdown” task. This means it is only run on process shutdown. This is useful for cleaning up resources or making notifications. We use this to have the server remove itself from our service discovery system.

A task can have a start-delay key which indicates the number of seconds to wait before kicking off the first run of the task. This is useful for staggering the execution of multiple tasks. Instead of having ten tasks wake up every 60 seconds together and fire off their requests, you could stagger them each by one or two seconds to spread out the load.

A tasks can have a dedicated-child key with a boolean value (defaults to false if this key is not present) indicating whether sidekick should spawn a separate process to execute the task in. This could be useful if the task response is large, might somehow crash Node, or if the task execution itself (once the “specify task logic as Javascript in the config file” expansion described below is complete) is CPU intensive.

What’s next?

Task Enhancements

Tasks are just GET requests. There are some use cases for which it would be nice to extend this to other methods and perhaps allow specification of other attributes of the request (body, headers).

Having all tasks just be URL callbacks into the server works great for Apache and PHP because the PHP code behind the request can execute whatever we need it to. But this approach is not so useful when wrapping other servers without such easy programmability. (For example, we are exploring doing the same sidekick wrapping for Redis.) In that case, a useful extension to sidekick would be the ability to specify Javascript code, not just URLs, to define a task to run. The tasks can then do anything that Node.js can do, such as talk to remote network services or do local filesystem cleanups.

Process Monitoring

If the sidekick dies, it should kill the main service. (And vice versa.) We get around this right now by having an external liveness checker poking the server every few seconds — if it finds that just the sidekick (or just the main service) is running, it kills it and reports a failure to our monitoring system. Ideally the sidekick could take care of this itself. When it dies, it could kill the main server; it could periodically check the health of the main server and kill itself if it finds the main server is dead.

Download the Code

You can download the sidekick at https://github.com/ning/sidekick.

David Sklar Posted by David Sklar, written on August 8, 2011 – 10:04 AM.
I'm a Distinguished Engineer at Ning, where I've worked since March 2005. I spend my time bridging boundaries -- between programming languages, platform layers, and development teams. Aside from technical challenges, I enjoy eating, elephants, and entropy.

Also from Ning Code…

 

Attend Tech Talks by Ning's Engineering & Ops teams at Ning HQ in downtown Palo Alto, CA!

Archives by Category

Search this Blog


RSS