I start with a simple Worker process. It’s a Symfony Command, supposed to run multiple times at once. You start each process with an unique ID as argument. For example 1, 2, 3, etc. This process is running in an infinite while-loop. This is the main loop of the script. Such kind of main infinite-loop always should include a
sleep()-call. For simple process termination each process is looking up for its own
shutdown Redis key. This is at the top of the main loop. This creates the possibility to gentle shutdown a Worker process. A gentle shutdown means that the process has the chance to finish the current work batch. Then stop itself before beginning the next batch cycle in the main loop. This is good for maintenance.
The queue in Redis is a simple List on which I can append (
RPUSH) IDs to the end and fetch (
LPOP) IDs from the beginning of the List. The IDs could be references to Redis-, MySQL-objects or whatever. The first thing I will do check if the List operations are atomic. It’s bad if two Worker processes would catch one and the same object ID. For this purpose I fill the List with 1000000 IDs and concurrent run many Worker processes. As it turns out
LPOP is (of course) atomic. As every other Redis command, I guess.
Each Worker process fetches a specific amount of IDs at once. This value for this amount is the Batch Size. This can be a dynamic Integer value or a predefined constant. It defines how much objects each Worker process should fetch on a single batch cycle. A good Batch Size value would be like 2000. It’s not a good idea to load more than a few thousand objects into PHP. Otherwise the PHP process will be bloat. Especially long-time running processes will be blown up over time. You will need to use
unset() or restart the process from time to time to free memory. PHP does garbage collection but you will still need to dereference objects in some cases. This, of course, doesn’t affect a standard PHP application running over a HTTP server. For a background process like a queue it’s a good practice to keep the used memory as small as possible.
Once a Worker process has fetched a bulk of IDs from Redis it fetches the corresponding rows from MySQL. So we get the entire objects for the IDs. This also can be objects stored in Redis. If Redis stores the objects everything will be fetched faster. In my case MySQL stores the objects. Consider using a MySQL connection in a background processes needs to ping the server. This keeps the connection open.
To use the queue for more than one kind of object create a common class. Each type which should be processed through the queue get its own subclass of the common. This makes it easy to change the basic behavior for all queues. Each queue get its own namespace in Redis keys. For example
-  This article is from 2007 but it’s still accurate: Memory Leaks With Objects in PHP 5. You can test it with PHP 5.6.