Drupal/PHP: User Sessions and Garbage Collection

I recently had to troubleshoot a web site running on the LAMP platform.  I have deep knowledge of the Windows platform technologies, but only a passing knowledge of the technologies on which this particular site was built.

The specific technologies involved were PHP 5.2 (5.2.6), MySQL 5 (5.0.51a), Apache 2.2.9, and Debian Linux (lenny9).  The web site was based on Drupal 6 (6.19).  Everything was running on a single virtualized server.  The problem was that the database backups were taking so much time and consuming so many server resources that the web site was becoming unavailable for 10-20 minutes every night.

Following is a description of what I did to find and correct the problem.  This description will highlight some SQL queries that are useful for troubleshooting MySQL, as well as configuration settings that are key to management of Drupal user sessions and PHP garbage collection.

The original administrator for the site had set up a process to back up the MySQL database, compress the backup file into a ZIP, and move the ZIP file to a secondary network location.  As the site outages corresponded to the times when this backup process was running, it seemed clear that it was the cause of the problems.  The question was why.

The resources allocated to the virtual server, particularly RAM, were somewhat limited.  So an initial guess at the source of the problem was simply that the server was underpowered.

I examined the web server logs and confirmed that the web server was producing HTTP 503 errors during the period of time that the backups were occurring.  Further examination of the web server logs showed that the periods of time that the 503 errors were occurring each night had been gradually increasing over time.  In fact, the site had run for more than 18 months without a problem.  After that, 503 errors began to appear.  These errors had gone unnoticed for about three months.  Simple site monitors were in place to verify the availability of the web site, but those monitors only checked the site every 10 to 15 minutes.  Because of this, the initial site outages were rarely long enough to trigger the site monitors.  After about three months the outages became long enough to consistently trigger the site monitors, and forced my investigation into the source of the problem.

In all of this time the server resources were unchanged, so the theory that the site was underpowered was called into question by the slow-but-steady escalation of the problem.

On the other hand, data was constantly being added to the site, so I began to consider the possibility that the growth of the database was to blame.  It seemed feasible that the backups would take longer as the database grew, as would the corresponding site outages.

I ran the following SQL query on the MySQL server, which returned the size in MB of each database on the server.  It showed that the database for the web site was 5GB in size.

SELECT s.schema_name,  
   SUM(t.data_length + t.index_length)/1024/1024 total_size,  
  (SUM(t.data_length + t.index_length)-SUM(t.data_free))/1024/1024 data_used,
   SUM(data_free)/1024/1024 data_free
FROM INFORMATION_SCHEMA.SCHEMATA s INNER JOIN INFORMATION_SCHEMA.TABLES t
    ON s.schema_name = t.table_schema
GROUP BY s.schema_name
ORDER BY total_size DESC;

5GB seemed an excessively large size for that database, and provided a likely explanation for why the backup operations were taking so long.  To further investigate, I adjusted the SQL query to show the sizes of each individual table on the server.  The new query looked like this…

SELECT s.schema_name, table_name,  
  SUM(t.data_length + t.index_length)/1024/1024 total_size,  
(SUM(t.data_length + t.index_length)-SUM(t.data_free))/1024/1024 data_used, 
  SUM(data_free)/1024/1024 data_free
FROM INFORMATION_SCHEMA.SCHEMATA s INNER JOIN INFORMATION_SCHEMA.TABLES t
    ON s.schema_name = t.table_schema
GROUP BY s.schema_name, table_name
ORDER BY total_size DESC;

… which revealed that the "sessions" table in the database was 4GB.  Ah-ha!  Now the questions centered on the “sessions” table. What was it used for?  And what was a reasonable size for it?

A little research showed that the "sessions" table is a Drupal table that stores user session information for the web site.  If user sessions don’t expire, the “sessions” table grows indefinitely. 

One final SQL query, shown here, revealed that there was data in the “sessions” table that was more than two years old.  Notice the use of the FROM_UNIXTIME function to transform the timestamp values to a human-readable date format.

SELECT uid, sid, hostname,  
  FROM_UNIXTIME(timestamp, ‘%Y-%d-%m %H:%I:%S %X’) AS datecreated, 
  cache, session
from sessions
order by datecreated;

That gave me the explanation for why the problem became worse over time… user sessions were not expiring, which meant that the “sessions” table kept growing, and therefore the database backup and compress operations were taking longer and longer to complete.

So… the final question… why was this happening?  The blog post found at http://success.grownupgeek.com/index.php/2008/08/14/drupal-sessions-table/  held the answer.  It explains that  the settings.php file contains the Drupal settings that control the use of the “sessions” table.  The default settings are as follows:

ini_set(‘session.cache_expire’,     200000);
ini_set(‘session.cache_limiter’,    ‘none’);
ini_set(‘session.cookie_lifetime’,  2000000);
ini_set(‘session.gc_maxlifetime’,   200000);

200000 seconds sets a session length of more than 2 days. The blog post recommended settings for those values that were more like this (45 minute sessions):

ini_set(‘session.cache_expire’,     2700);
ini_set(‘session.cache_limiter’,    ‘none’);
ini_set(‘session.cookie_lifetime’,  0);
ini_set(‘session.gc_maxlifetime’,   2700);

I think that 45 minutes may be a bit short, but it does make sense to reduce the session length from the default settings (which are overlong, in my opinion).

I checked the configuration for the site (the file was located at /var/www/citebank.org/sites/default/settings.php) to see if the settings had been modified from the default. Nope; the site was still running at the default of more than 2 days per user session.  Still, even with the default settings there should not have been data more than two years old in the “sessions” table.

What was going on?  Going back to the blog post, I noticed a user comment that held the answer.  The comment reads “Debian/Ubuntu doesn’t have garbage collection set up by default in the apt-get version of PHP5”.  

Garbage collection is the process that cleans old data from the “sessions” table.  So, if garbage collection wasn’t configured, it would explain why the table contained such old data.  The site I was troubleshooting was running on PHP5 on Debian Linux.  apt-get  is a command line tool for managing software on Linux.  I did not know how (or if there even was a way) to find out if apt-get was used to install PHP, but it seemed likely.

The blog comment went on to describe some additional modifications to the settings.php file.  To ensure that garbage collection is configured, the following two lines need to exist (or be added) to the settings.php file:

ini_set(‘session.gc_probability’,  1);
ini_set(‘session.gc_divisor’,   100);

At this point, I was convinced that I had enough information to correct the problems.  I did have one concern, though.  Given that there was 4GB and millions of rows worth of data in the “sessions” table, I assumed that changing the garbage collection settings would cause Drupal to issue delete statements against all of the rows that were older than the configured length of a session.  That would be bad… millions of logged DELETE operations against a database happening all at once will bring the server to its knees.

I knew that a TRUNCATE TABLE statement issued against a SQL Server table was a quick (re: not logged) way of clearing out a table, but I wasn’t sure if that was also true on MySQL.  A little research showed that the version of MySQL on which the site was running would implement a TRUNCATE TABLE statement by actually dropping and recreating the “sessions” table… quick and easy.  And, no side effects other than a handful of active user sessions being prematurely dropped.  In this case, that was an acceptable side effect.

After having learned all of this, here are the steps that I took to correct the problem.

  1. Edit the settings.php file and modify the existing session variables: session.cache_expire, session.cache_limiter, session_cookie_lifetime, and session.gc_maxlifetime.
  2. Add the additional session variables to settings.php: session.gc_probability and session.gc_divisor.
  3. Issue a “TRUNCATE TABLE sessions” command against the MySQL database.
  4. Restart Apache.

That’s it.  A lot of pain caused by a few missing configuration settings.  Lesson learned:  to avoid unexpected problems, be sure to appropriately configure session lifetime and garbage collection when setting up a web site based on Drupal.

Advertisements

One Response to Drupal/PHP: User Sessions and Garbage Collection

  1. Anonymous says:

    The “Session expire” module is also helpful for addressing this sort of problem: http://drupal.org/project/session_expire

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: