This post is written for SysAdmins, IT Directors, and CTOs who either manage a Drupal 7 site or are thinking of bringing a Drupal 7 site into their portfolio of technologies. My goal is to demystify Drupal and help you understand how it works, how to best manage it, and the various tools and techniques to help ensure a good experience with Drupal.
Drupal is an open-source PHP framework, licensed under GPLv2, used to build websites and applications. It is modular, highly scaleable, supported by a 1m+ users, and powers ~14% (23.8m) of all CMS's on the web. Some noteable sites include whitehouse.gov, examiner.com, and economist.com.
Drupal can be used on all major server OSs, including *nix, Microsoft, Sun, FreeBSD, OS X Server, and even AmigaOS. Minimally, Drupal 7 requires PHP 5.2.5 (5.3.x recommended, see here), a web server (Apache 2.x recommended, has been reported to work on Apache 1.3, Nginx, and Microsoft IIS), and a database server. Fortunately Drupal has a database abstraction layer that can conceptually connect to any database. MySQL (and PostgreSQL, to a lesser extent) tends to be the most used, tested, and supported in the Drupal community. There are drivers that support both Oracle and SQL Server/Azure (and other databases, including a few NoSQL options), however these are "contrib" modules (more about that later) that may or may not support all your requirements. Drupal can be configured to work in a master/slave environment as well.
At it's core, Drupal is composed of core Drupal, contrib modules, custom modules, and one or more themes. To help you better understand:
- Core Drupal: These are the set of modules included with every Drupal site and are referred to when someone says "Version 7.26 of Drupal is installed on the site"
- Contrib modules: Contrib modules are pieces of functionality that are developed and contributed back to the Drupal community, see here. Some examples include WYSIWYG editors, automated query writers, drop down menus, there are thousands. Every site will have some, if not tens or even 100+. See here for contrib module usage.
- Custom modules: Sometimes a contrib module does not have all the functionality required. Other times requirements are so obscure that there's nothing to do but start from scratch. And sometimes it's just easier to create something programmatically. Custom modules are custom code written by Drupal programmers.
In a Drupal installation, contrib modules, custom modules, and custom themes hopefully live in their own directory, usually in sites/*, sites/default/*, sites/all/*, or profiles/* directory. Drupal is flexible enough this may be placed anywhere, sometimes you need to simply grep to find where something is located. Sometimes a profile (also called a distribution) is used to setup a site. That is, when Drupal is initially installed, a "profile" can be used to enable modules, configure settings, and more. In this case, contrib modules and themes may live in /profiles/*
I include this section next because you should know this before proceeding.
Drupal saves just about everything in the database. Thus, when developing and supporting a Drupal site, a problem inevitably occurs: How do you make database changes to your local site and push those changes to production? The solution is a module (and paradigm) called features. Features allow you to capture database changes, such as changes to views, contexts, blocks, content types, fields, variables, etc., in code. In some cases there are items that cannot be captured in code; in this case a developer uses hook_update_N, which allows you to programmatically apply changes to a database. Feature and hook_update_N allows a user to push any and all changes locally to any environment. IF YOU ARE EVER WORKING WITH A DEVELOPER AND S/HE SAYS THEY CAN'T PUSH A CHANGE TO PRODUCTION WITHOUT MANUALLY MAKING CHANGES, YOU ARE WORKING WITH THE WRONG DEVELOPER. Drupal 100% supports a hands-free production environment.
I strongly encourage all sysadmins to integrate the Git versioning system into their workflow. It's easy, free, supported, used by Drupal.org, and it works. It was created to support the Linux kernal so you know there's some good developers on the project. I also recommend following Nvie's model for Git branching, which ties into your dev, staging, and production environments. Loosely summarized, you have the dev branch sitting on your dev environment and master (or a tag) sitting on QA and prod. When a Drupal developer makes a commit/merge/tag and you want to deploy to an environment, you:
- git pull the latest code (or tag) to the environment
- revert the database (ie. features) to sync with code (ie. drush fra -y)
- run any database updates (ie. hook_update_N) (ie. drush updb -y)
- clear Drupal cache (ie. drush cc all)
At some point a developer may ask you to install Drush on the server, which is very straightforward. Drush is a command line shell and unix scripting interface for Drupal. Some of the things you can automate with Drush include:
- Enable/disable modules
- Clear cache
- Revert features (ie. sync code with database)
- Run database updates
- Download contrib modules
- So. Much. More.
Maintenance and Updating Drupal
Drupal has a status report you will want to visit every now-and-then located at /admin/reports/status. If the dblog module is enabled (which I don't recommend, discussed below), you can also visit /admin/reports/dblog and see the latest errors reported on the site, including PHP warnings, 404s, and all other types of goodies. See /admin/reports for the full list of reports available.
You may or may not need to update Drupal depending on your relationship with developers. Here's a summary of how you update core. Couple of quick notes:
- Some important files may be overwritten in the process, including .htaccess and any other files/directories in the root directory.
- Using Drush, you can automate updates by running drush up. See here for details, or google for more up-to-date instructions.
Caching and Performance
Drupal is very "chatty" with its database; a user on this post says 3-400 queries per page view is normal. I don't disagree. This means page views are resource intensive unless you do some caching. This is very environment-specific so I'll just cover the basics:
- Make sure to enable CSS & JS aggregation and caching at /admin/config/development/performance
- Views can be cached as well, this is a decent post about it
- There is a module called dblog that logs each error/warning/404/etc to a table called watchdog. This is great for development but resource intensive for production environments. Instead, enable the syslog module and write all events to syslog. You can configure syslog at /admin/config/development/logging.
- For posterity, make sure error reporting is disabled as well (/admin/config/development/logging), you don't want those printed to screen
- Finally, check out Varnish, APC, memcache, and a CDN of choice
This is a very high-level overview of administrating a Drupal 7 site. I strongly recommend working with an experienced sysadmin who can walk you through all these steps. While I can write about my general experiences, managing a Drupal site is typically a highly-customized solution tailored for your experience and organization.