Getting phpBB to accept Django sessions

Posted on 10 December 2008 in Django, Programming, Python, Resolver Systems

phpBB is a fantastic bulletin board system. We use it at Resolver Systems for our forums, and it does a great job.

However, we're a Python shop, so we prefer to do our serious web development -- for example, the login system that allows our paying customers to download fully-featured unlocked versions of our software -- in Django.

We needed to have a single sign-on system for both parts of our website. Specifically, we wanted people to be able to log in using the Django authentication module, and then to be able to post on the forums without logging in again. This post is an overview of the code we used; I've had to extract it from various sources, so it might not be complete -- let me know in the comments if anything's missing.

I will be uploading something more polished to a Google Code project over the next few days. (UPDATE: I've now uploaded all of the code to the Google Code project, so if you want to use it, you should get it from there. However, the description below may still be of interest, and I'll keep it for historical reasons :-))

The PHP side

Let's consider the PHP code first. phpBB has "pluggable" authentication -- that is, you can provide a PHP module containing certain functions, and then in the admin UI tell it to use that module for authentication. These modules are stored in the subdirectory includes/auth, and the standard installation includes one called auth_apache.php, which allows people using Apache to use HTTP authentication. Our Django integration is based on this module, so it's worth going over the original code before we look at the modified version.

There are four functions in the module:

  1. autologin_apache. This is called when a user tries to do something that requires a login (and perhaps at other times too, I'm not sure). Its job is to check the current session, and determine if the state of that session is such that the user should be autmatically logged in to phpBB. This is the core of the HTTP authentication: it checks that the current session relates to a user who is logged in HTTP-wise, and then tries to get a user of the same name from the phpBB user database. If there is such a user, then it returns their details. If there is not, then it creates one with a default profile. That last step can be counter-intuitive -- what kind of authentication system creates profiles for people it's never heard of before? -- but makes sense when you consider that it is only doing this for people who are actually already logged in using the other system that you're integrating with.
  2. user_row_apache. This function just generates a default phpBB profile given a username and a password.
  3. validate_session_apache. This function checks if the user passed to it matches the one who is logged in HTTP-wise.
  4. init_apache. This is used when you first switch to using Apache-based authentication, and is just a sanity check to make sure that you're logged in (HTTP-wise) as a user whose ID matches the user you're logged into phpBB as. This stops you from accidentally switching to using Apache-based authentication when you're not logged in using that kind of authentication and being logged out and unable to switch back again.
  5. login_apache. This is called by the phpBB login page to validate a user. To be perfectly honest, I'm not sure why it's included in the auth_apache module, because if it's correctly configured then anyone who performs an action that would require a login will be handled by autologin_apache, so they'd never see the login page. However, for completeness: this page checks that the current session relates to a user who is logged in HTTP-wise, and that the username used for the HTTP auth is the same as the one being used to log in. If all is OK, and the username does not identify an inactive user, then the function either returns the details of any existing user with the given username, or it creates a new one with a default profile.

From all that, it should be clear that in order to write a Django equivalent to this login system, all we need is a way of finding out from PHP, what the Django username associated with the current session is. The problem is, of course, that Django and PHP have entirely different session models, so we need to work out some way for them to communicate with each other.

We decided that cookies are the best way to do this. While the session objects may differ from PHP to Django, they both have access to the same set of cookies. Obviously, a trivial way for Django to pass the username to PHP would be to have a cookie that contained it. Almost equally obviously, this would be a terrible idea, because cookies are under the control of the user's browser, and you don't want people to be able to set the cookie to, say, "admin" using their browser's options, and then have admin rights on your forums.

However, Django and PHP use cookies for their session management, and Django puts a session ID into the cookie sessionid. That cookie is a primary key into the django_session database table, so from PHP you can get the cookie, get the session ID, and then get the Django session data. That's almost enough, but not quite. The problem is that the session data (which includes a User object, which has a user name) is pickled -- that is, it's encoded using Python's serialisation system. This cannot be decoded (as far as I know) from PHP. So, in order to get the user name from the Django session ID, we need to store the mapping from session IDs to usernames in the Django database. This requires a bit of Django coding, so let's look at that next.

The Django side

The answers to this Stack Overflow question contain several good ideas on how to get a user ID from a session ID. The approach we took was to create a new Django model class (which would map to a table in the database) called SessionProfile. In the same way as you might create UserProfile objects to store information about specific users without having to change the built-in User class, this stores information about sessions without needing to change the Session class. The model code is simple:

class SessionProfile(models.Model):

    session = models.ForeignKey(Session, unique=True)

    user = models.ForeignKey(User, null=True)

The next step, of course, is to populate it. As Peter Rowell suggested in the SO post, we decided to use Django middleware. This is ever-so-slightly more complex:

from django.contrib.auth.models import User
from django.contrib.sessions.models import Session

from resolver.usermanagement.models import SessionProfile


class SessionProfileMiddleware(object):

    def process_response(self, request, response):
        try:
            # For certain cases (I think 404) the session attribute
            # won't be set
            if hasattr(request, "session"):
                session = Session.objects.get(
                    pk=request.session.session_key
                )

                sessionProfile, _ =
                    SessionProfile.objects.get_or_create(
                        session=session
                    )

                userID = request.session.get("_auth_user_id")
                if userID:
                    user = User.objects.get(pk=int(userID))
                else:
                    user = None

                sessionProfile.user = user
                sessionProfile.save()

        except Session.DoesNotExist:
            # If there's no session associated with the current
            # request, there's nothing to do.
            pass

        return response

To configure Django to run this, we put it in the MIDDLEWARE_CLASSES, before the normal Django SessionMiddleware

MIDDLEWARE_CLASSES = (
    'django.middleware.common.CommonMiddleware',
    'resolver.usermanagement.middleware.SessionProfileMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.middleware.transaction.TransactionMiddleware',
)

All this together meant that we had a Django installation that maintained a table that mapped the session ID to the appropriate row in the auth_user table -- or to NULL if the session did not have a logged-in user. So now there was a way to write PHP code that could go from the sessionid cookie to the user name.

The PHP side, continued

From the starting point of the auth_apache module, it was easy to create an auth_resolver module that accessed the Django tables appropriately. I won't go into how it works in detail -- instead, you can view the two files that make it up from the Google Code repository. They should be pretty self-explanatory in the light of the auth_resolver description above. The files are:

Once we had uploaded these files, I made sure that we had a user called "administrator" on both the Django and the phpBB sides of our site, logged into both systems separately, went to the phpBB administration pages, and set the authentication to auth_resolver.

Now people only need to log in once on our site. There was just one more change to make; when a non-logged-in user tries to post on the forums, they are presented with a login form. With the setup I've described, this would not work, as it would try to log the user in using the login_resolver function from our authentication module, which is dependent on the user already being logged in to Django -- which they clearly are not as otherwise they would not have been presented with the login form! The solution was to change the page. The form that is shown to the user under these circumstances is defined by the forum's current style (skin); the default style puts it in styles/prosilver/template/login_body.html. Our current solution has been to simply replace this form with text that links to the Django login page. In the long run, we will streamline this. But that's a post for another day.

Any comments on the work so far very much welcome!