Monthly Archives: December 2008

Money for spreadsheets

18 December 2008

We’ve produced a lot of interesting spreadsheets in-house at Resolver Systems — some of which I’ve blogged about here — but we’re really keen to see what everyone else is doing with Resolver One. So we’re running a competition: every month for the next five months, we’re asking people to send us interesting stuff that they’ve done with our product, and we’ll give $2,000 to the author of the best one. After five months we’ll give $15,000 to the author of “the best of the best”.

It should be interesting to see what people send us :-)

Getting phpBB to accept Django sessions

10 December 2008

phpBB is a fantastic bulletin board system. We use it at Resolver Systems for our forums, and it does a great job.

However, we’re a Python shop, so we prefer to do our serious web development — for example, the login system that allows our paying customers to download fully-featured unlocked versions of our software — in Django.

We needed to have a single sign-on system for both parts of our website. Specifically, we wanted people to be able to log in using the Django authentication module, and then to be able to post on the forums without logging in again. This post is an overview of the code we used; I’ve had to extract it from various sources, so it might not be complete — let me know in the comments if anything’s missing. I will be uploading something more polished to a Google Code project over the next few days.[UPDATE: I’ve now uploaded all of the code to the Google Code project, so if you want to use it, you should get it from there. However, the description below may still be of interest, and I’ll keep it for historical reasons :-)]

The PHP side

Let’s consider the PHP code first. phpBB has “pluggable” authentication — that is, you can provide a PHP module containing certain functions, and then in the admin UI tell it to use that module for authentication. These modules are stored in the subdirectory includes/auth, and the standard installation includes one called auth_apache.php, which allows people using Apache to use HTTP authentication. Our Django integration is based on this module, so it’s worth going over the original code before we look at the modified version.

There are four functions in the module:

  1. autologin_apache. This is called when a user tries to do something that requires a login (and perhaps at other times too, I’m not sure). Its job is to check the current session, and determine if the state of that session is such that the user should be autmatically logged in to phpBB. This is the core of the HTTP authentication: it checks that the current session relates to a user who is logged in HTTP-wise, and then tries to get a user of the same name from the phpBB user database. If there is such a user, then it returns their details. If there is not, then it creates one with a default profile. That last step can be counter-intuitive — what kind of authentication system creates profiles for people it’s never heard of before? — but makes sense when you consider that it is only doing this for people who are actually already logged in using the other system that you’re integrating with.
  2. user_row_apache. This function just generates a default phpBB profile given a username and a password.
  3. validate_session_apache. This function checks if the user passed to it matches the one who is logged in HTTP-wise.
  4. init_apache. This is used when you first switch to using Apache-based authentication, and is just a sanity check to make sure that you’re logged in (HTTP-wise) as a user whose ID matches the user you’re logged into phpBB as. This stops you from accidentally switching to using Apache-based authentication when you’re not logged in using that kind of authentication and being logged out and unable to switch back again.
  5. login_apache. This is called by the phpBB login page to validate a user. To be perfectly honest, I’m not sure why it’s included in the auth_apache module, because if it’s correctly configured then anyone who performs an action that would require a login will be handled by autologin_apache, so they’d never see the login page. However, for completeness: this page checks that the current session relates to a user who is logged in HTTP-wise, and that the username used for the HTTP auth is the same as the one being used to log in. If all is OK, and the username does not identify an inactive user, then the function either returns the details of any existing user with the given username, or it creates a new one with a default profile.

From all that, it should be clear that in order to write a Django equivalent to this login system, all we need is a way of finding out from PHP, what the Django username associated with the current session is. The problem is, of course, that Django and PHP have entirely different session models, so we need to work out some way for them to communicate with each other.

We decided that cookies are the best way to do this. While the session objects may differ from PHP to Django, they both have access to the same set of cookies. Obviously, a trivial way for Django to pass the username to PHP would be to have a cookie that contained it. Almost equally obviously, this would be a terrible idea, because cookies are under the control of the user’s browser, and you don’t want people to be able to set the cookie to, say, “admin” using their browser’s options, and then have admin rights on your forums.

However, Django and PHP use cookies for their session management, and Django puts a session ID into the cookie sessionid. That cookie is a primary key into the django_session database table, so from PHP you can get the cookie, get the session ID, and then get the Django session data. That’s almost enough, but not quite. The problem is that the session data (which includes a User object, which has a user name) is pickled — that is, it’s encoded using Python’s serialisation system. This cannot be decoded (as far as I know) from PHP. So, in order to get the user name from the Django session ID, we need to store the mapping from session IDs to usernames in the Django database. This requires a bit of Django coding, so let’s look at that next.

The Django side

The answers to this Stack Overflow question contain several good ideas on how to get a user ID from a session ID. The approach we took was to create a new Django model class (which would map to a table in the database) called SessionProfile. In the same way as you might create UserProfile objects to store information about specific users without having to change the built-in User class, this stores information about sessions without needing to change the Session class. The model code is simple:

class SessionProfile(models.Model):

    session = models.ForeignKey(Session, unique=True)

    user = models.ForeignKey(User, null=True)

The next step, of course, is to populate it. As Peter Rowell suggested in the SO post, we decided to use Django middleware. This is ever-so-slightly more complex:

from django.contrib.auth.models import User
from django.contrib.sessions.models import Session

from resolver.usermanagement.models import SessionProfile


class SessionProfileMiddleware(object):

    def process_response(self, request, response):
        try:
            # For certain cases (I think 404) the session attribute 
            # won't be set
            if hasattr(request, "session"):
                session = Session.objects.get(
                    pk=request.session.session_key
                )

                sessionProfile, _ = 
                    SessionProfile.objects.get_or_create(
                        session=session
                    )

                userID = request.session.get("_auth_user_id")
                if userID:
                    user = User.objects.get(pk=int(userID))
                else:
                    user = None

                sessionProfile.user = user
                sessionProfile.save()

        except Session.DoesNotExist:
            # If there's no session associated with the current 
            # request, there's nothing to do.
            pass

        return response

To configure Django to run this, we put it in the MIDDLEWARE_CLASSES, before the normal Django SessionMiddleware

MIDDLEWARE_CLASSES = (
    'django.middleware.common.CommonMiddleware',
    'resolver.usermanagement.middleware.SessionProfileMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.middleware.transaction.TransactionMiddleware',
)

All this together meant that we had a Django installation that maintained a table that mapped the session ID to the appropriate row in the auth_user table — or to NULL if the session did not have a logged-in user. So now there was a way to write PHP code that could go from the sessionid cookie to the user name.

The PHP side, continued

From the starting point of the auth_apache module, it was easy to create an auth_resolver module that accessed the Django tables appropriately. I won’t go into how it works in detail — instead, you can download the two files that make it up. They should be pretty self-explanatory in the light of the auth_resolver description above.

  • auth_resolver.php, the phpBB authentication plugin.
  • ResolverUser.php, which provides a GetLoggedInUser function that is used by auth_resolver but is also useful elsewhere.

(If you want to use these, strip the “.txt” from the end of the filename and give them a “.php” extension.)

Once we had uploaded these files, I made sure that we had a user called “administrator” on both the Django and the phpBB sides of our site, logged into both systems separately, went to the phpBB administration pages, and set the authentication to auth_resolver.

Now people only need to log in once on our site. There was just one more change to make; when a non-logged-in user tries to post on the forums, they are presented with a login form. With the setup I’ve described, this would not work, as it would try to log the user in using the login_resolver function from our authentication module, which is dependent on the user already being logged in to Django — which they clearly are not as otherwise they would not have been presented with the login form! The solution was to change the page. The form that is shown to the user under these circumstances is defined by the forum’s current style (skin); the default style puts it in styles/prosilver/template/login_body.html. Our current solution has been to simply replace this form with text that links to the Django login page. In the long run, we will streamline this. But that’s a post for another day.

Any comments on the work so far very much welcome!

Product management with Google AdWords

4 December 2008

You can’t rely on people’s response to your advertising to manage your product — but as one of many inputs, perhaps it could be valuable. Can part of the product management role be taken over by aggregating data from carefully-targeted Google AdWords campaigns?

There have been some interesting recent discussions on the topic of product management. Like most startups, Resolver Systems doesn’t have anyone with the job title “Product Manager”, but the role is filled, mostly by me and my co-founders. We look at the software, talk to clients and to potential clients, read spreadsheet blogs, and try to synthesize all of this together to work out where development of Resolver One should go over the next weeks, months, and years.

This works surprisingly well; we’ve produced something solid and reliable that clearly fills a real gap in the market. But the other day I was looking at the first results from a new Google Adwords campaign, and noticed something interesting — something that may well be standard practice for people who’ve used this kind of tool for longer than I have, but was a bit of a revelation for me.

The way we’d structured this campaign was to identify the ten things we thought were most interesting about Resolver One, and then to create an Ad Group inside AdWords for each. “Ad Group” is Google’s terminology for a set of advertisements that all share the same set of keywords (among other things). So, for example, we had an Ad Group to cover Resolver One’s programmability, with keywords like “programmable spreadsheet” and “code in spreadsheet”. When Google spotted these keywords in a search, it would know that it could present its user with our ad, which said something like “a new, easy-to-program spreadsheet – download the 14-day trial”.

These ten Ad Groups had been running for a day or so, and I checked out the numbers – and saw something interesting. The number of clicks each ad got often went against my intuition about the product. I would have thought that the ability to convert a spreadsheet to a program would be much more interesting than the fact that you can build spreadsheets that are better protected against layout changes. But the number of clicks says quite the opposite!

To put it another way – by having an Ad Group per feature, and then ranking the Ad Groups by the number of clicks they received, I was able to get an instant market survey telling me what people thought about our different features. For less than £50 (I’d not budgeted more for this phase of the advertising), over 300,000 people looked at pages including our ads, but more importantly 350 clicked through on a specific feature, “voting” for more work on that feature!

I think this is a great new input to the product management process. Obviously building what people know they want is only part of creating something great; it’s as, if not more, important to build stuff they don’t yet know that they want, even if you then have to spend time and effort persuading them to try it out. But if you have ten ideas and want to know which is most popular, a £50 AdWords campaign can tell you an incredible amount very quickly.

So, the question is… if you were starting a new company tomorrow, would you think it ethical to start advertising before you started coding, just to see which features to focus on first?