Friday, February 6, 2009

Working Step 2: Mod_Wsgi and Idle MySQL Connections

In my last post, I listed five broad steps to get from here (Raise the Hammer running creakily in Classic ASP on a slightly dodgy IIS server) to there (Raise the Hammer running smoothly in Python and hanging out with the cool kids on a Linux server).

Step 1 is self-explanatory, but Step 2 has definitely been a slow progression (and a humbling experience, if ever there was a danger of me becoming a cocky programmer).

The process of getting the basic environment up and running has been by turns frustrating and exhilirating, as I slowly grow more acquainted with SSH, MySQL command line, Apache, and so on.

Setting up Apache and Mod_Wsgi


The hosting provider automatically installs the Apache webserver and sets up mod_wsgi, but when I ran a very simple web.py application, it served the pages as text/plain rather than text/html.

This turned out to be an easy fix. I modified the httpd.conf file (the configuration file for Apache) to add the following line:

AddType text/html .py

My next problem was trying to configure the site so that it served pages from http://domain.com/ rather than the relatively ugly http://domain.com/index.py/ (note the trailing slash). Also, I'm told that it's bad form to expose the underlying server technology in the URL.

A few websites suggested adding an Alias to httpd.conf, but this threw an internal server error. Then I figured out that Apache needed to have the mod_alias module added:

LoadModule alias_module modules/mod_alias.so

Then I could add the following lines:

Alias /static/ /path/to/webapp/htdocs/static/
Alias / /path/to/webapp/htdocs/index.py/

and voila! the files inside the /static/ folder (images, CSS and javascript) were served as is, and everything else was passed into the web.py application for handling.

MySQL, Why Did You Go Away


Once I had a web.py application running, I noticed that SQLAlchemy would intermittently pass along a seemingly bizarre error: "the MySQL Server Has Gone Away". On refresh, the page would load fine. The error seemed to happen when no page had been viewed for several hours

I found a helpful suggestion on this blog. Apparently the problem is that MySQL closes idle database connections after eight hours. The solution, then, is to ensure that the connection doesn't spend eight hours idling. The author proposed doing this by instructing SQLAlchemy to recycle the connection pool every 7,200 seconds (two hours).

So I tried it out:

import sqlalchemy as s
import quandy_config as c

engine = s.create_engine(c.DB_CONNECTION,
pool_size = 100,
pool_recycle=7200,
)

metadata = s.MetaData(bind=engine)

# define tables here

metadata.create_all()
and it seems to be working fine. I left the app to idle for twelve hours, and when I loaded it in my browser it fired up with no complaints.

Coming Soon: Windows guy learns SSH, or I'm PuTTY in your hands.

Tuesday, February 3, 2009

RTH Redesign in Five Steps

As Joel Spolsky warns, rewriting your code from scratch is not something to be undertaken lightly. So why am I doing it with the code for Raise the Hammer?

Step 1: admit you have a problem.

Five years ago, it made sense to develop a website in ASP and VBScript. It was easy to write, easy to deploy, widely available, and required minimal configuration (ASP/VBScript is tightly coupled with IIS). My computing background is DOS/Windows and we wanted to get the site up fairly quickly, so we went with what we already knew.

The original site didn't even have a database - all the articles, comments, user accounts, author profiles etc. were stored in text files and accessed using old-fashioned I/O. It's just as well, because the odd comment would have weird stuff in it, like anything' OR 'x'='x (little did the would-be malicious hackers know that the site was not sophisticated enough for their SQL injection attacks).

We bolted on a MySQL database, and later refactored the code to reduce code duplication and squeeze the maximum benefit out of VBScript's limited support for classes and objects. Still, it got to the point where fixing bugs and adding features has become something of a nightmare.

I had a steadily growing list of desirable features that were going to be painful to implement in VBS but would be trivially easy in a more robust language.

Meanwhile, other platforms and frameworks for creating websites grew steadily more robust, mature, and powerful. There was some noise about Models, Views and Controllers (MVC), and suddenly everyone was talking about Python and Ruby on Rails.

By 2009, VBScript was well past its sell-by date, and I had been doing enough other projects in Python to know what it feels like to program in a truly powerful, dynamic language.

Step 2: Choose a new platform.

As I mentioned, there's been an explosion of new web technologies over the past several years. When we started, the choice of a design platform was roughly limited to PHP (yeeech!) or VBScript (also yeeech, but arguably less so). Now designers have an embarrassment of options: ASP.NET (with a whole catalogue of languages to choose from), Java Server Pages (JSP), ColdFusion, PHP (now with limited namespace support), Lisp, and just about every dynamic language.

On top of this, most languages also offer one or more software frameworks that ease application development by abstracting out common functions and providing libraries of helpful utilities. Some languages, like Ruby, are dominated by a single framework (Rails), while others, like Python, enjoy (or suffer) a gaggle of competing frameworks: Django, Plone, Pylons, TurboGears, Web.py, web2py, and Zope, among others.

Confounding matters further is the proliferation of templating languages with exotic, vaguely Asian-sounding names.

I ruled out .NET pretty early, because I've been burned by Microsoft enough times to know not to rely on their proprietary technologies. I also ruled out Java, because damn, that language has a lot of verbosity. PHP is hideous, ColdFusion is just bizarre, and I'm not smart enough for the Lisps.

I narrowed my search down to Ruby and Python. While the former has a more celebrated framework, the latter is older, has a much larger and more diverse developer community, and has a truly epic collection of mature libraries to extend the language in addition to the already robust standard library (to which Python programmers refer using the phrase "batteries included").

Created by Guido Van Rossum (who maintains it as the Python Software Foundation's "Benevolent Dictator For Life"), Python is a free, open source, dynamic, flexible, object-oriented language that makes programming a joy. Python is a rich ecosystem in itself, having benefited from countless thousands of volunteer-hours.

As I mentioned, Python has lots of frameworks; but the more viable projects are slowly merging as each adopts the best practices of the others. Helping matters along is the Web Server Gateway Interface (WSGI), a standardized Python specification for web servers and application servers to communicate with web applications.

Among Python frameworks, I shied away from the Big Three - Django, TurboGears and Zope - mainly because they all felt like overkill. Reading through the documentation, it felt like I had to learn an additional language (or four additional languages in the case of TurboGears, which takes a hard stand on the notion that you shouldn't waste time reinventing wheels) when what I really wanted to do was write Python.

Also, it seems to be a common observation that big web frameworks are well-suited to doing about 85% of what you want - but their sheer complexity means that the last 15% is at least as difficult and painful as the rest combined.

In the end, I decided to go with Web.py. Created by Aaron Swartz and delivered into the public domain, it's a simple, elegant and lightweight framework that's easy to learn and does what you need it to do, then politely gets out of the way.

For the database connection, I went with SQLAlchemy, which is so clever I can't distinguish it from magic. It lets you manage database data through Python objects so you don't need to dirty your hands writing SQL. It also abstracts away the differences between dialects of SQL so you can port seamlessly between SQL providers.

For parsing URLs and serving content, I wrote my own mini-framework, which I'm calling Quandy (as in, quick and dirty). It plays nicely with Web.py, which in turn plays nicely with WSGI.

(Did you notice something? After rejecting TurboGears because it bolts together several tools that I would need to learn, I proceeded to learn several tools and bolt them together. Still, I learned and bolted on my terms...)

The new site will be hosted at Webfaction.com, and served from an Apache webserver with the mod_wsgi module. I went with Webfaction on the generalized advice of the Python developer community - it's widely regarded as the best provider, short of a dedicated server or virtual private server.

This, of course, means that I've needed a crash course on the Unix command line, which 1337 h4x0rs have used for years to impress lowly DOS programmers with nerdy syntax references. The learning curve has been steep but satisfying, and when my seven-year-old PC finally kicks it, I'll feel more comfortable taking that leap of faith from Windows to Linux.

Step 3: Make a deployment schedule.

And by "deployment schedule", I mean a list of things to do, not a list of timelines on when to have them completed.

  1. Configure the new environment (itself a big change from the puerile point-and-pray simplicity of Internet Services Manager).
  2. Recreate existing site layout as an alpha in new platform. I'm a big fan of making one change at a time.
  3. Get reader feedback on possible changes to layout (e.g. main page more like HuffPo) and design changes. Roll out as a beta.
  4. Implement the following new features:
    • A real user profile management tool.
    • Threaded comments (needs comment form at reply button).
    • A user-generated events listing
    • A Hamilton Wiki (if people seem interested in it).
    • Other user-requested features (I'm sure there are more, but I'm getting tired).
  5. Test, test, test - for bugs, for speed, for server load. Test some more.
  6. Deploy as main site.

Step 4: Start developing new application to ease ad hoc community organizing.

I hinted at this in a recent talk I gave as part of the Mohawk College Active Citizenship lecture series. I want to create a new application, coupled tighly or loosely to RTH, that allows people to organize around an issue and captures the efforts of people who only want to contribute a little bit. I think this is really missing from the community and would help to invigorate citizen activism and hold our leaders more accountable. More to come on this as it evolves.

And finally:

Step 5: Once the new tools are mature, open-source them so we can share them with other cities.

That's all I've got for now. Your feedback is welcome, and I'll post another entry as the project moves into step 3.