Database System
kirsle edited this page 5 years ago

The Rophako CMS is the latest in many, many rewrites of a content management system I developed for Kirsle.net over the years. It’s the first one written in Python; previously I always wrote my CMS’s in Perl (as either basic CGI or run with mod_fcgid).

And I’ve never liked dealing with relational databases like MySQL. The overhead of needing to install the database software and configure it, manage databases and user permissions, and when it comes to making backups of my website, having to dump a MySQL database is an added step that I wouldn’t like to do. These complaints also apply to some “NoSQL” databases such as MongoDB.

But, in general I like the idea of NoSQL, because designing hard, rigid schemas using C style data type limitations isn’t my idea of a fun time. And auto-incrementing primary keys? They drive my OCD up a wall. If I’m going to increment a primary key, I want it to be a manual, explicit process (I don’t want to have user ID 4 just because the first 3 users were for testing; I want to have user ID 1 because I’m the Admin!)

Rants aside, Rophako uses a custom hand-rolled database system. It’s one that I first started using a few Perl CMS’s ago, and with every CMS rewrite I’d make a script to migrate data from the old CMS (for example, see siikir-blog-migrate.py).

JsonDB

My custom database system, I call JsonDB. It’s basically flat JSON files saved to the hard disk, with file locking measures in place. There’s also a Redis caching server that sits between Rophako and the hard disk, so that the physical disk doesn’t need to be touched very often -- only when a document is being updated or when its Redis cache expires and it needs to be reloaded from the disk.

The “database tables” are really just files on the disk, in a normal directory hierarchy. In the Python code, you don’t refer to them as files (i.e. you never include the .json file extension). So for example, the JsonDB document users/by-id/1 would map to ./db/users/by-id/1.json as a file on disk.

The document URIs are hierarchical and that’s how the database is organized. Examples:

  • users/by-id/1 - the user data for user ID 1
  • users/by-name/admin - a short JSON document that maps the username admin to user ID 1 (for user lookups by name)
  • blog/entries/172 - the blog post with ID #172
  • blog/index - an index (high level summary) of all existing blog posts

For a simple personal homepage site, this database model is sufficient. If you need data to be available in a way that doesn’t fit the document-based structure (in what MongoDB would call a View), you make a new JSON document that pre-caches the data when it changes (i.e. blog/index is a high level cache of basic blog entry details, like their subject, creation date, author ID, etc. -- but not their contents, comment counts, avatars, or other details)

Example DB Structure

blog/
  entries/
    1.json
    2.json
  index.json
comments/
  threads/
    blog-1.json
    blog-2.json
  subscribers/
    blog-1.json
photos/
  index.json
users/
  by-id/
    1.json
  by-name/
    admin.json