Sunday, June 17, 2012

Bit of irony: Database Driven (Web) Applications

I've been doing web development since 1996 but really started working seriously on it in 1999 when I met Zope and Python. That's when I got serious about it. I did use other technologies over the years, Perl, Ruby, Eiffel, AWK (yes really),  Java (oh the horror), and C/Objective-C (long before I ever owned an apple product), Django, Rails, JRun, and even PHP (*blech*) among others. Most of the applications I built between 2000 and 2007 were in Python usually via Zope and very often with Plone.

One thing that set zope apart from most other web frameworks of its time is the persistance layer. While most other systems were using Relational Databases, Zope used an object database called... wait for it...  Zope Object Database (ZODB).  It took me a while to really understand how to make proper use of zodb as I wasn't, at the time, really crazy about Object Oriented Programing (and almost 12 years later, I'm still not crazy about OOP. I use it. I understand t. I just don't get all the hype).

From 1997 through about 2007 when I built apps with systems other than Zope, persistance was often handled by the likes of mSQL, MySQL, BDB, GDBM, DJB's CDB, or SQLite. At the time I was just gobbling up any technologies I could find, in part because I wanted to be able to do things outside of zope the way I could do them within the confines of zope. The hope was that one of these systems could be leveraged to that end.

In 2007 I started mucking around with Django, Grok, Repoze, and Ruby on Rails. In 2008 I discovered web2py which just seemed right to me and about 90% of my web work has been in web2py ever since, much of it deployed to Google's App Engine. The rest was mostly Lua (Kepler, Sputnik, and LuCI) which I'd also discovered in 2008 with a bit of Rails and Django work thrown here and there.

Odd and unfortunate....

From about 2004 through 2008 I was interviewing for various positions and whenever called in for web development roles the conversations would invariable go something like this:
Inverviewer: We've reviewed the code you wrote for zyx (some app that I'd built and that was running live on my servers somewhere). We only see the application code where's the database code?
me: It's right here...
Interviewer: That looks like Python [sometimes Perl or C]. Where is the SQL?
me: Well if you read the code you'll see that there is a call here to this function here which is a wrapper to retrieve... and here is a wrapper function to store... this here does...
Interviewer: But Databases use SQL for doing things.. where is the SQL? Even if you use wrapper functions those functions have to talk to the database. Where is the SQL that those functions use?
me: This particular app doesn't talk to a relational database so there's no SQL.
Inverviewer: What do you mean it doesn't use a relational database.. either it's connected to a database or it's not. If it's connected to a database it has to use SQL....

At this point the conversation would generally devolve into a set of emphatic statements about the way applications had to work and the end result would be that I didn't get the job. In a few cases I'd be told that they wanted someone with demonstrated knowledge of SQL, and relational databases to which I'd respond by sending them some of my code that makes extensive use of SQL and relational databases. In most cases I doubt they ever really looked at it because instead of commenting on the quality of the work I'd presented to them I'd be questioned as to why I didn't use relational databases for some of my personal projects. I'd respond that they weren't really needed for those projects and that I didn't want the overhead of Oracle or MySQL. Usually I'd get "well if you really understood databases you'd realize that the overhead of using one is far lower than maintaining your own home-grown database-substitute. You've learned enough SQL to cobble together some libraries but you obviously don't understand databases otherwise you would have used them for project xyz".

A Revelation

It took me a while to really get what was going on. Despite many conversations like the above it didn't really occur to me that I was doing anything wrong. In my mind, I knew about databases. I wasn't an expert but I knew what the hell a database was and I'd used several of different types. They were 'wrong' because they thought database meant a client-server relational database management system like Oracle. To me database meant a system for storing and retrieving data; a system designed to facilitate the storage and retrieval of data; a repository of data; the set of data being stored and/or retrieved. It dawned on me that in some cases my answers to some questions probably sounded petulant which might be ok when you're right and they know it, but it's never ok you're wrong (to them).

Take the classic 'glass half-full v half-empty' question. Naturally (to me) unless in a perfect vacuum the glass is always 100% full. Most people assume that the question is about the liquid, sand, or other 'non-air stuff' that's in the glass. I've come to understand that few people avoid making assumptions like that and that when they encounter someone who does (avoid making assumptions) they often assume the person to be intentionally provocative.  Further if you were to assert that the glass is completely full to someone who has never been taught about matter and gases they would probably look at you like you're completely bonkers.  In any case you (in this case: I) end up looking like a jackass if you're answer is one of the correct wrong answers.

The people I'd been talking to had never seen anything referred to as a database that wasn't a relational database. They had developed an assumption that if it's called a database it has to be 1) relational 2) use SQL and 3) likely uses the client server networked model. For the record all of that is rubbish but it was very common rubbish to encounter from 1995 up through say 2010.  This can be seen all over the interwebs look at this article from 2006 and this question from 2008 on stackoverflow.com.

NoSQL (rdb) vs NoSQL (terribly named 'movement')

I mentioned that I built some web applications with AWK and Perl. One relational database system I used for quite a few quick and dirty hacks was rdb and later NoSQL which was built out of the Shell as a 4GL doctrine. It was great for a unix nerd like myself who wanted a simple and easy to understand system that I could fix on my own. Since web2py made it so easy to develop applications to be run on Google App Engine, I started learning and using Google BigTable ~ 2008 or 2009.

In 2010 (as I recall it) NoSQL fever hit. Not the NoSQL I was familiar with for over a decade but something new. This NoSQL was (apparently) sparked by the white paper released by Google about their BigTable implementation and design. Systems like Hbase, HyperTable, and Cassandra implement very similar systems and are a few among the NoSQL 'movement'.

The original NoSQL was a relational database system built specifically for POSIX/Unix systems using the native tools/facilities of the Operating System. The new poser NoSQL system has been rebranded at least three times as new(?) projects get adopted into the fold.  Depending on who you asked, in 2010 the constellation of systems in the NoSQL universe included Riak, Redis, HTable and maybe CouchDB and some others but specifically did not include BigTable and Dynamo (I guess because they were not Open Source run-on-your-own-system solutions) nor something like ZODB or Berkley DB. Now the latter two are considered 'grand daddy' NoSQL solutions and are part of the flock as it's been realized that much of what people are calling NoSQL is really just a new implementation of existing concepts.

The irony of it all...

Now I've noticed that hiring managers looking for (web) developers will ask about their database experience expecting to hear about a system that's non-relational. If a candidate only mentions systems like Oracle and MySQL it's often assumed that didn't get the memo that 'database' no includes NoSQL options. 

still more of the same

Even so when I'm asked about my experience with Object Databases, NoSQL, or non-relational databases and I mention that I've been using them since the 90's I'm usually told that they didn't exist back then. I've learned though to understand that many assume all of this stuff just popped up between 2008 and 2011. I'm getting there.. one day I'll have it mastered...maybe.

No comments:

Post a Comment