Sunday, January 27, 2008 1:47 PM
As I dig into the future of databases, I have found some articles that I want to share with a wider readership. The basic premise of my search is "RDBMS's were developed over 25 years ago, and we haven't come up with something better since?! I gotta look into that" and so starts my education on all of the new stuff coming from those wacky data guys. ;)
Shards
One thing that I have learned about is scaling out versus scaling up. I have found it to be a very interesting concept, which is large part due to what we as software developers can do to make this easier. That which I am most excited about is the Hibernate.Shards API. How sweet is it going to be if I can hide the shards concept behind the hibernate api? very.
Reads:
http://highscalability.com/unorthodox-approach-database-design-coming-shard
http://highscalability.com/tags/shard
http://www.rgoarchitects.com/nblog/2007/08/21/TheRDBMSIsDead.aspx
Column Store Databases
Ok, still getting my head around these bad boys but the concept (I think) is that every column in a typical "row store" database is kept seperate. The benefit here is on reads, and according to the literature (vendor and otherwise) they are very fast at reading. I first discovered this concept while reading about Google's BigTable. Very neat, if only I could figure out how to best use it.
Reads:
http://www.databasecolumn.com/2007/09/one-size-fits-all.html
http://209.85.163.132/papers/bigtable-osdi06.pdf
http://en.wikipedia.org/wiki/Column-oriented_DBMS
Denormalization
A big topic for larger data sets seems to be the responsible denormalization of data. This isn't really a new concept, we have been doing it for reporting purposes for quite awhile but it seems to be coming back to me more and more often. One of the more interesting concepts was related by Mats Helander on storing an object in the db as an XML blob.
Reads:
http://www.matshelander.com/wordpress/?p=66
Object Oriented DB: http://www.db4o.com/
BASE vs ACID
I can't remember what got me started on this, but I am at the very beginning of my learning curve here.
http://www.infoq.com/articles/pritchett-latency