At the time of writing this, I have been working on a financial startup for roughly a year and a half. In that time I have taken a prototype in PHP and iterated dozens of times to bring it to the more mature and scaleable Rails application it is today. There have been many triumphs and failures in this journey, but one thing that has yet to fail in any way has been the tried and true MongoDB.
When asked why we use MongoDB in a transactional system, I often go immediately on the defensive. I do not want to generalize in explaining my encounters, but I also want to paint an accurate picture of the conversations and criticism I have received. In many cases, the question of why I use MongoDB is not posed as a genuine curiosity by a well versed developer. Often times I receive a hybrid statement and question like, “you use mongo?!!,” tainted with a subtle undertone and judgement of NoSQL data stores in general. The person asking usually comes from a corporate-minded background overseeing development or building .NET applications (sorry if it offends but it’s the truth).
Given that financial data is transactional (see what I did there?) you probably would assume choosing a database stack would be focused around transactions and stored procedures, and it is not a bad assumption. In fact, it is an assumption I would make if I was guessing another financial institution’s database software. The problem with this assumption is that it considers the technology from a very boilerplate business need without really contemplating the scale, growth, concurrency or timeline of the business. How can you, based upon a simple introduction, question any technology choice without knowing the business intimately? Simply put, you cannot.
The case for MongoDB begins
In a startup or small business, it is not uncommon to have frequent iterations of the technology to refine and match the needs of the brick and mortar side of the business. Financial software is not exempt from these refinements, regardless of the age of the industry (especially for a financial startup that is a disrupter). Thus many of the ideals for database software begin to shift when considering what these iterations will involve and how much time will be allocated to them. In my most recent case, the data has been in a transitionary state for over a year. Our formulas, sources, and granularity refine over time, and as a result the data set changes frequently with it. This immediately made us worry about the pain of Rails migrations. If we were going to have upwards of thousands of changes to our database schema over time, how would that impact our database? What if we needed to store shitty data for a period of time until we had a chance to refine it? These questions and more made us think outside of the ActiveRecord default and start looking elsewhere for solutions.
Our eyes turned to MongoDB. I was already somewhat of a thought leader on MongoDB but I remained non-biased in my assessment of whether it was a good move for the business in the short and long term. Since we did not fully understand our data and we knew this would be a pain point, the rigidity of ActiveRecord and RDBMS would only worsen those pains. Thus MongoDB was our choice for it’s infinite scalability and speed leaving us to still with two big concerns: relationships and transactions.
Relationships with ActiveModel and Mongoid
Since MongoDB is a NoSQL data store, we have the best of both worlds in that we can store very dirty data, but we can also scrub that data over time and store the polished result in the same document. ActiveModel and the Mongoid gem were our answer. The two of them create a very similar feel and structure to ActiveRecord, while preserving the advantages of native MongoDB such as aggregation. Mongoid leverages ActiveModel within Rails to architect a rigid database schema that is controlled, defined and regulated by application code rather than a SQL layer. The queries made are very similar (or in many cases identical) methods that would be used by ActiveRecord for RDBMS. This provides the ability to be mostly database agnostic, while taking advantage of rigidity in Rails models and not carrying if the underlying data store is transactional, relational, or flat storage. Our app builds the relations, tracks the data, and performs with lightning speed regardless.
Scaleable and Speedy
MongoDB was designed to be horizontally scaleable. The technology supports sharding and master/slave replication out of the box and with no additional cost or overhead. The database is also open source, ensuring that even at enterprise level scale, you will not run into hidden license fees. This means that for most applications, your approach may be different, but you can handle application concurrency by load balancing across nodes which can further rely on separate replica sets. This really means your application is likely to be the bottle neck, not the database. It changes your strategy, no doubt, but arguably for the better.
What about Transactions?
Okay, so you’re still not convinced that the app layer can be transactional? Fine. Let’s talk about transactions. Specifically, let’s talk about ACID and atomic operations, because that’s the real concern when it comes to data integrity. Since MongoDB supports atomic operations on a document level, what you need to really consider is how you support data integrity across relationships, backups, etc. Replication across nodes, and ActiveModel validations will help you there. You can also consider how you architect your data in order to take advantage of document based storage.
Often times RDBMS offers stored procedures, audit trails and migration patterns designed to cascade one data set to another. I knew that our data would be cleaned up over time, but I needed to ensure not to lose the history or context of where that data came from. We were also worried about roll backs in the event of data corruption. The ruby ecosystem is full of gems that supports these cascades with Rails models using Mongoid. Furthermore, I am leveraging gems to keep a separate database of all transactional history. It may not be 100% transactional like you would see in many RDBMS platforms, but we are damned close enough.