More models, more problems
As of mid-2012, we had been accepting payments on lumosity.com for almost five years, all of them through a rather creaky, nasty, brittle pile of code that only a few of our engineers were brave enough to touch. We wanted to build a more flexible payment system that would allow us to implement all kinds of functionality we could never have before. In our design and planning meetings, we realized having everything we wanted would require new code, new models, and new schemas to store the underlying data. No problems here – we quickly built a system that could do everything we wanted.
Sounds great, right? Deploy away!
Of course, there was one small roadblock: we have millions of users already on the current system. We needed to seamlessly transition them between the two systems without anyone noticing any change had happened. Total transparency for the end user was paramount. This proved to be a tough problem given the long, complex account histories that many users had.
One of the strategies we ended up relying on to pull this off was running sanity checks on the data. That is, the expectation was that a snapshot of the data before and after migration would produce the same answers to the same questions.
Case study
Pretend you’re creating a schema to store a person’s medical record. There are many ways you can record a patient’s visits to their doctor. You might choose to store each visit as a separate entry in a visits table, with each diagnosis for that visit stored in a visit_diagnoses table. In this case, the visit is your central model.
Or, you may choose take a longer view of their care and record each treatment given for a specific diagnosis in a diagnosis_treatments table. In this case, the treatments for a single diagnosis are more important than a single visit. No matter which way you choose, both models should be able to give you the same answers to questions like:
- Was Jane treated by Dr. Simpson on July 11th?
- Has John ever been diagnosed with measles?
- How many vaccinations was Stacy given in the past 5 years?
Your model undoubtedly already asks these questions. And if you’re migrating from one to the other, you are probably porting all those “questions” (in the form of methods) to your new system.
This means that by the time you’ve written your new models and are ready to migrate the data you have everything you need to check your migrated data’s correctness for free!
What we did
By checking the values of your model before and after the migration, you can have increased confidence in the data that you’ve migrated. We chose to do this using a SanityCheck class, which looks something like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class SanityCheck
attr_reader :diff, :record
# The list of methods we’re going to compare -- obviously, we can add anything we want here, not just methods that are on the user class.
Methods = [:was_treated_on?, :was_vaccinated_on?, :has_active_treatment?, :current_prescriptions] # etc.
def initialize(record)
@record = record
@before_values = SanityCheck.values(record)
end
def self.values(record)
Methods.map { |m| [m, record.send(m)] }.to_h
end
def check
@after_values = SanityCheck.values(record)
@diff = diff(@before_values, @after_values)
end
def diff(a, b)
a.dup
.delete_if { |k, v| b[k] == v }
.merge!(b.dup.delete_if { |k, v| a.has_key?(k) })
end
end
To make use of this class, we instantiated it before we migrated the data and populated the before values. Then we migrated the data and checked the after values. If there were any differences in the two, we logged an error and rolled back the transaction pending further review:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
User.each do |user|
ActiveRecord::Base.transaction do
sanity_check = SanityCheck.new(user)
user.migrate!
sanity_check.check
if sanity_check.diff.any?
# sanity check failed -- log an error, rollback the transaction, etc.
log "Oh no! Something went wrong: #{sanity_check.diff}"
raise ActiveRecord::Rollback
else
# woo hoo, success! Let’s indicate that this user was migrated
user.update_attributes(:was_migrated => true)
end
end
end
What about unit tests?
The important thing to note is that what we were doing with this pattern wasn’t testing the code (we already did that) – it was testing the data. And since we were doing it for every record in the system, using live, production values, it was the best source of data available.
It also helped us uncover gaps in our understanding of the model we were trying to migrate. In any long running system, there’s bound to be an abundance of accumulated knowledge living inside your codebase and nowhere else (but on the flip side, there’s a lot of obsolete functionality that you’ll never miss). Running these sanity checks helped us uncover these assumptions long before the inevitable “Hey, I remember in the old system we used to be able to…” emails started coming in.
Rewrites aren’t easy, especially when trying to migrate an accumulated history throughout time and faithfully capture the state at each of those times – all while handling every edge case, bug, hack, and workaround that seemed like a great idea years ago. Sanity checking proved to be a great tool in helping us run our migrations without a hitch.