Protects the Innocent

Source on GitHub


This Ruby on Rails plugin allows you easily generate a version of your application’s database that does not contain any proprietary data. It “protects the innocent”.

It’s built upon the Faker gem which generates random names, email addresses, text and more.


Have you ever wanted to demo an existing, working application without revealing private data? Maybe you want to bring another developer on board, and you need to give them realistic seed data.

Why not just fake this data? The more complex your app gets, the harder it is to fake this with yaml files, or even factories. But there’s another reason: there’s no substitute for real-world data. Any records you create by hand will be based on your assumptions of how your app is used. Maybe you never thought to test how your view handles a cart with zero items, for instance.


WARNING! Running the rake task below WILL change the data in your database. It does not create a new, sanitized database. How could it? It is advised that you copy your database to a development version, and run the rake tasks there.

In a nutshell, you need to:

  • Add should_protect_as_innocent helper to your unit test (if you’re using shoulda)
  • Add protection to your chosen models, choosing which fields should be protected.
  • Run the rake task that will actually protect all chosen model records.

For the impatient, here’s a three step example:

# test/unit/task_test.rb
class TaskTest :name, :email=>:email, :age=>:number, :description=>:text
# app/models/task.rb
class Task :name, :email=>:email, :age=>:number, :description=>:text
# command line
rake protect:all

For a more in-depth understanding, read on.


There is a test helper available if you use shoulda. Use the example above. It takes the same parameters as your call to protects_the_innocent in the model itself.

Add Protection to Your Chosen Models.

In order to protect a model, call protects_the_innocent with a hash of field
names you want to protect, along with the data type. View the example above.

Here are the available data types:

data type description example output
:company company name Kuphal and Sons
:catch_phrase company slogan Mandatory multimedia migration
:bs a short, made up sentence iterate visionary methodologies
:name a person’s full name Domenic Bergnaum
:first_name a person’s first name only Domenic
:last_name a person’s last name only Bergnaum
:username username marcus.bechtelar
:email email address
:phone phone number 902-697-2898 x7579
:ip ip address
:domain domain (url)
:address street address 521 Jakayla Island
:city city East Vallieberg
:state state Delaware
:state_abbr state_abbreviation MD
:zip zip/postal code 32342-8723
:word one latin-esque word eum
three latin-esque words ab quia esse
:sentence one latin-esque sentence Praesentium impedit mollitia deleniti officiis cum numquam quasi aperiam.
:sentences three latin-esque sentences Numquam laboriosam placeat similique quis qui. Quasi voluptatum quis omnis unde. Ut quo voluptatem ut.
one latin-esque paragraph Molestiae aspernatur est ipsum dolores in suscipit. Recusandae eaque alias occaecati aut earum adipisci nostrum. Eligendi eum et doloremque. Delectus maxime est a nihil aperiam nemo alias qui.
:paragraphs three latin-esque paragraphs Eos et esse consequatur quod labore debitis dicta. Quis placeat minus enim natus. Provident nesciunt aut nostrum voluptate molestiae omnis. Minus nostrum quia velit corporis consectetur sed nulla.

Quidem doloribus aut ut nam velit quos omnis. Illo labore consectetur culpa nihil quibusdam. Facilis et suscipit ipsam totam fugit.

Ut et et distinctio voluptatum. Non temporibus quas velit delectus eligendi accusantium illo. Illo in sunt adipisci pariatur quis enim voluptatum omnis. Ullam possimus ut odio.

:nil ruby’s nil value nil
:number any number This is cool: it will return a random number, but close enough to the original to be believable in almost any circumstance.

There are also these special, custom data types. I’ll show you example input instead of example output, since they’re more complex:

data type description example usage
lambda{|object| …} Any anonymous function you wish to pass in. The parameter is the object being protected. lambda{|x| x.user.display_name}
any static value You can pass in any static value. “turkey”

Run the Rake Task

You can protect all the models at once, only the ones you include, or all but the ones you exclude, respectively:

# Assume you have the following models: Client, Task, CustomerProfile, and Comment
# protect all four models, alphabetically
rake protect:all
# protect only Client, Task, and CustomerProfile models in the order listed
rake protect:only MODELS=Client,Task,CustomerProfile
# protect all models except Comment, in alphabetical order
rake protect:except MODELS=Comment 

Models can be listed in camelCase or under_scored form.

Advanced Usage

Setting the Order that Models are Protected

Sometimes the fake data in one model will depend on the fake data already created in another model. In this case, order is important. Models are protected alphabetically by name, by default. You can create a short custom rake task to
specify any order you want, to save future typing:

# lib/tasks/custom_protection.rake

namespace :protect do
  desc "Protect models in the order I choose"
  task :sorted => :environment do
    ENV['MODELS'] = "task,client"

All we’re doing here is calling the protect:only task that already exists. We’re setting the MODELS environment variable to a comma-separated list of models, in the order we choose.

Setting the Order that Fields are Protected

Sometimes certain fields depend on others when a record is in the act of being protected. Let’s say, for whatever reason, you have first_name, last_name, and full_name attributes. The full name should obviously be a composite of the first and last names, but that will only work if we know full_name is determined last. On top of that, we need to pass a function into our protection call that
combines first and last names into one:

protects_the_innocent [
  [:first_name, :first_name],
  [:last_name, :last_name],
  [:full_name, lambda{|x| "#{x.first_name} #{x.last_name}"}]

We’ve done two things here. First, we used a 2-dimensional array instead of a hash, because it preserves its order. protects_the_innocent doesn’t care, because it simply calls .first and .last on each set of values.

Second, we’ve passed in an anonymous function, with the parameter being the object currently undergoing protection. We use it to set the full name to a combination of first and last name.

To-Do List

  • expanded syntax to allow greater control over generated content:

    • numbers:

      • min value
      • max value
      • list of acceptable values
    • words, sentences, and paragraphs:

      • specified number of items
      • maximum length of generated string
    • all string types:

      • flag for unique value
      • new value that is guaranteed to be different from original

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: