Custom rake tasks in Merb: Data Backup and Import 22

Posted by mikong on November 03, 2008

There are a lot of data import solutions in Rails, most of which depend on ActiveRecord. Since Merb supports ActiveRecord too, you can use those solutions in your Merb app. But I’m using DataMapper in my Merb app, so I had to look for another way.

This article shows how to create a simple rake task in a Merb + DataMapper project. It then talks about the Data Backup and Import rake tasks db:dump_data and db:load_data. I’ve added some notes for those with a Rails background.

A simple rake task

When I generated my Merb app, the lib folder wasn’t generated. It looked something like this:

sample_app
  |--> app
  |--> autotest
  |--> config
  |--> doc
  |--> gems
  |--> merb
  |--> public
  |--> spec
  |--> Rakefile
  `--> tasks

The folder structure seems to suggest that you write your custom rake tasks under the tasks folder of your Merb app, but this is not the case. Read the Rakefile and you can see these 3 important details:

  • Add your custom tasks named file_name.rake in /lib/tasks.
  • The Merb environment is initialized to the MERB_ENV value, or ‘rake’ environment if MERB_ENV is not set.
  • To start the runner environment, in case you need access to your application’s classes, there is a task called :merb_env.

So the location is just like in Rails, i.e. in the lib/tasks folder. Try creating a custom.rake file in the lib/tasks folder and add the following:

  desc "Print all classes that include DataMapper::Resource."
  task :print_dm_resources => :merb_env do
    DataMapper::Resource.descendants.each do |resource|
      puts resource
    end
  end

The rake task above depends on the :merb_env task in order to access the application’s models. This is just like a rake task in Rails that depends on the :environment task. To run the task:

$ MERB_ENV=development rake print_dm_resources

Data Backup and Import

The following rake tasks are based off of the rake file provided in Tobias Lutke’s old blog post about migration between databases. I’ve translated it to work with Merb + DataMapper:

namespace :db do

  def interesting_tables
    DataMapper::Resource.descendants.reject! do |table|
      [Merb::DataMapperSessionStore].include?(table)
    end
  end

  desc "Dump data from the current environment's DB."
  task :dump_data => :merb_env do
    dir = Merb.root_path("config/data/#{Merb.env}")
    FileUtils.mkdir_p(dir)
    FileUtils.chdir(dir)

    interesting_tables.each do |table|
      puts "Dumping #{table}..."

      File.open("#{table}.yml", 'w+') { |f| YAML.dump(table.all.collect(&:attributes), f) }
    end
  end

  desc "Load data (from config/data/<environment>) into the current environment's DB."
  task :load_data => :merb_env do
    dir = Merb.root_path("config/data/#{Merb.env}")
    FileUtils.mkdir_p(dir)
    FileUtils.chdir(dir)

    adapter = DataMapper.repository(:default).adapter

    interesting_tables.each do |table|
      table.transaction do |txn|
        puts "Loading #{table} data..."
        YAML.load_file("#{table}.yml").each do |fixture|
          adapter.execute("INSERT INTO #{table.name.pluralize.snake_case} (#{fixture.keys.join(",")}) VALUES (#{fixture.values.collect {|value| adapter.send(:quote_column_value, value)}.join(",")})")
        end
      end
    end
  end
end

Add the above to your custom rake file. To dump the data from your development environment, run the following:

$ MERB_ENV=development rake db:dump_data

This will create the folder config/data/development if it doesn’t exist yet, and generate a ModelName.yml file for each of your models that included DataMapper::Resource. It is necessary to specify MERB_ENV, otherwise the environment will be initialized to the ‘rake’ environment and the folder config/data/rake will be created instead.

Then as you migrate your database to the latest version of your models, our database is cleared of its data (a side effect when using DataMapper’s automigrate):

$ rake db:automigrate

After migrating, we can just reload the data using our other rake task:

$ MERB_ENV=development rake db:load_data

If you were to use the sample process above, you might want to update the yaml files to handle the changes that happened with the migration. If there’s a new model, you can just create a new yaml file for it.

Here are some of the things you can do with the script:

  • change the path where the yaml files are stored
  • use a different file format for the data (quite a big change though)
  • edit interesting_tables method to exclude more models
  • clear tables before loading the data (this could be dangerous!)
  • create a task that depends on dump_data, automigrate, and load_data

I didn’t clear the tables before loading the data because I prefer that the rake task throw an error when I’m running it on a populated database.

There are probably better ways to approach this problem. In my case though, I just needed a quick solution to reload my data after running automigrate.

Trackbacks

Use this link to trackback from your own site.

Comments

Comments are closed.

  1. Uri Tue, 16 Dec 2008 11:39:13 UTC

    Thanks, these is very useful for me.
    BTW, should this work in slices too? Can’t see my custom tasks.

  2. mikong Tue, 16 Dec 2008 18:31:49 UTC

    @Uri: You might want to check out these pages in the merbivore wiki:

    http://wiki.merbivore.com/howto/slice

    There’s a part there in ‘Create table from model’ section where he noted that the slice didn’t have access to rake db:automigrate, and so he piped the ‘Datamapper.automigrate!’ command to the slice.

    http://wiki.merbivore.com/howto/slice/rake_tasks

    In this page, it’s shown that you need to add the dependencies in the Rakefile of your slice.

  3. Uri Tue, 16 Dec 2008 20:27:02 UTC

    Thanks!

  4. Philippe Rathé Sun, 01 Mar 2009 20:15:06 UTC

    Can you tell me what your script expect as yaml syntax?
    I get the following error:
    rake aborted!
    undefined method `keys’ for [”reason1″, {”name”=>”My own reason”}]:Array

    With this yaml:
    reason1:
    name: My own reason

    Thanks

  5. mikong Mon, 02 Mar 2009 16:36:59 UTC

    Hi Philippe,

    I suggest you try dumping some data so you can see the yaml expected. Here’s an example data dump on my side:


    ---
    - :word: burger
    :id: 1
    - :word: food
    :id: 2

  6. jc Sun, 22 Mar 2009 23:00:34 UTC

    Hrmmm.. the load task doesn’t work. It croaks on the created_at and updated_at columns. “Invalid time”

    Seems like YAML is a very odd choice for this. Why not just use sql in and sql out? The data format conversions make things very unstable.

  7. mikong Mon, 23 Mar 2009 02:06:39 UTC

    I have created_at and updated_at in several places in the project that I used this. I didn’t have any problems. Can you give more details about your environment? The Merb version and DM version you’re using, what your database is, and maybe how your updated_at/created_at data looks like in your YAML file?

    I like YAML because it has a nice format, and it’s quite easy to modify the data before loading it. I guess if I had your errors and couldn’t make YAML work, then I’d consider other solutions. Btw, it’s not exactly my solution. As mentioned above, I just translated Tobi’s work which I think is amazing coz it was an old article, and yet it just worked for my case.

    Anyway, I hope you could give me the details of your environment so I can take a look at the problem.

  8. jc Sun, 05 Apr 2009 08:50:59 UTC

    Looks like its a YAML bug. Tracked here:
    http://redmine.ruby-lang.org/issues/show/752

    It fails just loading the generated YAML file.
    irb(main):014:0> YAML.load_file “config/data/development/Job.yml”
    ArgumentError: time out of range
    from /opt/local/lib/ruby/1.8/yaml.rb:133:in `utc’
    from /opt/local/lib/ruby/1.8/yaml.rb:133:in `node_import’
    from /opt/local/lib/ruby/1.8/yaml.rb:133:in `load’
    from /opt/local/lib/ruby/1.8/yaml.rb:133:in `load’
    from /opt/local/lib/ruby/1.8/yaml.rb:144:in `load_file’

  9. jc Sun, 05 Apr 2009 08:51:41 UTC

    Here’s what the timestamps look like in the YML file

    :created_at: 2009-03-31T00:04:45-07:00
    :updated_at: 2009-03-31T00:04:45-07:00

  10. Brian Stolz Fri, 10 Jul 2009 16:36:30 UTC

    I was having issues with join tables which have names such as “roles_users” because of the double pluralization.

    To resolve this I changed:

    INSERT INTO #{table.name.pluralize.snake_case}

    to:

    INSERT INTO #{table.storage_name}

    This uses the proper table name instead.

  11. jc Sun, 27 Sep 2009 09:30:07 UTC

    This no longer works in datamapper 0.10. quote_column_name no longer exists on adapter. Supposedly its in the connection now, but I cant find it

  12. jc Sun, 27 Sep 2009 10:13:16 UTC

    adapter.send(:with_connection) do |connection|
    adapter.execute(”INSERT INTO #{model_class.name.pluralize.snake_case} (#{fixture.keys.collect{|key| adapter.send(:quote_name, key.to_s)}.join(”,”)}) VALUES (#{values.collect {|value| connection.quote_value(value)}.join(”,”)})”)
    end

  13. KARL Wed, 21 Jul 2010 17:55:58 UTC

    < blockquote >< a href=”http://medicamentspot.com/”>Medicamentspot.com. Canadian Health&Care.No prescription online pharmacy.Special Internet Prices.Best quality drugs. Online Pharmacy. Buy pills online< /a >…

    Buy:Synthroid.Zovirax.Actos.Prednisolone.Accutane.Prevacid.Lumigan.Arimidex.Nexium.Retin-A.100% Pure Okinawan Coral Calcium.Zyban.Human Growth Hormone.Petcam (Metacam) Oral Suspension.Mega Hoodia.Valtrex….

  14. PATARZZAZBCAB123123AACaCABCB Sun, 29 Aug 2010 06:40:12 UTC

    quote http://google.com : Love this keyword…

    google…

  15. uk MacBook Apple/ Sun, 29 Aug 2010 07:32:33 UTC

    uk MacBook Apple/ http://AWESOMEBABYCLOTHES.INFO/tag/r\x3dh : uk MacBook Apple/…

    r\x3dh…

  16. r\x3dh Sun, 29 Aug 2010 07:34:00 UTC

    uk MacBook Apple/ http://AWESOMEBABYCLOTHES.INFO/tag/r\x3dh : r\x3dh…

    r\x3dh…

  17. connection Sun, 29 Aug 2010 10:20:07 UTC

    connection http://pilinklo3.ABABYCLOTHES.INFO/tag/wire connection Net cable/ : Net…

    Net…

  18. Sale Sun, 29 Aug 2010 10:30:37 UTC
  19. 7900 Sun, 29 Aug 2010 21:25:03 UTC
  20. Distribution Sun, 29 Aug 2010 21:32:54 UTC
  21. wedding Sun, 29 Aug 2010 21:40:52 UTC
  22. window Mon, 30 Aug 2010 02:13:17 UTC