Friday, 21 December 2012

Import data from mysql to mongo


To import data from mysql into mongodb we need to export data from mysql in a format mongo likes, this is either a tsv or a csv. Ideally you would like to use the following sql

select columns INTO OUTFILE '/path/to/csv'
   FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
   LINES TERMINATED BY '\n' from table [where clause]

What the above SQL does is exports the data (only data , no headers) into a file as provided in your path, make sure you have access to location where you are putting the file. If you ignore the path and just use the filename this will be stored in the location where mysql data files are stored.

To get the fields in the data file, we can use the following
   
echo "SELECT * FROM
;" | mysql -uXXX -pXXX > '/path/to/
.tsv'

Finally you want all the be imported into a mongo database, here since our file contains a header we specify the --headerline option. For a complete list of options available with mongoimport please look at the mongodb site (http://docs.mongodb.org/manual/reference/mongoimport/)

mongoimport -d -c -type tsv --headerline

The headerline option uses the first line for fieldnames, this is fine till you have no data in your mysql and still want to import, then you are back to using the first option on OUTFILE but you also need to have a header file generated which you can then use in mongo, lets see how we can do this.

If you run into problems because of an empty data set, an empty set means no column names! use the steps below:

echo "SELECT column_name FROM information_Schema.cloumns WHERE table_schema = DATABASE() AND table_name='';" | mysql -uXXX -pXXX > '/path/to/_headers.tsv'

Then when importing to mongo use the following:

mongoimport -d -c -type tsv --fieldFile /path/to/_header.tsv

No comments:

Post a Comment