Skip to main content

Use ProxySQL to obfuscate data for development.

Before I begin let me put this out as a protection against the pitch fork guys 🙂 I do not in anyway recommend using live data (albeit sanitised) into dev, we should really have test data and keep updating the test data as application grows and is changed.

But in case when your developers are supporting a really fast moving, highly configurable system there may be a need to use live backups to be regularly sanitised and applied to dev systems. The usual way of doing this is take a backup from production servers, transfer to dev/stage servers, restore to Dev DB and then run some sanitisation scripts.

This seemed to be a bit of a problem as we did not want to have a situation that the mysqldump of live database sits on pre prod servers or may be left there my mistake. Privacy is a big thing today and due to introduction of GDPR we now have a need to eliminate any chances of collecting and storing personal data where we do not need it. 

Enter ProxySQL:

I stumbled upon a presentation from Frederic Descamps and Rene Cannao at FOSDEM 2017 where they talk about the data anonymisation for developers this is exactly what we needed. I wanted to try that out.

I also found an article from the proxysql blog which the same authors show what they did to achieve the data obfuscation. What follows is a step by step guide on how i did this with docker.

First we need our actual database, this is going to represent our live database, for this i am using the official mysql-server container from and then we need some data; for the data i have used the same database as recommended in the blogpost.

I pull the docker image for Mysql 5.7, expose the 3306 port so I can access the database from my host machine.

docker run --name=mysql1 -p 3306:3306 -e MYSQL_ROOT_PASSWORD=newpassword -e MYSQL_ROOT_HOST=% -d mysql/mysql-server:5.7

In short this will create a mysql 5.7 container with the database running, setting the 'e' params we set the password and MYSQL_ROOT_HOST variable needs to be set as % or the IP address of your docker machine if you need to access this database from outside the container.

For more information on other options and how this works please refer to the, beware the content will be volatile as we are not mounting any shares or volumes, which should suffice for our tests.

Now i will load this test data which is from

mysql -u root -h -p < employees.sql

in this case is my docker host, as i am using docker toolbox for mac.

Now for create a proxy sql container.

docker run -d -p 6033:6033 nidash/proxysql:1.4.8

This will create a proxysql server and start it, exposing port 6033

Now proxysql uses 2 ports, 6032 for administering proxysql and 6033 where it acts as a proxy for the mysql servers, ie. all external users connect to the mysql server using port 6033.

docker exec -it mysql -u admin -padmin -h -P6032

After you enter the password, which my default is 'admin' you are presented with a mysql client prompt, beware this is not a true mysql command line interface but a special one for proxy sql administering, this is done so it is easier for DBAs to use a familiar interface.

There is an option to configure ProxySQL using a configuration file, when using with docker the above command can be modified to add a -v (volume mapping option to mount your config file over the default one at /etc/proxysql.cfg)

To allow us to connect to our mysql server through the proxysql proxy, we need a user created on mysql server that allows you to connect, in my case i have created a user as 'msandbox', we also need the actual user to be present on the server in our case we will be using the user 'muser' we need to give appropriate permissions to allow it to connect as well, for the proxy sql side of things we need to add the configuration in as follows:

Setup backend:

INSERT INTO mysql_servers(hostgroup_id,hostname,port) VALUES (1,'',3306);

Configure monitoring:

UPDATE global_variables SET variable_value='msandbox' WHERE variable_name='mysql-monitor_username';
UPDATE global_variables SET variable_value='msandbox' WHERE variable_name='mysql-monitor_password';

Load the variables to runtime:



Now, lets confirm proxysql can access the mysql server by looking at the monitoring log:

SELECT * FROM monitor.mysql_server_connect_log ORDER BY time_start_us DESC LIMIT 10;

should show something like
| hostname       | port | time_start_us    | connect_success_time_us | connect_error                                                       |
| | 3306 | 1526302545856645 | 1388                    | NULL                                                                |
| | 3306 | 1526302485856208 | 622                     | NULL                                                                |

which is good, and we are happy with the mysql backend configuration so we load this config into runtime and save to disk.


Lets create mysql users now:

INSERT INTO mysql_users(username,password,default_hostgroup) VALUES ('muser','muser',1);

and load users to runtime and update disk:

Now lets try a mysql dump (these instructions are not my own, these are excerpts from the proxysql blog, i have just reproduced this here for the sake of showing this working with my setup)

mysqldump --user=muser --password=muser -h -P6033 employees salaries --skip-extended-insert | less

notice the output shows insert statements as 

INSERT INTO `salaries` VALUES (10001,60117,'1986-06-26','1987-06-26');
INSERT INTO `salaries` VALUES (10001,62102,'1987-06-26','1988-06-25');
INSERT INTO `salaries` VALUES (10001,66074,'1988-06-25','1989-06-25');

Now lets create a rule to mask these salaries.

We will need to connect to our admin interface and add the rules:

INSERT INTO mysql_query_rules (rule_id,active,schemaname,match_pattern,replace_pattern) VALUES (1,1,'employees','^SELECT /\*!40001 SQL_NO_CACHE \*/ \* FROM `salaries`','SELECT SQL_NO_CACHE emp_no, 99999, from_date, to_date FROM salaries');

Load the rules to runtime

and save to disk:

Exit the admin interaface and run the mysqldump again.

mysqldump --user=muser --password=muser -h -P6033 employees salaries --skip-extended-insert | less

notice the salary is now masked and output now contains:

INSERT INTO `salaries` VALUES (10001,99999,'1986-06-26','1987-06-26');
INSERT INTO `salaries` VALUES (10001,99999,'1987-06-26','1988-06-25');
INSERT INTO `salaries` VALUES (10001,99999,'1988-06-25','1989-06-25');

Now we can improve on these and add more rules, but do not forget to load rules to runtime and to save for future save them to disk. There is also a handy script that allows you to create obfuscation rules for the columns you specify, this can be found at:


Popular posts from this blog

Create a drupal development environment using docker.

Currently at work i am using docker to streamline and speedup of development and deployment to testing and production, I am using symfony currently and our app is dockerised, i wanted to do the same for drupal based apps.

I searched for a official drupal docker repo and i did find this.

Unfortunately there was very terse explanation but pretty much worked, i found a few hiccups along the way so wanted to share what i did.

Install docker for your environment. Lots of details are available at explaing docker and installing docker for your environment.

The docker repo holds the drupal container at, following the instructions this is how to create the drupal container:

In order to use my code base we need to mount the volume with the -v directive as below, we do this in one command:

docker run -v /$HOME/work/drupal8:/var/www/html -p 8080:80 --name mydrupal -d drupal

this will create…

Use feeds module to import XML data in Drupal 7

Recently i had to setup a xml import process to import data from an external supplier on a regular basis, the filename and format was not changing and i could choose where to store the file. Following is how i set it up.

First we need to download the following modules

feeds admin ui

enable the above modules.

We will need a sample XML so this will be the one we will use for this demonstration:

<?xml version="1.0" encoding="utf-8" ?>
<!-- Sample for demonstration of processing XML feeds with feeds module -->

How do you expose your legacy database tables to views in drupal 6 Part 1

This is a 2 part post, that shows how I exposed a legacy table to views in drupal and used the incredible power of views to create many combinations static and conditional views. This was very helpful for my client.

I had a requirement that warranted exposing existing database table to views, so that views can contain data from this table.  To enable this to happen, the following steps need to be taken.
In a custom module you need to first define the views_api hook as following:
function candidate_views_api(){ $view = array(); $view = array('api' => 2, 'path' => drupal_get_path('module','candidate'). '/views'); return$view; }
candidate module was a custom module which I already had, so it made sense to add this hook here. 
The important bit here is telling the drupal sub-system where the views file will reside. 
Then create a file with your <modulename> in my case it was, this file should have two hooks, one …