Docker

Use an SSH tunnel ending at the host also within a Docker container

There are many use cases for SSH tunnels as they allow accessing resources behind firewalls and other obstacles over a secure channel. Since more and more services are containerized, it makes sense to use SSH tunnels also within the context of containers, especially for testing. Using SSH tunnels within Docker containers would require installing an SSH client and mounting keys. In many cases this is not possible without building a new Docker image which includes the client. As this is a cumbersome approach, an easy but insecure solution exists, which is recommended in many tutorials or posts on StackOverflow. This fix makes use of the --net=host flag, which allows accessing all ports of the host - also open SSH tunnels. But for obvious reasons, this is dangerous.

A better approach is to bind the SSH tunnel to the bridge network of the Docker service. This bridge is available to all the containers connected to the particular network and thus can also forward specific ports. This technique gives a much more fine granular control over which containers may access the tunnel.

You can list the bridges with ifconfig.

br-b273916af970: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.18.0.1  netmask 255.255.0.0  broadcast 172.18.255.255
        ether dd:bb:aa:cc:bb  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 205  bytes 22043 (22.0 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

br-c92ab5650a7a: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.19.0.1  netmask 255.255.0.0  broadcast 172.19.255.255
        ether aa:bb:cc  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        ether bb:aa:cc:aa:bb:  txqueuelen 0  (Ethernet)
        RX packets 3919  bytes 227485 (227.4 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3205  bytes 8586636 (8.5 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

You can find out the bridge a container uses with docker inspect «container». The default bridge is called docker0. You then need to enable packet forwarding to this bridge for IP tables. Note that the change below is not persisted, you need to do it again after reboot or add it permanently.

sudo iptables -I INPUT 3 -i docker0 -j ACCEPT

After this step you open a SSH tunnel on the host and also use it inside your container. This way you do not have to install SSH clients, keys etc. The trick is to bind the SSH connection to the right interface. Below you can see an example command, which allows to connect to a MySQL database via a bastion host, a typical scenario for cloud services.

ssh -L 172.17.0.1:7002:mysqlcluster:3306 bastion.example.org

Then you can access the forwarded port within the docker container on the same IP / interface, e.g. 172.17.0.1:7200 This way you can for instance use the Percona PMM container for momitoring your cluster also on your local machine, without having to deploy it and expose it via the Web.

Lightning fast integration tests with Docker, MySQL and tmpfs

Integration tests that involve database operations requite to tear down and re-initialize a database multiple times. Although most developer machines offer SSD and enough RAM, database initialization can consume a considerable amount of time nevertheless.

Docker allows defining volumes that are mounted directly into the memory by using tmpfs. We can utilize this feature to utilize the speed up disk operations as during database imports, by moving the data from the disk into the memory.

The following example measures the time for writing 1GB worth of data to an SSD:

dd if=/dev/zero of=/tmp/output bs=1024k count=1024; 
1024+0 Datensätze ein
1024+0 Datensätze aus
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 2,22033 s, 484 MB/s</code>```


For comparison, the following steps create a RAM disk and write the data to memory.

$ sudo mount -o size=1G -t tmpfs none /tmp/tmpfs

$ dd if=/dev/zero of=/tmp/tmpfs/output bs=1024k count=1024; 1024+0 Datensätze ein 1024+0 Datensätze aus 1073741824 bytes (1,1 GB, 1,0 GiB) copied, 0,309017 s, 3,5 GB/s



As you can see writing 1GB to memory is 7x faster. With the following Docker run command, you can spin-up a default MySQL container, where the data directory resides in a tmpfs.

docker run -d
–rm
–name mysql-56-in-memory
-p 3307:3306
–tmpfs /var/lib/mysql:rw,noexec,nosuid,size=1024m
-e MYSQL_ALLOW_EMPTY_PASSWORD=TRUE
-e MYSQL_DATABASE=dev
mysql:5.6



The arguments of Docker run mean the following

  * &#8211;rm: delete the container once it was stopped
  * &#8211;name: a name for the container
  * -p: map the host&#8217;s port 3307 to the port 3306 inside the container. This allows to run multiple MySQL containers in parallel and connect to them from the host via the port specified
  * &#8211;tmpfs: This line mounts the internal directory of the container to a RAM disk. It should be writeable (rm). Noexec prevents the execution of binaries, nosuid prevents changing the permission flags and the size specifies the size occupied by the tmpfs partition in memory. Adapt this to your usecase. The minimum for MySQL is around 200MB. Add the space needed for your data, indices etc.
  * MYSQL\_ALLOW\_EMPTY_PASSWORD does what it implies
  * MYSQL_DATABASE defines the name of a database to be created

If you run this command you can connect to the container like this: _mysql -u root -h 127.0.0.1 -P 3307_

The container behaves like a normal MySQL database, unless the data is not persisted on a hard disk, but only stored in the ephemeral memory. If you stop the container, it will be removed by docker and if you reboot the machine the data will be gone. for obvious reasons this is only a good idea for test data that can be re-created at any time.

You can achieve the same also with Docker Compose if you would like to orchestrate multiple containers.

version: ‘3’ services: mysql-56-integration: container_name: mysql-56-in-memory restart: unless-stopped image: mysql:5.6 environment: - MYSQL_ALLOW_EMPTY_PASSWORD='true’ - MYSQL_HOST=’’ volumes: - data:/var/cache ports: - “3307:3306”

volumes: data: driver_opts: type: tmpfs device: tmpfs```

Jupyter docker stacks with a custom user

Jupyter allows to set a custom user instead of**_jovyan_** which is the default for all containers of the [Jupyter Docker Stack][1]. You need to change this user or its UID and GID in order to get the permissions right when you mount a volume from the host into the Jupyter container. The following steps are required:

  1. Create an unprivileged user and an asociated group on the host. Here we call the user and the group docker_worker
  2. Add your host user to the group. This gives you the permissions to modify and read the files also on the host. This is useful if your working directory on the hist is under source code control (eg. git)
  3. Launch the container with the correct settings that change the user inside the container

It is important to know that during the launch the container needs root privileges in order to change the settings in the mounted host volume and inside the container. After the permissions have been changed, the user is switched back and does not run with root privileges, but your new user. Thus make sure to secure your Docker service, as the permissions inside the container also apply to the host.

Prepare an unprivileged user on the host

1. sudo groupadd -g 1011 docker_worker
2. sudo useradd -s /bin/false -u 1010 -g 1020 docker_worker
3. Add your user to the group: sudo usermod -a -G docker_worker stefan```


# Docker-compose Caveats

It is important to know that docker-compose supports either an array or a dictionary for environment variables ([docs][2]). In the case below we use arrays and we quote all variables. If you accidentally use a dictionary, then the quotes would be passed along to the Jupyter script. You would then see this error message:&nbsp;

/usr/local/bin/start-notebook.sh: ignoring /usr/local/bin/start-notebook.d/* Set username to: docker_worker Changing ownership of /home/docker_worker to 1010:1020 chown: invalid user: ‘'-R’’```

The docker-compose file

version: '2'
services:
    datascience-notebook:
        image: jupyter/base-notebook:latest
        volumes:
            - /tmp/jupyter_test_dir:/home/docker_worker/work            
        ports:
            - 8891:8888
        command: "start-notebook.sh"
        user: root
        environment:
          NB_USER: 'docker_worker'
          NB_UID: 1010
          NB_GID: 1020
          CHOWN_HOME: 'yes'
          CHOWN_HOME_OPTS: -R```


Here you can see that we set the variables that cause the container to ditch jovyan in favor of docker_worker.

> NB\_USER: &#8216;docker\_worker&#8217;  
> NB_UID: 1010  
> NB_GID: 1020  
> CHOWN_HOME: &#8216;yes&#8217;  
> CHOWN\_HOME\_OPTS: -R

This facilitates easy version control of the working directory of Jupyter. I also added the snipped to my [Github Jupyter template][3].

<div class="twttr_buttons">
  <div class="twttr_twitter">
    <a href="http://twitter.com/share?text=Jupyter+docker+stacks+with+a+custom+user" class="twitter-share-button" data-via="" data-hashtags=""  data-size="default" data-url="https://blog.stefanproell.at/2018/08/08/jupyter-docker-stacks-with-a-custom-user/"  data-related="" target="_blank">Tweet</a>
  </div>
  
  <div class="twttr_followme">
    <a href="https://twitter.com/@stefanproell" class="twitter-follow-button" data-show-count="true" data-size="default"  data-show-screen-name="false"  target="_blank">Follow me</a>
  </div>
</div>

 [1]: https://github.com/jupyter/docker-stacks
 [2]: https://docs.docker.com/compose/compose-file/#environment
 [3]: https://github.com/stefanproell/jupyter-notebook-docker-compose/blob/master/README.md

Verifying Replication Consistency with Percona’s pt-table-checksum

Replication is an important concept for improving database performance and security. In this blog post, I would like to demonstrate how the consistency between a MySQL master and a slave can be verified. We will create two Docker containers, one for the master one for the slave.

Installing the Percona Toolkit

The Percona Toolkit is a collection of useful utilities, which can be obained for free from the company’s portal. The following commands install the prerequisits, download the package and eventually the package.

sudo apt-get install -y wget libdbi-perl libdbd-mysql-perl libterm-readkey-perl libio-socket-ssl-perl
wget https://www.percona.com/downloads/percona-toolkit/3.0.4/binary/debian/xenial/x86_64/\
    percona-toolkit_3.0.4-1.xenial_amd64.deb
sudo dpkg -i percona-toolkit_3.0.4-1.xenial_amd64.deb 

Setting up a Test Environment with Docker

The following command creates and starts a docker container. Note that these are minimal examples and are not suitable for a serious environment.

docker run --name mysql_master -e MYSQL_ALLOW_EMPTY_PASSWORD=true -d mysql:5.6 --log-bin \
   --binlog-format=ROW --server-id=1```


Get the IP address from the master container:

Get the IP of the master

docker inspect mysql_master | grep IPAddress

“SecondaryIPAddresses”: null, “IPAddress”: “172.17.0.2"```

You can connect to this container like this and verify the server id:

stefan@Lenovo ~/Docker-Projects $ mysql -u root -h 172.17.0.2
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.35-log MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show variables like 'server_id';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 1     |
+---------------+-------+
1 row in set (0,00 sec)

We repeat the command for the slave, but use a different id. port and name:

docker run --name mysql_slave -e MYSQL_ALLOW_EMPTY_PASSWORD=true -d mysql:5.6 --server-id=2```


For simplicity, we did not use Docker links, but will rather use IP addresses assigned by Docker directly.

## Replication Setup

First, we need to setup a user with replication privileges. This user will connect from the slave to the master.

On the host, interact with the master container

Get the IP address of the slave container

$ docker inspect mysql_slave | grep IPAddress “SecondaryIPAddresses”: null, “IPAddress”: “172.17.0.3”, “IPAddress”: “172.17.0.3”,

Login to the MySQL console of the master

Grant permissions

GRANT REPLICATION SLAVE ON . TO replication@‘172.17.0.3’ IDENTIFIED BY ‘SLAVE-SECRET’;

Get the current binlog position

mysql> SHOW MASTER STATUS; +——————-+———-+————–+——————+——————-+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +——————-+———-+————–+——————+——————-+ | mysqld-bin.000002 | 346 | | | | +——————-+———-+————–+——————+——————-+ 1 row in set (0,00 sec)```

Now log into the slave container and add the connection details for the master:

## Connect to the MySQL Slave instance
$ mysql -u root -h 172.17.0.3

### Setup the slave

mysql> CHANGE MASTER TO   
  MASTER_HOST='172.17.0.2',
  MASTER_PORT=3306,
  MASTER_USER='replication', 
  MASTER_PASSWORD='SLAVE-SECRET',
  MASTER_LOG_FILE='mysqld-bin.000002', 
  MASTER_LOG_POS=346;
Query OK, 0 rows affected, 2 warnings (0,05 sec)

### Start and check
mysql>   start slave;
Query OK, 0 rows affected (0,01 sec)

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.17.0.2
                  Master_User: percona
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysqld-bin.000002
          Read_Master_Log_Pos: 346
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 284
        Relay_Master_Log_File: mysqld-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

Now our simple slave setup is running.

Get some test data

Lets download the Sakila test database and import it into the master. It will be replicated immediately.

wget http://downloads.mysql.com/docs/sakila-db.tar.gz
~/Docker-Projects $ tar xvfz sakila-db.tar.gz 

mysql -u root -h 172.17.0.2 < sakila-db/sakila-schema.sql 
mysql -u root -h 172.17.0.2 < sakila-db/sakila-data.sql```


Verify that the data is on the slave as well:

mysql -u root -h 172.17.0.3 -e “USE sakila;SHOW TABLES;” +—————————-+ | Tables_in_sakila | +—————————-+ | actor | | actor_info | | address | | category | | city | | country | | customer | … | store | +—————————-+



After our setup is completed, we can proceed with Percona pt-table checksum.

# Percona pt-table-checksum

The Percona pt-table-checksum tool requires the connection information of the master and the slave in a specific format. This is called the DSN (data source name), which is a coma separated string. We can store this information in a dedicated database called percona in a table called dsns. We create this table on the master. Note that the data gets replicated to the slave within the blink of an eye.

CREATE DATABASE percona; USE percona;

CREATE TABLE DSN-Table ( id int(11) NOT NULL AUTO_INCREMENT, dsn varchar(255) NOT NULL, PRIMARY KEY (id) );



The next step involves creating permissions on the slave and the master!

GRANT REPLICATION SLAVE,PROCESS,SUPER, SELECT ON . TO ‘percona’@‘172.17.0.1’ IDENTIFIED BY ‘SECRET’; GRANT ALL PRIVILEGES ON percona.* TO ‘percona’@‘172.17.0.1’;```

The percona user is needed to run the script. Note that the IP address is this time from the (Docker) host, having the IP 172.17.0.1 by default. In real world scenarios, this script would either be run on the master or on the slave directly.

Now we need to add the information about the slave to the table we created. The Percona tool could also read this from the process list, but it is more reliable if we add the information ourselves. To do so, we add a record to the table we just created, which describes the slave DSN:

INSERT INTO percona.DSN-Table VALUES (1,'h=172.17.0.3,u=percona,p=SECRET,P=3306');```


The pt-table-checksum tool the connects to the master instance and the the slave. It computes checksums of all databases and tables and compares results. You can use the tool like this:

pt-table-checksum –replicate=percona.checksums –create-replicate-table –empty-replicate-table
–recursion-method=dsn=t=percona.DSN_Table -h 172.17.0.2 -P 3306 -u percona -pSECRET TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 09-10T10:13:11 0 0 0 1 0 0.020 mysql.columns_priv 09-10T10:13:11 0 0 3 1 0 0.016 mysql.db 09-10T10:13:11 0 0 0 1 0 0.024 mysql.event 09-10T10:13:11 0 0 0 1 0 0.014 mysql.func 09-10T10:13:11 0 0 40 1 0 0.026 mysql.help_category 09-10T10:13:11 0 0 614 1 0 0.023 mysql.help_keyword 09-10T10:13:11 0 0 1224 1 0 0.022 mysql.help_relation 09-10T10:13:12 0 0 585 1 0 0.266 mysql.help_topic 09-10T10:13:12 0 0 0 1 0 0.031 mysql.ndb_binlog_index 09-10T10:13:12 0 0 0 1 0 0.024 mysql.plugin 09-10T10:13:12 0 0 6 1 0 0.287 mysql.proc 09-10T10:13:12 0 0 0 1 0 0.031 mysql.procs_priv 09-10T10:13:12 0 1 2 1 0 0.020 mysql.proxies_priv 09-10T10:13:12 0 0 0 1 0 0.024 mysql.servers 09-10T10:13:12 0 0 0 1 0 0.017 mysql.tables_priv 09-10T10:13:12 0 0 1820 1 0 0.019 mysql.time_zone 09-10T10:13:12 0 0 0 1 0 0.015 mysql.time_zone_leap_second 09-10T10:13:12 0 0 1820 1 0 0.267 mysql.time_zone_name 09-10T10:13:13 0 0 122530 1 0 0.326 mysql.time_zone_transition 09-10T10:13:13 0 0 8843 1 0 0.289 mysql.time_zone_transition_type 09-10T10:13:13 0 1 4 1 0 0.031 mysql.user 09-10T10:13:13 0 0 1 1 0 0.018 percona.DSN_Table 09-10T10:13:13 0 0 200 1 0 0.028 sakila.actor 09-10T10:13:13 0 0 603 1 0 0.023 sakila.address 09-10T10:13:13 0 0 16 1 0 0.033 sakila.category 09-10T10:13:13 0 0 600 1 0 0.023 sakila.city 09-10T10:13:13 0 0 109 1 0 0.029 sakila.country 09-10T10:13:14 0 0 599 1 0 0.279 sakila.customer 09-10T10:13:14 0 0 1000 1 0 0.287 sakila.film 09-10T10:13:14 0 0 5462 1 0 0.299 sakila.film_actor 09-10T10:13:14 0 0 1000 1 0 0.027 sakila.film_category 09-10T10:13:14 0 0 1000 1 0 0.032 sakila.film_text 09-10T10:13:14 0 0 4581 1 0 0.276 sakila.inventory 09-10T10:13:15 0 0 6 1 0 0.030 sakila.language 09-10T10:13:15 0 0 16049 1 0 0.303 sakila.payment 09-10T10:13:15 0 0 16044 1 0 0.310 sakila.rental 09-10T10:13:15 0 0 2 1 0 0.029 sakila.staff 09-10T10:13:15 0 0 2 1 0 0.020 sakila.store



The result shows a difference in the MySQL internal table for permissions. This is obviously not what we are interested in, as permissions are individual to a host. So we rather exclude the MySQL internal database and also the percona database, because it is not what we are interested in. Also in order to test it the tool works, we delete the last five category assignments from the table with <span class="lang:default decode:true crayon-inline">mysql -u root -h 172.17.0.3 -e &#8220;DELETE FROM sakila.film_category WHERE film_id > 995;&#8221;</span> and update a row in the city table with&nbsp;

mysql -u root -h 172.17.0.3 -e “update sakila.city SET city='Innsbruck’ WHERE city_id=590;“```

Now execute the command again:

pt-table-checksum --replicate=percona.checksums --create-replicate-table --empty-replicate-table \
   --recursion-method=dsn=t=percona.DSN_Table --ignore-databases mysql,percona -h 172.17.0.2 -P 3306 -u percona -pSECRET
            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
09-10T10:46:33      0      0      200       1       0   0.017 sakila.actor
09-10T10:46:34      0      0      603       1       0   0.282 sakila.address
09-10T10:46:34      0      0       16       1       0   0.034 sakila.category
09-10T10:46:34      0      1      600       1       0   0.269 sakila.city
09-10T10:46:34      0      0      109       1       0   0.028 sakila.country
09-10T10:46:34      0      0      599       1       0   0.285 sakila.customer
09-10T10:46:35      0      0     1000       1       0   0.297 sakila.film
09-10T10:46:35      0      0     5462       1       0   0.294 sakila.film_actor
09-10T10:46:35      0      1     1000       1       0   0.025 sakila.film_category
09-10T10:46:35      0      0     1000       1       0   0.031 sakila.film_text
09-10T10:46:35      0      0     4581       1       0   0.287 sakila.inventory
09-10T10:46:35      0      0        6       1       0   0.035 sakila.language
09-10T10:46:36      0      0    16049       1       0   0.312 sakila.payment
09-10T10:46:36      0      0    16044       1       0   0.320 sakila.rental
09-10T10:46:36      0      0        2       1       0   0.030 sakila.staff
09-10T10:46:36      0      0        2       1       0   0.027 sakila.store

You see that there is a difference in the tables sakila.city and in the table sakila.film_category. The tool does not report the actual number of differences, but rather the number of different chunks. To get the actual differences, we need to use a different tool, which utilises the checksum table that the previous step created.

Show the differences with pt-tabel-sync

The pt-table-sync tool is the counter part for the pt-table-checksum util. It can print or even replay the SQL statements that would render the slave the same state again to be in sync with the master. We can run a dry-run first, as the tool is potentially dangerous.

pt-table-sync --dry-run  --replicate=percona.checksums --sync-to-master h=172.17.0.3 -P 3306 \
   -u percona -pSECRET --ignore-databases mysql,percona
# NOTE: --dry-run does not show if data needs to be synced because it
#       does not access, compare or sync data.  --dry-run only shows
#       the work that would be done.
# Syncing via replication P=3306,h=172.17.0.3,p=...,u=percona in dry-run mode, without accessing or comparing data
# DELETE REPLACE INSERT UPDATE ALGORITHM START    END      EXIT DATABASE.TABLE
#      0       0      0      0 Chunk     08:57:51 08:57:51 0    sakila.city
#      0       0      0      0 Nibble    08:57:51 08:57:51 0    sakila.film_category

With –dry-run, you only see affected tables, but not the actual data because it does not really access the databases tables in question. Use –print additionally or instead of dry-run to get a list:

pt-table-sync --print --replicate=percona.checksums --sync-to-master h=172.17.0.3 -P 3306 \
  -u percona -pSECRET --ignore-databases mysql,percona
REPLACE INTO `sakila`.`city`(`city_id`, `city`, `country_id`, `last_update`) VALUES \
   ('590', 'Yuncheng', '23', '2006-02-15 04:45:25') 
  \ /*percona-toolkit src_db:sakila src_tbl:city  ...
REPLACE INTO `sakila`.`film_category`(`film_id`, `category_id`, `last_update`) VALUES ... 
REPLACE INTO `sakila`.`film_category`(`film_id`, `category_id`, `last_update`) VALUES ('997',... 
REPLACE INTO `sakila`.`film_category`(`film_id`, `category_id`, `last_update`) VALUES ('998', '11 ...
REPLACE INTO `sakila`.`film_category`(`film_id`, `category_id`, `last_update`) VALUES ('999', '3', ...
REPLACE INTO `sakila`.`film_category`(`film_id`, `category_id`, `last_update`) VALUES ('1000', '5', ... 

The command shows how we can rename back from Innsbruck to Yuncheng again and also provides the INSERT statements to get the deleted records back.When we replace –print with –execute, the data gets written to the master and replicated to the slave. To allow this, we need to set the permissions on the master

GRANT INSERT, UPDATE, DELETE ON sakila.* TO 'percona'@'172.17.0.1';
pt-table-sync --execute  --replicate=percona.checksums --check-child-tables \ 
  --sync-to-master h=172.17.0.3 -P 3306 -u percona -pSECRET --ignore-databases mysql,percona
REPLACE statements on sakila.city can adversely affect child table `sakila`.`address` 
   because it has an ON UPDATE CASCADE foreign key constraint. 
   See --[no]check-child-tables in the documentation for more information. 
   --check-child-tables error  while doing sakila.city on 172.17.0.3

This error indicates that updating the city table has consequences, because it is a FK to child tables. In this example, we are bold and ignore this warning. This is absolutely not recommended for real world scenarios.

pt-table-sync --execute  --replicate=percona.checksums --no-check-child-tables \
   --no-foreign-key-checks --sync-to-master h=172.17.0.3 -P 3306 -u percona -pSECRET \ 
   --ignore-databases mysql,percona

The command–no-check-child-tables ignores child tables and the command –no-foreign-key-checks ignores foreign keys.

Run the checksum command again to verify that the data has been restored:

pt-table-checksum --replicate=percona.checksums --create-replicate-table --empty-replicate-table \ 
   --recursion-method=dsn=t=percona.DSN_Table --ignore-databases mysql,percona 
   -h 172.17.0.2 -P 3306 -u percona -pSECRET

            TS ERRORS  DIFFS     ROWS  CHUNKS SKIPPED    TIME TABLE
09-10T11:24:42      0      0      200       1       0   0.268 sakila.actor
09-10T11:24:42      0      0      603       1       0   0.033 sakila.address
09-10T11:24:42      0      0       16       1       0   0.029 sakila.category
09-10T11:24:42      0      0      600       1       0   0.275 sakila.city
09-10T11:24:42      0      0      109       1       0   0.023 sakila.country
09-10T11:24:43      0      0      599       1       0   0.282 sakila.customer
09-10T11:24:43      0      0     1000       1       0   0.046 sakila.film
09-10T11:24:43      0      0     5462       1       0   0.284 sakila.film_actor
09-10T11:24:43      0      0     1000       1       0   0.036 sakila.film_category
09-10T11:24:43      0      0     1000       1       0   0.279 sakila.film_text
09-10T11:24:44      0      0     4581       1       0   0.292 sakila.inventory
09-10T11:24:44      0      0        6       1       0   0.031 sakila.language
09-10T11:24:44      0      0    16049       1       0   0.309 sakila.payment
09-10T11:24:44      0      0    16044       1       0   0.325 sakila.rental
09-10T11:24:44      0      0        2       1       0   0.029 sakila.staff
09-10T11:24:44      0      0        2       1       0   0.028 sakila.store

0 DIFFS, we are done!

Persistent Data in a MySQL Docker Container

Running MySQL in Docker

In a recent article on Docker in this blog, we presented some basics for dealing with data in containers. This article will present another popular application for Docker: MySQL containers. Running MySQL instances in Docker allows isolating database infrastructure with ease.

Connecting to the Standard MySQL Container

The description of the MySQL docker image provides a lot of useful information how to launch and connect to a MySQL container. The first step is to create standard MySQL container from the latest available image.

sudo docker run \
   --name=mysql-instance 
   -e MYSQL_ROOT_PASSWORD=secret 
   -p 3307:3306 
   -d 
   mysql:latest

This creates a MySQL container where the root password is set to secret. As the host is already running its own MySQL instance (which has nothing to do with this docker example), the standard port 3306 is already taken. Thus we publish utilise the port 3307 on the host system and forward it to the 3306 standard port from the container.

Connect from the Host

We can then connect from the command line like this:

mysql -uroot -psecret -h 127.0.0.1 -P3307

We could also provide the hostname localhost for connecting to the container, but as the MySQL client per default assumes that a localhost connection is via a socket, this would not work. Thus when using the hostname localhost, we needed to specify the protocol TCP, wo that the client connects via the network interface.

mysql -uroot -psecret -h localhost --protocol TCP -P3307

Connect from other Containers

Connecting from a different container to the MySQL container is pretty straight forward. Docker allows to link two containers and then use the exposed ports between them. The following command creates a new ubuntu container and links to the MySQL container.

sudo docker run -it --name ubuntu-container --link mysql-instance:mysql-link ubuntu:16.10 bash

After this command, you are in the terminal of the Ubuntu container. We then need to install the MySQL client for testing:

# Fetch the package list
root@7a44b3e7b088:/# apt-get update
# Install the client
root@7a44b3e7b088:/# apt-get install mysql-client
# Show environment variables
root@7a44b3e7b088:/# env

The last command gives you a list of environment variables, among which is the IP address and port of the MySQL container.

MYSQL_LINK_NAME=/ubuntu-container/mysql-link
HOSTNAME=7a44b3e7b088
TERM=xterm
MYSQL_LINK_ENV_MYSQL_VERSION=5.7.14-1debian8
MYSQL_LINK_PORT=tcp://172.17.0.2:3306
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MYSQL_LINK_PORT_3306_TCP_ADDR=172.17.0.2
MYSQL_LINK_PORT_3306_TCP=tcp://172.17.0.2:3306
PWD=/
MYSQL_LINK_PORT_3306_TCP_PORT=3306
SHLVL=1
HOME=/root
MYSQL_LINK_ENV_MYSQL_MAJOR=5.7
MYSQL_LINK_PORT_3306_TCP_PROTO=tcp
MYSQL_LINK_ENV_GOSU_VERSION=1.7
MYSQL_LINK_ENV_MYSQL_ROOT_PASSWORD=secret
_=/usr/bin/env

You can then connect either manually of by providing the variables

mysql -uroot -psecret -h 172.17.0.2
mysql -uroot -p$MYSQL_LINK_ENV_MYSQL_ROOT_PASSWORD -h $MYSQL_LINK_PORT_3306_TCP_ADDR -P $MYSQL_LINK_PORT_3306_TCP_PORT

If you only require a MySQL client inside a container, simply use the MySQL image from docker. Batteries included!

Persistent Docker Containers

Docker Fundamentals

Docker has become a very popular tool for orchestrating services. Docker it much more lightweight than virtual machines. For instance do containers not require a boot process. Docker follows the philosophy that one container serves only one process. So in contrast to virtual machines which often bundle several services together, Docker is built for running single services per container. If you come from the world of virtualised machines, Docker can be a bit confusing in the beginning, because it uses its own terminology. A good point to start is as always the documentation and there are plenty of great tutorials out there.

Images and Containers

Docker images serves as templates for the containers. As images and containers both have hexadecimal ids they are very easy to confuse. The following example shows step by step how to create a new container based on the Debian image and how to open shell access.

# Create a new docker container based on the debian image
sudo docker create -t --name debian-test debian:stable 
# Start the container
sudo docker start  debian-test
# Check if the container is running
sudo docker ps -a
# Execute bash to get an interactive shell
sudo docker exec -i -t debian-test bash

A shorter variant of creating and launching a new container is listed below. The run command creates a new container and starts it automatically. Be aware that this creates a new container every time, so assigning a container name helps with not confusing the image with the container. The command run is in particular tricky, as you would expect it to run (i.e. launch) a container only. In fact, it creates a new one and starts it.

sudo docker run -it --name debian-test debian:stable bash

Important Commands

The following listing shows the most important commands:

# Show container status
sudo docker ps -a
# List available images
sudo docker images 
# Start or stop a container
sudo docker start CONTAINERNAME
sudo docker stop CONTAINERNAME
# Delete a container
sudo docker rm CONTAINERNAME

You can of course create your own images, which will not be discussed in this blog post. It is just important to know that you can’t move containers from your host so some other machine directly. You would need to commit the changes made to the image and create a new container based on that image. Please be aware that this does not include the actual data stored in that container! You need to manually export any data and files from the original container and import it in the new container again. This is another trap worth noting. You can, however,  also mount data in the image, if the data is available at the host at the time of image creation. Details on data in containers can be found here.

Persisting Data Across Containers

The way how Docker persists data needs getting used to in the beginning, especially as it is easy to confuse images with containers. Remember that Docker images serve only as the template. So when you issue the command sudo docker run …  this actually creates a container from an image first and then starts it. So whenever you issue this command again, you will end up with a new container which does share any data with the previously created container.

Docker 1.9 introduced data volume containers, which allow to create dedicated data containers which can be used from several other containers. Data volume containers can be used for persisting data. The following listing shows how to create a data volume container and mount the volume in a container.

# Create a data volume
sudo docker volume create --name data-volume-test
# List all volumes
sudo docker volume ls
# Delete the container
sudo docker rm debian-test
# Create a new container, now with the data volume 
sudo docker create -v data-volume-test:/test-data -t --name debian-test debian:stable
# Start the container
sudo docker start debian-test
# Get the shell
sudo docker exec -i -t debian-test bash

After we logged into the shell, we can see the data volume we mounted on the directory test-data:

root@d4ac8c89437f:/# ls -la
total 76
drwxr-xr-x  28 root root 4096 Aug  3 13:11 .
drwxr-xr-x  28 root root 4096 Aug  3 13:11 ..
-rwxr-xr-x   1 root root    0 Aug  3 13:10 .dockerenv
drwxr-xr-x   2 root root 4096 Jul 27 20:03 bin
drwxr-xr-x   2 root root 4096 May 30 04:18 boot
drwxr-xr-x   5 root root  380 Aug  3 13:11 dev
drwxr-xr-x  41 root root 4096 Aug  3 13:10 etc
drwxr-xr-x   2 root root 4096 May 30 04:18 home
drwxr-xr-x   9 root root 4096 Nov 27  2014 lib
drwxr-xr-x   2 root root 4096 Jul 27 20:02 lib64
drwxr-xr-x   2 root root 4096 Jul 27 20:02 media
drwxr-xr-x   2 root root 4096 Jul 27 20:02 mnt
drwxr-xr-x   2 root root 4096 Jul 27 20:02 opt
dr-xr-xr-x 267 root root    0 Aug  3 13:11 proc
drwx------   2 root root 4096 Jul 27 20:02 root
drwxr-xr-x   3 root root 4096 Jul 27 20:02 run
drwxr-xr-x   2 root root 4096 Jul 27 20:03 sbin
drwxr-xr-x   2 root root 4096 Jul 27 20:02 srv
dr-xr-xr-x  13 root root    0 Aug  3 13:11 sys
drwxr-xr-x   2 root root 4096 Aug  3 08:26 <span style="color: #0000ff;"><strong>test-data</strong></span>
drwxrwxrwt   2 root root 4096 Jul 27 20:03 tmp
drwxr-xr-x  10 root root 4096 Jul 27 20:02 usr
drwxr-xr-x  11 root root 4096 Jul 27 20:02 var```


We can navigate into that folder and create a 100 M data file with random data.

root@d4ac8c89437f:~# cd /test-data/ root@d4ac8c89437f:/test-data# dd if=/dev/urandom of=100M.dat bs=1M count=100 100+0 records in 100+0 records out 104857600 bytes (105 MB) copied, 6.69175 s, 15.7 MB/s root@d4ac8c89437f:/test-data# du -h . 101M .



When we exit the container, we can see the file in the host file system  here:

stefan@stefan-desktop:~$ sudo ls -l /var/lib/docker/volumes/data-volume-test/_data insgesamt 102400 -rw-r–r– 1 root root 104857600 Aug 3 15:17 100M.dat```

We can use this volume transparently in the container, but it is not depending on the container itself. So whenever we have to delete to container or want to use the data with a different container, this solution works perfectly. Thw following command shows how we mount the same volume in an Ubuntu container and execute the ls command to show the content of the directory.

stefan@stefan-desktop:~$ sudo docker run -it -v data-volume-test:/test-data-from-debian --name ubuntu-test ubuntu:16.10 ls -l /test-data-from-debian
total 102400
-rw-r--r-- 1 root root 104857600 Aug  3 13:17 100M.dat

You can display a lot of usefil information about a container with the inspect command. It also shows the data container and where it is mounted.

sudo docker inspect ubuntu-test

...
        "Mounts": [
            {
                "Name": "data-volume-test",
                "Source": "/var/lib/docker/volumes/data-volume-test/_data",
                "Destination": "/test-data-from-debian",
                "Driver": "local",
                "Mode": "z",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],
...```


We delete the ubuntu container and create a new one. We then start the container, open a bash session and write some test data into the directory.

stefan@stefan-desktop:~$ sudo docker create -v data-volume-test:/test-data-ubuntu -t –name ubuntu-test ubuntu:16.10 f3893d368e11a32fee9b20079c64494603fc532128179f0c08d10321c8c7a166 stefan@stefan-desktop:~$ sudo docker start ubuntu-test ubuntu-test stefan@stefan-desktop:~$ sudo docker exec -it ubuntu-test bash root@f3893d368e11:/# cd /test-data-ubuntu/ root@f3893d368e11:/test-data-ubuntu# ls 100M.dat root@f3893d368e11:/test-data-ubuntu# touch ubuntu-writes-a-file.txt



When we check the Debian container, we can immediately see the written file, as the volume is transparently mounted.

stefan@stefan-desktop:~$ sudo docker exec -i -t debian-test ls -l /test-data total 102400 -rw-r–r– 1 root root 104857600 Aug 3 13:17 100M.dat -rw-r–r– 1 root root 0 Aug 3 13:42 ubuntu-writes-a-file.txt```

Please be aware that the docker volume is just a regular folder on the file system. Writing from both containers the same file can lead to data corruption. Also remember that you can read and write the volume files directly from the host system.

Backups and Migration

Backing up data is also an important aspect when you use named data volumes as shown above. Currently, there is no way of moving Docker containers or volumes natively to a different host. The intention of Docker is to make the creation and destruction  of containers very cheap and easy. So you should not get too attached to your containers, because you can re-create them very fast. This of course is not true for the data stored in volumes. So you need to take care of your data yourself, for instance by creating automated backups like this sudo tar cvfz Backup-data-volume-test.tar.gz /var/lib/docker/volumes/data-volume-test and re-store the data when needed in a new volume. How to backup volumes using a container is described here.