mysqldump is a tool to create dump of existing mysql/mariadb database(s). The generated dump usually contains SQL statements but also can be generated in csv, xml format. The dump is commonly used for backup purpose or to transfer db schema & contents to different server.
Recently I have started working on an old projects that has some bash scripts to maintain backups every night. By monitoring, I noticed that the website stays unresponsive for few minutes while the DB backup script runs. Being curious I decided to find the culprit. (spoilers: it was me years ago, who wrote the script, so obviously I am the villain)
First, I tried running the script and monitored CPU/memory usage. None of it looked too bad to cease the site performance. Maybe inside the script, there is certain command that is responsible for it? I started reading the script contents and found the suspect.
mysqldump --defaults-extra-file=/credential.my.cnf --all-databases | gzip > backup.sql.gz
Note: --defaults-extra-file flag takes path to a file that contains mysql user credentials. This is because we do not want to pass our mysql password via command parameters, and you shouldn't too.
What does this line do? First, mysqldump connects to mysql server and generates dump of all databases. The dump response is piped to gzip. Gzip accepts any line of contents whenever mysqldump generates and compresses it. Then the compressed content is written to a file "backup.sql.gz".
Anyway, this is the line which making our site unresponsive. The whole dumping and gzip is taking around five minutes to complete. I found that our tables are being locked during that time. Aha, so that's why our site struggles because it needs to access DB but can't because of locked state.
So, what's the solution?
I started digging through mysql/mariaDB documentation for mysqldump and its' flags. This doc recommends to use both --single-transaction --quick flags when dumping large databases. Let's try that.
First, let's see how much time does our current dump command takes:
time mysqldump --defaults-extra-file=/credential.my.cnf --all-databases real 1m02.28s
Notice that I've added time before mysqldump. time is a command that measures execution time of any command and prints it. So as we can see, current mysqldump took around 1 minute to finish.
Now lets run with --single-transaction --quick
time mysqldump --defaults-extra-file=/credential.my.cnf --all-databases --single-transaction --quick real 1m00.10s
This time, mysqldump took 1 minute too. Not much improvement on speed. But I noticed that our site wasn't down this time. Why so? Let's talk about those two flags:
After adding those two flags, I ran the full script again. Dump was successful and our site was up too.. YAY.
But I found that mysqldump is still running for 5 minutes and that made me uncomfortable even though it doesn't seem to pose any threat.
time mysqldump --defaults-extra-file=/credential.my.cnf --all-databases --single-transaction --quick | gzip > backup.sql.gz real 5m39.84s
Why the whole command takes 5 minutes while mysqldump alone takes only 1 minute? Notice that we are piping mysqldump result into gzip and then writing to a backup.sql.gz file.
This StackOverflow answer, pointed the problem with piping to gzip. When we are piping mysqldump into gzip, what happens is, whenever mysqldump generates a single line of content, gzip accepts that and compresses that and then write to the file. The whole dumping and compressing runs in parallel.
Since our issue is related to mysqldump, I would like to finish that command as soon as possible, without waiting for complex process of gzip.
So, instead I told mysqldump to write to a plain .sql file and then ran gzip separately.
time mysqldump --defaults-extra-file=/credential.my.cnf --all-databases --single-transaction --quick > backup.sql && gzip backup.sql real 1m1.228s
This way, mysqldump took only 1 minute to generate .sql file and it exits. Then bash launches gzip to compress that .sql file. That && operator ensures that gzip runs only if mysqldump completes successfully. Gzip is still running for around 5 minutes but it doesn't create any lag for mysqldump anymore.
And that concludes todays debugging session to identify mysqldump performance issue and solution to that. I'll report back if I find anymore info on this issue. Cheers.