Quantcast
Channel: Severalnines - MariaDB
Viewing all 365 articles
Browse latest View live

Become a ClusterControl DBA - Deploying your Databases and Clusters

$
0
0

Many of our users speak highly of our product ClusterControl, especially how easy it is to install the software package. Installing new software is one thing, but using it properly is another.

We all are impatient to test new software and would rather like to toy around in a new exciting application than to read documentation up front. That is a bit unfortunate as you may miss the most important features or find out the way of doing things yourself instead of reading how to do things the easy way.

This new blog series will cover all the basic operations of ClusterControl for MySQL, MongoDB & PostgreSQL with examples explaining how to do this, how to make most of your setup and provides a deep dive per subject to save you time. 

These are the topics we'll cover in this series:

  • Deploying the first clusters
  • Adding your existing infrastructure
  • Performance and health monitoring
  • Make your components HA
  • Workflow management
  • Safeguarding your data
  • Protecting your data
  • In depth use case

In today’s post we cover installing ClusterControl and deploying your first clusters. 

Preparations

In this series we will make use of a set of Vagrant boxes but you can use your own infrastructure if you like. In case you do want to test it with Vagrant, we made an example setup available from the following Github repository:
https://github.com/severalnines/vagrant

Clone the repo to your own machine:

git clone git@github.com:severalnines/vagrant.git

The topology of the vagrant nodes is as follows:

  • vm1: clustercontrol
  • vm2: database node1
  • vm3: database node2
  • vm4: database node3

Obviously you can easily add additional nodes if you like by changing the following line:

4.times do |n|

The Vagrant file is configured to automatically install ClusterControl on the first node and forward the user interface of ClusterControl to port 8080 on your hos that runs Vagrantt. So if your host’s ip address is 192.168.1.10 you find ClusterControl UI here: http://192.168.1.10:8080/clustercontrol/

Installing ClusterControl

You can skip this if you chose to use the Vagrant file and received the automatic installation for free. But installation of ClusterControl is easy and will take less than five minutes of your time.

With the package installation all you have to do is issue the following three commands on the ClusterControl node to get it installed:

$ wget http://www.severalnines.com/downloads/cmon/install-cc
$ chmod +x install-cc
$ ./install-cc   # as root or sudo user

That’s it: it can’t get easier than this. If the installation script did not encounter any issues ClusterControl has been installed and is up and running. You can now log into ClusterControl on the following URL:
http://192.168.1.210/clustercontrol

After creating an administrator account and logging in, you will be prompted to add your first cluster.

Deploy a Galera cluster

In case you have installed ClusterControl via the package installation or when there are no clusters defined in ClusterControl you will be prompted to create a new database server/cluster or add an existing (i.e., already deployed) server or cluster:

severalnines-blogpost1-add-new-cluster-or-node.png

In this case we are going to deploy a Galera cluster and this only requires one screen to fill in:

several-nines-blogpost-add-galera-cluster.png

To allow ClusterControl to install the Galera nodes we use the root user that was granted ssh access by the Vagrant bootstrap scripts. In case you chose to use your own infrastructure, you must enter a user here that is allowed to do passwordless ssh to the nodes that ClusterControl is going to control.

Also make sure you disable AppArmor/SELinux. See here why.

After filling in all the details and you have clicked Deploy, a job will be spawned to build the new cluster. The nice thing is that you can keep track of the progress of this job by clicking on the spinning circle between the messages and settings icon in the top menu bar:

severalnines-blogpost1-progress-indicator.png

Clicking on this icon will open a popup that keeps you updated on the progress of your job.

severalnines-blogpost-add-galera-progress1.png

severalnines-blogpost-add-galera-progress2.png

Once the job has finished, you have just created your first cluster. Opening the cluster overview should look like this:

severalnines-blogpost1-cluster-overview.png

In the nodes tab, you can do about any operation you normally would like to do on a cluster and more. The query monitor gives you a good overview of both running and top queries. The performance tab will help you to keep a close eye upon the performance of your cluster and also features the advisors that help you to act pro-actively on trends in data. The backups tab enables you to easily schedule backups that are either stored on the DB nodes or the controller host  and the managing tab enables you to expand your cluster or make it highly available for your applications through a load balancer.

All this functionality will be covered in later blog posts in this series.

Deploy a MySQL replication set

A new feature in ClusterControl 1.2.11 is that you can not only add slaves to existing clusters/nodes but you can also create new masters. In order to create a new replication set, the first step would be creating a new MySQL master:

severalnines-blogpost-add-mysql-master.png

After the master has been created you can now deploy a MySQL slave via the “Add Node” option in the cluster list:

severalnines-blogpost-add-mysql-slave-dialogue.png

Keep in mind that adding a slave to a master requires the master’s configuration to be stored in the ClusterControl repository. This will happen automatically but will take a minute to be imported and storedAfter adding the slave node, ClusterControl will provision the slave with a copy of the data from its master using Xtrabackup. Depending on the size of your data this may take a while.

Deploy a PostgreSQL replication set

Creating a PostgreSQL cluster requires one extra step as compared to creating a Galera cluster as it gets divided in adding a standalone PostgreSQL server and then adding a slave. The two step approach lets you decide which server will become the master and which one becomes the slave.

A side note is that the supported PostgreSQL version is 9.x and higher. Make sure the correct version gets installed by adding the correct repositories by PostgreSQL: http://www.postgresql.org/download/linux/

First we create a master by deploying a standalone PostgreSQL server:

severalnines-blogpost-postgresql-master.png

After deploying, the first node will become available in the cluster list as a single node instance. 

You can either open the cluster overview and then add the slave, but the cluster list also gives you the option to immediately add a replication slave to this cluster:

severalnines-blogpost1-adding-postgresql-slave1.png

And adding a slave is as simple as selecting the master and filling in the fqdn for the new slave:

severalnines-blogpost-postgresql-slave.png

The PostgreSQL cluster overview gives you a good insight in your cluster:

severalnines-blogpost-postgresql-overview.png

Just like with the Galera and MySQL cluster overviews you can find all the necessary tabs and functions here: the query monitor, performance, backups tabs enables you to do the necessary operations.

Deploy a MongoDB replicaSet

Deploying a new MongoDB replicaSet is similar to PostgreSQL. First we create a new master node:

severalnines-blogpost-mongodb-master.png

After installing the master we can add a slave to the replicaSet using the same dropdown from the cluster overview:

severalnines-blogpost-mongodb-add-node-dropdown.png

Keep in mind that you need to select the saved Mongo template here to start replicating from a replicaSet, in this case select the mongod.conf.shardsvr configuration.

severalnines-blogpost-mongodb-slave.png

After adding the slave to the MongoDB replicaSet, a job will be spawned. Once this job has finished it will take a short while before MongoDB adds it to the cluster and it becomes visible in the cluster overview.

severalnines-blogpost-mongodb-cluster-overview.png

Similar to the PostgreSQL, Galera and MySQL cluster overviews you can find all the necessary tabs and functions here: the query monitor, performance, backups tabs enables you to do the necessary operations.

Final thoughts

With these three examples we have shown you how easy it is to set up new clusters for MySQL, MongoDB and PostgreSQL from scratch in only a couple of minutes. The beauty of using this Vagrant setup is that you can as easy as spawning this environment also take it down again and then spawn again. Impress your fellow colleagues on how easily you can setup a working environment and convince them to use it as their own test or devops environment.

Of course it would be equally interesting to add existing hosts and clusters into ClusterControl and that’s what we'll cover next time.

Blog category:


Become a ClusterControl DBA: Adding Existing Databases and clusters

$
0
0

In our previous blog post we covered the deployment of four types of clustering/replication: MySQL Galera, MySQL master-slave replication, PostgreSQL replication set and MongoDB replication set. This should enable you to create new clusters with great ease, but what if you already have 20 replication setups deployed and wish to manage them with ClusterControl?

This blog post will cover adding existing infrastructure components for these four types of clustering/replication to ClusterControl and how to have ClusterControl manage them.

Adding an existing Galera cluster to ClusterControl

Adding an existing Galera cluster to ClusterControl requires: mysql user with the proper grants and a ssh user that is able to login (without password) from the ClusterControl node to your existing databases and clusters.
 
Install ClusterControl on a separate VM. Once it is up, open the dialogue for adding an existing cluster. All you have to do is to add one of the Galera nodes and ClusterControl will figure out the rest:

severalnines-blogpost-add-existing-galera-cluster.png

After this behind the scenes, ClusterControl will connect to this host and detect all the necessary details for the full cluster and register the cluster in the overview.

Adding an existing MySQL master-slave to ClusterControl

Adding of an existing MySQL master-slave topology requires a bit more work than adding a Galera cluster. As ClusterControl is able to extract the necessary information for Galera, in the case of master-slave, you need to specify every host within the replication setup.

severalnines-blogpost-add-existing-mysql-master-slave.png

After this, ClusterControl will connect to every host, see if they are part of the same topology and register them as part of one cluster (or server group) in the GUI.

Adding an existing PostgreSQL replication set to ClusterControl

Similar to adding the MySQL master-slave above, the PostgreSQL replication set also requires to fill in all hosts within the same replication set.

severalnines-blogpost-add-existing-postgresql-replication-set.png

After this, ClusterControl will connect to every host, see if they are part of the same topology and register them as part of the same group. 

Adding an existing MongoDB replica set to ClusterControl

Adding an existing MongoDB replica set is just as easy as Galera: just one of the hosts in the replica set needs to be specified with its credentials and ClusterControl will automatically discover the other nodes in the replica set.

severalnines-blogpost-add-existing-mongodb-replica-set.png

Expanding your existing infrastructure

After adding the existing databases and clusters, they now have become manageable via ClusterControl and thus we can scale out our clusters.

For MySQL, MongoDB and PostgreSQL replication sets, this can easily be achieved via the same way we showed in our previous blogpost: simply add a node and ClusterControl will take care of the rest.

severalnines-blogpost1-adding-postgresql-slave1.png

For Galera, there is a bit more choice. The most obvious choice is to add a (Galera) node to the cluster by simply choosing “add node” in the cluster list or cluster overview. Expanding your Galera cluster this way should happen with increments of two to ensure your cluster always can have majority during a split brain situation.

Alternatively you could add a replication slave and thus create asynchronous slave in your synchronous cluster that looks like this:

magtid_arch_full.png

Adding a slave node blindly under one of the Galera nodes can be dangerous since if this node goes down, the slave won’t receive updates anymore from its master. We blogged about paradigm earlier and you can read how to solve this in this blog post.

Final thoughts

We showed you how easy it is to add existing databases and clusters to ClusterControl, you can literally add clusters within minutes. So nothing should hold you back from using ClusterControl to manage your existing infrastructure. If you have a large infrastructure, the addition of ClusterControl will give you more overview and save time in troubleshooting and maintaining your clusters.

Now the challenge is how to leverage ClusterControl to keep track of key performance indicators, show the global health of your clusters and proactively alert you in time when something is predicted to happen. And that’s the subject we'll cover next time.

Read also in the same series: 

Blog category:

Become a MySQL DBA blog series - Understanding the MySQL Error Log

$
0
0

We are yet to see a software that runs perfectly, without any issues. MySQL is no exception there. It’s not the software’s fault - we need to be clear about that. We use MySQL in different places, on different hardware and within different environments. It’s also highly configurable. All those features make it great product but they come with a price - sometimes some settings won’t work correctly under certain conditions. It is also pretty easy to make simple human mistakes like typos in the MySQL configuration. Luckily, MySQL provides us with means to understand what is wrong through the error log. In this blog, we’ll see how to read the information in the error log.

This is the fifteenth installment in the ‘Become a MySQL DBA’ blog series. Our previous posts in the DBA series include:

Error log on MySQL - clean startup and shutdown

MySQL’s error log may contain a lot of information about different issues you may encounter. Let’s start with checking what a ‘clean’ start of MySQL looks like. It will make it easier to find any anomalies later.

At first, all plugins (and storage engines, which do work as plugins in MySQL 5.6) are initiated. If something is not right, you’ll see errors on this stage.

2015-10-26 19:35:20 13762 [Note] Plugin 'FEDERATED' is disabled.

Next we can see a significant part related to InnoDB initialization.

2015-10-26 19:35:20 13762 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-26 19:35:20 13762 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-26 19:35:20 13762 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-10-26 19:35:20 13762 [Note] InnoDB: Memory barrier is not used
2015-10-26 19:35:20 13762 [Note] InnoDB: Compressed tables use zlib 1.2.8
2015-10-26 19:35:20 13762 [Note] InnoDB: Using Linux native AIO
2015-10-26 19:35:20 13762 [Note] InnoDB: Using CPU crc32 instructions
2015-10-26 19:35:20 13762 [Note] InnoDB: Initializing buffer pool, size = 512.0M
2015-10-26 19:35:21 13762 [Note] InnoDB: Completed initialization of buffer pool
2015-10-26 19:35:21 13762 [Note] InnoDB: Highest supported file format is Barracuda.
2015-10-26 19:35:21 13762 [Note] InnoDB: 128 rollback segment(s) are active.
2015-10-26 19:35:21 13762 [Note] InnoDB: Waiting for purge to start
2015-10-26 19:35:21 13762 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.26-74.0 started; log sequence number 710963181

In the next phase, authentication plugins are initiated (if they are configured correctly).

2015-10-26 19:35:21 13762 [Note] RSA private key file not found: /var/lib/mysql//private_key.pem. Some authentication plugins will not work.
2015-10-26 19:35:21 13762 [Note] RSA public key file not found: /var/lib/mysql//public_key.pem. Some authentication plugins will not work.

At the end you’ll see information about binding MySQL to the configured IP and port. Event scheduler is also initialized. Finally, you’ll see ‘ready for connections’ message indicating that MySQL started correctly.

2015-10-26 19:35:21 13762 [Note] Server hostname (bind-address): '*'; port: 33306
2015-10-26 19:35:21 13762 [Note] IPv6 is available.
2015-10-26 19:35:21 13762 [Note]   - '::' resolves to '::';
2015-10-26 19:35:21 13762 [Note] Server socket created on IP: '::'.
2015-10-26 19:35:21 13762 [Warning] 'proxies_priv' entry '@ root@ip-172-30-4-23' ignored in --skip-name-resolve mode.
2015-10-26 19:35:21 13762 [Note] Event Scheduler: Loaded 2 events
2015-10-26 19:35:21 13762 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.26-74.0'  socket: '/var/run/mysqld/mysqld.sock'  port: 33306  Percona Server (GPL), Release 74.0, Revision 32f8dfd

How does the shutdown process look like? Let’s follow it step by step:

2015-10-26 19:35:13 12955 [Note] /usr/sbin/mysqld: Normal shutdown

This line is very important - MySQL can be stopped in many ways - a user can use init scripts, it’s possible to use ‘mysqladmin’ to execute SHUTDOWN command, it’s also possible to send a SIGTERM to MySQL initiating shutdown of the database. At times you’ll be investigating why MySQL instance had stopped - this line is always an indication that someone (or something) triggered clean shutdown. No crash happened, MySQL wasn’t killed either.

2015-10-26 19:35:13 12955 [Note] Giving 12 client threads a chance to die gracefully
2015-10-26 19:35:13 12955 [Note] Event Scheduler: Purging the queue. 2 events
2015-10-26 19:35:13 12955 [Note] Shutting down slave threads
2015-10-26 19:35:15 12955 [Note] Forcefully disconnecting 6 remaining clients
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 37  user: 'cmon'
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 53  user: 'cmon'
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 38  user: 'cmon'
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 39  user: 'cmon'
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 44  user: 'cmon'
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 47  user: 'cmon'

In the above, MySQL is closing remaining connections - it allows them to finish on their own but if they do not close, they will be forcefully terminated.

2015-10-26 19:35:15 12955 [Note] Binlog end
2015-10-26 19:35:15 12955 [Note] Shutting down plugin 'partition'
2015-10-26 19:35:15 12955 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2015-10-26 19:35:15 12955 [Note] Shutting down plugin 'ARCHIVE'
...

Here MySQL is shutting down all plugins that were enabled in its configuration. There are many of them so we removed some output for the sake of better visibility.

2015-10-26 19:35:15 12955 [Note] Shutting down plugin 'InnoDB'
2015-10-26 19:35:15 12955 [Note] InnoDB: FTS optimize thread exiting.
2015-10-26 19:35:15 12955 [Note] InnoDB: Starting shutdown...
2015-10-26 19:35:17 12955 [Note] InnoDB: Shutdown completed; log sequence number 710963181

One of them is InnoDB - this step may take a while as a clean InnoDB shutdown can take some time on a busy server, even with InnoDB fast shutdown enabled (and this is the default setting).

2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'MyISAM'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'MRG_MYISAM'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'CSV'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'MEMORY'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'sha256_password'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'mysql_old_password'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'mysql_native_password'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'binlog'
2015-10-26 19:35:17 12955 [Note] /usr/sbin/mysqld: Shutdown complete

The shutdown process concludes with ‘Shutdown complete’ message.

Configuration errors

Let’s start with a basic issue - configuration error. It’s really a popular problem. Sometimes we make a typo in a variable name, while editing my.cnf. MySQL parses its configuration files when it starts. If something is wrong, it will refuse to run. Let’s take a look at some examples:

2015-10-27 11:20:05 18858 [Note] Plugin 'FEDERATED' is disabled.
2015-10-27 11:20:05 18858 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-27 11:20:05 18858 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-27 11:20:05 18858 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-10-27 11:20:05 18858 [Note] InnoDB: Memory barrier is not used
2015-10-27 11:20:05 18858 [Note] InnoDB: Compressed tables use zlib 1.2.8
2015-10-27 11:20:05 18858 [Note] InnoDB: Using Linux native AIO
2015-10-27 11:20:05 18858 [Note] InnoDB: Using CPU crc32 instructions
2015-10-27 11:20:05 18858 [Note] InnoDB: Initializing buffer pool, size = 512.0M
2015-10-27 11:20:05 18858 [Note] InnoDB: Completed initialization of buffer pool
2015-10-27 11:20:05 18858 [Note] InnoDB: Highest supported file format is Barracuda.
2015-10-27 11:20:05 18858 [Note] InnoDB: 128 rollback segment(s) are active.
2015-10-27 11:20:05 18858 [Note] InnoDB: Waiting for purge to start
2015-10-27 11:20:06 18858 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.26-74.0 started; log sequence number 773268083
2015-10-27 11:20:06 18858 [ERROR] /usr/sbin/mysqld: unknown variable '--max-connections=512'
2015-10-27 11:20:06 18858 [ERROR] Aborting

As you can see from the above, we have here a clear copy-paste error with regards to max_connections variable. MySQL doesn’t accept ‘--’ prefix in my.cnf so if you are copying settings from the ‘ps’ output or from MySQL documentation, please keep it in mind. There are many things that can go wrong but it’s important that ‘unknown variable’ error always points you to my.cnf and to some kind of mistake which MySQL can’t understand.

Another problem can be related to misconfigurations. Let’s say we over-allocated memory on our server. Here’s what you might see:

2015-10-27 11:24:41 31325 [Note] Plugin 'FEDERATED' is disabled.
2015-10-27 11:24:41 31325 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-27 11:24:41 31325 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-27 11:24:41 31325 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-10-27 11:24:41 31325 [Note] InnoDB: Memory barrier is not used
2015-10-27 11:24:41 31325 [Note] InnoDB: Compressed tables use zlib 1.2.8
2015-10-27 11:24:41 31325 [Note] InnoDB: Using Linux native AIO
2015-10-27 11:24:41 31325 [Note] InnoDB: Using CPU crc32 instructions
2015-10-27 11:24:41 31325 [Note] InnoDB: Initializing buffer pool, size = 512.0G
InnoDB: mmap(70330089472 bytes) failed; errno 12
2015-10-27 11:24:41 31325 [ERROR] InnoDB: Cannot allocate memory for the buffer pool
2015-10-27 11:24:41 31325 [ERROR] Plugin 'InnoDB' init function returned error.
2015-10-27 11:24:41 31325 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2015-10-27 11:24:41 31325 [ERROR] Unknown/unsupported storage engine: InnoDB
2015-10-27 11:24:41 31325 [ERROR] Aborting

As you can clearly see, we’ve tried to allocate 512G of memory which was not possible on this host. InnoDB is required for MySQL to start, therefore error in initializing InnoDB prevented MySQL from starting.

Permission errors

Below are three examples of three crashes related to permission errors.

/usr/sbin/mysqld: File './binlog.index' not found (Errcode: 13 - Permission denied)
2015-10-27 11:31:40 11469 [ERROR] Aborting

This one is pretty self-explanatory. As stated, MySQL couldn’t find an index file for binary logs due to ‘permission denied’ error. This is also a hard stop for MySQL, it has to be able to read and write to binary logs.

2015-10-27 11:32:46 13601 [Note] Plugin 'FEDERATED' is disabled.
2015-10-27 11:32:46 13601 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-27 11:32:46 13601 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-27 11:32:46 13601 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-10-27 11:32:46 13601 [Note] InnoDB: Memory barrier is not used
2015-10-27 11:32:46 13601 [Note] InnoDB: Compressed tables use zlib 1.2.8
2015-10-27 11:32:46 13601 [Note] InnoDB: Using Linux native AIO
2015-10-27 11:32:46 13601 [Note] InnoDB: Using CPU crc32 instructions
2015-10-27 11:32:46 13601 [Note] InnoDB: Initializing buffer pool, size = 512.0M
2015-10-27 11:32:46 13601 [Note] InnoDB: Completed initialization of buffer pool
2015-10-27 11:32:46 13601 [ERROR] InnoDB: ./ibdata1 can't be opened in read-write mode
2015-10-27 11:32:46 13601 [ERROR] InnoDB: The system tablespace must be writable!
2015-10-27 11:32:46 13601 [ERROR] Plugin 'InnoDB' init function returned error.
2015-10-27 11:32:46 13601 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2015-10-27 11:32:46 13601 [ERROR] Unknown/unsupported storage engine: InnoDB
2015-10-27 11:32:46 13601 [ERROR] Aborting

The above is another permission issue, this time the problem is related to ibdata1 file - shared system tablespace which is used for InnoDB to store its internal data (InnoDB dictionary and, by default, undo logs). Even if you use innodb_file_per_table, which is a default setting in MySQL 5.6, InnoDB still requires this file so it can intialize.

2015-10-27 11:35:21 17826 [Note] Plugin 'FEDERATED' is disabled.
/usr/sbin/mysqld: Can't find file: './mysql/plugin.frm' (errno: 13 - Permission denied)
2015-10-27 11:35:21 17826 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
2015-10-27 11:35:21 17826 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-27 11:35:21 17826 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-27 11:35:21 17826 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-10-27 11:35:21 17826 [Note] InnoDB: Memory barrier is not used
2015-10-27 11:35:21 17826 [Note] InnoDB: Compressed tables use zlib 1.2.8
2015-10-27 11:35:21 17826 [Note] InnoDB: Using Linux native AIO
2015-10-27 11:35:21 17826 [Note] InnoDB: Using CPU crc32 instructions
2015-10-27 11:35:21 17826 [Note] InnoDB: Initializing buffer pool, size = 512.0M
2015-10-27 11:35:21 17826 [Note] InnoDB: Completed initialization of buffer pool
2015-10-27 11:35:21 17826 [Note] InnoDB: Highest supported file format is Barracuda.
2015-10-27 11:35:21 7fa02e97d780  InnoDB: Operating system error number 13 in a file operation.
InnoDB: The error means mysqld does not have the access rights to
InnoDB: the directory.
2015-10-27 11:35:21 17826 [ERROR] InnoDB: Could not find a valid tablespace file for 'mysql/innodb_index_stats'. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2015-10-27 11:35:21 17826 [ERROR] InnoDB: Tablespace open failed for '"mysql"."innodb_index_stats"', ignored.
2015-10-27 11:35:21 7fa02e97d780  InnoDB: Operating system error number 13 in a file operation.
InnoDB: The error means mysqld does not have the access rights to
InnoDB: the directory.
2015-10-27 11:35:21 17826 [ERROR] InnoDB: Could not find a valid tablespace file for 'mysql/innodb_table_stats'. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2015-10-27 11:35:21 17826 [ERROR] InnoDB: Tablespace open failed for '"mysql"."innodb_table_stats"', ignored.
2015-10-27 11:35:21 7fa02e97d780  InnoDB: Operating system error number 13 in a file operation.
InnoDB: The error means mysqld does not have the access rights to
InnoDB: the directory.

Here we have an example of permission issues related to system schema (‘mysql’ schema). As you can see, InnoDB reported it does not have access rights to open tables in this schema. Again, this is serious problem which is a hard stop for MySQL.

Out of memory errors

Another set of frequent issues related to MySQL servers running out of memory. It can manifest in many different ways but the most popular is this one:

Killed
151027 12:18:58 mysqld_safe Number of processes running now: 0

mysqld_safe is an ‘angel’ process which monitors mysqld and restarts it when it dies. It’s true for MySQL but not for recent Galera nodes - in this case it won’t restart MySQL but you’ll see following sequence:

Killed
151027 12:18:58 mysqld_safe Number of processes running now: 0
151027 12:18:58 mysqld_safe WSREP: not restarting wsrep node automatically
151027 12:18:58 mysqld_safe mysqld from pid file /var/lib/mysql/mysql.pid ended

Another way memory issue may show up is through the following messages in the error log:

InnoDB: mmap(3145728 bytes) failed; errno 12
InnoDB: mmap(3145728 bytes) failed; errno 12

What’s important to keep in mind is that error codes can easily be checked using ‘perror’ utility. In this case, it makes perfectly clear what exactly happened:

$ perror 12
OS error code  12:  Cannot allocate memory

If you suspect a memory problem or we’d even say - if you are encountering unexpected MySQL crashes, you can also run dmesg to get some more data. If you see output like below, you can be certain that the problem was caused by OOM Killer terminating MySQL:

[ 4165.802544] Out of memory: Kill process 8143 (mysqld) score 938 or sacrifice child
[ 4165.808329] Killed process 8143 (mysqld) total-vm:5101492kB, anon-rss:3789388kB, file-rss:0kB
[ 4166.226410] init: mysql main process (8143) killed by KILL signal
[ 4166.226437] init: mysql main process ended, respawning

 

InnoDB crashes

InnoDB, most of the time, is a very solid and durable storage engine. Unfortunately, under some circumstances, data may become corrupted. Usually it’s related to either misconfiguration (for example, disabling double-write buffer) or some kind of faulty hardware (faulty memory modules, unstable I/O layer). In those cases it can happen that InnoDB will crash. Crashes can also be triggered by InnoDB bugs. It’s not common but it happens from time to time. No matter the reason of a crash, the error log usually looks in a similar way. Let’s go over a live example.

2015-10-13 15:08:06 7f53f0658700  InnoDB: Assertion failure in thread 139998492198656 in file btr0pcur.cc line 447
InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == buf_block_get_page_no(btr_pcur_get_block(cursor))

InnoDB code is full of sanity checks - assertions are very frequent and, if one of them fails, InnoDB intentionally crashes MySQL. At the beginning you’ll see information about exact location of the assertion which failed. This gives you information about what exactly may have happened - MySQL source code is available on the internet and it’s pretty easy to download code related to your particular version. It may sounds scary but it’s not - MySQL code is properly documented and it’s not a problem to figure out at least what kind of activity triggers your issue.

InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.

InnoDB logging tries to be as helpful as possible - as you can see, there’s information about InnoDB recovery process. Under some circumstances it may allow you to start MySQL and dump the InnoDB data. It’s used as “almost a last resort” - one step further and you’ll end up digging in InnoDB files using hex editor.

13:08:06 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=67108864
read_buffer_size=131072
max_used_connections=36
max_threads=514
thread_count=27
connection_count=26
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 269997 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

In the next step you’ll see some of the status counters printed. It’s supposed to give you some idea about configuration - maybe you configured MySQL to use too much resources? Once you go over this data, we’ll reach the backtrace. This piece of information will guide you through the stack and show you what kind of calls were executed when crash happened. Let’s check what exactly happened in this particular case.

Thread pointer: 0xa985960
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f53f0657d40 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x8cd37c]
/usr/sbin/mysqld(handle_fatal_signal+0x461)[0x6555a1]
/lib64/libpthread.so.0(+0xf710)[0x7f540f978710]
/lib64/libc.so.6(gsignal+0x35)[0x7f540dfd4625]
/lib64/libc.so.6(abort+0x175)[0x7f540dfd5e05]
/usr/sbin/mysqld[0xa0277b]
/usr/sbin/mysqld[0x567f88]
/usr/sbin/mysqld[0x99afde]
/usr/sbin/mysqld[0x8eb926]
/usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x5d)[0x59a6ed]
/usr/sbin/mysqld(_Z13rr_sequentialP11READ_RECORD+0x20)[0x800cd0]
/usr/sbin/mysqld(_Z12mysql_deleteP3THDP10TABLE_LISTP4ItemP10SQL_I_ListI8st_orderEyy+0xc88)[0x8128c8]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1bc5)[0x6d6d15]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x5a8)[0x6dc228]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x108c)[0x6dda4c]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x162)[0x6ab002]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x6ab0f0]
/usr/sbin/mysqld(pfs_spawn_thread+0x143)[0xaddd73]
/lib64/libpthread.so.0(+0x79d1)[0x7f540f9709d1]
/lib64/libc.so.6(clone+0x6d)[0x7f540e08a8fd]

We can see that problem was triggered when MySQL was executing a DELETE command. Records were read sequentially and full table scan was performed (handler read_rnd_next). Combined with information from the beginning:

InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == buf_block_get_page_no(btr_pcur_get_block(cursor))

we can tell the crash happened when InnoDB was performing a scan of the buffer pool (assertion was triggered in btr_pcur_move_to_next_page method).

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (a991d40): DELETE FROM cmon_stats WHERE cid = 2 AND ts < 1444482484
Connection ID (thread ID): 5
Status: NOT_KILLED

Finally, we can see (sometimes, it’s not always printed) which thread triggered the assertion. In our case we can confirm that indeed it was a DELETE query which caused some kind of problem.

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
151013 15:08:06 mysqld_safe Number of processes running now: 0

As we said, usually MySQL is pretty verbose when it comes to the error log. Sometimes, though, things go so wrong that MySQL can’t really collect much of the information. It still gives a pretty clear indication that something went very wrong (signal 11). Please take a look at the following real life example:

22:31:40 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=33554432
read_buffer_size=131072
max_used_connections=10
max_threads=202
thread_count=9
connection_count=7
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 113388 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x1fb7050
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Segmentation fault (core dumped)
150119 23:48:08 mysqld_safe Number of processes running now: 0

As you can see, we have clear information that MySQL crashed. There’s no additional information, though. Segmentation faults can also be logged in diagnostic messages. In this particular case dmesg returned:

[370591.200799] mysqld[5462]: segfault at 51 ip 00007fe624e3e688 sp 00007fe4f4099c30 error 4 in libgcc_s.so.1[7fe624e2f000+16000]

It’s not much, as you can see, but such information can prove to be very useful if you decide to file a bug against the software provider, be it Oracle, MariaDB, Percona or Codership. It can also be helpful in finding relevant bugs already reported.

It’s not always possible to fix the problem based only on the information written in the error log. It’s definitely a good place to start debugging problems in your database, though. If you are hitting some kind of very common, simple misconfiguration, information you’d find in the error log should be enough to pinpoint a cause and find a solution. If we are talking about serious MySQL crashes, things are definitely different - most likely you won’t be able to fix the issue on your own (unless you are a developer familiar with MySQL code, that is) On the other hand, this data may be more than enough to identify the culprit and locate existing bug reports which cover it. Such bug reports contain lots of discussion and tips or workarounds - it’s likely that you’ll be able to implement one of them. Even information that the given bug was fixed in version x.y.z is useful - you’ll be able to verify it and, if it does solve your problem, to plan for upgrade. In the worst case scenario, you should have enough data to create a bug report on your own.

In the next post in the series, we are going to take a guided tour through the error log (and other logs) you may find in your Galera cluster. We’ll cover what kind of issues you may encounter and how to solve them.

Blog category:

s9s Tools and Resources: 'Become a MySQL DBA' series, ClusterControl 1.2.11 release, and more!

$
0
0

Check Out Our Latest Technical Resources for MySQL, MariaDB, PostgreSQL and MongoDB

This blog is packed with all the latest resources and tools we’ve recently published! Please do check it out and let us know if you have any comments or feedback.

Product Announcements & Resources

1211_banner.png

ClusterControl 1.2.11 Release
Last month, we were pleased to announce our best release yet for Postgres users as well as to introduce key new features for our MySQL/MariaDB users, such as support for MaxScale, an open-source, database-centric proxy that works with MySQL, MariaDB and Percona server.

View all the related resources here:

Support for MariaDB’s MaxScale
You can get started straight away with MaxScale by deploying and managing it with ClusterControl 1.2.11. This How-To blog shows you how, step-by-step.

mariadb_maxscale.jpg

Customer Case Studies

From small businesses to Fortune 500 companies, customers have chosen Severalnines to deploy and manage MySQL, MongoDB and PostgreSQL.  

View our Customer page to discover companies like yours who have found success with ClusterControl.

Screen Shot 2015-11-04 at 09.44.52.png

Technical Webinar - Replays

As you know, we run a monthly technical webinar cycle; these are the latest replays, which you can watch at your own leisure, all part of our ‘Become a MySQL DBA’ series:

View all our replays here

ClusterControl Blogs

We’ve started a new series of blogs focussing on how to use ClusterControl. Do check them out!

View all ClusterControl blogs here.

cc_deploy_multiple.png

 

The MySQL DBA Blog Series

We’re on the 15th installment of our popular ‘Become a MySQL DBA’ series and you can view all of these blogs here. Here are the latest ones in the series:

View all the ‘Become a MySQL DBA’ blogs here

We trust these resources are useful. If you have any questions on them or on related topics, please do contact us!

Blog category:

ClusterControl Tips & Tricks: wtmp Log Rotation Settings for Sudo User

$
0
0

Requires ClusterControl. Applies to all supported database clusters. Applies to all supported operating systems (RHEL/CentOS/Debian/Ubuntu).

ClusterControl requires a super-privileged SSH user to provision database nodes. If you are running as non-root user, the corresponding user must able to execute sudo commands with or without sudo password. Unfortunately, this could generate another issue where performing remote command with “sudo” requires an interactive session (tty). We will explain this in details in the next sections.

What’s up with sudo?

By default, most of the RHEL flavors have the following configured under /etc/sudoers:

Defaults requiretty

When an interactive session (tty) is required, each time the sudo user SSH into the box with -t flag (force pseudo-tty allocation), entries will be created in /var/log/wtmp for the creation and destruction of terminals, or the assignment and release of terminals. These logs only record interactive sessions. If you didn’t specify -t, you would see the following error:

sudo: sorry, you must have a tty to run sudo

The root user does not require an interactive session when running remote SSH command, the entries only appear in /var/log/secure or /var/log/auth.log depending on the system configuration. Different distributions have different defaults in this regards. SSH does not make a file into wtmp if it is a non-interactive session.

To check the content of wtmp, we use the following command:

$ last -f /var/log/wtmp
ec2-user pts/0        ip-10-0-0-79.ap- Wed Oct 28 11:16 - 11:16  (00:00)
ec2-user pts/0        ip-10-0-0-79.ap- Wed Oct 28 11:16 - 11:16  (00:00)
ec2-user pts/0        ip-10-0-0-79.ap- Wed Oct 28 11:16 - 11:16  (00:00)
...

On Debian/Ubuntu system, sudo user does not need to acquire tty as it defaults to have no “requiretty” configured. However, ClusterControl defaults to append -t flag if it detects the SSH user as a non-root user. Since ClusterControl performs all the monitoring and management tasks as this user, you may notice that /var/log/wtmp will grow rapidly, as shown in the following section.

Log rotation for wtmp

Example: Take note of the following default configuration of wtmp in RHEL 7.1 inside /etc/logrotate.conf:

/var/log/wtmp {
    monthly
    create 0664 root utmp
    minsize 1M
    rotate 1
}

By running the following commands on one of the database nodes managed by ClusterControl, we can see how fast /var/log/wtmp grows every minute:

[user@server ~]$ a=$(du -b /var/log/wtmp | cut -f1) && sleep 60 && b=$(du -b /var/log/wtmp | cut -f1) && c=$(expr $b - $a ) && echo $c
89088

From the above result, ClusterControl causes the log file to grow 89 KB per minute, which equals to 128MB per day. If the mentioned logrotate configuration is used (monthly rotation), /var/log/wtmp alone may consume 3.97 GB of disk space! If the partition where this file resides (usually under “/” partition) is small (it’s common to have “/” partition smaller, especially if it’s a cloud instance), there is a potential risk you would fill up the disk space on that partition in less than one month. 

Workaround

The workaround is to play with the log rotation of wtmp. This is applicable to all operating systems mentioned in the beginning of this post. For those who are affected by this, you have to change the log rotation behaviour so it does not grow more than expected. The following is what we recommend:

/var/log/wtmp {
     size 100M
     create 0664 root utmp
     rotate 3
     compress
}

The above settings specify that the maximum size of wtmp should be 100 MB and, and we should keep the 3 most recent (compressed) files and remove older ones.

Logrotate uses crontab (under /etc/cron.daily/logrotate) to work. It is not a daemon so no need to reload its configuration. When the crontab executes logrotate, it will use the new config file automatically.

Happy Clustering!

PS.: To get started with ClusterControl, click here!

Blog category:

Webinar Replay & Slides: Deploying MongoDB, MySQL, PostgreSQL & MariaDB’s MaxScale in 40min

$
0
0

Live demo recording of ClusterControl 1.2.11 and its new features 

Thanks to everyone who joined us for our recent live webinar on ‘ClusterControl 1.2.11 - new release’ led by Art van Scheppingen , Senior Support Engineer at Severalnines. The replay and slides to the webinar are now available to watch and read online via the links below.

During this live webinar, Art demonstrated the latest ClusterControl release and its new features such as support for MariaDB’s MaxScale and related MySQL updates; this is also the best ClusterControl release for PostgreSQL yet.

In fact, and during a live demo, Art not only introduced the new features, but also proceeded to deploy MongoDB, PostgreSQL, MySQL clusters and replicated setups as well as MariaDB’s MaxScale proxy all in the one session and the one ClusterControl instance - within 40min! Impressive stuff, which can be viewed again in this recording!

Watch the replay

Read the slides

 

1211_banner.png

TOPICS COVERED

  • For PostgreSQL
    • Deployment and Management of Postgres Replicated Setups
    • Customisable dashboards
    • Database performance charts for nodes
    • Enablement of ClusterControl DevStudio
  • Support for MaxScale
    • Deployment and management of MaxScale load balancer
  • For MySQL
    • Add Existing HAProxy and Keepalived
    • Deployment of MySQL Replication setups
    • Improvements in charting of metrics
    • Revamped Configuration Management
    • New Database Logs Page
    • Revamped MySQL User Management

SPEAKER

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 15 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

cc_deploy_multiple.png

For further blogs on ClusterControl visit: http://www.severalnines.com/blog-categories/clustercontrol

To view all our webinar replays visit: http://www.severalnines.com/webinars-replay

Blog category:

Webinar: Become a MySQL DBA - Performing live database upgrades in Replication and Galera setups - Tuesday November 24th

$
0
0

Join us on November 24th for this new webinar on performing live database upgrades for MySQL Replication & Galera led by Krzysztof Książek, Senior Support Engineer at Severalnines. This is part of our ongoing ‘Become a MySQL DBA’ series. 

Database vendors typically release patches with bug/security fixes on a monthly basis, why should we care? After all, upgrading a system is extra work and might incur downtime. Unfortunately, the news is full of reports of security breaches and hacked systems, so unless security is not a concern, you might want to have the most current security fixes on your systems. Major versions are rarer, and usually harder (and riskier) to upgrade to. But they might bring along some important features that make the upgrade worth the effort.

In this webinar we will cover one of the most basic, but essential tasks of the DBA: minor and major database upgrades in production environments.

DATE, TIME & REGISTRATION

Europe/MEA/APAC
Tuesday, November 24th at 09:00 GMT / 10:00 CET (Germany, France, Sweden)
Register Now

North America/LatAm
Tuesday, November 24th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

AGENDA

  • What types of upgrades are there? 
  • How do I best prepare for the upgrades?
  • Best practices for: 
    • Minor version upgrades - MySQL & Galera
    • Major version upgrades - MySQL & Galera

SPEAKER

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. This webinar builds upon recent blog posts and related webinar series by Krzysztof on how to become a MySQL DBA.

To view all the blogs of the ‘Become a MySQL DBA’ series visit: http://www.severalnines.com/blog-categories/db-ops

up56_logo.png

Blog category:

Become a MySQL DBA blog series - Galera Cluster diagnostic logs

$
0
0

In a previous post, we discussed the error log as a source of diagnostic information for MySQL. We’ve shown examples of problems where the error log can help you diagnose issues with MySQL. This post focuses on the Galera Cluster error logs, so if you are currently using Galera, read on.

In a Galera cluster, whether you use MariaDB Cluster, Percona XtraDB Cluster or the Codership build, the MySQL error log is even more important. It gives you the same information as MySQL would, but you’ll also find in it information about Galera internals and replication. This data is crucial to understand the state of your cluster and to identify any issues which may impact the cluster’s ability to operate. In this post, we’ll try to make the Galera error log easier to understand.

This is the sixteenth installment in the ‘Become a MySQL DBA’ blog series. Our previous posts in the DBA series include:

Clean Galera startup

Let’s start by looking at what a clean startup of a Galera node looks like.

151028 14:28:50 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
151028 14:28:50 mysqld_safe Skipping wsrep-recover for 98ed75de-7c05-11e5-9743-de4abc22bd11:11152 pair
151028 14:28:50 mysqld_safe Assigning 98ed75de-7c05-11e5-9743-de4abc22bd11:11152 to wsrep_start_position

What we can see here is Galera trying to understand the sequence number of the last writeset this node applied. It is found in grastate.dat file. In this particular case, shutdown was clean so all the needed data was found in that file making wsrep-recovery process unnecessary (i.e., the process of calculating the last writeset based on the data applied to the database). Galera sets the wsrep_start_position to the last applied writeset.

2015-10-28 14:28:50 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2015-10-28 14:28:50 0 [Note] /usr/sbin/mysqld (mysqld 5.6.26-74.0-56) starting as process 553 ...
2015-10-28 14:28:50 553 [Note] WSREP: Read nil XID from storage engines, skipping position init
2015-10-28 14:28:50 553 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
2015-10-28 14:28:50 553 [Note] WSREP: wsrep_load(): Galera 3.12.2(rf3e626d) by Codership Oy <info@codership.com> loaded successfully.
2015-10-28 14:28:50 553 [Note] WSREP: CRC-32C: using hardware acceleration.
2015-10-28 14:28:50 553 [Note] WSREP: Found saved state: 98ed75de-7c05-11e5-9743-de4abc22bd11:11152

Next we can see the WSREP library being loaded and, again, information about last state of the cluster before shutdown.

2015-10-28 14:28:50 553 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 172.30.4.156; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum =

This line covers all configuration related to Galera.

2015-10-28 14:28:50 553 [Note] WSREP: Service thread queue flushed.
2015-10-28 14:28:50 553 [Note] WSREP: Assign initial position for certification: 11152, protocol version: -1
2015-10-28 14:28:50 553 [Note] WSREP: wsrep_sst_grab()
2015-10-28 14:28:50 553 [Note] WSREP: Start replication
2015-10-28 14:28:50 553 [Note] WSREP: Setting initial position to 98ed75de-7c05-11e5-9743-de4abc22bd11:11152
2015-10-28 14:28:50 553 [Note] WSREP: protonet asio version 0
2015-10-28 14:28:50 553 [Note] WSREP: Using CRC-32C for message checksums.
2015-10-28 14:28:50 553 [Note] WSREP: backend: asio

Initialization process continues, WSREP replication has been started.

2015-10-28 14:28:50 553 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2015-10-28 14:28:50 553 [Note] WSREP: restore pc from disk failed

Those two lines above require a detailed comment. The file which is missing here, gvstate.dat, contains information about cluster state, as a given node sees it. You can find there the UUID of a node and other members of the cluster which are in the PRIMARY state (i.e., which are part of the cluster). In case of unexpected crash, this file will allow the node to understand the state of the cluster before it crashed. In case of a clean shutdown, the gvwstate.dat file is removed. That is why it cannot be found in our example.

2015-10-28 14:28:50 553 [Note] WSREP: GMCast version 0
2015-10-28 14:28:50 553 [Note] WSREP: (3506d410, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2015-10-28 14:28:50 553 [Note] WSREP: (3506d410, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2015-10-28 14:28:50 553 [Note] WSREP: EVS version 0
2015-10-28 14:28:50 553 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '172.30.4.156:4567,172.30.4.191:4567,172.30.4.220:4567'

In this section, you will see the initial communication within the cluster - the node is trying to connect to the other members listed in wsrep_cluster_address variable. For Galera it’s enough to get in touch with one active member of the cluster - even if node “A” has contact only with node “B” but not “C”, “D” and “E”, node “B” will work as a relay and pass through the needed communication.

2015-10-28 14:28:50 553 [Warning] WSREP: (3506d410, 'tcp://0.0.0.0:4567') address 'tcp://172.30.4.156:4567' points to own listening address, blacklisting
2015-10-28 14:28:50 553 [Note] WSREP: (3506d410, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2015-10-28 14:28:50 553 [Note] WSREP: declaring 6221477d at tcp://172.30.4.191:4567 stable
2015-10-28 14:28:50 553 [Note] WSREP: declaring 78b22e70 at tcp://172.30.4.220:4567 stable

Here we can see results of the exchange of the communication. At first, it blacklists it’s own IP address. Then, as a result of a network exchange, the remaining two nodes of the cluster are found and declared ‘stable’. Please note the IDs used here - those Node IDs are used also in other places but here you can easily link them with the relevant IP address. Node ID can also be located in gvwstate.dat file (it’s a first segment of ‘my_uuid’) or by running “SHOW GLOBAL STATUS LIKE ‘wsrep_gcomm_uuid’”. Please keep in mind that UUID may change, for example, after a crash.

2015-10-28 14:28:50 553 [Note] WSREP: Node 6221477d state prim
2015-10-28 14:28:50 553 [Note] WSREP: view(view_id(PRIM,3506d410,5) memb {
    3506d410,0
    6221477d,0
    78b22e70,0
} joined {
} left {
} partitioned {
})
2015-10-28 14:28:50 553 [Note] WSREP: save pc into disk
2015-10-28 14:28:50 553 [Note] WSREP: gcomm: connected

As a result, the node managed to join the cluster - you can see that all IDs are listed as members of the cluster, cluster is in ‘Primary’ state. There are no nodes in ‘joined’ state, the cluster was not partitioned.

2015-10-28 14:28:50 553 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2015-10-28 14:28:50 553 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2015-10-28 14:28:50 553 [Note] WSREP: Opened channel 'my_wsrep_cluster'
2015-10-28 14:28:50 553 [Note] WSREP: Waiting for SST to complete.
2015-10-28 14:28:50 553 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
2015-10-28 14:28:50 553 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 355385f7-7d80-11e5-bb70-17166cef2a4d
2015-10-28 14:28:50 553 [Note] WSREP: STATE EXCHANGE: sent state msg: 355385f7-7d80-11e5-bb70-17166cef2a4d
2015-10-28 14:28:50 553 [Note] WSREP: STATE EXCHANGE: got state msg: 355385f7-7d80-11e5-bb70-17166cef2a4d from 0 (172.30.4.156)
2015-10-28 14:28:50 553 [Note] WSREP: STATE EXCHANGE: got state msg: 355385f7-7d80-11e5-bb70-17166cef2a4d from 1 (172.30.4.191)
2015-10-28 14:28:50 553 [Note] WSREP: STATE EXCHANGE: got state msg: 355385f7-7d80-11e5-bb70-17166cef2a4d from 2 (172.30.4.220)
2015-10-28 14:28:50 553 [Note] WSREP: Quorum results:
    version    = 3,
    component  = PRIMARY,
    conf_id    = 4,
    members    = 3/3 (joined/total),
    act_id     = 11152,
    last_appl. = -1,
    protocols  = 0/7/3 (gcs/repl/appl),
    group UUID = 98ed75de-7c05-11e5-9743-de4abc22bd11

Cluster members exchange information about their status and as a result we see a report on quorum calculation - cluster is in ‘Primary’ state with all three nodes joined.

2015-10-28 14:28:50 553 [Note] WSREP: Flow-control interval: [28, 28]
2015-10-28 14:28:50 553 [Note] WSREP: Restored state OPEN -> JOINED (11152)
2015-10-28 14:28:50 553 [Note] WSREP: New cluster view: global state: 98ed75de-7c05-11e5-9743-de4abc22bd11:11152, view# 5: Primary, number of nodes: 3, my index: 0, protocol version 3
2015-10-28 14:28:50 553 [Note] WSREP: SST complete, seqno: 11152
2015-10-28 14:28:50 553 [Note] WSREP: Member 0.0 (172.30.4.156) synced with group.
2015-10-28 14:28:50 553 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 11152)

Those few lines above are bit misleading - there was no SST performed as the node was synced with the cluster. In short, at this very moment, the cluster is up and our node is synced with the other cluster members.

2015-10-28 14:28:50 553 [Warning] option 'innodb-buffer-pool-instances': signed value -1 adjusted to 0
2015-10-28 14:28:50 553 [Note] Plugin 'FEDERATED' is disabled.
2015-10-28 14:28:50 553 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-28 14:28:50 553 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-28 14:28:50 553 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins

...

2015-10-28 14:28:52 553 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.26-74.0-56'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Percona XtraDB Cluster (GPL), Release rel74.0, Revision 624ef81, WSREP version 25.12, wsrep_25.12

Finally, MySQL startup process is coming to it’s end - MySQL is up and running, ready to handle connections.

As you can see, in the error log, during the clean start, we can see a lot of information. We can see how a node sees the cluster, we can see information about quorum calculation, we can see IDs and IP addresses of other nodes in the cluster. This give us great info about the cluster state - what nodes were up, were there any connectivity issues? Information about Galera configuration is also very useful - my.cnf may have been changed in the meantime, runtime configuration - the same. Yet, we can still check how exactly Galera was configured at the time of the startup. It may comes handy when dealing with misconfigurations.

Galera node - IST process

Ok, so we covered a regular startup - when a node is up to date with the cluster. What about when a node is not up to date? There are two options - if the donor has all the needed writesets stored in its gcache, Incremental State Transfer (IST) can be performed. If not, State Snapshot Transfer (SST) will be executed and it involves copying all data from the donor to the joining node. This time, we are looking at the node joining/synchronization using IST. To make it easier to read, we are going to remove some of the log entries which didn’t change compared to the clean startup we discussed earlier.

2015-10-28 16:36:50 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2015-10-28 16:36:50 0 [Note] /usr/sbin/mysqld (mysqld 5.6.26-74.0-56) starting as process 10144 ...

...

2015-10-28 16:36:50 10144 [Note] WSREP: Found saved state: 98ed75de-7c05-11e5-9743-de4abc22bd11:-1

Ok, MySQL started and after initialization, Galera located the saved state with ‘-1’ as a writeset sequence number. It means that the node didn’t shutdown cleanly and the correct state wasn’t written in grastate.dat file.

...

2015-10-28 16:36:50 10144 [Note] WSREP: Start replication
2015-10-28 16:36:50 10144 [Note] WSREP: Setting initial position to 98ed75de-7c05-11e5-9743-de4abc22bd11:11152

Even though grastate.dat file didn’t contain sequence number, Galera can calculate it on it’s own and setup the initial position accordingly.

...

2015-10-28 16:36:50 10144 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '172.30.4.156:4567,172.30.4.191:4567,172.30.4.220:4567'
2015-10-28 16:36:50 10144 [Warning] WSREP: (16ca37d4, 'tcp://0.0.0.0:4567') address 'tcp://172.30.4.191:4567' points to own listening address, blacklisting
2015-10-28 16:36:50 10144 [Note] WSREP: (16ca37d4, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2015-10-28 16:36:51 10144 [Note] WSREP: declaring 3506d410 at tcp://172.30.4.156:4567 stable
2015-10-28 16:36:51 10144 [Note] WSREP: declaring 78b22e70 at tcp://172.30.4.220:4567 stable
2015-10-28 16:36:51 10144 [Note] WSREP: Node 3506d410 state prim
2015-10-28 16:36:51 10144 [Note] WSREP: view(view_id(PRIM,16ca37d4,8) memb {
    16ca37d4,0
    3506d410,0
    78b22e70,0
} joined {
} left {
} partitioned {
})

Similar to the clean startup, the Galera nodes had some communication and decided on the state of the cluster. In this case, all three nodes were up and the cluster was formed.

2015-10-28 16:36:51 10144 [Note] WSREP: save pc into disk
2015-10-28 16:36:51 10144 [Note] WSREP: Waiting for SST to complete.
2015-10-28 16:36:51 10144 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
2015-10-28 16:36:51 10144 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 17633a80-7d92-11e5-8bdc-f6091df00f00
2015-10-28 16:36:51 10144 [Note] WSREP: STATE EXCHANGE: sent state msg: 17633a80-7d92-11e5-8bdc-f6091df00f00
2015-10-28 16:36:51 10144 [Note] WSREP: STATE EXCHANGE: got state msg: 17633a80-7d92-11e5-8bdc-f6091df00f00 from 0 (172.30.4.191)
2015-10-28 16:36:51 10144 [Note] WSREP: STATE EXCHANGE: got state msg: 17633a80-7d92-11e5-8bdc-f6091df00f00 from 1 (172.30.4.156)
2015-10-28 16:36:51 10144 [Note] WSREP: STATE EXCHANGE: got state msg: 17633a80-7d92-11e5-8bdc-f6091df00f00 from 2 (172.30.4.220)
2015-10-28 16:36:51 10144 [Note] WSREP: Quorum results:
    version    = 3,
    component  = PRIMARY,
    conf_id    = 6,
    members    = 2/3 (joined/total),
    act_id     = 31382,
    last_appl. = -1,
    protocols  = 0/7/3 (gcs/repl/appl),
    group UUID = 98ed75de-7c05-11e5-9743-de4abc22bd11

Again, communication exchange happens and quorum calculation is performed. This time you can see that three members were detected and only two of them joined. This is something we expect - one node is not synced with the cluster.

2015-10-28 16:36:51 10144 [Note] WSREP: Flow-control interval: [28, 28]
2015-10-28 16:36:51 10144 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 31382)
2015-10-28 16:36:51 10144 [Note] WSREP: State transfer required:
    Group state: 98ed75de-7c05-11e5-9743-de4abc22bd11:31382
    Local state: 98ed75de-7c05-11e5-9743-de4abc22bd11:11152

State transfer was deemed necessary because the local node is not up to date with the applied writesets.

2015-10-28 16:36:51 10144 [Note] WSREP: New cluster view: global state: 98ed75de-7c05-11e5-9743-de4abc22bd11:31382, view# 7: Primary, number of nodes: 3, my index: 0, protocol version 3
2015-10-28 16:36:51 10144 [Warning] WSREP: Gap in state sequence. Need state transfer.
2015-10-28 16:36:51 10144 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '172.30.4.191' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '10144''''
WSREP_SST: [INFO] Streaming with xbstream (20151028 16:36:52.039)
WSREP_SST: [INFO] Using socat as streamer (20151028 16:36:52.041)

State transfer is performed using xtrabackup, this is true for both IST and SST (if xtrabackup is configured as SST method). Both xbstream and socat binaries are required for IST (and SST) to execute. It may happen that socat is not installed on the system. In such case, the node won’t be able to join the cluster.

WSREP_SST: [INFO] Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20151028 16:36:52.132)
2015-10-28 16:36:52 10144 [Note] WSREP: Prepared SST request: xtrabackup-v2|172.30.4.191:4444/xtrabackup_sst//1
2015-10-28 16:36:52 10144 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2015-10-28 16:36:52 10144 [Note] WSREP: REPL Protocols: 7 (3, 2)
2015-10-28 16:36:52 10144 [Note] WSREP: Service thread queue flushed.
2015-10-28 16:36:52 10144 [Note] WSREP: Assign initial position for certification: 31382, protocol version: 3
2015-10-28 16:36:52 10144 [Note] WSREP: Service thread queue flushed.
2015-10-28 16:36:52 10144 [Note] WSREP: Prepared IST receiver, listening at: tcp://172.30.4.191:4568
2015-10-28 16:36:52 10144 [Note] WSREP: Member 0.0 (172.30.4.191) requested state transfer from '*any*'. Selected 1.0 (172.30.4.156)(SYNCED) as donor.
2015-10-28 16:36:52 10144 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 31389)

This stage covers state transfer - as you can see, the donor is selected. In our case, it is 172.30.4.156. This is pretty important information, especially when SST is executed - donors will run xtrabackup to move the data and they will log the progress in innobackup.backup.log file in the MySQL data directory. Different network issues may also cause problems with state transfer - it’s good to check which node was a donor even if only to check if there are any errors on the donor’s side.

2015-10-28 16:36:52 10144 [Note] WSREP: Requesting state transfer: success, donor: 1
2015-10-28 16:36:53 10144 [Note] WSREP: 1.0 (172.30.4.156): State transfer to 0.0 (172.30.4.191) complete.
2015-10-28 16:36:53 10144 [Note] WSREP: Member 1.0 (172.30.4.156) synced with group.
WSREP_SST: [INFO] xtrabackup_ist received from donor: Running IST (20151028 16:36:53.159)
WSREP_SST: [INFO] Galera co-ords from recovery: 98ed75de-7c05-11e5-9743-de4abc22bd11:11152 (20151028 16:36:53.162)
WSREP_SST: [INFO] Total time on joiner: 0 seconds (20151028 16:36:53.164)
WSREP_SST: [INFO] Removing the sst_in_progress file (20151028 16:36:53.166)
2015-10-28 16:36:53 10144 [Note] WSREP: SST complete, seqno: 11152
2015-10-28 16:36:53 10144 [Warning] option 'innodb-buffer-pool-instances': signed value -1 adjusted to 0
2015-10-28 16:36:53 10144 [Note] Plugin 'FEDERATED' is disabled.
2015-10-28 16:36:53 10144 [Note] InnoDB: Using atomics to ref count buffer pool pages

...

2015-10-28 16:36:55 10144 [Note] WSREP: Signalling provider to continue.
2015-10-28 16:36:55 10144 [Note] WSREP: Initialized wsrep sidno 1
2015-10-28 16:36:55 10144 [Note] WSREP: SST received: 98ed75de-7c05-11e5-9743-de4abc22bd11:11152
2015-10-28 16:36:55 10144 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.26-74.0-56'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Percona XtraDB Cluster (GPL), Release rel74.0, Revision 624ef81, WSREP version 25.12, wsrep_25.12
2015-10-28 16:36:55 10144 [Note] WSREP: Receiving IST: 20230 writesets, seqnos 11152-31382
2015-10-28 16:40:18 10144 [Note] WSREP: IST received: 98ed75de-7c05-11e5-9743-de4abc22bd11:31382
2015-10-28 16:40:18 10144 [Note] WSREP: 0.0 (172.30.4.191): State transfer from 1.0 (172.30.4.156) complete.

In this section, among others, we can see IST being performed and 20k writesets were transferred from the donor to the joining node.

2015-10-28 16:40:18 10144 [Note] WSREP: Shifting JOINER -> JOINED (TO: 34851)

At this point, the node applied all writesets sent by the donor and it switched to ‘Joined’ state. It’s not yet synced because there are remaining writesets (applied by the cluster while IST was performed) to be applied. This is actually something very important to keep in mind - when a node is a ‘Joiner’, it’s applying writesets as fast as possible, without any interruption to the cluster. Once a node switches to ‘Joined’ state, it may send flow control messages. This is particularly visible when there are lots of writesets to be sent via IST. Let’s say it took 2h to apply them on a joining node. This means it still has to apply 2h worth of traffic - and it can send flow control messages in the meantime. Given that the‘Joined’ node is significantly lagging behind (2h), it will definitely send flow control - for the duration of the time needed to apply those 2h worth of traffic, the cluster will run with degraded performance.

2015-10-28 16:42:30 10144 [Note] WSREP: Member 0.0 (172.30.4.191) synced with group.
2015-10-28 16:42:30 10144 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 36402)
2015-10-28 16:42:31 10144 [Note] WSREP: Synchronized with group, ready for connections
2015-10-28 16:42:31 10144 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

Finally, the node caught up on the writeset replication and synced with the cluster.

Galera SST process

In the previous part, we covered IST process from the error log point of view. One more important and frequent process is left - State Snapshot Transfer (SST).

151102 15:39:10 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
151102 15:39:10 mysqld_safe Skipping wsrep-recover for 98ed75de-7c05-11e5-9743-de4abc22bd11:71174 pair
151102 15:39:10 mysqld_safe Assigning 98ed75de-7c05-11e5-9743-de4abc22bd11:71174 to wsrep_start_position

...

2015-11-02 15:39:11 1890 [Note] WSREP: Quorum results:
    version    = 3,
    component  = PRIMARY,
    conf_id    = 10,
    members    = 2/3 (joined/total),
    act_id     = 97896,
    last_appl. = -1,
    protocols  = 0/7/3 (gcs/repl/appl),
    group UUID = 98ed75de-7c05-11e5-9743-de4abc22bd11
2015-11-02 15:39:11 1890 [Note] WSREP: Flow-control interval: [28, 28]
2015-11-02 15:39:11 1890 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 97896)
2015-11-02 15:39:11 1890 [Note] WSREP: State transfer required:
    Group state: 98ed75de-7c05-11e5-9743-de4abc22bd11:97896
    Local state: 98ed75de-7c05-11e5-9743-de4abc22bd11:71174
2015-11-02 15:39:11 1890 [Note] WSREP: New cluster view: global state: 98ed75de-7c05-11e5-9743-de4abc22bd11:97896, view# 11: Primary, number of nodes: 3, my index: 2, protocol version 3
2015-11-02 15:39:11 1890 [Warning] WSREP: Gap in state sequence. Need state transfer.
2015-11-02 15:39:11 1890 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '172.30.4.191' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '1890''''
WSREP_SST: [INFO] Streaming with xbstream (20151102 15:39:11.812)
WSREP_SST: [INFO] Using socat as streamer (20151102 15:39:11.814)
WSREP_SST: [INFO] Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20151102 15:39:11.908)
2015-11-02 15:39:12 1890 [Note] WSREP: Prepared SST request: xtrabackup-v2|172.30.4.191:4444/xtrabackup_sst//1
2015-11-02 15:39:12 1890 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2015-11-02 15:39:12 1890 [Note] WSREP: REPL Protocols: 7 (3, 2)
2015-11-02 15:39:12 1890 [Note] WSREP: Service thread queue flushed.
2015-11-02 15:39:12 1890 [Note] WSREP: Assign initial position for certification: 97896, protocol version: 3
2015-11-02 15:39:12 1890 [Note] WSREP: Service thread queue flushed.
2015-11-02 15:39:12 1890 [Note] WSREP: Prepared IST receiver, listening at: tcp://172.30.4.191:4568
2015-11-02 15:39:12 1890 [Note] WSREP: Member 2.0 (172.30.4.191) requested state transfer from '*any*'. Selected 0.0 (172.30.4.220)(SYNCED) as donor.
2015-11-02 15:39:12 1890 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 98007)
2015-11-02 15:39:12 1890 [Note] WSREP: Requesting state transfer: success, donor: 0

Until this line, everything looks exactly like with Incremental State Transfer (IST). There was a gap in the sequence between joining node and the rest of the cluster. SST was deemed necessary because the required writeset (71174) wasn’t available in the donor’s gcache.

WSREP_SST: [INFO] Proceeding with SST (20151102 15:39:13.070)
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20151102 15:39:13.071)
WSREP_SST: [INFO] Cleaning the existing datadir and innodb-data/log directories (20151102 15:39:13.093)
removed ‘/var/lib/mysql/DB187/t1.ibd’
removed ‘/var/lib/mysql/DB187/t2.ibd’
removed ‘/var/lib/mysql/DB187/t3.ibd’

SST basically copies data from a donor to the joining node. For this to work, the current contents of the data directory have to be removed - the SST process takes care of it and you can follow the progress in the error log, as can be seen above.

...

removed directory: ‘/var/lib/mysql/DBx236’
WSREP_SST: [INFO] Waiting for SST streaming to complete! (20151102 15:39:30.349)
2015-11-02 15:42:32 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000000 of size 134217728 bytes
2015-11-02 15:48:56 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000001 of size 134217728 bytes
2015-11-02 15:54:59 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000002 of size 134217728 bytes
2015-11-02 16:00:45 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000003 of size 134217728 bytes
2015-11-02 16:06:21 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000004 of size 134217728 bytes
2015-11-02 16:09:32 1890 [Note] WSREP: 0.0 (172.30.4.220): State transfer to 2.0 (172.30.4.191) complete.
2015-11-02 16:09:32 1890 [Note] WSREP: Member 0.0 (172.30.4.220) synced with group.

The SST process takes some time, depending on the time needed to copy the data. For the duration of the SST, the joining node caches writesets in gcache and, if in-memory gcache is not large enough, on-disk gcache files are used. Please note that at this point, in the last two lines, we can see information about the donor finishing the SST and getting back in sync with the rest of the cluster. 

WSREP_SST: [INFO] Preparing the backup at /var/lib/mysql//.sst (20151102 16:09:32.187)
WSREP_SST: [INFO] Evaluating innobackupex --no-version-check  --apply-log $rebuildcmd ${DATA} &>${DATA}/innobackup.prepare.log (20151102 16:09:32.188)
2015-11-02 16:12:39 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000005 of size 134217728 bytes
2015-11-02 16:20:52 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000006 of size 134217728 bytes

In this case, the joiner use xtrabackup as the SST method therefore the ‘prepare’ phase has to be completed.

WSREP_SST: [INFO] Moving the backup to /var/lib/mysql/ (20151102 16:26:01.062)
WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/mysql/my.cnf  --defaults-group=mysqld --no-version-check   --move-back --force-non-empty-directories ${DATA} &>${DATA}/innobackup.move.log (20151102 16:26:01.064)
WSREP_SST: [INFO] Move successful, removing /var/lib/mysql//.sst (20151102 16:26:14.600)

Last phase is to copy back files from the temporary directory to MySQL data directory. The rest of the process is similar to what we discussed earlier. The node initializes all plugins and InnoDB, and MySQL gets ready to handle connections. Once that’s done, it applies the remaining writesets stored in gcache.

2015-11-02 16:27:06 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000000
2015-11-02 16:27:08 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000001
2015-11-02 16:27:11 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000002
2015-11-02 16:27:15 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000003
2015-11-02 16:28:11 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000004
2015-11-02 16:28:56 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000005
2015-11-02 16:29:17 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000006
2015-11-02 16:29:24 1890 [Note] WSREP: Member 2.0 (172.30.4.191) synced with group.
2015-11-02 16:29:24 1890 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 214214)
2015-11-02 16:29:24 1890 [Note] WSREP: Synchronized with group, ready for connections
2015-11-02 16:29:24 1890 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

After all writesets were applied, on-disk gcache files are deleted and the node finally syncs up with the cluster.

Xtrabackup logs for SST

The SST process (if you use xtrabackup as the SST method), as you may have noticed, consists of three stages - streaming data to joiner, applying logs on the joiner and moving data from temporary directory to the final location in MySQL data directory. To have a complete view of the process, we need to mention some additional log files related to those three stages of the SST. You can find them in MySQL data directory or in the temporary directory Galera uses to move the data to the joiner. 

First of them (in order of appearance) is innobackup.backup.log - a file created on the donor host which consists of all the logs from xtrabackup process which is executed to stream the data from donor to joiner. This file can be very useful in dealing with any type of SST issue. In the next blog post, we will show how you can use it to identify the problem.

Another file located either in MySQL data directory (e.g., /var/lib/mysql) or .sst temporary subdirectory (e.g., /var/lib/mysql/.sst) is innobackup.prepare.log. This file contains logs related to ‘apply log’ phase. In this stage xtrabackup applies transaction logs to tablespaces (ibd files) copied from the donor. It may happen that InnoDB data is corrupted and, even though it was possible to stream the data from donor to joiner, it’s not possible to apply transactions from the redo log. Such errors will be logged in this file.

The last one is innobackup.move.log - it contains logs of the ‘move’ stage in which xtrabackup move files from temporary (.sst) directory to the final location in MySQL data directory.

In the next post, we’ll go through real life examples of errors which may impact your Galera Cluster. We will cover a wide range of issues that you can experience in production, and how to deal with them. For that, it is important to know what exactly you should expect in logs from a ‘normal’ system state - now that you know what normal looks like, it will much easier to understand what’s not normal :-) 

Blog category:


Become a ClusterControl DBA: performance monitoring and health

$
0
0

The blog series for MySQL, MongoDB & PostgreSQL administrators

In the previous two blog posts we covered both deploying the four types of clustering/replication (MySQL/Galera, MySQL Replication, MongoDB & PostgreSQL) and managing/monitoring your existing databases and clusters. So, after reading these two first blog posts you were able to add your 20 existing replication setups to ClusterControl, expand them and additionally deployed two new Galera clusters while doing a ton of other things. Or maybe you deployed MongoDB and/or PostgreSQL systems. So now, how do you keep them healthy?

That’s exactly what this blog post is about: how to leverage ClusterControl’s performance monitoring and advisors functionality to keep your MySQL, MongoDB and/or PostgreSQL databases and clusters healthy. So how is this done in ClusterControl?

The cluster list

The most important information can already be found in the cluster list: as long as there are no alarms and no hosts are shown to be down, everything is functioning fine. An alarm is raised if a certain condition is met, e.g. host is swapping, and brings to your attention the issue you should investigate. That means that alarms not only are raised during an outage but also to allow you to proactively manage your databases.

Suppose you would log into ClusterControl and see a cluster listing like this, you will definitely have something to investigate: one node is down in the Galera cluster for example and every cluster has various alarms.

severalnines-blogpost-cluster-list-node-down-alarms.png

Once you click on one of the alarms, you will go to a detailed page on all alarms of the cluster. The alarm details will explain the issue and in most cases also advise the action to resolve the issue.

You can set up your own alarms by creating custom expressions, but that has been deprecated in favor of our new Developer Studio that allows you to write custom Javascripts and execute these as Advisors. We will get back to this topic later in this post.

The cluster overview - Dashboards

When opening up the cluster overview, we can immediately see the most important performance metrics for the cluster in the tabs. This overview may differ per cluster type as, for instance, Galera has different performance metrics to watch than traditional MySQL, Postgres or MongoDB.

severalnines-blogpost-cluster-overview-performance.png

Both the default overview and the pre-selected tabs are customizable. By clicking on Overview > Dash Settings you are given a dialogue that allows you to define the dashboard.

severalnines-blogpost-cluster-overview-add-dashboard.png

By pressing the plus sign you can add and define your own metrics to graph the dashboard. In our case we will define a new dashboard featuring the Galera specific receive and send queue:

severalnines-blogpost-cluster-overview-add-dashboard2.png

This new dashboard should give us good insight in the average queue length of our Galera cluster.

Once you have pressed save, the new dashboard will become available for this cluster:

severalnines-blogpost-cluster-overview-new-dashboard-added.png

Similarly you can do this for PostgreSQL as well by combining the checkpoints with the number of commits:

severalnines-blogpost-performance-overview-pgsql-add-metric.png

severalnines-blogpost-performance-overview-pgsql-add-metric2.png

severalnines-blogpost-performance-overview-pgsql2.png

So as you can see, it is relatively easy to customize your own (default) dashboard.

Cluster overview - Query Monitor

The Query Monitor tab is available for both MySQL and PostgreSQL based setups and consists out of three dashboards: Top Queries, Running Queries and Query Histogram.

In the Running Queries dashboard, you will find all current queries that are running. This is basically the equivalent of SHOW PROCESSLIST in ClusterControl.

Top Queries and Query Histogram both rely on the input of the slow query log. To prevent ClusterControl to be too intrusive and the slow query log to grow too large, ClusterControl will sample the slow query log by turning it on and off again. This loop is by default set to 1 second capturing and the long_query_time is set to 0.5 seconds. If you wish to change these settings for your cluster, you can change this via Settings -> Query Monitor.

Top Queries will, like the name says, show the top queries that were sampled. You can sort them on various columns: for instance the frequency, average execution time or the total execution time.

severalnines-blogpost-top-queries-overview.png

You can get more details about the query by selecting it and this will present the query execution plan (if available) and optimization hints/advisories. If necessary you can also select the query and have the details emailed to you by clicking on the “email query” button.

The Query Histogram is similar to the Top Queries but then allows you to filter the queries per host and compare them in time.

Cluster overview - Operations

Similar to the PostgreSQL and MySQL systems the MongoDB clusters have the Operations overview and is similar to the Running Queries. This overview is similar to issuing the db.currentOp() command within MongoDB.

severalnines-blogpost-mongodb-current-ops.png

Cluster overview - Performance

MySQL / Galera

The performance tab is probably the best place to find the overall performance and health of your clusters. For MySQL and Galera it consists of an Overview page, the Advisors, status/variables overviews, the Schema Analyzer and the Transaction log.

The Overview page will give you a graph overview of the most important metrics in your cluster. This is, obviously, different per cluster type. Eight metrics have been set by default, but you can easily set your own - up to 20 graphs if needed.

severalnines-blogpost-define-graphs.png

The Advisors is one of the key features of ClusterControl: the Advisors are scripted checks that can be run on demand. The advisors can evaluate almost any fact known about the host and/or cluster and give its opinion on the health of the host and/or cluster and even can give advice on how to resolve issues or improve your hosts!

severalnines-blogpost-mysql-advisors.png

The best part is yet to come: you can create your own checks in the Developer Studio (Cluster -> Manage -> Developer Studio), run them on a regular interval and use them again in the Advisors section. We blogged about this new feature earlier this year.

We will skip the status/variables overview of MySQL and Galera as this is useful for reference but not for this blog post: it is good enough that you know it is here. It is also good to mention that  the Status Time Machine can help you track specific status variables and see how they change in time.

Now suppose your database is growing but you want to know how fast it grew in the past week. You can actually keep track of the growth of both data and index sizes from right within ClusterControl:

And next to the total growth on disk it can also report back the top 25 largest schemas.

Another important feature is the Schema Analyzer within ClusterControl.

ClusterControl will analyze your schemas and look for redundant indexes, MyISAM tables and tables without a primary key. Of course it is entirely up to you to keep a table without a primary key because some application might have created it this way, but at least it is great to get the advice here for free. The Schema Analyzer even constructs the necessary ALTER statement to fix the problem.

PostgreSQL

For PostgreSQL the Advisors, DB Status and DB Variables can be found here.

severalnines-blogpost-postgresql-advisors.png

MongoDB

For MongoDB the Mongo Stats and performance overview can be found under the Performance tab. The Mongo Stats is an overview of the output of mongostat and
the Performance overview gives a good graphical overview of the Mongo opcounters:

severalnines-blogpost-mongodb-performance.png

Final thoughts

We showed you how to keep your eyeballs on the most important monitoring and health checking features of ClusterControl. Obviously this is only the beginning of the journey as we will soon start another blog series about the Developer Studio capabilities and how you can make most of your own checks. Also keep in mind that our support for MongoDB and PostgreSQL is not as extensive as our MySQL toolset, but we are continuously improving on this.

You may ask yourself why we have skipped over the performance monitoring and health checks of HA Proxy and MaxScalel. We did that deliberately as the blog series covered only deployments of clusters up till now and not the deployment of HA components. So that’s the subject we'll cover next time.

Blog category:

nginx as Database Load Balancer for MySQL or MariaDB Galera Cluster

$
0
0

Nginx is well-known for its ability to act as a reverse-proxy with small memory footprint. It usually sits in the front-end web tier to redirect connections to available backend services, provided these passed some health checks. Using a reverse-proxy is common when you are running a critical application or service that requires high availability. It also distributes the load equally among the backend services.

Recently, nginx 1.9 introduced support for TCP load balancing - similar to what HAProxy is capable of. The one major drawback is that it does not support advanced backend health checks. This is required when running MySQL Galera Cluster, as we’ll explain in the next section. Note that this limitation is removed in the paid-only edition called NGINX Plus. 

In this blog post, we are going to play around with nginx as a reverse-proxy for MySQL Galera Cluster services to achieve higher availability. We had a Galera cluster up and running, deployed using ClusterControl on CentOS 7.1. We are going to install nginx on a fresh new host, as illustrated in the following diagram:

Backend Health Checks

With the existence of a synchronously-replicated cluster like Galera or NDB, it has become quite popular to use a TCP reverse-proxy as load balancer. All MySQL nodes are treated equal as one can read or write from any node. No read-write splitting is required, as you would with MySQL master-slave replication. Nginx is not database-aware, so some additional steps are required to configure the health checks for Galera Cluster backends so that they return something understandable. 

If you are running HAProxy, a healthcheck script on each MySQL server in the load balancing set should be able to return an HTTP response status. For example, if the backend MySQL server is healthy, then the script will return a simple HTTP 200 OK status code. Else, the script will return 503 Service unavailable. HAProxy can then update the routing table to exclude the problematic backend servers from the load balancing set and redirect the incoming connections only to the available servers. This is well explained in this webinar on HAProxy. Unfortunately, the HAProxy health check uses xinetd to daemonize and listen to a custom port (9200) which is not configurable in nginx yet.

The following flowchart illustrates the process to report the health of a Galera node for multi-master setup:

At the time of this writing, NGINX Plus (the paid release of nginx) also supports advanced backend health but it does not support custom backend monitoring port.

Using clustercheck-iptables

To overcome this limitation, we’ve created a healthcheck script called clustercheck-iptables. It is a background script that checks the availability of a Galera node, and adds a redirection port using iptables if the Galera node is healthy (instead of returning HTTP response). This allows other TCP-load balancers with limited health check capabilities to monitor the backend Galera nodes correctly. Other than HAProxy, you can now use your favorite reverse proxy like nginx (>1.9), IPVS, keepalived, piranha, distributor, balance or pen to load balance requests across Galera nodes.

So how does it work? The script performs a health check every second on each Galera node. If the node is healthy (wsrep_cluster_state_comment=Synced and read_only=OFF) or (wsrep_cluster_state_comment=Donor and wsrep_sst_method=xtrabackup/xtrabackup-v2), a port redirection will be setup using iptables (default: 3308 redirects to 3306) using the following command:

$ iptables -t nat -A PREROUTING -s $0.0.0.0/0 -p tcp --dport 3308 -j REDIRECT --to-ports 3306

Else, the above rule will be taken out from the iptables PREROUTING chain. On the load balancer, define the designated redirection port (3308) instead. If the backend node is "unhealthy", port 3308 will be unreachable because the corresponding iptables rule is removed on the database node. The load balancer shall then exclude it from the load balancing set.

Let’s install the script and see how it works in practice.

1. On the database servers, run the following commands to install the script:

$ git clone https://github.com/ashraf-s9s/clustercheck-iptables
$ cp clustercheck-iptables/mysqlchk_iptables /usr/local/sbin

2. By default, the script will use a MySQL user called “mysqlchk_user” with password “mysqlchk_password”. We need to ensure this MySQL user exists with the corresponding password before the script is able to perform health checks. Run the following DDL statements on one of the DB nodes (Galera should replicate the statement to the other nodes):

mysql> GRANT PROCESS ON *.* TO 'mysqlchk_user'@'localhost' IDENTIFIED BY 'mysqlchk_password';
mysql> FLUSH PRIVILEGES;

** If you would like to run as different user/password, specify -u and/or -p argument in the command line. See examples on the Github page.

3. This script requires running iptables. In this example, we ran on CentOS 7 which comes with firewalld by default. We have to install iptables-services beforehand:

$ yum install -y iptables-services
$ systemctl enable iptables
$ systemctl start iptables

Then, setup basic rules for MySQL Galera Cluster so iptables won’t affect the database communication:

$ iptables -I INPUT -m tcp -p tcp --dport 3306 -j ACCEPT
$ iptables -I INPUT -m tcp -p tcp --dport 3308 -j ACCEPT
$ iptables -I INPUT -m tcp -p tcp --dport 4444 -j ACCEPT
$ iptables -I INPUT -m tcp -p tcp --dport 4567:4568 -j ACCEPT
$ service iptables save
$ service iptables restart

4. Once the basic rules are added, verify them with the following commands:

$ iptables -L -n

5. Test mysqlchk_iptables:

$ mysqlchk_iptables -t
Detected variables/status:
wsrep_local_state: 4
wsrep_sst_method: xtrabackup-v2
read_only: OFF

[11-11-15 08:33:49.257478192] [INFO] Galera Cluster Node is synced.

6. Looks good. Now we can daemonize the health check script:

$ mysqlchk_iptables -d
/usr/local/sbin/mysqlchk_iptables started with PID 66566.

7. Our PREROUTING rules will look something like this:

$ iptables -L -n -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination
REDIRECT   tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:3308 redir ports 3306

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination

8. Finally, add the health check command into /etc/rc.local so it starts automatically on boot:

echo '/usr/local/sbin/mysqlchk_iptables -d'>> /etc/rc.local

In some distributions, you need to verify that rc.local holds the correct permission to execute scripts on boot. Verify with:

$ chmod +x /etc/rc.local

From the application side, verify that you can connect to MySQL through port 3308. Repeat the above steps (except step #2) for the remaining DB nodes. Now, we have configured our backend health checks correctly. Let’s set up our MySQL load balancer as described in the next section.

Setting Up nginx as MySQL Load Balancer

1. On the load balancer node, install the required packages:

$ yum -y install pcre-devel zlib-devel

2. Install nginx 1.9 from source with TCP proxy module (--with-stream):

$ wget http://nginx.org/download/nginx-1.9.6.tar.gz
$ tar -xzf nginx-1.9.6.tar.gz
$ ./configure --with-stream

3. Add the following lines into nginx configuration file located at /usr/local/nginx/conf/nginx.conf:

stream {
      upstream stream_backend {
        zone tcp_servers 64k;
        server 192.168.55.201:3308;
        server 192.168.55.202:3308;
        server 192.168.55.203:3308;
    }

    server {
        listen 3307;
        proxy_pass stream_backend;
        proxy_connect_timeout 1s;
    }
}

4. Start nginx:

$ /usr/local/nginx/sbin/nginx

5. Verify that nginx is listening to port 3307 that we have defined. MySQL connections should be coming via this port of this node and then redirects to available backends on port 3308. Then the respective DB node will redirect it to port 3306 where MySQL is listening:

$ netstat -tulpn | grep 3307
tcp        0      0 0.0.0.0:3307            0.0.0.0:*               LISTEN      5348/nginx: master

Great. We have now set up our nginx instance as MySQL Galera Cluster load balancer. Let’s test it out!

Testing

Let’s perform some tests to verify that our Galera cluster is correctly load balanced. We performed various exercises to look at nginx and Galera cluster work in action. We performed the following actions consecutively:

  1. Turn g1.local to read-only=ON and read_only=OFF.
  2. Kill mysql service on g1.local and force SST when startup.
  3. Kill the other two database nodes so g1.local will become non-primary.
  4. Bootstrap g1.local from non-primary state.
  5. Rejoin the other 2 nodes back to the cluster.

The screencast below contains several terminal outputs which explained as follows:

  • Terminal 1 (top left): iptables PREROUTING chain output
  • Terminal 2 (top right): MySQL error log for g1.local
  • Terminal 3 (middle left): Application output when connecting to nginx load balancer. It reports date, hostname, wsrep_last_committed and wsrep_local_state_comment
  • Terminal 4 (middle right): Output of /var/log/mysqlchk_iptables
  • Terminal 5 (bottom left): Output of read_only and wsrep_sst_method on g1.local
  • Terminal 6 (bottom right): Action console

The following asciinema recording shows the result:

 

Summary

Galera node health checks by a TCP load balancer was limited to HAProxy due to its ability to use a custom port for backend health checks. With this script, it’s now possible for any TCP load balancers/reverse-proxies to monitor Galera nodes correctly. 

You are welcome to fork, pull request and extend the capabilities of this script.

Blog category:

Become a ClusterControl DBA: Adding Existing Databases and clusters

$
0
0

In our previous blog post we covered the deployment of four types of clustering/replication: MySQL Galera, MySQL master-slave replication, PostgreSQL replication set and MongoDB replication set. This should enable you to create new clusters with great ease, but what if you already have 20 replication setups deployed and wish to manage them with ClusterControl?

This blog post will cover adding existing infrastructure components for these four types of clustering/replication to ClusterControl and how to have ClusterControl manage them.

Adding an existing Galera cluster to ClusterControl

Adding an existing Galera cluster to ClusterControl requires: mysql user with the proper grants and a ssh user that is able to login (without password) from the ClusterControl node to your existing databases and clusters.
 
Install ClusterControl on a separate VM. Once it is up, open the dialogue for adding an existing cluster. All you have to do is to add one of the Galera nodes and ClusterControl will figure out the rest:

severalnines-blogpost-add-existing-galera-cluster.png

After this behind the scenes, ClusterControl will connect to this host and detect all the necessary details for the full cluster and register the cluster in the overview.

Adding an existing MySQL master-slave to ClusterControl

Adding of an existing MySQL master-slave topology requires a bit more work than adding a Galera cluster. As ClusterControl is able to extract the necessary information for Galera, in the case of master-slave, you need to specify every host within the replication setup.

severalnines-blogpost-add-existing-mysql-master-slave.png

After this, ClusterControl will connect to every host, see if they are part of the same topology and register them as part of one cluster (or server group) in the GUI.

Adding an existing PostgreSQL replication set to ClusterControl

Similar to adding the MySQL master-slave above, the PostgreSQL replication set also requires to fill in all hosts within the same replication set.

severalnines-blogpost-add-existing-postgresql-replication-set.png

After this, ClusterControl will connect to every host, see if they are part of the same topology and register them as part of the same group. 

Adding an existing MongoDB replica set to ClusterControl

Adding an existing MongoDB replica set is just as easy as Galera: just one of the hosts in the replica set needs to be specified with its credentials and ClusterControl will automatically discover the other nodes in the replica set.

severalnines-blogpost-add-existing-mongodb-replica-set.png

Expanding your existing infrastructure

After adding the existing databases and clusters, they now have become manageable via ClusterControl and thus we can scale out our clusters.

For MySQL, MongoDB and PostgreSQL replication sets, this can easily be achieved via the same way we showed in our previous blogpost: simply add a node and ClusterControl will take care of the rest.

severalnines-blogpost1-adding-postgresql-slave1.png

For Galera, there is a bit more choice. The most obvious choice is to add a (Galera) node to the cluster by simply choosing “add node” in the cluster list or cluster overview. Expanding your Galera cluster this way should happen with increments of two to ensure your cluster always can have majority during a split brain situation.

Alternatively you could add a replication slave and thus create asynchronous slave in your synchronous cluster that looks like this:

magtid_arch_full.png

Adding a slave node blindly under one of the Galera nodes can be dangerous since if this node goes down, the slave won’t receive updates anymore from its master. We blogged about paradigm earlier and you can read how to solve this in this blog post.

Final thoughts

We showed you how easy it is to add existing databases and clusters to ClusterControl, you can literally add clusters within minutes. So nothing should hold you back from using ClusterControl to manage your existing infrastructure. If you have a large infrastructure, the addition of ClusterControl will give you more overview and save time in troubleshooting and maintaining your clusters.

Now the challenge is how to leverage ClusterControl to keep track of key performance indicators, show the global health of your clusters and proactively alert you in time when something is predicted to happen. And that’s the subject we'll cover next time.

Read also in the same series: 

Blog category:

Become a MySQL DBA blog series - Understanding the MySQL Error Log

$
0
0

We are yet to see a software that runs perfectly, without any issues. MySQL is no exception there. It’s not the software’s fault - we need to be clear about that. We use MySQL in different places, on different hardware and within different environments. It’s also highly configurable. All those features make it great product but they come with a price - sometimes some settings won’t work correctly under certain conditions. It is also pretty easy to make simple human mistakes like typos in the MySQL configuration. Luckily, MySQL provides us with means to understand what is wrong through the error log. In this blog, we’ll see how to read the information in the error log.

This is the fifteenth installment in the ‘Become a MySQL DBA’ blog series. Our previous posts in the DBA series include:

Error log on MySQL - clean startup and shutdown

MySQL’s error log may contain a lot of information about different issues you may encounter. Let’s start with checking what a ‘clean’ start of MySQL looks like. It will make it easier to find any anomalies later.

At first, all plugins (and storage engines, which do work as plugins in MySQL 5.6) are initiated. If something is not right, you’ll see errors on this stage.

2015-10-26 19:35:20 13762 [Note] Plugin 'FEDERATED' is disabled.

Next we can see a significant part related to InnoDB initialization.

2015-10-26 19:35:20 13762 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-26 19:35:20 13762 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-26 19:35:20 13762 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-10-26 19:35:20 13762 [Note] InnoDB: Memory barrier is not used
2015-10-26 19:35:20 13762 [Note] InnoDB: Compressed tables use zlib 1.2.8
2015-10-26 19:35:20 13762 [Note] InnoDB: Using Linux native AIO
2015-10-26 19:35:20 13762 [Note] InnoDB: Using CPU crc32 instructions
2015-10-26 19:35:20 13762 [Note] InnoDB: Initializing buffer pool, size = 512.0M
2015-10-26 19:35:21 13762 [Note] InnoDB: Completed initialization of buffer pool
2015-10-26 19:35:21 13762 [Note] InnoDB: Highest supported file format is Barracuda.
2015-10-26 19:35:21 13762 [Note] InnoDB: 128 rollback segment(s) are active.
2015-10-26 19:35:21 13762 [Note] InnoDB: Waiting for purge to start
2015-10-26 19:35:21 13762 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.26-74.0 started; log sequence number 710963181

In the next phase, authentication plugins are initiated (if they are configured correctly).

2015-10-26 19:35:21 13762 [Note] RSA private key file not found: /var/lib/mysql//private_key.pem. Some authentication plugins will not work.
2015-10-26 19:35:21 13762 [Note] RSA public key file not found: /var/lib/mysql//public_key.pem. Some authentication plugins will not work.

At the end you’ll see information about binding MySQL to the configured IP and port. Event scheduler is also initialized. Finally, you’ll see ‘ready for connections’ message indicating that MySQL started correctly.

2015-10-26 19:35:21 13762 [Note] Server hostname (bind-address): '*'; port: 33306
2015-10-26 19:35:21 13762 [Note] IPv6 is available.
2015-10-26 19:35:21 13762 [Note]   - '::' resolves to '::';
2015-10-26 19:35:21 13762 [Note] Server socket created on IP: '::'.
2015-10-26 19:35:21 13762 [Warning] 'proxies_priv' entry '@ root@ip-172-30-4-23' ignored in --skip-name-resolve mode.
2015-10-26 19:35:21 13762 [Note] Event Scheduler: Loaded 2 events
2015-10-26 19:35:21 13762 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.26-74.0'  socket: '/var/run/mysqld/mysqld.sock'  port: 33306  Percona Server (GPL), Release 74.0, Revision 32f8dfd

How does the shutdown process look like? Let’s follow it step by step:

2015-10-26 19:35:13 12955 [Note] /usr/sbin/mysqld: Normal shutdown

This line is very important - MySQL can be stopped in many ways - a user can use init scripts, it’s possible to use ‘mysqladmin’ to execute SHUTDOWN command, it’s also possible to send a SIGTERM to MySQL initiating shutdown of the database. At times you’ll be investigating why MySQL instance had stopped - this line is always an indication that someone (or something) triggered clean shutdown. No crash happened, MySQL wasn’t killed either.

2015-10-26 19:35:13 12955 [Note] Giving 12 client threads a chance to die gracefully
2015-10-26 19:35:13 12955 [Note] Event Scheduler: Purging the queue. 2 events
2015-10-26 19:35:13 12955 [Note] Shutting down slave threads
2015-10-26 19:35:15 12955 [Note] Forcefully disconnecting 6 remaining clients
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 37  user: 'cmon'
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 53  user: 'cmon'
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 38  user: 'cmon'
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 39  user: 'cmon'
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 44  user: 'cmon'
2015-10-26 19:35:15 12955 [Warning] /usr/sbin/mysqld: Forcing close of thread 47  user: 'cmon'

In the above, MySQL is closing remaining connections - it allows them to finish on their own but if they do not close, they will be forcefully terminated.

2015-10-26 19:35:15 12955 [Note] Binlog end
2015-10-26 19:35:15 12955 [Note] Shutting down plugin 'partition'
2015-10-26 19:35:15 12955 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2015-10-26 19:35:15 12955 [Note] Shutting down plugin 'ARCHIVE'
...

Here MySQL is shutting down all plugins that were enabled in its configuration. There are many of them so we removed some output for the sake of better visibility.

2015-10-26 19:35:15 12955 [Note] Shutting down plugin 'InnoDB'
2015-10-26 19:35:15 12955 [Note] InnoDB: FTS optimize thread exiting.
2015-10-26 19:35:15 12955 [Note] InnoDB: Starting shutdown...
2015-10-26 19:35:17 12955 [Note] InnoDB: Shutdown completed; log sequence number 710963181

One of them is InnoDB - this step may take a while as a clean InnoDB shutdown can take some time on a busy server, even with InnoDB fast shutdown enabled (and this is the default setting).

2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'MyISAM'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'MRG_MYISAM'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'CSV'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'MEMORY'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'sha256_password'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'mysql_old_password'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'mysql_native_password'
2015-10-26 19:35:17 12955 [Note] Shutting down plugin 'binlog'
2015-10-26 19:35:17 12955 [Note] /usr/sbin/mysqld: Shutdown complete

The shutdown process concludes with ‘Shutdown complete’ message.

Configuration errors

Let’s start with a basic issue - configuration error. It’s really a popular problem. Sometimes we make a typo in a variable name, while editing my.cnf. MySQL parses its configuration files when it starts. If something is wrong, it will refuse to run. Let’s take a look at some examples:

2015-10-27 11:20:05 18858 [Note] Plugin 'FEDERATED' is disabled.
2015-10-27 11:20:05 18858 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-27 11:20:05 18858 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-27 11:20:05 18858 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-10-27 11:20:05 18858 [Note] InnoDB: Memory barrier is not used
2015-10-27 11:20:05 18858 [Note] InnoDB: Compressed tables use zlib 1.2.8
2015-10-27 11:20:05 18858 [Note] InnoDB: Using Linux native AIO
2015-10-27 11:20:05 18858 [Note] InnoDB: Using CPU crc32 instructions
2015-10-27 11:20:05 18858 [Note] InnoDB: Initializing buffer pool, size = 512.0M
2015-10-27 11:20:05 18858 [Note] InnoDB: Completed initialization of buffer pool
2015-10-27 11:20:05 18858 [Note] InnoDB: Highest supported file format is Barracuda.
2015-10-27 11:20:05 18858 [Note] InnoDB: 128 rollback segment(s) are active.
2015-10-27 11:20:05 18858 [Note] InnoDB: Waiting for purge to start
2015-10-27 11:20:06 18858 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.26-74.0 started; log sequence number 773268083
2015-10-27 11:20:06 18858 [ERROR] /usr/sbin/mysqld: unknown variable '--max-connections=512'
2015-10-27 11:20:06 18858 [ERROR] Aborting

As you can see from the above, we have here a clear copy-paste error with regards to max_connections variable. MySQL doesn’t accept ‘--’ prefix in my.cnf so if you are copying settings from the ‘ps’ output or from MySQL documentation, please keep it in mind. There are many things that can go wrong but it’s important that ‘unknown variable’ error always points you to my.cnf and to some kind of mistake which MySQL can’t understand.

Another problem can be related to misconfigurations. Let’s say we over-allocated memory on our server. Here’s what you might see:

2015-10-27 11:24:41 31325 [Note] Plugin 'FEDERATED' is disabled.
2015-10-27 11:24:41 31325 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-27 11:24:41 31325 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-27 11:24:41 31325 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-10-27 11:24:41 31325 [Note] InnoDB: Memory barrier is not used
2015-10-27 11:24:41 31325 [Note] InnoDB: Compressed tables use zlib 1.2.8
2015-10-27 11:24:41 31325 [Note] InnoDB: Using Linux native AIO
2015-10-27 11:24:41 31325 [Note] InnoDB: Using CPU crc32 instructions
2015-10-27 11:24:41 31325 [Note] InnoDB: Initializing buffer pool, size = 512.0G
InnoDB: mmap(70330089472 bytes) failed; errno 12
2015-10-27 11:24:41 31325 [ERROR] InnoDB: Cannot allocate memory for the buffer pool
2015-10-27 11:24:41 31325 [ERROR] Plugin 'InnoDB' init function returned error.
2015-10-27 11:24:41 31325 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2015-10-27 11:24:41 31325 [ERROR] Unknown/unsupported storage engine: InnoDB
2015-10-27 11:24:41 31325 [ERROR] Aborting

As you can clearly see, we’ve tried to allocate 512G of memory which was not possible on this host. InnoDB is required for MySQL to start, therefore error in initializing InnoDB prevented MySQL from starting.

Permission errors

Below are three examples of three crashes related to permission errors.

/usr/sbin/mysqld: File './binlog.index' not found (Errcode: 13 - Permission denied)
2015-10-27 11:31:40 11469 [ERROR] Aborting

This one is pretty self-explanatory. As stated, MySQL couldn’t find an index file for binary logs due to ‘permission denied’ error. This is also a hard stop for MySQL, it has to be able to read and write to binary logs.

2015-10-27 11:32:46 13601 [Note] Plugin 'FEDERATED' is disabled.
2015-10-27 11:32:46 13601 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-27 11:32:46 13601 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-27 11:32:46 13601 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-10-27 11:32:46 13601 [Note] InnoDB: Memory barrier is not used
2015-10-27 11:32:46 13601 [Note] InnoDB: Compressed tables use zlib 1.2.8
2015-10-27 11:32:46 13601 [Note] InnoDB: Using Linux native AIO
2015-10-27 11:32:46 13601 [Note] InnoDB: Using CPU crc32 instructions
2015-10-27 11:32:46 13601 [Note] InnoDB: Initializing buffer pool, size = 512.0M
2015-10-27 11:32:46 13601 [Note] InnoDB: Completed initialization of buffer pool
2015-10-27 11:32:46 13601 [ERROR] InnoDB: ./ibdata1 can't be opened in read-write mode
2015-10-27 11:32:46 13601 [ERROR] InnoDB: The system tablespace must be writable!
2015-10-27 11:32:46 13601 [ERROR] Plugin 'InnoDB' init function returned error.
2015-10-27 11:32:46 13601 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2015-10-27 11:32:46 13601 [ERROR] Unknown/unsupported storage engine: InnoDB
2015-10-27 11:32:46 13601 [ERROR] Aborting

The above is another permission issue, this time the problem is related to ibdata1 file - shared system tablespace which is used for InnoDB to store its internal data (InnoDB dictionary and, by default, undo logs). Even if you use innodb_file_per_table, which is a default setting in MySQL 5.6, InnoDB still requires this file so it can intialize.

2015-10-27 11:35:21 17826 [Note] Plugin 'FEDERATED' is disabled.
/usr/sbin/mysqld: Can't find file: './mysql/plugin.frm' (errno: 13 - Permission denied)
2015-10-27 11:35:21 17826 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it.
2015-10-27 11:35:21 17826 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-27 11:35:21 17826 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-27 11:35:21 17826 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-10-27 11:35:21 17826 [Note] InnoDB: Memory barrier is not used
2015-10-27 11:35:21 17826 [Note] InnoDB: Compressed tables use zlib 1.2.8
2015-10-27 11:35:21 17826 [Note] InnoDB: Using Linux native AIO
2015-10-27 11:35:21 17826 [Note] InnoDB: Using CPU crc32 instructions
2015-10-27 11:35:21 17826 [Note] InnoDB: Initializing buffer pool, size = 512.0M
2015-10-27 11:35:21 17826 [Note] InnoDB: Completed initialization of buffer pool
2015-10-27 11:35:21 17826 [Note] InnoDB: Highest supported file format is Barracuda.
2015-10-27 11:35:21 7fa02e97d780  InnoDB: Operating system error number 13 in a file operation.
InnoDB: The error means mysqld does not have the access rights to
InnoDB: the directory.
2015-10-27 11:35:21 17826 [ERROR] InnoDB: Could not find a valid tablespace file for 'mysql/innodb_index_stats'. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2015-10-27 11:35:21 17826 [ERROR] InnoDB: Tablespace open failed for '"mysql"."innodb_index_stats"', ignored.
2015-10-27 11:35:21 7fa02e97d780  InnoDB: Operating system error number 13 in a file operation.
InnoDB: The error means mysqld does not have the access rights to
InnoDB: the directory.
2015-10-27 11:35:21 17826 [ERROR] InnoDB: Could not find a valid tablespace file for 'mysql/innodb_table_stats'. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2015-10-27 11:35:21 17826 [ERROR] InnoDB: Tablespace open failed for '"mysql"."innodb_table_stats"', ignored.
2015-10-27 11:35:21 7fa02e97d780  InnoDB: Operating system error number 13 in a file operation.
InnoDB: The error means mysqld does not have the access rights to
InnoDB: the directory.

Here we have an example of permission issues related to system schema (‘mysql’ schema). As you can see, InnoDB reported it does not have access rights to open tables in this schema. Again, this is serious problem which is a hard stop for MySQL.

Out of memory errors

Another set of frequent issues related to MySQL servers running out of memory. It can manifest in many different ways but the most popular is this one:

Killed
151027 12:18:58 mysqld_safe Number of processes running now: 0

mysqld_safe is an ‘angel’ process which monitors mysqld and restarts it when it dies. It’s true for MySQL but not for recent Galera nodes - in this case it won’t restart MySQL but you’ll see following sequence:

Killed
151027 12:18:58 mysqld_safe Number of processes running now: 0
151027 12:18:58 mysqld_safe WSREP: not restarting wsrep node automatically
151027 12:18:58 mysqld_safe mysqld from pid file /var/lib/mysql/mysql.pid ended

Another way memory issue may show up is through the following messages in the error log:

InnoDB: mmap(3145728 bytes) failed; errno 12
InnoDB: mmap(3145728 bytes) failed; errno 12

What’s important to keep in mind is that error codes can easily be checked using ‘perror’ utility. In this case, it makes perfectly clear what exactly happened:

$ perror 12
OS error code  12:  Cannot allocate memory

If you suspect a memory problem or we’d even say - if you are encountering unexpected MySQL crashes, you can also run dmesg to get some more data. If you see output like below, you can be certain that the problem was caused by OOM Killer terminating MySQL:

[ 4165.802544] Out of memory: Kill process 8143 (mysqld) score 938 or sacrifice child
[ 4165.808329] Killed process 8143 (mysqld) total-vm:5101492kB, anon-rss:3789388kB, file-rss:0kB
[ 4166.226410] init: mysql main process (8143) killed by KILL signal
[ 4166.226437] init: mysql main process ended, respawning

 

InnoDB crashes

InnoDB, most of the time, is a very solid and durable storage engine. Unfortunately, under some circumstances, data may become corrupted. Usually it’s related to either misconfiguration (for example, disabling double-write buffer) or some kind of faulty hardware (faulty memory modules, unstable I/O layer). In those cases it can happen that InnoDB will crash. Crashes can also be triggered by InnoDB bugs. It’s not common but it happens from time to time. No matter the reason of a crash, the error log usually looks in a similar way. Let’s go over a live example.

2015-10-13 15:08:06 7f53f0658700  InnoDB: Assertion failure in thread 139998492198656 in file btr0pcur.cc line 447
InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == buf_block_get_page_no(btr_pcur_get_block(cursor))

InnoDB code is full of sanity checks - assertions are very frequent and, if one of them fails, InnoDB intentionally crashes MySQL. At the beginning you’ll see information about exact location of the assertion which failed. This gives you information about what exactly may have happened - MySQL source code is available on the internet and it’s pretty easy to download code related to your particular version. It may sounds scary but it’s not - MySQL code is properly documented and it’s not a problem to figure out at least what kind of activity triggers your issue.

InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.

InnoDB logging tries to be as helpful as possible - as you can see, there’s information about InnoDB recovery process. Under some circumstances it may allow you to start MySQL and dump the InnoDB data. It’s used as “almost a last resort” - one step further and you’ll end up digging in InnoDB files using hex editor.

13:08:06 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=67108864
read_buffer_size=131072
max_used_connections=36
max_threads=514
thread_count=27
connection_count=26
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 269997 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

In the next step you’ll see some of the status counters printed. It’s supposed to give you some idea about configuration - maybe you configured MySQL to use too much resources? Once you go over this data, we’ll reach the backtrace. This piece of information will guide you through the stack and show you what kind of calls were executed when crash happened. Let’s check what exactly happened in this particular case.

Thread pointer: 0xa985960
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f53f0657d40 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x2c)[0x8cd37c]
/usr/sbin/mysqld(handle_fatal_signal+0x461)[0x6555a1]
/lib64/libpthread.so.0(+0xf710)[0x7f540f978710]
/lib64/libc.so.6(gsignal+0x35)[0x7f540dfd4625]
/lib64/libc.so.6(abort+0x175)[0x7f540dfd5e05]
/usr/sbin/mysqld[0xa0277b]
/usr/sbin/mysqld[0x567f88]
/usr/sbin/mysqld[0x99afde]
/usr/sbin/mysqld[0x8eb926]
/usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x5d)[0x59a6ed]
/usr/sbin/mysqld(_Z13rr_sequentialP11READ_RECORD+0x20)[0x800cd0]
/usr/sbin/mysqld(_Z12mysql_deleteP3THDP10TABLE_LISTP4ItemP10SQL_I_ListI8st_orderEyy+0xc88)[0x8128c8]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1bc5)[0x6d6d15]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x5a8)[0x6dc228]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x108c)[0x6dda4c]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x162)[0x6ab002]
/usr/sbin/mysqld(handle_one_connection+0x40)[0x6ab0f0]
/usr/sbin/mysqld(pfs_spawn_thread+0x143)[0xaddd73]
/lib64/libpthread.so.0(+0x79d1)[0x7f540f9709d1]
/lib64/libc.so.6(clone+0x6d)[0x7f540e08a8fd]

We can see that problem was triggered when MySQL was executing a DELETE command. Records were read sequentially and full table scan was performed (handler read_rnd_next). Combined with information from the beginning:

InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == buf_block_get_page_no(btr_pcur_get_block(cursor))

we can tell the crash happened when InnoDB was performing a scan of the buffer pool (assertion was triggered in btr_pcur_move_to_next_page method).

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (a991d40): DELETE FROM cmon_stats WHERE cid = 2 AND ts < 1444482484
Connection ID (thread ID): 5
Status: NOT_KILLED

Finally, we can see (sometimes, it’s not always printed) which thread triggered the assertion. In our case we can confirm that indeed it was a DELETE query which caused some kind of problem.

You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
151013 15:08:06 mysqld_safe Number of processes running now: 0

As we said, usually MySQL is pretty verbose when it comes to the error log. Sometimes, though, things go so wrong that MySQL can’t really collect much of the information. It still gives a pretty clear indication that something went very wrong (signal 11). Please take a look at the following real life example:

22:31:40 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=33554432
read_buffer_size=131072
max_used_connections=10
max_threads=202
thread_count=9
connection_count=7
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 113388 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x1fb7050
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Segmentation fault (core dumped)
150119 23:48:08 mysqld_safe Number of processes running now: 0

As you can see, we have clear information that MySQL crashed. There’s no additional information, though. Segmentation faults can also be logged in diagnostic messages. In this particular case dmesg returned:

[370591.200799] mysqld[5462]: segfault at 51 ip 00007fe624e3e688 sp 00007fe4f4099c30 error 4 in libgcc_s.so.1[7fe624e2f000+16000]

It’s not much, as you can see, but such information can prove to be very useful if you decide to file a bug against the software provider, be it Oracle, MariaDB, Percona or Codership. It can also be helpful in finding relevant bugs already reported.

It’s not always possible to fix the problem based only on the information written in the error log. It’s definitely a good place to start debugging problems in your database, though. If you are hitting some kind of very common, simple misconfiguration, information you’d find in the error log should be enough to pinpoint a cause and find a solution. If we are talking about serious MySQL crashes, things are definitely different - most likely you won’t be able to fix the issue on your own (unless you are a developer familiar with MySQL code, that is) On the other hand, this data may be more than enough to identify the culprit and locate existing bug reports which cover it. Such bug reports contain lots of discussion and tips or workarounds - it’s likely that you’ll be able to implement one of them. Even information that the given bug was fixed in version x.y.z is useful - you’ll be able to verify it and, if it does solve your problem, to plan for upgrade. In the worst case scenario, you should have enough data to create a bug report on your own.

In the next post in the series, we are going to take a guided tour through the error log (and other logs) you may find in your Galera cluster. We’ll cover what kind of issues you may encounter and how to solve them.

Blog category:

s9s Tools and Resources: 'Become a MySQL DBA' series, ClusterControl 1.2.11 release, and more!

$
0
0

Check Out Our Latest Technical Resources for MySQL, MariaDB, PostgreSQL and MongoDB

This blog is packed with all the latest resources and tools we’ve recently published! Please do check it out and let us know if you have any comments or feedback.

Product Announcements & Resources

1211_banner.png

ClusterControl 1.2.11 Release
Last month, we were pleased to announce our best release yet for Postgres users as well as to introduce key new features for our MySQL/MariaDB users, such as support for MaxScale, an open-source, database-centric proxy that works with MySQL, MariaDB and Percona server.

View all the related resources here:

Support for MariaDB’s MaxScale
You can get started straight away with MaxScale by deploying and managing it with ClusterControl 1.2.11. This How-To blog shows you how, step-by-step.

mariadb_maxscale.jpg

Customer Case Studies

From small businesses to Fortune 500 companies, customers have chosen Severalnines to deploy and manage MySQL, MongoDB and PostgreSQL.  

View our Customer page to discover companies like yours who have found success with ClusterControl.

Screen Shot 2015-11-04 at 09.44.52.png

Technical Webinar - Replays

As you know, we run a monthly technical webinar cycle; these are the latest replays, which you can watch at your own leisure, all part of our ‘Become a MySQL DBA’ series:

View all our replays here

ClusterControl Blogs

We’ve started a new series of blogs focussing on how to use ClusterControl. Do check them out!

View all ClusterControl blogs here.

cc_deploy_multiple.png

 

The MySQL DBA Blog Series

We’re on the 15th installment of our popular ‘Become a MySQL DBA’ series and you can view all of these blogs here. Here are the latest ones in the series:

View all the ‘Become a MySQL DBA’ blogs here

We trust these resources are useful. If you have any questions on them or on related topics, please do contact us!

Blog category:

ClusterControl Tips & Tricks: wtmp Log Rotation Settings for Sudo User

$
0
0

Requires ClusterControl. Applies to all supported database clusters. Applies to all supported operating systems (RHEL/CentOS/Debian/Ubuntu).

ClusterControl requires a super-privileged SSH user to provision database nodes. If you are running as non-root user, the corresponding user must able to execute sudo commands with or without sudo password. Unfortunately, this could generate another issue where performing remote command with “sudo” requires an interactive session (tty). We will explain this in details in the next sections.

What’s up with sudo?

By default, most of the RHEL flavors have the following configured under /etc/sudoers:

Defaults requiretty

When an interactive session (tty) is required, each time the sudo user SSH into the box with -t flag (force pseudo-tty allocation), entries will be created in /var/log/wtmp for the creation and destruction of terminals, or the assignment and release of terminals. These logs only record interactive sessions. If you didn’t specify -t, you would see the following error:

sudo: sorry, you must have a tty to run sudo

The root user does not require an interactive session when running remote SSH command, the entries only appear in /var/log/secure or /var/log/auth.log depending on the system configuration. Different distributions have different defaults in this regards. SSH does not make a file into wtmp if it is a non-interactive session.

To check the content of wtmp, we use the following command:

$ last -f /var/log/wtmp
ec2-user pts/0        ip-10-0-0-79.ap- Wed Oct 28 11:16 - 11:16  (00:00)
ec2-user pts/0        ip-10-0-0-79.ap- Wed Oct 28 11:16 - 11:16  (00:00)
ec2-user pts/0        ip-10-0-0-79.ap- Wed Oct 28 11:16 - 11:16  (00:00)
...

On Debian/Ubuntu system, sudo user does not need to acquire tty as it defaults to have no “requiretty” configured. However, ClusterControl defaults to append -t flag if it detects the SSH user as a non-root user. Since ClusterControl performs all the monitoring and management tasks as this user, you may notice that /var/log/wtmp will grow rapidly, as shown in the following section.

Log rotation for wtmp

Example: Take note of the following default configuration of wtmp in RHEL 7.1 inside /etc/logrotate.conf:

/var/log/wtmp {
    monthly
    create 0664 root utmp
    minsize 1M
    rotate 1
}

By running the following commands on one of the database nodes managed by ClusterControl, we can see how fast /var/log/wtmp grows every minute:

[user@server ~]$ a=$(du -b /var/log/wtmp | cut -f1) && sleep 60 && b=$(du -b /var/log/wtmp | cut -f1) && c=$(expr $b - $a ) && echo $c
89088

From the above result, ClusterControl causes the log file to grow 89 KB per minute, which equals to 128MB per day. If the mentioned logrotate configuration is used (monthly rotation), /var/log/wtmp alone may consume 3.97 GB of disk space! If the partition where this file resides (usually under “/” partition) is small (it’s common to have “/” partition smaller, especially if it’s a cloud instance), there is a potential risk you would fill up the disk space on that partition in less than one month. 

Workaround

The workaround is to play with the log rotation of wtmp. This is applicable to all operating systems mentioned in the beginning of this post. For those who are affected by this, you have to change the log rotation behaviour so it does not grow more than expected. The following is what we recommend:

/var/log/wtmp {
     size 100M
     create 0664 root utmp
     rotate 3
     compress
}

The above settings specify that the maximum size of wtmp should be 100 MB and, and we should keep the 3 most recent (compressed) files and remove older ones.

Logrotate uses crontab (under /etc/cron.daily/logrotate) to work. It is not a daemon so no need to reload its configuration. When the crontab executes logrotate, it will use the new config file automatically.

Happy Clustering!

PS.: To get started with ClusterControl, click here!

Blog category:

Webinar Replay & Slides: Deploying MongoDB, MySQL, PostgreSQL & MariaDB’s MaxScale in 40min

$
0
0

Live demo recording of ClusterControl 1.2.11 and its new features 

Thanks to everyone who joined us for our recent live webinar on ‘ClusterControl 1.2.11 - new release’ led by Art van Scheppingen , Senior Support Engineer at Severalnines. The replay and slides to the webinar are now available to watch and read online via the links below.

During this live webinar, Art demonstrated the latest ClusterControl release and its new features such as support for MariaDB’s MaxScale and related MySQL updates; this is also the best ClusterControl release for PostgreSQL yet.

In fact, and during a live demo, Art not only introduced the new features, but also proceeded to deploy MongoDB, PostgreSQL, MySQL clusters and replicated setups as well as MariaDB’s MaxScale proxy all in the one session and the one ClusterControl instance - within 40min! Impressive stuff, which can be viewed again in this recording!

Watch the replay

Read the slides

 

1211_banner.png

TOPICS COVERED

  • For PostgreSQL
    • Deployment and Management of Postgres Replicated Setups
    • Customisable dashboards
    • Database performance charts for nodes
    • Enablement of ClusterControl DevStudio
  • Support for MaxScale
    • Deployment and management of MaxScale load balancer
  • For MySQL
    • Add Existing HAProxy and Keepalived
    • Deployment of MySQL Replication setups
    • Improvements in charting of metrics
    • Revamped Configuration Management
    • New Database Logs Page
    • Revamped MySQL User Management

SPEAKER

Art van Scheppingen is a Senior Support Engineer at Severalnines. He’s a pragmatic MySQL and Database expert with over 15 years experience in web development. He previously worked at Spil Games as Head of Database Engineering, where he kept a broad vision upon the whole database environment: from MySQL to Couchbase, Vertica to Hadoop and from Sphinx Search to SOLR. He regularly presents his work and projects at various conferences (Percona Live, FOSDEM) and related meetups.

cc_deploy_multiple.png

For further blogs on ClusterControl visit: http://www.severalnines.com/blog-categories/clustercontrol

To view all our webinar replays visit: http://www.severalnines.com/webinars-replay

Blog category:


Webinar: Become a MySQL DBA - Performing live database upgrades in Replication and Galera setups - Tuesday November 24th

$
0
0

Join us on November 24th for this new webinar on performing live database upgrades for MySQL Replication & Galera led by Krzysztof Książek, Senior Support Engineer at Severalnines. This is part of our ongoing ‘Become a MySQL DBA’ series. 

Database vendors typically release patches with bug/security fixes on a monthly basis, why should we care? After all, upgrading a system is extra work and might incur downtime. Unfortunately, the news is full of reports of security breaches and hacked systems, so unless security is not a concern, you might want to have the most current security fixes on your systems. Major versions are rarer, and usually harder (and riskier) to upgrade to. But they might bring along some important features that make the upgrade worth the effort.

In this webinar we will cover one of the most basic, but essential tasks of the DBA: minor and major database upgrades in production environments.

DATE, TIME & REGISTRATION

Europe/MEA/APAC
Tuesday, November 24th at 09:00 GMT / 10:00 CET (Germany, France, Sweden)
Register Now

North America/LatAm
Tuesday, November 24th at 09:00 Pacific Time (US) / 12:00 Eastern Time (US)
Register Now

AGENDA

  • What types of upgrades are there? 
  • How do I best prepare for the upgrades?
  • Best practices for: 
    • Minor version upgrades - MySQL & Galera
    • Major version upgrades - MySQL & Galera

SPEAKER

Krzysztof Książek, Senior Support Engineer at Severalnines, is a MySQL DBA with experience managing complex database environments for companies like Zendesk, Chegg, Pinterest and Flipboard. This webinar builds upon recent blog posts and related webinar series by Krzysztof on how to become a MySQL DBA.

To view all the blogs of the ‘Become a MySQL DBA’ series visit: http://www.severalnines.com/blog-categories/db-ops

up56_logo.png

Blog category:

Become a MySQL DBA blog series - Galera Cluster diagnostic logs

$
0
0

In a previous post, we discussed the error log as a source of diagnostic information for MySQL. We’ve shown examples of problems where the error log can help you diagnose issues with MySQL. This post focuses on the Galera Cluster error logs, so if you are currently using Galera, read on.

In a Galera cluster, whether you use MariaDB Cluster, Percona XtraDB Cluster or the Codership build, the MySQL error log is even more important. It gives you the same information as MySQL would, but you’ll also find in it information about Galera internals and replication. This data is crucial to understand the state of your cluster and to identify any issues which may impact the cluster’s ability to operate. In this post, we’ll try to make the Galera error log easier to understand.

This is the sixteenth installment in the ‘Become a MySQL DBA’ blog series. Our previous posts in the DBA series include:

Clean Galera startup

Let’s start by looking at what a clean startup of a Galera node looks like.

151028 14:28:50 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
151028 14:28:50 mysqld_safe Skipping wsrep-recover for 98ed75de-7c05-11e5-9743-de4abc22bd11:11152 pair
151028 14:28:50 mysqld_safe Assigning 98ed75de-7c05-11e5-9743-de4abc22bd11:11152 to wsrep_start_position

What we can see here is Galera trying to understand the sequence number of the last writeset this node applied. It is found in grastate.dat file. In this particular case, shutdown was clean so all the needed data was found in that file making wsrep-recovery process unnecessary (i.e., the process of calculating the last writeset based on the data applied to the database). Galera sets the wsrep_start_position to the last applied writeset.

2015-10-28 14:28:50 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2015-10-28 14:28:50 0 [Note] /usr/sbin/mysqld (mysqld 5.6.26-74.0-56) starting as process 553 ...
2015-10-28 14:28:50 553 [Note] WSREP: Read nil XID from storage engines, skipping position init
2015-10-28 14:28:50 553 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
2015-10-28 14:28:50 553 [Note] WSREP: wsrep_load(): Galera 3.12.2(rf3e626d) by Codership Oy <info@codership.com> loaded successfully.
2015-10-28 14:28:50 553 [Note] WSREP: CRC-32C: using hardware acceleration.
2015-10-28 14:28:50 553 [Note] WSREP: Found saved state: 98ed75de-7c05-11e5-9743-de4abc22bd11:11152

Next we can see the WSREP library being loaded and, again, information about last state of the cluster before shutdown.

2015-10-28 14:28:50 553 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 172.30.4.156; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum =

This line covers all configuration related to Galera.

2015-10-28 14:28:50 553 [Note] WSREP: Service thread queue flushed.
2015-10-28 14:28:50 553 [Note] WSREP: Assign initial position for certification: 11152, protocol version: -1
2015-10-28 14:28:50 553 [Note] WSREP: wsrep_sst_grab()
2015-10-28 14:28:50 553 [Note] WSREP: Start replication
2015-10-28 14:28:50 553 [Note] WSREP: Setting initial position to 98ed75de-7c05-11e5-9743-de4abc22bd11:11152
2015-10-28 14:28:50 553 [Note] WSREP: protonet asio version 0
2015-10-28 14:28:50 553 [Note] WSREP: Using CRC-32C for message checksums.
2015-10-28 14:28:50 553 [Note] WSREP: backend: asio

Initialization process continues, WSREP replication has been started.

2015-10-28 14:28:50 553 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2015-10-28 14:28:50 553 [Note] WSREP: restore pc from disk failed

Those two lines above require a detailed comment. The file which is missing here, gvstate.dat, contains information about cluster state, as a given node sees it. You can find there the UUID of a node and other members of the cluster which are in the PRIMARY state (i.e., which are part of the cluster). In case of unexpected crash, this file will allow the node to understand the state of the cluster before it crashed. In case of a clean shutdown, the gvwstate.dat file is removed. That is why it cannot be found in our example.

2015-10-28 14:28:50 553 [Note] WSREP: GMCast version 0
2015-10-28 14:28:50 553 [Note] WSREP: (3506d410, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2015-10-28 14:28:50 553 [Note] WSREP: (3506d410, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2015-10-28 14:28:50 553 [Note] WSREP: EVS version 0
2015-10-28 14:28:50 553 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '172.30.4.156:4567,172.30.4.191:4567,172.30.4.220:4567'

In this section, you will see the initial communication within the cluster - the node is trying to connect to the other members listed in wsrep_cluster_address variable. For Galera it’s enough to get in touch with one active member of the cluster - even if node “A” has contact only with node “B” but not “C”, “D” and “E”, node “B” will work as a relay and pass through the needed communication.

2015-10-28 14:28:50 553 [Warning] WSREP: (3506d410, 'tcp://0.0.0.0:4567') address 'tcp://172.30.4.156:4567' points to own listening address, blacklisting
2015-10-28 14:28:50 553 [Note] WSREP: (3506d410, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2015-10-28 14:28:50 553 [Note] WSREP: declaring 6221477d at tcp://172.30.4.191:4567 stable
2015-10-28 14:28:50 553 [Note] WSREP: declaring 78b22e70 at tcp://172.30.4.220:4567 stable

Here we can see results of the exchange of the communication. At first, it blacklists it’s own IP address. Then, as a result of a network exchange, the remaining two nodes of the cluster are found and declared ‘stable’. Please note the IDs used here - those Node IDs are used also in other places but here you can easily link them with the relevant IP address. Node ID can also be located in gvwstate.dat file (it’s a first segment of ‘my_uuid’) or by running “SHOW GLOBAL STATUS LIKE ‘wsrep_gcomm_uuid’”. Please keep in mind that UUID may change, for example, after a crash.

2015-10-28 14:28:50 553 [Note] WSREP: Node 6221477d state prim
2015-10-28 14:28:50 553 [Note] WSREP: view(view_id(PRIM,3506d410,5) memb {
    3506d410,0
    6221477d,0
    78b22e70,0
} joined {
} left {
} partitioned {
})
2015-10-28 14:28:50 553 [Note] WSREP: save pc into disk
2015-10-28 14:28:50 553 [Note] WSREP: gcomm: connected

As a result, the node managed to join the cluster - you can see that all IDs are listed as members of the cluster, cluster is in ‘Primary’ state. There are no nodes in ‘joined’ state, the cluster was not partitioned.

2015-10-28 14:28:50 553 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2015-10-28 14:28:50 553 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2015-10-28 14:28:50 553 [Note] WSREP: Opened channel 'my_wsrep_cluster'
2015-10-28 14:28:50 553 [Note] WSREP: Waiting for SST to complete.
2015-10-28 14:28:50 553 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
2015-10-28 14:28:50 553 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 355385f7-7d80-11e5-bb70-17166cef2a4d
2015-10-28 14:28:50 553 [Note] WSREP: STATE EXCHANGE: sent state msg: 355385f7-7d80-11e5-bb70-17166cef2a4d
2015-10-28 14:28:50 553 [Note] WSREP: STATE EXCHANGE: got state msg: 355385f7-7d80-11e5-bb70-17166cef2a4d from 0 (172.30.4.156)
2015-10-28 14:28:50 553 [Note] WSREP: STATE EXCHANGE: got state msg: 355385f7-7d80-11e5-bb70-17166cef2a4d from 1 (172.30.4.191)
2015-10-28 14:28:50 553 [Note] WSREP: STATE EXCHANGE: got state msg: 355385f7-7d80-11e5-bb70-17166cef2a4d from 2 (172.30.4.220)
2015-10-28 14:28:50 553 [Note] WSREP: Quorum results:
    version    = 3,
    component  = PRIMARY,
    conf_id    = 4,
    members    = 3/3 (joined/total),
    act_id     = 11152,
    last_appl. = -1,
    protocols  = 0/7/3 (gcs/repl/appl),
    group UUID = 98ed75de-7c05-11e5-9743-de4abc22bd11

Cluster members exchange information about their status and as a result we see a report on quorum calculation - cluster is in ‘Primary’ state with all three nodes joined.

2015-10-28 14:28:50 553 [Note] WSREP: Flow-control interval: [28, 28]
2015-10-28 14:28:50 553 [Note] WSREP: Restored state OPEN -> JOINED (11152)
2015-10-28 14:28:50 553 [Note] WSREP: New cluster view: global state: 98ed75de-7c05-11e5-9743-de4abc22bd11:11152, view# 5: Primary, number of nodes: 3, my index: 0, protocol version 3
2015-10-28 14:28:50 553 [Note] WSREP: SST complete, seqno: 11152
2015-10-28 14:28:50 553 [Note] WSREP: Member 0.0 (172.30.4.156) synced with group.
2015-10-28 14:28:50 553 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 11152)

Those few lines above are bit misleading - there was no SST performed as the node was synced with the cluster. In short, at this very moment, the cluster is up and our node is synced with the other cluster members.

2015-10-28 14:28:50 553 [Warning] option 'innodb-buffer-pool-instances': signed value -1 adjusted to 0
2015-10-28 14:28:50 553 [Note] Plugin 'FEDERATED' is disabled.
2015-10-28 14:28:50 553 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-10-28 14:28:50 553 [Note] InnoDB: The InnoDB memory heap is disabled
2015-10-28 14:28:50 553 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins

...

2015-10-28 14:28:52 553 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.26-74.0-56'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Percona XtraDB Cluster (GPL), Release rel74.0, Revision 624ef81, WSREP version 25.12, wsrep_25.12

Finally, MySQL startup process is coming to it’s end - MySQL is up and running, ready to handle connections.

As you can see, in the error log, during the clean start, we can see a lot of information. We can see how a node sees the cluster, we can see information about quorum calculation, we can see IDs and IP addresses of other nodes in the cluster. This give us great info about the cluster state - what nodes were up, were there any connectivity issues? Information about Galera configuration is also very useful - my.cnf may have been changed in the meantime, runtime configuration - the same. Yet, we can still check how exactly Galera was configured at the time of the startup. It may comes handy when dealing with misconfigurations.

Galera node - IST process

Ok, so we covered a regular startup - when a node is up to date with the cluster. What about when a node is not up to date? There are two options - if the donor has all the needed writesets stored in its gcache, Incremental State Transfer (IST) can be performed. If not, State Snapshot Transfer (SST) will be executed and it involves copying all data from the donor to the joining node. This time, we are looking at the node joining/synchronization using IST. To make it easier to read, we are going to remove some of the log entries which didn’t change compared to the clean startup we discussed earlier.

2015-10-28 16:36:50 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2015-10-28 16:36:50 0 [Note] /usr/sbin/mysqld (mysqld 5.6.26-74.0-56) starting as process 10144 ...

...

2015-10-28 16:36:50 10144 [Note] WSREP: Found saved state: 98ed75de-7c05-11e5-9743-de4abc22bd11:-1

Ok, MySQL started and after initialization, Galera located the saved state with ‘-1’ as a writeset sequence number. It means that the node didn’t shutdown cleanly and the correct state wasn’t written in grastate.dat file.

...

2015-10-28 16:36:50 10144 [Note] WSREP: Start replication
2015-10-28 16:36:50 10144 [Note] WSREP: Setting initial position to 98ed75de-7c05-11e5-9743-de4abc22bd11:11152

Even though grastate.dat file didn’t contain sequence number, Galera can calculate it on it’s own and setup the initial position accordingly.

...

2015-10-28 16:36:50 10144 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '172.30.4.156:4567,172.30.4.191:4567,172.30.4.220:4567'
2015-10-28 16:36:50 10144 [Warning] WSREP: (16ca37d4, 'tcp://0.0.0.0:4567') address 'tcp://172.30.4.191:4567' points to own listening address, blacklisting
2015-10-28 16:36:50 10144 [Note] WSREP: (16ca37d4, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2015-10-28 16:36:51 10144 [Note] WSREP: declaring 3506d410 at tcp://172.30.4.156:4567 stable
2015-10-28 16:36:51 10144 [Note] WSREP: declaring 78b22e70 at tcp://172.30.4.220:4567 stable
2015-10-28 16:36:51 10144 [Note] WSREP: Node 3506d410 state prim
2015-10-28 16:36:51 10144 [Note] WSREP: view(view_id(PRIM,16ca37d4,8) memb {
    16ca37d4,0
    3506d410,0
    78b22e70,0
} joined {
} left {
} partitioned {
})

Similar to the clean startup, the Galera nodes had some communication and decided on the state of the cluster. In this case, all three nodes were up and the cluster was formed.

2015-10-28 16:36:51 10144 [Note] WSREP: save pc into disk
2015-10-28 16:36:51 10144 [Note] WSREP: Waiting for SST to complete.
2015-10-28 16:36:51 10144 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
2015-10-28 16:36:51 10144 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 17633a80-7d92-11e5-8bdc-f6091df00f00
2015-10-28 16:36:51 10144 [Note] WSREP: STATE EXCHANGE: sent state msg: 17633a80-7d92-11e5-8bdc-f6091df00f00
2015-10-28 16:36:51 10144 [Note] WSREP: STATE EXCHANGE: got state msg: 17633a80-7d92-11e5-8bdc-f6091df00f00 from 0 (172.30.4.191)
2015-10-28 16:36:51 10144 [Note] WSREP: STATE EXCHANGE: got state msg: 17633a80-7d92-11e5-8bdc-f6091df00f00 from 1 (172.30.4.156)
2015-10-28 16:36:51 10144 [Note] WSREP: STATE EXCHANGE: got state msg: 17633a80-7d92-11e5-8bdc-f6091df00f00 from 2 (172.30.4.220)
2015-10-28 16:36:51 10144 [Note] WSREP: Quorum results:
    version    = 3,
    component  = PRIMARY,
    conf_id    = 6,
    members    = 2/3 (joined/total),
    act_id     = 31382,
    last_appl. = -1,
    protocols  = 0/7/3 (gcs/repl/appl),
    group UUID = 98ed75de-7c05-11e5-9743-de4abc22bd11

Again, communication exchange happens and quorum calculation is performed. This time you can see that three members were detected and only two of them joined. This is something we expect - one node is not synced with the cluster.

2015-10-28 16:36:51 10144 [Note] WSREP: Flow-control interval: [28, 28]
2015-10-28 16:36:51 10144 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 31382)
2015-10-28 16:36:51 10144 [Note] WSREP: State transfer required:
    Group state: 98ed75de-7c05-11e5-9743-de4abc22bd11:31382
    Local state: 98ed75de-7c05-11e5-9743-de4abc22bd11:11152

State transfer was deemed necessary because the local node is not up to date with the applied writesets.

2015-10-28 16:36:51 10144 [Note] WSREP: New cluster view: global state: 98ed75de-7c05-11e5-9743-de4abc22bd11:31382, view# 7: Primary, number of nodes: 3, my index: 0, protocol version 3
2015-10-28 16:36:51 10144 [Warning] WSREP: Gap in state sequence. Need state transfer.
2015-10-28 16:36:51 10144 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '172.30.4.191' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '10144''''
WSREP_SST: [INFO] Streaming with xbstream (20151028 16:36:52.039)
WSREP_SST: [INFO] Using socat as streamer (20151028 16:36:52.041)

State transfer is performed using xtrabackup, this is true for both IST and SST (if xtrabackup is configured as SST method). Both xbstream and socat binaries are required for IST (and SST) to execute. It may happen that socat is not installed on the system. In such case, the node won’t be able to join the cluster.

WSREP_SST: [INFO] Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20151028 16:36:52.132)
2015-10-28 16:36:52 10144 [Note] WSREP: Prepared SST request: xtrabackup-v2|172.30.4.191:4444/xtrabackup_sst//1
2015-10-28 16:36:52 10144 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2015-10-28 16:36:52 10144 [Note] WSREP: REPL Protocols: 7 (3, 2)
2015-10-28 16:36:52 10144 [Note] WSREP: Service thread queue flushed.
2015-10-28 16:36:52 10144 [Note] WSREP: Assign initial position for certification: 31382, protocol version: 3
2015-10-28 16:36:52 10144 [Note] WSREP: Service thread queue flushed.
2015-10-28 16:36:52 10144 [Note] WSREP: Prepared IST receiver, listening at: tcp://172.30.4.191:4568
2015-10-28 16:36:52 10144 [Note] WSREP: Member 0.0 (172.30.4.191) requested state transfer from '*any*'. Selected 1.0 (172.30.4.156)(SYNCED) as donor.
2015-10-28 16:36:52 10144 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 31389)

This stage covers state transfer - as you can see, the donor is selected. In our case, it is 172.30.4.156. This is pretty important information, especially when SST is executed - donors will run xtrabackup to move the data and they will log the progress in innobackup.backup.log file in the MySQL data directory. Different network issues may also cause problems with state transfer - it’s good to check which node was a donor even if only to check if there are any errors on the donor’s side.

2015-10-28 16:36:52 10144 [Note] WSREP: Requesting state transfer: success, donor: 1
2015-10-28 16:36:53 10144 [Note] WSREP: 1.0 (172.30.4.156): State transfer to 0.0 (172.30.4.191) complete.
2015-10-28 16:36:53 10144 [Note] WSREP: Member 1.0 (172.30.4.156) synced with group.
WSREP_SST: [INFO] xtrabackup_ist received from donor: Running IST (20151028 16:36:53.159)
WSREP_SST: [INFO] Galera co-ords from recovery: 98ed75de-7c05-11e5-9743-de4abc22bd11:11152 (20151028 16:36:53.162)
WSREP_SST: [INFO] Total time on joiner: 0 seconds (20151028 16:36:53.164)
WSREP_SST: [INFO] Removing the sst_in_progress file (20151028 16:36:53.166)
2015-10-28 16:36:53 10144 [Note] WSREP: SST complete, seqno: 11152
2015-10-28 16:36:53 10144 [Warning] option 'innodb-buffer-pool-instances': signed value -1 adjusted to 0
2015-10-28 16:36:53 10144 [Note] Plugin 'FEDERATED' is disabled.
2015-10-28 16:36:53 10144 [Note] InnoDB: Using atomics to ref count buffer pool pages

...

2015-10-28 16:36:55 10144 [Note] WSREP: Signalling provider to continue.
2015-10-28 16:36:55 10144 [Note] WSREP: Initialized wsrep sidno 1
2015-10-28 16:36:55 10144 [Note] WSREP: SST received: 98ed75de-7c05-11e5-9743-de4abc22bd11:11152
2015-10-28 16:36:55 10144 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.26-74.0-56'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Percona XtraDB Cluster (GPL), Release rel74.0, Revision 624ef81, WSREP version 25.12, wsrep_25.12
2015-10-28 16:36:55 10144 [Note] WSREP: Receiving IST: 20230 writesets, seqnos 11152-31382
2015-10-28 16:40:18 10144 [Note] WSREP: IST received: 98ed75de-7c05-11e5-9743-de4abc22bd11:31382
2015-10-28 16:40:18 10144 [Note] WSREP: 0.0 (172.30.4.191): State transfer from 1.0 (172.30.4.156) complete.

In this section, among others, we can see IST being performed and 20k writesets were transferred from the donor to the joining node.

2015-10-28 16:40:18 10144 [Note] WSREP: Shifting JOINER -> JOINED (TO: 34851)

At this point, the node applied all writesets sent by the donor and it switched to ‘Joined’ state. It’s not yet synced because there are remaining writesets (applied by the cluster while IST was performed) to be applied. This is actually something very important to keep in mind - when a node is a ‘Joiner’, it’s applying writesets as fast as possible, without any interruption to the cluster. Once a node switches to ‘Joined’ state, it may send flow control messages. This is particularly visible when there are lots of writesets to be sent via IST. Let’s say it took 2h to apply them on a joining node. This means it still has to apply 2h worth of traffic - and it can send flow control messages in the meantime. Given that the‘Joined’ node is significantly lagging behind (2h), it will definitely send flow control - for the duration of the time needed to apply those 2h worth of traffic, the cluster will run with degraded performance.

2015-10-28 16:42:30 10144 [Note] WSREP: Member 0.0 (172.30.4.191) synced with group.
2015-10-28 16:42:30 10144 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 36402)
2015-10-28 16:42:31 10144 [Note] WSREP: Synchronized with group, ready for connections
2015-10-28 16:42:31 10144 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

Finally, the node caught up on the writeset replication and synced with the cluster.

Galera SST process

In the previous part, we covered IST process from the error log point of view. One more important and frequent process is left - State Snapshot Transfer (SST).

151102 15:39:10 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
151102 15:39:10 mysqld_safe Skipping wsrep-recover for 98ed75de-7c05-11e5-9743-de4abc22bd11:71174 pair
151102 15:39:10 mysqld_safe Assigning 98ed75de-7c05-11e5-9743-de4abc22bd11:71174 to wsrep_start_position

...

2015-11-02 15:39:11 1890 [Note] WSREP: Quorum results:
    version    = 3,
    component  = PRIMARY,
    conf_id    = 10,
    members    = 2/3 (joined/total),
    act_id     = 97896,
    last_appl. = -1,
    protocols  = 0/7/3 (gcs/repl/appl),
    group UUID = 98ed75de-7c05-11e5-9743-de4abc22bd11
2015-11-02 15:39:11 1890 [Note] WSREP: Flow-control interval: [28, 28]
2015-11-02 15:39:11 1890 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 97896)
2015-11-02 15:39:11 1890 [Note] WSREP: State transfer required:
    Group state: 98ed75de-7c05-11e5-9743-de4abc22bd11:97896
    Local state: 98ed75de-7c05-11e5-9743-de4abc22bd11:71174
2015-11-02 15:39:11 1890 [Note] WSREP: New cluster view: global state: 98ed75de-7c05-11e5-9743-de4abc22bd11:97896, view# 11: Primary, number of nodes: 3, my index: 2, protocol version 3
2015-11-02 15:39:11 1890 [Warning] WSREP: Gap in state sequence. Need state transfer.
2015-11-02 15:39:11 1890 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '172.30.4.191' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '1890''''
WSREP_SST: [INFO] Streaming with xbstream (20151102 15:39:11.812)
WSREP_SST: [INFO] Using socat as streamer (20151102 15:39:11.814)
WSREP_SST: [INFO] Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20151102 15:39:11.908)
2015-11-02 15:39:12 1890 [Note] WSREP: Prepared SST request: xtrabackup-v2|172.30.4.191:4444/xtrabackup_sst//1
2015-11-02 15:39:12 1890 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2015-11-02 15:39:12 1890 [Note] WSREP: REPL Protocols: 7 (3, 2)
2015-11-02 15:39:12 1890 [Note] WSREP: Service thread queue flushed.
2015-11-02 15:39:12 1890 [Note] WSREP: Assign initial position for certification: 97896, protocol version: 3
2015-11-02 15:39:12 1890 [Note] WSREP: Service thread queue flushed.
2015-11-02 15:39:12 1890 [Note] WSREP: Prepared IST receiver, listening at: tcp://172.30.4.191:4568
2015-11-02 15:39:12 1890 [Note] WSREP: Member 2.0 (172.30.4.191) requested state transfer from '*any*'. Selected 0.0 (172.30.4.220)(SYNCED) as donor.
2015-11-02 15:39:12 1890 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 98007)
2015-11-02 15:39:12 1890 [Note] WSREP: Requesting state transfer: success, donor: 0

Until this line, everything looks exactly like with Incremental State Transfer (IST). There was a gap in the sequence between joining node and the rest of the cluster. SST was deemed necessary because the required writeset (71174) wasn’t available in the donor’s gcache.

WSREP_SST: [INFO] Proceeding with SST (20151102 15:39:13.070)
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20151102 15:39:13.071)
WSREP_SST: [INFO] Cleaning the existing datadir and innodb-data/log directories (20151102 15:39:13.093)
removed ‘/var/lib/mysql/DB187/t1.ibd’
removed ‘/var/lib/mysql/DB187/t2.ibd’
removed ‘/var/lib/mysql/DB187/t3.ibd’

SST basically copies data from a donor to the joining node. For this to work, the current contents of the data directory have to be removed - the SST process takes care of it and you can follow the progress in the error log, as can be seen above.

...

removed directory: ‘/var/lib/mysql/DBx236’
WSREP_SST: [INFO] Waiting for SST streaming to complete! (20151102 15:39:30.349)
2015-11-02 15:42:32 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000000 of size 134217728 bytes
2015-11-02 15:48:56 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000001 of size 134217728 bytes
2015-11-02 15:54:59 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000002 of size 134217728 bytes
2015-11-02 16:00:45 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000003 of size 134217728 bytes
2015-11-02 16:06:21 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000004 of size 134217728 bytes
2015-11-02 16:09:32 1890 [Note] WSREP: 0.0 (172.30.4.220): State transfer to 2.0 (172.30.4.191) complete.
2015-11-02 16:09:32 1890 [Note] WSREP: Member 0.0 (172.30.4.220) synced with group.

The SST process takes some time, depending on the time needed to copy the data. For the duration of the SST, the joining node caches writesets in gcache and, if in-memory gcache is not large enough, on-disk gcache files are used. Please note that at this point, in the last two lines, we can see information about the donor finishing the SST and getting back in sync with the rest of the cluster. 

WSREP_SST: [INFO] Preparing the backup at /var/lib/mysql//.sst (20151102 16:09:32.187)
WSREP_SST: [INFO] Evaluating innobackupex --no-version-check  --apply-log $rebuildcmd ${DATA} &>${DATA}/innobackup.prepare.log (20151102 16:09:32.188)
2015-11-02 16:12:39 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000005 of size 134217728 bytes
2015-11-02 16:20:52 1890 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000006 of size 134217728 bytes

In this case, the joiner use xtrabackup as the SST method therefore the ‘prepare’ phase has to be completed.

WSREP_SST: [INFO] Moving the backup to /var/lib/mysql/ (20151102 16:26:01.062)
WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/mysql/my.cnf  --defaults-group=mysqld --no-version-check   --move-back --force-non-empty-directories ${DATA} &>${DATA}/innobackup.move.log (20151102 16:26:01.064)
WSREP_SST: [INFO] Move successful, removing /var/lib/mysql//.sst (20151102 16:26:14.600)

Last phase is to copy back files from the temporary directory to MySQL data directory. The rest of the process is similar to what we discussed earlier. The node initializes all plugins and InnoDB, and MySQL gets ready to handle connections. Once that’s done, it applies the remaining writesets stored in gcache.

2015-11-02 16:27:06 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000000
2015-11-02 16:27:08 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000001
2015-11-02 16:27:11 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000002
2015-11-02 16:27:15 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000003
2015-11-02 16:28:11 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000004
2015-11-02 16:28:56 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000005
2015-11-02 16:29:17 1890 [Note] WSREP: Deleted page /var/lib/mysql/gcache.page.000006
2015-11-02 16:29:24 1890 [Note] WSREP: Member 2.0 (172.30.4.191) synced with group.
2015-11-02 16:29:24 1890 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 214214)
2015-11-02 16:29:24 1890 [Note] WSREP: Synchronized with group, ready for connections
2015-11-02 16:29:24 1890 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

After all writesets were applied, on-disk gcache files are deleted and the node finally syncs up with the cluster.

Xtrabackup logs for SST

The SST process (if you use xtrabackup as the SST method), as you may have noticed, consists of three stages - streaming data to joiner, applying logs on the joiner and moving data from temporary directory to the final location in MySQL data directory. To have a complete view of the process, we need to mention some additional log files related to those three stages of the SST. You can find them in MySQL data directory or in the temporary directory Galera uses to move the data to the joiner. 

First of them (in order of appearance) is innobackup.backup.log - a file created on the donor host which consists of all the logs from xtrabackup process which is executed to stream the data from donor to joiner. This file can be very useful in dealing with any type of SST issue. In the next blog post, we will show how you can use it to identify the problem.

Another file located either in MySQL data directory (e.g., /var/lib/mysql) or .sst temporary subdirectory (e.g., /var/lib/mysql/.sst) is innobackup.prepare.log. This file contains logs related to ‘apply log’ phase. In this stage xtrabackup applies transaction logs to tablespaces (ibd files) copied from the donor. It may happen that InnoDB data is corrupted and, even though it was possible to stream the data from donor to joiner, it’s not possible to apply transactions from the redo log. Such errors will be logged in this file.

The last one is innobackup.move.log - it contains logs of the ‘move’ stage in which xtrabackup move files from temporary (.sst) directory to the final location in MySQL data directory.

In the next post, we’ll go through real life examples of errors which may impact your Galera Cluster. We will cover a wide range of issues that you can experience in production, and how to deal with them. For that, it is important to know what exactly you should expect in logs from a ‘normal’ system state - now that you know what normal looks like, it will much easier to understand what’s not normal :-) 

Blog category:

Become a ClusterControl DBA: performance monitoring and health

$
0
0

The blog series for MySQL, MongoDB & PostgreSQL administrators

In the previous two blog posts we covered both deploying the four types of clustering/replication (MySQL/Galera, MySQL Replication, MongoDB & PostgreSQL) and managing/monitoring your existing databases and clusters. So, after reading these two first blog posts you were able to add your 20 existing replication setups to ClusterControl, expand them and additionally deployed two new Galera clusters while doing a ton of other things. Or maybe you deployed MongoDB and/or PostgreSQL systems. So now, how do you keep them healthy?

That’s exactly what this blog post is about: how to leverage ClusterControl’s performance monitoring and advisors functionality to keep your MySQL, MongoDB and/or PostgreSQL databases and clusters healthy. So how is this done in ClusterControl?

The cluster list

The most important information can already be found in the cluster list: as long as there are no alarms and no hosts are shown to be down, everything is functioning fine. An alarm is raised if a certain condition is met, e.g. host is swapping, and brings to your attention the issue you should investigate. That means that alarms not only are raised during an outage but also to allow you to proactively manage your databases.

Suppose you would log into ClusterControl and see a cluster listing like this, you will definitely have something to investigate: one node is down in the Galera cluster for example and every cluster has various alarms.

severalnines-blogpost-cluster-list-node-down-alarms.png

Once you click on one of the alarms, you will go to a detailed page on all alarms of the cluster. The alarm details will explain the issue and in most cases also advise the action to resolve the issue.

You can set up your own alarms by creating custom expressions, but that has been deprecated in favor of our new Developer Studio that allows you to write custom Javascripts and execute these as Advisors. We will get back to this topic later in this post.

The cluster overview - Dashboards

When opening up the cluster overview, we can immediately see the most important performance metrics for the cluster in the tabs. This overview may differ per cluster type as, for instance, Galera has different performance metrics to watch than traditional MySQL, Postgres or MongoDB.

severalnines-blogpost-cluster-overview-performance.png

Both the default overview and the pre-selected tabs are customizable. By clicking on Overview > Dash Settings you are given a dialogue that allows you to define the dashboard.

severalnines-blogpost-cluster-overview-add-dashboard.png

By pressing the plus sign you can add and define your own metrics to graph the dashboard. In our case we will define a new dashboard featuring the Galera specific receive and send queue:

severalnines-blogpost-cluster-overview-add-dashboard2.png

This new dashboard should give us good insight in the average queue length of our Galera cluster.

Once you have pressed save, the new dashboard will become available for this cluster:

severalnines-blogpost-cluster-overview-new-dashboard-added.png

Similarly you can do this for PostgreSQL as well by combining the checkpoints with the number of commits:

severalnines-blogpost-performance-overview-pgsql-add-metric.png

severalnines-blogpost-performance-overview-pgsql-add-metric2.png

severalnines-blogpost-performance-overview-pgsql2.png

So as you can see, it is relatively easy to customize your own (default) dashboard.

Cluster overview - Query Monitor

The Query Monitor tab is available for both MySQL and PostgreSQL based setups and consists out of three dashboards: Top Queries, Running Queries and Query Histogram.

In the Running Queries dashboard, you will find all current queries that are running. This is basically the equivalent of SHOW PROCESSLIST in ClusterControl.

Top Queries and Query Histogram both rely on the input of the slow query log. To prevent ClusterControl to be too intrusive and the slow query log to grow too large, ClusterControl will sample the slow query log by turning it on and off again. This loop is by default set to 1 second capturing and the long_query_time is set to 0.5 seconds. If you wish to change these settings for your cluster, you can change this via Settings -> Query Monitor.

Top Queries will, like the name says, show the top queries that were sampled. You can sort them on various columns: for instance the frequency, average execution time or the total execution time.

severalnines-blogpost-top-queries-overview.png

You can get more details about the query by selecting it and this will present the query execution plan (if available) and optimization hints/advisories. If necessary you can also select the query and have the details emailed to you by clicking on the “email query” button.

The Query Histogram is similar to the Top Queries but then allows you to filter the queries per host and compare them in time.

Cluster overview - Operations

Similar to the PostgreSQL and MySQL systems the MongoDB clusters have the Operations overview and is similar to the Running Queries. This overview is similar to issuing the db.currentOp() command within MongoDB.

severalnines-blogpost-mongodb-current-ops.png

Cluster overview - Performance

MySQL / Galera

The performance tab is probably the best place to find the overall performance and health of your clusters. For MySQL and Galera it consists of an Overview page, the Advisors, status/variables overviews, the Schema Analyzer and the Transaction log.

The Overview page will give you a graph overview of the most important metrics in your cluster. This is, obviously, different per cluster type. Eight metrics have been set by default, but you can easily set your own - up to 20 graphs if needed.

severalnines-blogpost-define-graphs.png

The Advisors is one of the key features of ClusterControl: the Advisors are scripted checks that can be run on demand. The advisors can evaluate almost any fact known about the host and/or cluster and give its opinion on the health of the host and/or cluster and even can give advice on how to resolve issues or improve your hosts!

severalnines-blogpost-mysql-advisors.png

The best part is yet to come: you can create your own checks in the Developer Studio (Cluster -> Manage -> Developer Studio), run them on a regular interval and use them again in the Advisors section. We blogged about this new feature earlier this year.

We will skip the status/variables overview of MySQL and Galera as this is useful for reference but not for this blog post: it is good enough that you know it is here. It is also good to mention that  the Status Time Machine can help you track specific status variables and see how they change in time.

Now suppose your database is growing but you want to know how fast it grew in the past week. You can actually keep track of the growth of both data and index sizes from right within ClusterControl:

And next to the total growth on disk it can also report back the top 25 largest schemas.

Another important feature is the Schema Analyzer within ClusterControl.

ClusterControl will analyze your schemas and look for redundant indexes, MyISAM tables and tables without a primary key. Of course it is entirely up to you to keep a table without a primary key because some application might have created it this way, but at least it is great to get the advice here for free. The Schema Analyzer even constructs the necessary ALTER statement to fix the problem.

PostgreSQL

For PostgreSQL the Advisors, DB Status and DB Variables can be found here.

severalnines-blogpost-postgresql-advisors.png

MongoDB

For MongoDB the Mongo Stats and performance overview can be found under the Performance tab. The Mongo Stats is an overview of the output of mongostat and
the Performance overview gives a good graphical overview of the Mongo opcounters:

severalnines-blogpost-mongodb-performance.png

Final thoughts

We showed you how to keep your eyeballs on the most important monitoring and health checking features of ClusterControl. Obviously this is only the beginning of the journey as we will soon start another blog series about the Developer Studio capabilities and how you can make most of your own checks. Also keep in mind that our support for MongoDB and PostgreSQL is not as extensive as our MySQL toolset, but we are continuously improving on this.

You may ask yourself why we have skipped over the performance monitoring and health checks of HA Proxy and MaxScalel. We did that deliberately as the blog series covered only deployments of clusters up till now and not the deployment of HA components. So that’s the subject we'll cover next time.

Blog category:

nginx as Database Load Balancer for MySQL or MariaDB Galera Cluster

$
0
0

Nginx is well-known for its ability to act as a reverse-proxy with small memory footprint. It usually sits in the front-end web tier to redirect connections to available backend services, provided these passed some health checks. Using a reverse-proxy is common when you are running a critical application or service that requires high availability. It also distributes the load equally among the backend services.

Recently, nginx 1.9 introduced support for TCP load balancing - similar to what HAProxy is capable of. The one major drawback is that it does not support advanced backend health checks. This is required when running MySQL Galera Cluster, as we’ll explain in the next section. Note that this limitation is removed in the paid-only edition called NGINX Plus. 

In this blog post, we are going to play around with nginx as a reverse-proxy for MySQL Galera Cluster services to achieve higher availability. We had a Galera cluster up and running, deployed using ClusterControl on CentOS 7.1. We are going to install nginx on a fresh new host, as illustrated in the following diagram:

Backend Health Checks

With the existence of a synchronously-replicated cluster like Galera or NDB, it has become quite popular to use a TCP reverse-proxy as load balancer. All MySQL nodes are treated equal as one can read or write from any node. No read-write splitting is required, as you would with MySQL master-slave replication. Nginx is not database-aware, so some additional steps are required to configure the health checks for Galera Cluster backends so that they return something understandable. 

If you are running HAProxy, a healthcheck script on each MySQL server in the load balancing set should be able to return an HTTP response status. For example, if the backend MySQL server is healthy, then the script will return a simple HTTP 200 OK status code. Else, the script will return 503 Service unavailable. HAProxy can then update the routing table to exclude the problematic backend servers from the load balancing set and redirect the incoming connections only to the available servers. This is well explained in this webinar on HAProxy. Unfortunately, the HAProxy health check uses xinetd to daemonize and listen to a custom port (9200) which is not configurable in nginx yet.

The following flowchart illustrates the process to report the health of a Galera node for multi-master setup:

At the time of this writing, NGINX Plus (the paid release of nginx) also supports advanced backend health but it does not support custom backend monitoring port.

Using clustercheck-iptables

To overcome this limitation, we’ve created a healthcheck script called clustercheck-iptables. It is a background script that checks the availability of a Galera node, and adds a redirection port using iptables if the Galera node is healthy (instead of returning HTTP response). This allows other TCP-load balancers with limited health check capabilities to monitor the backend Galera nodes correctly. Other than HAProxy, you can now use your favorite reverse proxy like nginx (>1.9), IPVS, keepalived, piranha, distributor, balance or pen to load balance requests across Galera nodes.

So how does it work? The script performs a health check every second on each Galera node. If the node is healthy (wsrep_cluster_state_comment=Synced and read_only=OFF) or (wsrep_cluster_state_comment=Donor and wsrep_sst_method=xtrabackup/xtrabackup-v2), a port redirection will be setup using iptables (default: 3308 redirects to 3306) using the following command:

$ iptables -t nat -A PREROUTING -s $0.0.0.0/0 -p tcp --dport 3308 -j REDIRECT --to-ports 3306

Else, the above rule will be taken out from the iptables PREROUTING chain. On the load balancer, define the designated redirection port (3308) instead. If the backend node is "unhealthy", port 3308 will be unreachable because the corresponding iptables rule is removed on the database node. The load balancer shall then exclude it from the load balancing set.

Let’s install the script and see how it works in practice.

1. On the database servers, run the following commands to install the script:

$ git clone https://github.com/ashraf-s9s/clustercheck-iptables
$ cp clustercheck-iptables/mysqlchk_iptables /usr/local/sbin

2. By default, the script will use a MySQL user called “mysqlchk_user” with password “mysqlchk_password”. We need to ensure this MySQL user exists with the corresponding password before the script is able to perform health checks. Run the following DDL statements on one of the DB nodes (Galera should replicate the statement to the other nodes):

mysql> GRANT PROCESS ON *.* TO 'mysqlchk_user'@'localhost' IDENTIFIED BY 'mysqlchk_password';
mysql> FLUSH PRIVILEGES;

** If you would like to run as different user/password, specify -u and/or -p argument in the command line. See examples on the Github page.

3. This script requires running iptables. In this example, we ran on CentOS 7 which comes with firewalld by default. We have to install iptables-services beforehand:

$ yum install -y iptables-services
$ systemctl enable iptables
$ systemctl start iptables

Then, setup basic rules for MySQL Galera Cluster so iptables won’t affect the database communication:

$ iptables -I INPUT -m tcp -p tcp --dport 3306 -j ACCEPT
$ iptables -I INPUT -m tcp -p tcp --dport 3308 -j ACCEPT
$ iptables -I INPUT -m tcp -p tcp --dport 4444 -j ACCEPT
$ iptables -I INPUT -m tcp -p tcp --dport 4567:4568 -j ACCEPT
$ service iptables save
$ service iptables restart

4. Once the basic rules are added, verify them with the following commands:

$ iptables -L -n

5. Test mysqlchk_iptables:

$ mysqlchk_iptables -t
Detected variables/status:
wsrep_local_state: 4
wsrep_sst_method: xtrabackup-v2
read_only: OFF

[11-11-15 08:33:49.257478192] [INFO] Galera Cluster Node is synced.

6. Looks good. Now we can daemonize the health check script:

$ mysqlchk_iptables -d
/usr/local/sbin/mysqlchk_iptables started with PID 66566.

7. Our PREROUTING rules will look something like this:

$ iptables -L -n -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination
REDIRECT   tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:3308 redir ports 3306

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination

8. Finally, add the health check command into /etc/rc.local so it starts automatically on boot:

echo '/usr/local/sbin/mysqlchk_iptables -d'>> /etc/rc.local

In some distributions, you need to verify that rc.local holds the correct permission to execute scripts on boot. Verify with:

$ chmod +x /etc/rc.local

From the application side, verify that you can connect to MySQL through port 3308. Repeat the above steps (except step #2) for the remaining DB nodes. Now, we have configured our backend health checks correctly. Let’s set up our MySQL load balancer as described in the next section.

Setting Up nginx as MySQL Load Balancer

1. On the load balancer node, install the required packages:

$ yum -y install pcre-devel zlib-devel

2. Install nginx 1.9 from source with TCP proxy module (--with-stream):

$ wget http://nginx.org/download/nginx-1.9.6.tar.gz
$ tar -xzf nginx-1.9.6.tar.gz
$ ./configure --with-stream

3. Add the following lines into nginx configuration file located at /usr/local/nginx/conf/nginx.conf:

stream {
      upstream stream_backend {
        zone tcp_servers 64k;
        server 192.168.55.201:3308;
        server 192.168.55.202:3308;
        server 192.168.55.203:3308;
    }

    server {
        listen 3307;
        proxy_pass stream_backend;
        proxy_connect_timeout 1s;
    }
}

4. Start nginx:

$ /usr/local/nginx/sbin/nginx

5. Verify that nginx is listening to port 3307 that we have defined. MySQL connections should be coming via this port of this node and then redirects to available backends on port 3308. Then the respective DB node will redirect it to port 3306 where MySQL is listening:

$ netstat -tulpn | grep 3307
tcp        0      0 0.0.0.0:3307            0.0.0.0:*               LISTEN      5348/nginx: master

Great. We have now set up our nginx instance as MySQL Galera Cluster load balancer. Let’s test it out!

Testing

Let’s perform some tests to verify that our Galera cluster is correctly load balanced. We performed various exercises to look at nginx and Galera cluster work in action. We performed the following actions consecutively:

  1. Turn g1.local to read-only=ON and read_only=OFF.
  2. Kill mysql service on g1.local and force SST when startup.
  3. Kill the other two database nodes so g1.local will become non-primary.
  4. Bootstrap g1.local from non-primary state.
  5. Rejoin the other 2 nodes back to the cluster.

The screencast below contains several terminal outputs which explained as follows:

  • Terminal 1 (top left): iptables PREROUTING chain output
  • Terminal 2 (top right): MySQL error log for g1.local
  • Terminal 3 (middle left): Application output when connecting to nginx load balancer. It reports date, hostname, wsrep_last_committed and wsrep_local_state_comment
  • Terminal 4 (middle right): Output of /var/log/mysqlchk_iptables
  • Terminal 5 (bottom left): Output of read_only and wsrep_sst_method on g1.local
  • Terminal 6 (bottom right): Action console

The following asciinema recording shows the result:

 

Summary

Galera node health checks by a TCP load balancer was limited to HAProxy due to its ability to use a custom port for backend health checks. With this script, it’s now possible for any TCP load balancers/reverse-proxies to monitor Galera nodes correctly. 

You are welcome to fork, pull request and extend the capabilities of this script.

Blog category:

ClusterControl Tips & Tricks for MySQL: Max Open Files

$
0
0

Requires ClusterControl 1.2.11 or later. Applies to MySQL single instances, replications and Galera clusters.

You have created a large database with thousands of tables (> 5000 in MySQL 5.6). Then you want to create a backup using xtrabackup. Or, if it is a Galera cluster, you have to recover a galera node using wsrep_sst_method=xtrabackup[-v2].

Unfortunately it fails and the following is emitted in the Job Logs messages:

xtrabackup: Generating a list of tablespaces
2015-11-03 19:36:02 7fdef130a780  InnoDB: Operating system error number 24 in a file operation.
InnoDB: Error number 24 means 'Too many open files'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/operating-system-error-codes.html
InnoDB: Error: could not open single-table tablespace file ./DB75/t69.ibd

In this case you can simply increase the open_files_limit of the MySQL server(s) by going to Manage > Configurations, click on Change Parameter.

We want to make the change on all MySQL hosts, and add the open_files_limit to the ‘MYSQLD’ group. We then have to type in ‘open_files_limit’ in the “Parameter” field, since the parameter has not yet been set in the my.cnf file of the selected servers. Then it is simply to set the New Value to something appropriate. We have > 80000 tables in the database, so we’ll set the new value to 100000.

Next, press Proceed, and then you will be presented with the Config Change Log dialog:

The parameter change was successful, and the next step is to restart the MySQL servers as indicated in the dialogs. Please note that many operating systems impose an upper limit on how many open files can be set on a process.

PS.: To get started with ClusterControl, click here!

Blog category:

Viewing all 365 articles
Browse latest View live


Latest Images