08/12/06
The updated source of this document is available at:
Source: http://durrett.net/MySQL_master_master_administration.html
Overview
For a production service you pretty much need at least N+1 on any critical server. Perhaps the most critical role is your database. Here are some tips to help configure MySQL for backups and fail over There are many approaches to this, so look around for what suits your needs.
The basic setup requires two servers that use Master-Master replication (each server considers the other it master). This means all writes to the master are relayed to the slave, so the slave will have an identical copy of the database. A master can have many slaves, but any slave can only have one master. A server can be both a master and a slave. You can use more that two servers, but I will stick with two to keep the examples simple.
Getting a Snapshot
Replication is pretty simple to setup but for some reason difficult to get right. One of the most annoying parts about setting up replication is getting a clean snapshot of the server... if you are starting with an empty database you save yourself this hassle. If not, you need an exact copy of the data, which can be done in a few ways but to get an exact copy pretty much every method will lock the database, which makes this a bad option in a production environment.
A pretty simple way to get a snapshot is with mysqldump. Make sure you include the "--master-data" options as this will add important replication info to the output. The following command should get a clean snapshot of your databases:
# mysqldump --add-locks --create-options --disable-keys --extended-insert --master-data --quick --lock-tables -A > snapshot.sql
If you are lucky enough to have all of the binlogs from the time you initially setup your database you can skip the snapshot process (see "Starting Replication", below).
MySQL Configuration
I am going to use "Server A" and "Server B" for my examples – these refer to the MySQL instance, most likely running on separate pieces of hardware (if not, I am not sure why you are reading this).
Of great importance, make sure Server A and Server B have different server ids in the my.cnf file. The server id is used to identify where a query originated and prevents replication from being an infinite loop (i.e. if, via replication, Server A gets a query that originated from its server id, it discards the update rather than executing the query).
So, Server A has a my.cnf like this:
server-id = 10
and Server B has a my.cnf like this:
server-id = 20
Other settings for the my.cnf file on both servers:
# The next two lines ignore replication queries for the mysql and
# test databases... you probably want this is you want to manage permissions
# on each server separately (I will identify the importance of this in the
# fail over section, later
replicate-ignore-db = mysql
replicate-ignore-db = test
# Make sure you have binlogs. Otherwise, when restoring, you will lose
# all data added since the backup snapshot
log-bin = /var/log/mysql/mysql-bin.log
# Normally slaves do not add the queries they receive via replication
# to their binary logs. If you want to restore from backups you will want
# these updates in the binlogs.
log-slave-updates
Make sure you restart your mysql instances after you change the my.cnf settings.
Loading the Database
For loading from a snapshot, I will assume that Server A is the source and Server B is the new server you are setting up. If you are not loading from a snapshot, jump ahead to "Starting Replication"
Take your snapshot from earlier and load it on Server B. Assuming Server B is a new install of MySQL with only default databases, this should work (assuming the root password is blank):
# mysql ["less than" symbol my blog keeps destroying] snapshot.sql
Once this completes Server B should be an exact copy of Server A at the time the snapshot was taken.
Starting Replication
Next you need to start replication. Once replication is started, all queries written to a master will be sent to the slaves, ideally keeping an exact copy.
Before a slave can replicate from a master, you need to grant permissions on the master. The following would grant replication to the mysql user 'rep' from any machine (please take the appropriate password / firewall precautions before you do this). As we will be running with a master-master setup, this needs to be done on both Server A and Server B:
mysql> GRANT REPLICATION SLAVE ON *.* TO 'rep'@'%';
Next you need to tell each slave how to connect to the master. If you took a snapshot, the information you need is in the snapshot.sql file. In the first 50 or so lines you should see a statement like:
--
-- Position to start replication or point-in-time recovery from
--
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000321', MASTER_LOG_POS=8675309;
If you don't see this, you probably did not use the '--master-data' option when doing a mysqldump (bummer).
If you did not use a snapshot because you have all of the binlogs from the initial setup of the other database, you should assume that MASTER_LOG_FILE is the name of the oldest binlog on your system and MASTER_LOG_POS=4.
With that information in hand, you can now tell the slave how to connect to the master. So, on Server B, issue the following command:
mysql> CHANGE MASTER TO MASTER_HOST='[hostname or IP of Server A]', MASTER_USER='rep', MASTER_PASSWORD='[hopefully you set a password]', MASTER_LOG_FILE='[MASTER_LOG_FILE]', MASTER_LOG_POS=[MASTER_LOG_POS];
So, using my example snapshot.sql from above (and assuming Server A has an IP address of 192.168.99.10) this would be:
mysql> CHANGE MASTER TO MASTER_HOST='192.168.99.10', MASTER_USER='rep', MASTER_PASSWORD='sekrit', MASTER_LOG_FILE='mysql-bin.000321', MASTER_LOG_POS=8675309;
You can double check your work like this:
mysql> SHOW SLAVE STATUS \G
Once all of this is working, you can start the replication on the slave with the following command:
mysql> START SLAVE;
After that, do another:
mysql> SHOW SLAVE STATUS \G
If all went well you should see "Slave_IO_Running: Yes", "Slave_SQL_Running: Yes" and "Seconds_Behind_Master:" should be some numeric value, ideally "0". If all did not go well, it is beyond the scope of this document.
At this point Server B is a slave of Server A. Important: do not execute any queries on Server B that will result in an update. Now we want to make Server A a slave of Server B.
On Server B, execute the following:
mysql> SHOW MASTER STATUS;
You will use these values to setup replication on Server A. If you execute this more than once you may see the values are updating – this is fine so long as you are not executing update queries on Server B yet – the updating values reflect updates to Server A being replicated to Server B.
On Server A, execute the following:
mysql> CHANGE MASTER TO MASTER_HOST='[hostname or IP of Server B]', MASTER_USER='rep', MASTER_PASSWORD='[hopefully you set a password]', MASTER_LOG_FILE='[MASTER_LOG_FILE]', MASTER_LOG_POS=[MASTER_LOG_POS];
So assuming Server B has an IP address of 192.168.99.20, this might look like:
mysql> CHANGE MASTER TO MASTER_HOST='192.168.99.20', MASTER_USER='rep', MASTER_PASSWORD='sekrit', MASTER_LOG_FILE='mysql-bin.0000011', MASTER_LOG_POS=94703;
Again on Server A, start the slave, confirm it is working and pat yourself on the back:
mysql> START SLAVE;
mysql> SHOW SLAVE STATUS \G
Congratulations, you now have Master/Master replication working.
Using Your Databases
So you may think, cool... I now have twice the capacity on my database. Not really. For one, all writes done to one database also have to be done to the other. You may be able to take advantage of more read capacity, but if your application requires the read capacity of both servers, you don't have N+1 redundancy – when one server dies all reads will go to the remaining server, probably overloading it. But neither of these are the really big problem...
When using replication in MySQL, writes to multiple servers are not necessarily written in the order they were intended. For example, lets say you do an insert on Server A and, at the same time, you do an insert on Server B. If the insert was done to a table that has an auto_increment primary key, it is possible that both inserts will get the same id because they were not aware of the other insert at the time they were written. You can write your application to avoid this, but if you are starting with most off-the-shelf or open source solutions, they probably are not written to deal with this situation.
Once you have duplicate keys your replication will fail because the slave will not be able to perform the insert it received from the master. At that point your replication is inconsistent... very bad. For this reason, you probably never want to be writing to both servers at the same time.
So instead I recommend that you have one server setup as your "primary" and one setup as your "standby". Don't assign these names to your servers because they will switch during a fail over.. think of "primary" and "standby" as a pointer to the server. All reads and writes are sent to the primary server and your standby is used for getting clean backups without blocking your application and can become the primary when the primary server fails. I like to enforce this restriction with MySQL permissions... the application only has write permission to the primary server and read (or no) permission to the standby.
Backups
Backups are simple at this point – simply use the mysqldump example from above on your standby server. You can also stop the server and copy the binary database data – you are on your own though. During this backups your standby database will not update so it will fall behind the primary database (you can see how many seconds behind it is with the "SHOW SLAVE STATUS" command). This leaves you a little more vulnerable to failure during backups – if this is too risky, you can setup a third slave that is only used for backups.
Note: when using the "--master-data" command in mysqldump, the output is the pointer to the master data for the machine performing the dump, not for that machine's master! So if your standby server is Server B and you do a mysqldump on it, to restore from that snapshot you point to Server B as the master, not Server A. This is also why it is important to have "log-slave-updates" enabled in the my.cnf file – the master data from the backup snapshot needs to point at binlogs on Server B. If you are not logging slave updates, these binlogs will be empty because all writes on Server B are from replication.
So to recap, in order to restore this database with no data loss, you need both the snapshot and binlogs from the server you use for backups (your standby). If you just have your snapshot, you will lose all updates that occurred between the snapshot and the failure. When I update this document I will add some tricks to get around this.
Fail over
Fail over is relatively simple – make your application stop writing to the primary server and start writing to the standby server (and now think of "standby" as "primary" and "primary" as "standby". In practice there are a few precautions you may want to take:
*
If the primary database is accessible, disable write access to it to that you avoid the duplicate key issues. In many failure cases you will have to skip this step.
*
Before starting to write on the standby, make sure that it is '0' seconds behind the master. Otherwise you risk duplicate key issues.
*
If you have automated backup scripts, you may want to point them at the new standby... otherwise they will lock your new primary database when they kick-in.
Recovery
Recovery works pretty much exactly like the initial setup of the servers – load the snapshot and start replication. There is one huge gotcha that I don't usually see identified... If you need to do a full restore of what was previously your primary server you are likely to that you get errors in replication... this is because when replicating from the restore it is ignoring the updates it performed prior to its failure. Here is an example scenario:
- A snapshot is taken from standby Server B at 6:00 AM
- At 11:30 AM, primary Server A fails and we redirect application to Server B
- Server A is re-installed from scratch with the 6:00 AM Server B snapshot
- Server A is told to replicate from Server B starting at the log position of the snapshot
- Since all binlogs from 6:00 AM until 11:30 AM originated from Server A, it ignores these statements and actually starts replication at 11:30 AM, when Server B was the origin of the updates.
The quick fix (hack) is giving Server A a different server id until it catches-up to Server B and then it can get its old server id back.
06/27/06
How To Setup MogileFS
MogileFS looks like a great alternative to the not-so-great NFS. The documentation was a little sparse so I took the time to help others get started with MogileFS. The source (and updated) document is available at http://durrett.net/mogilefs_setup.html
Overview
This document explains how to setup and configure a basic MogileFS installation. It is intended for the novice systems administrator and should enable anybody with the skills to install a basic Linux distro to get MogileFS up and running on it.
I am going to assume you have three roles for this setup. A machine should be able to handle more than one role. I will use hosts mogiledb.yourdomain.com, mogiletracker.yourdomain.com and mogilestorage.yourdomain.com – rename these to meet your needs.
Getting MogileFS
If you look around you can find MogileFS directly on the Danga website, in an abandoned CVS repository and in the current SVN repository. If you pull from the website you will not have admin tools to make setup easier. If you pull from CVS you will have a version from October, 2004. If you pull from SVN you will have a shiny new version that works best with this documentation.
Using svn (http://subversion.tigris.org/), checkout the latest and greatest version (the example below pulls from the trunk).
$ mkdir mogilefs-src
$ cd mogilefs-src
$ svn checkout http://code.sixapart.com/svn/mogilefs/trunk
Installation
Creating a Database
Setting up a MySQL server is beyond the scope of this document – there are packages available for most distributions, find one that suits your needs. Once you have a MySQL server up and running on host mogiledb.yourdomain.com, create a table and user for MogileFS. Some libraries don't play well with new MySQL passwords – if you use these, set the password using the “OLD_PASSWORD” function. Make sure you change the password to something better than the example.
# mysql
mysql> CREATE DATABASE mogilefs;
mysql> GRANT ALL ON mogilefs.* TO 'mogile'@'%';
mysql> SET PASSWORD FOR 'mogile'@'%' = OLD_PASSWORD( 'sekrit' );
mysql> FLUSH PRIVILEGES;
mysql> quit
You will also need to create the schema – that is covered later in this document.
Setting up the Trackers and Storage Servers
You need to obtain and install the MogileFS-specific perl modules and their dependencies before MogileFS will install successfully. Hopefully you pulled the MogileFS source from the SVN repository – if so, the server components will be located in the trunk/server directory. You probably need to install dependent modules first (see below) but once all dependencies have been met, you can install Mogile with the following commands (starting in the top of the SVN directory you pulled):
# cd trunk/server/
# perl Makefile.PL
# make
# make test
# make install
If you get any errors during this process it will probably be errors telling you that a dependent module is missing. If during the 'make test' step you get the error, "t/00-startup....DBI connect('mysql','root',...) failed: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) at t/lib/mogtestlib.pl line 16" it can probably be ignored – it should not be necessary for MySQL to be running on any host other than mogiledb.yourdomain.com.
You probably want to install some helpful utilities on each tracker or storage server as well (these will be needed for later configuration). These are located in the trunk/utils directory and can be installed with the following commands (starting in the top of the SVN directory you pulled):
# cd trunk/utils/
# perl Makefile.PL
# make
# make test
# make install
You also want the API – the utilities will require this. These are located in the trunk/api/perl directory and can be installed with the following commands (starting in the top of the SVN directory you pulled):
# cd trunk/api/perl
# perl Makefile.PL
# make
# make test
# make install
Installing Perl Module Dependencies
As for Perl modules, you need (at least) Perlbal and Danga-Socket and any dependencies. If you are not sure how to locate the dependent modules, consider using CPAN (http://cpan.org/) as it can install a module and all of the dependencies. If you already have a reasonable Perl installation you probably have most of the non-MogileFS modules already.
The basic process for installing most perl modules is:
extract the module:
# tar -xvzf Linux-AIO-1.9.tar.gz
Create the makefile:
# cd Linux-AIO-1.9
# perl Makefile.PL
Build the code:
# make
Test the code (some modules may not have this step but it is okay to run anyway):
# make test
Install it:
# make install
If you see any errors stating that a module dependency is missing, obtain and install the missing module first and try again.
Setup
Database Configuration
The database is empty and will need a schema applied. The ' trunk/server' directory has a utility named 'mogdbsetup' to make this process simple. By default it assumes the database is located on localhost so if you are running it from a different host you will need to provide the host name on the command line.
# ./mogdbsetup --dbhost=mogiledb.yourdomain.com --dbname=mogilefs --dbuser=mogile --dbpassword=sekrit
Again, make sure you replace the host and password so that they match you database configuration from above.
The mogdbsetup utility does not specify a table type by default so your tables will match the defaults for your database. In many cases this will mean that you end up with MyISAM tables. If you prefer InnoDB tables you will either need to make sure your database defaults to InnoDB or you can manually convert the tables (both of these are outside of the scope of this document but there are plenty of examples out there).
Tracker Configuration
On each tracker server (mogiletracker.yourdomain.com), create a configuration file at /etc/mogilefs/mogilefsd.conf with the following:
db_dsn DBI:mysql:mogilefs:mogiledb.yourdomain.com
db_user mogile
db_pass sekrit
conf_port 6001
listener_jobs 5
db_dsn points to your database instance. If you are running the database on the same machine as the storage server you can omit ":mogiledb.yourdomain.com: and it will use the local machine. db_user and db_pass should match the user and password you configured when setting up your database.
The program 'mogilefsd' will not run as root so you will need to run this as a non-root user. To create a user for this, enter the following command and follow the prompts to create the "mogile" user:
# adduser mogile
In order to use the tools to setup the storage servers you will need to have the trackers running. Refer to "Starting Trackers", below.
Storage Server Configuration
On each storage server, create the storage directory (make sure it has access permissions for the user you will use to run mogstored):
# mkdir /var/mogdata
Configure it:
On each storage server, create a configuration file at /etc/mogilefs/mogstored.conf with the following:
httplisten=0.0.0.0:7500
mgmtlisten=0.0.0.0:7501
docroot=/var/mogdata
Use 'mogadm' to add each storage server to the database. This requires that the trackers are already running so if you have not already started them, refer to "Starting Trackers", below. You need to supply the Perl lib path which has the 'MogileFS.pm' perl module installed – this was installed if you installed the API in the "Setting up the Trackers and Storage Servers" section above. The following example would add the host mogilestorage.yourdomain.com as a storage server, assuming that mogilestorage.yourdomain.com had an IP address of 192.168.42.3 (listening on port 7500) and your tracker had an IP address of 192.168.42.1 (listening on port 6001):
# mogadm --lib=/usr/local/share/perl/5.8.4 --trackers=192.168.42.1:6001 host add mogilestorage --ip=192.168.42.3 --port=7500 --status=alive
You can confirm that your host(s) were added with the following command;
# mogadm --lib=/usr/local/share/perl/5.8.4 --trackers=192.168.42.1:6001 host list
You also need to add devices for each storage host. If you don't provide a device id mogadm is supposed to do it for you... in practice I find it dies with an error, so you will need to manually add a unique device id after the host:
# mogadm --lib=/usr/local/share/perl/5.8.4 --trackers=192.168.42.1:6001 device add mogilestorage 1
Finally, add the correctly-name device (folder) to each storage host. I have been unable to get the tools to handle this well, so I am probably doing something wrong. As a workaround, I used the modadm device list command to see what device names were assigned and then I added the folders to my storage hosts. Run the following command:
# mogadm --lib=/usr/local/share/perl/5.8.4 --trackers=192.168.42.1:6001 device list
It will list each host and the device name followed by its status and storage available. Here is example output:
mogilestorage [1]: alive
used(G) free(G) total(G)
dev1: alive 0.892 67.772 68.664
This means "mogilestorage" has a host id of "1" and it has one device named "dev1" on it and that device is in the "alive" state (your other statistics will probably be zeros). Using the example output above, you would simply create the directory on mogilestorage.yourdomain.com:
# mkdir -p /vag/mogdata/dev1
Finally, confirm your devices are configured:
# mogadm --lib=/usr/local/share/perl/5.8.4 --trackers=192.168.42.1:6001 device list
Running MogileFS
Starting Storage Servers
Start each storage server (mogilestorage.yourdomain.com) by running the following command as root:
# mogstored --daemon
Starting Trackers
Trackers will not run as root so you will need to run them as another user. If you created the "mogile" user when seetingup the trackers, the following commands will work (assumes you start logged in to mogiletracker.yourdomain.com as root):
# su mogile
$ mogilefsd -c /etc/mogilefs/mogilefsd.conf --daemon
$ exit
You can confirm that the trackers are running with the following command:
# ps aux | grep mogilefsd
If you don't get a list of running processes the trackers are not running.
Try It
Do a Quick Sanity Test
The 'mogadm' tool can be used to make sure your trackers are functioning. You need to supply the Perl lib path which has the 'MogileFS.pm' perl module installed – this was installed if you installed the API in the "Setting up the Trackers and Storage Servers" section above. The following example would check all mogile components using the trackers at IP address 192.168.42.1 and 192.168.42.2, both listening on port 6001:
# mogadm --lib=/usr/local/share/perl/5.8.4 --trackers=192.168.42.1:6001,192.168.42.2:6001 check
Try it with Real Data
Create a domain:
# mogadm --lib=/usr/local/share/perl/5.8.4 --trackers=192.168.42.1:6001,192.168.42.2:6001 domain add testdomain
Add a class to the domain:
# mogadm --lib=/usr/local/share/perl/5.8.4 --trackers=192.168.42.1:6001,192.168.42.2:6001 class add testdomain testclass
Troubleshooting
When starting the storage daemon I get "ERROR: Directory not found for service mogstored":
You did not create a storage directory or you are starting the mogstored as a user that does not have access to the directory.
Problems Connecting:
Make sure your firewall is open. Using the examples in this document, port 7500 and 7501 needed on storage servers, 6001 on trackers.
While testing I get “MogileFS::Backend: couldn't connect to mogilefsd backend at /usr/local/share/perl/5.8.4/MogileFS.pm line 56”:
Make sure your tracker connects to the database:
# su mogile
$ mogilefsd -c /etc/mogilefs/mogilefsd.conf
I get a "REQUEST FAILURE" on "Checking devices..." when doing a mogadm check:
Confirm the devices (folders) exist in the /var/mogdata directory and that the use running the mogstored process has full permissions to these directories. If the device does not exist, add it – it will take a few seconds for mogadm check to reflect the fixed directory. For example, if mogilestorage.yourdomain.com had the device "dev1" on it, you would add the directory:
# mkdir -p /vag/mogdata/dev1
05/24/06
Whenever I try to install Debian 3.1 on an AMD64 system using the XFS file system I get failures in the installer. Hopefully the following workaround will help others that have run into the same problem. I was using a 4 CPU Opteron system, 32 Gigs of RAM and Megaraid controller.
Get the netinst CD image from http://www.debian.org/ports/amd64/
Boot CD (Debian Sarge AMD64 3.1 R0A Net Install)
Select Language
Select Country
Select Keymap
Select "Ok" Ethernet selection (if you are like me you need to debug this failure and realize somebody unplugged the switch you are connected to)
Enter hostname
Enter domain
Partition Disks:
Create XFS partition at /
Format the partition
The dialog "Installing the Debian Base System" appears and begins installation
Install LILO to Master Boot Record (you may get lucky)
The dialog "Installing Lilo" appears and will probably fail (if it is successful, skip the rest of my workaround)
The dialog "LILO installation failed" appears with 'Running "/sbin/lilo" failed with error code "1".'
Click "continue" twice - you should go to "Debian Installer Main Menu"
Press Alt-F2 to get to another console (and press enter to activate the console)
Type:
chroot /target
vi /etc/lilo.conf
At line 106 you should see the following:
label=Linux
read-only
add the following line below these
initrd=/initrd.img
So it should now look like this:
label=Linux
read-only
initrd=/initrd.img
quit vi
Create symlinks with the following commands:
ln -s /boot/vmlinuz /vmlinuz
ln -s /boot/initrd.img /initrd.img
write your boot sector using liloconfig - type:
liloconfig
When asked "Install a boot block using your current LILO configuration?" I went with "Yes"
type "exit" to get our of the chroot
type "exit" to get our of the console
Press Alt-F1 to get back to the installer
Select "Continue without boot loader"
Select "Finish the installation" if it does not advace automatically
Reboot
Your system should reboot as the kernel should load into the second part of the installation.
Configure as normal
12/29/05
Finally, blogs are working again... grumble.
09/11/05
I am always fascinated when I see an avatar snapshot that has a real photo. Here is the short list of avatar snapshot photos I came across this morning:
http://avatars.imvu.com/Guest_sxciboo
http://avatars.imvu.com/Kell02
http://avatars.imvu.com/Guest_MapTheUniverse
http://avatars.imvu.com/Guest_ProaSailor
http://avatars.imvu.com/Guest_xFadinxAwayx
http://avatars.imvu.com/Guest_Malchihiro
http://avatars.imvu.com/Guest_AndrewAnimationCom
http://avatars.imvu.com/Guest_Enyaron
Okay, this one is probably not the actual person ![]()
