Infrastructure/DB Replication Plan

From FedoraProject

< Infrastructure
Revision as of 03:23, 13 August 2013 by Kevin (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Databases are currently a single point of failure in infrastructure. We'd like to come up with something that lets us reboot a db server and not have downtime. We have mostly postgres databases and one thing (the wiki) on mysql.

Owners: Toshio (abadger1999), Kevin (nirik)
Ticket: Infra ticket 2718


Existing setup overview

We currently have the following database servers:


db02.stg - This staging server runs both postgresql and mysql and acts as all databases in staging.


db01 - primary postgresql database

db04 - koji postgresql database

db05 - mysql database (currently paste and mediawiki)

db-fas01 - postgresql database for fas only.

Features we're looking for

Must have

These are the reasons that we want db replication. Anything less than this would be unacceptable

  • Switchover
    • Want to reboot db server. Sysadmin manually specifies that db1 is going away and db2 should take over
  • Very short downtime
    • less than 5 minutes on a switchover/failover event
  • No loss of data. Once the db says data is committed there must be copies on other boxes
  • Performance must meet our current demands but only our current demands.
    • if we need to service 100 fas commits per second but the current (unreplicated) service could theoretically handle 1000 commits, the replication solution only needs to handle 100 commits, not 1000.

Really really want

If a solution has these and its competition doesn't chances are we're going to go with that solution.

  • Auto failover
    • Db1 stops responding. db2 automatically takes over.
  • No downtime (as long as one db node is up)

Won't lose sleep over

May I have a pony too?

  • load balancing (reads or writes)
    • Currently we don't have load issues
  • replication to other data centers

postgres technologies to explore

synchronous streaming replication + repmgr

The core replication is in postgres 9.x. repmgr adds command line scripts and automation to make switchover and failover much easier to manage.

Note: synchronous replication gives us the no data loss guarantee. However, it is hard to mix that with high availability (as it means transactions that write data can't be completed until the slaves reply that they've committed the data too). Reading the synchronous replication docs, it may be that this can be mitigated by having multiple slaves. From the documentation it seems that postgres might be making sure that the data is replicated to at least one other slave but not demanding that it be written to all the other slaves. So if one of two slaves is stopped, the system would still be able to function.

Note that PostgreSQL version 9.2 is available in the rhel software collections channel.

repmgr is currently not packaged in Fedora/EPEL.

repmgr seems to require postgresql compiled with dynamic libraries that the Fedora version doesn't provide.


PGPoolII is very flexible. We'll have to pick out the set of features that gives us what we want and test it heavily to see how it works.

PGPool is a proxy type solution. Clients talk to pgpool and it in turn sends commits to all databases. The tricky part is re-adding a database once it dropped out. You must rsync all transactions over to it and run them before re-adding it to the pgpool collection.

Shared storage?

We could look at some kind of shared storage and failover (pacemaker, etc). This would mean some short downtime when switching, but might be simpiler than the above solutions. Possible things: iscsi from netapp, glusterfs, drbd.

Misc other items

  • Should we split databases into per application? More instances to manage, but spreads downtime around.
  • We should migrate completely off mysql, so we only have postgresql to worry about.
  • Can we gain anything from perhaps stopping writes during reboots/outages? ie, r/o from replication only while primary is rebooting?
  • Could r/o replicants in other datacenters help us any?
  • Staging should be a good place to test ideas, or cloud instances.