From Fedora Project Wiki

< Infrastructure

Revision as of 18:33, 29 April 2013 by Toshio (talk | contribs) (Add ticket link, owners, etc)

Databases are currently a single point of failure in infrastructure. We'd like to come up with something that lets us reboot a db server and not have downtime. We have mostly postgres databases and one thing (the wiki) on mysql.

Owners: Toshio (abadger1999), Seth (skvidal), Kevin (nirik)
Ticket: Infra ticket 2718

Features we're looking for

Must have

These are the reasons that we want db replication. Anything less than this would be unacceptable

  • Switchover
    • Want to reboot db server. Sysadmin manually specifies that db1 is going away and db2 should take over
  • Very short downtime
    • less than 5 minutes on a switchover/failover event
  • No loss of data. Once the db says data is committed there must be copies on other boxes
  • Performance must meet our current demands but only our current demands.
    • if we need to service 100 fas commits per second but the current (unreplicated) service could theoretically handle 1000 commits, the replication solution only needs to handle 100 commits, not 1000.

Really really want

If a solution has these and its competition doesn't chances are we're going to go with that solution.

  • Auto failover
    • Db1 stops responding. db2 automatically takes over.
  • No downtime (as long as one db node is up)

Won't lose sleep over

May I have a pony too?

  • load balancing (reads or writes)
    • Currently we don't have load issues
  • replication to other data centers