Setup MariaDB Cross Data Center Disaster Recovery (DC-DR)
New to KubeDB? Please start here.
This guide walks through deploying a DC-DR enabled distributed MariaDB across two
Member data centers (DCs) plus one Arbiter DC, and verifying that exactly one DC
is writable. Read the
DC-DR Overview
first for the architecture and the concepts referenced below (the primary-dc
Lease, the marker fence, role labeling, and the cross-DC asynchronous link).
Before you begin
DC-DR builds directly on the distributed MariaDB substrate. Complete the following from the Distributed MariaDB Overview before you start here:
- An OCM hub with the three participating spoke clusters joined and accepted.
In this guide they are
dc-a,dc-b, anddc-c. The OCM spoke cluster name is the DC name and must match the PlacementPolicyclusterNameexactly. - The OCM WorkConfiguration patch (
RawFeedbackJsonString) applied on every spoke. - KubeSlice installed, a project and
SliceConfigcovering all three clusters, and CoreDNS forwarding*.slice.localon every cluster. - The KubeDB operator installed on the hub with
--set petset.features.ocm.enabled=true.
In addition, DC-DR requires the cross-DC failover authority:
- The
dr-controlplanethree site etcd quorum running behind the OCM control plane, with one etcd member in each ofdc-a,dc-b, anddc-c(the Arbiter DC contributes its vote here). - The per-DC
dr-controlplaneagent running on each spoke, projecting theprimary-dcmarker ConfigMap into thedc-failovernamespace. - The KubeDB operator started with the DC-DR flags so its hub orchestrator watches
the Lease:
--dc-dr-enabled,--dc-dr-coord-kubeconfig, and--dc-dr-local-dc.
Note: The
dr-controlplaneagent needs write access to ConfigMaps in each spoke’sdc-failovernamespace, and the MariaDB coordinator needs read access to that ConfigMap from the database namespace. These RBAC rules ship with the DC-DR Helm values.
Step 1: Define the DC-DR PlacementPolicy
The PlacementPolicy is what turns a plain distributed MariaDB into a DC-DR cluster. Two things matter here:
clusterSpreadConstraint.failoverPolicywithmode: TwoDCandtrigger.scope: Global. This declares the two Member DC plus Arbiter DC layout and that a singleprimary-dcLease decides the writable DC for the whole cluster.- A
roleon eachdistributionRule. The two data centers arerole: Member(each becomes a self contained Galera cluster), and the third isrole: Arbiterwith an emptyreplicaIndices(no MariaDB data, only thedr-controlplaneetcd vote).
Create placement-policy.yaml:
apiVersion: apps.k8s.appscode.com/v1
kind: PlacementPolicy
metadata:
labels:
app.kubernetes.io/managed-by: Helm
name: distributed-mariadb-dcdr
spec:
clusterSpreadConstraint:
slice:
projectNamespace: kubeslice-demo-distributed-mariadb
sliceName: demo-slice
failoverPolicy:
mode: TwoDC
trigger:
scope: Global
distributionRules:
- clusterName: dc-a
role: Member
storageClassName: local-path # optional; omit to use the cluster default
replicaIndices:
- 0
- 1
- 2
- clusterName: dc-b
role: Member
storageClassName: local-path # optional; omit to use the cluster default
replicaIndices:
- 3
- 4
- 5
- clusterName: dc-c
role: Arbiter
replicaIndices: []
nodeSpreadConstraint:
maxSkew: 1
whenUnsatisfiable: ScheduleAnyway
zoneSpreadConstraint:
maxSkew: 1
whenUnsatisfiable: ScheduleAnyway
Note: Each Member DC’s
replicaIndicesset becomes one independent Galera cluster with its own gcomm peer set and its own quorum. Use an odd count per Member DC (3 here) so each local Galera cluster keeps odd quorum without a per-DC garbd. A Member DC with an even local node count gets its own intra-DC garbd automatically.
Apply the policy on the hub:
$ kubectl apply -f placement-policy.yaml --context dc-a --kubeconfig $HOME/.kube/config
Step 2: Create the DC-DR MariaDB
Create the demo namespace if it does not exist:
$ kubectl create namespace demo
Define the distributed MariaDB and reference the PlacementPolicy. The interim
annotation dr.kubedb.com/enabled: "true" enables the DC-DR behavior (this is
transitioning to the PlacementPolicy failoverPolicy as the single source of
truth). Create mariadb.yaml:
apiVersion: kubedb.com/v1
kind: MariaDB
metadata:
name: mariadb-dcdr
namespace: demo
annotations:
dr.kubedb.com/enabled: "true"
spec:
distributed: true
deletionPolicy: WipeOut
replicas: 6
storage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
storageType: Durable
version: 12.1.2
podTemplate:
spec:
podPlacementPolicy:
name: distributed-mariadb-dcdr
spec.replicas: 6 is partitioned across the Member DCs by the PlacementPolicy:
3 nodes in dc-a and 3 in dc-b. The Arbiter DC (dc-c) carries no MariaDB
data.
Apply the resource on the hub:
$ kubectl apply -f mariadb.yaml --context dc-a --kubeconfig $HOME/.kube/config
The operator expands this one CR into one Galera cluster per Member DC, each with
its own governing ServiceExport, and configures the standby DC’s node 0 as a GTID
asynchronous replica of the active DC’s primary ServiceExport. The DC that first
acquires the primary-dc Lease bootstraps writable; the other Member DC seeds
from it and follows.
Step 3: Verify exactly one writable DC
1. Check which DC holds the Lease
The active DC is whichever spoke holds the primary-dc Lease. Inspect the
projected marker ConfigMap on each spoke:
$ kubectl get configmap primary-dc -n dc-failover -o yaml --context dc-a
$ kubectl get configmap primary-dc -n dc-failover -o yaml --context dc-b
The data.activeDC value names the active DC and is the same on every spoke. In
this example assume it is dc-a.
2. Confirm the DR status on the CR
$ kubectl get mariadb mariadb-dcdr -n demo -o jsonpath='{.status.disasterRecovery}' --context dc-a | jq
Output (abridged):
{
"activeDC": "dc-a",
"phase": "Steady",
"dataCenters": [
{ "clusterName": "dc-a", "role": "Member", "writable": true, "healthy": true },
{ "clusterName": "dc-b", "role": "Member", "writable": false, "healthy": true, "lagBytes": 0 },
{ "clusterName": "dc-c", "role": "Arbiter", "healthy": true }
]
}
Exactly one dataCenters[] entry has writable: true.
3. Confirm role labels resolve only to the active DC
Only the active DC’s nodes carry kubedb.com/role: Primary; the standby DC’s
nodes are standby:
# Active DC nodes are Primary
$ kubectl get pods -n demo -l 'kubedb.com/role=Primary' --context dc-a
# Standby DC nodes are standby
$ kubectl get pods -n demo -l 'kubedb.com/role=standby' --context dc-b
Because the single <db> primary Service resolves only to the Primary labeled
nodes, every client write lands on the active DC.
4. Confirm the standby DC is read only and following
Connect to the standby DC’s node 0 and confirm the fence and the asynchronous replica:
$ kubectl exec -it -n demo pod/mariadb-dcdr-3 --context dc-b -- bash
mariadb -uroot -p$MYSQL_ROOT_PASSWORD
SHOW VARIABLES LIKE 'super_read_only';
SHOW SLAVE STATUS\G
super_read_only is ON, and SHOW SLAVE STATUS shows the GTID asynchronous
replica streaming from the active DC’s primary endpoint
(mariadb-dcdr.demo.svc.slice.local) with both threads running and a small
Seconds_Behind_Master.
5. Confirm writes are refused on the standby DC
A direct write attempt against the standby DC is rejected by the fence:
CREATE DATABASE should_fail;
-- ERROR 1290 (HY000): The MariaDB server is running with the --super-read-only option
This confirms the fail-closed guarantee: only the Lease holder accepts writes.
Triggering a planned switchover
To move the active DC on purpose (for example to drain a DC for maintenance) with zero data loss, set the switchover annotation on the CR. The hub quiesces writes on the current active DC, waits for the target’s GTID to catch up, then moves the Lease:
$ kubectl annotate mariadb mariadb-dcdr -n demo \
dr.kubedb.com/switchover-to=dc-b --overwrite --context dc-a
Watch the DR status transition through FailingOver back to Steady with
activeDC: dc-b:
$ kubectl get mariadb mariadb-dcdr -n demo \
-o jsonpath='{.status.disasterRecovery.phase} {.status.disasterRecovery.activeDC}{"\n"}' \
--context dc-a --watch
Cleanup
$ kubectl delete mariadb mariadb-dcdr -n demo --context dc-a
$ kubectl delete placementpolicy distributed-mariadb-dcdr --context dc-a
Note: Per-DC PlacementPolicies and ServiceExports created by the operator are cleaned up with the MariaDB. The Arbiter DC’s
dr-controlplaneetcd member is part of the control plane, not the database, and is not removed by deleting the MariaDB.
Next Steps
- Review the DC-DR Overview for failover, failback, and the lag guard semantics.
- See the Distributed MariaDB Overview for the OCM, KubeSlice, and operator install that DC-DR depends on.































