使用PBM对MongoDB做物理备份和恢复

2025年12月12日 264点热度 0人点赞 0条评论

之前文章已详细描述了PBM的安装和入门使用,结论是可大幅提升大数据量的备份效率,本文结合一线 MongoDB 集群的实际部署现状,详细记录使用 PBM 进行物理备份和恢复的操作步骤,供一线直接参考使用。

一、Make a physical backup

配置好 pbm 以后,物理备份非常简单:
pbm backup --type=physical

二、Restore from a physical backup

1、恢复之前首先确认当前的备份集:

Backup snapshots:
  2025-12-10T08:22:10Z <physical> [restore_to_time: 2025-12-10T08:22:12Z]
  2025-12-10T08:33:06Z <logical> [restore_to_time: 2025-12-10T08:53:18Z]
  2025-12-12T06:43:30Z <physical> [restore_to_time: 2025-12-12T06:43:32Z]

2、分片集群恢复

注意:恢复操作会自动清空当前的数据,切记谨慎操作!

1). 停止 balancer

mongos>
sh.stopBalancer()

2). 关闭所有 mongos,阻止客户端访问

killall mongos

3). 禁用 PITR

pbm config --set pitr.enabled=false

4). 全集恢复

pbm restore "2025-12-12T06:43:30Z"

[mongod@node1:1 /u01/nfs/ynzybak_mongodb_pbm]$ pbm restore "2025-12-12T06:43:30Z"
Starting restore 2025-12-12T07:17:14.062699954Z from '2025-12-12T06:43:30Z'...............................................Restore of the snapshot from '2025-12-12T06:43:30Z' has started.
Check restore status with: pbm describe-restore 2025-12-12T07:17:14.062699954Z -c </path/to/pbm.conf.yaml>
No other pbm command is available while the restore is running!

以上输出可见,恢复期间切勿执行其他的 pbm 操作,以免导致意外!
并且,恢复期间仅可以通过以下方式检查恢复状态:
pbm describe-restore 2025-12-12T07:17:14.062699954Z -c /u01/mongodb/conf/pbm_config.yaml

[mongod@node1:1 /u01/nfs/ynzybak_mongodb_pbm]$ pbm describe-restore 2025-12-12T07:17:14.062699954Z -c /u01/mongodb/conf/pbm_config.yaml
name: "2025-12-12T07:17:14.062699954Z"
opid: ""
backup: ""
type: physical
status: running
last_transition_time: "2025-12-12T07:18:00Z"
replsets:
- name: configs
  status: done
  last_transition_time: "2025-12-12T07:22:29Z"
  nodes:
  - name: node1:21000
    status: done
    last_transition_time: "2025-12-12T07:22:24Z"
  - name: node2:21000
    status: done
    last_transition_time: "2025-12-12T07:22:24Z"
  - name: node3:21000
    status: done
    last_transition_time: "2025-12-12T07:22:23Z"
- name: shard1
  status: down
  last_transition_time: "2025-12-12T07:19:06Z"
  nodes:
  - name: node1:27001
    status: done
    last_transition_time: "2025-12-12T07:22:23Z"
  - name: node2:27001
    status: running
    last_transition_time: "2025-12-12T07:17:49Z"
  - name: node3:27001
    status: running
    last_transition_time: "2025-12-12T07:17:49Z"

以上输出可见,部分节点的状态是 running ,表明正在恢复中。
最后恢复完毕之后的状态如下:

[mongod@node1:1 /u01/nfs/ynzybak_mongodb_pbm]$ pbm describe-restore 2025-12-12T07:17:14.062699954Z -c /u01/mongodb/conf/pbm_config.yaml
name: "2025-12-12T07:17:14.062699954Z"
opid: 693bc17a3c10278f23378457
backup: "2025-12-12T06:43:30Z"
type: physical
status: done
last_transition_time: "2025-12-12T07:23:08Z"
replsets:
- name: configs
  status: done
  last_transition_time: "2025-12-12T07:22:29Z"
  nodes:
  - name: node1:21000
    status: done
    last_transition_time: "2025-12-12T07:22:24Z"
  - name: node2:21000
    status: done
    last_transition_time: "2025-12-12T07:22:24Z"
  - name: node3:21000
    status: done
    last_transition_time: "2025-12-12T07:22:23Z"
- name: shard1
  status: done
  last_transition_time: "2025-12-12T07:23:03Z"
  nodes:
  - name: node1:27001
    status: done
    last_transition_time: "2025-12-12T07:22:23Z"
  - name: node2:27001
    status: done
    last_transition_time: "2025-12-12T07:22:39Z"
  - name: node3:27001
    status: done
    last_transition_time: "2025-12-12T07:22:32Z"

可见,各个副本集、节点的状态均为 done,表明恢复完成。

4). PITR 恢复

基于时间点的恢复如下操作:
pbm restore --time="2023-02-22T08:30:00"
本次暂未执行,掠过。

5). Post-restore steps

恢复完成后,执行如下步骤。
---Remove the contents of the datadir on any arbiter nodes

当前一线集群架构未采用仲裁节点,所以不涉及

---Restart all mongod nodes

mongod -f ${MongoDir}/conf/config.conf
mongod -f ${MongoDir}/conf/shard1.conf

---Restart all pbm-agents

pbm-agent --mongodb-uri "mongodb://pbmuser:passwd@localhost:21000/?authSource=admin" > /u01/mongodb/pbm-log/pbm-agent.$(hostname -s).21000.log 2>&1 &
pbm-agent --mongodb-uri "mongodb://pbmuser:passwd@localhost:27001/?authSource=admin" > /u01/mongodb/pbm-log/pbm-agent.$(hostname -s).27001.log 2>&1 &

---Run the following command to resync the backup list with the storage:

更新备份集信息
pbm config --force-resync -w

---Start the balancer and start mongos nodes.

mongos -f ${MongoDir}/conf/mongos.conf
sh.startBalancer()

---启用 PITR
pbm config --set pitr.enabled=true

liking

这个人很懒,什么都没留下

文章评论