Postgresql-rman
- 联机程序. 并且目标数据库必须处于归档模式。
- 支持在线全备, 增量备份, 归档备份
- 增量备份基于已经存在的一个全库备份
- rman 本身使用pg_start_backup(), copy, pg_stop_backup() 备份模式
本身采用的是文本拷贝… cp/fwrite;
- pg_start_backup()
- text 用户定义的标签, 是备份转储文件将被存储的名字
- boolean 指尽快执行pg_start_backup. 这将会强制一个立即执行的检查点, 会导致I/O操作的峰值, 拖慢任何并发执行的查询.
- boolean 如果为false, 则在完成备份后, pg_stop_backup将立即返回,而无需等待WAL归档
- pg_stop_backup()
rman整体架构
默认配置参数:
- PGDATA
- BACKUP_PATH
- ARCLOG_PATH
pg_rman init
pg_rman show
pg_rman config –list
pg_rman backup -b full
-b inc [incremental]
-b arch [archive]
pg_rman restore
[新增功能] pg_rman blockrecover –datafile tablespaceOid/databaseOid/relfilenode –block 0
备份策略
- 恢复窗口: 指定天数. 默认值为 7.
- 备份数量: 冗余度保留。 默认值为 1.
代码组织架构:
.
├── backup.c
├── blockrecover.c
├── catalog.c
├── COPYRIGHT
├── data.c
├── delete.c
├── dir.c
├── docs
├── expected
├── idxpagehdr.h
├── init.c
├── Makefile
├── parray.c
├── parray.h
├── pg_rman.c
├── pg_rman.h
├── pgsql_src
├── pgut
├── README.md
├── restore.c
├── script
├── show.c
├── sql
├── util.c
├── validate.c
└── xlog.c
pg_rman-源码浅析
代码阅读
* +----------------+---------------------------------+
* | PageHeaderData | linp1 linp2 linp3 ... |
* +-----------+----+---------------------------------+
* | ... linpN | |
* +-----------+--------------------------------------+
* | ^ pd_lower |
* | |
* | v pd_upper |
* +-------------+------------------------------------+
* | | tupleN ... |
* +-------------+------------------+-----------------+
* | ... tuple3 tuple2 tuple1 | "special space" |
* +--------------------------------+-----------------+
如果有数据刷入, 那么将会做持久化,数据库页头部的pd_lsn表示该数据库页最后一次变化时, 变化产生的REDO在wal file中的结束为止.
如果wal flush的lsn插入位置 大于或者等于这个pd_lsn将表示这个页的更改是可靠的. 即每次修改都将发生块的变化: 包含LSN的修改.
即可以通过第一次备份开始时的全局LSN, 以及当前需要备份的数据的Page LSN来判断此页是否发生过修改.
修改了即备份,没修改不需要备份, 从而实现数据库的块级别增量备份
增量备份关联代码:
pgBackupGetPath(prev_backup, prev_file_txt, lengthof(prev_file_txt),
DATABASE_FILE_LIST);
prev_files = dir_read_file_list(pgdata, prev_file_txt);
/*
* Do backup only pages having larger LSN than previous backup.
*/
lsn = &prev_backup->start_lsn;
xlogid = (uint32) (*lsn >> 32);
xrecoff = (uint32) *lsn;
elog(DEBUG, _("backup only the page updated after LSN(%X/%08X)"),
xlogid, xrecoff);
/* Construct the directory for this backup within BACKUP_PATH. */
pgBackupGetPath(¤t, path, lengthof(path), DATABASE_DIR);
/* Save the files listed above. */
backup_files(pgdata, path, files, prev_files, lsn, current.compress_data, NULL);
[新增]块恢复代码:
for (loop = 0; loop <= brc.base_index; loop++)
{
backup = (pgBackup *) parray_get(backups, loop);
/* don't use incomplete nor different timeline backup */
if (backup->status != BACKUP_STATUS_OK || backup->tli != base_backup->tli)
continue;
if(-1 == brc.lastBackupIndex && HAVE_ARCLOG(backup) && brc.last_needed_index >= loop)
{
restore_archive_logs(backup,true);
}
/* use database backup only */
if (BACKUP_MODE_INCREMENTAL > backup->backup_mode || brc.last_needed_index < loop)
continue;
elog(DEBUG, "found backup BK_KEY: \"%d\" can be used ",backup->backup_id);
recoverBackup(backup,loop);
=> [[
for(loop = 0; loop < brc.rbNum; loop++)
{
/*If this block has find a page,skip it*/
if(brc.pageArray[loop])
{
elog(DEBUG,"block \'%u\' has find it's page,skip.",brc.recoverBlock[loop]);
continue;
}
page = findPageInBackup(backup, brc.recoverBlock[loop]);
if(page)
{
brc.pageArray[loop] = page;
if(-1 == brc.lastBackupIndex)
{
brc.lastBackupIndex = backupindex;
elog(DEBUG,"Find last backup can be used:BK_KEY \'%d\'",backup->backup_id);
}
}
}
]]
}
问题:
- 随意增大filenode大小, 即无法整除8192时, 会默认增大一个Page。 此时的Page是不完整的. pg默认不开启checksum校验. 因此Pg会提示blk Num无效, 进行blockrecover操作时, 将会发生无法恢复. 因为整个filenode本身就没有正确的此Page;
- 当随意修改Page数据时, 有时会发生显示数据不全,即数据条目与插入条目不符的情况. 此时Pg本身无法正常的数据异常告警. 请开启checksum. 进行验证.
checkSum异常告警;
WARNING: 01000: page verification failed, calculated checksum 11654 but expected 8293
- 确定table的tuple Num
- 确定table的page Num
确保开启checksum功能, 保证Page的数据正常. 但对上述问题不产生有效影响;;