HBASE-29891: Multi-table continuous incremental backup is failing bec…#7891
Open
kgeisz wants to merge 3 commits intoapache:HBASE-28957_rebasedfrom
Open
HBASE-29891: Multi-table continuous incremental backup is failing bec…#7891kgeisz wants to merge 3 commits intoapache:HBASE-28957_rebasedfrom
kgeisz wants to merge 3 commits intoapache:HBASE-28957_rebasedfrom
Conversation
…ause output directory already exists Change-Id: I710cc8d0d87a299b7782a19d93f28bf6283c2436
c4a70b9 to
8ed360e
Compare
Contributor
Author
|
@vinayakphegde Here is the fix for HBASE-29891 |
Contributor
Author
|
I successfully created a multi-table continuous incremental backup in the hbase-docker container setup. I was able to take the backup when each table had 1,000 rows. After the incremental backup, I added 1,000 more rows to each table, and then I did a point-in-time restore and verified the target tables had just 1,000 rows instead of 2,000. |
anmolnar
reviewed
Mar 11, 2026
Contributor
anmolnar
left a comment
There was a problem hiding this comment.
Patch looks good to me. Just a nitpick.
Have you considered adding unit tests for the WALPlayer changes?
hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/WALPlayer.java
Outdated
Show resolved
Hide resolved
Change-Id: Ia3eebdfc8c2061a512bc5a448da9f79e09d57759
Change-Id: I58691d0d0de91b102ee6774a213dadf5b6207929
19c1531 to
de3b79f
Compare
Contributor
Author
|
@anmolnar, I have added some unit tests that cover the changes I made to the WALPlayer. I also added a default value for |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://issues.apache.org/jira/browse/HBASE-29891
Key Changes
backupRoot/.tmp/backup_X-> After:backupRoot/.tmp/backup_X/namespace/tablewalToHFiles()inIncrementalTableBackupClient.javanow setshbase.mapreduce.use.multi.table.hfileoutputformattofalsewhen configuringWALPlayerhbase.mapreduce.use.multi.table.hfileoutputformatconfig is also set tofalsewhen replaying WALs for continuous backups.WALPlayerso it does not always use a multi-table HFile output format (regardless of the value ofhbase.mapreduce.use.multi.table.hfileoutputformat)Background
This pull request fixes an issue where running an incremental backup on multiple tables at once results in a failure. When continuous backup is enabled, an incremental backup firsts convert the WALs to HFiles. These HFiles are output to a
.tmp/backup_Xdirectory (whereXis the backup ID). This is known as the "bulk load output directory". After, adistcpis performed to copy the temporary backup directory to the actual backup directory.Here is an example file system after the WALs to HFiles conversion and before the
distcp. Thedistcpis supposed to copy the contents ofbackupRoot/.tmp/backup_INCR02intobackupRoot/backup_INCR02:Incremental backups convert WALs to HFiles one table at a time, even if a backup set contains more than one table. When WALs are converted to HFiles, the
WALPlayerruns and a map-reduce job is performed. The HFiles are sent to a newly createdbackupRoot/.tmp/backup_Xdirectory. The MR job for the first table runs without any issues. The problem occurs during the second MR job. ThisbackupRoot/.tmp/backup_Xnow already exists, which causes the MR job to fail with something like:Solution
Summary
This fix changes the bulk load output directory for continuous incremental backups. Since the
WALPlayeris run individually for each table, each WALs-to-HFiles conversion can be sent to a directory for that specific table. An example bulk load output directory fortable1in thedefaultnamespace would bebackupRoot/.tmp/backup_X/default/table1. Then,table2would get its own bulk load output directory, etc.Issues while working on the fix
Getting the proper bulk load output and getting the
distcpto run successfully took more effort than expected. Changing the bulk load output directory for each table was simple. The real challenge was getting the HFiles to be output in the proper directory structure. SincebackupRoot/.tmp/backup_X/namespace/tableis already the output directory, we only want the HFiles'columnFamilydirectory to be placed insidetable. We don't want the typicalnamespace/table/columnFamilyoutput structure.table1to bebackupRoot/.tmp/backup_X/default/table1, then the HFiles would instead be output tobackupRoot/.tmp/backup_X/default/table1/default/table1, where the namespace and table name directories are repeated. This caused the.tmpdirectory structure to look like the following after runningWALPlayer:distcpto just copy the deeperdefault/tabledirectories resulted in a failure fromdistcpdue to conflicting source directory names. This works if there is only one table in each namespace, but does not work if a namespace has multiple directories. This is because thedistcplooks as follows:Resulting in an error like:
distcpcommand can have multiple source directories, but only one destination directory:Using
-updatein thedistcpcommand did not get the desired result either.Using
IncrementalTableBackupClient.getBulkOutputDirForTable()to create the bulk load directory caused similar issues. The only difference is the "doubling up" of the directories had adatadirectory in between, like:backupRoot/.tmp/backup_X/default/table/data/default/tablePotential Workaround
A workaround for the issues mentioned above would be to run the
distcpfor each namespace. Then, the source directories would be unique table names, and they could all have the same destination directory (the namespace dir). However, this means a differentdistcpwould need to be performed for each namespace in the backup set.The Actual Solution
We want the WALs-to-HFiles to output to something like this in
.tmp:In order to get rid of the "double
namespace/tableName" directory structure, we have to change how the HFiles are output. We want to keep our bulk load output directory asbackupRoot/.tmp/backup_X/namespace/tableand have just thecfcolumn family directory sent there, notnamespace/table/cf.This is done by setting the
hbase.mapreduce.use.multi.table.hfileoutputformatconfig key tofalsefor continuous incremental backups. The problem here isWALPlayer.javawas always usingMultiTableHFileOutputFormat, which implicitly setshbase.mapreduce.use.multi.table.hfileoutputformattotrue. That's why some changes were made to the logic inWALPlayer.java.Also, this
hfileoutputformatconfig key needs to befalsewhen replaying the WALs during a restore. Otherwise, a failure occurs like the following: