He3DB与PostgreSQL的日志归档有所不同,He3DB将WAL日志存储在了tikv中,而PostgreSQL则是存储在本地pg_wal文件目录中,对此,He3DB新设计了一个归档工具hbr-raw,用于将tikv中的日志归档到S3中,以及将S3中的WAL日志恢复到tikv中。代码在:gitee.com/he3db/he3pg...
在进行使用时,可购买使用移动云的对象存储EOS,注册移动云账号后购买可用。或者可以在本地环境搭建minio,Minio是一种对象存储解决方案,提供与 Amazon Web Services S3兼容的API并支持所有核心S3功能。Minio旨在部署在任何地方------公共或私有云、裸机基础设施、编排环境和边缘基础设施,提供了Java、Python和Go等多门语言的SDK。参考文档Linux 的 MinIO 对象存储。
设计思考
He3DB其中一个核心的设计理念就是低成本。数据库的使用成本中,很大一部分是存储成本,而降低存储成本,大致的思路有2个,其一,对数据进行压缩存储,降低数据的容量,其二,采用冷热数据分层,将冷数据放到更低成本的存储中。He3DB采用的是后面的思路。 所以,最底层采用的是S3进行冷数据存储,这样可以尽可能的而降低存储成本。一般数据库都会有一个日志归档的功能,主要用作备份使用,出于低成本的考虑,我们设计了hbr-raw工具,支持日志归档到S3中,并从S3进行恢复。
hbr-raw工具的使用
hbr-raw采用Go语言实现。用法如下:
vbnet
postgres@slpc:~/works/gitee/he3pg/hbr-raw$ ./hbr-raw --help
Welcome to use hbr for He3DB backup&restore
Usage:
hbr [flags]
hbr [command]
Available Commands:
archive Archive He3DB Xlog KV
help Help about any command
restore Restore He3DB
scan Archive He3DB Xlog KV
version Show Version
Flags:
--access_key string S3 Access Key
--archive_start_file string start key of archive[included] (default "000000010000000000000001")
--archive_start_lsn string start lsn of archive[included] (default "0000000000000000")
--archive_start_time_line string start time line of archive[included] (default "0000000000000001")
--bucket string S3 bucket
--concurrency int concurrency (default 100)
--endpoint string S3 endpoint
-h, --help help for hbr
--name string Backup name
--pd string Tikv placement driber (default "http://127.0.0.1:2379")
--region string S3 region
--secret_key string S3 Secret Key
Use "hbr [command] --help" for more information about a command.
将WAL日志从tikv归档到s3方法如下:
tikv默认PDADDR="127.0.0.1:2379",如果不是默认配置需要增加对应tikv的配置项
- 场景一 全量日志备份(单时间线,目前不支持跨时间线)
./hbr-raw archive --access_key "*" --secret_key "*" --region suzhou3 --endpoint "eos.fenhu-1.cmecloud.cn" --bucket he3db --archive_start_time_line "0000000000000001" --name walbak1
- 场景二 增量日志备份(单时间线,目前不支持跨时间线,比如从lsn:0/00756F89开始备份(包含))
./hbr-raw archive --access_key "*" --secret_key "*" --region suzhou3 --endpoint "eos.fenhu-1.cmecloud.cn" --bucket he3db --archive_start_time_line "0000000000000001" --name walbak3 --archive_start_lsn 0000000000756F89
将wal日志从s3恢复到tikv:
./hbr-raw restore --access_key "*" --secret_key "*" --region suzhou3 --endpoint "eos.fenhu-1.cmecloud.cn" --bucket he3db --archive_start_time_line "0000000000000001" --name walbak1
S3
关于S3,入门与使用可参考文档:Amazon Simple Storage Service 文档。这里不再细数。如果我们要将日志写入到S3中,需要对应的API接口去访问,而我们的工具是用Go是实现的。所以
hbr-raw源码分析
hbr-raw是一个命令行工具,采用Go语言实现。
go
package main
import (
"hbr-raw/cmd"
)
func main() {
cmd.Execute()
}
使用cobra构建工具。
go
package cmd
var rootCmd = &cobra.Command{
Use: "hbr",
Short: "He3DB backup&restore",
Long: "Welcome to use hbr for He3DB backup&restore",
Run: runRoot,
}
var wg sync.WaitGroup
var concurrency int
func init() {
rootCmd.PersistentFlags().String("access_key", "", "S3 Access Key")
rootCmd.PersistentFlags().String("secret_key", "", "S3 Secret Key")
rootCmd.PersistentFlags().String("endpoint", "", "S3 endpoint")
rootCmd.PersistentFlags().String("region", "", "S3 region")
rootCmd.PersistentFlags().String("bucket", "", "S3 bucket")
rootCmd.PersistentFlags().String("pd", "http://127.0.0.1:2379", "Tikv placement driber")
rootCmd.PersistentFlags().String("name", "", "Backup name")
rootCmd.PersistentFlags().String("archive_start_file", "000000010000000000000001", "start key of archive[included]")
rootCmd.PersistentFlags().String("archive_start_time_line", "0000000000000001", "start time line of archive[included]")
rootCmd.PersistentFlags().String("archive_start_lsn", "0000000000000000", "start lsn of archive[included]")
rootCmd.PersistentFlags().IntVar(&concurrency, "concurrency", 100, "concurrency")
}
func Execute() {
if err := rootCmd.Execute(); err != nil {
panic(err)
}
}
归档到S3源码实现
从tikv归档日志到S3中,需要访问tikv,使用tikv的Go语言API接口,可参考Go Client: Interact with TiKV using Go
。从tikv中取日志,然后存到S3中,同样需要Go语言的S3接口,可参考文档:docs.aws.amazon.com/sdk-for-go/... 实现逻辑比较容易理解,代码如下:
go
package cmd
var archiveCmd = &cobra.Command{ // cobra.Command 是一个结构体,代表一个命令
Use: "archive", // 命令的名称
Short: "Archive He3DB Xlog KV", // 当前命令的简短描述
Long: "Welcome to use hbr for He3DB xlog archive", // 当前命令的完整描述
Run: runArchive, // 属性是一个函数,当执行命令时会调用此函数
}
func init() {
rootCmd.AddCommand(archiveCmd) // 添加archive命令
}
func runArchive(cmd *cobra.Command, args []string) {
var sem = make(chan bool, concurrency)
archiveStart := time.Now()
access_key, _ := cmd.Flags().GetString("access_key") // s3的access_key
secret_key, _ := cmd.Flags().GetString("secret_key") // s3的secret_key
endpoint, _ := cmd.Flags().GetString("endpoint") // s3的访问地址
region, _ := cmd.Flags().GetString("region") // s3的region
bucket, _ := cmd.Flags().GetString("bucket") // s3 桶名称
pd, _ := cmd.Flags().GetString("pd") // tikv的pd地址
backup_name, _ := cmd.Flags().GetString("name")
archive_start_time_line, _ := cmd.Flags().GetString("archive_start_time_line")
archive_start_lsn, _ := cmd.Flags().GetString("archive_start_lsn")
if access_key == "" || secret_key == "" || endpoint == "" || region == "" || bucket == "" || pd == "" || backup_name == "" || archive_start_time_line == "" || archive_start_lsn == "" {
fmt.Printf("PARAMETER ERROR!\n")
return
}
client, err := tikv.NewRawKVClient([]string{pd}, config.Security{}) // tikv go client,访问tikv用
if err != nil {
fmt.Printf("Connect Tikv Error!\n%v\n", err)
return
}
// The session the S3 Uploader will use
sess, err := session.NewSession(&aws.Config{
Region: aws.String(region),
Endpoint: aws.String(endpoint),
Credentials: credentials.NewStaticCredentials(access_key, secret_key, ""),
S3ForcePathStyle: aws.Bool(true),
})
if err != nil {
fmt.Printf("Connect S3 Error!\n%v\n", err)
return
}
s3_client := s3.New(sess) // New creates a new instance of the S3 client with a session
var filename string = ""
wlCount := 0
// archive wal kv
fmt.Printf("archive wal kv!\n")
for id := 0; id < 8; id++ {
//06000000000000000100000000000000070000000000000000
//因为加了个id字段,目前不能跨时间线备份
retStartString := fmt.Sprintf("06%s000000000000000%d%s", archive_start_time_line, id, archive_start_lsn)
//retEndString := fmt.Sprintf("06ffffffffffffffff000000000000000%dffffffffffffffff", id)
retEndString := fmt.Sprintf("06%s000000000000000%dffffffffffffffff", archive_start_time_line, id)
retStart := make([]byte, 25)
retEnd := make([]byte, 25)
index := 0
for i := 0; i < len(retStartString); i += 2 {
value, _ := strconv.ParseUint(retStartString[i:i+2], 16, 8)
retStart[index] = byte(0xff & value)
value, _ = strconv.ParseUint(retEndString[i:i+2], 16, 8)
retEnd[index] = byte(0xff & value)
index++
}
fmt.Printf("%x\n", retStart)
fmt.Printf("%x\n", retEnd)
limit := 10240
for {
keys, values, _ := client.Scan(retStart, retEnd, limit) // 从tikv中读日志
for k, _ := range keys {
fmt.Printf("%x\n", keys[k])
filename = fmt.Sprintf("%x", keys[k])
wg.Add(1)
sem <- true
go s3PutKV(s3_client, bucket, backup_name, filename, values[k], sem) // 调用s3 api接口,将日志存到s3中
if bytes.Compare(retStart, keys[k]) < 0 {
retStart = keys[k]
}
wlCount++
}
if len(keys) < limit {
break
}
wlCount--
}
}
wg.Wait()
client.Close()
fmt.Printf("wal kv count:%v\n", wlCount)
fmt.Println("backup time:", time.Since(archiveStart))
}
// 存数据到s3
func s3PutKV(s3_client *s3.S3, bucket string, backup_name string, filename string, v []byte, sem chan bool) {
defer wg.Done()
defer func() {
<-sem
}()
_, err := s3_client.PutObject(&s3.PutObjectInput{ // Adds an object to a bucket.
Bucket: aws.String(bucket), // 存到那个桶中
Key: aws.String(backup_name + "/" + filename), // key的构造:backup_name + filename
Body: bytes.NewReader(v),
})
if err != nil {
fmt.Printf("S3 PutObject Error!\n%v\n", err)
os.Exit(1)
}
//fmt.Printf("S3 PutObject!\n")
}
从S3恢复到tikv中
我们将WAL日志从tikv中归档到了S3中,使用的时候,需要从S3恢复日志到tikv中,为什么不能恢复到本地盘呢?因为He3DB与PG不同,它是将WAL日志存到了tikv中,从tikv中读取日志进行回放等操作。实现的逻辑其实与上面的逻辑是相同的。
go
package cmd
import (
"fmt"
"time"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/credentials"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
"github.com/spf13/cobra"
"github.com/pingcap/tidb/config"
"github.com/pingcap/tidb/store/tikv"
)
var restoreCmd = &cobra.Command{
Use: "restore",
Short: "Restore He3DB",
Long: "Welcome to use hbr for He3DB restore",
Run: runRestore,
}
func init() {
rootCmd.AddCommand(restoreCmd)
}
func runRestore(cmd *cobra.Command, args []string) {
var sem = make(chan bool, concurrency)
restoreStart := time.Now()
access_key, _ := cmd.Flags().GetString("access_key") // 访问s3所需的参数
secret_key, _ := cmd.Flags().GetString("secret_key")
endpoint, _ := cmd.Flags().GetString("endpoint")
region, _ := cmd.Flags().GetString("region")
bucket, _ := cmd.Flags().GetString("bucket")
pd, _ := cmd.Flags().GetString("pd")
backup_name, _ := cmd.Flags().GetString("name")
if access_key == "" || secret_key == "" || endpoint == "" || region == "" || bucket == "" || pd == "" || backup_name == "" {
fmt.Printf("PARAMETER ERROR!\n")
return
}
client, err := tikv.NewRawKVClient([]string{pd}, config.Security{}) // 访问tikv
if err != nil {
fmt.Printf("Connect Tikv Error!\n%v\n", err)
return
}
sess, err := session.NewSession(&aws.Config{
Region: aws.String(region),
Endpoint: aws.String(endpoint),
Credentials: credentials.NewStaticCredentials(access_key, secret_key, ""),
S3ForcePathStyle: aws.Bool(true),
})
if err != nil {
fmt.Printf("Connect S3 Error!\n%v\n", err)
return
}
s3_client := s3.New(sess)
count := 0
input := &s3.ListObjectsInput{
Bucket: aws.String(bucket),
Prefix: aws.String(backup_name),
}
for {
resp, err := s3_client.ListObjects(input)
if err != nil {
fmt.Printf("S3 ListObjects Error!\n%v\n", err)
return
}
for _, keys := range resp.Contents {
wg.Add(1)
sem <- true
go s3RestoreKVRaw(s3_client, bucket, backup_name, keys, client, sem) // s3取日志到tikv中
count++
}
if resp.NextMarker == nil {
fmt.Printf("Done!\n")
break
}
input.Marker = resp.NextMarker
}
wg.Wait()
fmt.Printf("N:%v\n", count)
fmt.Println("restore time:", time.Since(restoreStart))
}
// s3恢复日志到tikv中
func s3RestoreKVRaw(s3_client *s3.S3, bucket string, backup_name string, keys *s3.Object, client *tikv.RawKVClient, sem chan bool) {
defer wg.Done()
defer func() {
<-sem
}()
out, err := s3_client.GetObject(&s3.GetObjectInput{ // 从s3取日志
Bucket: aws.String(bucket),
Key: aws.String(*keys.Key),
})
if err != nil {
fmt.Printf("S3 GetObject Error!\n%v\n", err)
os.Exit(1)
}
defer out.Body.Close()
data, err := ioutil.ReadAll(out.Body)
if err != nil {
fmt.Printf("out.Body.Read!\n%v\n", err)
os.Exit(1)
}
fmt.Printf("filename:%s\n", (*keys.Key)[len(backup_name)+1:])
ret := make([]byte, (len(*keys.Key)-len(backup_name)-1)/2)
index := 0
for i := len(backup_name) + 1; i < len(*keys.Key); i += 2 {
value, _ := strconv.ParseUint((*keys.Key)[i:i+2], 16, 8)
ret[index] = byte(0xff & value)
index++
}
if err := client.Put(ret, data); err != nil { // 存到tikv中
fmt.Printf("Tikv Set Error!\n%v\n", err)
os.Exit(1)
}
}
其他
He3DB在持续开发中,未来,WAL日志不一定会存储在tikv中,所以,将来这个工具也可能会改变。
参考文档:
什么是 Amazon S3?
Cobra 中文文档
Cobra文档