xixi_kv

package module
v0.0.0-...-2fb3ca1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 2, 2025 License: Apache-2.0 Imports: 18 Imported by: 0

README

xixi-kv-logo.jpg
GitHub top language Go Reference LICENSE GitHub stars GitHub forks Go Report CardGitHub go.mod Go version (subdirectory of monorepo)GitHub contributors

English | 简体中文

xixi-kv is a concurrent-safe key-value storage engine based on the Bitcask model, featuring low read/write latency, high throughput, and data storage capacity that exceeds memory limitations.

Features

For more features and usage information, please refer to:issues

  • Supports concurrent-safe sharded index implementation with multiple underlying index configuration options, including B-tree, skip list, and map.
  • Provides high-performance batch processing with no maximum operation limit, guaranteeing atomicity, durability, and consistency.
  • Supports both standard file I/O and memory-mapped file (MMap) implementations with corresponding configuration options, suitable for different data file capacity scenarios.
  • Offers database-level iterator functionality with customizable iterator configuration options, allowing users to flexibly control data traversal methods.

Quick Start

For a complete example, see:main.go

Installation

Install Go and run the go get command:

go get -u github.com/XiXi-2024/xixi-kv

Opening the Database

The core object of xixi-kv is DB , which provides default configuration options via DefaultOptions . To open or create a database, use the Open method:

package main

import kv "github.com/XiXi-2024/xixi-kv"

func main() {
	db, err := kv.Open(kv.DefaultOptions)
    // ...
}

Basic Operations

// Insert
err = db.Put(key, logRecord)

// Retrieve
val, err := db.Get(key)

// Delete
err = db.Delete(key)

Advanced Configuration

xixi-kv offers various configuration options that can be adjusted according to specific requirements:

opts := kv.Options{
    DirPath:            "/path/to/data",    // Data directory 
    DataFileSize:       256 * 1024 * 1024,  // Data file size limit
    SyncStrategy:       kv.Threshold,       // Synchronization strategy
    BytesPerSync:       8 * 1024 * 1024,    // Bytes written before synchronization
    IndexType:          kv.BPTree,          // Index type
    ShardNum:           16,                 // Number of index shards
    FileIOType:         kv.MemoryMap,       // I/O type
    DataFileMergeRatio: 0.5,                // Merge trigger ratio
    EnableBackgroundMerge: true,            // Enable background merging
}

Benchmark Tests

For complete testing details, see: db_test.go

Environment

goos: darwin
goarch: arm64
cpu: Apple M1

os.File

Interface QPS(Single Thread) QPS(Multi-Thread)
Put 444279 376621
Get 1002304 2370576
Delete 1297602 1006764

mmap

Interface QPS(Single Thread) QPS(Multi-Thread)
Put 1004450 1000326
Get 2210830 8933606
Delete 6813520 4509661

Notes

When running on Windows systems, ensure that all open DB instances or files are explicitly closed before attempting to delete files, otherwise you may encounter the following error:

The process cannot access the file because it is being used by another process.

It is recommended to run on macOS or Linux environments, or manually delete generated files when testing on Windows systems.

Contribution

As the project continues to grow, I recognize the limitations of individual effort. There are still many issues to resolve and challenging features to implement. If you're interested in this project, I warmly welcome your issues and pull requests, and I'll respond promptly!

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrKeyIsEmpty             = errors.New("the key is empty")
	ErrIndexUpdateFailed      = errors.New("failed to update index")
	ErrKeyNotFound            = errors.New("key not found in database")
	ErrDataFileNotFound       = errors.New("datafile file is not found")
	ErrDataDirectoryCorrupted = errors.New("the database directory maybe corrupted")
	ErrDBClosed               = errors.New("the database is closed")
	ErrBatchCommitted         = errors.New("the batch is committed")
	ErrBatchRollbacked        = errors.New("the batch is rollbacked")
	ErrMergeIsProgress        = errors.New("merge is in progress, try again later")
	ErrDatabaseIsUsing        = errors.New("the database directory is used by another process")
	ErrMergeRatioUnreached    = errors.New("the merge ratio do not reach the option")
	ErrNoEnoughSpaceForMerge  = errors.New("no enough disk space for merge")
)
View Source
var DefaultBatchOptions = BatchOptions{
	Sync: false,
}

DefaultBatchOptions 默认事务 Options, 供测试使用

View Source
var DefaultIteratorOptions = IteratorOptions{
	Prefix:  nil,
	Reverse: false,
}

DefaultIteratorOptions 默认迭代器Options, 供测试使用

View Source
var DefaultOptions = Options{
	DirPath:               os.TempDir(),
	DataFileSize:          512 * 1024 * 1024,
	SyncStrategy:          No,
	BytesPerSync:          1024 * 1024,
	EnableBackgroundMerge: false,
	IndexType:             index.HashMap,
	ShardNum:              16,
	FileIOType:            fio.StandardFIO,
	DataFileMergeRatio:    0.5,
}

DefaultOptions 默认Options, 供示例程序使用

Functions

This section is empty.

Types

type Batch

type Batch struct {
	// contains filtered or unexported fields
}

Batch 批处理操作客户端

func (*Batch) Commit

func (b *Batch) Commit() error

func (*Batch) Delete

func (b *Batch) Delete(key []byte) error

func (*Batch) Get

func (b *Batch) Get(key []byte) ([]byte, error)

func (*Batch) Put

func (b *Batch) Put(key []byte, value []byte) error

type BatchOptions

type BatchOptions struct {
	Sync bool // 刷新时是否理解持久化
}

BatchOptions 批处理操作配置项

type DB

type DB struct {
	// contains filtered or unexported fields
}

DB bitcask存储引擎客户端

func Open

func Open(options Options) (*DB, error)

Open 客户端初始化

func (*DB) Backup

func (db *DB) Backup(dir string) error

Backup 数据库备份

func (*DB) Close

func (db *DB) Close() error

Close 关闭数据库

func (*DB) Delete

func (db *DB) Delete(key []byte) error

Delete 根据 key 删除数据

func (*DB) Fold

func (db *DB) Fold(fn func(key []byte, value []byte) bool) error

Fold 对数据库所有项执行自定义操作, 项改变不会同步数据库

func (*DB) Get

func (db *DB) Get(key []byte) ([]byte, error)

Get 根据 key 读取数据

func (*DB) ListKeys

func (db *DB) ListKeys() [][]byte

ListKeys 获取数据库中的所有 key

func (*DB) Merge

func (db *DB) Merge() error

Merge 立即执行 Merge 过程

func (*DB) NewBatch

func (db *DB) NewBatch(options BatchOptions) *Batch

func (*DB) NewIterator

func (db *DB) NewIterator(opts IteratorOptions) *Iterator

func (*DB) Put

func (db *DB) Put(key []byte, value []byte) error

Put 新增元素

func (*DB) Stat

func (db *DB) Stat() *Stat

Stat 获取当前时刻数据库统计信息

func (*DB) Sync

func (db *DB) Sync() error

Sync 数据持久化

type Iterator

type Iterator struct {
	// contains filtered or unexported fields
}

Iterator 数据库层迭代器, 面向用户

func (*Iterator) Close

func (it *Iterator) Close()

Close 关闭迭代器 释放相关资源

func (*Iterator) Key

func (it *Iterator) Key() []byte

Key 返回当前位置的 key

func (*Iterator) Next

func (it *Iterator) Next()

Next 遍历下一个满足条件的元素

func (*Iterator) Rewind

func (it *Iterator) Rewind()

Rewind 迭代器重置回到起点

func (*Iterator) Seek

func (it *Iterator) Seek(key []byte)

Seek 返回首个大于(小于)等于指定 key 的目标 key

func (*Iterator) Valid

func (it *Iterator) Valid() bool

Valid 判断是否遍历完成

func (*Iterator) Value

func (it *Iterator) Value() ([]byte, error)

Value 返回当前位置 key 对应的实际 value

type IteratorOptions

type IteratorOptions struct {
	// key 过滤前缀, 默认为空
	Prefix []byte
	// 是否降序遍历, 默认为false
	Reverse bool
}

IteratorOptions 索引迭代器配置项

type Options

type Options struct {
	DirPath               string          // 数据目录
	DataFileSize          int64           // 数据文件最大容量, 单位字节
	SyncStrategy          SyncStrategy    // 持久化策略
	BytesPerSync          uint            // 新写入数据量阈值
	IndexType             index.IndexType // 索引类型
	FileIOType            fio.FileIOType  // 文件 IO 类型
	EnableBackgroundMerge bool            // 是否启用后台定时 merge
	DataFileMergeRatio    float32         // 执行 merge 的无效数据占比阈值
	ShardNum              int             // 索引分片数量
}

Options 用户配置项

type Stat

type Stat struct {
	KeyNum          int   // 当前 key 的数量
	DataFileNum     int   // 当前数据文件数量
	ReclaimableSize int64 // 当前 merge 可回收的数据量, 单位字节
	DiskSize        int64 // 数据目录的磁盘占用空间大小
}

Stat 实时统计信息

type SyncStrategy

type SyncStrategy byte
const (
	No SyncStrategy = iota // 由操作系统决定

	Always // 立即持久化

	Threshold // 新写入数据量达到阈值持久化
)

Directories

Path Synopsis
cmd command
examples
batch command
db command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL