kvimd

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 6, 2019 License: MIT Imports: 15 Imported by: 0

README

kvimd

Documentation Build Status

kvimd - KV IM-mutable D-atabase

A (fast) disk K/V store for immutable data

Requirements

  • Key / Value is immutable (we assume for a key, the value is unique and always the same) (kvimd == reverse lookup database)
  • We don't care about disk space, we care about speed
  • Random access is as cheap as continuous access (disk == SSD/NVMe)
  • Key size is constant

File structure

For a given root path of /kvimd_db/:

  • /kvimd_db/db#.hashdisk is a disk hashmap mapping key -> (valuesDisk file id, offset in file)
  • /kvimd_db/db#.valuesdisk is the file containing the values. (Seeking with offset, you get back a value)
db#.hashdisk

It is a non-sparse file where all values are encoded as follow:

  • On write, we ask the DB to reserve us space of len(value) + size of the varint to encode the value
  • The data is written as length_as_varint + data. We use uint32 for this file (so file max of 4Gb) so the varint can be up to 5 bytes
db#.valuesdisk

It is a sparse disk file that is mmapped. A cell is of size len(key) + 4 + 4 (4 for uint32 which is the valuesDisk file id + 4 for uint32 which is the offset in that file) This imposes a limitation on kvimd that the database will not hold more than 4Gb*4Gb = 1<<60 = 1<<42 exabytes

Currently use linear probing, in the future we might want to implement RobinHood hashing to increase the load factor to 0.9 or 0.95

Improvements:

HashDisk

  • Use robin hood hashing instead of linear probling
  • Use type casting / whatever instead of bytes.Equal to find zero-value slice (5.81 ns/op vs 2.27 ns/op)

ValuesDisk

  • Do a dichotomy to know what offset to restart on (or read length). This is bc if we crash loop, we will create A LOT of (large) files
  • Add test for Load()
  • Optional value compression

Main DB

  • Check that if key size is given at DB creation and not const it's fine (benchmark)
  • Add test for rotate()
  • There is a log of recent entries (for replay)
  • Possibility to snapshot / lock the database (then everything is appended to log instead)

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrDBClosed    = errors.New("database is already closed")
	ErrFileTooBig  = errors.New("file size is too big (max 4Gb)")
	ErrInvalidKey  = errors.New("key is not valid")
	ErrKeyNotFound = errors.New("key was not found in database")
	ErrNoSpace     = errors.New("no space left in database") // What you usually want to do here is create a new file
	ErrCorrupted   = errors.New("database seems corrupted")
)

Define public errors

Functions

This section is empty.

Types

type DB

type DB struct {
	RootPath string
	// contains filtered or unexported fields
}

DB is a kvimd database. It uses uint32 in a lot of places so this means: each hashmap file is max 4Gb; you can store max 4Gb*4Gb/workers values (a lot)

func NewDB

func NewDB(root string, fileSize uint32) (*DB, error)

NewDB returns a new kvimd database

func (*DB) Close

func (d *DB) Close() error

Close the database, flushing all pending operations to disk. It is not safe to call any Read or Write after a Close

func (*DB) Read

func (d *DB) Read(key []byte) ([]byte, error)

Read a value for a given key from the database. If error is nil then value is returned Return ErrKeyNotFound if key doesn't exist. Return any non-nil error on other errors

func (*DB) Write

func (d *DB) Write(key, value []byte) error

Write a value for a given key in the database. If write succeed, returned error is nil Value might not be persisted directly to disk.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL