calypso

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 18, 2026 License: MIT Imports: 21 Imported by: 0

README

CalypsoDB Logo

Go Version Redis Compatible License Build Status

A key-value storage engine in Go implementing a log-structured Bitcask design with an Adaptive Radix Tree index.

FeaturesArchitectureBuildingUsageBenchmarks


What is CalypsoDB?

While standard Bitcask (from the Riak design paper) offers fast sequential writes and single disk-seek reads, it has distinct design trade-offs, such as in-memory key directories that scale linearly with the key count. CalypsoDB introduces adaptations designed for practical, single-node key-value storage use:

  • Adaptive Radix Tree (ART) Index: Standard Bitcask implementations typically maintain an in-memory hash map (keydir). This consumes substantial memory and prevents ordered scans. CalypsoDB employs a memory-efficient Adaptive Radix Tree (ART) (github.com/plar/go-adaptive-radix-tree). This enables ordered prefix scans and range queries (Range, Scan, SiftRange).
  • Time-to-Live (TTL) Support: Provides support for setting an expiry duration on key-value pairs (PutWithTTL). CalypsoDB manages a separate thread-safe TTL index and runs background compaction to identify and clean up expired records.
  • Redis Protocol Compatibility (calypsod): Exposes a Redis-compatible TCP interface built on github.com/tidwall/redcon. You can query and manage CalypsoDB using standard Redis clients (redis-cli, or standard Go/Python/Node Redis libraries) out of the box.
  • Web UI Dashboard & REST Server (apid): Integrates an HTTP REST API server paired with a browser-based template dashboard to browse, insert, update, and delete key-value pairs.
  • Double-Buffered Compaction (Merge): Implements a background merge and compaction process using a double-buffered temporary database pattern. This design reclaims disk space from stale or deleted records without blocking active reads or incoming writes.

Project Status

CalypsoDB is currently experimental and under active development. APIs and storage formats may change between releases.


Architectural Modules

CalypsoDB is structured into modular sub-systems:

graph TD
    Client[Client / CLI / Web] -->|HTTP / Redis Protocol / Go API| Calypso[Calypso Core Engine]
    Calypso -->|In-Memory Indexing| ART[Adaptive Radix Tree Index]
    Calypso -->|Append-Only| WAL[Active Datafile / WAL]
    Calypso -->|Compaction| Compactor[Double-Buffered Compactor]
    Compactor -->|Reclaim Space| Disk[Compacted Datafiles on Disk]
  • Octopus: The internal engine core responsible for entry codecs, metadata handling, options configuration, versioning, and abstract index definitions.
  • Dolphin: The visual administration frontend. A server dashboard serving direct HTML templates for managing key-value stores.
  • Seagull: The gateway services housing the main binaries:
    • calypso - The standalone CLI management tool.
    • calypsod - The Redis-protocol-compatible daemon.
    • apid - The combined REST API and template web interface.

Building and Installation

Prerequisites

  • Go 1.25+ installed.
  • A standard Unix build environment (supporting make).

Build via Makefile

Run the default target to clean, generate assets, and build all binaries:

make

This will produce three statically linked binaries inside the ./bin/ directory:

  • bin/calypso - CLI Tool
  • bin/calypsod - Redis Protocol Daemon
  • bin/apid - REST API & Web Dashboard Server

Build via Go Compiler

If you don't have make installed, compile the packages manually using:

go build -o bin/calypso ./seagull/calypso/...
go build -o bin/calypsod ./seagull/calypsod/...
go build -o bin/apid ./seagull/apid/...

Usage Guide

1. Embedded Go Library API

Integrate CalypsoDB directly into your Go application:

package main

import (
	"fmt"
	"log"
	"time"

	calypso "github.com/calypsodb/calypso"
)

func main() {
	// Open or create a database
	db, err := calypso.Open("./my_db")
	if err != nil {
		log.Fatalf("Failed to open database: %v", err)
	}
	defer db.Close()

	// 1. Standard Put and Get
	err = db.Put([]byte("name"), []byte("CalypsoDB"))
	val, err := db.Get([]byte("name"))
	fmt.Printf("Get 'name': %s\n", val) // CalypsoDB

	// 2. Put with Time-to-Live (TTL)
	err = db.PutWithTTL([]byte("session_token"), []byte("xyz123"), 5*time.Second)

	// 3. Prefix Scan (powered by Radix Tree)
	err = db.Scan([]byte("user:"), func(key []byte) error {
		val, _ := db.Get(key)
		fmt.Printf("Key: %s, Value: %s\n", key, val)
		return nil
	})
}

2. The calypso Command-Line Tool

Interact with your database data files directly through the command line:

# Set a key-value pair
./bin/calypso put ./my_db "greeting" "Hello Universe"

# Get the value
./bin/calypso get ./my_db "greeting"

# Delete a key
./bin/calypso del ./my_db "greeting"

# Get database statistics
./bin/calypso stats ./my_db

3. Redis Protocol Compatibility (calypsod)

Start the Redis-compatible TCP server daemon:

# Start the daemon on port 6379, writing to ./my_db
./bin/calypsod --bind :6379 --dbpath ./my_db

Query the database using standard Redis clients:

# Connect using redis-cli
redis-cli -p 6379

127.0.0.1:6379> PING
PONG
127.0.0.1:6379> SET user:1 "Alice"
OK
127.0.0.1:6379> GET user:1
"Alice"
127.0.0.1:6379> KEYS
1) "user:1"
127.0.0.1:6379> EXISTS user:1
(integer) 1
127.0.0.1:6379> DEL user:1
(integer) 1

4. REST API & Web Dashboard Daemon (apid)

Start the API and Web administration dashboard server:

# Start the server (default port 7777)
./bin/apid --dbpath ./my_db
  • REST API endpoints:
    • GET http://localhost:7777/get/{key} - Retrieve key value.
    • POST http://localhost:7777/set/{key}/{value} - Write key-value pair.
  • Web Dashboard:
    • Visit http://localhost:7777/browse in your browser to view, search, insert, and manage records visually.

Benchmarks

To evaluate the engine's performance relative to alternative designs (such as B-trees, SQLite, or alternative Bitcask implementations), you can execute the included benchmark suite:

# Execute all benchmark suites
go test -v ./test_bench/...

Preliminary Observations

Because Bitcask uses an append-only log-structured design, write performance is bound primarily by sequential disk I/O (subject to chosen fsync configuration options), while reads guarantee a maximum of one disk seek for values not cached in memory. Radix tree operations in memory execute in $O(k)$ time, where $k$ is key length.

Actual performance metrics depend heavily on underlying hardware (SSD vs. HDD), value size distributions, concurrency levels, and whether fsync is called synchronously on every write or deferred periodically. Run the benchmarks locally to generate metrics specific to your target environment.


License

CalypsoDB is open-source software licensed under the MIT License.

Documentation

Overview

Package calypso implements a high-performance key-value store based on a WAL and LSM.

Index

Constants

View Source
const (
	// DefaultDirFileModeBeforeUmask is the default os.FileMode used when creating directories
	DefaultDirFileModeBeforeUmask = os.FileMode(0700)

	// DefaultFileFileModeBeforeUmask is the default os.FileMode used when creating files
	DefaultFileFileModeBeforeUmask = os.FileMode(0600)

	// DefaultMaxDatafileSize is the default maximum datafile size in bytes
	DefaultMaxDatafileSize = 1 << 20 // 1MB

	// DefaultMaxKeySize is the default maximum key size in bytes
	DefaultMaxKeySize = uint32(64) // 64 bytes

	// DefaultMaxValueSize is the default value size in bytes
	DefaultMaxValueSize = uint64(1 << 16) // 65KB

	// DefaultSync is the default file synchronization action
	DefaultSync = false

	CurrentDBVersion = uint32(1)
)

Variables

View Source
var (
	// ErrKeyNotFound is the error returned when a key is not found
	ErrKeyNotFound = errors.New("error: key not found")

	// ErrKeyTooLarge is the error returned for a key that exceeds the
	// maximum allowed key size (configured with WithMaxKeySize).
	ErrKeyTooLarge = errors.New("error: key too large")

	// ErrKeyExpired is the error returned when a key is queried which has
	// already expired (due to ttl)
	ErrKeyExpired = errors.New("error: key expired")

	// ErrEmptyKey is the error returned for a value with an empty key.
	ErrEmptyKey = errors.New("error: empty key")

	// ErrValueTooLarge is the error returned for a value that exceeds the
	// maximum allowed value size (configured with WithMaxValueSize).
	ErrValueTooLarge = errors.New("error: value too large")

	// ErrChecksumFailed is the error returned if a key/value retrieved does
	// not match its CRC checksum
	ErrChecksumFailed = errors.New("error: checksum failed")

	// ErrDatabaseLocked is the error returned if the database is locked
	// (typically opened by another process)
	ErrDatabaseLocked = errors.New("error: database locked")

	ErrInvalidRange   = errors.New("error: invalid range")
	ErrInvalidVersion = errors.New("error: invalid db version")

	// ErrMergeInProgress is the error returned if merge is called when already a merge
	// is in progress
	ErrMergeInProgress = errors.New("error: merge already in progress")
)

Functions

This section is empty.

Types

type Bitcask

type Bitcask struct {
	// contains filtered or unexported fields
}

Bitcask is a struct that represents an on-disk LSM and WAL data structure and in-memory hash of key/value pairs as per the Bitcask paper and seen in the Riak database.

func Open

func Open(path string, options ...Option) (*Bitcask, error)

Open opens the database at the given path with optional options. Options can be provided with the `WithXXX` functions that provide configuration options as functions.

func (*Bitcask) Backup

func (b *Bitcask) Backup(path string) error

Backup copies db directory to given path it creates path if it does not exist

func (*Bitcask) Close

func (b *Bitcask) Close() error

Close closes the database and removes the lock. It is important to call Close() as this is the only way to clean up the lock held by the open database.

func (*Bitcask) Delete

func (b *Bitcask) Delete(key []byte) error

Delete deletes the named key.

func (*Bitcask) DeleteAll

func (b *Bitcask) DeleteAll() (err error)

DeleteAll deletes all the keys. If an I/O error occurs the error is returned.

func (*Bitcask) Fold

func (b *Bitcask) Fold(f func(key []byte) error) (err error)

Fold iterates over all keys in the database calling the function `f` for each key. If the function returns an error, no further keys are processed and the error is returned.

func (*Bitcask) Get

func (b *Bitcask) Get(key []byte) ([]byte, error)

Get fetches value for a key

func (*Bitcask) Has

func (b *Bitcask) Has(key []byte) bool

Has returns true if the key exists in the database, false otherwise.

func (*Bitcask) Keys

func (b *Bitcask) Keys() chan []byte

Keys returns all keys in the database as a channel of keys

func (*Bitcask) Len

func (b *Bitcask) Len() int

Len returns the total number of keys in the database

func (*Bitcask) Merge

func (b *Bitcask) Merge() error

Merge merges all datafiles in the database. Old keys are squashed and deleted keys removes. Duplicate key/value pairs are also removed. Call this function periodically to reclaim disk space.

func (*Bitcask) Put

func (b *Bitcask) Put(key, value []byte) error

Put stores the key and value in the database.

func (*Bitcask) PutWithTTL

func (b *Bitcask) PutWithTTL(key, value []byte, ttl time.Duration) error

PutWithTTL stores the key and value in the database with the given TTL

func (*Bitcask) Range

func (b *Bitcask) Range(start, end []byte, f func(key []byte) error) (err error)

Range performs a range scan of keys matching a range of keys between the start key and end key and calling the function `f` with the keys found. If the function returns an error no further keys are processed and the first error returned.

func (*Bitcask) Reclaimable

func (b *Bitcask) Reclaimable() int64

Reclaimable returns space that can be reclaimed

func (*Bitcask) Reopen

func (b *Bitcask) Reopen() error

Reopen closes and reopens the database

func (*Bitcask) RunGC

func (b *Bitcask) RunGC() error

RunGC deletes all expired keys

func (*Bitcask) Scan

func (b *Bitcask) Scan(prefix []byte, f func(key []byte) error) (err error)

Scan performs a prefix scan of keys matching the given prefix and calling the function `f` with the keys found. If the function returns an error no further keys are processed and the first error is returned.

func (*Bitcask) Sift

func (b *Bitcask) Sift(f func(key []byte) (bool, error)) (err error)

Sift iterates over all keys in the database calling the function `f` for each key. If the KV pair is expired or the function returns true, that key is deleted from the database. If the function returns an error on any key, no further keys are processed, no keys are deleted, and the first error is returned.

func (*Bitcask) SiftRange

func (b *Bitcask) SiftRange(start, end []byte, f func(key []byte) (bool, error)) (err error)

SiftRange performs a range scan of keys matching a range of keys between the start key and end key and calling the function `f` with the keys found. If the KV pair is expired or the function returns true, that key is deleted from the database. If the function returns an error on any key, no further keys are processed, no keys are deleted, and the first error is returned.

func (*Bitcask) SiftScan

func (b *Bitcask) SiftScan(prefix []byte, f func(key []byte) (bool, error)) (err error)

SiftScan iterates over all keys in the database beginning with the given prefix, calling the function `f` for each key. If the KV pair is expired or the function returns true, that key is deleted from the database.

If the function returns an error on any key, no further keys are processed,

no keys are deleted, and the first error is returned.

func (*Bitcask) Stats

func (b *Bitcask) Stats() (stats Stats, err error)

Stats returns statistics about the database including the number of data files, keys and overall size on disk of the data

func (*Bitcask) Sync

func (b *Bitcask) Sync() error

Sync flushes all buffers to disk ensuring all data is written

type Option

type Option func(*config.Config) error

Option is a function that takes a config struct and modifies it

func WithAutoRecovery

func WithAutoRecovery(enabled bool) Option

WithAutoRecovery sets auto recovery of data and index file recreation. IMPORTANT: This flag MUST BE used only if a proper backup was made of all the existing datafiles.

func WithDirFileModeBeforeUmask

func WithDirFileModeBeforeUmask(mode os.FileMode) Option

WithDirFileModeBeforeUmask sets the FileMode used for each new file created.

func WithFileFileModeBeforeUmask

func WithFileFileModeBeforeUmask(mode os.FileMode) Option

WithFileFileModeBeforeUmask sets the FileMode used for each new file created.

func WithMaxDatafileSize

func WithMaxDatafileSize(size int) Option

WithMaxDatafileSize sets the maximum datafile size option

func WithMaxKeySize

func WithMaxKeySize(size uint32) Option

WithMaxKeySize sets the maximum key size option

func WithMaxValueSize

func WithMaxValueSize(size uint64) Option

WithMaxValueSize sets the maximum value size option

func WithSync

func WithSync(sync bool) Option

WithSync causes Sync() to be called on every key/value written increasing durability and safety at the expense of performance

type Stats

type Stats struct {
	Datafiles int
	Keys      int
	Size      int64
}

Stats is a struct returned by Stats() on an open Bitcask instance

Directories

Path Synopsis
benchmarks
bolt_bench command
cl_bench command
faker command
million_cl command
prol_bench command
sqlite3_bench command
seagull
apid command
calypso command
calypsod command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL