streamhash

package module

v0.0.0-...-9c240c3 Latest Latest Go to latest Published: Jun 3, 2026 License: Apache-2.0 Imports: 19 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/tamirms/streamhash

Links

Open Source Insights

README ¶

StreamHash

A Go library for building and querying Minimal Perfect Hash Function (MPHF) indexes over billions of keys, using bounded RAM and streaming construction.

An MPHF maps N keys to N consecutive integers [0, N) with no collisions. This enables compact, read-only lookup tables where every key has a unique position — O(1) lookups without storing the keys themselves.

Features

Streaming construction — build indexes over 1B+ keys with ~1–75 MB of RAM, regardless of dataset size
Sorted and unsorted input — pre-sorted keys build with no temp disk; unsorted keys use per-writer temp files
Parallel builds — block-independent construction scales near-linearly with worker count
Concurrent writers — unsorted mode supports multiple concurrent writers for parallel key ingestion
Two algorithms — Bijection (compact, low RAM) and PTRHash (fast queries)
Optional payloads — store 1–8 byte fixed-size values alongside keys
Optional fingerprints — detect non-member keys with configurable false-positive rates
Single-file output — one mmap'd file for queries, no external dependencies

Performance

Index size: Bijection produces the most compact indexes (~2.5 bits/key MPHF overhead). PTRHash is slightly larger (~2.7 bits/key) but offers significantly faster queries.

Query latency: PTRHash queries are O(1) via direct pilot lookup, roughly 20× faster than Bijection's O(128) checkpoint-based decoding. Query latency is the same regardless of build mode. Each query requires one metadata read from disk; payload mode adds a second read.

Build throughput: Both algorithms build at similar speeds. Sorted input is faster than unsorted at higher worker counts because blocks are streamed directly to workers without an intermediate partition-read phase. Unsorted builds match sorted throughput at 1 worker and scale well with more workers but plateau below sorted peak throughput. Adding more workers via WithWorkers(n) scales build throughput near-linearly up to the number of blocks.

Memory: Sorted builds use very little heap (single-digit MB at 1 worker). Unsorted builds use more due to per-writer flush buffers (~12 MB per writer) and the read-phase pipeline, but remain bounded regardless of dataset size. With AddKeys(N, fn), the AddKey phase parallelizes across N concurrent writers.

Run go run ./cmd/bench on your hardware for concrete numbers (see Benchmarking).

Installation

go get github.com/tamirms/streamhash

Requires Go 1.26+.

Usage

Building an index (sorted input)

Keys must be at least 16 bytes and uniformly distributed. Use PreHash for non-uniform keys (strings, integers, etc.).

builder, err := streamhash.NewSortedBuilder(ctx, "index.idx", totalKeys,
    streamhash.WithPayload(4),
    streamhash.WithWorkers(4),
)
if err != nil {
    log.Fatal(err)
}
defer builder.Close()

for _, key := range sortedKeys {
    if err := builder.AddKey(key, payload); err != nil {
        log.Fatal(err)
    }
}
if err := builder.Finish(); err != nil {
    log.Fatal(err)
}

Building an index (unsorted input)

builder, err := streamhash.NewUnsortedBuilder(ctx, "index.idx", totalKeys, "/tmp/myindex",
    streamhash.WithWorkers(4),
)
if err != nil {
    log.Fatal(err)
}
defer builder.Close()

for _, key := range keys {
    if err := builder.AddKey(key, payload); err != nil {
        log.Fatal(err)
    }
}
if err := builder.Finish(); err != nil {
    log.Fatal(err)
}

Building with concurrent writers (unsorted input)

builder, err := streamhash.NewUnsortedBuilder(ctx, "index.idx", totalKeys, "/tmp/myindex",
    streamhash.WithWorkers(16),
)
if err != nil {
    log.Fatal(err)
}
defer builder.Close()

// AddKeys calls Finish internally.
if err := builder.AddKeys(8, func(writerID int, addKey func([]byte, uint64) error) error {
    for key, payload := range myPartition(writerID) {
        if err := addKey(key, payload); err != nil {
            return err
        }
    }
    return nil
}); err != nil {
    log.Fatal(err)
}

Querying

idx, err := streamhash.Open("index.idx")
if err != nil {
    log.Fatal(err)
}
defer idx.Close()

// MPHF mode: get the rank (0-based index) for a key
rank, err := idx.QueryRank(key)

// Payload mode: get the stored payload for a key
pi, err := idx.WithPayload()
rank, payload, err := pi.QueryPayload(key)

Pre-hashing non-uniform keys

If your keys are not already uniformly random (e.g., strings, sequential integers, UUIDs), pre-hash them before building and querying:

// Pre-hash and sort
hashedKeys := make([][]byte, len(keys))
for i, key := range keys {
    hashedKeys[i] = streamhash.PreHash(key)
}
sort.Slice(hashedKeys, func(i, j int) bool {
    return bytes.Compare(hashedKeys[i], hashedKeys[j]) < 0
})

// Build
builder, err := streamhash.NewSortedBuilder(ctx, "index.idx", uint64(len(hashedKeys)))
if err != nil {
    log.Fatal(err)
}
defer builder.Close()
for _, hk := range hashedKeys {
    if err := builder.AddKey(hk, 0); err != nil {
        log.Fatal(err)
    }
}
if err := builder.Finish(); err != nil {
    log.Fatal(err)
}

// Query: pre-hash the lookup key
rank, err := idx.QueryRank(streamhash.PreHash(originalKey))

Build Options

Option	Description	Default
`WithWorkers(n)`	Parallel build workers	1
`WithPayload(sizeBytes)`	Payload size in bytes (0-8)	0 (MPHF only)
`WithFingerprint(sizeBytes)`	Fingerprint size in bytes (0-4)	0 (disabled)
`WithAlgorithm(algo)`	`AlgoBijection` or `AlgoPTRHash`	`AlgoBijection`
`WithGlobalSeed(seed)`	Hash seed (change on build failure)	fixed default
`WithMetadata(data)`	Arbitrary metadata stored in the file	none

Choosing an Algorithm

Bijection (default) — best for most use cases. Smallest indexes and lowest RAM usage. Query decoding is O(128) via checkpoint-based Elias-Fano/Golomb-Rice decoding.

PTRHash — best when query speed is critical. O(1) queries via direct pilot byte lookup, at the cost of slightly larger indexes and more RAM during construction.

Design

StreamHash partitions keys into fixed-size blocks by prefix, routes each key to its block, and delegates MPHF construction to a pluggable algorithm. This block-partitioning architecture enables:

Bounded RAM: each block is solved independently, using only O(block) memory
Parallelism: workers solve blocks concurrently while a coordinator sequences output
Locality: all metadata for a block is contiguous on disk — one read per query

The framework handles key routing, file layout, parallel coordination, and payload/fingerprint storage. Algorithms only need to implement a build-time solver and a query-time decoder.

See streamhash-spec.md for the full technical specification.

Benchmarking

# MPHF-only mode (core hash function performance)
go run ./cmd/bench -keys 10000000 -payload 0 -fp 0 -algo bijection

# With payload and fingerprint
go run ./cmd/bench -keys 10000000 -payload 4 -fp 1 -algo ptrhash -workers 4

License

Apache License 2.0 — see LICENSE.

Documentation ¶

Rendered for

Overview ¶

Package streamhash implements a Minimal Perfect Hash Function (MPHF) library with streaming build support and bounded RAM usage.

StreamHash is designed for building large-scale indexes (1B+ keys) efficiently. See the README for performance characteristics.

Basic Usage ¶

Building an index (sorted input):

builder, err := streamhash.NewSortedBuilder(ctx, "index.idx", totalKeys)
if err != nil {
    log.Fatal(err)
}
defer builder.Close()
for key, payload := range sortedKeys {
    if err := builder.AddKey(key, payload); err != nil {
        log.Fatal(err)
    }
}
if err := builder.Finish(); err != nil {
    log.Fatal(err)
}

Building an index (unsorted input):

builder, err := streamhash.NewUnsortedBuilder(ctx, "index.idx", totalKeys, "/tmp")
if err != nil {
    log.Fatal(err)
}
defer builder.Close()
for key, payload := range keys {
    if err := builder.AddKey(key, payload); err != nil {
        log.Fatal(err)
    }
}
if err := builder.Finish(); err != nil {
    log.Fatal(err)
}

Querying an index:

idx, err := streamhash.Open("index.idx")
if err != nil {
    log.Fatal(err)
}
defer idx.Close()

rank, err := idx.QueryRank(streamhash.PreHash([]byte("mykey")))
if err != nil {
    log.Fatal(err)
}
fmt.Printf("Key rank: %d\n", rank)

Package Structure ¶

The implementation is organized as follows:

Public API: builder.go (NewSortedBuilder), builder_unsorted.go (NewUnsortedBuilder), index.go (Open, QueryRank, PayloadIndex)
Configuration: builder_options.go (BuildOption, With* functions)
Serialization: header.go (header, footer, ramIndexEntry), index_writer.go
Key routing & hashing: key.go (fingerprint extraction, payload packing), prehash.go (PreHash); block routing is FastRange32 over the big-endian prefix, inline in builder.go/index.go
Algorithm dispatch: algorithm.go (blockBuilder/blockDecoder interfaces, factory functions)
Block algorithms: internal/bijection/ (EF/GR), internal/ptrhash/ (Cuckoo)
Platform: platform_*.go (OS-specific: fallocate)

Index ¶

Constants
Variables
func PreHash(key []byte) []byte
func PreHashInPlace(dst []byte, key []byte)
type Algorithm
- func (a Algorithm) String() string
type BuildOption
- func WithAlgorithm(algo Algorithm) BuildOption
- func WithFingerprint(sizeBytes int) BuildOption
- func WithGlobalSeed(seed uint64) BuildOption
- func WithMetadata(data []byte) BuildOption
- func WithPayload(sizeBytes int) BuildOption
- func WithWorkers(n int) BuildOption
type Index
- func Open(path string) (*Index, error)
- func OpenBytes(data []byte) (*Index, error)
- func OpenFile(f *os.File) (*Index, error)
- func (idx *Index) Close() error
- func (idx *Index) NumBlocks() uint32
- func (idx *Index) NumKeys() uint64
- func (idx *Index) PayloadSize() int
- func (idx *Index) QueryRank(key []byte) (uint64, error)
- func (idx *Index) Stats() *Stats
- func (idx *Index) UserMetadata() []byte
- func (idx *Index) Verify() error
- func (idx *Index) WithPayload() (*PayloadIndex, error)
type PayloadIndex
- func OpenPayload(path string) (*PayloadIndex, error)
- func OpenPayloadBytes(data []byte) (*PayloadIndex, error)
- func OpenPayloadFile(f *os.File) (*PayloadIndex, error)
- func (pi *PayloadIndex) QueryPayload(key []byte) (rank uint64, payload uint64, err error)
type SortedBuilder
- func NewSortedBuilder(ctx context.Context, output string, totalKeys uint64, opts ...BuildOption) (*SortedBuilder, error)
- func (sb *SortedBuilder) AddKey(key []byte, payload uint64) error
- func (sb *SortedBuilder) Close() error
- func (sb *SortedBuilder) Finish() error
type Stats
- func GetStats(path string) (*Stats, error)
type UnsortedBuilder
- func NewUnsortedBuilder(ctx context.Context, output string, totalKeys uint64, tempDir string, ...) (*UnsortedBuilder, error)
- func (ub *UnsortedBuilder) AddKey(key []byte, payload uint64) error
- func (ub *UnsortedBuilder) AddKeys(numWriters int, ...) error
- func (ub *UnsortedBuilder) Close() error
- func (ub *UnsortedBuilder) Finish() error

Constants ¶

View Source

const MinKeySize = 16

MinKeySize is the minimum key length required for routing and hash computation. Keys must have at least 16 bytes to extract k0 (bytes 0-7) and k1 (bytes 8-15).

Variables ¶

View Source

var (
	ErrBuilderClosed    = sherr.ErrBuilderClosed
	ErrEmptyIndex       = sherr.ErrEmptyIndex
	ErrTooManyKeys      = sherr.ErrTooManyKeys
	ErrKeyTooShort      = sherr.ErrKeyTooShort
	ErrKeyTooLong       = sherr.ErrKeyTooLong
	ErrPayloadOverflow  = sherr.ErrPayloadOverflow
	ErrDuplicateKey     = sherr.ErrDuplicateKey
	ErrUnsortedInput    = sherr.ErrUnsortedInput
	ErrKeyCountMismatch = sherr.ErrKeyCountMismatch
)

Build errors.

View Source

var (
	ErrPayloadTooLarge             = sherr.ErrPayloadTooLarge
	ErrFingerprintTooLarge         = sherr.ErrFingerprintTooLarge
	ErrSplitBucketSeedSearchFailed = sherr.ErrSplitBucketSeedSearchFailed
	ErrIndistinguishableHashes     = sherr.ErrIndistinguishableHashes
)

Construction errors.

View Source

var (
	ErrInvalidMagic   = sherr.ErrInvalidMagic
	ErrInvalidVersion = sherr.ErrInvalidVersion
	ErrChecksumFailed = sherr.ErrChecksumFailed
	ErrTruncatedFile  = sherr.ErrTruncatedFile
	ErrCorruptedIndex = sherr.ErrCorruptedIndex
)

Index errors.

View Source

var (
	ErrIndexClosed = sherr.ErrIndexClosed
	ErrNoPayload   = sherr.ErrNoPayload
	ErrNotFound    = sherr.ErrNotFound
)

Query errors.

Functions ¶

func PreHash ¶

func PreHash(key []byte) []byte

PreHash applies xxHash3-128 to a key, returning 16 bytes.

Use this function when your keys are not uniformly distributed (e.g., strings, URLs, sequential integers, JSON). Pre-hashing transforms arbitrary input into uniformly random 128-bit values required by the MPHF construction.

Usage ¶

For unsorted builds, prehash keys before building:

hashedKeys := make([][]byte, len(keys))
for i, key := range keys {
    hashedKeys[i] = streamhash.PreHash(key)
}

For sorted builds (Builder), you must sort by the HASHED key, not the original key. The original keys are discarded; only hashed keys are indexed:

// 1. Prehash all keys (original keys are no longer needed)
hashedKeys := make([][]byte, len(keys))
for i, key := range keys {
    hashedKeys[i] = streamhash.PreHash(key)
}

// 2. Sort by hashed key bytes
sort.Slice(hashedKeys, func(i, j int) bool {
    return bytes.Compare(hashedKeys[i], hashedKeys[j]) < 0
})

// 3. Build using Builder
builder, _ := streamhash.NewSortedBuilder(ctx, path, uint64(len(hashedKeys)))
for _, hk := range hashedKeys {
    builder.AddKey(hk, 0)
}
err := builder.Finish()

When to use ¶

Strings, URLs, file paths: highly non-uniform prefix distribution
Sequential integers: all keys share common high bits
UUIDs with common prefixes: e.g., all start with same version nibble
Any key where the first 16 bytes are not uniformly random

When NOT to use ¶

Keys are already random 128-bit values (e.g., random UUIDs, crypto hashes)
Keys are already uniformly distributed across their prefix bits

Querying: If you prehash keys during build, you must also prehash during query:

rank, err := idx.QueryRank(streamhash.PreHash(originalKey))

func PreHashInPlace ¶

func PreHashInPlace(dst []byte, key []byte)

PreHashInPlace applies xxHash3-128 to a key, writing the result to dst. dst must be at least 16 bytes. This avoids allocation when processing many keys in a loop.

Types ¶

type Algorithm ¶

type Algorithm uint16

Algorithm identifies the MPHF algorithm used for block construction. This is stored in the file header.

const (
	// AlgoBijection uses EF/GR encoding with O(128) query.
	AlgoBijection Algorithm = 0

	// AlgoPTRHash uses PTRHash-style Cuckoo with 8-bit pilots.
	AlgoPTRHash Algorithm = 1
)

func (Algorithm) String ¶

func (a Algorithm) String() string

String returns the algorithm name.

type BuildOption ¶

type BuildOption func(*buildConfig)

BuildOption is a functional option for configuring builds.

func WithAlgorithm ¶

func WithAlgorithm(algo Algorithm) BuildOption

WithAlgorithm sets the block construction algorithm. Default is AlgoBijection.

func WithFingerprint ¶

func WithFingerprint(sizeBytes int) BuildOption

WithFingerprint enables fingerprint verification. Size must be 0-4 bytes. 0 means no fingerprints (the default). With fingerprints enabled, QueryRank and QueryPayload return ErrNotFound for non-member keys.

func WithGlobalSeed ¶

func WithGlobalSeed(seed uint64) BuildOption

WithGlobalSeed sets the global hash seed.

func WithMetadata ¶

func WithMetadata(data []byte) BuildOption

WithMetadata sets the variable-length user metadata. The metadata is copied, so the caller can reuse the slice after this call.

func WithPayload ¶

func WithPayload(sizeBytes int) BuildOption

WithPayload configures payload storage. Size must be 0-8 bytes. 0 means no payload (MPHF-only mode, the default).

func WithWorkers ¶

func WithWorkers(n int) BuildOption

WithWorkers sets the number of parallel workers for block building during Finish.

type Index ¶

type Index struct {
	// contains filtered or unexported fields
}

Index is a read-only StreamHash index for querying.

Thread Safety: - QueryRank, PayloadIndex.QueryPayload, and other read methods are safe for concurrent use - Close is NOT safe to call concurrently with queries - Close must only be called after all queries have completed - After Close returns, no methods may be called on the Index

func Open ¶

func Open(path string) (*Index, error)

Open opens a StreamHash index file for querying. It opens the file, memory-maps it, and closes the file descriptor.

func OpenBytes ¶

func OpenBytes(data []byte) (*Index, error)

OpenBytes creates a StreamHash index from an in-memory byte slice. No file is opened or memory-mapped; Close is a no-op. The caller must ensure data is not modified while the Index is in use.

func OpenFile ¶

func OpenFile(f *os.File) (*Index, error)

OpenFile opens a StreamHash index by memory-mapping the given file. The caller is responsible for closing f. Per POSIX mmap(2), f may be closed immediately after OpenFile returns.

func (*Index) Close ¶

func (idx *Index) Close() error

Close closes the index and releases resources.

func (*Index) NumBlocks ¶

func (idx *Index) NumBlocks() uint32

NumBlocks returns the number of blocks in the index.

func (*Index) NumKeys ¶

func (idx *Index) NumKeys() uint64

NumKeys returns the total number of keys in the index.

func (*Index) PayloadSize ¶

func (idx *Index) PayloadSize() int

PayloadSize returns the payload size per key.

func (*Index) QueryRank ¶

func (idx *Index) QueryRank(key []byte) (uint64, error)

QueryRank returns the rank (0-based index) for a key. This is the core MPHF operation. Returns ErrKeyTooShort if key is less than 16 bytes.

func (*Index) Stats ¶

func (idx *Index) Stats() *Stats

Stats returns statistics for the index.

func (*Index) UserMetadata ¶

func (idx *Index) UserMetadata() []byte

UserMetadata returns the variable-length user-defined metadata. The returned slice is backed by the memory-mapped file data.

func (*Index) Verify ¶

func (idx *Index) Verify() error

Verify checks the integrity of the entire index. For separated layout, this verifies: 1. PayloadRegionHash (hash-of-hashes: H(H(b0) || H(b1) || ...)) 2. MetadataRegionHash (streaming hash of metadata region)

The footer (last 32 bytes) is decoded on each Verify call rather than at Open time, so Open() only touches the contiguous prefix and avoids a scattered page fault.

The hash-of-hashes approach matches the streaming hash computation during build, where workers compute per-block payload hashes that are folded in order.

func (*Index) WithPayload ¶

func (idx *Index) WithPayload() (*PayloadIndex, error)

WithPayload returns a PayloadIndex if the index has payload data. Returns an error if the index was built without WithPayload.

type PayloadIndex ¶

type PayloadIndex struct {
	*Index
}

PayloadIndex extends Index with payload query capability. Obtained via OpenPayload, OpenPayloadFile, OpenPayloadBytes, or Index.WithPayload().

func OpenPayload ¶

func OpenPayload(path string) (*PayloadIndex, error)

OpenPayload opens a StreamHash index that has payload data. Returns an error if the index was built without WithPayload.

func OpenPayloadBytes ¶

func OpenPayloadBytes(data []byte) (*PayloadIndex, error)

OpenPayloadBytes creates a payload-capable index from an in-memory byte slice. Returns an error if the index was built without WithPayload.

func OpenPayloadFile ¶

func OpenPayloadFile(f *os.File) (*PayloadIndex, error)

OpenPayloadFile opens a payload-capable index from an open file. Returns an error if the index was built without WithPayload.

func (*PayloadIndex) QueryPayload ¶

func (pi *PayloadIndex) QueryPayload(key []byte) (rank uint64, payload uint64, err error)

QueryPayload returns the rank and payload for a key. Returns ErrKeyTooShort if key is less than 16 bytes.

type SortedBuilder ¶

type SortedBuilder struct {
	// contains filtered or unexported fields
}

SortedBuilder builds an index from keys that arrive in block-sorted order.

Usage:

builder, err := streamhash.NewSortedBuilder(ctx, "index.idx", totalKeys, opts...)
if err != nil { return err }
defer builder.Close()

for key, payload := range sortedData {
    if err := builder.AddKey(key, payload); err != nil { return err }
}
return builder.Finish()

func NewSortedBuilder ¶

func NewSortedBuilder(ctx context.Context, output string, totalKeys uint64, opts ...BuildOption) (*SortedBuilder, error)

NewSortedBuilder creates a builder for sorted input. Keys must be added in block-sorted order via AddKey. Use WithWorkers(N) to parallelize block building during Finish.

func (*SortedBuilder) AddKey ¶

func (sb *SortedBuilder) AddKey(key []byte, payload uint64) error

AddKey adds a key-payload pair. Keys must be in block-sorted order.

func (*SortedBuilder) Close ¶

func (sb *SortedBuilder) Close() error

Close aborts the build and cleans up resources. Safe to call after Finish.

func (*SortedBuilder) Finish ¶

func (sb *SortedBuilder) Finish() error

Finish completes the index and writes it to disk.

type Stats ¶

type Stats struct {
	NumKeys         uint64
	NumBlocks       uint32
	BitsPerKey      float64
	PayloadSize     int
	FingerprintSize int // bytes per key (0 if disabled)
	FileSize        int64
	OverheadBPK     float64 // MPHF overhead bits per key (excludes payload + fingerprint)
	Algorithm       Algorithm
}

Stats holds index statistics.

func GetStats ¶

func GetStats(path string) (*Stats, error)

GetStats returns statistics for an index file.

type UnsortedBuilder ¶

type UnsortedBuilder struct {
	// contains filtered or unexported fields
}

UnsortedBuilder builds an index from keys in arbitrary order. Keys are buffered to per-writer temp files, partitioned, and built during Finish.

Two ingestion modes (mutually exclusive):

AddKey: single-threaded sequential addition, then call Finish
AddKeys: concurrent multi-writer callback, calls Finish internally

Usage (single-threaded):

builder, _ := streamhash.NewUnsortedBuilder(ctx, path, totalKeys, tempDir, opts...)
defer builder.Close()
for key, payload := range data {
    builder.AddKey(key, payload)
}
return builder.Finish()

Usage (concurrent):

builder, _ := streamhash.NewUnsortedBuilder(ctx, path, totalKeys, tempDir, opts...)
defer builder.Close()
return builder.AddKeys(8, func(writerID int, addKey func([]byte, uint64) error) error {
    for key, payload := range myPartition(writerID) {
        if err := addKey(key, payload); err != nil { return err }
    }
    return nil
})

func NewUnsortedBuilder ¶

func NewUnsortedBuilder(ctx context.Context, output string, totalKeys uint64, tempDir string, opts ...BuildOption) (*UnsortedBuilder, error)

NewUnsortedBuilder creates a builder for unsorted input. Keys can be added in any order via AddKey or AddKeys.

tempDir specifies where partition files are created. Pass "" to use the system default (os.TempDir). The directory must exist and be on a local filesystem (ext4, xfs, btrfs). NFS is not supported. tmpfs works but is not recommended at scale since it stores data in RAM/swap.

Use WithWorkers(N) to parallelize block building during Finish.

func (*UnsortedBuilder) AddKey ¶

func (ub *UnsortedBuilder) AddKey(key []byte, payload uint64) error

AddKey adds a key-payload pair. Keys can be in any order. Cannot be used after AddKeys has been called.

func (*UnsortedBuilder) AddKeys ¶

func (ub *UnsortedBuilder) AddKeys(numWriters int, fn func(writerID int, addKey func(key []byte, payload uint64) error) error) error

AddKeys ingests keys in parallel using numWriters concurrent writers. The callback fn is invoked once per writer in its own goroutine. Each invocation receives a writerID (0-based) and an addKey function. AddKeys calls Finish internally — do not call Finish separately.

Cannot be used after AddKey has been called.

Example:

err := builder.AddKeys(8, func(writerID int, addKey func([]byte, uint64) error) error {
    for key, payload := range myPartition(writerID) {
        if err := addKey(key, payload); err != nil { return err }
    }
    return nil
})

func (*UnsortedBuilder) Close ¶

func (ub *UnsortedBuilder) Close() error

Close aborts the build and cleans up resources. Safe to call after Finish.

func (*UnsortedBuilder) Finish ¶

func (ub *UnsortedBuilder) Finish() error

Finish completes the index. Called automatically by AddKeys. Only call this directly when using AddKey (not AddKeys).

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmd
bench command Bench is a benchmarking and data generation tool for StreamHash.	Bench is a benchmarking and data generation tool for StreamHash.
internal
bijection Package bijection implements the Bijection algorithm for StreamHash.	Package bijection implements the Bijection algorithm for StreamHash.
bits Package bits provides low-level bit manipulation primitives.	Package bits provides low-level bit manipulation primitives.
encoding Package encoding provides serialization utilities for fingerprints and payloads.	Package encoding provides serialization utilities for fingerprints and payloads.
ptrhash Package ptrhash implements the PTRHash algorithm for StreamHash.	Package ptrhash implements the PTRHash algorithm for StreamHash.
sherr Package sherr defines error sentinels shared across the streamhash library.	Package sherr defines error sentinels shared across the streamhash library.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL