sentex

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 18, 2026 License: MIT Imports: 19 Imported by: 0

README

go-sentex

Pure Go sentence embeddings. Zero CGo, zero external system dependencies, one go get.

go-sentex wraps the sentence-transformers/all-MiniLM-L6-v2 transformer behind a tiny API: give it a string, get back a 384-dimensional unit-norm vector suitable for semantic search, RAG, clustering, or deduplication.

Why it exists

Every other sentence-embedding option in the Go ecosystem requires CGo (ONNX Runtime, fastembed-go, all-minilm-l6-v2-go), a Python sidecar, or settles for word-vector approximations. go-sentex fills the gap: a single dependency that builds with CGO_ENABLED=0, cross-compiles cleanly, and needs no C toolchain on the host.

Install

go get github.com/edgetools/go-sentex

No system libraries. No apt install. Builds with CGO_ENABLED=0.

Quick start

package main

import (
	"fmt"
	"log"

	"github.com/edgetools/go-sentex"
)

func main() {
	model, err := sentex.LoadModel()
	if err != nil {
		log.Fatal(err)
	}

	vec, err := model.Embed("deployment strategy")
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(len(vec), "dims, first value:", vec[0])

	vecs, err := model.EmbedBatch([]string{"text one", "text two"})
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(len(vecs), "vectors of", model.Dimensions(), "dims")
}

LoadModel loads model.onnx (~86MB) and tokenizer.json (~700KB) from the local HuggingFace Hub cache. If the files are already there — for example because you've used this model before from Python's sentence-transformers — no download happens. Otherwise they are fetched from HuggingFace Hub once and reused on every subsequent call.

API

  • sentex.LoadModel() (*Model, error) — load (and download if needed).
  • model.Embed(text string) ([]float32, error) — one vector, length 384.
  • model.EmbedBatch(texts []string) ([][]float32, error) — one vector per input, preserving order. Empty strings yield all-zero vectors.
  • model.Dimensions() int — always 384.

Output vectors are L2-normalized, so cosine similarity reduces to a dot product.

Model

Model sentence-transformers/all-MiniLM-L6-v2
Format ONNX (full precision)
Output 384-dimensional []float32, unit norm
Max input 256 tokens (longer inputs truncated)
Download size ~87MB, only if not already in the HF cache

Cache location

Model files are stored in the standard HuggingFace Hub cache layout, so if you've already pulled this model with Python's sentence-transformers or huggingface_hub, go-sentex picks it up with zero download.

HF_HOME is respected if set. Otherwise the base directory comes from os.UserCacheDir:

OS Default path
Linux ~/.cache/huggingface
macOS ~/Library/Caches/huggingface
Windows %LocalAppData%\huggingface

Model files land under <cache>/hub/models--sentence-transformers--all-MiniLM-L6-v2/.

Requirements

  • Go 1.25 or newer (see go.mod).
  • Network access the first time the model is fetched into the HF cache. No network is needed once the cache is populated (including when another tool like Python's sentence-transformers populated it).
  • Works with CGO_ENABLED=0.

Limitations

  • Inputs longer than 256 tokens are truncated by the tokenizer.
  • The SimpleGo inference backend is ~5× slower than XLA, which is irrelevant for typical query-scale embedding but worth knowing if you need to embed millions of documents in-process.
  • If the HF cache is cold, the first LoadModel call downloads ~87MB over the network before returning.

Architecture

See DESIGN.md for the inference pipeline, concurrency model, cache layout, and choice of underlying libraries.

License

MIT. See LICENSE.

Documentation

Overview

Package sentex provides pure-Go sentence embeddings via the all-MiniLM-L6-v2 transformer model. Zero CGo, zero external system dependencies.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Model

type Model struct {
	// contains filtered or unexported fields
}

Model holds the loaded sentence-embedding model. The internal fields are unexported; callers interact exclusively through the public methods.

func LoadModel

func LoadModel() (*Model, error)

LoadModel returns a ready-to-use Model. On first call the model weights are downloaded to the local HuggingFace Hub cache if they are not already present. Progress lines are written to os.Stderr during download; no output once warm.

func (*Model) Dimensions

func (m *Model) Dimensions() int

Dimensions returns the length of every embedding vector produced by this model. For all-MiniLM-L6-v2 that is always 384.

func (*Model) Embed

func (m *Model) Embed(text string) ([]float32, error)

Embed returns a 384-dimensional unit-norm embedding for the given text. An empty string returns a length-384 all-zero vector with a nil error.

func (*Model) EmbedBatch

func (m *Model) EmbedBatch(texts []string) ([][]float32, error)

EmbedBatch returns one embedding per element of texts, in the same order. Empty strings yield all-zero length-384 vectors; non-empty entries yield L2-normalized length-384 vectors. A nil or empty slice returns [][]float32{} with a nil error.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL