sentex

package module

v0.1.0 Latest Latest Go to latest Published: Apr 18, 2026 License: MIT Imports: 19 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/edgetools/go-sentex

Links

Open Source Insights

README ¶

go-sentex

Pure Go sentence embeddings. Zero CGo, zero external system dependencies, one go get.

go-sentex wraps the sentence-transformers/all-MiniLM-L6-v2 transformer behind a tiny API: give it a string, get back a 384-dimensional unit-norm vector suitable for semantic search, RAG, clustering, or deduplication.

Why it exists

Every other sentence-embedding option in the Go ecosystem requires CGo (ONNX Runtime, fastembed-go, all-minilm-l6-v2-go), a Python sidecar, or settles for word-vector approximations. go-sentex fills the gap: a single dependency that builds with CGO_ENABLED=0, cross-compiles cleanly, and needs no C toolchain on the host.

Install

go get github.com/edgetools/go-sentex

No system libraries. No apt install. Builds with CGO_ENABLED=0.

Quick start

package main

import (
	"fmt"
	"log"

	"github.com/edgetools/go-sentex"
)

func main() {
	model, err := sentex.LoadModel()
	if err != nil {
		log.Fatal(err)
	}

	vec, err := model.Embed("deployment strategy")
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(len(vec), "dims, first value:", vec[0])

	vecs, err := model.EmbedBatch([]string{"text one", "text two"})
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(len(vecs), "vectors of", model.Dimensions(), "dims")
}

LoadModel loads model.onnx (~86MB) and tokenizer.json (~700KB) from the local HuggingFace Hub cache. If the files are already there — for example because you've used this model before from Python's sentence-transformers — no download happens. Otherwise they are fetched from HuggingFace Hub once and reused on every subsequent call.

API

sentex.LoadModel() (*Model, error) — load (and download if needed).
model.Embed(text string) ([]float32, error) — one vector, length 384.
model.EmbedBatch(texts []string) ([][]float32, error) — one vector per input, preserving order. Empty strings yield all-zero vectors.
model.Dimensions() int — always 384.

Output vectors are L2-normalized, so cosine similarity reduces to a dot product.

Model


Model	`sentence-transformers/all-MiniLM-L6-v2`
Format	ONNX (full precision)
Output	384-dimensional `[]float32`, unit norm
Max input	256 tokens (longer inputs truncated)
Download size	~87MB, only if not already in the HF cache

Cache location

Model files are stored in the standard HuggingFace Hub cache layout, so if you've already pulled this model with Python's sentence-transformers or huggingface_hub, go-sentex picks it up with zero download.

HF_HOME is respected if set. Otherwise the base directory comes from os.UserCacheDir:

OS	Default path
Linux	`~/.cache/huggingface`
macOS	`~/Library/Caches/huggingface`
Windows	`%LocalAppData%\huggingface`

Model files land under <cache>/hub/models--sentence-transformers--all-MiniLM-L6-v2/.

Requirements

Go 1.25 or newer (see go.mod).
Network access the first time the model is fetched into the HF cache. No network is needed once the cache is populated (including when another tool like Python's sentence-transformers populated it).
Works with CGO_ENABLED=0.

Limitations

Inputs longer than 256 tokens are truncated by the tokenizer.
The SimpleGo inference backend is ~5× slower than XLA, which is irrelevant for typical query-scale embedding but worth knowing if you need to embed millions of documents in-process.
If the HF cache is cold, the first LoadModel call downloads ~87MB over the network before returning.

Architecture

See DESIGN.md for the inference pipeline, concurrency model, cache layout, and choice of underlying libraries.

License

MIT. See LICENSE.

Documentation ¶

Overview ¶

Package sentex provides pure-Go sentence embeddings via the all-MiniLM-L6-v2 transformer model. Zero CGo, zero external system dependencies.

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Model ¶

type Model struct {
	// contains filtered or unexported fields
}

Model holds the loaded sentence-embedding model. The internal fields are unexported; callers interact exclusively through the public methods.

func LoadModel ¶

func LoadModel() (*Model, error)

LoadModel returns a ready-to-use Model. On first call the model weights are downloaded to the local HuggingFace Hub cache if they are not already present. Progress lines are written to os.Stderr during download; no output once warm.

func (*Model) Dimensions ¶

func (m *Model) Dimensions() int

Dimensions returns the length of every embedding vector produced by this model. For all-MiniLM-L6-v2 that is always 384.

func (*Model) Embed ¶

func (m *Model) Embed(text string) ([]float32, error)

Embed returns a 384-dimensional unit-norm embedding for the given text. An empty string returns a length-384 all-zero vector with a nil error.

func (*Model) EmbedBatch ¶

func (m *Model) EmbedBatch(texts []string) ([][]float32, error)

EmbedBatch returns one embedding per element of texts, in the same order. Empty strings yield all-zero length-384 vectors; non-empty entries yield L2-normalized length-384 vectors. A nil or empty slice returns [][]float32{} with a nil error.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL