The shim package provides a compatibility layer that allows the helium XML toolkit to be used as a drop-in replacement for Go's standard encoding/xml library. It implements the common Marshal, Unmarshal, MarshalIndent, NewEncoder, and NewDecoder APIs, while leveraging the helium parser and DOM engine for XML processing.
The primary goal of the shim is to enable existing Go projects to migrate seamlessly to helium for improved XML spec compliance and better compatibility with libxml2-style parsing, all without rewriting existing XML marshaling/unmarshaling code. It respects Go's standard xml struct tags such as xml:"name,attr" and correctly maps structs to and from the helium internal XML tree model.
Marshal, Unmarshal, MarshalIndent, NewEncoder, and NewDecoder matching the Go encoding/xml standard library pattern.helium parse errors to encoding/xml.SyntaxError for compatibility with error handling written for the standard library.helium.Sources: shim/shim.go14-18 shim/compat_errors.go13-20
The shim package builds a bridge between the helium DOM/parser and Go's reflection-based XML marshaling/unmarshaling mechanism.
Unmarshal(data []byte, v interface{}) shim/shim.go30helium.Parser via helium.NewParser() and parses the XML into a helium.Document DOM tree shim/shim.go36-40xml struct tags shim/unmarshal.go100This approach leverages the robust helium XML parser and DOM with full namespace support and correctness.
Marshal(v interface{}) []byte shim/shim.go22helium.Document and populates helium.Element and helium.Attribute nodes accordingly shim/marshal.go30helium.Write() or the stream package's writer shim/marshal.go30This enables XML output consistent with libxml2 style serialization.
Sources: shim/shim.go22-40 shim/compat_errors.go13-20
| Function | Description |
|---|---|
Marshal(v interface{}) ([]byte, error) | Marshals a Go struct to XML using helium serialization. |
MarshalIndent(v interface{}, prefix, indent string) ([]byte, error) | Like Marshal but with indentation formatting. |
Unmarshal(data []byte, v interface{}) error | Parses XML bytes into a Go struct using helium parser and reflection. |
NewEncoder(w io.Writer) *Encoder | Returns a new Encoder writing to w. |
NewDecoder(r io.Reader) *Decoder | Returns a new Decoder reading from r. |
Sources: shim/shim.go22-50
The shim's compat_errors.go implements a conversion function convertParseError() which maps helium.ErrParseError into standard library-compatible encoding/xml.SyntaxError errors. This ensures that callers depending on specific error types and messages for compatibility with encoding/xml continue to work.
The conversion includes:
helium ErrParseError context | encoding/xml.SyntaxError equivalent message |
|---|---|
"invalid name start char" | "expected element name after <" |
| Namespace related errors | Mapped to equivalent standard library phrasing |
This error adaptation layer maintains the encoding/xml error interface contract for client code.
Sources: shim/compat_errors.go13-20
Despite striving for full compatibility, there are several intentional limitations where helium's shim does not support certain features of Go's native encoding/xml, primarily due to design and spec compliance goals:
No Strict=false Mode
The standard library supports a permissive Strict flag to tolerate some malformed XML. helium is strict and fully conforming due to its libxml2 lineage and does not permit disabling strict parsing.
This is documented as absent in helium.NewParser() options helium/parser.go10-20
No HTMLAutoClose Option
The shim does not auto-close HTML tags. Instead, HTML is handled by a separate dedicated html package with its own parser html/html.go10 This matches libxml2 separation of concerns.
Stricter Entity Limits and Security Guards
The underlying helium parser includes entity amplification safeguards (to prevent XML denial-of-service attacks) via BlockXXE() helium/parser.go10-20 These may be stricter than the default encoding/xml's defaults unless explicitly relaxed by the user settings via parser options.
Attribute Ordering Differences
Serialized output of XML attributes may differ in order, especially around xmlns namespace declarations, which are emitted before regular attributes in helium to align with libxml2 style.
Precise InputOffset Reporting
The InputOffset returned on errors may not match byte-for-byte with standard encoding/xml. It is approximate to line/column but reflects helium's internal parse offsets.
Sources: helium/parser.go10-20 html/html.go10
The diagram below demonstrates the step-by-step internal data processing when unmarshaling XML data using the shim package:
Sources: shim/shim.go30-40 shim/unmarshal.go100
This shim package enables incrementally adopting the robust, libxml2-compatible helium XML toolkit in existing Go projects with minimal code changes, providing improved parsing correctness and extensibility while preserving the familiar standard encoding/xml API contract. It is especially suited for projects requiring strict XML conformance or seamless interop with libxml2-driven workflows.
Sources:
Refresh this wiki
This wiki was recently refreshed. Please wait 1 day to refresh again.