This page documents the handling of document attachments during conversion between GOBL and UBL formats. Attachments represent supplementary documents such as timesheets, specifications, or supporting files that accompany an invoice.
The gobl.ubl library supports two distinct types of attachments: external reference attachments and embedded binary attachments. These types are handled differently due to structural differences between GOBL and UBL data models.
For information about other document references such as order references and contract references, see Ordering References.
UBL 2.1 supports two methods of including attachments within an invoice document:
| Attachment Type | UBL Element | GOBL Support | Conversion Method |
|---|---|---|---|
| External Reference | cac:ExternalReference | Full support via org.Attachment | Automatic bidirectional conversion |
| Embedded Binary | cbc:EmbeddedDocumentBinaryObject | Manual extraction/addition required | Separate API calls |
External reference attachments contain a URI pointing to a document location, along with optional metadata such as MIME type, hash digest, and filename. These are fully supported in GOBL's org.Attachment structure and convert automatically in both directions.
Embedded binary attachments contain the actual document data encoded as base64 within the XML. Since GOBL does not support embedded binary data directly in its invoice model, these must be extracted and added using dedicated functions.
Sources: attachments_parse.go12-29 attachments.go10-38 invoice_parse.go154
The UBL Attachment structure can contain either an ExternalReference or an EmbeddedDocumentBinaryObject, but not both. These are stored within AdditionalDocumentReference elements on the invoice.
Sources: attachments.go9-38
The BinaryAttachment struct provides a simplified interface for working with embedded binary attachments:
This structure is used by both Invoice.AddBinaryAttachment() and Invoice.ExtractBinaryAttachments() to provide a consistent API that abstracts away the base64 encoding details.
Sources: attachments_parse.go12-29
External reference attachments from GOBL's org.Attachment structure are automatically converted during the standard invoice conversion process:
The addAttachments() function iterates through GOBL attachments and creates corresponding UBL Reference structures with ExternalReference elements. The function handles:
Sources: attachments.go40-75
Binary attachments must be added explicitly using the AddBinaryAttachment() method on the UBL Invoice object:
The method automatically performs base64 encoding of the binary data and creates the appropriate UBL structures. This allows applications to embed documents like PDFs directly within the UBL XML output.
Sources: attachments.go77-119 attachments_test.go32-72
External reference attachments are automatically extracted during UBL parsing and added to the GOBL invoice's Attachments field:
The goblAddAttachments() function processes only external reference attachments. Binary attachments are intentionally skipped during this phase and must be extracted separately.
Sources: attachments_parse.go34-78 invoice_parse.go154
Binary attachments must be extracted using the ExtractBinaryAttachments() method:
This function returns a slice of BinaryAttachment structures containing the decoded binary data. The function handles:
Sources: attachments_parse.go80-130 attachments_parse_test.go35-62
Binary data encoding and decoding follows this process:
The encoding process is straightforward, using Go's standard encoding/base64 package. The decoding process includes an additional whitespace removal step because XML formatting can introduce newlines and spaces within the base64 content.
The whitespace removal regex pattern matches one or more whitespace characters and replaces them with an empty string:
Sources: attachments.go92 attachments_parse.go94-96
When converting a GOBL invoice that contains org.Attachment entries:
The attachments are automatically included in the output UBL XML as external references.
Sources: attachments.go40-75 test/data/convert/invoice-attachments.json277-285
For adding binary content directly into UBL:
This pattern is used when you need to embed documents like PDFs, images, or other binary files directly in the UBL XML.
Sources: attachments.go77-119 attachments_test.go32-72
When parsing UBL that contains both external references and binary attachments:
This two-step approach is necessary because:
org.Attachment modelSources: attachments_parse.go34-53 attachments_parse.go80-130 invoice_parse.go30-54
Attachments are stored in the UBL AdditionalDocumentReference array, which also contains other document references. The system distinguishes between attachment references and other types (like ordering references) based on the presence of an Attachment element:
| Document Type | Stored In | Identifying Field |
|---|---|---|
| External Attachment | AdditionalDocumentReference | Attachment.ExternalReference != nil |
| Binary Attachment | AdditionalDocumentReference | Attachment.EmbeddedDocumentBinaryObject != nil |
| Ordering Reference (Type 130) | AdditionalDocumentReference | DocumentTypeCode == "130" |
| Other References | AdditionalDocumentReference | No Attachment element |
Sources: attachments_parse.go34-53 ordering_parse.go87-105
The parsing process includes special handling for malformed XML characters. The cleanString() utility function removes Unicode replacement characters (U+FFFD) that may appear in badly-encoded XML documents:
This ensures that attachment descriptions and filenames parse correctly even when the source UBL contains encoding issues. This cleaning is applied to:
Sources: attachments_parse.go74-75 utils.go14-16
The test data includes an example invoice with an external reference attachment. In GOBL format:
This converts to UBL as:
Sources: test/data/convert/invoice-attachments.json277-285 attachments_test.go13-30
Refresh this wiki