Documentation
¶
Index ¶
- Variables
- type Cardinality
- type CustomField
- type FieldType
- type Opt
- type Option
- type Schema
- func (s *Schema[T]) Append(value T)
- func (s *Schema[T]) AppendWithCustom(value T, c ...any) error
- func (s *Schema[T]) Clone(mem memory.Allocator) (schema *Schema[T], err error)
- func (s *Schema[T]) FieldNames() []string
- func (s *Schema[T]) NewNormalizerRecord() arrow.Record
- func (s *Schema[T]) NewRecord() arrow.Record
- func (s *Schema[T]) NormalizerBuilder() *array.RecordBuilder
- func (s *Schema[T]) Parquet() *schema.Schema
- func (s *Schema[T]) Proto(r arrow.Record, rows []int) []T
- func (s *Schema[T]) ReadParquet(ctx context.Context, r parquet.ReaderAtSeeker, columns []int) (arrow.Record, error)
- func (s *Schema[T]) Release()
- func (s *Schema[T]) Schema() *arrow.Schema
- func (s *Schema[T]) WriteParquet(w io.Writer) error
- func (s *Schema[T]) WriteParquetRecords(w io.Writer, records ...arrow.Record) error
Constants ¶
This section is empty.
Variables ¶
var ErrMxDepth = errors.New("max depth reached, either the message is deeply nested or a circular dependency was introduced")
var ErrPathNotFound = errors.New("path not found")
Functions ¶
This section is empty.
Types ¶
type Cardinality ¶ added in v0.5.1
type Cardinality protoreflect.Cardinality
Cardinality determines whether a field is optional, required, or repeated.
const ( Optional Cardinality = 1 // appears zero or one times Required Cardinality = 2 // appears exactly one time; invalid with Proto3 Repeated Cardinality = 3 // appears zero or more times )
Constants as defined by the google.protobuf.Cardinality enumeration.
func (*Cardinality) Get ¶ added in v0.5.1
func (c *Cardinality) Get() protoreflect.Cardinality
type CustomField ¶ added in v0.5.1
type CustomField struct {
// Name must not conflict with existing proto.Message field names.
Name string `toml:"name"`
// Supported types:
// BOOL bool
// BYTES []byte
// STRING string
// INT64 int64
// FLOAT64 float64
Type FieldType `toml:"type"`
// FieldCardinality is a type alias of protoreflect.Cardinality.
// Cardinality determines whether a field is optional, required, or repeated.
// const (
// Optional Cardinality = 1 // appears zero or one times
// Required Cardinality = 2 // appears exactly one time; invalid with Proto3
// Repeated Cardinality = 3 // appears zero or more times
// )
// Constants as defined by the google.protobuf.Cardinality enumeration.
FieldCardinality Cardinality `toml:"field_cardinality"`
// IsPacked reports whether repeated primitive numeric kinds should be
// serialized using a packed encoding.
// If true, then it implies Cardinality is Repeated.
IsPacked bool `toml:"is_packed"`
}
type Option ¶ added in v0.5.0
type Option func(config)
func WithCustomFields ¶ added in v0.5.1
func WithCustomFields(c []CustomField) Option
WithCustomFields adds user-defined fields to the message schema which can be populated with AppendWithCustom().
func WithNormalizer ¶ added in v0.5.0
WithNormalizer configures the scalars to add to a flat Arrow Record suitable for efficient aggregation. Fields should be specified by their path (field names separated by a period ie. 'field1.field2.field3'). The Arrow field types of the selected fields will be used to build the new schema. If coaslescing data between multiple fields of the same type, specify only one of the paths. List fields should have an index to retrieve specified, otherwise defaults to all elements; ranges are not yet implemented. Current functionality is limited to valitating the fields/aliases match in `New()“, and `NormalizerBuilder()` returning an `*arrow.RecordBuilder` to be used externally to append data, and NewNormalizerRecord() to get an `arrow.Record` from the normalizer RecordBuilder. Future development may include Append methods that accept protopath operations to normalize protobuf messages in-flight internally to the package. failOnRangeError indicates whether to fail on a list[start:end] where end > len(list). TODO
type Schema ¶
func New ¶
New returns a new bufarrow.Schema. Options include WithNormalizer and WithCustomFields. WithNormalizer creates a separate Arrow record whilst WithCustomFields expands the schema of the proto.Message used as the type parameter.
func (*Schema[T]) Append ¶
func (s *Schema[T]) Append(value T)
Append appends protobuf value to the schema builder. This method is not safe for concurrent use.
func (*Schema[T]) AppendWithCustom ¶ added in v0.5.1
AppendWithCustom appends protobuf value and custom field values to the schema builder. This method is not safe for concurrent use. The number of custom field values must match the number of custom fields. Supported types:
bool []byte string int64 float64
func (*Schema[T]) Clone ¶ added in v0.4.0
Clone returns an identical bufarrow.Schema. Use in concurrency scenarios as Schema methods are not concurrency safe.
func (*Schema[T]) FieldNames ¶ added in v0.3.0
FieldNames returns top-level field names
func (*Schema[T]) NewNormalizerRecord ¶ added in v0.5.0
NewNormalizerRecord returns buffered builder value as an arrow.Record. The builder is reset and can be reused to build new records.
func (*Schema[T]) NewRecord ¶
NewRecord returns buffered builder value as an arrow.Record. The builder is reset and can be reused to build new records.
func (*Schema[T]) NormalizerBuilder ¶ added in v0.5.0
func (s *Schema[T]) NormalizerBuilder() *array.RecordBuilder
NormalizerBuilder returns the Normalizer's Arrow array.RecordBuilder, to be used to append normalized data.
func (*Schema[T]) ReadParquet ¶
func (s *Schema[T]) ReadParquet(ctx context.Context, r parquet.ReaderAtSeeker, columns []int) (arrow.Record, error)
ReadParquet specified columns from parquet source r and returns an Arrow record. The returned record must be released by the caller.
func (*Schema[T]) Release ¶
func (s *Schema[T]) Release()
Release releases the reference on the message builder
func (*Schema[T]) WriteParquet ¶
WriteParquet writes Parquet to an io.Writer