D4S.Indexer.Domain
1.0.21
dotnet add package D4S.Indexer.Domain --version 1.0.21
NuGet\Install-Package D4S.Indexer.Domain -Version 1.0.21
<PackageReference Include="D4S.Indexer.Domain" Version="1.0.21" />
<PackageVersion Include="D4S.Indexer.Domain" Version="1.0.21" />
<PackageReference Include="D4S.Indexer.Domain" />
paket add D4S.Indexer.Domain --version 1.0.21
#r "nuget: D4S.Indexer.Domain, 1.0.21"
#:package D4S.Indexer.Domain@1.0.21
#addin nuget:?package=D4S.Indexer.Domain&version=1.0.21
#tool nuget:?package=D4S.Indexer.Domain&version=1.0.21
D4S.Indexer
Document indexing library for Azure AI Search: extracts text, generates vector embeddings, and uploads searchable chunks.
Quick start
var indexer = IndexerBuilder.Create("my-index")
.WithAzureSearch(searchEndpoint, searchKey)
.WithAzureOpenAI(aoaiEndpoint, aoaiKey, embeddingDeployment, embeddingDimensions)
.WithLocalFiles("./documents")
.WithFileMetadataFields()
.Build();
var result = await indexer.IndexAsync();
See src/Rag/samples/ for working examples (local files, SharePoint, OCR, agentic retrieval).
Architecture
D4S.Indexer.Domain Entities, abstractions (interfaces)
D4S.Indexer.Application Orchestration (DocumentIndexerService, DocumentExtractor)
D4S.Indexer.Infrastructure Azure implementations, builder, processors, sources
| Interface | Purpose |
|---|---|
IDocumentSource |
Enumerates documents from a data source |
IDocumentProcessor |
Extracts text/metadata from a document |
IEmbeddingService |
Generates vector embeddings |
ISearchIndexService |
Manages the index (CRUD on chunks) |
ITextChunker |
Splits text into chunks |
IOcrService / IKeywordExtractor |
OCR for scans / AI keyword extraction |
Built-in sources: local filesystem, multi-site SharePoint (PnP Core). Built-in processors: PDF, DOCX, XLSX, PPTX, TXT/Markdown.
Indexing modes
- Full (default): all documents fetched from every source; documents missing from the source list are deleted from the index.
- Delta (
.WithDeltaMode()): only changed/new/deleted documents are provided; deletion is driven byDocumentMetadata.DeletedDate(set it and passnullforGetContentAsync). No implicit cleanup.
Both modes compare LastModifiedDate against the index to skip unchanged documents.
Builder options
IndexerBuilder.Create("index-name")
// Required
.WithAzureSearch(endpoint, apiKey)
.WithAzureOpenAI(endpoint, apiKey, deployment, dimensions)
// Sources (at least one)
.WithLocalFiles("./docs") // or: opts => { opts.Path = …; opts.FileExtensions = […]; }
.WithSharePointMultiSite(spOptions, contextFactory)
.WithCustomDocumentSource<T>(serviceProvider, serviceKey)
// Optional
.WithDeltaMode()
.WithFileMetadataFields()
.WithChunkSize(maxSize: 1000, overlap: 200)
.WithBatchSize(50)
.WithKeywordExtraction(gptDeployment, maxKeywords: 10)
.WithAzureDocumentIntelligence(endpoint, apiKey) // OCR
.WithCustomDocumentProcessor<T>(serviceProvider, serviceKey)
.ContinueOnError(true)
.Filter(meta => meta.Extension == ".pdf")
.ConfigureMetadata(meta => meta with { CustomFields = … })
.AddCustomField("Status", CustomFieldType.String, filterable: true)
.AddIndexFieldsFromAttributes<MyModel>()
.OnProgress(p => Console.WriteLine(p.Phase))
.WithLogging()
.Build();
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- No dependencies.
NuGet packages (2)
Showing the top 2 NuGet packages that depend on D4S.Indexer.Domain:
| Package | Downloads |
|---|---|
|
D4S.Indexer.Application
Application services and configuration for D4S Indexer. |
|
|
D4S.Indexer
D4S document indexer for Azure AI Search and RAG workflows. |
GitHub repositories
This package is not used by any popular GitHub repositories.