-
Notifications
You must be signed in to change notification settings - Fork 16
Enumerations
Home > Collections > Enumerations
The VDS.Common.Collections.Enumerations namespace provides a range of useful enumerables and enumerators that provide some advanced enumeration functionality that may be useful in more complex applications.
There is an EnumerableExtensions class which provides extension methods that make these easily accessible to users in a fluent fashion. The following table gives the extension methods and a brief description of each, the later sections of this page discuss each in more detail.
| Extension Method | Functionality |
|---|---|
LongTake() |
Takes a number of items from another enumerable |
LongSkip() |
Skips a number of items from another enumerable |
AddIfMissing() |
Adds an item if it is missing from the other enumerable |
AddIfEmpty() |
Yields an item if the other enumerable is empty |
Reduced() |
Does streaming distinctness by eliminating adjacent duplicates |
Top() and TopDistinct()
|
Yields the top N (optionally distinct) items according to some ordering |
Bottom() and BottomDistinct()
|
Yields the bottom N (optionally distinct) items according to some ordering |
ProbabilisticDistinct() |
Does streaming distinctness using [BloomFilters](Bloom Filters) |
These are fairly self explanatory, they are functionally equivalent to the standard .Net Take() and Skip() except they allow taking/skipping more records
This is essentially a conditional Concat(), it first enumerates the inner enumerable and if the extra item is not seen (based on the configured IEqualityComparer<T>) it then yields the extra item.
This is essentially a conditional Concat(), it first tries to enumerate the inner enumerable and if there are any items yields only those. If the inner enumerable is empty it instead yields the extra item.
This provides a form of streaming distinctness that has very low memory costs (needs only store the last seen item). It eliminates adjacent duplicates in the inner enumerable by tracking the last seen item and discarding any subsequent items that are equal (based on the configured IEqualityComparer<T>).
This is most useful and effective when you know the enumerable you are operating over is ordered.
This provides a form of streaming distinctness that has slightly higher memory costs but does not need to store any items directly since it instead uses a Bloom Filter to determine which items it has already seen.
The effectiveness of this will depend on the configuration provided both in terms of how good a hash function you use and the size of the bloom filter you choose to use.
There provide extremely memory efficient alternatives to doing a standard .OrderBy().Take() or a .OrderByDescending().Take() because these only ever have to store at most N+1 items rather than the standard which typically have to materialise and sort all the data in-memory. As with standard OrderBy() and OrderByDescending() sorting is configurable based on a given IComparer<T>.
These work by keeping a sorted list of the top N items and adding items to it and removing items if N is exceeded. The distinct variants of these only store distinct items meaning that they also very efficiently implement distinctness though distinctness here is based upon the IComparer<T>.
When not doing distinctness duplicate items are not stored directly rather the first distinct instance of each item is stored along with a count of the number of times it has occurred. When enumerating the same instance may be yielded multiple times so this should only be used if the caller is happy to receive the same instance multiple times. This usually works best if the items are either immutable or if the caller is able to copy the items.