HashSet<Custom>
Photo Credits: Unsplash and Tech Icons
.NET HashSets
As discussed here:
A HashSet holds a set of objects, but in a way that allows you to easily and quickly determine whether an object is already in the set or not. It does so by internally managing an array and storing the object using an index which is calculated from the hashcode of the object.
The only catch of HashSet is that there is no access by indices. To access elements you can either use an enumerator or use the built-in function to convert the HashSet into a List and iterate through that.
In other words, a HashSet is designed to only contain distinct, or unique objects. In .NET, a HashSet implements several useful methods, such as .Contains()
, which returns whether the HashSet contains the specified object, and .IntersectWith()
, which returns the HashSet elements that are present within the specified collection. These 'uniqueness-based' operations are much faster to execute with a HashSet than with a List or other enumerable type. HashSets are also useful when iteratively retrieving results and discarding duplicate values.
The official .NET HashSet documentation can be found here.
Using HashSets in PowerShell
I use .NET primarily in the context of PowerShell. In PowerShell, a HashSet may be created and used as follows:
C:\Users\nwm> $hs = [System.Collections.Generic.HashSet[int]]::new()
C:\Users\nwm> $hs.Add(1)
True
C:\Users\nwm> $hs.Add(2)
True
C:\Users\nwm> $hs.Add(2)
False
C:\Users\nwm> $hs.Contains(1)
True
C:\Users\nwm> $hs.Contains(3)
False
HashSets with Custom Classes
I have previously found myself storing the results of an iterative process in a custom class and needing to discard duplicate values. I sought to use a HashSet for this use case but struggled to understand what class methods I needed to implement in order to implement HashSet functionality. As an example of a non-functional approach:
C:\Users\nwm> class CustomClass { [string]$Name; [int]$Num }
C:\Users\nwm> $hs = [System.Collections.Generic.HashSet[CustomClass]]::new()
C:\Users\nwm> $hs.Add([CustomClass]@{ Name = "Joe"; Num = 2 })
True
C:\Users\nwm> $hs.Add([CustomClass]@{ Name = "Joe"; Num = 2 })
True
C:\Users\nwm> $hs.Count
2
As you can see, the HashSet fails to recognize our duplicate element as a duplicate element - the very thing that a HashSet is meant to do.
Many StackOverflow posts later, I discovered which class methods a HashSet attempts to call on its members in order to implement its functions: .GetHashCode()
and .Equals()
. With that knowledge, we can implement the expected functionality for our class:
C:\Users\nwm> class CustomClass {
>> [string]$Name
>> [int]$Num
>> [bool]Equals($x) {
>> if ( $x -is [CustomClass] ) {
>> return ( ( $this.Name -eq $x.Name ) -and ( $this.Num -eq $x.Num ) )
>> } else { return $false }
>> }
>> [int]GetHashCode() { return "$($this.Name)$($this.Num.ToString())".GetHashCode() }
>> }
C:\Users\nwm> $hs = [System.Collections.Generic.HashSet[CustomClass]]::new()
C:\Users\nwm> $hs.Add([CustomClass]@{ Name = "Joe"; Num = 2 })
True
C:\Users\nwm> $hs.Add([CustomClass]@{ Name = "Joe"; Num = 2 })
False
C:\Users\nwm> $hs.Count
1