Ah the beauty of LINQ. I previously did a post on How to use Lambda Expressions and if you aren’t sure now how to use them, is the time to check it out. I have decided to have a good look at the Methods available to IEnumerable collections and first on the list the DISTINCT method.
The Boring Standard Method:
The Distinct method removes any duplicate entries in a collection without you needing to do much work, the same as the DISTINCT keyword inT-SQL. Here is the basic and standard use that is really not difficult to understand.
string[] names = new string[] { "Peter", "Paul", "Mary", "Peter", "Paul", "Mary", "Janet" };
foreach (string _name in names.Distinct())
{
Console.WriteLine(_name);
}
So like I said, easy enough, we have a string array with names and by using the .Distinct method we remove any duplicates from the resultset that is returned. Now comes the real fun!
Custom Distinct Implementation:
Since the advent of generics, most of us have been creating strongly typed lists of data. Plain example would be a list of our customer UserClass.
public class Users
{
public int ID {get; set;}
public string Name {get; set;}
public int AwesomeScore {get; set;}
}
// Create and populate new list of Users
List<Users> failBoyUsers = new List<Users>();
failBoyUsers.Add(new Users { AwesomeScore = 7, Name = "FailBoy", ID = 1 });
failBoyUsers.Add(new Users { AwesomeScore = 10, Name = "Rathlan", ID = 2 });
failBoyUsers.Add(new Users { AwesomeScore = 10, Name = "Rathlan", ID = 2 });
failBoyUsers.Add(new Users { AwesomeScore = 7, Name = "Joe", ID = 3 });
failBoyUsers.Add(new Users { AwesomeScore = 7, Name = "Joe", ID = 3 });
failBoyUsers.Add(new Users { AwesomeScore = 7, Name = "Joe", ID = 3 });
failBoyUsers.Add(new Users { AwesomeScore = 1, Name = "newbie", ID = 4 });
This would then give us a “strongly typed Array” of our User class. If we now had to call the .Distinct function on the collection it wouldn’t work! Thats because the CLR does not know what to evaluate to decide whether an entry is unique or a duplicate so it returns all the items including the duplicates. The Distinct method accepts an overload paramter, this object must implement the IEqualityComparer interface. So we declare a new class and implement IEqualityComparer as follows:
public class AwesomeComparer : IEqualityComparer<Users>
{
public bool Equals(Users x, Users y)
{
if (x == null || y == null)
return false;
else
return (x.Name.ToLower() == y.Name.ToLower() &&
x.AwesomeScore == y.AwesomeScore &&
x.ID == y.ID);
}
public int GetHashCode(Users obj)
{
return obj.Name.GetHashCode();
}
}
so what we do in the Equals Method is define what we feel a duplicate of our custom class looks like. We also implement the GetHashCode method to return the hash of the “primary field”. This will usually be the Primary Key that you class will be based on. Then we implement our Distinct method, we do so as follows:
AwesomeComparer awesomeComparer = new AwesomeComparer();
foreach (Users usr in failBoyUsers.Distinct(awesomeComparer))
{
Console.WriteLine(usr.ID.ToString() + " - " + usr.Name + " - " + usr.AwesomeScore.ToString());
}
And that will only return Distinct values from our custom class! I hope this helps someone at some stage as I never knew about this myself and stumbled onto it. Drop me a comment if you think I suck or if my some margain of luck I managed to produce some bug free code, I am after FailBoy for a reason