Comparing Two large collections (A and B) and removing duplicate records from Collection B when several data points match in each row.

rickjh123

New Member
I have two collections, Collections "A" and "B". Both collections contain the exact same number of columns (11) and their headers are the same names and data types. All our text except one which is a number type. The values in each row may be different in both collections. In other words Collection A may have the same records as B, but not necessarily, and Collection B may have the same records as A, but not necessarily. I currently have a nested loop where I take the first row of collection a and compare to all rows one at a time in collection b, and if i find a match (using values from 3 of the columns) in B, I delete that row from collection B. If the row in collection A does not match any row in collection b, it simply moves on to the next row in collection A and continues. The final result is a unique collection of records in collection B, which is what I want. (I do not want to delete any records in collection A.) This works fine for 40-50 records but does not work when I get into the hundreds of records.

I cannot use Excel macros to complete this task - it must be done in BP. If possible I'd like to use VB script in a code stage but whatever is the quickest solution outside of excel would be great.

I created this but it's not doing what I want, nor is it complete (again I need to compare values from three columns and all must be the same in all three columns for that row to be considered a 'match' between collection a and b)

Dim aRowData As String

Dim bRowData As String

Collection_Out_3 = new DataTable()
Collection_Out_4 = new DataTable()

Collection_Out_3.Columns.Add(“ACCOUNT”, GetType(String))
Collection_Out_4.Columns.Add(“ACCOUNT”, GetType(String))


For Each aRow as DataRow In Collection_In_1.Rows

aRowData = CStr(aRow(“ACCOUNT"))

For Each bRow As DataRow In Collection_In_2.Rows

bRowData = CStr(bRow(“ACCOUNT”))

If aRowData = bRowData Then

Collection_Out_3.Rows.Add(new Object() {bRowData})

End If
Next

Collection_Out_4.Rows.Add(new Object() {aRowData})
Next


Would appreciate some help!! Thanks so much
 
Top