Write a large collection into an Excel

#1
Hello, I have a collection of 20,000 rows and I can´t write them in an excel, I don´t know how to write it by code or if there is some other way, thank you.
 

VJR

Well-Known Member
Staff member
#2
Hi armando.rueda,

Have you used the 'Write Collection' action of the Excel VBO?
If you are facing any transfer issues due to large volume of data then you can split the collection.
And this can be done by using the 'Copy rows' action of Collection Manipulation to transfer multiple chunks of data one by one into another collection and then writing that collection to Excel each time.
 
#4
Collections: These are a very important part of VBA. If you have used the language for any length of time then you will have used Collections. The most common ones are the Workbooks, Worksheets, Range and Cells collections.
 
#5
Hi armando.rueda,

Have you used the 'Write Collection' action of the Excel VBO?
If you are facing any transfer issues due to large volume of data then you can split the collection.
And this can be done by using the 'Copy rows' action of Collection Manipulation to transfer multiple chunks of data one by one into another collection and then writing that collection to Excel each time.
That was exactly what I did, but writing the 20k is very slow, is there any other way?
I really appreciate your response, thank you Sr.
 

VJR

Well-Known Member
Staff member
#7
That was exactly what I did, but writing the 20k is very slow, is there any other way?
I really appreciate your response, thank you Sr.
Can you give some details like-
i. how many rows have you tried splitting?
ii. what is the time it takes to write those many rows (as in No i.) to Excel?
iii. what is the time it takes to write the entire 20K rows in spite of splitting?
iv. how many columns have you got in the collection?
 
#8
Can you give some details like-
i. how many rows have you tried splitting?
ii. what is the time it takes to write those many rows (as in No i.) to Excel?
iii. what is the time it takes to write the entire 20K rows in spite of splitting?
iv. how many columns have you got in the collection?
Yes sure,
i. I divided the collection into 10 sub collections, each one of little more than 2k.
ii. It takes about 5 minutes each sub collection (it's too much).
iii. Almost 50 minutes, but almost always fails before it ends, the client freezes and no longer responds.
iv. From A to AR, 44 columns.
 

VJR

Well-Known Member
Staff member
#9
Hi armando.rueda,

I created a collection with 20K rows and 44 columns and ran it to write to Excel. I faced the same issues as yours - took too long and had to terminate BP because I had get back to more researching.
So this appears to be by design since it is happening on both of our machines (mine is even slower on a Virtual
machine).
So then I thought to research on splitting the collection columns into 22 each to see if that is causing the problem.
That too produced the same issue of taking too long before terminating the process.
This implied that it is the number of rows that are causing the issues.
So I ran the process to write a chunk of 1K records to excel this time will all the 44 columns. This took 4 mins 12 secs.
Then ran with a chunk of 500 records which took 01 mins 03 secs and this one was comparatively quick without any system issues.
Now using the same chunk logic of 500 records per "write to collection", I wrote 2K records which took 4 mins 30 seconds. If we go by this logic then below could be the timelines.

4 mins 30 secs - 2K rows when broken into chunks of 500
9:00 for 4K
13:30 for 6K
18 for 8K
22:30 for 10K
27 for 12K
31:30 for 14K
36 for 16K
40:30 for 18K
45 min for 20K which is almost the same as you encountered ie; 50 mins for the 20K records.

Also I noticed that even through Blue Prism "appeared" to be frozen it was actually running the process in the background as I was able to scroll Excel and literally see the records being written to each row of Excel.
So based on the above observations may be a chunk of 500 records looks good from a system stability and quickness point of view OR you could increase it to 1K or you can also stick to your original approach as the total time is almost the same.

One thing to be noted is that you do not have to create those many collections based on the chunks you make. You can maintain a Counter data item and use the Copy Rows action to copy 500 rows in each iteration to another collection and then write that collection to Excel maintaining the position from where to write in the Excel which is basically the number of rows transferred + 1.
 
#10
Hi armando.rueda,

I created a collection with 20K rows and 44 columns and ran it to write to Excel. I faced the same issues as yours - took too long and had to terminate BP because I had get back to more researching.
So this appears to be by design since it is happening on both of our machines (mine is even slower on a Virtual
machine).
So then I thought to research on splitting the collection columns into 22 each to see if that is causing the problem.
That too produced the same issue of taking too long before terminating the process.
This implied that it is the number of rows that are causing the issues.
So I ran the process to write a chunk of 1K records to excel this time will all the 44 columns. This took 4 mins 12 secs.
Then ran with a chunk of 500 records which took 01 mins 03 secs and this one was comparatively quick without any system issues.
Now using the same chunk logic of 500 records per "write to collection", I wrote 2K records which took 4 mins 30 seconds. If we go by this logic then below could be the timelines.

4 mins 30 secs - 2K rows when broken into chunks of 500
9:00 for 4K
13:30 for 6K
18 for 8K
22:30 for 10K
27 for 12K
31:30 for 14K
36 for 16K
40:30 for 18K
45 min for 20K which is almost the same as you encountered ie; 50 mins for the 20K records.

Also I noticed that even through Blue Prism "appeared" to be frozen it was actually running the process in the background as I was able to scroll Excel and literally see the records being written to each row of Excel.
So based on the above observations may be a chunk of 500 records looks good from a system stability and quickness point of view OR you could increase it to 1K or you can also stick to your original approach as the total time is almost the same.

One thing to be noted is that you do not have to create those many collections based on the chunks you make. You can maintain a Counter data item and use the Copy Rows action to copy 500 rows in each iteration to another collection and then write that collection to Excel maintaining the position from where to write in the Excel which is basically the number of rows transferred + 1.
I am really impressed with your research, I thank you very much and I hope that soon Blue Prism can solve this problem, since it is really necessary to handle large volumes of data in Excel.
 
Top