RPA Technical Story: Our production BOT system went down

In this technical story, we will be answering the below questions and would understand the problem first before before jumping on the solution

What was the issue ?
  • .     In the middle of BOT execution, we saw that system went down and later got disconnected from the server


  • We could not take remote into this BOT runner system and therefore were forced to involve client IT server maintenance team.

  • Server maintenance team performed a restart on the said system and remote connection was restored
How this issue occurred ?
  • BOT was executing VbScript on an bulky excel file (rows > 100k) extracted from an ERP system
  • VbScript/Excel calculation caused CPU usage to reach 100% and maxing out of the system resources
  • To prevent further damages, system went down and disconnected itself from server

Note: We were unaware about the impact of high CPU usage on the BOT execution and on the system. Thanks to our client’s server maintenance team for bringing this learning to us

What factors contributed to this problem ?
Bulky Excel File
  • Bulky excel file often cause performance issues such as slowness, high CPU usage, difficulty in opening and at times not responding
Excel Data Crunching
  • Operations such as filtering/deleting records
  • Adding of new columns, formulas, text to column
  • Performing calculations
Excel Calculation
  • Excel tries to calculate the values based on formulas every time any cell is changed
  • Calculation is done for each cell in the entire documents
  • Calculation calls for multi-threading feature of excel that depends on no. of logical processors
What went wrong with solution design ?
  • Missed out on considering performance issues with bulky excel files
  • Limited skill set restricted to only Automation tool/VbScript/Macros usage
  • Missed volume testing during our development and user acceptance testing phase
  • Unaware about impact of high CPU usage on BOT execution and the system itself
What solution we came up with ?
Let the database handle the situation.....

Importing bulky excel file into Ms Access
Excel spreadsheet containing more than 100k records was imported into MS Access and was stored as database table. We can import any or all of the worksheets in an excel workbook in a single shot. Access wont change any data being exported from excel workbook.

Perform Data Manipulation
Once we exported the spreadsheet into Access as a database table, we performed data operations such as filtering, deleting, sorting, text to column and calculations with the help of Access queries

Export Result
Post performing all the data crunching and calculation with the help of Access queries the resultant data was exported back from Access in the form of excel sheet. This exported excel worksheet was then being fed to the BOT for its processing.

Why this solution ?
Availability
  • MS Access is available with all MS Office professional suites and higher end. Mostly all organization are using professional end only
  • Access is economical as compared to other database system such as SQL server, Oracle database etc.
Data Management
  • Data management (especially large data) is efficient in MS Access as compared to excel. Since it is a relational database and not a flat file database unlike excel. It offers much more capabilities in terms of storage, retrieval, control and integrity of data.
Database Queries
  • Most fascinating part of database is the SQL queries, a query can give you an answer to a simple question, perform calculations, combine data from different tables, add, change, or delete data from a database. We can use expressions in a query to create a calculated fields similar to using formula in excel
Easy to export
  • MS Access allows the result set i.e. data retrieved post query execution to be exported into different formats such as excel, text, XML, ODBC etc.
Speed of execution
  • Queries are much powerful and executes in few seconds, however speed of execution depends upon the database size, execution plan etc. These queries either perform a full table scan (simple queries) or an index scan (complex queries including joins) as per the structure of the queries.
Technical aspects of this solution
As a prerequisite you would need Microsoft Visual Studio IDE and MS Access along with scripting knowledge on C#

Class library creation
We created a C# class library with methods to connect with MS Access, Export excel into MS Access and execute queries at run time.

User Defined Functions
Created user defined VBA functions within MS Access to achieve a specific functionality that was not possible with Access queries

Compile class into DLL
Custom class library was compiled into a DLL file

Consuming DLL
In our case, we used Automation Anywhere tool and therefore we used MetaBot feature to consume this DLL file in our automation

What was the impact ?
Earlier with VbScript and excel, it took almost ~15 to 20 mins of execution and at times halted the BOT execution and caused 100% CPU and choked the entire system. But with this new approach it took merely ~ 30 seconds to churn out the result. This new approach turned out to be system as well as CPU friendly.

Comments

  1. Thanks for sharing information on RPA , Keep posting more information ,

    ReplyDelete

Post a Comment

Popular posts from this blog

Handling Azure Data Classification in MS Excel/Word/PDFs