Hello all. Just wanted to start a general discussion about what is an acceptable failure rate? We have a process that connects to a web site and then runs several high level operations inside of that we site. On occasion, primarily during report generation, the report runs long so something else happens and that step of the operation fails. despite the redundancies built into the process. Additionally, the whole process will successfully complete, we will just at time end up missing a report. The user can replicate this failed step and generate the report themselves, but they find this to be an annoyance. As I measure the failure rate I am curious what others call a successful rate of automation run time and what is an expected level of failure.
In general this just feels like an uphill battle as we often don't have the same control as you would in building a C#/.net application since we are relying on applications like web sites to respond to commands in the same general amount of time.
Just looking for ideas as we continue to refine what acceptable SLAs should be.
In general this just feels like an uphill battle as we often don't have the same control as you would in building a C#/.net application since we are relying on applications like web sites to respond to commands in the same general amount of time.
Just looking for ideas as we continue to refine what acceptable SLAs should be.