Tag Archives: Bug

Checklist for Troubleshooting Web Application in Internet

6 Mar

One of the hard things of troubleshooting the production issues is to identify the root cause of the issue. When the application is deployed in the open internet and accessed by the variety of the browsers, and has large number of hops in the network, it becomes quite a challenging task. So in this post I will walk you through the list of things which needs to be checked at high level on client side in order to identify the root cause of the issue and before proceeding to do the review of the code base.

Browser Setting

  • What is the proxy setting of the browser, does it access the application via proxy or connection is directly to the internet.
  • How is browser configured? Does the user have administrative rights or he has basic rights on the browser settings.
  • Is the browser configured to show user friendly messages? If Yes than can we reproduce the issue by removing those user friendly messages and just displaying the actual message which application is throwing.(Please note if your application does not do proper error handling, there exists a risk that you are displaying nasty code to the users. E.g. Famous yellow screens of the .net)
  • Is the browser configured to run in standard mode or compatibility mode? This point applies to IE. (Ajax and UI Issues are most often related to compatibility mode)
  • In case the issue is related to certificate, then checking whether the relevant certificate is present in trusted store often helps.
  • If your application uses popup windows to display some information, then checking if there are any add on or setting in browsers which is blocking pop ups also helps. Some browser add on silently block pop ups without giving any information to the users.
  • Is the browser setting on default? These are the setting which is factory default. One of the easiest ways to troubleshoot issues is reset the browsers to default setting.

Client Computer Settings

  • Is the client computer behind the firewall? If yes than verifying that it’s correctly configured saves lot of time.
  • Checking the antivirus software installed on the client machine also helps. Sometimes in case where your application uses specials character, there exists a chance that badly configured antivirus might filter out or block the incoming responses.
  • Hardware and software configuration of the user’s machine. In case if your application does lot of heavy lifting on the client side, then it often helps to educate the users that minimum configuration needs to be met.

Network Infrastructure and configuration

  • Is the network correctly configured? Using the bidirectional ping command often helps to identify the network issues.
  • Is the client able to resolve the application host name correctly?
  • Checking how many hops the user needs to make to connect to the server also helps
  • In increasing user experiences. General Thumb rule I often use is more the hops user does to connect to the server, more the response time he is going to get.
  • Is there any load balancer or firewall between the server and client? If yes than checking if they are correctly configured also helps.

User access or Login Issues

  • Is the user giving the right credentials and if yes then checking if server is doing correct validation of credentials also helps.
  • Is the identity and access validation done by application or by third party component like Site minder? If by third party, then checking the third party component in isolation often helps.
  • Does the user have appropriate access level to access the resources? If yes then further troubleshooting is required. Else its waste of time.
  • If browser is client, then disabling friendly error message settings in browser will reduce the time to identify the issue by almost 50% since no extra debugging tools are required unless the case if of missing or stolen headers.

Tips to Reproduce the Performance Issues

23 May

Whenever the performance tester logs an incident for performance issues, the first questions ,development team asks is “ how to we reproduce the incident “ and sometimes they often refuse to agree that there exists any performance issues in the application for the simple reason that incident highlighted is often not reproducible in their environment.

There exists a norm in software development industry that if the incident logged cannot be reproduced by the tester or by the person highlighting it, then it cannot be resolved or solved for the simple reason that not enough information is available to resolve the issue or understand the issue.

I believe that performance incidents are hard to reproduce and do not often occur under functional or manual environments, so it becomes real hard to say as what really happened to the application under load. But however there are some features which most load testing tools provide which can help the performance engineers to reproduce the defect and to know as what the inputs were given to the application at that point that triggered the issues to surface under load test.

In order reproduce the incidents, performance engineer’s needs to have understanding of the functionality of the application along with technical details of the application. In case if the information is not available, then he needs to ask this information with relevant stakeholders before logging an incident. This really helps so that everyone involved stays in the same page. So in this post I would be highlighting some of the features which LoadRunner provides which when used effectively can be helpful in reproducing the incidents.

LoadRunner provides rich set of features which can be used for reproducing incidents which often cannot be reproduced manually, below are some the setting,

  • Enable Snapshot on Error: This feature I believe was introduced in 8.x versions of LoadRunner. Whenever the error occurs, it takes the snapshot of the page and saves it in Vuser logs. However it needs to be enabled in the run time setting of the script. Often snapshot or screenshot are the ones development team requires to believe that error has indeed occurred. So I suggest this feature needs to enabled if you believe that you have some performance issues in the application. However please do keep in mind that enabling this also consumes the LG’s resources.
  • Logging Functions: LoadRunner provides rich set of functions that can be used to log messages to file. However I prefer to use out put message function and disable logging completely. By using output message function (lr_output_message) function, I don’t need to have complete logging enabled, and I see all information which I want to see. I also suggest logging all the correlated values along with user defined parameters using output message function for the reason that under load, one can really find out as what values were given ,what values were captured from the server response and what values were not captured. Once we have the data from output message function, we can reuse the same data and try to reproduce it manually.
  • Extended logging: LoadRunner also provides us the features wherein we can see client request made and server response received from the servers. There might be some cases where in we would like to see as what request client has send that triggered the error in the application under load, in such a cases extended logging can be enabled. However please do note that it impacts response time and consumes lot of load generator resources. Extended logging in LoadRunner helps us to see as what used defined data parameters were used, what response was send by the server and extended trace of the function calls made by the LoadRunner. In short, it shows you the complete trace as what flows in the wire for the user. However this requires the performance engineer have sufficient knowledge to read and interpret the data captured.
  • Iteration Number: LoadRunner provides the feature wherein users can log the iteration number. Under load, each user does many iteration and uses many data points from the user defined data files; it becomes really confusing to know what data was used in which iteration while reproducing the error. So I suggest that one needs to log iteration number along with parameter used in the script either in the beginning of the action block or depending on the requirement. Iteration number along with logged information when correlated with snapshot on errors helps in most of the cases to come out with clear knowledge of the issue found.

However there are some cases of incidents where in spite of having all information, one might not be able to reproduce the incident. For such cases, I suggest that you isolate the scenario and run it having the relevant stakeholders monitoring at their respective ends.

Is this bug in LoadRunner 9.52 with lr_end_timer ?- No Please

29 Apr

I have a business process which needs to be scripted with LoadRunner and business process looks some thing like below like below.

  1. The users logs in to the web application.
  2. Then he adds some details in the next screen which comes right after login.
  3. After adding details users click on search button.
  4. After clicking on search button, results are displayed in the data grid.
  5. Max 10 records per page are displayed in the grid.
  6. Then user’s clicks on the first record and details about that record are displayed in the new browser window.
  7. In this new browser window, there exists little or some transaction information.
  8. User needs to find out as how many transaction records exist. It’s more like of viewing details.
  9. If transaction details are less than 500, then he needs to display the response time for those 500 records without writing a transaction for it.
  10. If the transaction details are more than 500, he needs to insert transaction marker and measure the response time.

Well here is some more information about this application, its classic web enterprise asp.net application. Not much technical complexity involved while scripting with LoadRunner nor the business process is very much complicated.

However this application has lot of data dependency and one needs to have right dataset in place. Now since this application is part of large program and this application belongs one of the top tier client, getting the right data set for this application is hard job. Of course there exists a data team which gives us the data in millions, but again they don’t assure that data is valid or its good data.

So basically for point 4, for some user ids, you might get no records found or 10’s of records. Similarly for point number 8, we can have 2 records or 3000 records. However we need to ensure that we are giving transaction response time for those transactions which has got more than 500 records for point 10.Looks some simple right. But for some reasons, LoadRunner is not allowing me do the job.

So here is my script for doing this,

long double time_getorderdetails ;

merc_timer_handle_t timer;

merc_timer_handle_t lr_start_timer ();

double lr_end_timer (merc_timer_handle_t timer_handle);

timer =  lr_start_timer();

// I have still more code on top of this,however that are web calls irrelevant to problem

web_reg_save_param(“OrderDetails”,

                “LB=left\”>”,

                “RB=<“,

                “Ord=All”,

                “RelFrameId=1”,

                “Search=Body”,

                “IgnoreRedirections=Yes”,

                LAST);

// this block of If then will exist the iteration in case 0 search results are found for point 3

if (atoi (lr_eval_string (“{orderDetails_count}”))==0){

  lr_output_message (“—->the OrderDetails Count is Zero”);

  lr_exit (LR_EXIT_ITERATION_AND_CONTINUE, LR_FAIL);

        }

 

for (m=1; m<=atoi (lr_eval_string(“{orderDetails_count}”) );m++) {

// If search results are displayed, then I click sequentially each // record displayed in the search with below for loop order details count

// is captured earlier with web reg save param.

 

itoa(m, i,10);

lr_save_string(i,”random” );

paramvalue = lr_eval_string( lr_eval_string(“{orderDetails_{random}}”) );

lr_save_string( paramvalue, “random_param” );

lr_output_message (“The value of the random_param #%s”, lr_eval_string (“{random_param}”));

// Text check to ensure that when I click the record, I am viewing transactions.

web_reg_find(“Text=** END OF ACTIVITY **”,  LAST);

// I capture all the transactions with web reg save param,

web_reg_save_param(“TransactionCheck”,

                “LB=left\”>”,

                “RB=<“,

                “Ord=All”,

                “RelFrameId=1”,

                “Search=Body”,

                “IgnoreRedirections=Yes”,

                LAST);

// this is url which gives you to view transaction details as mentioned in point 8,9,10

web_url(“OrderDetails.aspx”,

                “URL=http://xxx{random_param}”,

                “TargetFrame=”,

                “Resource=0”,

                “RecContentType=text/html”,

                “Referer=”,

                “Snapshot=t13.inf”,

                “Mode=HTML”,

                LAST);

time_getorderdetails = lr_end_timer(timer);

lr_output_message (“Transaction Duration for order details call is %0.01f”, time_getorderdetails);

lr_set_debug_message(LR_MSG_CLASS_EXTENDED_LOG, LR_SWITCH_ON);

if (atoi(lr_eval_string(“{TransactionCheck_count}”))> 500)

{

lr_output_message(“The value of the random_param #%s”, lr_eval_string(“{random_param}”));

lr_output_message(“The number of transaction were GREATER than 500 #%0.01f”, lr_eval_string(“{TransactionCheck_count

}”));

// I set this transaction to pass in case if transaction records are more than 500

lr_set_transaction(“ClickClient_GetTransactions_Details”,time_getorderdetails,LR_PASS);  

              lr_output_message (“Transaction Duration for transactions more than 500 is %0.01f”, time_getorderdetails);

                }

        else

        {

            lr_output_message(“The number of transactions were LESS than 500 #%s”, lr_eval_string(“{TransactionCheck_count}”));

            lr_output_message(“The value of the random_param #%s”, lr_eval_string(“{random_param}”));

             lr_output_message(“Transaction Duration for transaction less than 500 were %1f”, time_getorderdetails);

                  }

        lr_set_debug_message(LR_MSG_CLASS_EXTENDED_LOG, LR_SWITCH_OFF);

So I have above code, which works perfectly when I don’t give lr_start_timer and lr_end_timer.But when I give these 2 functions and try to use that with lr_set_transaction,I get below message in vugen logs,

First time loop value is Transaction Duration for order details call is 1.346211

Second time loop value is Transaction Duration for order details call is 2.582253

Third time loop value is Transaction Duration for order details call is 1304018620.981333

Fourth time loop value is Transaction Duration for  more than 500 is 1304018623.736548

All these above values are part of for loop which I have built for checking all the orders sequentially in the above code. If you look at the values at 1st time and 2st time, they are in seconds as it should be for lr_start_timer and lr_end_timers.If you look at values in 3rd and 4th, they are indeed response time but they are response time since 1st Jan 1970.

Yes I haven’t added anything great in the code, it’s all there in help files of LoadRunner which comes bundled with LoadRunner installation.

I have time_getorderdetails variable declared as double data type. So I believed that value returned will have some precision, but again the precision what LoadRunner was giving me was driving me nuts.

Also after couple of more minutes of test run in vugen and pc, I get below errors,

Error: C interpreter run time error: TransactionCheck.c (372):  Error — memory violation : Exception ACCESS_VIOLATION received.

TransactionCheck.c(372): Notify: CCI trace: TransactionCheck.c(372): lr_end_timer(0x024e4cf8 “”)

.

TransactionCheck.c(372): Notify: CCI trace: Compiled_code(0): TransactionCheck()

Well it was quite disappointing to know that when LoadRunner has made so many improvements ,added so many protocols yet it does not have a function which can suspend the transaction at runtime and not write about that event in the test reports or test results file. Flexibility to create transaction without markers in the first place is need which I feel needs to be addressed in this ajaxified world. None of functions available in LoadRunner gives you an option to create a transaction without using lr_start_transaction in the first place other than lr_set_transaction and there isn’t any easy option to put timers and get values of it.I tried but for some reason got response time since 1/1/1970 which I am sure my stakeholders are not interested. Debugging this code for business process isn’t a hard job but again, when we have perfect response time twice and there after response time in Unix style, this complicates the matter further as I need to add some more c code to this and then do some comparison et etc. For various reasons, I believe performance testing scripts should be lean in code.

With this I strongly believe this could be a major bug in lr_start_timer and lr_end_timer functions.

Hope code and other details aren’t hard to read.

Well the version of LoadRunner used was 9.52 and PC version was also the same.

Latest Update as of 30th April.

Well the issue has been resolved and the problem was that I was dealing with unreachable code. The code given above is right code and it does all the things what it is supposed to do.However there is one silly error  which makes the code unreachable.I will not add the corrected stuff here and would request readers to find out the silly error.With this I am editing the title of the post.My apologies to LoadRunner team for calling this as Bug.