Overview and practical use cases with open source tracing tool for Java Apps. Timeouts

 Hi! 

In this article, I would like to share the small story of how Glowroot, other monitoring will help you with the next small story of how huge timeouts can decrease customer satisfaction. 

So a few times I heard in one of the installation the problem around the opening Jira, but monitoring shows everything is fine, metrics of response time and TTFB in grafana shows quite good results. 

 

Of course, here helps us with the open source tracing tool Glowroot.

After reviewing the exception, I just set the filter CrowdRestException, to understand the frequency as result I see the time of execution is 10 sec.

image.png

After clicking to trace, we can easily find the problem with connection timeout which generates a connection to one of the ldap servers. Which is stuck in the authentication process.

image.png

After changing the read timeout to 3 sec, and excluding from AD pool the failing AD server, it works too well.

I hope that the small story will show the timeout waiting response is quite correct, and it should be enough to make a decision to close connection and make retry after random sleep timeout. Also, sometimes we must think about UX :)

Hope it’s helpful. 

Cheers,

Gonchik 

Comments

Popular posts from this blog

How only 2 parameters of PostgreSQL reduced anomaly of Jira Data Center nodes

Stories about detecting Atlassian Confluence bottlenecks with APM tool [part 1]

Atlassian Community, let's collaborate and provide stats to vendors about our SQL index usage