How to detect the issue of CPU spike for Tableau cluster on Linux

 Hi!

Today I would like to share how I am detecting the problem of CPU spikes.

This morning we had a few CPU spikes. So let’s check

uptime

10:19:23 up 116 days, 12:13, 1 user, load average: 0.36, 0.81, 2.57

dmesg --human | tail -n 50

[Dec30 16:15] audit: type=1327 audit(1703952955.024:85698311): proctitle=”/var/opt/tableau/tableau_server/data/tabsvc/services/backgrounder_0.20231.23.0806.1229/bin/run-backgrounder”

[ +0.631996] audit: type=1327 audit(1703952955.024:85698311): proctitle=”/var/opt/tableau/tableau_server/data/tabsvc/services/backgrounder_0.20231.23.0806.1229/bin/run-backgrounder”

[ +0.002027] audit: type=1300 audit(1703952955.029:85698312): arch=c000003e syscall=42 success=yes exit=0 a0=47 a1=7f5bb5d4b5c0 a2=10 a3=6f8da items=0 ppid=91808 pid=67952 auid=4294967295 uid=991 gid=1087000453 euid=991 suid=991 fsuid=991 egid=1087000453 sgid=1087000453 fsgid=1087000453 tty=(none) ses=4294967295 comm=”pool-29-thread-” exe=”/opt/tableau/tableau_server/packages/bin.20231.23.0806.1229/run-backgrounder” key=”network_connect_4"

[ +0.003330] audit: type=1306 audit(1703952955.029:85698312): saddr=020001850A8102DE0000000000000000

[ +0.001054] audit: type=1327 audit(1703952955.029:85698312): proctitle=”/var/opt/tableau/tableau_server/data/tabsvc/services/backgrounder_0.20231.23.0806.1229/bin/run-backgrounder”

[ +0.001971] audit: type=1300 audit(1703952955.037:85698313): arch=c000003e syscall=42 success=yes exit=0 a0=51 a1=7f5bb5d4b5c0 a2=10 a3=6f8da items=0 ppid=91808 pid=67952 auid=4294967295 uid=991 gid=1087000453 euid=991 suid=991 fsuid=991 egid=1087000453 sgid=1087000453 fsgid=1087000453 tty=(none) ses=4294967295 comm=”pool-29-thread-” exe=”/opt/tableau/tableau_server/packages/bin.20231.23.0806.1229/run-backgrounder” key=”network_connect_4"

[ +0.003145] audit: type=1306 audit(1703952955.037:85698313): saddr=020001850A8102DE0000000000000000

[ +0.000914] audit: type=1327 audit(1703952955.037:85698313): proctitle=”/var/opt/tableau/tableau_server/data/tabsvc/services/backgrounder_0.20231.23.0806.1229/bin/run-backgrounder”

[ +0.001850] audit: type=1300 audit(1703952955.042:85698314): arch=c000003e syscall=42 success=yes exit=0 a0=4b a1=7fb19785c840 a2=10 a3=cdace items=0 ppid=91808 pid=110383 auid=4294967295 uid=991 gid=1087000453 euid=991 suid=991 fsuid=991 egid=1087000453 sgid=1087000453 fsgid=1087000453 tty=(none) ses=4294967295 comm=”pool-29-thread-” exe=”/opt/tableau/tableau_server/packages/bin.20231.23.0806.1229/run-backgrounder” key=”network_connect_4"

[ +0.002892] audit: type=1306 audit(1703952955.042:85698314): saddr=020001850A8102DE0000000000000000

[Jan 2 15:15] kauditd_printk_skb: 250 callbacks suppressed

[ +0.000002] audit: type=1327 audit(1704208542.451:87759651): proctitle=”/var/opt/tableau/tableau_server/data/tabsvc/services/backgrounder_1.20231.23.0806.1229/bin/run-backgrounder”

cd /var/opt/tableau/tableau_server/data/tabsvc/services/backgrounder_0.20231.23.0806.1229

cd crashdumps/stacktraces

cat _opt_tableau_tableau_server_packages_hyper.20231.23.0806.1229_hyperd.50889.6.1704225028.txt | grep -i terminated

Program terminated with signal 6, Aborted.

Signal 6 (SIGABRT) = SIGABRT is commonly used by libc and other libraries to abort the program in case of critical errors. For example, glibc sends an SIGABRT in case of a detected double-free or other heap corruptions.

https://faculty.cs.niu.edu/~hutchins/csci480/signals.htm

Therefore, we identified the source of signal 6 to be an OOM issue during the operation of hyperd parking. Consequently, I strongly advise examining the volume of data you are receiving. The process of fetching data through the backgrounder and storing it in hyperd could potentially pose a problem.

Comments

Popular posts from this blog

How only 2 parameters of PostgreSQL reduced anomaly of Jira Data Center nodes

Stories about detecting Atlassian Confluence bottlenecks with APM tool [part 1]

Atlassian Community, let's collaborate and provide stats to vendors about our SQL index usage