# Gathering CycleCloud service JVM heap usage to Log Analytics

# Prepare for monitoring CycleCloud service JVM heap usage

The CycleCloud Service is developed using Java, a Java runtime environment is automatically installed with CycleCloud. With JMX(Java Management Extension), the information about heap usage can be obtained using the jstat command. Therefore, we need to install some necessary packages to send CycleCloud’s JVM heap usage information to Log Analytics.

  1. Install the required packages
  • Install the java-1.8.0-openjdk-devel package:
1
sudo yum install java-1.8.0-openjdk-devel
  • Install Azure SDK related packages for Python:
1
2
3
4
sudo pip3 install azure-identity
sudo pip3 install azure-keyvault-secrets
sudo pip3 install azure-monitor-query
sudo pip3 install azure-core
  1. Prepare related services and permissions
  • Enable Managed Identity for CycleCloud
  • Obtain Workspace id and primary key for Log Analytics Workspace and store them in Key Vault
  • Store Log Analytics Workspace id and primary key in Key Vault
Scope Account Account Type Role or Permission Description
Key Vault CycleCloud VM Managed Identity Managed Identity Log Analytics Secrets Get Obtain Log Analytics Workspace ID and Primary Key

# Setting up CycleCloud server JVM heap usage to Log Analytics

  1. Log in to the CycleCloud Server and ensure we can obtain heap usage information by jstat command:
1
jstat -gcutil <pid>

alt text

  1. Log in to CycleCloud and create a custom script directory:
1
mkdir ~/scripts
  1. Create a python file to collect JVM heap usage:
  • Create a python file
1
vi ~/scripts/jsvc_heap_usage.py
  • Copy the following content into jsvc_heap_usage.py and modify the settings accordingly:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
import psutil
import os
import json
import time
import datetime
import requests
import base64
import hmac
import hashlib
from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential
from azure.monitor.query import LogsQueryClient, LogsQueryStatus
from azure.core.exceptions import HttpResponseError

TIMEOUT = 300
key_vault_url = "https://<YourKeyVault>.vault.azure.net/"
workspace_id_kv_secret = "la-workspace-id"
workspace_key_kv_secret = "la-workspace-key"
table_name = 'CycleCloudGCLog'

def get_default_credential():
print("Getting DefaultAzureCredential...")
INITIAL_WAIT_TIME = 1
MAX_WAIT_TIME = 60
WAIT_TIME_MULTIPLIER = 2
wait_time = INITIAL_WAIT_TIME
start_time = time.monotonic()
while True:
try:
credential = DefaultAzureCredential()
return credential
except:
time.sleep(wait_time)
wait_time *= WAIT_TIME_MULTIPLIER
wait_time = min(wait_time, MAX_WAIT_TIME)
elapsed_time = time.monotonic() - start_time
if elapsed_time >= TIMEOUT:
return None

CREDENTIAL = get_default_credential()

def access_secret_from_key_vault(secret_name):
print("Accessing secret from Key Vault...")
try:
client = SecretClient(vault_url=key_vault_url, credential=CREDENTIAL)
retrieved_secret = client.get_secret(secret_name)
return retrieved_secret.value
except:
return "None"

def build_signature(customer_id, shared_key, date, content_length, method, content_type, resource):
print("Building signature...")
x_headers = 'x-ms-date:' + date
string_to_hash = method + "\n" + str(content_length) + "\n" + content_type + "\n" + x_headers + "\n" + resource
bytes_to_hash = bytes(string_to_hash, 'UTF-8')
decoded_key = base64.b64decode(shared_key)
encoded_hash = base64.b64encode(hmac.new(decoded_key, bytes_to_hash, digestmod=hashlib.sha256).digest()).decode('utf-8')
authorization = "SharedKey {}:{}".format(customer_id, encoded_hash)
return authorization

def post_data(customer_id, shared_key, body, log_type):
print("Posting data to Log Analytics endpoint...")
method = 'POST'
content_type = 'application/json'
resource = '/api/logs'
rfc1123date = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
content_length = len(body)
signature = build_signature(customer_id, shared_key, rfc1123date, content_length, method, content_type, resource)
uri = 'https://' + customer_id + '.ods.opinsights.azure.com' + resource + '?api-version=2016-04-01'
headers = {
'content-type': content_type,
'Authorization': signature,
'Log-Type': log_type,
'x-ms-date': rfc1123date
}
response = requests.post(uri, data=body, headers=headers)
return response.content

def send_to_log_analytic(data):
azure_log_customer_id = access_secret_from_key_vault(workspace_id_kv_secret)
azure_log_shared_key = access_secret_from_key_vault(workspace_key_kv_secret)
result = post_data(azure_log_customer_id, azure_log_shared_key, data, table_name)
print("Data posted to Log Analytics.")
return result

def get_username(pid):
try:
return psutil.Process(pid).username()
except psutil.NoSuchProcess:
return "Unknown"

def parse_jstat_output(jstat_output, pid):
print("Parsing jstat output...")
header = "Timestamp\tPID\t" + jstat_output.splitlines()[0]
timestamp = datetime.datetime.utcnow().strftime('%Y-%m-%d*%H:%M:%S')
result = f"{timestamp}\t{pid}\t{jstat_output.splitlines()[1]}"
return header + '\n' + result

def convert_to_json(jstat_output):
print("Converting to JSON...")
lines = jstat_output.strip().split('\n')
headers = lines[0].split()
values = lines[1].split()
timestamp_value = values[0].replace('*', ' ')
data_dict = {"Timestamp": timestamp_value, "PID": values[1]}
for i in range(2, len(headers)):
data_dict[headers[i]] = float(values[i])
json_data = json.dumps(data_dict, indent=4)
return json_data

def find_jsvc_pids():
print("Finding jsvc.exec with 'cycle_server' username process PIDs...")
jsvc_pids = []
for process in psutil.process_iter(['pid', 'cmdline']):
if 'jsvc.exec' in process.info['cmdline']:
if get_username(process.info['pid']) == 'cycle_server':
jsvc_pids.append(process.info['pid'])
return jsvc_pids

if __name__ == '__main__':
jsvc_pids = find_jsvc_pids()
if jsvc_pids:
for pid in jsvc_pids:
data = os.popen('sudo jstat -gc ' + str(pid)).read()
jstat_data = parse_jstat_output(data, pid)
jstat_json = convert_to_json(jstat_data)
send_to_log_analytic(jstat_json)
else:
print("No jsvc.exe process found.")
  1. Set up a cron job to run every minute:
  • Edit the crontab file:
1
crontab -e
  • Add the following line to the crontab file:
1
* * * * * /usr/bin/python3 ~/scripts/jsvc_heap_usage.py
  • Verify the cron job is set up correctly:
1
crontab -l
  • Check the cron job is running:
1
sudo grep CRON /var/log/cron
  • Verify the data is sent to Log Analytics:
    A. Open Log Analytics and confirm that the custom table has been created.
    alt text

    B. Open the custom table to view the data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
CycleCloudGCLog_CL
| where todatetime(Timestamp_s) >= todatetime('2023-07-26 08:51:00')
| summarize
avg(S0C_d),
avg(S1C_d),
avg(S0U_d),
avg(S1U_d),
avg(EC_d),
avg(EU_d),
avg(OC_d),
avg(OU_d),
avg(MC_d),
avg(MU_d)
by bin(todatetime(Timestamp_s), 15

alt text

C. Create a chart to visualize the data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
CycleCloudGCLog_CL
| where todatetime(Timestamp_s) >= todatetime('2023-07-26 08:51:00')
| summarize
avg(S0C_d),
avg(S1C_d),
avg(S0U_d),
avg(S1U_d),
avg(EC_d),
avg(EU_d),
avg(OC_d),
avg(OU_d),
avg(MC_d),
avg(MU_d)
by bin(todatetime(Timestamp_s), 15m)
| project
Timestamp_s,
S0_Usage=avg_S0U_d / avg_S0C_d * 100,
S1_Usage=avg_S1U_d / avg_S1C_d * 100,
Eden_Usage=avg_EU_d / avg_EC_d * 100,
Old_Usage=avg_OU_d / avg_OC_d * 100,
Meta_Usage=avg_MU_d / avg_MC_d * 100
| render timechart

alt text

# Jstat output description

Field Description
S0C 第一個 Survivor 區的容量(KB)。
S1C 第二個 Survivor 區的容量(KB)。
S0U 第一個 Survivor 區的使用量(KB)。
S1U 第二個 Survivor 區的使用量(KB)。
EC Eden 空間的容量(KB)。
EU Eden 空間的使用量(KB)。
OC 老年代(Old Gen)的容量(KB)。
OU 老年代(Old Gen)的使用量(KB)。
MC 元數據區(Metaspace)的容量(KB)。
MU 元數據區(Metaspace)的使用量(KB)。
CCSC 壓縮類空間(Compressed Class Space)的容量(KB)。
CCSU 壓縮類空間(Compressed Class Space)的使用量(KB)。
YGC 年輕代垃圾回收次數。
YGCT 年輕代垃圾回收所花費的時間(秒)。
FGC 老年代垃圾回收次數。
FGCT 老年代垃圾回收所花費的時間(秒)。
GCT 垃圾回收總共所花費的時間(秒)。
  • jstat -gc <pid> output example:
1
2
S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT
56320.0 80384.0 55883.3 0.0 1595904.0 147712.5 244224.0 165986.0 157612.0 126611.8 16044.0 13589.9 32 1.444 5 1.827 3.271