Table of Contents
Introduction
在Previous ArticleIn this article, we explore PSI (Pressure Stall Information) and how to monitor system PSI information. This article will delve into how to monitor PSI information for a single container.
PSI and cgroupv2
In systems that have migrated to the cgroup2 filesystem, you can trace pressure stall information for each cgroup. Under the cgroupfs mount point, each cgroup controller's subdirectory contains files such as cpu.pressure, memory.pressure, and io.pressure.
You can use the following command to check the PSI information for a specific cgroup. This example queries the cpu.pressure of a cgroup named cg1:
cat /sys/fs/cgroup/cg1/cpu.pressure
PSI and runc
The upcoming runc 1.2.0 release will support retrieving pressure stall information from container cgroups. The following command can be used to fetch this data:
runc --root <container_root> events --stats <container_id>
Here, container_root refers to the directory path where container information is stored. For example, in Docker it might be /var/run/docker/runtime-runc/moby/, while in containerd it might be /var/run/containerd/runc etc.
After running the command, the output will appear in the following JSON format:
{
"type": "stats",
"id": "9eef3a09b21e11a6c54823ecdbe7b71a204d439acfeb7392a97e60a4baf64a74",
"data": {
"cpu": {
"usage": {
...
},
"throttling": {},
"psi": {
"some": {
"avg10": 0,
"avg60": 0,
"avg300": 0,
"total": 201
},
"full": {
"avg10": 0,
"avg60": 0,
"avg300": 0,
"total": 201
}
}
},
"cpuset": {
...
},
"memory": {
"usage": {
...
},
"swap": {
...
},
"kernel": {
...
},
"kernelTCP": {
...
},
"raw": {
...
},
"psi": {
"some": {
"avg10": 0,
"avg60": 0,
"avg300": 0,
"total": 0
},
"full": {
"avg10": 0,
"avg60": 0,
"avg300": 0,
"total": 0
}
}
},
"pids": {
...
},
"blkio": {
"psi": {
"some": {
"avg10": 0,
"avg60": 0,
"avg300": 0,
"total": 0
},
"full": {
"avg10": 0,
"avg60": 0,
"avg300": 0,
"total": 0
}
}
},
"hugetlb": {},
"intel_rdt": {},
"network_interfaces": null
}
}
In cpu, memory, and blkio, the psi values correspond to their respective PSI metrics.
Prometheus support
Currently, cAdvisor is waiting for the official release of runc 1.2.0 to provide support. Related details can be found atThis PROnce support is complete, you can collect PSI data via cAdvisor integrated with Prometheus.
In addition, there are still other tools available in the market, such as those developed by Cloudflare psi_exporter and those created by Mosquito cgroups-exporter。
Summary
Monitoring PSI data from containers is crucial for understanding and optimizing resource management in containerized environments. With continuous support from tools like runc and cAdvisor, we can now obtain these metrics more accurately, enabling more efficient resource allocation and management, and ensuring high system performance.