The size and complexity of supercomputer systems and their power and cooling facilities have continuously increased, thus posing additional challenge for long-term and stable operation. Supercomputers are shared computational resources and usually operate with different computational workloads at different locations (space) and timings (time). Better understanding of the supercomputer systems heat generation and cooling behavior is highly desired from the facility operational side for decision making and optimization planning. In this work, we present a dimensionality reduction-based visual analytics method for time-series log data, from supercomputer system and its facility, to capture characteristic spatio-temporal features and behaviors during the operation.
Funding: JSPS KAKENHI [20H04194, 21H04903]