argus/README.md

8 lines
602 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

背景保障大规模GPU智算集群以下简称“集群”能够稳定、高效、安全地运行需构建一个自动化、可视化、智能化的全方位运维体系。
一期目标实现“看得见”能收到关键告警解决从无到有的问题完成GPU集群机器各种性能指标、日志数据以及告警集中监控。
项目暂定名字ARGUSAI Reliable and GPU Unified Supervision希腊神话中百眼巨人阿耳戈斯代表着警惕与守护。
项目文档【腾讯文档】GPU集群运维系统
https://docs.qq.com/doc/DQUxDdmhIZ1dpeERk