Monday, May 21, 2018

AI for IT Infrastructure Management



AI is being used today for IT Infrastructure management. IT infrastructure generates lots of telemetry data from sensors & software that can be used to observe and automate. As IT infrastructure grows in size and complexity, standard monitoring tools does not work well. That's when we need AI tools to manage IT infrastructure.

Like in any classical AI system, IT infrastructure management systems also has 5 standard steps:

1. Observe: 
Typical IT systems collect billions of data sets from thousands of sensors, collecting data every 4-5 minutes. I/O pattern data is also collected in parallel and parsed for analysis. 

2. Learn:
Telemetry data from each device is modeled along with its global connections, and system learns each device & application  stable, active states, and learns unstable states. Abnormal behavior is identified by learning from I/O patterns & configurations of each device and application.

3. Predict: 
AI engines learn to predict an issue based on pattern-matching algorithms. Even application performance can be modeled and predicted based on historical workload patterns and configurations

4. Recommend: 
Based on predictive analytics, recommendations are be developed based on expert systems. Recommendations are based on what constitutes an ideal environment, or what is needed to improve the current condition

5. Automate: 
IT automation is done via Run Book Automation tools – which runs on behalf of IT Administrators, and all details of event & automation results are entered into an IT Ticketing system

No comments: