Adil Khan 10 months ago
AdiKhanOfficial #FYP Ideas

Hardware failure prediction in large datacenters

The main idea behind this project is to predict the trends of hard disk failures in data centers by using machine learning. We propose a machine learning model to predict failure in server?s hardware hence making data center network more reliable and fault tolerant. Our prediction algorithm work wil

Project Title

Hardware failure prediction in large datacenters

Project Area of Specialization

Artificial Intelligence

Project Summary

The main idea behind this project is to predict the trends of hard disk failures in data centers by using machine learning. We propose a machine learning model to predict failure in server’s hardware hence making data center network more reliable and fault tolerant. Our prediction algorithm work will improve stability, and predictability in large data centers. The aim is to use different algorithms and techniques that maximize the accuracy of the prediction.

Project Objectives

The major goal of the project is to predict the hard disk failure in datacentres using machine learning with a presentable amount of accuracy to increase the efficiency and decrease the downtime of datacentres. The aim is to identify patterns of failures in a datacentre environment and use different algorithms and techniques to predict.

Project Implementation Method

Time-Series transformation: Transform the data to time-series

We transform the Backblaze hard drive dataset to a time series so that each drive can be analyzed for the life span it has been operational throughout the dataset.

Change point detection: Time series change point detection; identifying subset of SMART parameters indicative of disk failure.

We select the smart indicators which are indicative of disk failure for this we will employ Bayesian structural time series model for change point detection. The SMART indicators which exhibit such a change point before the disk are replaced are further selected in our predictive model

Time-Series compression: Exponential smoothing for compact time series representation.

We transform the time series into compact both highly informative representation, we use a window to split the row data into segments we aggregate each segment to a single value using exponential smoothing over the specific time window.

Failure backtracking: Mark the days before the actual failure.

To be able to predict the failure beforehand, so that there be sufficient time to take the necessary measures and carry out the failure management tasks; we have to mark the dataset with indicators for the days that shows the symptoms of failure well before the actual failure, the number of days to be marked is determined by analyzing the change point in the time series data.

Informed down-sampling: Informed down-sampling of the dataset via clustering to address the high-class imbalance.

Since there are many more healthy drives than failed drives (a ratio of about 100:2 annually) the model must be very precise when making a positive prediction to provide actual value. Since only a small subset of the disks are replaced over time our training data will exhibit a strong class imbalance, to address this we will undergo informed down-sampling of the healthy disks, such that we select only the most representative data points.

Classification: Build a predictive model to distinguish the healthy disks from those likely to fail.

Lastly, we fit a powerful non-linear classification model (a variant of decision trees or neural network) to provide high quality predictions for the future (test/unseen) data.

Benefits of the Project

The project will deliver a predictive model for hard drives failure in datacentres to minimize the effect of disk failures and to allow for more efficient scheduled maintenance processes in place of the inefficient reactive repair procedures (repair after the disk fails or detection of a fault), decreasing unplanned maintenance downtimes and unavailability. Prediction for potential failure will provide ample time to schedule maintenance and efficient resource management by transfer the load from an unhealthy drive among healthy ones. The scope of the project may be expanded towards different hardware components of datacentres; as per the availability of data; that affects the efficiency and availability of datacentres.

Technical Details of Final Deliverable

Final deliverable will be a research report and a research paper prepared using IEEE standards and guidelines. These 2 documents will contain all the results that we were able to extract out of the model along with comparisons of results of different models. Final working model will be able to predict hardware failure of hard drives in a datacenter with satisfactory amount of accuracy. The model will also eliminate false positive results as they can prove to be costly because if there exist some results which classify a drive which is healthy as an unhealthy drive, we will have to remove it without even knowing that it is healthy.

Final Deliverable of the Project

Software System

Type of Industry

IT

Technologies

Artificial Intelligence(AI), Others

Sustainable Development Goals

Decent Work and Economic Growth, Industry, Innovation and Infrastructure

Required Resources

Item Name Type No. of Units Per Unit Cost (in Rs) Total (in Rs)
Nvidia GTX 1070ti Equipment17000070000
Liquid Cooler Miscellaneous 11000010000
Total in (Rs) 80000
If you need this project, please contact me on contact@adikhanofficial.com
Try shoes on ugmented reality

Augmented Reality (AR) driven mobile application for client?s footwear store. This footwea...

1675638330.png
Adil Khan
10 months ago
Car Tracker System

We live in a digital age in which new technologies emerge every day, and as a result, it o...

1675638330.png
Adil Khan
10 months ago
HBR Rental Cars

HBR Car Rental agency is a company that rents automobiles for short periods of time,...

1675638330.png
Adil Khan
10 months ago
ROBOTIC PONG PLAYER

As pong is a really simple and fun game, we are bringing a robot into the mix. A singular...

1675638330.png
Adil Khan
10 months ago
Customized App for Paperless Workflow in NED University

In an era of digitalization, every organization is moving towards automation and striving...

1675638330.png
Adil Khan
10 months ago