Robust Data Extraction Model for real time Application
Several Industries use log servers to maintain logs in daily routine work. Especially, in Internet Service Providers (ISPs) a huge flow of logs are saved in files which contains the data according to date, time & IP addresses. There IP addresses belong to specific data in the database server. Th
2025-06-28 16:34:51 - Adil Khan
Robust Data Extraction Model for real time Application
Project Area of Specialization Software EngineeringProject SummarySeveral Industries use log servers to maintain logs in daily routine work. Especially, in Internet Service Providers (ISPs) a huge flow of logs are saved in files which contains the data according to date, time & IP addresses. There IP addresses belong to specific data in the database server. This project comes in the domain of big data.
The purpose is to design a system which deals with data in real time .A system have to take data from any source, in any format, and to search, analyze, store and visualize that data in real time. In current scenario, a traditional log server takes an ample amount of time to search for specific log. Task is to reduce the searching time andintegrate the server with interactive APIs to automate the searching process. And also visualize data in real time to monitor performance and security.
Project ObjectivesFollowing are the objectives of this project:
- Make an API
- Reduce the searching time less than 5sec
- Visualize data in real time
- Increase performance
Following are the implementations for this project:
1. Parse the information by Reading the log files (Stream Reader).
2. Merge all the files into one file.
3. Apply column names & convert it into CSV format
4. Then apply sorting algorithm upon CSV file and save the sorted file into another file.
5. Trigger (a small piece of code) is very important in this project which always check the change in the system. So, trigger also checks whether the update CSV file & sorted file are change from each other or not. If both are different then update the sorted file.
6. Then API (basic need) works like a waiter, basically it waits for the user request. If user wants to search the data, then according to user request, API will allow to pass sorted file towards the searching point and show the result.
7. Visualization is important in terms of showing real time flow of data in the form of some graphs & charts.
Benefits of the ProjectThis project is very important for industry especially in telecommunication industries. A huge amount of logs are generated every day which contains the information like IP_Addresses and other details. So, whenever there is a need to search for specific log to get information about it, the traditional servers take alot of time in searching process. Therefore, our project will helps to reduce the searching time as well as visualize the data.
Technical Details of Final DeliverableFinally, we have to deliver a API based system which will help out to search and visualize data of logs in real time.
Final Deliverable of the Project Software SystemCore Industry ITOther Industries Telecommunication Core Technology Big DataOther Technologies OthersSustainable Development Goals Industry, Innovation and InfrastructureRequired Resources| Item Name | Type | No. of Units | Per Unit Cost (in Rs) | Total (in Rs) |
|---|---|---|---|---|
| Total in (Rs) | 60000 | |||
| Dell core i3 | Equipment | 1 | 60000 | 60000 |