Evaluation of Data Lake and Apache Spark Technologies for Urban Infrastructure Planning and Management

Authors

  • G. Bektemisova International Information Technology University, Kazakhstan
  • S. Kalnazar International Information Technology University, Kazakhstan

DOI:

https://doi.org/10.51301/ce.2024.i1.05

Keywords:

urban infrastructure, Data Lake, Apache Spark, big data analytics, scalability, automation, data quality, smart city planning

Abstract

This study evaluates the use of Data Lake technology and Apache Spark in the context of urban infrastructure management. By analyzing their capabilities for handling structured, semi-structured, and unstructured datasets, the research highlights their potential to optimize data processing workflows. The system was deployed on Yandex Cloud, leveraging distributed computing and horizontal scalability to achieve efficient data storage, real-time analytics, and fault tolerance. Automation pipelines and quality assurance mechanisms were implemented to streamline data ingestion, transformation, and validation processes. The findings demonstrate significant improvements in data processing efficiency, scalability, and resource optimization, offering a robust framework for enhancing smart city infrastructure planning and evaluation.

Downloads

Published

2024-03-31

How to Cite

Bektemisova, G. ., & Kalnazar, S. . (2024). Evaluation of Data Lake and Apache Spark Technologies for Urban Infrastructure Planning and Management. Computing &Amp; Engineering, 2(1), 25–31. https://doi.org/10.51301/ce.2024.i1.05

Issue

Section

Automation, Robotics, and Intelligent Systems