Evaluation of Data Lake and Apache Spark Technologies for Urban Infrastructure Planning and Management
DOI:
https://doi.org/10.51301/ce.2024.i1.05Keywords:
urban infrastructure, Data Lake, Apache Spark, big data analytics, scalability, automation, data quality, smart city planningAbstract
This study evaluates the use of Data Lake technology and Apache Spark in the context of urban infrastructure management. By analyzing their capabilities for handling structured, semi-structured, and unstructured datasets, the research highlights their potential to optimize data processing workflows. The system was deployed on Yandex Cloud, leveraging distributed computing and horizontal scalability to achieve efficient data storage, real-time analytics, and fault tolerance. Automation pipelines and quality assurance mechanisms were implemented to streamline data ingestion, transformation, and validation processes. The findings demonstrate significant improvements in data processing efficiency, scalability, and resource optimization, offering a robust framework for enhancing smart city infrastructure planning and evaluation.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Computing & Engineering

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
<div class="pkpfooter-son">
<a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc/4.0/80x15.png"></a><br>This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.
</div>
