Empirical Analysis of Programmable ETL Tools

Publications

Empirical Analysis of Programmable ETL Tools

Year : 2019

Publisher : Springer Verlagservice@springer.de

Source Title : Communications in Computer and Information Science

Document Type :

Abstract

ETL (Extract Transform Load) is the widely used standard process for creating and maintaining a Data Warehouse (DW). ETL is the most resource, cost and time demanding process in DW implementation and maintenance. Now a days, many Graphical User Interfaces (GUI) based solutions are available to facilitate the ETL processes. In spite of the high popularity of GUI based tool, there is still some downside of such approach. This paper focuses on alternative ETL developmental approach taken by hand coding. In some context, it is appropriate to custom develop an ETL code which can be cheaper, faster and maintainable. Some well-known code based open source ETL tool (Pygrametl, Petl, Scriptella, R_etl) developed by the academic world has been studied in this article. Their architecture and implementation details are addressed here. The aim of this paper is to present a comparative evaluation of these code based ETL tools. Not to acclaim that code based ETL is superior to GUI based approach. It depends on the particular requirement, data strategy and infrastructure of any organization to choose the path between Code based and GUI based approach.