
Sách keo gáy, bìa mềm
Modern extract, transform, and load (ETL) pipelines
for data engineering have favored the Python language for its broad
range of uses and a large assortment of tools, applications, and open
source components. With its simplicity and extensive library support,
Python has emerged as the undisputed choice for data processing.
In
this book, you’ll walk through the end-to-end process of ETL data
pipeline development, starting with an introduction to the fundamentals
of data pipelines and establishing a Python development environment to
create pipelines. Once you've explored the ETL pipeline design
principles and ET development process, you'll be equipped to design
custom ETL pipelines. Next, you'll get to grips with the steps in the
ETL process, which involves extracting valuable data; performing
transformations, through cleaning, manipulation, and ensuring data
integrity; and ultimately loading the processed data into storage
systems. You’ll also review several ETL modules in Python, comparing
their pros and cons when building data pipelines and leveraging cloud
tools, such as AWS, to create scalable data pipelines. Lastly, you’ll
learn about the concept of test-driven development for ETL pipelines to
ensure safe deployments.
By the end of this book, you’ll have
worked on several hands-on examples to create high-performance ETL
pipelines to develop robust, scalable, and resilient environments using
Python.
Thể loại:Computers - Organization and Data Processing
Năm:2023
In lần thứ:1
Nhà xuát bản:Packt Publishing
Ngôn ngữ:english
Trang:246