
Sách keo gáy, Bìa mềm
Thể loại:Business & Economics
Năm:2019
Ngôn ngữ:english
Trang:264 / 274
The book describes the emergence of big data technologies and the role
of Spark in the entire big data stack. It compares Spark and Hadoop and
identifies the shortcomings of Hadoop that have been overcome by Spark.
The book mainly focuses on the in-depth architecture of Spark and our
understanding of Spark RDDs and how RDD complements big data’s immutable
nature, and solves it with lazy evaluation, cacheable and type
inference. It also addresses advanced topics in Spark, starting with the
basics of Scala and the core Spark framework, and exploring Spark data
frames, machine learning using Mllib, graph analytics using Graph X and
real-time processing with Apache Kafka, AWS Kenisis, and Azure Event
Hub. It then goes on to investigate Spark using PySpark and R. Focusing
on the current big data stack, the book examines the interaction with
current big data tools, with Spark being the core processing layer for
all types of data.
The book is intended for data engineers and
scientists working on massive datasets and big data technologies in the
cloud. In addition to industry professionals, it is helpful for aspiring
data processing professionals and students working in big data
processing and cloud computing environments.