
Sách Exploring Data Science with R and the Tidyverse, A Concise Introduction (sách keo gáy, bìa mềm)
Thể loại:Computers - Algorithms and Data Structures
Năm:2023
Ngôn ngữ:english
Trang:492
This book introduces the reader to data science using R and the
tidyverse. No prerequisite knowledge is needed in college-level
programming or mathematics (e.g., calculus or statistics). The book is
self-contained so readers can immediately begin building data science
workflows without needing to reference extensive amounts of external
resources for onboarding. The contents are targeted for undergraduate
students but are equally applicable to students at the graduate level
and beyond. The book develops concepts using many real-world examples to
motivate the reader.
Upon completion of the text, the reader will be able to:
Gain proficiency in R programming
Load and manipulate data frames, and "tidy" them using tidyverse tools
Conduct statistical analyses and draw meaningful inferences from them
Perform modeling from numerical and textual data
Generate data visualizations (numerical and spatial) using ggplot2 and understand what is being represented
An
accompanying R package "edsdata" contains synthetic and real datasets
used by the textbook and is meant to be used for further practice. An
exercise set is made available and designed for compatibility with
automated grading tools for instructor use.
As you develop
familiarity with processing data, you learn how to develop intuition
from the data at hand by glancing at its values. Unfortunately, there is
only so much you can do with glancing at values. There is a substantial
limitation to what you can obtain when the data at hand is so large.
Visualization is a powerful tool in such cases. In this chapter we
introduce another key member of the tidyverse, the ggplot2 package, for
visualization. R provides many facilities for creating visualizations.
The most sophisticated of them, and perhaps the most elegant, is
ggplot2. In this section we introduce generating visualizations using
ggplot2.