Last updated 7/2020
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 567.32 MB | Duration: 5h 21m
Learn by doing - solve real-world data analysis problems using the most popular R packages
What you'll learn
Extract, transform, and load data from heterogeneous sources
Understand how easily R can confront probability and statistics problems
Get simple R instructions to quickly organize and manipulate large datasets
Predict user purchase behavior by adopting a classification approach
Implement data mining techniques to discover items that are frequently purchased together
Group similar text documents by using various clustering methods
Requirements
You are expected to know basics of R programming. You should have R installed on your system and your system should be connected to the Internet. That's all really!
Description
If you are looking for that one course that includes everything about data analysis with R, this is it. Let's get on this data analysis journey together.
This course is a blend of text, videos, code examples, and assessments, which together makes your learning journey all the more exciting and truly rewarding. It includes sections that form a sequential flow of concepts covering a focused learning path presented in a modular manner. This helps you learn a range of topics at your own speed and also move towards your goal of solving data analysis problems with R.
The R language is a powerful open source functional programming language. R is becoming the go-to tool for data scientists and analysts. Its growing popularity is due to its open source nature and extensive development community. R is increasingly being used by experienced data science professionals instead of Python and it will remain the top choice for data scientists in 2017. Big companies continue to use R for their data science needs and this course will make you ready for when these opportunities come your way.
This course has been prepared using extensive research and curation skills. Each section adds to the skills learned and helps us to achieve mastery of data analysis. Every section is modular and can be used as a standalone resource.
This course has been designed to include topics on every possible requirement from a data scientist and it does so in a step-by-step and practical manner. This course covers step-by-step and practical solutions to data analysis using R. It covers every required topic and also adds an introduction to machine learning.
We will start off with learning how to prepare, process, and perform sophisticated ETL for heterogeneous data sources with R packages. An example of data manipulation will be provided, illustrating how to use the "dplyr" and "data.table" packages to efficiently process larger data structures. We will then understand how easily R can confront probability and statistics problems and look at R instructions to quickly organize and manipulate large datasets. We will then learn to predict user purchase behavior by adopting a classification approach and implement data mining techniques to discover items that are frequently purchased together. Finally, we will offer insight into time series analysis on financial data, after which there will be detailed information on the hot topic of machine learning, including data classification, regression, clustering, association rule mining, and dimension reduction.
This course has been authored by some of the best in their fields
Yu-Wei, Chiu (David Chiu)
Yu-Wei, Chiu (David Chiu) is the founder of LargitData, a start-up company that mainly focuses on providing big data and machine learning products. He specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences.
Selva Prabhakaran
Selva Prabhakaran is a data scientist with a large E-commerce organization. In his 7 years of experience in data science, he has tackled complex real-world data science problems and delivered production-grade solutions for top multinational companies.
Tony Fischetti
Tony Fischetti is a data scientist at College Factual, where he gets to use R everyday to build personalized rankings and recommender systems.
Viswa Viswanathan
Viswa Viswanathan is an associate professor of Computing and Decision Sciences at the Stillman School of Business in Seton Hall University. In addition to teaching at the university, Viswa has conducted training programs for industry professionals. He has written several peer-reviewed research publications in journals such as Operations Research, IEEE Software, Computers and Industrial Engineering, and International Journal of Artificial Intelligence in Education.
Shanthi Viswanathan
Shanthi Viswanathan is an experienced technologist who as a consultant, has helped several large organizations, such as Canon, Cisco, Celgene, Amway, Time Warner Cable, and GE among others, in areas such as data architecture and analytics, master data management, service-oriented architecture, business process management, and modeling.
Romeo Kienzler
Romeo Kienzler is the Chief Data Scientist of the IBM Watson IoT Division and working as an Advisory Architect helping client worldwide to solve their data analysis problems. His current research focus is on cloud-scale data mining using open source technologies including R, ApacheSpark, SystemML, ApacheFlink, and DeepLearning4J.
This course is a blend of text, videos, and assessments, all packaged together keeping your journey in mind. It combines some of the best that Packt has to offer in one complete package. It includes content from the following Packt products
R for Data Science Cookbook by Yu-Wei, Chiu (David Chiu)R for Data Science Solutions[video] by Yu-Wei, Chiu (David Chiu)Mastering R Programming[video] by Selva PrabhakaranData Analysis with R by Tony FischettiR Data Analysis Cookbook by Viswa Viswanathan and Shanthi ViswanathanLearning Data Mining with R[video] by Romeo Kienzler
Overview
Section 1: Data Extracting, Transforming, and Loading
Lecture 1 About the course
Lecture 2 Downloading open data
Lecture 3 Reading and writing CSV files
Lecture 4 Scanning text files
Lecture 5 Working with Excel files
Lecture 6 Reading data from databases
Lecture 7 Scraping web data
Lecture 8 Accessing Facebook data
Lecture 9 Working with Twitter
Section 2: Data Preprocessing and Preparation
Lecture 10 Renaming the data variable
Lecture 11 Converting data types
Lecture 12 Working with the date format
Lecture 13 Adding new records
Lecture 14 Filtering data
Lecture 15 Dropping data
Lecture 16 Merging and sorting data
Lecture 17 Reshaping data
Lecture 18 Detecting missing data
Lecture 19 Imputing missing data
Section 3: Data Manipulation
Lecture 20 Enhancing a data.frame with a data.table
Lecture 21 Managing data with a data.table
Lecture 22 Performing fast aggregation with a data.table
Lecture 23 Merging large datasets with a data.table
Lecture 24 Subsetting and slicing data with dplyr
Lecture 25 Sampling data with dplyr
Lecture 26 Selecting columns with dplyr
Lecture 27 Chaining operations in dplyr
Lecture 28 Arranging rows with dplyr
Lecture 29 Eliminating duplicated rows with dplyr
Lecture 30 Adding new columns with dplyr
Lecture 31 Summarizing data with dplyr
Lecture 32 Merging data with dplyr
Section 4: Simulation from Probability Distributions
Lecture 33 Generating random samples
Lecture 34 Understanding uniform distributions
Lecture 35 Generating binomial random variates
Lecture 36 Generating Poisson random variates
Lecture 37 Sampling from a normal distribution
Lecture 38 Sampling from a chi-squared distribution
Lecture 39 Understanding Student's t-distribution
Lecture 40 Sampling from a dataset
Lecture 41 Simulating the stochastic process
Section 5: Statistical Inference in R
Lecture 42 Getting confidence intervals
Lecture 43 Performing Z-tests
Lecture 44 Performing student's T-tests
Lecture 45 Conducting exact binomial tests
Lecture 46 Performing Kolmogorov-Smirnov tests
Lecture 47 Working with the Pearson's chi-squared tests
Lecture 48 Understanding the Wilcoxon Rank Sum and Signed Rank tests
Lecture 49 Conducting one-way ANOVA
Lecture 50 Performing two-way ANOVA
Section 6: Rule and Pattern Mining with R
Lecture 51 Transforming data into transactions
Lecture 52 Displaying transactions and associations
Lecture 53 Mining associations with the Apriori rule
Lecture 54 Pruning redundant rules
Lecture 55 Visualizing association rules
Lecture 56 Mining frequent itemsets with Eclat
Lecture 57 Creating transactions with temporal information
Lecture 58 Mining frequent sequential patterns with cSPADE
Section 7: Time Series Mining with R
Lecture 59 Creating time series data
Lecture 60 Plotting a time series object
Lecture 61 Decomposing a time series
Lecture 62 Smoothing a time series
Lecture 63 Forecasting a time series
Lecture 64 Selecting an ARIMA model
Lecture 65 Creating an ARIMA model
Lecture 66 Forecasting with an ARIMA model
Lecture 67 Predicting stock prices with an ARIMA model
Section 8: Text Analytics In-depth
Lecture 68 Scraping web pages and processing texts
Lecture 69 Corpus, TDM, TF-IDF, and word cloud
Lecture 70 Cosine similarity and Latent Semantic Analysis
Lecture 71 Extracting topics with Latent Dirichlet Allocation
Lecture 72 Sentiment scoring with tidytext and Syuzhet
Lecture 73 Classifying texts with RTextTools
Section 9: Sources of Data
Lecture 74 Relational databases
Lecture 75 Using JSON
Lecture 76 XML
Lecture 77 Other data formats
Lecture 78 Online repositories
Section 10: Let's Do A Project: Social Network Analysis
Lecture 79 Downloading social network data using public APIs
Lecture 80 Creating adjacency matrices and edge lists
Lecture 81 Plotting social network data
Lecture 82 Computing important network metrics
Section 11: Supervised Machine Learning
Lecture 83 Fitting a linear regression model with lm
Lecture 84 Summarizing linear model fits
Lecture 85 Using linear regression to predict unknown values
Lecture 86 Measuring the performance of the regression model
Lecture 87 Performing a multiple regression analysis
Lecture 88 Selecting the best-fitted regression model with stepwise regression
Lecture 89 Applying the Gaussian model for generalized linear regression
Lecture 90 Performing a logistic regression analysis
Lecture 91 Building a classification model with recursive partitioning trees
Lecture 92 Visualizing a recursive partitioning tree
Lecture 93 Measuring model performance with a confusion matrix
Lecture 94 Measuring prediction performance using ROCR
Section 12: Unsupervised Machine Learning
Lecture 95 Clustering data with hierarchical clustering
Lecture 96 Cutting tree into clusters
Lecture 97 Clustering data with the k-means method
Lecture 98 Clustering data with the density-based method
Lecture 99 Extracting silhouette information from clustering
Lecture 100 Comparing clustering methods
Lecture 101 Recognizing digits using the density-based clustering method
Lecture 102 Grouping similar text documents with k-means clustering methods
Lecture 103 Performing dimension reduction with Principal Component Analysis (PCA)
Lecture 104 Determining the number of principal components using a scree plot
Lecture 105 Determining the number of principal components using the Kaiser method
Lecture 106 Visualizing multivariate data using biplot
Section 13: Extra Goodies: Cognitive Computing and Artificial Intelligence
Lecture 107 Introduction to neural networks and deep learning
Lecture 108 Using the H2O deep learning framework
Lecture 109 Real-time cloud based IoT sensor data analysis
This course is useful whether someone is a hobbyist, analyst, an aspiring or professional data scientist, or even learning data analysis for the first time. Those already familiar with the basics of R, but want to learn to efficiently analyze real-world data problems will also find this course a match for their needs.
Homepage
https://www.udemy.com/course/r-complete-data-analysis-solutions/