Analytics with R
Self-paced Online Course
About R
A programming language for statistical computing, R is one most widely used software environments for computational statistics, data science and visualisation. Millions of analysts and data scientists use R for problems ranging from quantitative finance and computational biology to market research and behavioural studies. Though a freeware, recent surveys show that the adoption of R is fast outpacing legacy, proprietary data analysis software, which continue to lag behind it in features and functionalities.
Who is this course for?
This is a foundation course meant to introduce you to R, as well impart fundamental data analysis skills. If this is your first attempt to learn R, or you would like to apply the analytical techniques of R to your core area of work, this course is for you. It will give you all the skills necessary for using R for analysis of your data with the most fundamental R techniques.
Why this course?
Compared to other programming languages,
learning R presents a unique challenge. It is not
sufficient learn to code in R. It is more important to
learn the theory behind the various techniques available in
R, where and how to apply them, and how to interpret the
results that they produce. This course builds a foundation for R by
teaching,
• The theories and concepts behind the techniques.
• The way these techniques need to be applied in R.
• Interpreting and drawing conclusions from the outcomes.
How is this course taught?
We believe that to be practical and useful, learning techniques need to closely follow the steps that are likely to be used while actually working. The course is therefore taught through an unique simulated interface, where you will be taught the details of R commands, how they are to be executed from the R user interface, and finally how to interpret the results that R produces. Keep R running in another window, follow the steps as training demonstrates, and in no time you will learn how the start using R on your own.
Coverage
Introduction to R
• Background and ResourcesHistory behind R and online resources for R.
• Installing R
Installing R in windows.
• R Console
R window to edit and execute R commands.
• Commands and Syntax
R Commands and R Syntax.
• Packages and Libraries
Install and load a package in R.
• Help in R
Getting help about R commands.
• Workspace in R
Save and load R file in workspace.
Frequencies
• FrequenciesFrequencies, Frequency table and their graphical presentation, Relative frequency, Frequency curve.
Comparing Populations
• Test of HypothesisConcept of Hypothesis testing. Null Hypothesis and Alternative Hypothesis.
• Cross Tabulations
Contigency tables and their use.Chi-Square test. Fisher’s exact test.
• One Sample t test
Concept, Assumptions, Hypothesis, Verification of assumptions, Performing the test and interpretation of results.
• Independent Samples t test
Concept, Type, Assumptions, Hypothesis, Verification of assumptions, Performing the test and interpretation of results.
• Paired Samples t test
Concept, Assumptions, Hypothesis, Verification of assumptions, Performing the test and interpretation of results.
• One way ANOVA
Concept, assumptions, hypothesis, verification of assumptions.Model fit, hypothesis testing. Post hoc tests: Fisher’s LSD, Tukey’s HSD.
Data Structures
• Introduction To Data StructuresWhy data structures. Types of data structures in R.
• Vectors
Types of Vectors and their creation procedures. Assigning created Vector to an object. Basic vector operations. Operations between vectors.
• Matrices
Creating a matrix.Extracting elements rows or columns from a matrix. Combining two matrices, Basic matrix operations.
• Arrays
Creating an Array. Finding type and dimension of Array.
• Lists
Creating a List. Extracting a specific component from a list. Extracting a component from a sublist.
• Factors
Creating a factor. Unordered and ordered factors.
• Dataframes
Creating a Dataframe. Examining different parts of a dataframe. Editing and saving a dataframe.
• Importing and Exporting data
Import from and export to CSV, SPSS, text file and Excel.
• Data types
Numerical, nominal and ordinal data types. Modifying data types.
Descriptive Statistics
• Measures of Central TendencyMean, Median and Mode.
• Measures of Positions
Quartiles, Deciles, Percentiles and Quantiles.
• Measures of Dispersion
Range, Median, Absolute deviation about median.Variance and Standard deviation.
• Measures of Distribution
Skewness and Kurtosis.
• Box and Whisker Plot
Box Plot and its parts, Using Box Plots to compare distribution.
Graphical Analysis
• Creating a Simple GraphUsing plot() command.
• Modifying the Points and Lines of a Graph
Using type, pch, font, cex, lty, lwd, col arguments in plot() command.
• Modifying Title and Subtitle of a Graph
Using main, sub, col.main, col.sub, cex.main, cex.sub, font.main, font.sub arguments in plot() command.
• Modifying Axes of a Graph
Using xlab, ylab, col.lab, cex.lab, font.lab, xlim, ylim, col.axis, cex.axis, font.axis arguments and axis() command.
• Adding Additional Elements to a Graph
Using points(), text(), abline(), curve() commands.
• Adding Legend on a Graph
Using legend() command.
• Special Graphs
Using pie(), barplot(), hist() commands.
• Multiple Plots
Using mfrow or mfcol arguments in par() command and layout command.
Relationship Between Variables
• CorrelationConcept, Measures of correlation and corresponding tests: Pearson's r, Spearman's p.
• Simple Linear Regression
Definition, assumptions, hypothesis.Model fit, verification of assumptions, hypothesis testing. Prediction.
• Multiple Linear Regression
Definition, assumptions, hypothesis Testing. Model fit: Manual and Automatic.
Time Series Analysis
• Time Series AnalysisTime series data and their graphical representation .Timec index. Decomposition of time series data. Simple exponential smoothing, Holt’s linear trend model, Winter’s seasonal method.