In this course, you will learn analysis techniques that allow you to uncover statistical relationships and patterns in your data. It focuses on three classical methods of multivariate statistics which are regression, cluster and factor analysis.
Linear regression analysis allows you the modelling of relationships and the influencing of various factors on a specific target value. What influence does the weather have on my sales numbers? Which distribution channels are most successful?
A cluster analysis can reveal hidden similarities between observations. The goal is to identify groups within the sample that are homogeneous in themselves and at the same time can be easily distinguished from the other groups. A classic application in this field is a customer segmentation.
Factor analysis is always interesting when information from different measured values must be condensed. This form of statistical information compression is used in various areas of application: in psychological and socio-scientific analyses for measuring abstract constructs such as „civil courage“ and in the technical area for image or signal processing.
“Multivariate statistics with R” is regarded as an application-oriented introduction to the three methods mentioned above. Their focus lies in the application in R. Furthermore, this course is designed to provide a comprehensive introduction to the three methods mentioned above. It is aimed at interested parties who already have a basic knowledge of R and statistics.
Table of contents:
- Cluster analysis
-
- Basic concepts of cluster analysis
- Similarity and distance measures
- Comparison and application of different algorithms
- Regression analysis
-
- Introduction to linear regression analysis
- Model coefficients, significance tests, model quality
- Graphical and statistical verification of model prerequisites
- Consideration of non-linear effects and interaction effects
- Automated modeling using Stepwise Regression
- Factor and principal component analysis (PCA)
-
- Introduction to procedures
- Process of factor analysis
- Data inspection, determination of the number of factors, rotation
- Factor loadings, communalities and reproduced variance
—
Recommended course length: Two days