Linear and Logistic Regression
Statistics is the basis for the reliable evaluation of data. However, data is never perfect and thus a suitable model must be found. The main objective of this competency is to understand how data can be practically analyzed in Data Science through linear and logistic regression.
Linear methods for regression & classification Students can model and test linear relationships between continuous and categorical variables using simple linear and logistic regression, e.g. residuals analysis. They are familiar with the least squares method and maximum likelihood method.
Hypothesis testing Students will understand parametric and nonparametric hypothesis tests in regression problems, be able to apply them using the existing distribution of the data, and estimate their reliability due to errors of the 1st and 2nd kind.
Confidence intervals The students know different distributions, especially the normal distribution and the t-distribution. They can determine confidence intervals and state the probability with which a parameter belongs to a distribution. Using a t-test, they can determine whether mean values from different samples are comparable.
Statistics: Terms like mean, standard deviation, functions should already be known.
Exploratory Data Analysis, Data Wrangling, Probability Modelling, Foundation in Linear Algebra, Foundation in Calculus.