Machine learning algorithms are described as learning a target function (f) that best maps input variables (X) to an output variable (Y): Y = f(X)
Overview

Linear Regression
Algorithm
Y= a *X + b
- Y – Dependent Variable
- a – Slope
- X – Independent variable
- b – Intercept
Type
Supervised learning – regression.
Example
Predict weight based on height.
When To Use
Predict a value based on training data.
To estimate a value based on a continuous variable (numeric).
Strength is speed rather than accuracy.
Limitations
- Only applies to linear relationships e.g. the relationship between income and age is curved
- Sensitive to outliers
Business Use Cases
- Predicting sales
- Analysing influence of marketing, pricing and promotions on sales
- Assessing risk
R Code
#Load Train and Test datasets #Identify feature and response variable(s) and values must be numeric and numpy arrays x_train <- input_variables_values_training_datasets y_train <- target_variables_values_training_datasets x_test <- input_variables_values_test_datasets x <- cbind(x_train,y_train) # Train the model using the training sets and check score linear <- lm(y_train ~ ., data = x) summary(linear) #Predict Output predicted= predict(linear,x_test)
Python Code
#Import Library #Import other necessary libraries like pandas, numpy... from sklearn import linear_model #Load Train and Test datasets #Identify feature and response variable(s) and values must be numeric and numpy arrays x_train=input_variables_values_training_datasets y_train=target_variables_values_training_datasets x_test=input_variables_values_test_datasets # Create linear regression object linear = linear_model.LinearRegression() # Train the model using the training sets and check score linear.fit(x_train, y_train) linear.score(x_train, y_train) #Equation coefficient and Intercept print('Coefficient: \n', linear.coef_) print('Intercept: \n', linear.intercept_) #Predict Output predicted= linear.predict(x_test)
Further Reading
- Linear Regression and Correlation: A Beginner’s Guide – Scott Hartshorn
- Machine Learning For Hackers – Conway
- Data Jujitsu – The Art of Turning Data Into Product – Patil
- Numsense! Data Science For The Layman. No Math Added – Annalyn Ng
Notes on other algorithms coming soon…
- k-Nearest Neighbour (kNN)
- k-Means Clustering
- Logistic Regression
- Naive Bayes
- Decision Trees
- Random Forest
- Neural Network
- Support Vector Machine
- Principal Component Analysis
Thanks to this article on the Analytics Vidhya site for getting me started on understanding these algorithms, and to the following to helping my build on that knowledge:
- 10 Machine Learning Algorithms Every Data Scientist Should Know
- A Tour of the Top 10 Algorithms for Machine Learning Newbies
- A Beginner’s Guide to Neural Networks – Part 1
- Comparing supervised learning algorithms
- Limitations of Linear Regression
- Popular Applications of Linear Regression
- 5 Applications of Regression Analysis in Business
- Machine Learning Algorithms and Business Use Cases
- Quick Guide to Boosting Algorithms
- Principal Component Analysis
