Machine learning algorithms are described as learning a target function (f) that best maps input variables (X) to an output variable (Y): Y = f(X)

Overview

Screen Shot 2018-09-25 at 19.58.31

Linear Regression

Image source

Algorithm

Y= a *X + b

Y – Dependent Variable
a – Slope
X – Independent variable
b – Intercept

Type

Supervised learning – regression.

Example

Predict weight based on height.

When To Use

Predict a value based on training data.

To estimate a value based on a continuous variable (numeric).

Strength is speed rather than accuracy.

Limitations

Only applies to linear relationships e.g. the relationship between income and age is curved
Sensitive to outliers

Business Use Cases

Predicting sales
Analysing influence of marketing, pricing and promotions on sales
Assessing risk

R Code

#Load Train and Test datasets
#Identify feature and response variable(s) and values must be numeric and numpy arrays
x_train <- input_variables_values_training_datasets
y_train <- target_variables_values_training_datasets
x_test <- input_variables_values_test_datasets
x <- cbind(x_train,y_train)
# Train the model using the training sets and check score
linear <- lm(y_train ~ ., data = x)
summary(linear)
#Predict Output
predicted= predict(linear,x_test)

Code Source

Python Code

#Import Library
#Import other necessary libraries like pandas, numpy...
from sklearn import linear_model
#Load Train and Test datasets
#Identify feature and response variable(s) and values must be numeric and numpy arrays
x_train=input_variables_values_training_datasets
y_train=target_variables_values_training_datasets
x_test=input_variables_values_test_datasets
# Create linear regression object
linear = linear_model.LinearRegression()
# Train the model using the training sets and check score
linear.fit(x_train, y_train)
linear.score(x_train, y_train)
#Equation coefficient and Intercept
print('Coefficient: \n', linear.coef_)
print('Intercept: \n', linear.intercept_)
#Predict Output
predicted= linear.predict(x_test)

Code Source

Notes on other algorithms coming soon…

k-Nearest Neighbour (kNN)
k-Means Clustering
Logistic Regression
Naive Bayes
Decision Trees
Random Forest
Neural Network
Support Vector Machine
Principal Component Analysis

Thanks to this article on the Analytics Vidhya site for getting me started on understanding these algorithms, and to the following to helping my build on that knowledge:

Fox Visual Insights

Learning data analytics and visualisation

Algorithms

Overview

Linear Regression

Algorithm

Type

Example

When To Use

Limitations

Business Use Cases

R Code

Python Code

Further Reading

Notes on other algorithms coming soon…

Fox Visual Insights

Learning data analytics and visualisation

Overview

Linear Regression

Algorithm

Type

Example

When To Use

Limitations

Business Use Cases

R Code

Python Code

Further Reading

Notes on other algorithms coming soon…

Share this: