Supervised

Index

Supervised Learning
Supervised Learning Catagories

Supervised Learning

What is Supervised Learning?

Supervised learning (SL) is a machine learning paradigm for problems where the available data consists of labelled examples, meaning that each data point contains features (covariates) and an associated label.

The goal of supervised learning algorithms is learning a function that maps feature vectors (inputs) to labels (output), based on example input-output pairs. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal).

A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples.

An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way (see inductive bias). This statistical quality of an algorithm is measured through the so-called generalization error.

Train-test Split of data

Classification and Regresssion

Regression

Fitting the Data

Add supporting information to the text, including equations, images, and hyperlinks.

Let's try fitting the data with polynomial regression. We'll use the MATLAB polyfit function to get the coefficients.

We will see later how this relates perfectly with our linear regression procedure.

The fit equations are:

$$\large{\color{Purple} \begin{cases} linear & y = w_1x+w_0 \\ quadratic & y = w_2x^2+w_1x+w_0 \\ cubic & y = w_3x^3+w_2x^2+w_1x+w_0 \\ \end{cases} } $$

Example:

--- * Supervised Learning is where you have a input variable(x) and a output variable(y) and you use a mapping function from the input to the output * It is called supervised because the process of an algorithm training fom the dataset can be thought as a teacher supervising the learning process.

Data point

We start with the case where there is a single input and a single output

$\large{\color{Purple} (x^{(i)}, y^{(i)})} : i^{th} \textit{ example of (input, output) set.}$
$\large{\color{Purple}m}$ : Number of examples or data points.
We assume that there are 51 pairs of data points.

Hypothesis

The general univariate linear regression problem We now introduce our model hypothesis Linear Model-

$$ \large{\color{Purple} \hat{y}= h(x)=w_{0}+ w_{1}x} $$

$\large{\color{Purple} w_0, w_1}$ are parameters and there are infinite possibilities. Which do we choose?
For this we defined a cost function $\large{\color{Purple} J=\frac{1}{2m} \sum_{i}(y^{(i)}- \hat{y}^{(i)})^2}$
- Notice that no line is going to fit all of this perfectly so the difference between the points $\large{\color{Purple}y}$ and the pont resides on the regression line $\large{\color{Purple}\hat{y}}$ is the loss.

How would we achieve our optimal $\large{\color{Purple} w}$ ?.

So we say that the optimal $\large{\color{Purple} w}$ (so you can now notice it has now become optimization problem) is the one which minimizes this net cost function.
So the $\large{\color{Purple} w}$ that we get at the end of the process will be called Least Square Coefficient
This fit is called Least Square fit and the cost function is called Least Mean Squre (LMS)

Measuring the Fit

Mean Square Error:

$$\large{\color{Purple} J=\frac{1}{2m} \sum_{i}(y^{(i)}- \hat{y}^{(i)})^2}$$

So this is one measure of how good the fit is. Sometimes this is not a good enough, sometimes we just get a large value of J and we don’t know whether this is a good fit or not.

Varience

$$\large{\color{Purple}\sum^{m}_{i=1} (y_i-\bar{y})^2} {\color{Cyan}\textrm{(SST) Sum Square Total}}$$

What does total variance mean? Before we even had a model there was some amount of variation in the data, and this term actually calculates the total amount of variance in the data before we even had a model.

Amount of varience present in the data

Error

$$\large{\color{Purple}\sum^{m}_{i=1} (y_i-\hat{y_i})^2} {\color{Cyan}\textrm{(SSE) Sum Square Error}}$$

Here y_i is the Ground Truth and the &hat; y_i is the prediction or Hypothesis or Model.

Varience in Prediction

$$\large{\color{Purple}\sum^{m}_{i=1} (\hat{y_i} - \bar{y})^2} {\color{Cyan}\textrm{(SSR) Sum Square Regression}}$$

Amount of varience captured by the model.

$\large R^2$ Error

$$\large{\color{Purple}\mathbf{R^2} = \frac{\textrm{SSR}}{\textrm{SST}} = \frac{\textit{Amount of varience captured by the Model}}{\textit{Amount of varience present in Data}} }$$

Typically we would like one number which lie between 0 and 1. So we want to normalize this. You have a number, you would like to non-dimensionalize it, normalize it with respect to some denominator, so that you get an idea between 0 and 1.

$$\large{\color{Purple} \mathbf{R^2 \in \Bigl[0, 1 \Bigl] } } \normalsize{\color{Cyan} \begin{cases} 0 &= Very\ Bad\ fit \\ 1 &= Very\ Good \ fit \end{cases}} $$

Conclusion

$$ \large{\color{Purple} \begin{align*} \because & \textbf{SST} = \textbf{SSE + SSR} \\ \Rightarrow & \textbf{SSR} = \textbf{SST-SSE} \\ \Rightarrow & \mathbf{R^2} = \mathbf{1- \frac{SSE}{SST}} \end{align*} } $$

Name		Name	Last commit message	Last commit date
parent directory ..
Decision Tree		Decision Tree
Ensemble-Learning_and_Randon-Forest		Ensemble-Learning_and_Randon-Forest
Linear Regrassion		Linear Regrassion
Logistic Regrassion		Logistic Regrassion
SVM		SVM
knn		knn
naievebayes		naievebayes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Index

Supervised Learning

What is Supervised Learning?

Classification and Regresssion

Regression

Fitting the Data

The fit equations are:

Example:

Data point

Hypothesis

How would we achieve our optimal $\large{\color{Purple} w}$ ?.

Measuring the Fit

Varience

Error

Varience in Prediction

$\large R^2$ Error

Conclusion

Supervised Learning Catagories - Based on Types

Machine Learning Catagories - Based on Outliers handling

FilesExpand file tree

Supervised

Directory actions

More options

Directory actions

More options

Latest commit

History

Supervised

Folders and files

parent directory

README.md

Index

Supervised Learning

What is Supervised Learning?

Classification and Regresssion

Regression

Fitting the Data

The fit equations are:

Example:

Data point

Hypothesis

How would we achieve our optimal $\large{\color{Purple} w}$ ?.

Measuring the Fit

Varience

Error

Varience in Prediction

$\large R^2$ Error

Conclusion

Supervised Learning Catagories - Based on Types

Machine Learning Catagories - Based on Outliers handling