What is Supervised Learning?

If you are learning about Machine Learning, you might have come across terms like “Supervised Learning”, “Unsupervised Learning”, “Reinforcement Learning” etc. In this article, we are going to explore “Supervised Learning” in Machine Learning. It’s a kind of broad concept, so there is a lot to dive into the concept and learn new things, but this tutorial focuses on making you understand Supervised learning, and begin exploring it.

What is Supervised Learning?

What is Machine Learning?

First of all, let’s quickly revise what is Machine Learning, since that’s a kind of umbrella term here. In machine learning, the focus is on enabling machines to learn from data, make predictions, or make decisions, without being explicitly programmed to do so.

If you try to find, there are a bunch of definitions of Machine Learning from different eminent personalities, like the one from Tom Mitchell – “a computer program is said to learn from experience E concerning some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E”.

The above definition by Tom Mitchell may seem harder, but it’s really simple once you understand the concept. There is some task T, which the computer does, and if the performance P improves with experience E, the computer is said to learn.

But right now, we are more interested in understanding Supervised learning, so let’s now focus on that.

What is Supervised learning?

To understand supervised learning, you first need to understand what the term “Supervised” means. You can understand the term “Supervised” as being observed, watched, or under some supervision.

When it comes to Supervised learning, the machines are trained using the labeled data, which simply means that the data has some labels(outputs) along with the data, on which the machines are trained. For example, let’s say that we need to train a machine to identify a cat, or dog, from the shown photo. In that case, when we are training the machine, we show a lot of photos of cats and dogs, saying “This is a cat”, or “This is a dog”, giving data + output for training purposes.

In this case, the supervised learning algorithms aim to find a relation between input variables(features) and output variables (target). If this confuses you, don’t worry, because as you explore more and more about Supervised learning, and learn about some different related algorithms, you will find it easy to digest.

Supervised Learning vs Unsupervised Learning

While learning about Machine Learning, the two much-heard terms are Supervised learning and Unsupervised Learning. While we are reading about Supervised learning, you should have a brief idea about Unsupervised learning, so that it becomes easier when you move ahead exploring Unsupervised learning.

In the case of supervised learning, machines are trained on labeled data, while in Unsupervised learning, machines are trained on unlabeled data, where the goal is to find some hidden pattern in the data. While we will not go into unsupervised learning right now, since there’s a whole different Universe of Unsupervised learning.

How Supervised Learning work?

Now let’s quickly understand how Supervised learning works. We will try to avoid the use of jargon, and try to simplify the explanation, to improve understanding. As an example, let’s consider a dummy dataset, where there are some features related to weather conditions, and the target is whether the person will play tennis or not(actually this is a famous dataset).

So, let’s say we have different columns, namely – Outlook, Temperature, Humidity, Wind, and Will play Tennis(Yes/No)”.

Here, the “Will play Tennis” column is the target variable or the thing that we need to predict. The model is trained on the data(which is called training data), and then it is tested on something called test data, to check how well the model has performed.

We will not go into much deeper details, as then we will also need to understand about some different supervised learning algorithms, which are right now out of scope for this article, but if you want, you can further read and explore more about this.

Just you need to understand the basic thing, that in supervised learning, the machine is being trained on labeled data, which means we are giving data, along with output, for training purposes.

Types of Supervised Learning Problems

Now that we have a brief idea of what is supervised learning, and also about how it works, let’s have a quick look at some different types of Supervised learning problems. Supervised learning can be further divided into two types of problems – Classification problems, and Regression Problems.

Classification Problems

Classification algorithms include the algorithms, which are used when the output variable(target variable) is categorical. In order words, we can say that the classification algorithms are used when we need to classify the data, into discrete categories(like Yes / No, Male / Female, Spam / Not Spam. etc)
Here are some of the use cases, when you might need classification algorithms –

  • Spam Filtering.
  • Sentiment Analysis
  • Image Classification
  • Fraud Detection

It also includes some kind of binary, or multi–class classification problems, where you would use some classification algorithms.

Regression Problems

You are using classification algorithms when the target variable is discrete, and when the target variable is continuous, the problem is a Regression problem. For example, when calculating the salary of some person, stock price, etc, the problem becomes of regression.

Here are some of the use cases, where you might need regression algorithms –

  • House Price Prediction
  • Stock Price Forecasting
  • Salary Prediction
  • Insurance Premium Estimation

In short, when the output variable is continuous, we say that the problem is regression one, and we use some kind of regression algorithm to solve the problem.

Here are some of the algorithms for Classification and Regression problems –

Classification Algorithms Regression Algorithms
Logistic Regression Linear Regression
SVM(Support Vector Machines) Ridge Regression
Decision Trees Lasso Regression
Random Forest Polynomial Regression
K-Nearest Neighbors SVR (Support Vector Regression)
Naive Bayes Decision Tree Regression

As you can see, above are some names of algorithms, related to classification and regression problems. There are a variety of algorithms, which are interesting, practical, and problem-solving. However, for right now, we are not diving into the particular algorithms for this article, because it is kind of out of scope.

Advantages of Supervised Learning

As you now already know, with Supervised Learning, we have the labeled data, on which we can train the model, and then the model is checked for accuracy, and if the model performs well, we can further use the model.

So here we can see that some particular problems are being solved here. Also, the thing is that we have an exact idea of the problem and the outcome. Along with that, many real-world problems can be solved using Supervised Learning algorithms.

So, here are some of the advantages of supervised learning.

  • Greatly beneficial for Classification and Regression problems.
  • There is an exact idea of classes or the output that we are getting.
  • Can be applied to many real-world problems.

Disadvantages of Supervised Learning

Supervised learning is suitable for a particular group of problems, like classification problems and regression problems. So, there are some kind of disadvantages, with which, we need to be familiar.

  • May not be suitable for some very complex tasks.
  • The training and accuracy of the model depend on the training data. It can affect the accuracy of the final model.
  • Supervised learning requires output labels.

Conclusion

Here, we have understood some basics about Supervised learning in Machine Learning, which is one of the very important, and fundamental concepts in Machine learning. Mainly, there are two kinds of problems – Classification, and Regression problems, and there are several algorithms that are used for a variety of purposes. While this tutorial mainly focused on Supervised Learning, you can explore about different algorithms of Supervised learning.

FAQs about Supervised Learning

Q: What is Machine Learning?

Ans: Machine learning can be understood as the technology that enables machines to learn from data, to make predictions, or to make decisions, without being explicitly programmed.

Q: What is Supervised Learning?

Ans: Supervised learning is a type of Machine Learning, in which, machines are trained on the labeled data(which means data + output).

Q: What are types of Supervised Learning?

Ans: Supervised learning is mainly categorized into Classification and Regression problems.