The aim of this blog is to simplify the terminologies used in Machine Learning in a question and answer format. It is just to make you understand what this terminology means in a layman term. The Q & A below are connected to the above Question. It would help you understand better if you read one after the other. This blog is written keeping in mind the beginners, without any technical background. Without any further due let’s start.
What is Machine Learning?
Machine Learning is subset of Artificial Intelligence (AI) that allows machine to learn from data without being programmed explicitly. The term Machine can be either software or a hardware.
For Free, Demo classes Call: 8605110150
Registration Link:Click Here!
What is without being programmed explicitly mean?
To understand this, we need to first understand what is Traditional Programming?
What is Traditional Programming?
Traditional Programming is a computer program written in any of the programming languages like C, C++, Java, Python etc. The purpose of programming is to provide instructions to the machine to achieve a particular objective. Objectives might be like developing a application or software to do a particular task like
- Google Docs
- Weather forecast
- E-commerce website
- Trading System
- Bidding System
- Management Systems like
- Logistic Management System
- Admission Management System and many more…
For Free, Demo classes Call: 8605110150
Registration Link:Click Here!
Traditional Programming vs Machine Learning
To illustrate the difference below is the diagram. In Traditional programming we give data and the program (the rules to operate on data) to the computer and it gives us the output. But unlike traditional programming, in machine learning we provide the computer with data and output. The computer figures out the program (the rules to operate on data). This is what we mean by without being programmed explicitly.
What is data?
Data is a piece of information. The Data is organized in tabular forms i.e consisting of rows and columns. Each column is considered as variables. And each row is an observation for those variables. To illustrate, let’s look at a dataset to solidify our knowledge.
Country | Age | Salary | Purchased |
France | 44 | 72000 | No |
Spain | 27 | 48000 | Yes |
Germany | 30 | 54000 | No |
Spain | 38 | 61000 | No |
Germany | 40 | 61000 | Yes |
France | 35 | 63000 | Yes |
Spain | 38 | 52000 | No |
France | 48 | 79000 | Yes |
Germany | 37 | 83000 | No |
France | 37 | 67000 | Yes |
In the above dataset, the data is in tabular format. In terms of rows and columns. The columns are Country, Age, Salary, and Purchased which are treated as variables. And each row is considered as an observation. The interpretation for the first observation is as follows, the person is from country France, of age 44 and having a salary of 72000 and has not purchased the product. Similarly, the rest of the observations. In total we have 10 observations.
What are Independent Variable?
In the above dataset – The columns Country, Age and Salary are Independent variables. The goal is to find out the relationship between the independent variables to dependent variable.
What are Dependent Variable?
In the above dataset – The column Purchased is Dependent variable. The goal is to find out the relationship between the independent variables to dependent variable i.e how change in independent variable affect the dependent variable.
What are the two main types of Machine Learning?
- Supervised Learning
- Unsupervised Learning
For Free, Demo classes Call: 8605110150
Registration Link:Click Here!
What is Supervised Learning?
If we are given with the above dataset which has Independent and Dependent Variable i.e we are provided with both the inputs and outputs. The goal is to find the relationship between the independent variable to dependent variable. Such type of learning is called Supervised Learning.
What is Unsupervised Learning?
If we are given with the dataset which has only Independent and not Dependent Variable i.e we are provided with only the inputs and no outputs. The goal is to find the similarities among the independent variable and group them or cluster them. Such type of learning is called Unsupervised Learning.
What is Regression in Machine learning?
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the ‘outcome variable’) and one or more independent variables (often called ‘predictors’, ‘covariates’, or ‘features’). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion. If the Dependent Variable has continuous data then we would use regression in supervised learning. Regression algorithms are used to predict the continuous values such as price, salary, age, etc.
What is Classification in Machine learning?
In statistical modeling, classification analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the ‘outcome variable’) and one or more independent variables (often called ‘predictors’, ‘covariates’, or ‘features’). The most common form of classification analysis is logistic regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion. If the Dependent Variable has discrete data then we would use classification in supervised learning. Classification algorithms are used to predict/Classify the discrete values such as Male or Female, True or False, Spam or Not Spam, etc.
What are the various steps involved in a Machine Learning project?
The following are the various steps involved in a Machine Learning project:
- Understand the Business problem
- Explore the data and become familiar with it.
- Preprocess the data for modeling by detecting outliers, treating missing values, encoding the
categorical (text) variables etc.
- After data preparation, start running the model, analyze the result and tweak the approach.
This is an iterative step until the best possible outcome is achieved.
- Validate the model using a new data set.
- Start implementing the model and see the result to analyze the performance of the model over the period of time.
Author:
Newton, Titus
Call the Trainer and Book your free demo Class now!!!