Machine Learning: Data Preprocessing
Data preprocessing is preparing raw data suitable for machine learning models. It is the most important status. Are you interested in Machine Learning: Data Preprocessing, one of the hottest fields in technology today? Our Machine Learning Course in Pune is the perfect way to dive deep into this exciting field and gain hands-on experience with cutting-edge tools and technologies.
Data Preprocessing involves the following status
- Data Import
- Data Exploration
- Data Wrangling
- Data Manipulation
- Encoding Categorical Data
- Splitting Data Set
- Feature Scaling
For Free, Demo classes Call: 7507414653
Registration Link: Click Here!
Methods of Machine Learning: Data Preprocessing
- Data Import
Before starting with the dataset, the first step is to load the dataset.
Code :
df = pandas.read_csv(“filepath”) : to load .csv file
df = pandas.read_excel(“filepath”) : to load .xlsx file
df = pandas.read_json(“filepath”) : to load .json file
df = pandas.read_sql(“filepath”) : to load .sql file
- Data Exploration Techniques
- Dimensionality Check : df.shape
- Type of Dataset : type(df)
- Slicing and Indexing : df.iloc[ : , : ]
- Mean : df.mean()
- Median : df.median()
- Mode : df.mode()
- Identifying Unique Elements : df.unique()
- Value Extraction : df.values()
- Data Wrangling
- Missing Value and missing value treatment.
- Inconsistent Data
- Presence of noisy Data
- Developing a more accurate model.
For Free, Demo classes Call: 7507414653
Registration Link: Click Here!
- Data Manipulation
A data object is a two dimensional data structure on which following functions can be applied.
- Returns first n rows : df.head()
- Returns last n rows : df.tail()
- Returns actual data in a series : df.values()
- Returns data frame in groped format : df.group()
- Concatenate combines two or more data structures : df.concatenate()
- Merging is the Pandas operation that perform database joins on object
- Encoding Categorical Data
The machine learning model completely works on mathematics and numbers, but if our dataset has a categorical variable, then it may create trouble while building the model.
To perform it we apply LabelEncoder() and One Hot Encoding.
- Splitting Data Set
In machine learning data preprocessing, we divide our dataset into training and testing datasets.
For Free, Demo classes Call: 7507414653
Registration Link: Click Here!
- Feature Scaling
It is a technique to standardize the independent variable of the dataset in a specific range
Author:-
Aniket Kulkarni
Call the Trainer and Book your free demo Class for Machine Learning now!!!
© Copyright 2021 | SevenMentor Pvt Ltd.