Regularization in Machine Learning
- Regularization in machine learning is a set of techniques used to prevent overfitting and improve the generalization capability of a model. Overfitting occurs when a model learns to fit the training data too closely, capturing noise and irrelevant patterns that do not generalize well to unseen data.
- Regularization techniques add a penalty term to the loss function during training, encouraging the model to learn simpler patterns that are more likely to be generalized.
# Import All Lab's import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt import warnings warnings.filterwarnings("ignore") # Load CSV Files df = pd.read_csv("C:/Users/Administrator/Desktop/SevenMentor All Data/Data Sci/csv files/50_Startups.csv") df.head()
df.describe()
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 50 entries, 0 to 49 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 RND 50 non-null float64 1 ADMIN 50 non-null float64 2 MKT 50 non-null float64 3 STATE 50 non-null object 4 PROFIT 50 non-null float64 dtypes: float64(4), object(1) memory usage: 2.1+ KB
1. Handling Missing Data:
- Removing rows or columns with missing values.
- Imputing missing values with statistical measures such as mean, median, or mode
df.isna().sum()
RND 0 ADMIN 0 MKT 0 STATE 0 PROFIT 0 dtype: int64
from module import replacer
replacer(df)
df.isna().sum()
RND 0 ADMIN 0 MKT 0 STATE 0 PROFIT 0 dtype: int64
# MLR -: mx1 + mx2 + ...+mxn +c
X = df.drop(labels = ["PROFIT", "ADMIN"], axis = 1)
y = df['PROFIT']
How to Convert diffrent scales to same scales. Feature Scaling:
- preprocessing techniques used in machine learning to scale numerical features
- Standardization: Scaling features to have zero mean and unit variance.
- Normalization: Scaling features to a range, typically between 0 and 1.
Note :- This two methods is woring only continues data not a categorical data.
- Min-Max Scaling (MinMaxScaler): -Min-max scaling rescales the feature values to a fixed range, typically between 0 and 1.
- Standardization (StandardScaler): -Standardization scales the feature values to have a mean of 0 and a standard deviation of 1.
from module import standardize
X1 = standardize(X)
X1.head()
- Data Having Categorical Predictors
- Feature Encoding:
1.One-Hot Encoding: Converting categorical variables into binary vectors.
2.Label Encoding: Assigning unique numerical labels to categorical variables.
- Feature Encoding:
from module import preprocessing
Xnew = preprocessing(X)
Xnew
- Selection of predictor
we have multiple predictors which one used or not
- A] Filter method.
-
- correlation use corr()
-
- univarate feture selection. use F test
-
- Inforamtion gain
-
- B] Wapper Methods.
-
- RFF
-
- Forwaord Selection
-
- Backword elimination
-
- C] Embedded Methods.
-
- L1 = Lasso Regression
-
- L2 = Ridge Regression
-
- A] Filter method.
- correlation use corr()
df[['RND','ADMIN','MKT','PROFIT']].corr()
df.corr()["PROFIT"].sort_values()
ADMIN 0.200717 MKT 0.747766 RND 0.972900 PROFIT 1.000000 Name: PROFIT, dtype: float64
Data Splitting:
- Splitting the dataset into training, validation, and testing sets to evaluate model performance. This ensures that the model’s performance is evaluated on unseen data, providing a more accurate estimate of its generalization ability
from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(Xnew, y, test_size=0.3, random_state = 21)
xtrain.shape
(35, 5)
ytrain.shape
(35,)
ytest.shape
(15,)
xtest.shape
(15, 5)
Creating Model – MLR
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model
LinearRegression()
#data fitting using model object
model.fit(xtrain, ytrain)
LinearRegression()
# Model Evalution (for training data )
tr_pred = model.predict(xtrain)
tr_pred
array([ 99541.43793014, 193824.02859299, 127217.28935749, 104021.7701829 , 46763.42945949, 74391.00963875, 46170.4110297 , 45735.30729762, 60245.41595031, 101072.73057937, 87378.21313726, 190230.1955865 , 115901.05799171, 111378.24961112, 96604.2875743 , 173847.74587568, 111936.99237991, 116698.71044661, 69395.03522222, 155759.12682996, 131190.31460665, 163229.52238728, 131070.12375685, 82178.81858721, 116672.81163072, 68518.77303015, 77933.82829341, 75880.09276874, 152302.07433551, 136575.50889571, 89772.82048354, 89571.46081784, 173559.43523538, 145605.43845103, 153902.43204595])
# Model Evalution (for testing data )
ts_pred = model.predict(xtest)
ts_pred
array([162612.89222766, 63517.77097241, 58892.55493395, 101220.75716895, 151818.06212535, 184112.08847695, 112558.37533507, 96814.92164915, 130822.96014655, 44978.65129127, 99767.71264259, 118723.52449124, 113188.35251712, 133919.31898842, 117576.74202844])
#Finding Error term
from sklearn.metrics import mean_absolute_error
training_error = mean_absolute_error (ytrain, tr_pred)
print("Training Data Error :- ", round(training_error,2))
testing_error = mean_absolute_error (ytest, ts_pred)
print("Testing Data Error :- ", round(testing_error, 2))
Training Data Error :- 6833.19 Testing Data Error :- 6589.64
from sklearn.metrics import r2_score
r2_score(ytest, ts_pred)
0.9487046887109204
Types of Regularization Techniques
- L1 Regularization (Lasso):
This method adds a penalty term proportional to the absolute value of the coefficients of the features, forcing some of them to become exactly zero. It performs feature selection and helps in building simpler models.
from sklearn.linear_model import Lasso
ls = Lasso()
ls.fit(xtrain, ytrain)
ls.score(xtest, ytest)
0.9487241498248087
ls = Lasso(alpha= 1.5)
ls.fit(xtrain, ytrain)
ls.score(xtest, ytest)
0.9487337873216631
ls = Lasso(alpha= 1.6)
ls.fit(xtrain, ytrain)
ls.score(xtest, ytest)
0.9487358000606894
- L2 Regularization (Ridge):
L2 regularization adds a penalty term proportional to the square of the coefficients of the features. It encourages smaller weights and prevents them from becoming too large, effectively reducing the complexity of the model.
from sklearn.linear_model import Ridge
rd = Ridge()
rd.fit(xtrain, ytrain)
rd.score(xtest, ytest)
0.9471602080552655
rd = Ridge(alpha= 1.3)
rd.fit(xtrain, ytrain)
rd.score(xtest, ytest)
0.9465049351737375
rd = Ridge(alpha= 1.4)
rd.fit(xtrain, ytrain)
rd.score(xtest, ytest)
0.9462725550444244
Do visit our channel to learn More: Click Here
Author:-
Sagar Gade
Call the Trainer and Book your free demo Class for Machine Learning Call now!!!
| SevenMentor Pvt Ltd.
© Copyright 2021 | SevenMentor Pvt Ltd