CHI-SQUARE TEST For Machine Learning and Data Analytics

  • By Sagar Gade
  • October 6, 2023
  • Machine Learning
CHI-SQUARE TEST For Machine Learning and Data Analytics

CHI-SQUARE TEST For Machine Learning and Data Analytics 

 Chi-Square Test for Machine Learning and Data Analytics. 

 ▪ The most popular measure to test for an association between two categorical data and features. 

  • The chi-square test is based on observed data and Expected data.   • It measures the difference between observed and expected data. 

 

For Free, Demo classes Call: 7507414653

Registration Link: Click Here!

 

 Chi-Square Test for Feature Selection in Machine Learning 

 Steps to perform the Chi-Square Test For Machine Learning and Data Analytics: 

 1.Define Hypothesis. 

 2.Build a Contingency table. 

 3.Find the expected values. 

 4.Calculate the Chi-Square statistic. 

 5.Accept or Reject the Null Hypothesis. 

 1.Define Hypothesis 

 2. Contingency table

 A table showing the distribution of one variable in rows and another in columns. It is used   to study the relation between two variables. 

 Type Compact Large Midsize Small Sporty Van  AirBags 

 Driver &  

 2 3 6 0 3 0 

 Passenger 

 Driver only 9 6 11 5 8 3 

 None 5 0 4 16 2 6 

 

For Free, Demo classes Call: 7507414653

Registration Link: Click Here!

 Accept or Reject the Null Hypothesis 

 If the p-value is less than the assumed significance value (0.05), then we fail to accept that there is no association between the variables. That is, we reject the   NULL hypothesis and accept the alternate hypothesis claim. Enhance your skills and dive into the world of Machine Learning with our hands-on Machine Learning training in Pune

 DEMO 

 CONCLUSION 

 Output 

 (31.496973760366618,  

 0.0004854823787767891,  

 10,  

 array([[2.51685393, 1.41573034,  

 3.30337079, 3.30337079, 2.04494382,  

 1.41573034], [7.5505618 , 4.24719101, 

 9.91011236, 9.91011236, 6.13483146,  

 4.24719101], [5.93258427, 3.33707865,  

 7.78651685, 7.78651685, 4.82022472,  

 3.33707865]])) 

 From above, 0.00048 is the p-value, 31.49 is the statistical value and 10 is the degree of freedom. As the p-value is greater than 0.05, we reject the NULL   hypothesis.  

 variables ‘AirBags’ and ‘Type’ are not independent of each other.

 

Author:-

Sagar Gade

Call the Trainer and Book your free demo Class For Machine Learning

Call now!!! | SevenMentor Pvt Ltd.

© Copyright 2021 | SevenMentor Pvt Ltd.

Submit Comment

Your email address will not be published. Required fields are marked *

*
*