cover photo

BLOG · 5/9/2025

Level 3 Temp report

Level 3 Temp report
This Article is yet to be approved by a Coordinator.

Task 1: Decision Tree Based on ID3 Algorithm:

A decision tree is a tree-like structure used to make decisions or predictions based on given conditions . It is a machine learning algorithm that splits the target variable into different sub groups of the target variable. It starts at the parent node and a sequence of splits based in hierarchial order of impact on the target. It is a big if else tree. It used Greedy Top Down approach to select which variable to split at. It need Entropy and Information Gain to make these decisions. Both the Entropy and the IG focus on the purity and impurity of a node , based on values from 0 to 1 where 1 being the max. The decision tree can make prediction on both the numerical and categorical values with mean value of target at each values for numeric and the modal score for the categorical values. Disadvantages-> The decision tree is bound to Overfit. So we use a method called Pruning to prevent it. THere are 2 types of Pruning.
1) Pre-Prunning:- Setting the limit on the tree with the hyper-parameters like max_depth, min_sample leaf, etc..
2) Post-Pruning :- it is the process where the model is made to train fully and then getting rid of unnecessary branches or sub-trees. In this task I built a simple decision tree on a simple dataset. Sample Image
Here is a sample code Sample_code
Here is the Full code . Github repo

Task 2: Naïve Bayesian Classifier :

Naive Bayes classifier is a simple, probabilistic machine learning algorithm based on Bayes' theorem. It assumes that features are independent given the class label, which simplifies calculations. It's commonly used for text classification, spam detection, and sentiment analysis due to its efficiency and effectiveness with large datasets. The algorithm calculates the probability of a data point belonging to each class and assigns it to the class with the highest probability. The Naive Bayes works on the Binary and Multiclass Classification with small dataset. but fails on data which is highly correleated and highly imbalanced. In this task I used Multinomial classification to predict the category of the given text on which particular class it belongs to. Here is a sample image Sample Image 1
Sample Image 2
Here is the Github Repo

UVCE,
K. R Circle,
Bengaluru 01