1 / 3 / 2024
Task 1: Decision Tree based on ID3 Algorithm
Built a decision tree that predicts if a student will buy a computer or not based on the given conditions using the ID3 algorith ritten from scratch.
Task 2: Naive Bayesian Classifier
Built a Naive Bayesian Classifier from scratch that predicts if an email is spam or not,; by identifying some keywords in emails and calculating their probability for it to be in a spam message,; and classifies the test data as per this probability.
Task 3: Exploratory Data Analysis
Conducted Exploratory Data analysis on a dataset about Airbnbs in Newyork using table raphs and charts. Also used linear regression to predict the rents of the houses and presented the predictions in a bar graph.
Task 4: Ensemble Techniques
Trained several machine learning models like decision trees,; random forrests,; and logistic regression on different features of the titanic dataset and applied stacking technique on them to combine and improve the overall accuracy of the model. Stacking improved the accuracy of the model by ~6%.
Task 5: Random Forest,; Xgboost
Implemented the Random forest algorithm to predict the price of a stock and compared it with the actual price on the day
Used Xgboost algorithm to improve the prediction of the stock,; using other parameters as well like simple moving average and relative strength index. Used gridsearch to choose the best parameters for the tree and implemented it.
Nvidia Stock prediction using Xgboost
Tesla stock prediction using Xgboost
Task 6: Hyperparameter Tuning
Used Grid search and Random tuning to get the best parameters to make a random forest to predict the outcome of the dataset.Phone prices were predicted according to their features in the dataset
Hypertuned Parameters for Phone Price Prediction
Task 7 : Image Classification using KMeans Clustering
Built a model that classifies handwritten numbers from the MNIST dataset into categories using KNeighbors classifier algorithms with accuracy of 98.64%.
Task 8 : Support Vector Machines
Built a support vector machine that draws/creates a vector between datapoints that are classified as breast cancer and datapoints that are not using the breast cancer Dataset.
Task 9 : Anomaly Detection
Used Isolation Forest algorithm on Iris dataset to classify/isolate datapoints that were different from the rest of the data set (anomalies) using Local outlier factor and the anomalies are plotted in red .