cover photo

COURSEWORK

Varsha's AI-ML-001 course work. Lv 1

Varsha RaoAUTHORACTIVE
This Report is yet to be approved by a Coordinator.

Level 2 AIML

2 / 10 / 2024


marvel_rep_l2

BASICS: WhatsApp Image 2024-10-02 at 10 54 15_9d225e52

TASK 1: Linear and logistic regression

I learnt about the fundamentals of linear and logistic regression. For linear regression, I did a house price prediction model using the Califronia housing dataset available on Google colab. For Logistic regression, I did a simple flower classifying model, using the iris dataset. Links for the same: linear regression: https://colab.research.google.com/drive/1rKDBJ0Oj2G7Sg0fpVQ-M1FHU9wu5yH0-?usp=sharing logistic regression: https://colab.research.google.com/drive/1E3dRJmTjWGQjcoUYnNZv-51hlCLSRr6S?usp=sharing

TASK 2: Matplotlib and Data Visualisation

I understood the usage of matplotlib and the different ways for visualising data: Link for the colab file: https://colab.research.google.com/drive/1gdP5k9ucxwNkMBjVJsMCl6jVDV-TogsI?usp=sharing

TASK 3: Numpy

This was an easy task, where we had to use numpy funtions to do perform this: Generate an array by repeating a small array across each dimension and generate an array with element indexes such that the array elements appear in ascending order. Link for the same: https://colab.research.google.com/drive/1A2w2v7Mf6s83SSikP9GAdU2QhOonEOsg?usp=sharing

TASK 4: Metrics and Performance Evaluation

I studied the different metrics used to evaluate how good and efficiently a model is performing. I included the implementation of this in the first task's two models itself, same colab file, so same link. Regression: WhatsApp Image 2024-10-02 at 03 54 39_6f3f5355 WhatsApp Image 2024-10-02 at 03 55 05_f3d22a08 WhatsApp Image 2024-10-02 at 04 12 37_0fa632f5 WhatsApp Image 2024-10-02 at 11 07 30_3aeee674 WhatsApp Image 2024-10-02 at 11 07 48_ed6abfb3

Classification: image

TASK 5: Linear and Logistic Regression - Coding the model from SCRATCH

I studied all the maths and the concepts behind both logistic and linear regression. In both the cases, I have used my own custom datasets. Honestly, I think the mdoels that i made can be improved a lot, because the accuracy was far below my expectation and the error was huge. I'll try working on it again. WhatsApp Image 2024-10-02 at 03 40 44_c3d531b3 WhatsApp Image 2024-10-02 at 03 40 56_cf414626 Sigmoid Function: image

Links for the same: Linear: https://colab.research.google.com/drive/1yDXRIGhCQn4JbYYyRTMT3ILt_u6y2AS1?usp=sharing Logistic: https://colab.research.google.com/drive/1K_zMW9W82lJ2uCcRTAnJfZPdheWFog4l?usp=sharing

TASK 6: K Nearest Neighboours

I learnt about KNN method, which happens to be a model for regression as well as classification task. The model works by memorising the datapoints from the training dataset and then make predictions by checking the euclidean distance between the new point from testing dataset (query point) and the training dataset points. Thus, the model is called as a lazy model because it does not do any actual learning and takes a long time to calculate the distance and increase the computational costs. Also, a hyperparameter k is chosen, which represents the number of dataponits influencing the final predicted value. In classification tasks, after finding the k nearest neighbors, take a majority vote among the classes of the nearest neighbors. The class that appears most often is assigned to the data point. Whereas in regression tasks, in regression, we calculate the average of the target values of the k neighbors to predict the target value for the new data point. Here is the link to the code: https://colab.research.google.com/drive/1wvtn4XtZHQFgEqfK--S2LaU2KwAFeub5?usp=sharing

Task 7: Understanding neural networks and LLMs

https://docs.google.com/document/d/1iGiEkRdj0vjVgjosTHzCcy-geG_Gt8O16U_dx506MWg/edit?usp=sharing

TASK 8: Mathematics behind machine learning

I learnt about curve fitting, which is an ML process to find the best fitting curve or function, in order to least error. And I also learnt how the best fit line is chosen via the least squared method, along with the maths. WhatsApp Image 2024-10-02 at 03 01 50_11da4357 WhatsApp Image 2024-11-09 at 23 13 49_c9f2bfd2

About the fourier transfrom, I now know that it is used in various real life use cases, like sound engineering, image recognition and analysis, quantum mechanics, etc. So, any complex signal or wave, could be broken down to simpler multiple sine and cos waves. This is with the help of the fourier transforms, where say we give an amplitude vs time graph to get a broken down graph of amplitude vs frequency. WhatsApp Image 2024-10-02 at 03 10 56_aa08f3d2 I generated a simple sine wave in time domain in MATLAB, and then converted it to its frequency components, using fourier transform. WhatsApp Image 2024-10-02 at 03 23 45_1c370cb0

TASK 9:

Plotly is a Python and MATPLOTLIB library for visualising data. It could be used by businesses for creating creative dashboards and for making educational visual representations and animations. https://colab.research.google.com/drive/1wQhi-Ydf0PRGbE2MHzYTO6fJ8A6Yk4pt?usp=sharing

TASK 10:

Decision trees is again used for both regression and classification tasks. I built a model using decision trees to predict a heart attack. image Link to the code: https://colab.research.google.com/drive/1vEQDTv95AZXmY8-Q8alyx-_tZQHCYRnB?usp=sharing

TASK 11:

SVM or Support Vector Machines is an ML model used for both regression and classification tasks. Depending on the data being linear or non-linear, we use the appropriate method. WhatsApp Image 2024-11-03 at 17 01 08_097de3c8 WhatsApp Image 2024-11-03 at 17 01 20_f2594401

FOR NON-LINEAR SEPARATED DATA

image

FOR NON-LINEAR SEPARATE BOUNDARIES

image I also discoverd how svm could be used for regression. image

Link for colab file code for prediction of breast cancer: https://colab.research.google.com/drive/1sMpEvHlPd8nxP2koB9lxq8sd9qz4UENu?usp=sharing For the above task, I have used the dataset(kaggle notebook given in the resource): https://www.kaggle.com/datasets/merishnasuwal/breast-cancer-prediction-dataset

UVCE,
K. R Circle,
Bengaluru 01