COURSEWORK

Varsha's AI-ML-001 course work. Lv 3

Varsha Rao

AUTHOR

ACTIVE

Level 1 AIML

2 / 10 / 2024

marvel_rep_l1

BASICS: WhatsApp Image 2024-10-02 at 10 54 15_9d225e52

TASK 1: Linear and logistic regression

I learnt about the fundamentals of linear and logistic regression. For linear regression, I did a house price prediction model using the California housing dataset available on Google colab. For Logistic regression, I did a simple flower classifying model, using the iris dataset. Links for the same: linear regression: https://colab.research.google.com/drive/1rKDBJ0Oj2G7Sg0fpVQ-M1FHU9wu5yH0-?usp=sharing logistic regression: https://colab.research.google.com/drive/1E3dRJmTjWGQjcoUYnNZv-51hlCLSRr6S?usp=sharing

TASK 2: Matplotlib and Data Visualisation

I understood the usage of matplotlib and the different ways for visualising data: Link for the colab file: https://colab.research.google.com/drive/1gdP5k9ucxwNkMBjVJsMCl6jVDV-TogsI?usp=sharing

TASK 3: Numpy

This was an easy task, where we had to use numpy funtions to do perform this: Generate an array by repeating a small array across each dimension and generate an array with element indexes such that the array elements appear in ascending order. Link for the same: https://colab.research.google.com/drive/1A2w2v7Mf6s83SSikP9GAdU2QhOonEOsg?usp=sharing

TASK 4: Metrics and Performance Evaluation

I studied the different metrics used to evaluate how good and efficiently a model is performing. I included the implementation of this in the first task's two models itself, same colab file, so same link. Regression: WhatsApp Image 2024-10-02 at 03 54 39_6f3f5355 WhatsApp Image 2024-10-02 at 11 07 30_3aeee674

Classification:

TASK 5: Linear and Logistic Regression - Coding the model from SCRATCH

I studied all the maths and the concepts behind both logistic and linear regression. In both the cases, I have used my own custom datasets. Honestly, I think the mdoels that i made can be improved a lot, because the accuracy was far below my expectation and the error was huge. I'll try working on it again. WhatsApp Image 2024-10-02 at 03 40 44_c3d531b3 Sigmoid Function:

Links for the same: Linear: https://colab.research.google.com/drive/1yDXRIGhCQn4JbYYyRTMT3ILt_u6y2AS1?usp=sharing Logistic: https://colab.research.google.com/drive/1K_zMW9W82lJ2uCcRTAnJfZPdheWFog4l?usp=sharing

TASK 6: K Nearest Neighboours

I learnt about KNN method, which happens to be a model for regression as well as classification task. The model works by memorising the datapoints from the training dataset and then make predictions by checking the euclidean distance between the new point from testing dataset (query point) and the training dataset points. Thus, the model is called as a lazy model because it does not do any actual learning and takes a long time to calculate the distance and increase the computational costs. Also, a hyperparameter k is chosen, which represents the number of dataponits influencing the final predicted value. In classification tasks, after finding the k nearest neighbors, take a majority vote among the classes of the nearest neighbors. The class that appears most often is assigned to the data point. Whereas in regression tasks, in regression, we calculate the average of the target values of the k neighbors to predict the target value for the new data point. Here is the link to the code: https://colab.research.google.com/drive/1wvtn4XtZHQFgEqfK--S2LaU2KwAFeub5?usp=sharing

Task 7: Understanding neural networks and LLMs

https://docs.google.com/document/d/1iGiEkRdj0vjVgjosTHzCcy-geG_Gt8O16U_dx506MWg/edit?usp=sharing

TASK 8: Mathematics behind machine learning

I learnt about curve fitting, which is an ML process to find the best fitting curve or function, in order to least error. And I also learnt how the best fit line is chosen via the least squared method, along with the maths. WhatsApp Image 2024-10-02 at 03 01 50_11da4357 WhatsApp Image 2024-11-09 at 23 13 49_c9f2bfd2

About the fourier transfrom, I now know that it is used in various real life use cases, like sound engineering, image recognition and analysis, quantum mechanics, etc. So, any complex signal or wave, could be broken down to simpler multiple sine and cos waves. This is with the help of the fourier transforms, where say we give an amplitude vs time graph to get a broken down graph of amplitude vs frequency. WhatsApp Image 2024-10-02 at 03 10 56_aa08f3d2 I generated a simple sine wave in time domain in MATLAB, and then converted it to its frequency components, using fourier transform.

TASK 9:

Plotly is a Python and MATPLOTLIB library for visualising data. It could be used by businesses for creating creative dashboards and for making educational visual representations and animations. https://colab.research.google.com/drive/1wQhi-Ydf0PRGbE2MHzYTO6fJ8A6Yk4pt?usp=sharing

TASK 10:

Decision trees is again used for both regression and classification tasks. I built a model using decision trees to predict a heart attack. Link to the code: https://colab.research.google.com/drive/1mxEFuwLumYSTn4ByvtzwPi4hSlKeP3tK#scrollTo=tSF-pOnVhyE7

TASK 11:

SVM or Support Vector Machines is an ML model used for both regression and classification tasks. Depending on the data being linear or non-linear, we use the appropriate method. WhatsApp Image 2024-11-03 at 17 01 08_097de3c8