cover photo

COURSEWORK

vrishank's AI-ML-001 course work. Lv 2

vrishank aryanAUTHORACTIVE
This Report is yet to be approved by a Coordinator.

5 / 6 / 2025


Task 1: Linear and Logistic Regression.

Linear Regression:

Used a housing dataset to predict home prices based on features like number of rooms, area, location, and more. I implemented it using scikit-learn’s LinearRegression and experimented with different combinations of features to see how it impacted the accuracy. It was a nice way to understand how regression actually works behind the scenes.

Blog!

Logistic Regression

Worked on classifying Iris flowers into their respective species using measurements like sepal length, sepal width, petal length, and petal width. I used scikit-learn’s LogisticRegression for this and got a good grasp of how classification models behave with clean, structured data. Also helped me understand how decision boundaries are formed in low-dimensional spaces.

Blog!

Task 2: MatPlotLib and Data Visualization

This task was all about getting comfy with Matplotlib — the go-to library for plotting in Python. I explored a bunch of chart types like line, scatter, bar (all variations), box, violin, contour, heatmap, 3D plots and more. Along the way, I learned how to set axis labels and limits, use subplots, add legends, and save the visuals as PNGs. But I didn’t stop at the usual stuff. I had some fun with it — made goofy plots like drawing a pizza slice or even a butterfly using just Matplotlib. Honestly, it made learning way more fun and helped me understand how flexible plotting can be.

Task 3: NumPy

This one was all about getting hands-on with NumPy basics. I worked on generating arrays by repeating smaller arrays across dimensions — kind of like tiling patterns. I also created arrays where each element basically represented its own index, arranged in ascending order.

Felt a bit like playing with data Lego blocks — figuring out how to shape, repeat, and organize arrays neatly using just NumPy tools. Super satisfying once you get the hang of the slicing and broadcasting magic!

Task 4: Metrics and Performance Evaluation

For this task, I explored how to evaluate models properly — both for regression and classification problems. I worked with metrics like accuracy, precision, recall, F1-score for classification, and MAE, MSE, RMSE, and R² for regression.

It was cool to see how different metrics highlight different aspects of a model’s performance. This task helped me understand how to actually judge whether a model is doing well beyond just looking at the output.

I have given the metrics for all the models i have trained in this report, make sure to check the other notebooks to see the metric evaluation.

Task 5: Linear and Logistic Regression (Coded from scratch)

Linear Regression

Linear Regression coded from scratch using torch tensors

Logistic Regression

Logistic Regression coded from scratch using torch tensors

Task 6: K-Nearest Neighbor Algorithm

For this one, I first built the KNN algorithm from scratch — manually calculating distances, finding the nearest neighbors, and figuring out the most common label. Once that was working, I moved on to using scikit-learn's built-in KNeighborsClassifier to compare results.

Doing it from scratch really helped me understand what’s going on under the hood, and using the library version showed how easy it is once you get the concept. It’s simple, intuitive, and surprisingly powerful for small datasets.

Coded from scratch

Using scikit learn's in-built model

Task 7: An elementary step towards understanding Neural Networks

Building Neural Networks from Scratch

I started off by building a basic neural network from the ground up — no PyTorch, no TensorFlow, just pure math. I implemented forward propagation, activation functions, and even hand-coded backpropagation. It gave me a solid grip on how the internals of a neural net actually work. After spending hours on matrix ops and gradients, I finally saw it learn. That felt wildly rewarding.

I have written a huge blog explaining the mathematics behind all this, you can find that on my website linked below:

Neural Networks Blog

Building GPT from scratch

The second part? I implemented a mini version of GPT — a transformer-based language model trained on Shakespearean text. It’s decoder-only, character-level, and can generate text in full Shakespeare mode. This was easily the most intense and exciting project I’ve done. Training it and watching it spit out lines like “thou art a villain” made everything click — attention mechanisms, embeddings, autoregression, all of it.

I have written a massive blog on this too, explaining everything in great detail, it is linked below:

GPT-2 blog

I would have linked this code below too, but the model took me hours to train and i don't have a local GPU, so i trained it on a colab notebook. I shall link the notebook below:

GPT-2 code

Task 8: Math behind Machine Learning

This task was all about going back to the roots — understanding the math that powers everything we do in ML. I explored curve fitting by modeling a simple function on Desmos, just to get an intuitive feel for how functions behave and how models try to approximate them.

Even though it was just a basic exercise, it helped me see the "why" behind the code — especially how loss functions work and why minimizing error matters. Felt good to slow down and actually understand the backbone of machine learning for a change.

ib

Task 9: Plotly

For this task, I explored data visualization using Plotly, which honestly felt like an upgrade from Matplotlib and Seaborn. It’s interactive, dynamic, and super fun to use — perfect for exploring datasets in a more hands-on way.

I used it for exploratory data analysis (EDA), creating visuals that I could actually hover over, zoom into, and play around with. It made spotting patterns and trends so much easier (and cooler). Definitely a step up in making my visualizations not just informative, but actually engaging.

Task 10: Decision Trees

This task was my intro to decision trees — one of the most visual and intuitive ML algorithms out there. I learned how decision trees basically work like a flowchart of “if-else” questions, splitting the data step-by-step until it lands on a prediction.

I tried out both classification and regression trees, and it was super cool to see how something as simple as asking the right questions in the right order can lead to solid predictions. It’s like building a decision-making brain — one split at a time.

Binary Classification Trees

Blog Post

Regression Trees

Blog Post

Task 11: SVMs

For this task, I explored SVMs — a super powerful supervised learning algorithm that’s all about finding the perfect boundary between two classes. It works by finding a hyperplane that best separates the data while maximizing the margin between the classes.

I experimented with linear SVMs and got a feel for how regularization and margin trade-offs work. Once I saw how it handles even tricky datasets with clean decision boundaries, I totally got why it’s such a go-to tool in ML.

UVCE,
K. R Circle,
Bengaluru 01