cover photo

COURSEWORK

Moksha's AI-ML-001 course work. Lv 3

MokshaAUTHORACTIVE
This Report is yet to be approved by a Coordinator.

Level 1 Aiml Report

31 / 3 / 2024


Task 1: Linear and Logistic Regression - HelloWorld for AIML

The objective of this task is to implement linear and logistic regression models using Python's scikit-learn library.

1.Linear Regression - To predict the price of a home based on multiple variables using linear regression.

Task Steps :

  1. Import Libraries.
  2. Load the california housing data from sklearn.datasets.
  3. Transform the dataset into a data frame.
  4. Initialize the linear regression model.
  5. Split the data into training and testing data.
  6. Train the model with our training data.
  7. Print the predictions on our test data.
  8. Check the model performance by calculating mean squared error. Code link

2.Logistic Regression - Train a model to distinguish between different species of the Iris flower based on sepal length, sepal width, petal length and petal width.

Task Steps :

  1. Load Iris Dataset.
  2. Split Data into Train and Test Sets.
  3. Create and Train Logistic Regression Model.
  4. Make Predictions.
  5. Evaluate Model Performance. Code link

Task 2: Matplotlib and Data Visualisation

  1. Environment Setup: Ensure Python and required libraries are installed.
  2. Library Import: Import Matplotlib, Seaborn and Pandas.
  3. Prepare Sample Data: Create or load data for demonstration.
  4. Set Axes Label and Limits: Use Matplotlib to set labels and limits for the axes.
  5. Create Multiple Plots: Utilize Matplotlib's subplot() function to create a grid of subplots.
  6. Add Legend: Use Matplotlib's legend() function to explain plot elements.
  7. Save Plot as PNG: Use Matplotlib's savefig() function to save the plot as a PNG file.
  8. Explore Plot Types: Experiment with various plot types like line, scatter, bar etc.
  9. Execute and Visualize: Run the code and visualize the plots. Matplotlib and Data Visualisation

Task 3: Numpy

NumPy is a Python library for working with arrays and linear algebra and matrices.

import numpy as np

Generate an array by repeating a small array across each dimension

small_array = np.array([[1, 2], [3,4]])
repeated_array = np.tile(small_array, (3, 2))

Generate an array with element indexes such that the array elements appear in ascending order

index_array = np.arange(repeated_array.size).reshape(repeated_array.shape)

Print the arrays

print(Small Array:\")
print(small_array)
print(\"\Array by Repeating Small Array:\")
print(repeated_array)
print(\"\Array with Element Indexes:\")
print(index_array)

Link to the cell : link


Task 4: Metrics and Performance Evaluation


TASK 5: Linear and Logistic Regression - Coding the model from SCRATCH

The objective of this task is to gain a deeper understanding of linear and logistic regression by implementing the algorithm from scratch.

Linear Regression - Linear Regression is a basic and most commonly used type of predictive analysis. It is used to predict the value of a dependent variable based on the value of independent variable.

The simplest of regression equation is:

 y = m*x + b 

where,
y = estimated dependent value.
b = intercept or constant.
m = regression coefficient or slope.
x = value of the independent variable.

Logistic Regression - Logistic regression is a supervised machine learning algorithm used for classification tasks where the goal is to predict the probability that an instance belongs to a given class or not.

Implementation


Task 6 : K- Nearest Neighbor Algorithm

K-Nearest Neighbors (KNN) is a supervised learning algorithm used for both classification and regression tasks. It classifies a data point by comparing it to the majority class of its nearest neighbors in a feature space, where the value of k represents the number of neighbors considered.
The objective of this task is to compare the performance of a custom K-Nearest Neighbors (KNN) algorithm implementation with scikit-learn KNN implementation across multiple datasets by measuring accuracy and other relevant metrics. IMPLEMENTATION


Task 7 : An elementary step towards understanding Neural Networks

  • Neural Networks mimic the structure and functioning of the human brain. They consist of layers of artificial neurons, each performing specific computations.
    This task aims to understand Neural Networks, including types like Convolutional Neural Networks (CNN), Artificial Neural Networks (ANN) and Recurrent Neural Networks(RNN).
  • Large Language Models and Building GPT-4:
    Large Language Models (LLMs) are sophisticated AI systems designed to understand and generate human-like text based on vast amounts of training data. They utilize advanced deep learning techniques, particularly Transformer architectures, to process and generate text with remarkable fluency and coherence.
    blogpost

Task 8: Mathematics behind machine learning

  • Curve-Fitting - Curve fitting is a fundamental concept in machine learning and data analysis, where we find a mathematical function that best fits a given set of data points.
    Curve fitting using Desmos
  • Fourier Transform -

Task 9: Data Visualization for Exploratory Data Analysis

  • Plotly is a powerful data visualization library that offers a wide range of tools for creating interactive and dynamic plots. It provides support for various types of plots, including scatter plots, line plots, bar plots, histograms, heatmaps, 3D plots, and more.
  • Exploratory Data Analysis (EDA) is an essential step in the data analysis process that involves summarizing the main characteristics of a dataset, often with visual methods. In this report, I performed EDA on the Iris dataset using Plotly, an advanced visualization library known for its interactive and dynamic plots.
    Data Visualization using Plotly
    Google colab

Task 10: An introduction to Decision Trees

Decision Trees are a powerful supervised learning algorithm whicg are used for classification tasks. They provide a visual representation of decision-making processes where each internal node represents a "decision" based on a feature, each branch represents the outcome of that decision, and each leaf node represents the final decision or outcome.
Decision Trees


Task 11: Exploration of a Real world application of Machine Learning

This case study examines how Spotify's Music Recommendation System utilizes advanced machine learning algorithms and mathematical constructs to deliver personalized music experiences to its users.
Case Study


UVCE,
K. R Circle,
Bengaluru 01