cover photo

COURSE

AI-ML-001

4 Levels · 11 Months

An introductory course on Artificial Intelligence and Machine Learning.

AI-ML-001

Jump to:


Level 1


Generic Tasks

TASK 1: 3D Printing

Understand the working of a 3D printer, check out the online resources. Understand what's an STL file, and then learn to slice it (using ultimaker or creality slicer).Go through the SOP'S regarding the 3d printer. Learn about bed temperature, infill density and other printer settings. Finally get an STL file from the internet, and slice it and put it for print.

Resources:

Introduction to 3d printer

PLA settings

Types of 3D printing

(Note this task is to be done under coordinator supervision.) 3dprinter

TASK 2: API

What is an API? Learn the working of an API and its applications. Using any api of your choice, build an user interface(web app, mobile app, etc), where you can make calls and then display the necessary information. An example weather app is given below, using the open weather api.

Example

TASK 3: Working with Github

Familiarize yourself with GitHub integrated workflows (GitHub actions), Issues, and pull requests with this task. Given below is a git repository, go check it out and then perform the necessary tasks stated in the readme file.

Check this link for more info: https://github.com/UVCE-Marvel/git-task

TASK 4: Get familiar with the command line on ubuntu and do the following subtasks:

● Create a folder named test.

● cd into that folder.

● Create a blank file without using any text editor.

● list the files in that folder

● create 2600 folders in this folder where each folder is named like . For example, M90 or B56.

● concatenate two text files containing any random text and display them on the terminal.

https://ubuntu.com/tutorials/command-line-for-beginners#1-overview

Task 5 : Build Your Own Brain -Linear Regression from Scratch

Dive into the core of machine learning by implementing Linear Regression from scratch using , and compare its performance with the scikit-learn implementation. Use the California Housing dataset to evaluate your model on real-world data.

Your Task:

  • Implement linear regression manually (without using ML libraries for training).
  • Understand and apply gradient descent to minimize error.
  • Compare your custom model’s performance against sklearn.linear_model.LinearRegression .
  • You should analyze results by:
    • Graph showing line of best fit and the datapoints.
    • Performance metrics: MSE, MAE, R² for both custom and scikit-learn models.
    • Brief comparison between two models.

Download Dataset

Learn Linear Regression:

  1. Understanding :

  2. Coding the linear regression algorithm from scratch:

Expected Outcomes:

  • Grasp how gradient descent optimizes weights in linear regression.
  • Understand the importance of feature scaling.
  • Know how to evaluate regression models using standard metrics.
  • Be able to appreciate the convenience and performance of inbuilt ML libraries.

Precautions:

  • Always normalize or standardize features before training your scratch model, especially if you’re using gradient descent.
  • Be cautious with your learning rate , too small and the model is slow, too large and it may diverge.
  • Initialize weights and bias properly ( small random values or zeros).

Task 6 : The Matrix Puzzle — Decode with NumPy & Reveal the Image

Get hands-on with NumPy and Matplotlib by solving a visual puzzle. You’ll be given a scrambled matrix, and your mission is to decode it into a hidden image using NumPy operations and visualization techniques.

Your Task:

  • Download the scrambled matrix from the link provided.
  • Use your knowledge of NumPy to manipulate, reshape, and reorient the matrix.
  • Reveal the secret image by plotting it using matplotlib.pyplot.imshow() .
  • Scrambled Matrix: Download Here
  • NumPy Learning Doc: Explore Here

Learn NUMPY:

Learn Matplotlib:

Decode the Matrix using these clues and Visualize it :

  • "Try reshaping the encoded array into a square—how many elements are there?"
  • "The structure may be upright, but the data might be sideways. Look at its orientation."
  • "Sometimes the end is actually the beginning."

Expected Outcomes:

  • Gain confidence with NumPy operations like reshaping, slicing, flipping, and transposing.
  • Learn to visualize 2D arrays using Matplotlib.
  • Sharpen your debugging and puzzle-solving skills in a fun context.

Precautions:

  • Check the shape of the array before applying imshow() - wrong dimensions will throw errors.
  • Ensure that your reshaped matrix has the correct number of elements (it's likely a square!).

TASK 7: Create a Portfolio Webpage

Create a website to showcase your portfolio - about yourself, interests, projects, social media profiles and more. It has to be responsive and also pushed to the git repository. CSS can be of your choice and any framework can be used.

TASK 8: Writing Resource Article using Markdown

Markdown is an easy-to-use markup language that is used with plain text to add formatting elements (headings, bulleted lists, URLs) to plain text without the use of a formal text editor or the use of HTML tags. Markdown is device agnostic and displays the writing format consistently across device type. Write a technical resource article on a topic of your choice and post it on the MARVEL website. Refer to the linked article for further details

Link

TASK 9: Tinkercad

Create a tinkercad account, get familiar with the application, understand the example circuits given and simulate a simple circuit using an ultrasonic sensor to estimate the distance between an obstacle and the sensor. Display the results on the serial monitor.

Create a radar system utilising an ultrasonic sensor and servo motor to detect objects within a certain range. The ultrasonic sensor emits sound waves and measures the time taken for them to bounce back, while the servo motor rotates the sensor to cover a wider area, providing a simple yet effective detection mechanism. RESOURCE: https://youtu.be/NwmcNCvUcDc?si=x2LAYMFiqs1SzLfI TASK OUTCOME: introduction to- · TINKERCAD · Working of ultrasonic sensor and servo motor · Radar technology PRECAUTIONS/SAFETY MEASURES- NOT ANY

TASK 10: Speed Control of DC Motor

Explore basic techniques for controlling DC motors, understand the control DC motors using the L298N motor driver and the Arduino board. Using an UNO and H-Bridge L298N motor driver, control the speed of a 5V BO motor, try simulating this on tinkercad and then perform it on the hardware, Record videos of you doing the same.

Reference

TASK 11: LED Toggle Using ESP32

Learn the working of an ESP32 and create a standalone web server with an ESP32 that controls the LED connected with ESP32 GPIOs. Use the arduino IDE to code and upload the program to the ESP32. Learn to configure the IDE to upload code to an ESP32.

Reference

TASK 12: Soldering Prerequisites

(Soldering is to be done in presence of a coordinator)

Learn about the soldering equipment present in our lab, the solder, the soldering iron, soldering wick, flux, etc. Learn to use them and perform basic soldering on a perf board, for example a LED circuit in the presence of a coordinator and document the same.

Reference

TASK 13:

Design a 555 astable multivibrator with duty cycle 60%, rig up the circuit on a breadboard and by using the probes observe the output of your circuit on the DSO. Resources:

Circuit

TASK 14: Karnaugh Maps and Deriving the logic circuit

Description: For 4 cases, based on door lock/open and key pressed/not pressed. Determine the karnaugh map and make a burglar alarm using simple logic circuits. The buzzer or led blinks when certain conditions are met, you can use push buttons for the door and key.

(Tip: use logic gates, use k-maps to figure out the working conditions.)

TASK 15: Active Participation:

Take part in any technical event, inter or intra college and submit the issued certificate of participation.

Enroll for a MOOC and complete the course.

TASK 16: Datasheets report writing:

Topics: 1)MQ135 Gas sensor 2)L293D motor driver Task Description: Study the datasheet of any one of the above and write a report on it. Specify about the ICs used in L293D, PWM, H-bridge etc. In case of MQ 135, specify the calibrations for different gases and the Freundlich Absorption Theorem Graph.

Task 17: Introduction to VR

Familiarise yourself with what Virtual Reality is. Make a detailed study about what's the difference between VR and AR. Mention about the trends in the space and technology stack being developed. Make about Indian companies in this space. Make the report with detail. Using generative AI to generate this study can lead to disqualification.

vrlol

TASK 18: Sad servers - "Like LeetCode for Linux"

Sadservers is an excellent ground to test your Linux troubleshooting skills. Here is a troubleshooting scenario: Command Line Murders. Troubleshoot and Make Sad Servers Happy!

Command line murder
Linux commands
Linux commands

Task 19: Make a Web app

Using express create a resource library website where you can browse the resource articles, books etc which are available and also manage your account
Reference

Domain Specific Tasks

Task 20 : Notebook Ninja – Getting Started with Jupyter

Familiarize yourself with Jupyter Notebook as a tool for both coding and communication.This task is designed to build confidence in writing clean, readable, and well-structured notebooks using both code and Markdown.

Your Task:

Complete the following challenges inside a single Jupyter notebook.

Quest 1: Markdown & Presentation Skills

  • Use Markdown to structure and beautify your notebook:
  • Add a title and section headers ( # , ## )
  • Create a bullet list of 3 things you want to learn
  • Format text using bold, italics, and insert a horizontal line ( --- )
  • Embed an image
  • Include a code snippet using triple backticks

Quest 2: Python Coding & Visualization

  • Demonstrate Python coding and Jupyter features:
  • Declare two variables, do a calculation, and print the result
  • Plot a simple graph using Matplotlib (e.g., line plot)
  • Wrap it all into a mini-report with an intro and a closing summary

Expected Outcomes:

  • Know how to structure a professional, readable Jupyter notebook.
  • Understand Markdown basics for data storytelling.
  • Build confidence in combining code, comments, and visualizations.
  • Develop the habit of documenting your thought process clearly.

Precautions:

  • Don’t mix up code and markdown cells , switch cell types appropriately.
  • Don’t clutter , use whitespace and headers to maintain readability.
  • Keep your notebook clean by remove unused cells or redundant outputs.

Learn to Install and use Jupyter:

Task 21: Watch & Reflect – Intro to Machine Learning

Understand foundational ML concepts and data preparation techniques by watching two beginner-friendly videos and writing an article.

Watch the following videos:

  1. A Gentle Introduction to Machine Learning by StatQuest
  1. How is Data Prepared for Machine Learning? by AltexSoft

Your Task:

Write a resource article (300–500 words) summarizing the contents of the two videos and your understanding of it , and upload it in the resource articles page of the MARVEL website


Level 2


System Requirements:

  • Jupyter Notebook(installed or set up)
  • Python IDE( (VS Code or PyCharm).
  • Python Libraries installed (numpy , pandas, matplotlib , sci-kit learn , seaborn , joblib)

TASK 1 : MATLAB ML Onramp Course

Gain hands-on experience with Machine Learning fundamentals using MATLAB. This task is designed to introduce you to practical ML workflows through interactive and guided lessons on the MATLAB Machine Learning Onramp Course .

Your Task:

Enroll in and complete the MATLAB Machine Learning Onramp course.

Expected Outcomes:

Understand the end-to-end flow of a machine learning project. Gain exposure to supervised learning techniques in a new tool. Learn data handling, model training, validation, and performance assessments. Be able to compare MATLAB’s workflow with Python-based tools like scikit-learn

Precautions:

Ensure stable internet as your progress might be lost if page resets .

TASK 2 : Kaggle Crafter - Build & Publish Your Own Dataset

Learn the essentials of data curation, documentation, and publishing by creating and sharing your own dataset on Kaggle. This task will help you understand what makes a dataset usable, discoverable, and valuable to the data science community.

Your Task:

  • Create a dataset of your choice - it can be real or synthetic or fake , but should be cleanly organized.
  • Upload the dataset to Kaggle with proper metadata and formatting.
  • Your dataset should meet the following usability criteria (total score ≥ 8.5):

Usability Factors:

1. Completeness

Add a subtitle, tags, description, and a cover image

2. Credibility

  • Mention source/provenance (or clarify it’s synthetic
  • Add a public notebook demonstrating use (if applicable)
  • State the update frequency clearly

3. Compatibility

  • Choose a proper license (like CC0 or CC BY)
  • Ensure it’s in a compatible format ( .csv , .xlsx , .json , etc.)
  • Write clear descriptions for the file and columns

Expected Outcomes:

  • Understand what makes a dataset useful, trustworthy, and user-friendly
  • Practice data storytelling through documentation and presentation
  • Build visibility on Kaggle with a well-structured contribution
  • Gain confidence in preparing and sharing data professionally

Precautions:

  • Avoid sensitive, personal, or plagiarized data
  • Keep your dataset clean and minimal

Learn how to create Fake Datasets in Python:

Learn how to upload Dataset to Kaggle :

Article

TASK 3 : Data Detox - Data Cleaning using Pandas

Learn how to preprocess and clean raw, messy datasets using Pandas for better machine learning outcomes.

Your Task:

  1. Load the dataset and explore the types of issues present.
  2. Handle missing values by either dropping or imputing them.
  3. Fix inconsistencies in text or categorical columns (e.g., case mismatches, typos).
  4. Format column correctly (e.g., dates as datetime , numbers as int / float ).
  5. Remove duplicate rows, if any.
  6. Save the cleaned dataset as a new CSV.

Expected Outcomes:

  • Understand real-world data cleaning challenges.
  • Gain hands-on experience with Pandas data cleaning methods.
  • Learn to prepare data for analysis or modeling.

Precautions:

  • Always inspect the data before applying changes.
  • Don’t blindly drop nulls-understand their significance.
  • Keep a backup of raw data before applying transformations.

Download Dataset

Learn Pandas:
Learn to clean dataset using Pandas:

TASK 4 : Anomaly Detection

G-Flix Inc. suspects a breach , but not from the outside. Your job as a Data Forensics Officer is to detect unusual patterns in user activity logs using anomaly detection techniques. The twist? You don’t know what the anomaly looks like , it’s hidden in plain sight.

Your Task:

  • Load and explore the provided dataset.
  • Identify normal behavior trends using visualizations.
  • Apply at least two anomaly detection techniques:
    • Statistical (e.g., Z-score, IQR)
    • Unsupervised ML (e.g., Isolation Forest, DBSCAN)
  • Compare flagged anomalies from each method.
  • Prepare a final report with your top 5 suspects and evidence.

Expected Outcomes:

  • Understand real-world applications of anomaly detection.
  • Gain hands-on experience with unsupervised ML methods.
  • Learn how to differentiate between outliers and genuine anomalies.
  • Build effective visualizations for behavior profiling.
  • Develop investigative storytelling and reporting skills.

Precautions:

  • Scale your data before applying distance-based algorithms.
  • Don’t assume every outlier is an anomaly : context is key.
  • Use multiple features to justify suspicious behavior.
  • Validate anomalies through both visual and algorithmic evidence.

Download Dataset

Understand the concept and implementation:
Learn to implement Different Anomaly Detection Algorithms :

TASK 5 : Logistic Regression from Scratch

Understand binary classification through hands-on experience by building a logistic regression model from scratch and comparing it with a standard library implementation. The chosen use-case: predicting heart disease.

Your Task:

  • Implement Logistic Regression from Scratch
  • Implement Logistic Regression Using scikit-learn
  • Compare Models
  • Use metrics: accuracy, precision, recall, F1-score
  • Discuss:
    • Performance differences
    • Training time
    • Implementation difficulty and interpretability

Download Dataset

Expected Outcomes :

  • Master the inner mechanics of logistic regression
  • Practice matrix operations and gradient descent
  • Learn how scikit-learn abstracts complexity
  • Build confidence in choosing the right level of abstraction for ML tasks

Precautions

  • Watch out for issues like vanishing gradients and poor convergence.
  • Ensure your dataset is normalized.

Understand Logistic Regression:
Implement Logistic Regression From Scratch:

TASK 6 : Battle-Test Your Model - Support Vector Machines

Understand and implement Support Vector Machines (SVM) using scikit-learn , then stress-test your model by injecting noise into the data to observe how its performance deteriorates. Use Red Wine quality Dataset.

Your Task:

  • Implement SVM Using scikit-learn
  • Noise Robustness Experiment: * Gradually add Gaussian/random noise to the dataset: * Begin with small noise levels (e.g., ±1%) * Increase progressively (e.g., ±5%, ±10%, etc.)
  • At each level: * Retrain the model * Evaluate and log the metrics * Identify the breakdown point - the level of noise where the model starts to fail.

Visualize the Results

  • Create a line plot of performance metrics vs. noise level
  • Highlight the threshold where model performance drops sharply

Download Dataset

Expected Outcomes:

  • Be able to apply SVMs to real-world datasets
  • Understand how robust your model is to data corruption
  • Gain insight into hyperparameter tuning, model evaluation, and noise handling

Precautions:

  • Check for and handle missing or duplicate data.
  • Add noise only to features, not labels.
  • Gradually increase noise in small steps (e.g., std dev: 0.01 → 0.5).
  • Avoid complex kernels with small datasets.
Understand SVMs:
Implement SVMs:

TASK 7 : Fairness Meets Functionality

Use the Utrecht Fairness Recruitment Dataset from Kaggle, which contains anonymized recruitment data including age, gender, education, experience, and whether a candidate was hired.

Investigate potential biases in the model by analyzing its predictions across demographic groups such as gender and age. Use fairness metrics like demographic parity and equal opportunity to measure disparities. Discuss any unfair discrimination found and explore possible reasons behind it.

Your Task:

  • Build a Decision Tree from Scratch using ID3 Algorithm
  • Evaluate Model Performance
  • Use performance metrics: * Accuracy * Precision * Recall * F1-score
  • Analyze feature importance: * Which features were most influential in the tree? * Do these make intuitive or ethical sense?
  • Conduct a Fairness Analysis * Slice the data by demographic groups: * Gender (Male/Female/Other) * Age brackets (e.g., <25, 25–35, 35+)

Discuss:

  • Demographic Parity: Are hiring decisions independent of gender or age?
  • Equal Opportunity: Are qualified candidates from all groups equally likely to be hired?

Expected Outcomes:

  • Understand decision tree fundamentals and implementation
  • Learn how model decisions are formed
  • Gain awareness of bias detection techniques in machine learning
  • Explore responsible AI practices and ethical model deployment

Precautions:

  • Limit tree depth and set minimum samples per split.
  • Handle missing demographic info appropriately.
  • Clearly define demographic groups (e.g., gender, age brackets).
  • Watch for proxy variables causing indirect bias.
Understanding Decision Trees
Understand ID3 :
Implement ID3 :

TASK 8 : KNN with Ablation Study

In this task, you'll build a K-Nearest Neighbors (KNN) classifier using the Breast Cancer Wisconsin dataset. The goal is to not only train and evaluate the classifier, but also to conduct a feature ablation study to determine which features are most important for accurate classification. By removing one feature at a time and observing the effect on model performance, you'll identify which features significantly contribute to the model’s prediction.

Your Task:

  • Preprocess Data: * Drop id , encode diagnosis (M=1,$ $,B=0)$ , and normalize features.
  • Train KNN Model: * Use KNeighborsClassifier (e.g., $k=5)$ , train-test split, and evaluate.
  • Feature Ablation: * Remove one feature at a time, retrain, record metrics (accuracy, precision, recall, F1score).
  • Analyze Impact: * Identify features whose removal drops performance the most.

Expected Outcomes

  • Understand how KNN works and why feature scaling is essential.
  • Gain hands-on experience in evaluating model performance.
  • Learn how removing features affects model behavior and accuracy.
  • Discover which features are most informative in medical datasets.
  • Appreciate the value of feature selection and ablation studies in ML pipelines.

Precautions

  • Normalize data before KNN.
  • Use consistent k-value for fair comparison.
  • Compare all 4 metrics, not just accuracy.
  • Ensure only one feature is removed at a time.
Understand KNN :
Implementation:

TASK 9 : Evaluation Metrics - Pick the Best Performer!

You will receive 5 pretrained ML models saved as .pkl files. Your goal is to evaluate and compare them using a test dataset and identify the best-performing model.

Your Task:

  • Load the test dataset using pandas .
  • Load each pickle file using joblib .
  • Use the model to predict on the test set.
  • Evaluate: * Classification: accuracy, precision, recall, F1-score. * Regression: MSE, RMSE, R².
  • Compare all scores and conclude which model performs best, with reasons

Expected Outcomes:

  • Learn how to load and evaluate saved models.
  • Understand different evaluation metrics and their significance.
  • Develop critical analysis by comparing models based on real performance.
  • Gain experience in handling multiple model types efficiently.
  • Learn to make informed decisions on model selection.

Precautions:

  • Ensure consistent preprocessing (scaling, encoding) between training and testing.
  • Check if model type matches the dataset (classification vs regression).
  • Handle exceptions if a model fails to load or predict.
  • Verify test data shape matches model input requirements.

Download Pickle Files

Install Joblib

Learn how to use Joblib for machine learning:
Evaluation Metrics:

Level 3


Task 1 - Decision Tree based ID3 Algorithm

  1. Understanding Basic Terminology
  2. Understand ID3
  3. Implement ID3 for

Task 2 - Naive Bayesian Classifier

  1. Understand Naive Bayesian Classifier, watch it in action using sklearn
  2. Implement Naive Bayesian Classifier for text classification and other applicable datasets

Task 3 - Ensemble techniques

  1. What are ensemble techniques
  2. Apply the ensemble techniques on the Titanic Dataset

Task 4 - Random Forest, GBM and Xgboost

  1. Random forest
    1. Understand
    2. Implement
  2. GBM
    1. Understand
    2. Implement
  3. Xgboost
    1. Understand
    2. Implement

Task 5 - Hyperparameter Tuning

  1. Understanding
  2. Pick a suitable problem (and dataset) and train a model to fit the problem.
  3. Tune the hyperparameters of the model to increase accuracy

Task 6 : Image Classification using KMeans Clustering

Image classification is an important area in real world applications of machine learning. K means clustering is a simple algorithm that uses clusters or collections of data and finds ‘k’ number of centroids, by averaging it out, such that k is minimum. Resources:

  1. Understanding K Means Clustering:
  2. Classify a given set of images into a given number of categories using KMeans Clustering using MNIST dataset

Task 7: Anomaly Detection

Anomaly detection is a way to detect erroneous data points in a stream, by looking at statistical differences. Anomaly detection can be done through unsupervised or supervised learning methods.

Resources:

Task 8: Generative AI Task Using GAN

Develop a generative adversarial network (GAN) model to generate realistic images of a specific category, such as faces, animals, or landscapes. Customize the GAN architecture and train it on a dataset relevant to the chosen category to produce high-quality and diverse synthetic images.

Resources:

  1. Outcome: Implementation and training of GAN model tailored to a specific image category.
  2. Generating diverse and realistic synthetic images using the trained GAN.
  3. Demonstrating understanding of GAN architecture and its applications in generative tasks.

Task 9: PDF Query Using LangChain

Utilize LangChain, a natural language processing framework, to extract relevant information from PDF documents based on user queries. Develop a system that can interpret user queries, process PDF documents, and retrieve relevant sections or excerpts using language understanding techniques.

Resources:

  • LangChain Documentation: Link
  • PDF Parsing with Python: Link
  • Natural Language Understanding (NLU): Link

Task Outcomes:

  1. Development of a PDF query system using LangChain.
  2. Implementation of PDF parsing and text extraction functionality.
  3. Integration of natural language processing techniques for query interpretation.
  4. Testing and validation of the system with various PDF documents and queries.
  5. Documentation of system architecture, functionality, and usage guidelines.

Task 10: Table Analysis Using PaddleOCR

Employ PaddleOCR, an Optical Character Recognition (OCR) toolkit, to extract and analyze tabular data from images or scanned documents. Develop a pipeline that can accurately detect tables, extract data, and perform analysis such as statistical computations or data visualization.

Resource Links:

  • PaddleOCR Documentation: Link
  • Tabular Data Extraction: Link
  • Data Analysis with Python: Link

Task Outcomes:

  1. Implementation of a table detection and extraction pipeline using PaddleOCR.
  2. Development of algorithms for tabular data analysis, including statistical computations.
  3. Integration of data visualization techniques to represent extracted data.
  4. Evaluation of pipeline accuracy and performance on various image datasets.
  5. Documentation of the process, including code, methodologies, and results.

Level 4


Task 1: Create a end to end mini project using the skills you have learnt.

Here's an outline of the steps we'll follow:

  • Set up Environment: Install necessary libraries and dependencies.
  • Collect Data: Obtain a dataset of recipes and their ingredients.
  • Train Model: Train a simple machine learning model to recommend recipes based on input ingredients.
  • Deploy: Deploy the chatbot web interface to a platform like Heroku.

UVCE,
K. R Circle,
Bengaluru 01