2 / 1 / 2024
Use sci-kit’s linear_model.LinearRegression()
Certainly! Here's a brief explanation of both linear and logistic regression:
Linear regression is a statistical method used to model the relationship between a dependent variable (target) and one or more independent variables (features) by fitting a linear equation to observed data. The basic form of a linear regression equation with one independent variable is:
[ y = mx + b ]
Linear regression aims to find the best-fitting line that minimizes the difference between the observed values and the values predicted by the model. It is commonly used for predicting continuous outcomes, such as predicting house prices based on square footage, number of bedrooms, etc.
(here)[https://colab.research.google.com/drive/1JQQ0DanRt-UakMghQyESqg-RoddcA1nu?usp=sharing]
Use sci-kit’s linear_model.LogisticRegression
Logistic regression is a statistical method used for binary classification tasks, where the target variable has only two possible outcomes (e.g., yes/no, 1/0). However, it can also be extended to handle multi-class classification problems (e.g., distinguishing between different species of flowers, as in the Iris dataset).
Instead of fitting a straight line to the data, logistic regression fits an S-shaped logistic function (sigmoid function) to estimate the probability that a given input belongs to a certain class. The logistic function maps any real-valued number into a value between 0 and 1, making it suitable for classification tasks.
The logistic regression equation can be represented as follows:
[ P(y=1|x) = \frac{1}{1 + e^{-(mx + b)}} ]
Logistic regression models are trained to learn the optimal weights and bias that maximize the likelihood of the observed data belonging to their respective classes. It is commonly used in various fields such as healthcare (predicting disease occurrence), marketing (customer churn prediction), and finance (credit risk assessment).
(here)[https://colab.research.google.com/drive/1b5XWFzaAi7tAK0iSsLklZxOAwsLpnphl?usp=sharing]
1.Import Libraries
2.Set Axes Label and Limits
3.Create a Figure with Multiple Plots using Subplot
4.Add a Legend to the Plot
5.Save Your Plot as PNG
(here)[https://github.com/HemaShenoy/marvel]
LINK: https://colab.research.google.com/drive/1FTyIpK4r-2wqp9-qnCN9sZuHm3Gup3bo?usp=drive_link
import numpy as np
# Small array
small_array = np.array([[1, 2], [3, 4]])
# Repeat the small array along each dimension
repeated_array = np.tile(small_array, (3, 2))
print("Small Array:")
print(small_array)
print("\nArray by Repeating Small Array:")
print(repeated_array)
import numpy as np
# Specify the shape of the desired array
array_shape = (4, 5, 3)
# Generate an array with element indexes in ascending order
index_array = np.arange(np.prod(array_shape)).reshape(array_shape)
print("Array with Element Indexes:")
print(index_array)
LINK:https://colab.research.google.com/drive/101UNvWYXa9dZ5lMrXQCD_Iybag6oBE2P?usp=sharing
Regression models predict continuous numerical values, and scikit-learn provides various algorithms like Linear Regression, Decision Trees, Random Forests, and Support Vector Machines (SVM). Before diving into metrics, let's grasp some key concepts:
Metrics help measure how well a regression model fits the data. Key metrics include:
from sklearn.metrics import mean_absolute_error
true_values = [2.5, 3.7, 1.8, 4.0, 5.2]
predicted_values = [2.1, 3.9, 1.7, 3.8, 5.0]
mae = mean_absolute_error(true_values, predicted_values)
print("Mean Absolute Error:", mae)
Understanding these metrics is crucial for assessing regression model performance. In a practical example using scikit-learn, we applied metrics to evaluate a Linear Regression model on house prices. This process helps gauge the model's accuracy and effectiveness in predicting continuous values.
Linear regression is a linear approach to modeling the relationship between a dependent variable and one or more independent variables
import numpy as np
import matplotlib.pyplot as plt
# Generate linear-like data
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
# Your Linear Regression Implementation
# ...
# Scikit-learn Linear Regression
from sklearn.linear_model import LinearRegression
sklearn_lr = LinearRegression()
sklearn_lr.fit(X, y)
# Plot the data and the linear regression line
plt.scatter(X, y, label='Data')
plt.plot(X, sklearn_lr.predict(X), color='red', label='Scikit-learn Linear Regression')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Linear Regression')
plt.legend()
plt.show()
Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome.
import numpy as np
import matplotlib.pyplot as plt
# Generate logistic-like data
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = (X > 1).astype(int).ravel()
# Your Logistic Regression Implementation
# ...
# Scikit-learn Logistic Regression
from sklearn.linear_model import LogisticRegression
sklearn_logreg = LogisticRegression()
sklearn_logreg.fit(X, y)
# Plot the data and the logistic regression curve
plt.scatter(X, y, label='Data')
X_test = np.linspace(0, 2, 300).reshape(-1, 1)
plt.plot(X_test, sklearn_logreg.predict_proba(X_test)[:, 1], color='red', label='Scikit-learn Logistic Regression')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Logistic Regression with Complicated Data')
plt.legend()
plt.show()
In this example, we generate a dataset with a logistic-like relationship.
Link: https://colab.research.google.com/drive/1rR699_0qcK9ZsA_xHymq1XMS4zrYmzmS?usp=sharing
Understand the K-Nearest Neighbour Algorithm and implement it, first with a built in interface and next, from scratch. Compare the results for both with the indicated datasets. References:
Implement KNN using sci-kit’s neighbors.KNeighborsClassifier for multiple suitable datasets
Understanding the algorithm
Implement KNN from scratch. Compare results with sci-kit’s built in method for different datasets.
(here)[https://github.com/HemaShenoy/marvel]
A blog about your understanding of Neural Networks and types like CNN, ANN, etc
Decoding Convolutional Neural Networks: Insights into CNNs and ANNs
Building GPT-4: A Comprehensive Guide to Creating Advanced Large Language Models
Curve-Fitting- Model a curve fitting for a simple function of your choice, on Desmos.
1.Enter the Quadratic Function: In the input bar, enter the quadratic function:
f(x)=2x^2 +3x−5
2.Add Noisy Data Points: To simulate real-world data with noise, we'll add some random data points around our quadratic function.
3.Fit a Quadratic Curve:
After adding the data points, click on the wrench icon in the upper right corner of the table to adjust settings. Under "Regression Type," choose "Quadratic" to fit a quadratic curve to the data points. Desmos will automatically fit the best quadratic curve to your data points.
4.Visualize the Fit: You will see the original quadratic function
f(x)=2x^2 +3x−5 plotted in blue, and the curve fitted to the noisy data points in red.
DESMOS LINKS:
Fourier Transforms- Fourier transforms are perhaps the most important function approximators used today. Model a fourier transform for a function of your choice on MATLAB.
f(t)= 1, if 0≤t −1, if T/2≤t where T is the period of the square wave.
create a signal that consists of multiple sinusoidal components at different frequencies and perform its Fourier transform to analyze its frequency components.
Use Plotly for data visualization. This is an advanced visualization library, more dynamic than the generally used MatPlotlib or Seaborn.
(example 1)[https://colab.research.google.com/drive/1U1f2q9sCEGC789wq46MnK4KoGX6Vajm3?usp=sharing]
(example 2)[https://colab.research.google.com/drive/1y9QlUlGFXtbg5GX-NcGv7WFk41GXlABH?usp=sharing]
Decision Tree is a supervised learning algorithm that can be used for Regressive or Classifying Tasks. It is a way to use conditional statements as a hierarchy so that, for an event, you get the chances of given outcomes.
(here)[https://github.com/HemaShenoy/marvel]