Skip to content

Rahuldrabit/Machine-Learning-Project-Track

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

🤖 Machine Learning Project Track

ML Projects Deep Learning Language Status

A curated collection of Machine Learning and Deep Learning projects completed during learning from Coursera and CUET (Chittagong University of Engineering & Technology), spanning classical ML algorithms to advanced neural network architectures.


📋 Table of Contents


🗂 Projects Overview

# Project Domain Algorithms / Techniques Language
1 KAN on MNIST Deep Learning Kolmogorov-Arnold Networks, B-splines Python
2 ML in C++ Classical ML Neural Network from scratch C++
3 Graph NN Quark-Gluon Physics / GNN GCN, EdgeConv, GAT Python
4 Leukemia Detection Medical AI CNN, DenseNet, ResNet Python
5 Number Classification Computer Vision Deep Learning (MNIST) Python
6 Iris Clustering Unsupervised ML K-Means, DBSCAN, Correlation Python
7 Car Price Prediction Regression Linear, RandomForest, Polynomial Python
8 Titanic Prediction Classification RandomForest, AdaBoost, DecisionTree Python
9 Naive Bayes Classification Gaussian/Multinomial Naive Bayes Python
10 Plant Seedling Recognition Computer Vision CNN, Transfer Learning Python
11 House Price Prediction Regression Regression Models Python

1. 🧠 Kolmogorov-Arnold Network on MNIST

Repository: Kolmogorov-Arnold-Network_MNIST

Overview

This project implements a Kolmogorov-Arnold Network (KAN) — a novel neural network architecture that uses learnable B-spline activation functions on edges rather than fixed activations on nodes (unlike traditional MLPs). The model is applied to the well-known MNIST handwritten digit classification benchmark.

Key Features

  • 🔬 Custom BSplineBasis and KANLayer modules built from scratch in PyTorch
  • 🏗️ MNISTKAN architecture with dimensionality reduction + two KAN layers
  • 📊 Full training pipeline with checkpointing, loss/accuracy curves, confusion matrix, and basis function visualization
  • 🎯 Achieves 90.95% test accuracy on MNIST in just 15 epochs

Architecture Highlights

Component Details
Model MNISTKAN (B-Spline KAN layers)
Optimizer Adam (lr = 0.001)
Batch Size 128
Epochs 15
Hidden Dim 64
B-spline Bases 16

Results

Metric Value
Final Test Accuracy 90.95%
Final Test Loss 0.3104

Tech Stack

Python · PyTorch · torchvision · matplotlib · numpy · tqdm


2. ⚙️ Machine Learning in C++

Repository: MachineLearning_cPlusPlus

Overview

This project builds a Machine Learning framework from scratch using C++, without relying on high-level ML libraries. It implements a neural network capable of reading and processing the MNIST dataset (raw binary IDX format) entirely in C++. This project demonstrates a deep understanding of the mathematical foundations underlying neural networks.

Project Structure

MachineLearning_cPlusPlus/
├── include/
│   ├── data.hpp            # Data point structure
│   └── data_handler.hpp    # MNIST binary file reader
├── src/
│   ├── data.cc             # Data implementation
│   └── data_handler.cc     # IDX file parsing logic
├── MakeFile                # Build configuration
└── main                    # Compiled executable

Key Features

  • 📦 Raw MNIST binary (IDX) file parsing in pure C++
  • 🔧 Custom DataHandler class for train/test/validation splitting
  • 🧮 Neural network logic built entirely without ML libraries
  • 🛠️ Makefile-based build system

Tech Stack

C++ · Standard Library · Makefile · MNIST IDX format


3. 🔬 Graph Neural Networks for Quark-Gluon Classification

Repository: Graph_NN_Pythia8_Quark-Gluon

Overview

This project applies Graph Neural Networks (GNNs) to high-energy physics: classifying quark jets vs. gluon jets using data generated from the Pythia8 Monte Carlo event generator. Jets are represented as graphs where particles are nodes and spatial relationships form edges — a natural fit for GNN-based classification.

Models Implemented

Model Test Accuracy AUC Score
GCN (Graph Convolutional Network) 73.0% 0.797
EdgeConv ✅ Best 75.5% 0.821
GAT (Graph Attention Network) 67.2% 0.730

Best Model: EdgeConv with AUC = 0.821

Key Features

  • ⚛️ Particle physics jet data processed as graph structures
  • 🏆 Three GNN architectures compared: GCN, EdgeConv, GAT
  • 📈 Training/validation curves and confusion matrices
  • 🗃️ Modular code architecture: separate models, utils, and data processing

Tech Stack

Python · PyTorch Geometric · Pythia8 · NumPy · Matplotlib


4. 🩺 Leukemia Cell Detection

Repository: LeukemiaCellDetection

Overview

This project addresses a critical medical AI challenge: detecting Leukemia from microscopic blood smear images. It implements and compares multiple deep learning approaches — from training CNNs from scratch with data augmentation to leveraging powerful transfer learning with DenseNet and ResNet architectures.

Dataset

Six leukemia and blood cell classes:

Class Description
ALL Acute Lymphoblastic Leukemia
AML Acute Myeloid Leukemia
CLL Chronic Lymphocytic Leukemia
CML Chronic Myeloid Leukemia
MM Multiple Myeloma
Healthy Normal (non-cancerous) cells

Data Split: 60% Train · 20% Validation · 20% Test

Models Implemented

  • CNN with Data Augmentation — Random flipping, rotation, and zooming for better generalization
  • DenseNet121 Transfer Learning — Pre-trained feature extractor with custom classification head (w/ and w/o FC layers)
  • ResNet50 Transfer Learning — Pre-trained backbone with a custom classification head
  • Fine-tuning — Partial unfreezing of base model layers with a lower learning rate

Tech Stack

Python · TensorFlow 2.x · Keras · NumPy · Matplotlib · Seaborn · Scikit-learn


5. 🔢 Number Classification (MNIST)

Repository: NumberClassification

Overview

This project tackles the classic MNIST handwritten digit classification problem, serving as a foundational deep learning project. The Jupyter notebook (ComputerProjectFirstProject.ipynb) walks through the complete pipeline: data loading, model building, training, and evaluation of neural network models on the 70,000-sample MNIST dataset.

Key Features

  • 📒 Interactive Jupyter Notebook workflow
  • 🔍 Exploratory data analysis of digit images
  • 🧠 Neural network model trained on 60,000 training images
  • 📊 Evaluation on 10,000 test images with accuracy metrics

Tech Stack

Python · Jupyter Notebook · TensorFlow / Keras · NumPy · Matplotlib


6. 🌸 Iris Unsupervised Clustering

Repository: Iris_Unsupervised_KMeans_DBScan_Correlation

Overview

This project applies unsupervised machine learning to the famous Iris flower dataset to discover natural groupings among flower species without using labels. It compares clustering algorithms and analyzes feature correlations to understand the underlying data structure.

Techniques Applied

Technique Purpose
K-Means Clustering Partition-based clustering to find K natural groups
DBSCAN Density-based clustering — handles noise and non-spherical clusters
Correlation Analysis Feature correlation heatmaps to understand relationships

Key Features

  • 📊 Elbow method for optimal K selection in K-Means
  • 🔍 DBSCAN with epsilon and min-samples tuning
  • 🌡️ Correlation heatmaps showing feature interdependencies
  • 📈 2D/3D cluster visualizations with PCA dimensionality reduction

Tech Stack

Python · Scikit-learn · Pandas · Matplotlib · Seaborn · NumPy


7. 🚗 Car Price Prediction

Repository: CarPricePrediction_Linear_RandomForest_Regressor_PolynomialFeatures

Overview

This project builds and compares multiple regression models to predict used car prices based on features like make, model, year, mileage, engine specs, and more. It explores progressively more complex models to improve prediction accuracy.

Models Compared

Model Description
Linear Regression Baseline regression with one-hot encoding
Polynomial Features + Linear Regression Captures non-linear relationships in car pricing
Random Forest Regressor Ensemble method for robust, non-linear regression

Key Features

  • 🔧 Feature engineering with Polynomial Features for non-linear captures
  • 🌲 Random Forest with hyperparameter tuning
  • 📉 R² score, MAE, and RMSE comparison across models
  • 📊 Feature importance visualization from Random Forest

Tech Stack

Python · Scikit-learn · Pandas · Matplotlib · NumPy


8. 🚢 Titanic Survival Prediction

Repository: Titanic_dataset_RandomForest_AdaBoost_DecisionTree

Overview

This project solves the classic Titanic survival prediction problem — one of the most well-known introductory ML datasets. The goal is to predict whether a passenger survived the Titanic disaster based on features like age, sex, ticket class, and family size. Three ensemble and tree-based classifiers are implemented and compared.

Models Implemented

Model Description
Decision Tree Simple, interpretable tree-based classifier
Random Forest Ensemble of decision trees with bagging
AdaBoost Boosting algorithm that sequentially corrects errors

Key Features

  • 🧹 Data preprocessing: missing value imputation, feature encoding
  • 📊 Feature importance analysis
  • 🔄 Cross-validation for robust performance estimation
  • 📈 Accuracy, precision, recall, F1-score comparison across models

Tech Stack

Python · Scikit-learn · Pandas · Matplotlib · Seaborn


9. 📬 Naive Bayes Classifiers

Repository: Naive_Bayes_classifiers

Overview

This project provides a comprehensive implementation and exploration of Naive Bayes classification — a family of probabilistic classifiers based on Bayes' theorem with a "naive" assumption of feature independence. The project includes both a Jupyter Notebook for interactive exploration and a standalone Python script.

Variants Explored

  • Gaussian Naive Bayes — For continuous features (assumes Gaussian distribution)
  • Multinomial Naive Bayes — For discrete count features (e.g., text classification)
  • Bernoulli Naive Bayes — For binary/boolean features

Key Features

  • 📐 Mathematical derivation and intuition behind Bayes' theorem
  • 📊 Comparison of Naive Bayes variants on multiple datasets
  • 🎯 Accuracy, confusion matrix, and classification reports
  • 🐍 Reusable Python module (naive_bayes_classifier.py)

Tech Stack

Python · Scikit-learn · NumPy · Pandas · Matplotlib


10. 🌱 Plant Seedling Recognition

Repository: Plant_Seedling_Recognition

Overview

This is a Computer Vision project focused on identifying and classifying different plant seedling species from images. Accurate plant identification at the seedling stage has important applications in precision agriculture — helping farmers distinguish crops from weeds early in the growing season. The project was submitted as an online CV course project.

Key Features

  • 🌿 Multi-class image classification of plant seedling species
  • 🔄 Data preprocessing and augmentation pipeline
  • 🏗️ CNN-based architecture for feature extraction
  • 📊 Training visualization and per-class accuracy analysis

Tech Stack

Python · TensorFlow / Keras · NumPy · Matplotlib · Jupyter Notebook


11. 🏠 House Price Prediction

Repository: House_Price_Prediction

Overview

This project predicts residential house prices based on various structural and locational features of properties. It was submitted as an online ML course project and demonstrates a complete machine learning pipeline — from exploratory data analysis (EDA) and feature engineering to model training and evaluation.

Key Features

  • 📊 In-depth Exploratory Data Analysis (EDA) with visualizations
  • 🔧 Feature engineering: handling missing values, encoding categorical variables, feature scaling
  • 🤖 Regression model training and hyperparameter tuning
  • 📉 Performance evaluation with MAE, RMSE, and R² metrics
  • 🗺️ Geographic/spatial feature analysis (if location data available)

Tech Stack

Python · Scikit-learn · Pandas · NumPy · Matplotlib · Seaborn · Jupyter Notebook


🛠 Tech Stack Summary

Category Technologies
Languages Python 3.x, C++
Deep Learning PyTorch, TensorFlow 2.x, Keras
ML Libraries Scikit-learn, PyTorch Geometric
Data Processing NumPy, Pandas
Visualization Matplotlib, Seaborn
Environments Jupyter Notebook, Python Scripts
Build Tools Makefile (C++)
Physics Simulation Pythia8 (Monte Carlo)

👤 Author

Rahul Drabit Chowdhury

Machine Learning Enthusiast | Physics Researcher | Software Developer

GitHub


Projects completed during learning from Coursera and CUET (Chittagong University of Engineering & Technology)

⭐ Star this repo if you find it helpful!

About

This repo link to all project in Machine Learning link what I completed during learning Machine learning from Coursera and CUET .

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors