A curated collection of Machine Learning and Deep Learning projects completed during learning from Coursera and CUET (Chittagong University of Engineering & Technology), spanning classical ML algorithms to advanced neural network architectures.
- Projects Overview
- 1. Kolmogorov-Arnold Network on MNIST
- 2. Machine Learning in C++
- 3. Graph Neural Networks for Quark-Gluon Classification
- 4. Leukemia Cell Detection
- 5. Number Classification (MNIST)
- 6. Iris Unsupervised Clustering
- 7. Car Price Prediction
- 8. Titanic Survival Prediction
- 9. Naive Bayes Classifiers
- 10. Plant Seedling Recognition
- 11. House Price Prediction
- Tech Stack Summary
| # | Project | Domain | Algorithms / Techniques | Language |
|---|---|---|---|---|
| 1 | KAN on MNIST | Deep Learning | Kolmogorov-Arnold Networks, B-splines | Python |
| 2 | ML in C++ | Classical ML | Neural Network from scratch | C++ |
| 3 | Graph NN Quark-Gluon | Physics / GNN | GCN, EdgeConv, GAT | Python |
| 4 | Leukemia Detection | Medical AI | CNN, DenseNet, ResNet | Python |
| 5 | Number Classification | Computer Vision | Deep Learning (MNIST) | Python |
| 6 | Iris Clustering | Unsupervised ML | K-Means, DBSCAN, Correlation | Python |
| 7 | Car Price Prediction | Regression | Linear, RandomForest, Polynomial | Python |
| 8 | Titanic Prediction | Classification | RandomForest, AdaBoost, DecisionTree | Python |
| 9 | Naive Bayes | Classification | Gaussian/Multinomial Naive Bayes | Python |
| 10 | Plant Seedling Recognition | Computer Vision | CNN, Transfer Learning | Python |
| 11 | House Price Prediction | Regression | Regression Models | Python |
Repository: Kolmogorov-Arnold-Network_MNIST
This project implements a Kolmogorov-Arnold Network (KAN) — a novel neural network architecture that uses learnable B-spline activation functions on edges rather than fixed activations on nodes (unlike traditional MLPs). The model is applied to the well-known MNIST handwritten digit classification benchmark.
- 🔬 Custom
BSplineBasisandKANLayermodules built from scratch in PyTorch - 🏗️
MNISTKANarchitecture with dimensionality reduction + two KAN layers - 📊 Full training pipeline with checkpointing, loss/accuracy curves, confusion matrix, and basis function visualization
- 🎯 Achieves 90.95% test accuracy on MNIST in just 15 epochs
| Component | Details |
|---|---|
| Model | MNISTKAN (B-Spline KAN layers) |
| Optimizer | Adam (lr = 0.001) |
| Batch Size | 128 |
| Epochs | 15 |
| Hidden Dim | 64 |
| B-spline Bases | 16 |
| Metric | Value |
|---|---|
| Final Test Accuracy | 90.95% |
| Final Test Loss | 0.3104 |
Python · PyTorch · torchvision · matplotlib · numpy · tqdm
Repository: MachineLearning_cPlusPlus
This project builds a Machine Learning framework from scratch using C++, without relying on high-level ML libraries. It implements a neural network capable of reading and processing the MNIST dataset (raw binary IDX format) entirely in C++. This project demonstrates a deep understanding of the mathematical foundations underlying neural networks.
MachineLearning_cPlusPlus/
├── include/
│ ├── data.hpp # Data point structure
│ └── data_handler.hpp # MNIST binary file reader
├── src/
│ ├── data.cc # Data implementation
│ └── data_handler.cc # IDX file parsing logic
├── MakeFile # Build configuration
└── main # Compiled executable
- 📦 Raw MNIST binary (IDX) file parsing in pure C++
- 🔧 Custom
DataHandlerclass for train/test/validation splitting - 🧮 Neural network logic built entirely without ML libraries
- 🛠️ Makefile-based build system
C++ · Standard Library · Makefile · MNIST IDX format
Repository: Graph_NN_Pythia8_Quark-Gluon
This project applies Graph Neural Networks (GNNs) to high-energy physics: classifying quark jets vs. gluon jets using data generated from the Pythia8 Monte Carlo event generator. Jets are represented as graphs where particles are nodes and spatial relationships form edges — a natural fit for GNN-based classification.
| Model | Test Accuracy | AUC Score |
|---|---|---|
| GCN (Graph Convolutional Network) | 73.0% | 0.797 |
| EdgeConv ✅ Best | 75.5% | 0.821 |
| GAT (Graph Attention Network) | 67.2% | 0.730 |
Best Model: EdgeConv with AUC = 0.821
- ⚛️ Particle physics jet data processed as graph structures
- 🏆 Three GNN architectures compared: GCN, EdgeConv, GAT
- 📈 Training/validation curves and confusion matrices
- 🗃️ Modular code architecture: separate models, utils, and data processing
Python · PyTorch Geometric · Pythia8 · NumPy · Matplotlib
Repository: LeukemiaCellDetection
This project addresses a critical medical AI challenge: detecting Leukemia from microscopic blood smear images. It implements and compares multiple deep learning approaches — from training CNNs from scratch with data augmentation to leveraging powerful transfer learning with DenseNet and ResNet architectures.
Six leukemia and blood cell classes:
| Class | Description |
|---|---|
| ALL | Acute Lymphoblastic Leukemia |
| AML | Acute Myeloid Leukemia |
| CLL | Chronic Lymphocytic Leukemia |
| CML | Chronic Myeloid Leukemia |
| MM | Multiple Myeloma |
| Healthy | Normal (non-cancerous) cells |
Data Split: 60% Train · 20% Validation · 20% Test
- CNN with Data Augmentation — Random flipping, rotation, and zooming for better generalization
- DenseNet121 Transfer Learning — Pre-trained feature extractor with custom classification head (w/ and w/o FC layers)
- ResNet50 Transfer Learning — Pre-trained backbone with a custom classification head
- Fine-tuning — Partial unfreezing of base model layers with a lower learning rate
Python · TensorFlow 2.x · Keras · NumPy · Matplotlib · Seaborn · Scikit-learn
Repository: NumberClassification
This project tackles the classic MNIST handwritten digit classification problem, serving as a foundational deep learning project. The Jupyter notebook (ComputerProjectFirstProject.ipynb) walks through the complete pipeline: data loading, model building, training, and evaluation of neural network models on the 70,000-sample MNIST dataset.
- 📒 Interactive Jupyter Notebook workflow
- 🔍 Exploratory data analysis of digit images
- 🧠 Neural network model trained on 60,000 training images
- 📊 Evaluation on 10,000 test images with accuracy metrics
Python · Jupyter Notebook · TensorFlow / Keras · NumPy · Matplotlib
Repository: Iris_Unsupervised_KMeans_DBScan_Correlation
This project applies unsupervised machine learning to the famous Iris flower dataset to discover natural groupings among flower species without using labels. It compares clustering algorithms and analyzes feature correlations to understand the underlying data structure.
| Technique | Purpose |
|---|---|
| K-Means Clustering | Partition-based clustering to find K natural groups |
| DBSCAN | Density-based clustering — handles noise and non-spherical clusters |
| Correlation Analysis | Feature correlation heatmaps to understand relationships |
- 📊 Elbow method for optimal K selection in K-Means
- 🔍 DBSCAN with epsilon and min-samples tuning
- 🌡️ Correlation heatmaps showing feature interdependencies
- 📈 2D/3D cluster visualizations with PCA dimensionality reduction
Python · Scikit-learn · Pandas · Matplotlib · Seaborn · NumPy
Repository: CarPricePrediction_Linear_RandomForest_Regressor_PolynomialFeatures
This project builds and compares multiple regression models to predict used car prices based on features like make, model, year, mileage, engine specs, and more. It explores progressively more complex models to improve prediction accuracy.
| Model | Description |
|---|---|
| Linear Regression | Baseline regression with one-hot encoding |
| Polynomial Features + Linear Regression | Captures non-linear relationships in car pricing |
| Random Forest Regressor | Ensemble method for robust, non-linear regression |
- 🔧 Feature engineering with Polynomial Features for non-linear captures
- 🌲 Random Forest with hyperparameter tuning
- 📉 R² score, MAE, and RMSE comparison across models
- 📊 Feature importance visualization from Random Forest
Python · Scikit-learn · Pandas · Matplotlib · NumPy
Repository: Titanic_dataset_RandomForest_AdaBoost_DecisionTree
This project solves the classic Titanic survival prediction problem — one of the most well-known introductory ML datasets. The goal is to predict whether a passenger survived the Titanic disaster based on features like age, sex, ticket class, and family size. Three ensemble and tree-based classifiers are implemented and compared.
| Model | Description |
|---|---|
| Decision Tree | Simple, interpretable tree-based classifier |
| Random Forest | Ensemble of decision trees with bagging |
| AdaBoost | Boosting algorithm that sequentially corrects errors |
- 🧹 Data preprocessing: missing value imputation, feature encoding
- 📊 Feature importance analysis
- 🔄 Cross-validation for robust performance estimation
- 📈 Accuracy, precision, recall, F1-score comparison across models
Python · Scikit-learn · Pandas · Matplotlib · Seaborn
Repository: Naive_Bayes_classifiers
This project provides a comprehensive implementation and exploration of Naive Bayes classification — a family of probabilistic classifiers based on Bayes' theorem with a "naive" assumption of feature independence. The project includes both a Jupyter Notebook for interactive exploration and a standalone Python script.
- Gaussian Naive Bayes — For continuous features (assumes Gaussian distribution)
- Multinomial Naive Bayes — For discrete count features (e.g., text classification)
- Bernoulli Naive Bayes — For binary/boolean features
- 📐 Mathematical derivation and intuition behind Bayes' theorem
- 📊 Comparison of Naive Bayes variants on multiple datasets
- 🎯 Accuracy, confusion matrix, and classification reports
- 🐍 Reusable Python module (
naive_bayes_classifier.py)
Python · Scikit-learn · NumPy · Pandas · Matplotlib
Repository: Plant_Seedling_Recognition
This is a Computer Vision project focused on identifying and classifying different plant seedling species from images. Accurate plant identification at the seedling stage has important applications in precision agriculture — helping farmers distinguish crops from weeds early in the growing season. The project was submitted as an online CV course project.
- 🌿 Multi-class image classification of plant seedling species
- 🔄 Data preprocessing and augmentation pipeline
- 🏗️ CNN-based architecture for feature extraction
- 📊 Training visualization and per-class accuracy analysis
Python · TensorFlow / Keras · NumPy · Matplotlib · Jupyter Notebook
Repository: House_Price_Prediction
This project predicts residential house prices based on various structural and locational features of properties. It was submitted as an online ML course project and demonstrates a complete machine learning pipeline — from exploratory data analysis (EDA) and feature engineering to model training and evaluation.
- 📊 In-depth Exploratory Data Analysis (EDA) with visualizations
- 🔧 Feature engineering: handling missing values, encoding categorical variables, feature scaling
- 🤖 Regression model training and hyperparameter tuning
- 📉 Performance evaluation with MAE, RMSE, and R² metrics
- 🗺️ Geographic/spatial feature analysis (if location data available)
Python · Scikit-learn · Pandas · NumPy · Matplotlib · Seaborn · Jupyter Notebook
| Category | Technologies |
|---|---|
| Languages | Python 3.x, C++ |
| Deep Learning | PyTorch, TensorFlow 2.x, Keras |
| ML Libraries | Scikit-learn, PyTorch Geometric |
| Data Processing | NumPy, Pandas |
| Visualization | Matplotlib, Seaborn |
| Environments | Jupyter Notebook, Python Scripts |
| Build Tools | Makefile (C++) |
| Physics Simulation | Pythia8 (Monte Carlo) |
Projects completed during learning from Coursera and CUET (Chittagong University of Engineering & Technology)
⭐ Star this repo if you find it helpful!