Day1:MLops_learning

Introduction to MLOps

Machine learning operations (MLOps) are a set of practices that:

Automate and simplify machine learning (ML) workflows and deployments
Bring DevOps discipline to building, shipping, and running ML models
Improve reliability, reproducibility, and productivity across the ML lifecycle

Life Cycle of Data Science Project

Understand the problem and use case - Define business objectives and identify the specific problem that needs to be solved with data science.
EDA: Data nature finding - Exploratory Data Analysis to understand data patterns, distributions, relationships, and anomalies.
Data pre-processing - Clean and prepare data using techniques like IQR, box plot, Q-Q plot, standardization, and handling null values.
Feature engineering - Create new meaningful features from existing data to improve model performance.
Feature selection - Identify and select the most relevant features that contribute to the predictive power of the model.
Model training and hyperparameter tuning - Train machine learning models and optimize their parameters for best performance.
Model evaluation - Assess model performance using appropriate metrics to ensure it meets business requirements.
App building/UI - Develop user interface and application to make the model accessible to end users.
Deploy - Deploy the model to production environment where it can serve real-world predictions.

Issues with DS Practice Without MLOps

Low Coding Standards - OOPS concept, modular coding, logging, exception handling, etc.
No Data Management - Data ingestion/artifacts management
Versioning - Code, data, and model versioning not implemented
Data Pipeline/Experiments - Lack of reproducible pipelines and experiment tracking
No CI/CD Concept - Missing continuous integration and continuous deployment practices
Scalability & Monitoring (Production) - Missing tools like Kubernetes, Prometheus, Grafana for production monitoring
Cross Team Friction - Communication and coordination issues between teams

Comparison: Data Science Lifecycle vs Software Development Lifecycle

Aspect	Software Development Lifecycle (SDLC)	Data Science Lifecycle (ML)
Primary Goal	Build reliable, maintainable software products	Build accurate, generalizable ML models
Output	Software application/system	Trained ML model with predictions
Testing	Unit testing, integration testing, QA testing	Data validation, model validation, cross-validation
Versioning	Code versioning (Git)	Code, data, and model versioning required
Deployment	Fixed requirements, deterministic outputs	Dynamic requirements, probabilistic outputs
Monitoring	Application performance, errors, uptime	Model performance, data drift, prediction accuracy
Reproducibility	Easier to reproduce with same code	Harder due to randomness and data variability
CI/CD	Well-established practices	Emerging best practices in MLOps
Key Challenge	Feature completeness and bug-free code	Model accuracy and handling data/concept drift
Maintenance	Bug fixes, feature updates	Model retraining, data pipeline updates
Stakeholders	Developers, QA, DevOps	Data scientists, ML engineers, DevOps engineers

Standards That Should Be Followed by New Beginners

Code Standards - OOPS concept, modular coding, logging module for better debugging, managing artifacts, components, and pipelines
Code Versioning - Git & GitHub (Bitbucket, GitLab)
Data/Model Versioning - Maintaining data pipelines and experimentation using DVC, MLflow (Neptune, Seldon, Kubeflow, ZenML)
CI/CD Tools - GitHub Actions, CircleCI, TravisCI
Containerization - Docker and Docker Hub for code reliability
Scalability & Monitoring - Kubernetes, Prometheus, Grafana
Cloud Services - AWS Services (IAM User, ECR, S3, EC2, etc.) or all-in-one platforms (AWS SageMaker, Google Vertex AI, Azure ML)

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Day2git		Day2git
Day3oops		Day3oops
gitoperation		gitoperation
image/README		image/README
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Day1:MLops_learning

Introduction to MLOps

Life Cycle of Data Science Project

Issues with DS Practice Without MLOps

Comparison: Data Science Lifecycle vs Software Development Lifecycle

Standards That Should Be Followed by New Beginners

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Day1:MLops_learning

Introduction to MLOps

Life Cycle of Data Science Project

Issues with DS Practice Without MLOps

Comparison: Data Science Lifecycle vs Software Development Lifecycle

Standards That Should Be Followed by New Beginners

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages