I'm Osama Mustafa, a Data Analyst at Turing, currently working on fine-tuning and evaluating multimodal LLM models. My expertise spans building end-to-end data pipelines, real-time streaming systems, and cloud-scale data lakehouses. I am passionate about scalable data architectures and actively seeking opportunities in data engineering.
Pinned Loading
-
Enterprise_Retail_Data_Lakehouse
Enterprise_Retail_Data_Lakehouse PublicBatch retail data lakehouse on Databricks: Delta Live Tables (bronze β silver β gold), Unity Catalog, synthetic data generator, and an executive analytics dashboard.
Python
-
Realtime-Transaction-CDC-Pipeline
Realtime-Transaction-CDC-Pipeline PublicEvent-driven pipeline streaming financial transactions end-to-end: API Gateway β DynamoDB β CDC via Streams β Kinesis β S3 Data Lake β Athena SQL Analytics
Python
-
Booking-Event-Driven-Data-Pipeline
Booking-Event-Driven-Data-Pipeline PublicEvent-driven AWS pipeline β S3 β CloudTrail β EventBridge β Step Functions β Glue ETL (PySpark) β Redshift, with data quality validation, circuit breaker, and SNS alerts
Python
-
Realtime-Profile-Ingestion-Pipeline
Realtime-Profile-Ingestion-Pipeline PublicReal-time data pipeline that ingests user profiles from a REST API, streams them through Kafka, processes with Spark Structured Streaming, and persists to Cassandra β orchestrated by Airflow and fuβ¦
Python
-
Sentiment-Driven-Stock-Analysis
Sentiment-Driven-Stock-Analysis PublicThis project predicts stock market performance using sentiment analysis of news headline. The sentiment is visualized using Treemap
-
If the problem persists, check the GitHub status page or contact support.
