This project implements an automated stock trading system using Deep Reinforcement Learning techniques such as DQN and PPO. Historical Apple stock data is cleaned, processed, and enriched with multiple technical indicators including moving averages, RSI, MACD, and Bollinger Bands. A custom trading environment simulates real market behavior, allowing RL agents to learn optimal buy–sell–hold strategies through interaction and reward feedback. The project includes data visualization, feature engineering, normalization, cumulative return analysis, and prediction of future price movements, demonstrating how reinforcement learning can be effectively applied to algorithmic trading.
apple_stock_data.csv
1️ DQN (Deep Q-Network)-- Value-based RL, ε-greedy exploration, Two-layer neural network, Experience replay memory.
2️ PPO (Proximal Policy Optimization)-- Policy-gradient based, Actor–Critic architecture, Clipped objective function , Stable policy updates.
MA5, MA20, MA50, RSI, MACD + Signal Line, Bollinger Bands, Momentum, Volume Moving Averages, Price Variance, Daily Returns.
Closing price over time, OHLC combination plot, Volume plot, RSI curve, Daily return distribution, Agent action plot (Red = Buy, Blue = Sell, Green = Hold), Cumulative returns across episodes.