A code judge that doesn't just check your output — it checks if you understand your solution. Submit Python code, run it in an isolated Docker container, get evaluated against multiple test cases. On accepted, Gemini generates follow-up questions specific to your code and evaluates your answers.
- User submits Python code via the React frontend
- Rails saves the submission and enqueues a Sidekiq job — returns 202 immediately
- Sidekiq worker runs the code inside an isolated Docker container with strict resource limits
- Output is compared against all test cases — partial results tracked (e.g. 3/4 passed)
- Frontend polls until status is terminal
- On accepted: Gemini generates 3 short follow-up questions specific to the user's code
- User has 3 minutes to answer — Gemini evaluates leniently and explains each result
accepted: All test cases passedwrong_answer: Output mismatch on at least one test caseruntime_error: Non-zero exit codecompile_error: Python SyntaxError detectedtime_limit_exceeded: Execution exceeded 10 seconds
- API: Ruby on Rails 7.2 (API mode)
- Background jobs: Sidekiq + Redis
- Database: PostgreSQL
- Execution sandbox: Docker (python:3.11-alpine)
- AI follow-up: Google Gemini 2.5 Flash
- Frontend: React + Vite
Prerequisites: Ruby 3.2.2, Rails 7.2, PostgreSQL, Redis, Docker
git clone https://github.com/Alokxk/CodeBench.git
cd CodeBench
bundle install
rails db:create db:migrate db:seed
docker pull python:3.11-alpineCreate .env in the project root:
GEMINI_API_KEY=your_key_here
SIDEKIQ_WEB_PASSWORD=any_password_you_want
SIDEKIQ_WEB_SECRET=run `ruby -e "require 'securerandom'; puts SecureRandom.hex(32)"` and paste output here
Get a free Gemini API key at aistudio.google.com.
# Terminal 1
rails server
# Terminal 2
bundle exec sidekiq -C config/sidekiq.yml
# Terminal 3
cd frontend && npm install && npm run devOpen http://localhost:5173
See ARCHITECTURE.md for engineering decisions.