Skip to content

Latest commit

 

History

History
223 lines (160 loc) · 12.8 KB

File metadata and controls

223 lines (160 loc) · 12.8 KB

DeepCamera — Open-Source AI Camera Skills Platform

DeepCamera's open-source skills give your cameras AI — VLM scene analysis, object detection, person re-identification, all running locally with models like Qwen, DeepSeek, SmolVLM, and LLaVA. Built on proven facial recognition, RE-ID, fall detection, and CCTV/NVR surveillance monitoring, the skill catalog extends these machine learning capabilities with modern AI. All inference runs locally for maximum privacy.

GitHub release Pypi release download


🛡️ Introducing SharpAI Aegis — Desktop App for DeepCamera

Use DeepCamera's AI skills through a desktop app with LLM-powered setup, agent chat, and smart alerts — connected to your mobile via Discord / Telegram / Slack.

SharpAI Aegis is the desktop companion for DeepCamera. It uses LLM to automatically set up your environment, configure camera skills, and manage the full AI pipeline — no manual Docker or CLI required. It also adds an intelligent agent layer: persistent memory, agentic chat with your cameras, AI video generation, voice (TTS), and conversational messaging via Discord / Telegram / Slack.

📦 Download SharpAI Aegis →

Run Local VLMs from HuggingFace — Even on Mac Mini 8GB

SharpAI Aegis — Browse and run local VLM models for AI camera video analysis

Download and run SmolVLM2, Qwen-VL, LLaVA, MiniCPM-V locally. Your AI security camera agent sees through these eyes.

Chat with Your AI Camera Agent

SharpAI Aegis — LLM-powered agentic security camera chat

"Who was at the door?" — Your agent searches footage, reasons about what happened, and answers with timestamps and clips.


🗺️ Roadmap

  • Skill architecture — pluggable SKILL.md interface for all capabilities
  • Skill Store UI — browse, install, and configure skills from Aegis
  • AI/LLM-assisted skill installation — community-contributed skills installed and configured via AI agent
  • GPU / NPU / CPU (AIPC) aware installation — auto-detect hardware, install matching frameworks, convert models to optimal format
  • Hardware environment layer — shared env_config.py for auto-detection + model optimization across NVIDIA, AMD, Apple Silicon, Intel, and CPU
  • Skill development — 18 skills across 9 categories, actively expanding with community contributions

🧩 Skill Catalog

Each skill is a self-contained module with its own model, parameters, and communication protocol. See the Skill Development Guide and Platform Parameters to build your own.

Category Skill What It Does Status
Detection yolo-detection-2026 Real-time 80+ class detection — auto-accelerated via TensorRT / CoreML / OpenVINO / ONNX
Analysis home-security-benchmark 143-test evaluation suite for LLM & VLM security performance
smarthome-bench Video anomaly detection benchmark — 105 clips across 7 smart home categories
homesafe-bench Indoor safety hazard detection — 40 tests across 5 categories
sam2-segmentation Click-to-segment with pixel-perfect masks 📐
Transformation depth-estimation Monocular depth maps with Depth Anything v2 📐
Annotation dataset-annotation AI-assisted labeling → COCO export 📐
Camera Providers eufy · reolink · tapo Direct camera integrations via RTSP 📐
Streaming go2rtc-cameras RTSP → WebRTC live view 📐
Channels matrix · line · signal Messaging channels for Clawdbot agent 📐
Automation mqtt · webhook · ha-trigger Event-driven automation triggers 📐
Integrations homeassistant-bridge HA cameras in ↔ detection results out 📐

✅ Ready · 🧪 Testing · 📐 Planned

Registry: All skills are indexed in skills.json for programmatic discovery.

🚀 Getting Started with SharpAI Aegis

The easiest way to run DeepCamera's AI skills. Aegis connects everything — cameras, models, skills, and you.

  • 📷 Connect cameras in seconds — add RTSP/ONVIF cameras, webcams, or iPhone cameras for a quick test
  • 🤖 Built-in local LLM & VLM — llama-server included, no separate setup needed
  • 📦 One-click skill deployment — install skills from the catalog with AI-assisted troubleshooting
  • 🔽 One-click HuggingFace downloads — browse and run Qwen, DeepSeek, SmolVLM, LLaVA, MiniCPM-V
  • 📊 Find the best VLM for your machine — benchmark models on your own hardware with HomeSec-Bench
  • 💬 Talk to your guard — via Telegram, Discord, or Slack. Ask what happened, tell it what to watch for, get AI-reasoned answers with footage.

🎯 YOLO 2026 — Real-Time Object Detection

State-of-the-art detection running locally on any hardware, fully integrated as a DeepCamera skill.

YOLO26 Models

YOLO26 (Jan 2026) eliminates NMS and DFL for cleaner exports and lower latency. Pick the size that fits your hardware:

Model Params Latency (optimized) Use Case
yolo26n (nano) 2.6M ~2ms Edge devices, real-time on CPU
yolo26s (small) 11.2M ~5ms Balanced speed & accuracy
yolo26m (medium) 25.4M ~12ms Accuracy-focused
yolo26l (large) 52.3M ~25ms Maximum detection quality

All models detect 80+ COCO classes: people, vehicles, animals, everyday objects.

Hardware Acceleration

The shared env_config.py auto-detects your GPU and converts the model to the fastest native format — zero manual setup:

Your Hardware Optimized Format Runtime Speedup vs PyTorch
NVIDIA GPU (RTX, Jetson) TensorRT .engine CUDA 3-5x
Apple Silicon (M1–M4) CoreML .mlpackage ANE + GPU ~2x
Intel (CPU, iGPU, NPU) OpenVINO IR .xml OpenVINO 2-3x
AMD GPU (RX, MI) ONNX Runtime ROCm 1.5-2x
Any CPU ONNX Runtime CPU ~1.5x

Aegis Skill Integration

Detection runs as a parallel pipeline alongside VLM analysis — never blocks your AI agent:

Camera → Frame Governor → detect.py (JSONL) → Aegis IPC → Live Overlay
                5 FPS           ↓
                          perf_stats (p50/p95/p99 latency)
  • 🖱️ Click to setup — one button in Aegis installs everything, no terminal needed
  • 🤖 AI-driven environment config — autonomous agent detects your GPU, installs the right framework (CUDA/ROCm/CoreML/OpenVINO), converts models, and verifies the setup
  • 📺 Live bounding boxes — detection results rendered as overlays on RTSP camera streams
  • 📊 Built-in performance profiling — aggregate latency stats (p50/p95/p99) emitted every 50 frames
  • Auto start — set auto_start: true to begin detecting when Aegis launches

📖 Full Skill Documentation →

📊 HomeSec-Bench — How Secure Is Your Local AI?

HomeSec-Bench is a 143-test security benchmark that measures how well your local AI performs as a security guard. It tests what matters: Can it detect a person in fog? Classify a break-in vs. a delivery? Resist prompt injection? Route alerts correctly at 3 AM?

Run it on your own hardware to know exactly where your setup stands.

Area Tests What's at Stake
Scene Understanding 35 Person detection in fog, rain, night IR, sun glare
Security Classification 12 Telling a break-in from a raccoon
Tool Use & Reasoning 16 Correct tool calls with accurate parameters
Prompt Injection Resistance 4 Adversarial attacks that try to disable your guard
Privacy Compliance 3 PII leak prevention, illegal surveillance refusal
Alert Routing 5 Right message, right channel, right time

Results: Local vs. Cloud vs. Hybrid

HomeSec-Bench benchmark results — local Qwen 4B vs cloud GPT-5.2 vs hybrid

Running on a Mac M1 Mini 8GB: local Qwen3.5-4B scores 39/54 (72%), cloud GPT-5.2 scores 46/48 (96%), and the hybrid config reaches 53/54 (98%). All 35 VLM test images are AI-generated — no real footage, fully privacy-compliant.

📄 Read the Paper · 🔬 Run It Yourself · 📋 Test Scenarios


📦 More Applications

Legacy Applications (SharpAI-Hub CLI)

These applications use the sharpai-cli Docker-based workflow. For the modern experience, use SharpAI Aegis.

Application CLI Command Platforms
Person Recognition (ReID) sharpai-cli yolov7_reid start Jetson/Windows/Linux/macOS
Person Detector sharpai-cli yolov7_person_detector start Jetson/Windows/Linux/macOS
Facial Recognition sharpai-cli deepcamera start Jetson/Windows/Linux/macOS
Local Facial Recognition sharpai-cli local_deepcamera start Windows/Linux/macOS
Screen Monitor sharpai-cli screen_monitor start Windows/Linux/macOS
Parking Monitor sharpai-cli yoloparking start Jetson AGX
Fall Detection sharpai-cli falldetection start Jetson AGX

📖 Detailed setup guides →

Tested Devices

  • Edge: Jetson Nano, Xavier AGX, Raspberry Pi 4/8GB
  • Desktop: macOS, Windows 11, Ubuntu 20.04
  • MCU: ESP32 CAM, ESP32-S3-Eye

Tested Cameras

  • RTSP: DaHua, Lorex, Amcrest
  • Cloud: Blink, Nest (via Home Assistant)
  • Mobile: IP Camera Lite (iOS)

🏗️ Architecture

architecture

Complete Feature List →

🤝 Support & Community