🎨 Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs
(CVPR 2026 Highlight)

Widget2Code is a baseline framework that strengthens both perceptual understanding and system-level generation for transforming visual widgets into UI code. It leverages advanced vision-language models to automatically generate production-ready WidgetDSL from screenshots, featuring icon detection across 57,000+ icons, layout analysis, component recognition and generation. This repository provides the implementation and tools needed to generate high-fidelity widget code.

📋 Table of Contents

🎨 Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs

🔥🔥🔥 News

🌟 Apr 9, 2026: Selected as a CVPR 2026 Highlight
📦 Mar 31, 2026: Evaluation toolkit published on PyPI
🎉 Feb 21, 2026: Accepted to CVPR 2026
📦 Dec 22, 2025: Benchmark dataset uploaded to Hugging Face
📄 Dec 22, 2025: Paper uploaded to arXiv
🚀 Dec 16, 2025: We release the complete Widget2Code framework including inference code, interactive playground, batch processing scripts, and evaluation tools.

🎥 Demo

playground.mp4

📖 Overview

Widget2Code is a baseline framework that strengthens both perceptual understanding and system-level generation for transforming visual widgets into UI code.

🏗️ Architecture

Widget2Code employs a sophisticated multi-stage generation pipeline:

Generation Pipeline

Image Preprocessing: Resolution normalization, format conversion, and quality analysis
Layout Detection: Multi-stage layout analysis with intelligent retry mechanism for robust component positioning
Icon Retrieval: FAISS-based similarity search across 57,000+ icons with dual-encoder (text + image) matching
Chart Recognition: Specialized detection and classification for 8 chart types using vision models
Color Extraction: Advanced palette and gradient analysis with perceptual color matching
DSL Generation: LLM-based structured output generation with domain-specific prompts
Validation: Schema validation, constraint checking, and error correction
Compilation: DSL to React JSX/HTML transformation with optimization
Rendering: Render from code to png in headless browser

🛠️ Dependencies and Installation

Quick Install

One-Command Setup:

./scripts/setup/install.sh

Installs all dependencies including Node.js packages and isolated Python environment.

⚙️ Configuration

Create .env file with API credentials and ground truth directory:

cp .env.example .env
# Edit .env and configure:
# - API credentials
# - GT_DIR: Path to ground truth directory for evaluation (e.g., ./data/widget2code-benchmark/test)

🚀 Quick Start

Step 1: Start API Service

# Start API backend (required for batch processing)
npm run api

Step 2: Generate Widgets (Batch)

# Batch generation with 5 concurrent workers
./scripts/generation/generate-batch.sh ./mockups ./output 5

# Force regenerate all images
./scripts/generation/generate-batch.sh ./mockups ./output 5 --force

Step 3: Render Widgets (Batch)

# Batch rendering with 5 concurrent workers
./scripts/rendering/render-batch.sh ./output 5

# Force rerender all widgets
./scripts/rendering/render-batch.sh ./output 5 --force

Step 4: Evaluate Results

# Evaluate generated widgets against ground truth
# If GT_DIR is set in .env, -g flag is optional
./scripts/evaluation/run_evaluation.sh ./output

# Or specify ground truth directory explicitly
./scripts/evaluation/run_evaluation.sh ./output -g ./data/widget2code-benchmark/test

# Use GPU and more workers for faster evaluation
./scripts/evaluation/run_evaluation.sh ./output -g ./data/widget2code-benchmark/test --cuda -w 16

Interactive Playground (Optional)

# Start interactive playground
npm run playground

📊 Benchmarks & Evaluation

Performance Comparison

Widget2Code achieves state-of-the-art performance across multiple quality dimensions including layout accuracy, legibility, style preservation, perceptual similarity, and geometric precision.

Evaluation Datasets

Widget2Code has been evaluated on 13 benchmark datasets:

Seed1.6-Thinking
Gemini2.5-Pro
GPT-4o
Qwen3-VL
Qwen3-VL-235b
Design2Code
DCGen
LatCoder
UICopilot
WebSight-VLM-8B
ScreenCoder
UI-UG
Widget2Code

Download Benchmarks

Download the Widget2Code Benchmark Dataset to the ./data/ folder.

After downloading, set GT_DIR=./data/widget2code-benchmark/test in your .env file, or use the -g flag when running evaluation scripts. The test split (./data/widget2code-benchmark/test) should be used as ground truth for evaluation.

Benchmark Results: All Methods Results (465MB) - Download evaluation results across all 13 benchmark datasets and methods from Google Drive.

To use the benchmark results:

# Install gdown (if not already installed)
pip install gdown

# Download using gdown (465MB)
gdown --fuzzy "https://drive.google.com/file/d/1LAYReu4fUES1IE0qM7h-zNGvyUgYnqwz/view?usp=sharing"

# If download fails, manually download from the link above

# Extract to project root directory
unzip benchmarks_backup_20251216.zip

# Run evaluation on all benchmarks (using test split as ground truth)
./scripts/evaluation/run_all_benchmarks.sh -g ./data/widget2code-benchmark/test --cuda -w 16

📚 Citation

If you find Widget2Code useful for your research or projects, please cite our work:

@article{widget2code2025,
  title={Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs},
  author={Houston H. Zhang, Tao Zhang, Baoze Lin, Yuanqi Xue, Yincheng Zhu, Huan Liu, Li Gu, Linfeng Ye, Ziqiang Wang, Xinxin Zuo, Yang Wang, Yuanhao Yu, Zhixiang Chi},
  journal={arXiv preprint},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🎨 Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs
(CVPR 2026 Highlight)

📋 Table of Contents

🔥🔥🔥 News

🎥 Demo

📖 Overview

🏗️ Architecture

Generation Pipeline

🛠️ Dependencies and Installation

Quick Install

⚙️ Configuration

🚀 Quick Start

Step 1: Start API Service

Step 2: Generate Widgets (Batch)

Step 3: Render Widgets (Batch)

Step 4: Evaluate Results

Interactive Playground (Optional)

📊 Benchmarks & Evaluation

Performance Comparison

Evaluation Datasets

Download Benchmarks

📚 Citation

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🎨 Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs(CVPR 2026 Highlight)

📋 Table of Contents

🔥🔥🔥 News

🎥 Demo

📖 Overview

🏗️ Architecture

Generation Pipeline

🛠️ Dependencies and Installation

Quick Install

⚙️ Configuration

🚀 Quick Start

Step 1: Start API Service

Step 2: Generate Widgets (Batch)

Step 3: Render Widgets (Batch)

Step 4: Evaluate Results

Interactive Playground (Optional)

📊 Benchmarks & Evaluation

Performance Comparison

Evaluation Datasets

Download Benchmarks

📚 Citation

🎨 Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs
(CVPR 2026 Highlight)