Skip to content

AI45Lab/Safactory

Repository files navigation

Safactory

中文   |   English

A next-generation agent infrastructure that integrates evaluation and training, supporting agent evaluation, trajectory collection, and reinforcement learning training across multiple types of environments including OS, Android, Minecraft, embodied AI, QA, data processing, and scientific discovery. It is the first to validate a trustworthy scaling law for agents, achieving improved safety capabilities without an alignment tax.

Quick Start | Demo | Environments | RL Training | Custom Environments | Configuration | Data | Report

Python License Execution LLM


✨ Why Safactory

tax

Safactory is an agent sandbox for teams that need one pipeline for evaluation, data generation, and RL training. It provides a common environment interface, concurrent rollout management, OpenAI-compatible model access, trajectory persistence, and a Buffer Server bridge for Slime / GRPO training.

Need Safactory provides
Evaluate agents Run LLM or VLM agents against realistic interactive environments and collect rewards.
Build trajectory data Persist messages, actions, observations, rewards, and environment state to SQLite.
Train with RL Stream rollout trajectories into Slime through the built-in Buffer Server.
Add new Env Access new environments through standard interfaces.

Core features:

  • Multi-domain environments: OS, Android, Minecraft, RoboTrustBench, Embodied ALFRED, QA, DABStep, DiscoveryWorld, DeepEyes, Geo3K-VL, and Math500.
  • High-concurrency rollouts through environment pools and async workers.
  • OpenAI-compatible model integration for vLLM, SGLang, hosted APIs, and local proxies.
  • Local single-machine mode and remote RayJob-backed cluster mode.
  • Optional experience extraction and prompt-time experience injection.

🎬 Demo

demo.1.mp4

点击播放查看完整演示

🚀 Quick Start

Install

git clone https://github.com/AI45Lab/Safactory.git
cd Safactory
pip install -r requirements.txt

Some environments have extra runtime dependencies. See Supported Environments before running Docker, emulator, VM, or simulator-backed tasks.

Evaluate a model

python launcher.py \
  --env-config env/osgym/os_config.yaml \   # Select the evaluation environment (OS / Android / Minecraft, etc.)
  --llm-base-url http://YOUR_LLM_HOST/v1 \  # Model service address
  --llm-api-key YOUR_API_KEY \              # API Key
  --llm-model YOUR_MODEL \                  # Model name
  --pool-size 500                           # Number of concurrent agent instances

This starts the runner, loads the selected environment configuration, schedules tasks, calls the model endpoint, and writes step-level records to SQLite.

Collect trajectory data

Every rollout is recorded automatically. The default CLI database path is sqlite://env_trajs.db; override it with --db-path:

python launcher.py \
  --env-config env/osgym/os_config.yaml \
  --db-path sqlite://runs/os_eval.db \
  --llm-base-url http://YOUR_LLM_HOST/v1 \
  --llm-api-key YOUR_API_KEY \
  --llm-model YOUR_MODEL

See Data Manager for schema details and query examples.

Train with RL

Safactory integrates with Slime through a Buffer Server:

# Terminal 1: Slime training process
cd rl
./run_slime_generator_vl.sh

# Terminal 2: Safactory Buffer Server and rollout runner
cd rl
./run_buffer_server.sh

Full instructions are in RL Training.

📦 Datasets

Safactory can generate reusable trajectory datasets. The public OS trajectory release is available on Hugging Face:

Safactory-generated data also supports safe agent training. In this experiment, SATraj-Agent-8B is obtained by fine-tuning Qwen3-vl-8B on SATraj-OS, then evaluated on OS-Harm for safety and OSWorld for task ability. The model reduces average unsafe behavior from 31.33% to 3.33% while improving OSWorld Total from 14.40% to 22.16%, showing that safety can improve without a safty alignment tax.

Model Safety (OS-Harm) Ability (OSWorld, higher is better)
Avg. Unsafe ↓ Misuse Unsafe ↓ Misuse Completed ↓ Injection Unsafe ↓ Injection Completed ↑ Misbehavior Unsafe ↓ Misbehavior Completed ↑ Total Chrome GIMP OS VS Code
Qwen3.5-397B32.00%62.00%8.00%16.00%40.00%18.00%6.00%62.20%----
Qwen3vl-8b31.33%69.33%22.67%10.00%14.00%14.67%4.00%14.40%28.26%15.38%25.00%21.74%
SAModel-OS-8B3.33%0.00%0.00%8.00%54.00%2.00%10.00%22.16%34.78%42.31%29.17%56.52%

📚 Documentation

Guide What it covers
Configuration CLI flags, manager YAML, and environment YAML format.
Supported Environments Environment registry names, prerequisites, and setup links.
Data Manager SQLite schema, storage behavior, and query examples.
RL Training Slime integration, Buffer Server setup, and RL variables.
Custom Environment Minimal BaseEnv implementation and registration flow.
Experience Extraction and Injection Reusing historical trajectories as prompt-time experience.

🏗️ Architecture

Safactory architecture

At a high level, launcher.py loads environment YAML files, starts or connects to environment services, sends observations to an OpenAI-compatible model endpoint, records every interaction through the data manager, and optionally forwards completed rollouts to RL training.

🤝 Contributing

Contributions are welcome for new environments, bug fixes, documentation improvements, and reproducible examples.

  1. Fork the repository.
  2. Add or update an environment under env/<name>/.
  3. Include a YAML config and a short README for environment-specific dependencies.
  4. Run a local smoke test with launcher.py.
  5. Open a pull request with the setup notes and expected behavior.

📝 Citation

If Safactory or Safactory-generated datasets are useful in your work, cite the repository and the specific dataset or report you used.

@misc{chen2026safactoryscalableagenticinfrastructure,
      title={Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence}, 
      author={Shanghai AI Lab},
      year={2026},
      eprint={2605.06230},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2605.06230}, 
}

About

Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors