Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,7 @@ tests/data_*.h5
tests/data_*/
tests/tmp.*
tests/.coverage

# local dev artifact
uv.lock
.venv/
144 changes: 144 additions & 0 deletions skills/dpdata-driver/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
---
name: dpdata-driver
description: Use dpdata Python Driver plugins to label systems (energies/forces/virials) via System.predict(), list available drivers, and build Driver objects (ase/deepmd/gaussian/sqm/hybrid). Use when working with dpdata Python API (not CLI) and you need driver-based energy/force prediction, plugin registration keys, or examples of using dpdata with ASE calculators or DeePMD models.
---

# dpdata-driver

Use dpdata “driver plugins” to **label** a `dpdata.System` (predict energies/forces/virials) and obtain a `dpdata.LabeledSystem`.

## Key idea

- A **Driver** converts an unlabeled `System` into a `LabeledSystem` by computing:
- `energies` (required)
- `forces` (optional but common)
- `virials` (optional)

In dpdata, this is exposed as:

- `System.predict(*args, driver="dp", **kwargs) -> LabeledSystem`

`driver` can be:

- a **string key** (plugin name), e.g. `"ase"`, `"dp"`, `"gaussian"`
- a **Driver object**, e.g. `Driver.get_driver("ase")(...)`

## List supported driver keys (runtime)

When unsure what drivers exist in *this* dpdata version/env, query them at runtime:

```python
from dpdata.driver import Driver

print(sorted(Driver.get_drivers().keys()))
```
Comment on lines +30 to +34

In the current repo state, keys include:

- `ase`
- `dp` / `deepmd` / `deepmd-kit`
- `gaussian`
- `sqm`
- `hybrid`

(Exact set depends on dpdata version and installed extras.)

## Minimal workflow

```python
import dpdata
from dpdata.system import System

sys = System("input.xyz", fmt="xyz")
ls = sys.predict(driver="ase", calculator=...) # returns dpdata.LabeledSystem
```

### Verify you got a labeled system

```python
assert "energies" in ls.data
# optional:
# assert "forces" in ls.data
# assert "virials" in ls.data
```

## Example: use the ASE driver with an ASE calculator (runnable)

This is the easiest *fully runnable* example because it doesn’t require external QM software.

Dependencies (recommended): declare script dependencies with uv inline metadata, then run with `uv run`.

```python
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "dpdata",
# "numpy",
# "ase",
Comment on lines +72 to +77
# ]
# ///
```

Script:

```python
import numpy as np
from ase.calculators.emt import EMT
from dpdata.system import System

# write a tiny molecule
open("tmp.xyz", "w").write("""2\n\nH 0 0 0\nH 0 0 0.74\n""")

sys = System("tmp.xyz", fmt="xyz")
ls = sys.predict(driver="ase", calculator=EMT())
Comment on lines +86 to +93
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Does ASE EMT calculator support hydrogen molecules or is it only for metals?

💡 Result:

ASE’s built-in EMT calculator is primarily meant for a small set of fcc metals (Al, Cu, Ag, Au, Ni, Pd, Pt), but it does include parameters for H (and also C, N, O). Those non-metal parameters are explicitly marked as “not for any serious use”, i.e., you can run an H₂ molecule with EMT, but the results are not considered reliable. [1], [2]


Replace ASE's EMT calculator with a more suitable option for H₂ demonstration.

ASE's EMT calculator includes parameters for hydrogen, but these are explicitly marked as "not for any serious use." Using EMT with an H₂ molecule produces unreliable results unsuitable for documentation examples. Consider using LennardJones or another general-purpose calculator that reliably handles all atomic species.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/dpdata-driver/SKILL.md` around lines 86 - 93, The example uses ASE's
EMT calculator (calculator=EMT()) for an H2 demo which is unreliable; replace
EMT with a general-purpose calculator such as ASE's LennardJones to produce
stable, documented results. Update the snippet that constructs the System and
calls sys.predict(driver="ase", calculator=EMT()) so it instead passes
calculator=LennardJones() (or another suitable cross-species calculator) and
ensure any necessary imports (e.g., from ase.calculators.lennardjones import
LennardJones) are added and referenced where EMT and calculator=EMT() currently
appear.


print("energies", np.array(ls.data["energies"]))
print("forces shape", np.array(ls.data["forces"]).shape)
if "virials" in ls.data:
print("virials shape", np.array(ls.data["virials"]).shape)
else:
print("virials: <not provided by this driver/calculator>")
```

## Example: pass a Driver object instead of a string

```python
from ase.calculators.emt import EMT
from dpdata.driver import Driver
from dpdata.system import System

sys = System("tmp.xyz", fmt="xyz")
ase_driver = Driver.get_driver("ase")(calculator=EMT())
ls = sys.predict(driver=ase_driver)
```

## Hybrid driver

Use `driver="hybrid"` to sum energies/forces/virials from multiple drivers.

The `HybridDriver` accepts `drivers=[ ... ]` where each item is either:

- a `Driver` instance
- a dict like `{"type": "sqm", ...}` (type is the driver key)

Example (structure only; may require external executables):

```python
from dpdata.driver import Driver

hyb = Driver.get_driver("hybrid")(
drivers=[
{"type": "sqm", "qm_theory": "DFTB3"},
{"type": "dp", "dp": "frozen_model.pb"},
]
)
# ls = sys.predict(driver=hyb)
```

## Notes / gotchas

- Many drivers require extra dependencies or external programs:
- `dp` requires `deepmd-kit` + a model file
- `gaussian` requires Gaussian and a valid executable (default `g16`)
- `sqm` requires AmberTools `sqm`
- If you just need file format conversion, use the existing **dpdata CLI** skill instead.
Loading