One-way sync tool that pulls files from an Azure Blob Storage container to your local machine. Only downloads new or changed files — safe to run repeatedly.
- Python 3.10+
- An Azure Storage account with at least one blob container
First, create and activate a virtual environment:
# Windows
python -m venv .venv
.venv\Scripts\activate
# macOS/Linux
python -m venv .venv
source .venv/bin/activateThen install the dependencies:
pip install -r requirements.txtYou can configure the tool three ways (in order of precedence):
python sync.py \
--connection-string "DefaultEndpointsProtocol=https;AccountName=...;AccountKey=...;EndpointSuffix=core.windows.net" \
--container my-container \
--local-dir ./outputcp config.example.json config.jsonEdit config.json with your values:
{
"connection_string": "DefaultEndpointsProtocol=https;AccountName=YOUR_ACCOUNT;AccountKey=YOUR_KEY;EndpointSuffix=core.windows.net",
"container_name": "my-container",
"local_dir": "./synced-files",
"prefix": "",
"delete_orphaned": false
}Then run with:
python sync.py --config config.jsonexport AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;..."
python sync.py --container my-containerNote: CLI flags override config file values, which override env vars.
# Basic sync using a config file
python sync.py --config config.json
# Sync only blobs under a specific path
python sync.py -c config.json --prefix "images/2025/"
# Sync and delete local files that were removed from Azure
python sync.py -c config.json --delete-orphaned
# Verbose output for debugging
python sync.py -c config.json -v
# Override output directory
python sync.py -c config.json --local-dir /data/backup| Flag | Short | Default | Description |
|---|---|---|---|
--config |
-c |
— | Path to JSON config file |
--connection-string |
— | Azure Storage connection string | |
--container |
— | Blob container name | |
--local-dir |
./synced-files |
Local directory to sync into | |
--prefix |
"" |
Only sync blobs matching this prefix | |
--delete-orphaned |
false |
Remove local files deleted from Azure | |
--verbose |
-v |
false |
Enable debug-level logging |
- List — queries all blobs in the container (filtered by
--prefixif set) - Compare — checks each blob's
etagandlast_modifiedagainst a local.sync_manifest.json - Download — only pulls blobs that are new or changed
- Clean up (optional) — with
--delete-orphaned, removes local files that no longer exist in the container - Save manifest — writes
.sync_manifest.jsonso the next run knows what's already synced
The manifest is stored inside your --local-dir. Deleting it will cause a full re-download on the next run.
pip install pytest
pytest test_sync.py -vAll Azure SDK calls are mocked — no real Azure connection needed.
- Go to the Azure Portal
- Navigate to your Storage Account → Access keys
- Click Show next to Key 1 and copy the Connection string
Security: Never commit
config.jsonto source control — it's in.gitignoreby default.