Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
9e92fbc
remove model source center and reformat
liellnima Oct 22, 2024
7e08575
move selected scenario mip files to docs
liellnima Oct 26, 2024
68dabf5
update download configs for project, and ensemble members
liellnima Oct 26, 2024
114eb10
remove unused esm_constants
liellnima Oct 26, 2024
47f0c78
add new constant files for each esgf project type
liellnima Oct 26, 2024
aa89ff6
remove get_selected_scenario as it is too restricting
liellnima Oct 26, 2024
a61dc2a
remove restricting funcs, extend to broader model set, extend to broa…
liellnima Oct 26, 2024
b868033
move constants into constant classes, and collect them in a dict in e…
liellnima Nov 19, 2024
bb7b8f1
update configs: move project id to the top
liellnima Nov 19, 2024
5a0c38f
update download_from_config func with new constant and config handlin…
liellnima Nov 19, 2024
ad5e0b0
Add base structure for abstract downloader and implementations
f-PLT Jan 10, 2025
aa0e451
Refactor ESGF constants and project constants
f-PLT Jan 28, 2025
6a76fa9
Add first base structure of Config classes
f-PLT Jan 28, 2025
23b0bea
Integrate config class for Input4mips
f-PLT Jan 28, 2025
770c003
Implement config classes
f-PLT Feb 26, 2025
0ea3aae
Update tests
f-PLT Feb 26, 2025
9df2456
Refactor CMIP6Downloader for multiple models
f-PLT Feb 26, 2025
df32336
Cleanup of downloader.py file
f-PLT Feb 26, 2025
1f2ff66
Cleanup of downloader.py file
f-PLT Feb 26, 2025
7789c1b
Update all download config files
f-PLT Feb 27, 2025
ecf8a41
Add download example
f-PLT Feb 27, 2025
260952d
Update download_from_config_file() to use existing functions for each…
f-PLT Feb 27, 2025
1017f14
Fix Pylint errors
f-PLT Feb 27, 2025
445115f
fix typo
liellnima Feb 27, 2025
51614eb
update minimal usecase config and add ocean configs for future
liellnima Feb 27, 2025
30e3969
add ocean constants for future use cases, can be ignored rn
liellnima Feb 27, 2025
2c341dc
Update with new QA tools and new Makefile version
f-PLT May 23, 2025
73b56a2
Ruff fix lint + formatting
f-PLT May 23, 2025
4f0283b
Update and fix failing test
f-PLT May 23, 2025
0ceafa0
Refactor input4mips constants for safety
f-PLT May 23, 2025
3ed950c
Handle pylint warnings
f-PLT May 23, 2025
99561a3
Update github actions
f-PLT May 24, 2025
bf53e46
Formatting for pyproject.toml
f-PLT May 26, 2025
665d77b
Refactor downloader constants
f-PLT May 28, 2025
6717393
Refactor downloader_config from Abstract to base inheritance
f-PLT May 28, 2025
ee52825
Update .pre-commit-config.yaml
f-PLT Jun 1, 2025
cf437bd
Update pyproject.toml
f-PLT Jun 3, 2025
fe35c4f
Save progress - Prototype url search
f-PLT Jun 17, 2025
05e98cc
Create constraints classes
f-PLT Feb 5, 2026
29c1fca
Remove pytest xfail for test_downloader_model_params
f-PLT Feb 5, 2026
a5a8b9e
Implement new search client
f-PLT Feb 5, 2026
37ccda1
Use new client in utils.py
f-PLT Feb 16, 2026
0041eb4
feat(esgpull): Add esgpull dependency and update constraints for mult…
f-PLT Mar 4, 2026
88b5349
feat(esgpull): Implement isolated_esgpull_context for safe execution
f-PLT Mar 4, 2026
5424300
feat(esgpull): Implement EsgpullDownloader and robust query search co…
f-PLT Mar 4, 2026
4178374
test(esgpull): Add integration test for EsgpullDownloader search
f-PLT Mar 4, 2026
722001a
feat(esgpull): Add async execution and download integration
f-PLT Mar 4, 2026
6ed11ca
test(esgpull): Add verification and testing for esgpull implementation
f-PLT Mar 4, 2026
4d4b4bc
Add esgpull downloader
f-PLT Mar 24, 2026
8dc75b4
Cleanup unused dedicated esgpull downloader
f-PLT Mar 24, 2026
e09705a
Remove push from CI triggers
f-PLT Mar 24, 2026
d15f462
Refactor esgpull utils
f-PLT Mar 24, 2026
c2c8363
Merge branch 'main' into 12-generalize-downloader
f-PLT Mar 24, 2026
7099073
Update Makefile for bugfix
f-PLT Mar 24, 2026
04ef0e5
Update Makefile for bugfix
f-PLT Mar 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .make/CHANGES_MAKEFILE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ ______________________________________________________________________

<!-- (New changes here in list form) -->

## [1.3.1](https://github.com/RolnickLab/lab-advanced-template/tree/makefile-1.3.1) (2026-03-24)

______________________________________________________________________

- Fix issue where the `ENV_COMMAND_TOOL` variable is not what was expected with `conda` environments

## [1.3.0](https://github.com/RolnickLab/lab-advanced-template/tree/makefile-1.3.0) (2026-02-12)

______________________________________________________________________
Expand Down
2 changes: 1 addition & 1 deletion .make/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
# files to include.
########################################################################################
PROJECT_PATH := $(dir $(abspath $(firstword $(MAKEFILE_LIST))))
MAKEFILE_VERSION := 1.3.0
MAKEFILE_VERSION := 1.3.1
BUMP_TOOL := bump-my-version
BUMP_CONFIG_FILE := $(PROJECT_PATH).bumpversion.toml
SHELL := /usr/bin/env bash
Expand Down
2 changes: 1 addition & 1 deletion .make/base.make
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ PROJECT_PATH := $(dir $(abspath $(firstword $(MAKEFILE_LIST))))
MAKEFILE_NAME := $(word $(words $(MAKEFILE_LIST)),$(MAKEFILE_LIST))
SHELL := /usr/bin/env bash
BUMP_TOOL := bump-my-version
MAKEFILE_VERSION := 1.3.0
MAKEFILE_VERSION := 1.3.1
DOCKER_COMPOSE ?= docker compose
AUTO_INSTALL ?=

Expand Down
7 changes: 5 additions & 2 deletions .make/poetry.make
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,17 @@ ifeq ($(DEFAULT_INSTALL_ENV),venv)
POETRY_COMMAND_WITH_PROJECT_ENV := source $(VENV_ACTIVATE) && $(POETRY_COMMAND_WITH_PROJECT_ENV)
else ifeq ($(DEFAULT_INSTALL_ENV),poetry)
POETRY_COMMAND_WITH_PROJECT_ENV := $(POETRY_COMMAND_WITH_PROJECT_ENV)
else ifeq ($(DEFAULT_INSTALL_ENV),conda)
POETRY_COMMAND_WITH_PROJECT_ENV := $(CONDA_ENV_TOOL) run -n $(CONDA_ENVIRONMENT) $(POETRY_COMMAND_WITH_PROJECT_ENV)
endif

# Do not rename these unless you also rename across all other make files in .make/
ENV_COMMAND_TOOL := $(POETRY_COMMAND_WITH_PROJECT_ENV) run
ENV_INSTALL_TOOL := $(POETRY_COMMAND_WITH_PROJECT_ENV) install

ifeq ($(DEFAULT_INSTALL_ENV),conda)
ENV_COMMAND_TOOL := $(CONDA_ENV_TOOL) run -n $(CONDA_ENVIRONMENT)
ENV_INSTALL_TOOL := $(ENV_COMMAND_TOOL) $(POETRY_COMMAND_WITH_PROJECT_ENV) install
endif


## -- Poetry targets ------------------------------------------------------------------------------------------------ ##

Expand Down
14 changes: 12 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
exclude: "^docs/|/migrations/"
exclude: ^docs/|/migrations/|Makefile*
default_stages: [commit]

repos:
Expand All @@ -17,8 +17,18 @@ repos:
- id: check-added-large-files
args: ["--maxkb=5000"]

- repo: https://github.com/PyCQA/autoflake
rev: v2.3.1
hooks:
- id: autoflake

- repo: https://github.com/hhatto/autopep8
rev: v2.3.2
hooks:
- id: autopep8

- repo: https://github.com/psf/black
rev: 23.12.1
rev: 24.4.2
hooks:
- id: black

Expand Down
1 change: 1 addition & 0 deletions climateset/download/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .downloader import download_from_config_file # noqa F401
7 changes: 7 additions & 0 deletions climateset/download/abstract_downloader.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from abc import ABC, abstractmethod


class AbstractDownloader(ABC):
@abstractmethod
def download(self):
pass
164 changes: 164 additions & 0 deletions climateset/download/client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
from typing import Any, List, Optional

from pyesgf.search import SearchConnection
from pyesgf.search.context import DatasetSearchContext

from climateset.download.constants import NODE_LINK_URLS
from climateset.download.constraints import BaseSearchConstraints
from climateset.utils import create_logger

LOGGER = create_logger(__name__)


class SearchClient:
"""
Client for performing searches against ESGF nodes with failover support.

Acts as a factory for SearchSession objects.
"""

def __init__(self, node_urls: List[str] | None = None, distrib: bool = True):
self.node_urls = node_urls if node_urls is not None else NODE_LINK_URLS
self.distrib = distrib
self.logger = LOGGER

def __enter__(self):
return self

def __exit__(self, exc_type, exc_val, exc_tb):
pass

def new_session(self) -> "SearchSession":
"""Start a new search session."""
return SearchSession(self.node_urls, self.distrib, self.logger)


class SearchSession:
"""
Stateful session for building iterative search queries.

Handles node failover by replaying applied constraints.
"""

def __init__(self, node_urls: List[str], distrib: bool, logger):
self.node_urls = node_urls
self.distrib = distrib
self.logger = logger

# History of constraints applied to this session
self._constraints_history: List[BaseSearchConstraints] = []

# State relative to correct active connection
self._current_node_index = 0
self._connection: Optional[SearchConnection] = None
self._context: Optional[DatasetSearchContext] = None

# Initialize connection logic
self._ensure_connection()

def _ensure_connection(self):
"""
Ensures a valid connection/context exists.

If not, attempts to connect to available nodes. Once connected, replays history.
"""
if self._context is not None:
return

while self._current_node_index < len(self.node_urls):
url = self.node_urls[self._current_node_index]
try:
self.logger.info(f"Connecting to ESGF node: {url}")
self._connection = SearchConnection(url=url, distrib=self.distrib)

# Create fresh context
ctx = self._connection.new_context()

# Replay constraints
for constraints in self._constraints_history:
params = constraints.to_esgf_params()
if params:
ctx = ctx.constrain(**params)

self._context = ctx
return
except Exception as e: # pylint: disable=broad-exception-caught
self.logger.warning(f"Failed to connect to {url}: {e}")
self._current_node_index += 1
self._connection = None
self._context = None

raise ConnectionError(f"Could not connect to any ESGF node. Tried: {self.node_urls}")

def _rotate_node(self):
"""Force rotation to the next node (e.g. after a search failure)."""
self.logger.info("Rotating to next ESGF node...")
self._current_node_index += 1
self._connection = None
self._context = None
self._ensure_connection()

def constrain(self, constraints: BaseSearchConstraints) -> "SearchSession":
"""Apply a new set of constraints to the session."""
self._constraints_history.append(constraints)

# If we have an active context, apply immediately.
# If not (e.g. all nodes down), _ensure_connection will handle it next time.
if self._context:
params = constraints.to_esgf_params()
if params:
try:
self._context = self._context.constrain(**params)
except Exception as e: # pylint: disable=broad-exception-caught
self.logger.warning(f"Error applying constraints on current node: {e}")
self._rotate_node()
else:
# Try to establish connection if we were disconnected
try:
self._ensure_connection()
except ConnectionError:
pass # Delay error until actual search/facet request

return self

def get_available_facets(self, facet_name: str) -> List[str]:
"""
Get available counts/values for a specific facet.

Retries on other nodes if current fails.
"""
max_attempts = len(self.node_urls)
attempts = 0

while attempts < max_attempts:
try:
self._ensure_connection()
if facet_name in self._context.facet_counts:
return list(self._context.facet_counts[facet_name].keys())
return []
except Exception as e: # pylint: disable=broad-exception-caught
self.logger.warning(f"Error fetching facets from {self.node_urls[self._current_node_index]}: {e}")
self._rotate_node()
attempts += 1

return []

def search(self) -> List[Any]:
"""
Execute the search using applied constraints.

Retries on other nodes if current fails.
"""
max_attempts = len(self.node_urls)
attempts = 0

while attempts < max_attempts:
try:
self._ensure_connection()
return self._context.search()
except Exception as e: # pylint: disable=broad-exception-caught
self.logger.warning(f"Search failed on {self.node_urls[self._current_node_index]}: {e}")
self._rotate_node()
attempts += 1

raise ConnectionError("Search failed on all available nodes.")
Loading
Loading