[ENH] V1 → V2 API Migration - studies by rohansen856 · Pull Request #1610 · openml/openml-python

rohansen856 · 2026-01-08T20:47:22Z

Metadata

Reference Issue: [ENH] V1 → V2 API Migration - studies #1594 (towards [ENH] V1 → V2 API Migration #1575)
New Tests Added: No
Documentation Updated: No

Details

Stackend PR, Depends on #1576

This PR adds Studies v2 migration.

A question:
Due to the pre commit hook i could not put 6 arguments in a function, so i had to workaround that with this instead:
openml_api\resources\studies.py (line 10-15)

        limit = kwargs.get("limit")
        offset = kwargs.get("offset")
        status = kwargs.get("status")
        main_entity_type = kwargs.get("main_entity_type")
        uploader = kwargs.get("uploader")
        benchmark_suite = kwargs.get("benchmark_suite")

I would like to confirm if this approach is correct or not. Raising a draft PR for now.

codecov-commenter · 2026-01-08T20:53:56Z

Codecov Report

❌ Patch coverage is 23.91304% with 35 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.64%. Comparing base (e653ef6) to head (03ea718).

Files with missing lines	Patch %	Lines
openml/_api/resources/study.py	23.25%	33 Missing ⚠️
openml/study/functions.py	33.33%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1610      +/-   ##
==========================================
- Coverage   54.67%   54.64%   -0.04%     
==========================================
  Files          63       63              
  Lines        5108     5124      +16     
==========================================
+ Hits         2793     2800       +7     
- Misses       2315     2324       +9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

rohansen856 · 2026-01-13T07:12:28Z

Implementing noqa instead of the kwargs following example from here: openml\testing.py:

    def _check_fold_timing_evaluations(  # noqa: PLR0913
        self,
        fold_evaluations: dict[str, dict[int, dict[int, float]]],
        num_repeats: int,
        num_folds: int,
        *,
        max_time_allowed: float = 60000.0,
        task_type: TaskType = TaskType.SUPERVISED_CLASSIFICATION,
        check_scores: bool = True,
    ) -> None:

Final function signature:

    def list(  # noqa: PLR0913
        self,
        limit: int | None = None,
        offset: int | None = None,
        status: str | None = None,
        main_entity_type: str | None = None,
        uploader: list[int] | None = None,
        benchmark_suite: int | None = None,
    ) -> Any:

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040

Good work. Just use the listing as suggested in #1575 (comment) which is already similar to what you have done.

for more information, see https://pre-commit.ci

rohansen856 · 2026-01-15T08:35:54Z

@geetu040 I reviewed the specific changes needed and have a slight doubt in the pandas implementation.
So as i undertand, i need to use pandas Dataframe insteaf of ANY in openml\_api\resources\base.py like this:

class StudiesAPI(ResourceAPI, ABC):
    @abstractmethod
    def list(  # noqa: PLR0913
        self,
        limit: int | None = None,
        offset: int | None = None,
        status: str | None = None,
        main_entity_type: str | None = None,
        uploader: list[int] | None = None,
        benchmark_suite: int | None = None,
    ) -> pd.DataFrame: ...

and similarly i have to change the return object in openml\_api\resources\studies.py from this:return response.text
to this:

xml_string = response.text

        # Parse XML and convert to DataFrame
        study_dict = xmltodict.parse(xml_string, force_list=("oml:study",))

        # Minimalistic check if the XML is useful
        assert isinstance(study_dict["oml:study_list"]["oml:study"], list), type(
            study_dict["oml:study_list"],
        )
        assert (
            study_dict["oml:study_list"]["@xmlns:oml"] == "http://openml.org/openml"
        ), study_dict["oml:study_list"]["@xmlns:oml"]

        studies = {}
        for study_ in study_dict["oml:study_list"]["oml:study"]:
            # maps from xml name to a tuple of (dict name, casting fn)
            expected_fields = {
                "oml:id": ("id", int),
                "oml:alias": ("alias", str),
                "oml:main_entity_type": ("main_entity_type", str),
                "oml:benchmark_suite": ("benchmark_suite", int),
                "oml:name": ("name", str),
                "oml:status": ("status", str),
                "oml:creation_date": ("creation_date", str),
                "oml:creator": ("creator", int),
            }
            study_id = int(study_["oml:id"])
            current_study = {}
            for oml_field_name, (real_field_name, cast_fn) in expected_fields.items():
                if oml_field_name in study_:
                    current_study[real_field_name] = cast_fn(study_[oml_field_name])
            current_study["id"] = int(current_study["id"])
            studies[study_id] = current_study

        return pd.DataFrame.from_dict(studies, orient="index")

A total of 3 files would be affected: openml\_api\resources\base.py, openml\_api\resources\studies.py and openml\study\functions.py

Can you please confirm my approach... After that i will update the PR.

geetu040 · 2026-01-15T08:55:24Z

@rohansen856 yes sounds right

Signed-off-by: rohansen856 <rohansen856@gmail.com>

rohansen856 · 2026-01-15T10:42:12Z

Updated! Ready for review.

…into issue1564

geetu040

Almost fine, just complety remove _list_studies as well and replace _list_studies with api_context.backend.studies.list as the parameter for partial in list_studies. Hope I didnot confuse you, just search for the exact method names in code. Let me know if I am not clear enough.

rohansen856 · 2026-01-16T09:45:32Z

Almost fine, just complety remove _list_studies as well and replace _list_studies with api_context.backend.studies.list as the parameter for partial in list_studies. Hope I didnot confuse you, just search for the exact method names in code. Let me know if I am not clear enough.

Oh definitely! I prolly missed that in openml\study\functions.py but pushing the change with next commit.

…list Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040

update with #1576 (comment)

for more information, see https://pre-commit.ci

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040

I think it's not synced with the base pr #1576 that's why the tests are failing.

identified in https://github.com/openml/openml-python/actions/runs/23148612114/job/67243650521?pr=1609

identified in https://github.com/openml/openml-python/actions/runs/23430963986/job/68156944423?pr=1576

geetu040 · 2026-03-23T12:41:47Z

Please add in description Fixes 1594

…into studies-migration

geetu040

The base PR is merged now, please sync with main.

geetu040

Nicely done overall, I am a bit skeptical about the overuse of mocking, since it adds alot of hardcoded response content, but let's see if Pieter thinks otherwise, when he reviews.

geetu040 · 2026-03-25T16:26:32Z

openml/_api/resources/base/resources.py

+        main_entity_type: str | None = None,
+        uploader: list[int] | None = None,
+        benchmark_suite: int | None = None,
+    ) -> pd.DataFrame:


use abstractmethod and I'd say remove the docstring and simply use the placeholder ... inplace of NotImplementedError

openml/_api/resources/study.py

geetu040 · 2026-03-25T16:26:40Z

tests/test_api/test_study.py

+
+        result = study_v1.delete(study_id)
+
+        assert result is True


Suggested change

assert result is True

assert result

geetu040

also fix the failing tests, you probably just need to fix the fixture, see other test files for reference

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040

Thanks for the changes, looks good now.

@PGijsbers please review/merge.

PGijsbers

Minor test changes required, confirmation from @geetu040 requested on the intent of the added functions.

PGijsbers · 2026-03-27T09:01:55Z

tests/test_api/test_study.py

+        mock_request.return_value._content = mock_response.encode("utf-8")
+
+        studies_df = study_v1.list(limit=5, offset=0)
+


Suggested change

mock_request.assert_called_once()

Make sure that we check the mocks are used so that we're testing what we think we are testing.

PGijsbers · 2026-03-27T09:07:03Z

tests/test_api/test_study.py

+
+        page1_ids = set(page1["id"])
+        page2_ids = set(page2["id"])
+        assert page1_ids.isdisjoint(page2_ids)


I think the real test here is that we make the right request, e.g., that the pagination parameters are set correctly in the URL. We should assert that. The parsed responses are not really under test, and are dictated by the mocked responses anyway.

PGijsbers · 2026-03-27T11:28:00Z

tests/test_api/test_study.py

+            f"</oml:upload_study>\n"
+        ).encode("utf-8")
+
+        published_id = study_v1.publish("study", files=study_files)


note (no action required): as discussed in the call today, the publish method is going to be used internally through from OpenMLBase.publish. Users will not be expected to pass files manually to the function.
Similar work will be done for tag/untag so that the interface for those operations doesn't change.
This is listed as follow up work to be handled in a separate PR.

Just leaving the comment here to clarify the intent and double check my understanding is correct.

@geetu040

Signed-off-by: rohansen856 <rohansen856@gmail.com>

…enml-python into studies-migration

geetu040 mentioned this pull request Jan 9, 2026

[ENH] V1 → V2 API Migration #1575

Open

18 tasks

chore: fixed the args limit in function using noqa

88077a7

Signed-off-by: rohansen856 <rohansen856@gmail.com>

rohansen856 marked this pull request as ready for review January 13, 2026 07:21

geetu040 suggested changes Jan 13, 2026

View reviewed changes

satvshr and others added 3 commits January 14, 2026 16:55

merge main

1dbc780

Merge branch 'main' into studies-migration

13acf35

[pre-commit.ci] auto fixes from pre-commit.com hooks

e02e05b

for more information, see https://pre-commit.ci

geetu040 and others added 3 commits January 15, 2026 14:51

undo changes in tasks/functions.py

4c75e16

Merge branch 'main' into migration

5762185

chore: updated the list function acc to reviews

9170edc

Signed-off-by: rohansen856 <rohansen856@gmail.com>

satvshr added 4 commits January 15, 2026 21:36

made requested changes

021a1e1

Merge branch 'main' into issue1564

4c4a12c

made requested changes

1d91220

Merge branch 'issue1564' of https://github.com/satvshr/openml-python …

3e26ace

…into issue1564

geetu040 suggested changes Jan 15, 2026

View reviewed changes

satvshr added 4 commits January 15, 2026 21:56

fixed bugs

0060b2e

fixed bugs

65ba66b

fixed bugs

317c6e9

fixed bugs

503ab82

rohansen856 and others added 3 commits January 16, 2026 15:15

chore: removed _list_studies and implemented api_context for studies …

8c980c9

…list Signed-off-by: rohansen856 <rohansen856@gmail.com>

Merge branch 'main' into issue1564

fd7ea2b

bug fixing

fa3cd40

geetu040 assigned rohansen856 Jan 19, 2026

Merge branch 'main' into migration

7e9bc1f

Merge branch 'main' into migration

8de99b7

geetu040 suggested changes Mar 13, 2026

View reviewed changes

geetu040 and others added 11 commits March 16, 2026 20:12

create enum ServerMode

7d61107

update config for ServerMode

1ecbbba

update tests for ServerMode

65472ed

Merge branch 'main' into studies-migration

a0a3b61

[pre-commit.ci] auto fixes from pre-commit.com hooks

33858a7

for more information, see https://pre-commit.ci

chore: fixed pre commit errors

a704bb0

Signed-off-by: rohansen856 <rohansen856@gmail.com>

chore: fixed mypy errors

3470cb5

Signed-off-by: rohansen856 <rohansen856@gmail.com>

udpate apikey in _TEST_SERVERS_LOCAL

44b48b5

CI trigger

62c3d3d

chore: fixed missing header in OpenMLConfigManager

c93b97f

Signed-off-by: rohansen856 <rohansen856@gmail.com>

chore: fixed server key issue in test

972987b

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040 suggested changes Mar 23, 2026

View reviewed changes

geetu040 added 2 commits March 23, 2026 14:42

fix: remove duplicate server name in cache path

04bc83b

identified in https://github.com/openml/openml-python/actions/runs/23148612114/job/67243650521?pr=1609

test: remove check for ":" since windows CI expects it

f926092

identified in https://github.com/openml/openml-python/actions/runs/23430963986/job/68156944423?pr=1576

Merge branch 'migration' of https://github.com/geetu040/openml-python …

083194b

…into studies-migration

geetu040 suggested changes Mar 24, 2026

View reviewed changes

Merge branch 'main' into studies-migration (resolve conflicts)

c224532

geetu040 suggested changes Mar 25, 2026

View reviewed changes

rohansen856 added 4 commits March 26, 2026 13:18

chore: added missing argument in studies api test

3953fdf

Signed-off-by: rohansen856 <rohansen856@gmail.com>

chore: updated resource base to enable tagging for studies

c2d8487

Signed-off-by: rohansen856 <rohansen856@gmail.com>

chore: updated acc to reviews

edf8524

Signed-off-by: rohansen856 <rohansen856@gmail.com>

Merge branch 'main' into studies-migration

ca2cdc5

geetu040 approved these changes Mar 26, 2026

View reviewed changes

PGijsbers requested changes Mar 27, 2026

View reviewed changes

rohansen856 added 2 commits March 30, 2026 15:13

chore: updated tests acc to review

6f30a45

Signed-off-by: rohansen856 <rohansen856@gmail.com>

Merge branch 'studies-migration' of https://github.com/rohansen856/op…

03ea718

…enml-python into studies-migration

		mock_request.return_value._content = mock_response.encode("utf-8")

		studies_df = study_v1.list(limit=5, offset=0)

Uh oh!

Conversation

rohansen856 commented Jan 8, 2026

Metadata

Details

Uh oh!

codecov-commenter commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

rohansen856 commented Jan 13, 2026

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

rohansen856 commented Jan 15, 2026

Uh oh!

geetu040 commented Jan 15, 2026

Uh oh!

rohansen856 commented Jan 15, 2026

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

rohansen856 commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

geetu040 commented Mar 23, 2026

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

geetu040 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

geetu040 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

PGijsbers left a comment

Choose a reason for hiding this comment

Uh oh!

PGijsbers Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

PGijsbers Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

PGijsbers Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

PGijsbers Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

codecov-commenter commented Jan 8, 2026 •

edited

Loading

rohansen856 commented Jan 16, 2026 •

edited

Loading