Skip to content

build(deps-dev): bump datasets from 4.7.0 to 4.8.2#955

Open
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/uv/datasets-4.8.2
Open

build(deps-dev): bump datasets from 4.7.0 to 4.8.2#955
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/uv/datasets-4.8.2

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Mar 18, 2026

Bumps datasets from 4.7.0 to 4.8.2.

Release notes

Sourced from datasets's releases.

4.8.2

What's Changed

Full Changelog: huggingface/datasets@4.8.1...4.8.2

4.8.1

What's Changed

Full Changelog: huggingface/datasets@4.8.0...4.8.1

4.8.0

Dataset Features

  • Read (and write) from HF Storage Buckets: load raw data, process and save to Dataset Repos by @​lhoestq in huggingface/datasets#8064

    from datasets import load_dataset
    # load raw data from a Storage Bucket on HF
    ds = load_dataset("buckets/username/data-bucket", data_files=["*.jsonl"])
    # or manually, using hf:// paths
    ds = load_dataset("json", data_files=["hf://buckets/username/data-bucket/*.jsonl"])
    # process, filter
    ds = ds.map(...).filter(...)
    # publish the AI-ready dataset
    ds.push_to_hub("username/my-dataset-ready-for-training")

    This also fixes multiprocessed push_to_hub on macos that was causing segfault (now it uses spawn instead of fork). And it bumps dill and multiprocess versions to support python 3.14

  • Datasets streaming iterable packaged improvements and fixes by @​Michael-RDev in huggingface/datasets#8068

    • added max_shard_size to IterableDataset.push_to_hub (but requires iterating twice to know the full dataset twice - improvements are welcome)
    • more arrow-native iterable operations for IterableDataset
    • better support of glob patterns in archives, e.g. zip://*.jsonl::hf://datasets/username/dataset-name/data.zip
    • fixes for to_pandas, videofolder, load_dataset_builder kwargs

What's Changed

New Contributors

Full Changelog: huggingface/datasets@4.7.0...4.8.0

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [datasets](https://github.com/huggingface/datasets) from 4.7.0 to 4.8.2.
- [Release notes](https://github.com/huggingface/datasets/releases)
- [Commits](huggingface/datasets@4.7.0...4.8.2)

---
updated-dependencies:
- dependency-name: datasets
  dependency-version: 4.8.2
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code labels Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants