Skip to content

[Ruby] Add pure Ruby Apache Arrow reader/writer #48132

@hiroyuki-sato

Description

@hiroyuki-sato

Describe the enhancement requested

Apache Arrow's Ruby language support is implemented as a binding for Apache Arrow C++/GLib. Bindings generally require a C++ compiler and are more difficult to install than libraries implemented in pure Ruby.

The difficulty of installing a binding may be worthwhile if you use the high-speed data processing features provided by Apache Arrow C++, but it may not be worth it if, for example, you simply want to support Apache Arrow output in your application.

By providing a pure Ruby library that can only read and write Apache Arrow data, we hope to increase the number of Ruby applications and libraries that support Apache Arrow. This is a similar approach to the nanoarrow approach, which is provided separately from Apache Arrow C++.

This pure Ruby library does not replace the current binding. If you require the high-speed data processing features provided by Apache Arrow C++, we recommend continuing to use the current binding. On the other hand, this pure Ruby library is an option for use cases where simply reading and writing Apache Arrow data is sufficient.

In addition, the pure Ruby version reduces file size because it does not depend on apache-arrow cpp, glib, and gobject-introspection etc.
For example, apache arrow cpp, glib and gobject-introspection together use approximately 260MB, and if you only read and write, this space will not be needed.

Ultimately, we expect to see an increase in the number of Apache Arrow users.

Component(s)

Ruby

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions