feat: add more scalar array functions to `bigframes.bigquery` by tswast · Pull Request #17213 · googleapis/google-cloud-python

tswast · 2026-05-20T20:11:30Z

🦕

gemini-code-assist

Code Review

This pull request introduces a comprehensive set of BigQuery array operations into the global namespace, moving existing functions and adding new ones like array_concat and generate_array. The generator script was updated to handle single-argument tuples, and new unit tests were implemented. Review feedback identifies a syntax error in the generated code where single-element tuples lack trailing commas, leading to potential runtime errors. Additionally, the reviewer suggests consolidating repetitive imports for better maintainability and correcting type mismatches and argument passing in the unit tests to ensure they properly validate the new operations.

gemini-code-assist · 2026-05-20T20:20:04Z

+    result = array.flatten(
+        cast(bpd.Series, scalar_types_df["string_col"]),
+        cast(bpd.Series, scalar_types_df["string_col"]),
+    ).to_frame()
+
+    snapshot.assert_match(result.sql.rstrip() + "\n", "out.sql")


In test_flatten, the depth argument is passed as a positional argument, but the YAML definition specifies it as keyword_only: true. Additionally, passing string_col for depth is a type mismatch as it expects an integer (i64). This ensures the test aligns with the expected fail-fast behavior for unsupported types.

Suggested change

result = array.flatten(

cast(bpd.Series, scalar_types_df["string_col"]),

cast(bpd.Series, scalar_types_df["string_col"]),

).to_frame()

snapshot.assert_match(result.sql.rstrip() + "\n", "out.sql")

def test_flatten(scalar_types_df: bpd.DataFrame, snapshot):

result = array.flatten(

cast(bpd.Series, scalar_types_df["string_col"]),

depth=1,

).to_frame()

snapshot.assert_match(result.sql.rstrip() + "\n", "out.sql")

References

When a function receives parameters of an unsupported type, it should raise an error (e.g., ProgrammingError) instead of silently returning empty values. This ensures fail-fast behavior and prevents potential issues with missing parameter values in database operations.

gemini-code-assist · 2026-05-20T20:20:05Z

+def test_generate_array(scalar_types_df: bpd.DataFrame, snapshot):
+    result = array.generate_array(
+        cast(bpd.Series, scalar_types_df["string_col"]),
+        cast(bpd.Series, scalar_types_df["string_col"]),
+        cast(bpd.Series, scalar_types_df["string_col"]),
+    ).to_frame()
+
+    snapshot.assert_match(result.sql.rstrip() + "\n", "out.sql")


In test_generate_array, string_col is used for the start, end, and step expressions. GENERATE_ARRAY in BigQuery only supports numeric types (INT64, FLOAT64, etc.). The test should use appropriate numeric columns to align with the expected fail-fast behavior for unsupported types.

Suggested change

def test_generate_array(scalar_types_df: bpd.DataFrame, snapshot):

result = array.generate_array(

cast(bpd.Series, scalar_types_df["string_col"]),

cast(bpd.Series, scalar_types_df["string_col"]),

cast(bpd.Series, scalar_types_df["string_col"]),

).to_frame()

snapshot.assert_match(result.sql.rstrip() + "\n", "out.sql")

def test_generate_array(scalar_types_df: bpd.DataFrame, snapshot):

result = array.generate_array(

cast(bpd.Series, scalar_types_df["int64_col"]),

cast(bpd.Series, scalar_types_df["int64_col"]),

cast(bpd.Series, scalar_types_df["int64_col"]),

).to_frame()

snapshot.assert_match(result.sql.rstrip() + "\n", "out.sql")

References

When a function receives parameters of an unsupported type, it should raise an error (e.g., ProgrammingError) instead of silently returning empty values. This ensures fail-fast behavior and prevents potential issues with missing parameter values in database operations.

tswast · 2026-05-20T20:29:17Z


 class Flatten(Func):
-    pass
+    arg_types = {"this": True, "depth": False}


I wonder why it made this change?

Jetski says "Enhanced SQLGlot ( third_party/bigframes_vendored/sqlglot/expressions.py ):
• Fixed a ValueError during test_flatten by updating the vendored Flatten class definitions. It now explicitly supports the optional depth parameter ( arg_types = {"this": True,
"depth": False} ), matching BigQuery's actual FLATTEN function signature.
"

This reverts commit 37644c6.

TrevorBergeron · 2026-05-20T21:29:31Z

+_ARRAY_CONCAT_OP = googlesql.GoogleSqlScalarOp(
+    "ARRAY_CONCAT",
+    args=(googlesql.ArgSpec(), googlesql.ArgSpec()),
+    signature=lambda *args: None,
+)


Could we place the GoogleSqlScalarOp defs somewhere public/central so that we can also use them in the compiler. My intention with the GoogleSqlScalarOp is that it can be the single definition of a google sql function for both internal and user-facing purposes.

TrevorBergeron · 2026-05-20T21:30:39Z

+SELECT
+  `rowindex`,
+  ARRAY_CONCAT(`string_col`,   `string_col`) AS `0`
+FROM `bigframes-dev`.`sqlglot_test`.`scalar_types` AS `bft_0`


These start to be a bit silly now that they all exercise the same exact emitter code eh?

feat: add more scalar array functions to bigframes.bigquery

be5602a

tswast requested review from a team as code owners May 20, 2026 20:11

tswast requested review from TrevorBergeron and removed request for a team May 20, 2026 20:11

tswast commented May 20, 2026

View reviewed changes

Comment thread packages/bigframes/bigframes/bigquery/__init__.py

gemini-code-assist Bot reviewed May 20, 2026

View reviewed changes

fix generator for global functions

ac68eb3

tswast commented May 20, 2026

View reviewed changes

tswast added 6 commits May 20, 2026 20:38

fix uv run

e85acc1

use active venv

27ae53d

attempt to tighthen typing

37644c6

Revert "attempt to tighthen typing"

90874c8

This reverts commit 37644c6.

loosen types

e2d597d

add some casts

4171618

TrevorBergeron reviewed May 20, 2026

View reviewed changes

oops

fa66d9a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add more scalar array functions to `bigframes.bigquery`#17213

feat: add more scalar array functions to `bigframes.bigquery`#17213
tswast wants to merge 9 commits into
mainfrom
tswast-bbq-array

tswast commented May 20, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot May 20, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot May 20, 2026

Uh oh!

tswast May 20, 2026

Uh oh!

tswast May 20, 2026

Uh oh!

TrevorBergeron May 20, 2026

Uh oh!

TrevorBergeron May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tswast commented May 20, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

tswast May 20, 2026

Choose a reason for hiding this comment

Uh oh!

tswast May 20, 2026

Choose a reason for hiding this comment

Uh oh!

TrevorBergeron May 20, 2026

Choose a reason for hiding this comment

Uh oh!

TrevorBergeron May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants