[FLINK-39401] Extend raw format to support line-delimiter option#27897
[FLINK-39401] Extend raw format to support line-delimiter option#27897featzhang wants to merge 1 commit intoapache:masterfrom
Conversation
|
@featzhang Thanks for this contribution. The commit message in the PR should include the Jira ticket id. |
...k-table-runtime/src/main/java/org/apache/flink/formats/raw/RawFormatSerializationSchema.java
Outdated
Show resolved
Hide resolved
...k-table/flink-table-runtime/src/main/java/org/apache/flink/formats/raw/RawFormatFactory.java
Show resolved
Hide resolved
...table-runtime/src/main/java/org/apache/flink/formats/raw/RawFormatDeserializationSchema.java
Outdated
Show resolved
Hide resolved
...k-table-runtime/src/main/java/org/apache/flink/formats/raw/RawFormatSerializationSchema.java
Outdated
Show resolved
Hide resolved
8205f0c to
3161013
Compare
|
Thanks for the thorough review @spuru9 and @rmetzger! Here's what was addressed in this update:
|
3161013 to
bb33242
Compare
Summary
This PR extends the
rawformat to support a new optionalraw.line-delimiterconfig option.When
raw.line-delimiteris set:raw.charset, split by the delimiter (String.split(Pattern.quote(delimiter), -1)), and oneRowDatais emitted per segment viadeserialize(byte[], Collector<T>).When
raw.line-delimiteris not set, all existing behavior is preserved exactly (backward compatible).Example SQL
Changes
RawFormatOptionsLINE_DELIMITERConfigOption<String>with no default valueRawFormatFactoryoptionalOptions()RawFormatDeserializationSchemadeserialize(byte[], Collector)to split by delimiter when set; addlineDelimiterfield toequals/hashCodeRawFormatSerializationSchemalineDelimiterfield toequals/hashCodeRawFormatFactoryTesttestLineDelimiterOption()RawFormatLineDelimiterTestTest Plan
RawFormatLineDelimiterTest(9 tests):\ndelimiter → 3 rows||→ 3 rows\ndelimiter → correct splitting\n→ appends\n||→ appends||RawFormatFactoryTest.testLineDelimiterOption(): verifies factory produces schemas with correct delimiter