Skip to content

MDEV-39412: parse error reading tabs in ranges#4994

Closed
bsrikanth-mariadb wants to merge 50 commits intobb-12.3-MDEV-38805-dev-sprint-work-2from
12.3-MDEV-39412-parse-error-reading-tabs-in-ranges
Closed

MDEV-39412: parse error reading tabs in ranges#4994
bsrikanth-mariadb wants to merge 50 commits intobb-12.3-MDEV-38805-dev-sprint-work-2from
12.3-MDEV-39412-parse-error-reading-tabs-in-ranges

Conversation

@bsrikanth-mariadb
Copy link
Copy Markdown
Contributor

@bsrikanth-mariadb bsrikanth-mariadb commented Apr 27, 2026

Note:
while reading from information_schema.optimizer_context one level of unescaping is already done i.e. (\\t becomes \t or \\\\t becomes \\t)

w.r.t the MDEV, there are 2 problems: -

When reading from the sql script file, json parser is not able to parse the range value in json_read_value() from json_lib.c "ranges": [
"(b\t\t\t\t\t\t) <= (b) <= (b???????)"
],
mainly the \t\t stuff, and hence a warning.
It also stops loading the context into memory.
Since, a new table is created with empty data, and without context, we get Impossible WHERE noticed after reading const tables

There is unescaping call being made in read_string() from sql_json_lib.cc while parsing of the context. With this \\t was becoming \t. However, print_range() from opt_range.cc already does escaping of the values. The value "b\t\t\t" was in fact produced as "b\\t\\t\\t". Later, we try to compare range values from the query and the context.

Here a mismatch a found because, in one case there was escaping, and in the other case escaping got removed.

Solutions

For Problem 1. have escaping for ranges.
This should be done while dumping range values into the context.

For Problem 2. Remove unscaping call in read_string().

…LAIN in the replay.

The last command is set "optimizer_replay_context=null"...
This is needed to be able to run replay for many queries in a row.
Need to discuss whether we should ship it like this.
If the previous command was EXPLAIN, it would use its context.
If the previous command was not an EXPLAIN, nothing would be printed.
It seems to get mtr into a state where the replay server is unusable.
A whole class of errors were not printed to the test output.
Usage:
--disable_replay next_query <arbitrary reason text>

If the next query after this is an EXPLAIN, its processing will be disabled.
Disable replay use for the whole file.
…utput.

I don't think we can print warnings, we would get spurious test failures.
…error

Disable the testcase with explanation.
…>stats.records' failed

The problem was caused by this scenario:
- The same table can be used multiple times in the query.
- A SELECT may be resolved without ever entering make_join_statistics()
  and set_statistics_for_table(). This will happen if the SELECT is
  handled via opt_sum_query, for example.
- Then, Context Capture code can take this TABLE object and save its
  unitialized statistics into the Optimizer Context.
- When we then attempt to use unitialized statistics in the other
  SELECT that is handled in regular way, we get a failure.

The fix:
- Do not save/restore table->used_stat_records. Do save/restore
  table->file->stats.records and let the SQL layer to copy it to
  used_stat_records (or use EITS data).
Handle it by saving the MIN/MAX rows into Optimizer Context.
Like we do it for const tables.
Instead of REPLACE INTO, use
SET STATEMENT sql_mode={remove STRICT_...TABLES} REPLACE INTO.
Make Optimizer Context include
  SET character_set_client=...;
  SET NAMES ... COLLATE ...;
Read_container_value -> Read_array.
Read_list_of_context -> Read_array_into_list
@spetrunia
Copy link
Copy Markdown
Member

Please rebase over bb-12.3-MDEV-39368-test-replay branch which now has unittest/sql/json_reader-t.cc; then add a unit test for reading escaped JSON data.

Note:
while reading from information_schema.optimizer_context one level of unescaping
is already done i.e. (\\t becomes \t or \\\\t becomes \\t)

w.r.t the MDEV, there are 2 problems: -

1.
When reading from the sql script file, json parser is not able to parse
the range value in json_read_value() from json_lib.c
"ranges": [
            "(b\t\t\t\t\t\t) <= (b) <= (b???????)"
          ],
mainly the \t\t stuff, and hence a warning.
It also stops loading the context into memory.
Since, a new table is created with empty data, and without context,
we get Impossible WHERE noticed after reading const tables

2.
There is unescaping call being made in read_string() from sql_json_lib.cc
while parsing of the context. With this \\t was becoming \t.
However, print_range() from opt_range.cc already does escaping of the values.
The value "b\t\t\t" was in fact produced as "\b\\t\\t\\t".
Later, we try to compare range values from the query and the context.

Here a mismatch a found because, in one case there was escaping,
and in the other case escaping got removed.

Solutions
=========
For Problem 1. have escaping for ranges.
This should be done while dumping range values into the context.

For Problem 2. Remove unscaping call in read_string().
@bsrikanth-mariadb bsrikanth-mariadb force-pushed the 12.3-MDEV-39412-parse-error-reading-tabs-in-ranges branch from 503361c to b100fbf Compare May 6, 2026 17:01
@bsrikanth-mariadb bsrikanth-mariadb marked this pull request as draft May 6, 2026 17:05
@bsrikanth-mariadb
Copy link
Copy Markdown
Contributor Author

closing this, and instead created a new pr #5049 i.e. against branch bb-12.3-MDEV-39368-test-replay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants