Describe the bug
XML parsing fails on valid nested numeric array input because GEOS pre-validates XML attribute values with generated std::regex patterns. For valid 2D real-array input such as sourceCoordinates="{ { 1575, 2400, 2900 } }", Apple libc++ throws:
std::regex_error:
The complexity of an attempted match against a regular expression exceeded a pre-set level.
This happens before the actual typed array parser gets to parse the value. The input appears valid; the failure is in regex-based pre-validation.
There is also a secondary diagnostic issue: the user-facing GEOS error message is blank:
***** Unknown
***** Rank
***** Message :
To Reproduce
Steps to reproduce the behavior:
- Build GEOS on macOS/aarch64 with Apple clang/libc++.
- Run the wave propagation smoke problem:
bin/geosx -i ../inputFiles/wavePropagation/acouselas3D_Q2_abc_smoke.xml
- Observe the blank GEOS error after the first solver is added:
Opened XML file: .../inputFiles/wavePropagation/acouselas3D_Q2_abc_smoke.xml
Solvers: adding AcousticSEM acousticSolver
***** Unknown
***** Rank
***** Message :
Using LLDB with a C++ exception breakpoint shows the real first exception:
std::regex_error:
The complexity of an attempted match against a regular expression exceeded a pre-set level.
Relevant stack trace:
xmlWrapper::validateString
xmlWrapper::stringToInputVariable<Array<double,2>>
xmlWrapper::readAttributeAsType<real64_array2d>
Wrapper<real64_array2d>::processInputFile
Group::processInputFile
ProblemManager::parseXMLDocument
The failing XML attribute is:
sourceCoordinates="{ { 1575, 2400, 2900 } }"
in:
inputFiles/wavePropagation/acouselas3D_Q2_abc_smoke.xml
Expected behavior
Valid 2D real-array XML input should parse successfully. GEOS should not reject valid input because the regex engine exceeds its implementation-specific complexity limit.
If the input is actually malformed, GEOS should report a useful XML parsing error that includes the node, attribute, value, and expected format.
Platform (please complete the following information):
Machine: local macOS/aarch64 machine
Compiler: Apple clang 17.0.0
MPI: Open MPI 5.0.9
GEOS Version: develop
Additional context
The immediate failure occurs in:
src/coreComponents/dataRepository/xmlWrapper.cpp
inside:
xmlWrapper::validateString(...)
at the std::regex_match(...) call.
The array regex is generated by:
src/coreComponents/codingUtilities/RTTypes.cpp
in constructArrayRegex(...).
Suggested fix: avoid using std::regex as the authoritative runtime validator for scalar and especially array input. The typed parser should be authoritative. For arrays, use LvArray::input::stringToArray or a small purpose-built linear parser to validate braces, commas, dimensions, and scalar tokens. Keep the rtTypes format descriptions for diagnostics/schema/docs, but avoid matching nested numeric arrays with large generated regexes at runtime.
There is also a secondary error-reporting bug: xmlWrapper::processInputException() throws a direct InputError, but the top-level geos::Exception catch flushes the global diagnostic object instead of reporting e.what(). In this case that produces the blank ***** Unknown message, obscuring the real regex failure.
NOTE: This error was diagnosed using codex.
Describe the bug
XML parsing fails on valid nested numeric array input because GEOS pre-validates XML attribute values with generated std::regex patterns. For valid 2D real-array input such as sourceCoordinates="{ { 1575, 2400, 2900 } }", Apple libc++ throws:
This happens before the actual typed array parser gets to parse the value. The input appears valid; the failure is in regex-based pre-validation.
There is also a secondary diagnostic issue: the user-facing GEOS error message is blank:
To Reproduce
Steps to reproduce the behavior:
Using LLDB with a C++ exception breakpoint shows the real first exception:
Relevant stack trace:
The failing XML attribute is:
Expected behavior
Valid 2D real-array XML input should parse successfully. GEOS should not reject valid input because the regex engine exceeds its implementation-specific complexity limit.
If the input is actually malformed, GEOS should report a useful XML parsing error that includes the node, attribute, value, and expected format.
Platform (please complete the following information):
Machine: local macOS/aarch64 machine
Compiler: Apple clang 17.0.0
MPI: Open MPI 5.0.9
GEOS Version: develop
Additional context
The immediate failure occurs in:
The array regex is generated by:
Suggested fix: avoid using std::regex as the authoritative runtime validator for scalar and especially array input. The typed parser should be authoritative. For arrays, use LvArray::input::stringToArray or a small purpose-built linear parser to validate braces, commas, dimensions, and scalar tokens. Keep the rtTypes format descriptions for diagnostics/schema/docs, but avoid matching nested numeric arrays with large generated regexes at runtime.
There is also a secondary error-reporting bug: xmlWrapper::processInputException() throws a direct InputError, but the top-level geos::Exception catch flushes the global diagnostic object instead of reporting e.what(). In this case that produces the blank ***** Unknown message, obscuring the real regex failure.
NOTE: This error was diagnosed using codex.