Skip to content

Regex error on macOS  #4053

@rrsettgast

Description

@rrsettgast

Describe the bug
XML parsing fails on valid nested numeric array input because GEOS pre-validates XML attribute values with generated std::regex patterns. For valid 2D real-array input such as sourceCoordinates="{ { 1575, 2400, 2900 } }", Apple libc++ throws:

std::regex_error:
The complexity of an attempted match against a regular expression exceeded a pre-set level.

This happens before the actual typed array parser gets to parse the value. The input appears valid; the failure is in regex-based pre-validation.

There is also a secondary diagnostic issue: the user-facing GEOS error message is blank:

***** Unknown
***** Rank
***** Message :

To Reproduce
Steps to reproduce the behavior:

  1. Build GEOS on macOS/aarch64 with Apple clang/libc++.
  2. Run the wave propagation smoke problem:
bin/geosx -i ../inputFiles/wavePropagation/acouselas3D_Q2_abc_smoke.xml
  1. Observe the blank GEOS error after the first solver is added:
Opened XML file: .../inputFiles/wavePropagation/acouselas3D_Q2_abc_smoke.xml
Solvers: adding AcousticSEM acousticSolver
***** Unknown
***** Rank
***** Message :

Using LLDB with a C++ exception breakpoint shows the real first exception:

std::regex_error:
The complexity of an attempted match against a regular expression exceeded a pre-set level.

Relevant stack trace:

xmlWrapper::validateString
xmlWrapper::stringToInputVariable<Array<double,2>>
xmlWrapper::readAttributeAsType<real64_array2d>
Wrapper<real64_array2d>::processInputFile
Group::processInputFile
ProblemManager::parseXMLDocument

The failing XML attribute is:

sourceCoordinates="{ { 1575, 2400, 2900 } }"
in:

inputFiles/wavePropagation/acouselas3D_Q2_abc_smoke.xml

Expected behavior
Valid 2D real-array XML input should parse successfully. GEOS should not reject valid input because the regex engine exceeds its implementation-specific complexity limit.

If the input is actually malformed, GEOS should report a useful XML parsing error that includes the node, attribute, value, and expected format.

Platform (please complete the following information):
Machine: local macOS/aarch64 machine
Compiler: Apple clang 17.0.0
MPI: Open MPI 5.0.9
GEOS Version: develop

Additional context
The immediate failure occurs in:

src/coreComponents/dataRepository/xmlWrapper.cpp
inside:

xmlWrapper::validateString(...)
at the std::regex_match(...) call.

The array regex is generated by:

src/coreComponents/codingUtilities/RTTypes.cpp
in constructArrayRegex(...).

Suggested fix: avoid using std::regex as the authoritative runtime validator for scalar and especially array input. The typed parser should be authoritative. For arrays, use LvArray::input::stringToArray or a small purpose-built linear parser to validate braces, commas, dimensions, and scalar tokens. Keep the rtTypes format descriptions for diagnostics/schema/docs, but avoid matching nested numeric arrays with large generated regexes at runtime.

There is also a secondary error-reporting bug: xmlWrapper::processInputException() throws a direct InputError, but the top-level geos::Exception catch flushes the global diagnostic object instead of reporting e.what(). In this case that produces the blank ***** Unknown message, obscuring the real regex failure.

NOTE: This error was diagnosed using codex.

Metadata

Metadata

Assignees

Labels

type: bugSomething isn't workingtype: newA new issue has been created and requires attention

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions