## Tokenizer - [x] add pass to check input files are valid as utf-8 - https://github.com/Rust-GCC/gccrs/issues/2309 - https://github.com/Rust-GCC/gccrs/pull/2374 - [x] add util function to check XID_Start/XID_Continue to libcpp - XID_Start, XID_Continue - https://www.unicode.org/Public/14.0.0/ucd/DerivedCoreProperties.txt - https://github.com/Rust-GCC/gccrs/pull/2284 - [x] identifiers - https://github.com/Rust-GCC/gccrs/pull/2284 - https://github.com/Rust-GCC/gccrs/pull/2338 - [x] lifetime label - https://github.com/Rust-GCC/gccrs/pull/2284 - https://github.com/Rust-GCC/gccrs/pull/2338 - [x] raw identifiers - https://github.com/Rust-GCC/gccrs/issues/2309 - https://github.com/Rust-GCC/gccrs/pull/2338 - [x] string literal - [x] char literal - [x] Unicode esccape `\uxxxx` - loop label - not implemented yet - [x] parse whitespaces - https://doc.rust-lang.org/reference/whitespace.html?highlight=whitespaces#whitespace - https://github.com/Rust-GCC/gccrs/pull/2307 - https://github.com/Rust-GCC/gccrs/pull/2339 - [x] normalize identifiers (including labels) to NFC - normalize function: https://github.com/Rust-GCC/gccrs/issues/2379 - use it during tokenization: https://github.com/Rust-GCC/gccrs/pull/2489 ## Parser - [x] check for `crate_name` attributes - https://doc.rust-lang.org/reference/crates-and-source-files.html#the-crate_name-attribute - Unicode Alphabetic - https://www.unicode.org/reports/tr44/#Alphabetic - https://www.unicode.org/Public/14.0.0/ucd/DerivedCoreProperties.txt - Unicode Numeric (search with `Nl;` (decimal digits), `Nd;` (letter-like numeric), or `No;` (other numeric) ) - https://doc.rust-lang.org/std/primitive.char.html#method.is_numeric - is_numeric and is_alphabetic functions is added via - https://github.com/Rust-GCC/gccrs/pull/2425 - Modify checker - https://github.com/Rust-GCC/gccrs/pull/2463 ## Backend - [x] punycode (v0 mangling) - https://github.com/Rust-GCC/gccrs/pull/2533 - Related to https://github.com/Rust-GCC/gccrs/issues/305 ## Others - [ ] linter - https://rust-lang.github.io/rfcs/2457-non-ascii-idents.html - [x] Fix `#[no_mangle]` - https://github.com/Rust-GCC/gccrs/issues/2548 - [x] Fix legacy name manglling - https://github.com/Rust-GCC/gccrs/issues/2545 ## TODOs need more tests!
Tokenizer
\uxxxxParser
crate_nameattributesNl;(decimal digits),Nd;(letter-like numeric), orNo;(other numeric) )Backend
Others
#[no_mangle]#[no_mangle]should not be applied to non-ASCII items #2548TODOs
need more tests!