[IN] Global register allocator by matgla · Pull Request #5 · matgla/tinycc

matgla · 2026-02-07T15:49:52Z

and a lot of other things + benchmarks and tests

…ot, not fully verified, but works for tests

added benchmark

Copilot

Pull request overview

This PR significantly expands ARM Thumb (ARMv8-M) backend infrastructure and documentation while removing unused ARM64 stubs, aiming to support a global register allocator and related optimizations, benchmarks, and test workflows.

Changes:

Added ARM Thumb backend components (scratch register management, call-site tracking, ABI classification, arch/FPU configs).
Added extensive design/validation documentation and benchmark analysis notes for IR/codegen optimizations.
Added container + CI scaffolding and repo tooling metadata (Dockerfile, workflows, submodules, editor configs); removed ARM64 dummy assembler/linker code.

Reviewed changes

Copilot reviewed 48 out of 536 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
asm_token_fix_plan.md	Documents migration plan for new ASM token layout in Thumb assembler
arm64-link.c	Removes AArch64 linker/relocation implementation from this branch
arm64-asm.c	Removes ARM64 dummy assembler stub
arm-tok.h	Removes legacy ARM token definitions header
arm-thumb-scratch.c	Adds scratch register allocation/save/restore helpers for Thumb codegen
arm-thumb-defs.h	Introduces Thumb target defs, ABI enums, generator state structs, suffix parsing types
arm-thumb-callsite.c	Adds call-site table + IR scan utilities for ABI call layout reconstruction
arch/fpu/arm/fpv5-sp-d16.h	Declares FPv5-SP-D16 floating-point config
arch/fpu/arm/fpv5-sp-d16.c	Defines FPv5-SP-D16 floating-point capabilities/config
arch/armv8m.c	Adds ARMv8-M ArchitectureConfig defaults
arch/arm_aapcs.c	Adds AAPCS-like argument classification + call layout capacity helpers
TCC_GCC_CODEGEN_COMPARISON.md	Documents codegen gaps vs GCC and targeted optimizations
TAL_TOKSTR_OPTIMIZATION.md	Documents plan to reduce peak memory by changing tok_str allocation strategy
REFACTORING_SUMMARY.md	Summarizes “architecture independence” refactor and new machine/opt modules
POSTINC_EMBEDDED_DEREF_PLAN.md	Documents plan for post-inc fusion when deref is embedded in expressions
OPTIMIZATION_VALIDATION_REPORT.md	Records which optimization phases are implemented/validated
OPTIMIZATION_QUICK_WIN_1_FP_CACHE.md	Documents FP-offset cache implementation and tests
OPTIMIZATION_PLAN_V2.md	Updated optimization plan and analysis (LR usage, indexed loads, post-inc, etc.)
OPTIMIZATION_FP_CACHE_IMPROVEMENT_PLAN.md	Documents FP cache “register-agnostic” approach and limitations
MLA_WITH_DEREFERENCES_PLAN.md	Documents plan to enable MLA fusion when MUL operands are dereferences
LICM_IMPLEMENTATION_STATUS.md	Status/report on LICM implementation and known bugs
LICM_IMPLEMENTATION_PLAN.md	Detailed LICM design plan
LAZY_SECTION_LOADING_ALL_PLAN.md	Notes reverted attempt to defer-load all sections and failure reasons
LAZY_SECTION_LOADING.md	Documents lazy section loading design/implementation
FP_CACHE_IR_LEVEL_PLAN.md	Proposes IR-level FP offset caching as a cleaner approach
Dockerfile	Adds Ubuntu-based build/test container with ARM toolchain + QEMU
BUBBLE_SORT_COMPARISON.md	Detailed bubble-sort codegen diff analysis and optimization priorities
AGENTS.md	Adds project agent/guide documentation, structure, build/test notes
.vscode/tasks.json	Adds QEMU run/stop tasks
.vscode/settings.json	Adds terminal auto-approve configuration for tooling
.vscode/launch.json	Adds debug configs (tcc under gdb, QEMU gdb attach)
.vscode/c_cpp_properties.json	Adds IntelliSense settings/defines for ARM Thumb target
.gitmodules	Adds new submodules (newlib, pico-sdk, mibench)
.github/workflows/docker-build.yml	Adds workflow to build/push container image to GHCR
.github/workflows/ci.yml	Adds CI workflow running tests inside GHCR container
.github/prompts/plan-aliasAnalysisStoreLoadOptimizations.prompt.md	Adds prompt doc for alias-analysis/store-load optimization plan
.dockerignore	Adds docker ignore rules for repo build artifacts and tooling dirs
.clang-format	Updates formatting rules (including brace style + column limit)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-07T21:36:09Z

arch/fpu/arm/fpv5-sp-d16.h

+
+#pragma once
+
+#define USING_GLOBALS


Defining USING_GLOBALS in a header is very likely to cause multiple-definition / ODR-style linker errors because every translation unit including this header will pull in global definitions from tcc.h. Move #define USING_GLOBALS into exactly one .c file (typically the implementation file) or remove it entirely from the header and include only what’s needed (e.g., forward-declare FloatingPointConfig or include a smaller header that doesn’t depend on USING_GLOBALS).

Suggested change

#define USING_GLOBALS

Copilot · 2026-02-07T21:36:09Z

arch/arm_aapcs.c

+  if (layout->capacity < needed)
+  {
+    int new_cap = layout->capacity ? layout->capacity : 8;
+    while (new_cap < needed)
+      new_cap *= 2;
+    // layout->locs = (TCCAbiArgLoc *)tcc_realloc(layout->locs, sizeof(TCCAbiArgLoc) * new_cap);
+    /* Zero new tail for determinism. */
+    // memset(&layout->locs[layout->capacity], 0, sizeof(TCCAbiArgLoc) * (new_cap - layout->capacity));
+    layout->capacity = new_cap;
+  }


layout->capacity is increased without reallocating the backing arrays (the tcc_realloc/memset calls are commented out). This can desynchronize capacity from actual allocated sizes and can also cause tcc_abi_call_layout_ensure_capacity() to incorrectly skip reallocation later, leading to out-of-bounds writes. Replace this block with a call to tcc_abi_call_layout_ensure_capacity(layout, needed) (or fully re-enable correct realloc/zero-init here) and keep capacity consistent with the real allocations.

Suggested change

if (layout->capacity < needed)

{

int new_cap = layout->capacity ? layout->capacity : 8;

while (new_cap < needed)

new_cap *= 2;

// layout->locs = (TCCAbiArgLoc *)tcc_realloc(layout->locs, sizeof(TCCAbiArgLoc) * new_cap);

/* Zero new tail for determinism. */

// memset(&layout->locs[layout->capacity], 0, sizeof(TCCAbiArgLoc) * (new_cap - layout->capacity));

layout->capacity = new_cap;

}

tcc_abi_call_layout_ensure_capacity(layout, needed);

Copilot · 2026-02-07T21:36:10Z

arm-thumb-scratch.c

+    for (int i = 0; i < count; ++i)
+    {
+      if (regs_to_save & (1u << result.regs[i]))
+      {
+        if (scratch_push_count < 128)
+          scratch_push_stack[scratch_push_count++] = result.regs[i];
+        else
+          tcc_error("compiler_error: scratch register push stack overflow (>128 pushes without restore)");
+      }
+    }


When you emit a single th_push(regs_to_save) with a register mask, the architectural stack layout is in register-number order, but scratch_push_stack records registers in allocation order (result.regs[]). If allocation order differs from ascending register number order, later POP validation/deferral and the “auto restore” path can pop registers in an order that doesn’t match what’s actually on the stack, restoring wrong values. A robust fix is to track pushes as masks on a push-mask stack (LIFO of uint32_t masks) instead of per-register, or record the saved registers in the exact order they are stored on the stack (ascending register number) for each mask push.

Copilot · 2026-02-07T21:36:10Z

arm-thumb-scratch.c

+
+    if (can_restore_all && check_count > 0)
+    {
+      fprintf(stderr, "[SCRATCH] Popping registers (mask=0x%x) in single instruction\n", allocs->saved_mask);


These fprintf(stderr, ...) calls are unconditional (not behind #ifdef ARM_THUMB_DEBUG_SCRATCH) and will spam stderr in normal builds/tests. Wrap them in the same debug macro used elsewhere in this file (or route through a project logging macro that can be compiled out).

Copilot · 2026-02-07T21:36:10Z

arm-thumb-scratch.c

+    }
+    else
+    {
+      fprintf(stderr, "[SCRATCH] WARNING: restore_scratch_regs out of order; deferring POP\n");


These fprintf(stderr, ...) calls are unconditional (not behind #ifdef ARM_THUMB_DEBUG_SCRATCH) and will spam stderr in normal builds/tests. Wrap them in the same debug macro used elsewhere in this file (or route through a project logging macro that can be compiled out).

Copilot · 2026-02-07T21:36:11Z

.vscode/settings.json

+        "make": true,
+        "heaptrack_print": true


Committing editor settings that auto-approve terminal commands can create a footgun for contributors (commands may run without explicit confirmation). Consider removing this from the repo, scoping it to a local/user settings example, or setting these to false by default with documentation on how to opt-in locally.

Suggested change

"make": true,

"heaptrack_print": true

"make": false,

"heaptrack_print": false

Copilot · 2026-02-07T21:36:11Z

.clang-format

 BasedOnStyle: LLVM
+ColumnLimit: 120
+AllowShortFunctionsOnASingleLine: None
+BreakBeforeBraces: Allman


BreakBeforeBraces: Allman conflicts with the brace style documented in AGENTS.md (K&R style). This will cause churn and inconsistent formatting expectations. Align the documented style and .clang-format settings (either update the docs to Allman or adjust .clang-format to match the documented K&R approach).

Suggested change

BreakBeforeBraces: Allman

BreakBeforeBraces: Attach

matgla added 30 commits December 6, 2025 09:59

legacy code removed

e2c962b

working on ir code generation

9285e60

wip

c241cc0

wip

5c80815

wip

44c8398

fixed bug in local variable assignment

dd6006f

working on ir

dc961db

more examples

6a67a66

wip

1f7c870

hello world example is running fine

914d1ee

trying to add qemu

fdeeebb

tinycc produces working hello world

ecf710e

trying to add test

069f966

working on tests

235649c

working on tests

6dd0d74

working on if support

f80724d

first basic tests are working, added linker script support with copil…

f809bde

…ot, not fully verified, but works for tests

more tests are working

d312a61

added a lot of fixes, array tests is still failing

b8eb7cd

more fixes, arrays to be fixed

218bb07

fixed stubs lines processing, fixed 05_array test

e2b013e

fixed switch/case

1578535

fixed while and do/while

60a2c20

fixed all tests except pointers

e52b1bd

pointers partially fixed, but global structs access is wrong

4d9566b

literal pool access optimization, part one

f74d459

fixed struct test case, data is now correctly handled in linker

8f5777f

added more tests

b1c9042

fixed precedence tests

038e09d

added more tests for logical operations

4c6f113

matgla added 25 commits January 27, 2026 19:34

all tests are passing before full removal

955e58e

more optimization arrived

90b7cc3

added benchmark

fixed bug with returns

f05c1fb

benchmarking works

eb42383

fixes for benchmarks

1c6a73e

plan for next optimizations

8e93934

refactoring of tccir

5089e4b

tccir split

d65c736

updated plan

72aecf0

added mla

cd490bf

added test for mla

242f034

tests are working

ac14a95

workflow

22d614f

mob

4acfdb5

changed newlib

aa88d6e

fixes for ci

06941b9

wip

bfc9929

changed workflow to use container

e71587c

ci fix

3ee5558

permission change

1ce7886

aa

4dffbf6

m

66ee46f

some fixes

2d48b45

wip

adfad8a

compilation on macos

f72ddd2

matgla requested a review from Copilot February 7, 2026 21:33

Copilot AI reviewed Feb 7, 2026

View reviewed changes

safety fix

649bfdd

matgla merged commit d1dae41 into mob Feb 7, 2026
1 check passed

matgla deleted the otherArchRemoval branch February 7, 2026 21:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IN] Global register allocator#5

[IN] Global register allocator#5
matgla merged 142 commits intomobfrom
otherArchRemoval

matgla commented Feb 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

matgla commented Feb 7, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant