Formal support for linking rlibs using a non-Rust linker

I'm working on a major existing C++ project which hopes to dip its toes into the Rusty waters. We:
  
- Use a non-Cargo build system with static dependency rules (and 40000+ targets)
- Sometimes build a single big binary; sometimes lots of shared objects, unit test executables, etc. - each containing various parts of our dependency tree.
- Perform final linking using an existing C++ toolchain (based on LLVM 11 as it happens)
- Want to have a few Rust components scattered throughout a very deep dependency tree, which may eventually roll up into one or multiple binaries

We can't:
- Switch from our existing linker to `rustc` for final linking. C++ is the boss in our codebase; we're not ready to make the commitment to put Rust in charge of our final linking.
- Create a Rust `staticlib` for each of our Rust components. This works if we're using Rust in only one place. For any binary containing several Rust components, there would be binary bloat and potentially violations of the one-definition-rule, by duplication of the Rust stdlib and any diamond dependencies.
- Create a single Rust `staticlib` containing all our Rust components, then link that into every binary. That monster static library would depend on many C++ symbols, which wouldn't be present in some circumstances.

We can either:
1. Create a Rust `staticlib` for each of our _output_ binaries, using `rustc` and an auto-generated `.rs` file containing lots of `extern crate` statements. Or,
2. Pass the `rlib` for each Rust component directly into the final C++ linking procedure.

The first approach is officially supported, but is hard because:
- We need to create a Rust `staticlib` as part of our _C++_ tool invocations. This is awkward in our build system. Our C++ targets don't keep track of Rust compiler flags (`--target`, etc.) and in general it just feels weird to be doing Rust stuff in C++ targets.
- Specifically, we need to invoke a Python wrapper script to consider invoking `rustc` to make a `staticlib` for every single one of our C++ link targets. For most of our targets (especially unit test targets) there will be no `rlibs` in their dependency tree, so it will be a no-op. But the presence of this wrapper script will make Rust adoption appear intrusive, and of course will have some small actual performance cost.
- For those link targets which _do_ include Rust code, we'll delay invocation of the main linker whilst we build a Rust static library.

The second approach is not officially supported. An `rlib` is an internal implementation format within Rust, and its only client is `rustc`. It is naughty to pass them directly into our own linker command line.

But it does, currently, work. It makes our build process much simpler and makes use of Rust less disruptive.

Because external toolchains are not expected to consume `rlib`s, some magic is required:
- The final C++ linker needs to pull in all the Rust stdlib `rlib`s, which would be easy apart from the fact they contain the symbol metadata hash in their names.
- We need to remap `__rust_alloc` to `__rdl_alloc` etc.

But obviously the bigger concern is that this is not a supported model, and Rust is free to break the `rlib` format at any moment.

Is there any appetite for making this a supported model for those with mixed C/C++/Rust codebases?

I'm assuming the answer may be 'no' because it would tie Rust's hands for future `rlib` format changes. But just in case: how's about the following steps?

1. [The Linkage section of the Rust reference](https://doc.rust-lang.org/reference/linkage.html) is enhanced to list the two _current_ strategies for linking C++ and Rust. Either:
   - Use `rustc` as the final linker; or
   - Build a Rust `staticlib` or `cdylib` then pass that to your existing final linker
(I think this would be worth explicitly explaining anyway, so unless anyone objects, I may raise a PR)
2. A new `rustc --print stdrlibs` (or similar) which will output the names of all the standard library rlibs (not just their directory, which is already possible with `target-libdir`)
3. Some kind of new `rustc` option which generates a `rust-dynamic-symbols.o` file (or similar) containing the codegen which is otherwise done by `rustc` at final link-time (e.g. symbols to call `__rdl_alloc` from `__rust_alloc`, etc.)
4. The Linkage section of the book is enhanced to list this as a third supported workflow. (You can use whatever linker you want, but make sure you link to `rust-dynamic-symbols.o` and everything output by `rustc --print stdrlibs`)
5. Somehow, we add some tests to ensure this workflow doesn't break.

A few related issues:
* #64191 wants to split the compile and link phases of rustc. This discussion has spawned from there.
* @dtolnay's marvellous https://github.com/dtolnay/cxx is not quite as optimal as it could be, because users can't use `-Wl,--start-group`, `-Wl,--end-group` on the linker line. (Per https://github.com/rust-lang/rust/issues/64191#issuecomment-629418541)
* the difficulties of using the `staticlib`-per-C++-target model happen to be magnified by #73047

@japaric @alexcrichton @retep998 @dtolnay I believe this may be the sort of thing you may wish to comment upon! I'm sure you'll come up with reasons why this is even harder than I already think. Thanks very much in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Formal support for linking rlibs using a non-Rust linker #73632

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Formal support for linking rlibs using a non-Rust linker #73632

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions