I'm working on a major existing C++ project which hopes to dip its toes into the Rusty waters. We:
- Use a non-Cargo build system with static dependency rules (and 40000+ targets)
- Sometimes build a single big binary; sometimes lots of shared objects, unit test executables, etc. - each containing various parts of our dependency tree.
- Perform final linking using an existing C++ toolchain (based on LLVM 11 as it happens)
- Want to have a few Rust components scattered throughout a very deep dependency tree, which may eventually roll up into one or multiple binaries
We can't:
- Switch from our existing linker to
rustc for final linking. C++ is the boss in our codebase; we're not ready to make the commitment to put Rust in charge of our final linking.
- Create a Rust
staticlib for each of our Rust components. This works if we're using Rust in only one place. For any binary containing several Rust components, there would be binary bloat and potentially violations of the one-definition-rule, by duplication of the Rust stdlib and any diamond dependencies.
- Create a single Rust
staticlib containing all our Rust components, then link that into every binary. That monster static library would depend on many C++ symbols, which wouldn't be present in some circumstances.
We can either:
- Create a Rust
staticlib for each of our output binaries, using rustc and an auto-generated .rs file containing lots of extern crate statements. Or,
- Pass the
rlib for each Rust component directly into the final C++ linking procedure.
The first approach is officially supported, but is hard because:
- We need to create a Rust
staticlib as part of our C++ tool invocations. This is awkward in our build system. Our C++ targets don't keep track of Rust compiler flags (--target, etc.) and in general it just feels weird to be doing Rust stuff in C++ targets.
- Specifically, we need to invoke a Python wrapper script to consider invoking
rustc to make a staticlib for every single one of our C++ link targets. For most of our targets (especially unit test targets) there will be no rlibs in their dependency tree, so it will be a no-op. But the presence of this wrapper script will make Rust adoption appear intrusive, and of course will have some small actual performance cost.
- For those link targets which do include Rust code, we'll delay invocation of the main linker whilst we build a Rust static library.
The second approach is not officially supported. An rlib is an internal implementation format within Rust, and its only client is rustc. It is naughty to pass them directly into our own linker command line.
But it does, currently, work. It makes our build process much simpler and makes use of Rust less disruptive.
Because external toolchains are not expected to consume rlibs, some magic is required:
- The final C++ linker needs to pull in all the Rust stdlib
rlibs, which would be easy apart from the fact they contain the symbol metadata hash in their names.
- We need to remap
__rust_alloc to __rdl_alloc etc.
But obviously the bigger concern is that this is not a supported model, and Rust is free to break the rlib format at any moment.
Is there any appetite for making this a supported model for those with mixed C/C++/Rust codebases?
I'm assuming the answer may be 'no' because it would tie Rust's hands for future rlib format changes. But just in case: how's about the following steps?
- The Linkage section of the Rust reference is enhanced to list the two current strategies for linking C++ and Rust. Either:
- Use
rustc as the final linker; or
- Build a Rust
staticlib or cdylib then pass that to your existing final linker
(I think this would be worth explicitly explaining anyway, so unless anyone objects, I may raise a PR)
- A new
rustc --print stdrlibs (or similar) which will output the names of all the standard library rlibs (not just their directory, which is already possible with target-libdir)
- Some kind of new
rustc option which generates a rust-dynamic-symbols.o file (or similar) containing the codegen which is otherwise done by rustc at final link-time (e.g. symbols to call __rdl_alloc from __rust_alloc, etc.)
- The Linkage section of the book is enhanced to list this as a third supported workflow. (You can use whatever linker you want, but make sure you link to
rust-dynamic-symbols.o and everything output by rustc --print stdrlibs)
- Somehow, we add some tests to ensure this workflow doesn't break.
A few related issues:
@japaric @alexcrichton @retep998 @dtolnay I believe this may be the sort of thing you may wish to comment upon! I'm sure you'll come up with reasons why this is even harder than I already think. Thanks very much in advance.
I'm working on a major existing C++ project which hopes to dip its toes into the Rusty waters. We:
We can't:
rustcfor final linking. C++ is the boss in our codebase; we're not ready to make the commitment to put Rust in charge of our final linking.staticlibfor each of our Rust components. This works if we're using Rust in only one place. For any binary containing several Rust components, there would be binary bloat and potentially violations of the one-definition-rule, by duplication of the Rust stdlib and any diamond dependencies.staticlibcontaining all our Rust components, then link that into every binary. That monster static library would depend on many C++ symbols, which wouldn't be present in some circumstances.We can either:
staticlibfor each of our output binaries, usingrustcand an auto-generated.rsfile containing lots ofextern cratestatements. Or,rlibfor each Rust component directly into the final C++ linking procedure.The first approach is officially supported, but is hard because:
staticlibas part of our C++ tool invocations. This is awkward in our build system. Our C++ targets don't keep track of Rust compiler flags (--target, etc.) and in general it just feels weird to be doing Rust stuff in C++ targets.rustcto make astaticlibfor every single one of our C++ link targets. For most of our targets (especially unit test targets) there will be norlibsin their dependency tree, so it will be a no-op. But the presence of this wrapper script will make Rust adoption appear intrusive, and of course will have some small actual performance cost.The second approach is not officially supported. An
rlibis an internal implementation format within Rust, and its only client isrustc. It is naughty to pass them directly into our own linker command line.But it does, currently, work. It makes our build process much simpler and makes use of Rust less disruptive.
Because external toolchains are not expected to consume
rlibs, some magic is required:rlibs, which would be easy apart from the fact they contain the symbol metadata hash in their names.__rust_allocto__rdl_allocetc.But obviously the bigger concern is that this is not a supported model, and Rust is free to break the
rlibformat at any moment.Is there any appetite for making this a supported model for those with mixed C/C++/Rust codebases?
I'm assuming the answer may be 'no' because it would tie Rust's hands for future
rlibformat changes. But just in case: how's about the following steps?rustcas the final linker; orstaticliborcdylibthen pass that to your existing final linker(I think this would be worth explicitly explaining anyway, so unless anyone objects, I may raise a PR)
rustc --print stdrlibs(or similar) which will output the names of all the standard library rlibs (not just their directory, which is already possible withtarget-libdir)rustcoption which generates arust-dynamic-symbols.ofile (or similar) containing the codegen which is otherwise done byrustcat final link-time (e.g. symbols to call__rdl_allocfrom__rust_alloc, etc.)rust-dynamic-symbols.oand everything output byrustc --print stdrlibs)A few related issues:
rustc#64191 wants to split the compile and link phases of rustc. This discussion has spawned from there.-Wl,--start-group,-Wl,--end-groupon the linker line. (Per Add support for splitting linker invocation to a second execution ofrustc#64191 (comment))staticlib-per-C++-target model happen to be magnified by rlibs retain reference to proc-macro dependencies - possibly unnecessary? #73047@japaric @alexcrichton @retep998 @dtolnay I believe this may be the sort of thing you may wish to comment upon! I'm sure you'll come up with reasons why this is even harder than I already think. Thanks very much in advance.