Use `load`+`store` instead of `memcpy` for small integer arrays I was inspired by #98892 to see whether, rather than making `mem::swap` do something smart in the library, we could update MIR assignments like `*_1 = *_2` to do something smarter than `memcpy` for sufficiently-small types that doing it inline is going to be better than a `memcpy` call in assembly anyway. After all, special code may help `mem::swap`, but if the "obvious" MIR can just result in the correct thing that helps everything -- other code like `mem::replace`, people doing it manually, and just passing around by value in general -- as well as makes MIR inlining happier since it doesn't need to deal with all the complicated library code if it just sees a couple assignments. LLVM will turn the short, known-length `memcpy`s into direct instructions in the backend, but that's too late for it to be able to remove `alloca`s. In general, replacing `memcpy`s with typed instructions is hard in the middle-end -- even for `memcpy.inline` where it knows it won't be a function call -- is hard [due to poison propagation issues](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/memcpy.20vs.20load-store.20for.20MIR.20assignments/near/360376712). So because we know more about the type invariants -- these are typed copies -- rustc can emit something more specific, allowing LLVM to `mem2reg` away the `alloca`s in some situations. #52051 previously did something like this in the library for `mem::swap`, but it ended up regressing during enabling mir inlining (https://github.com/rust-lang/rust/commit/cbbf06b0cd39dc93033568f1e65f5363cbbdebcd), so this has been suboptimal on stable for ≈5 releases now. The code in this PR is narrowly targeted at just integer arrays in LLVM, but works via a new method on the [`LayoutTypeMethods`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/trait.LayoutTypeMethods.html) trait, so specific backends based on cg_ssa can enable this for more situations over time, as we find them. I don't want to try to bite off too much in this PR, though. (Transparent newtypes and simple things like the 3×usize `String` would be obvious candidates for a follow-up.) Codegen demonstrations: <https://llvm.godbolt.org/z/fK8hT9aqv> Before: ```llvm define void `@swap_rgb48_old(ptr` noalias nocapture noundef align 2 dereferenceable(6) %x, ptr noalias nocapture noundef align 2 dereferenceable(6) %y) unnamed_addr #1 { %a.i = alloca [3 x i16], align 2 call void `@llvm.lifetime.start.p0(i64` 6, ptr nonnull %a.i) call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %a.i, ptr noundef nonnull align 2 dereferenceable(6) %x, i64 6, i1 false) tail call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %x, ptr noundef nonnull align 2 dereferenceable(6) %y, i64 6, i1 false) call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %y, ptr noundef nonnull align 2 dereferenceable(6) %a.i, i64 6, i1 false) call void `@llvm.lifetime.end.p0(i64` 6, ptr nonnull %a.i) ret void } ``` Note it going to stack: ```nasm swap_rgb48_old: # `@swap_rgb48_old` movzx eax, word ptr [rdi + 4] mov word ptr [rsp - 4], ax mov eax, dword ptr [rdi] mov dword ptr [rsp - 8], eax movzx eax, word ptr [rsi + 4] mov word ptr [rdi + 4], ax mov eax, dword ptr [rsi] mov dword ptr [rdi], eax movzx eax, word ptr [rsp - 4] mov word ptr [rsi + 4], ax mov eax, dword ptr [rsp - 8] mov dword ptr [rsi], eax ret ``` Now: ```llvm define void `@swap_rgb48(ptr` noalias nocapture noundef align 2 dereferenceable(6) %x, ptr noalias nocapture noundef align 2 dereferenceable(6) %y) unnamed_addr #0 { start: %0 = load <3 x i16>, ptr %x, align 2 %1 = load <3 x i16>, ptr %y, align 2 store <3 x i16> %1, ptr %x, align 2 store <3 x i16> %0, ptr %y, align 2 ret void } ``` still lowers to `dword`+`word` operations, but has no stack traffic: ```nasm swap_rgb48: # `@swap_rgb48` mov eax, dword ptr [rdi] movzx ecx, word ptr [rdi + 4] movzx edx, word ptr [rsi + 4] mov r8d, dword ptr [rsi] mov dword ptr [rdi], r8d mov word ptr [rdi + 4], dx mov word ptr [rsi + 4], cx mov dword ptr [rsi], eax ret ``` And as a demonstration that this isn't just `mem::swap`, a `mem::replace` on a small array (since replace doesn't use swap since #83022), which used to be `memcpy`s in LLVM changes in IR ```llvm define void `@replace_short_array(ptr` noalias nocapture noundef sret([3 x i32]) dereferenceable(12) %0, ptr noalias noundef align 4 dereferenceable(12) %r, ptr noalias nocapture noundef readonly dereferenceable(12) %v) unnamed_addr #0 { start: %1 = load <3 x i32>, ptr %r, align 4 store <3 x i32> %1, ptr %0, align 4 %2 = load <3 x i32>, ptr %v, align 4 store <3 x i32> %2, ptr %r, align 4 ret void } ``` but that lowers to reasonable `dword`+`qword` instructions still ```nasm replace_short_array: # `@replace_short_array` mov rax, rdi mov rcx, qword ptr [rsi] mov edi, dword ptr [rsi + 8] mov dword ptr [rax + 8], edi mov qword ptr [rax], rcx mov rcx, qword ptr [rdx] mov edx, dword ptr [rdx + 8] mov dword ptr [rsi + 8], edx mov qword ptr [rsi], rcx ret ```
The Rust Programming Language
This is the main source code repository for Rust. It contains the compiler, standard library, and documentation.
Note: this README is for users rather than contributors. If you wish to contribute to the compiler, you should read CONTRIBUTING.md instead.
Quick Start
Read "Installation" from The Book.
Installing from Source
The Rust build system uses a Python script called x.py to build the compiler,
which manages the bootstrapping process. It lives at the root of the project.
It also uses a file named config.toml to determine various configuration settings for the build.
You can see a full list of options in config.example.toml.
The x.py command can be run directly on most Unix systems in the following
format:
./x.py <subcommand> [flags]
This is how the documentation and examples assume you are running x.py.
Some alternative ways are:
# On a Unix shell if you don't have the necessary `python3` command
./x <subcommand> [flags]
# On the Windows Command Prompt (if .py files are configured to run Python)
x.py <subcommand> [flags]
# You can also run Python yourself, e.g.:
python x.py <subcommand> [flags]
More information about x.py can be found by running it with the --help flag
or reading the rustc dev guide.
Dependencies
Make sure you have installed the dependencies:
python3 or 2.7git- A C compiler (when building for the host,
ccis enough; cross-compiling may need additional compilers) curl(not needed on Windows)pkg-configif you are compiling on Linux and targeting Linuxlibiconv(already included with glibc on Debian-based distros)
To build Cargo, you'll also need OpenSSL (libssl-dev or openssl-devel on
most Unix distros).
If building LLVM from source, you'll need additional tools:
g++,clang++, or MSVC with versions listed on LLVM's documentationninja, or GNUmake3.81 or later (Ninja is recommended, especially on Windows)cmake3.13.4 or laterlibstdc++-staticmay be required on some Linux distributions such as Fedora and Ubuntu
On tier 1 or tier 2 with host tools platforms, you can also choose to download
LLVM by setting llvm.download-ci-llvm = true.
Otherwise, you'll need LLVM installed and llvm-config in your path.
See the rustc-dev-guide for more info.
Building on a Unix-like system
Build steps
-
Clone the source with
git:git clone https://github.com/rust-lang/rust.git cd rust
-
Configure the build settings:
./configureIf you plan to use
x.py installto create an installation, it is recommended that you set theprefixvalue in the[install]section to a directory:./configure --set install.prefix=<path> -
Build and install:
./x.py build && ./x.py installWhen complete,
./x.py installwill place several programs into$PREFIX/bin:rustc, the Rust compiler, andrustdoc, the API-documentation tool. By default, it will also include Cargo, Rust's package manager. You can disable this behavior by passing--set build.extended=falseto./configure.
Configure and Make
This project provides a configure script and makefile (the latter of which just invokes x.py).
./configure is the recommended way to programatically generate a config.toml. make is not
recommended (we suggest using x.py directly), but it is supported and we try not to break it
unnecessarily.
./configure
make && sudo make install
configure generates a config.toml which can also be used with normal x.py invocations.
Building on Windows
On Windows, we suggest using winget to install dependencies by running the following in a terminal:
winget install -e Python.Python.3
winget install -e Kitware.CMake
winget install -e Git.Git
Then edit your system's PATH variable and add: C:\Program Files\CMake\bin.
See
this guide on editing the system PATH
from the Java documentation.
There are two prominent ABIs in use on Windows: the native (MSVC) ABI used by Visual Studio and the GNU ABI used by the GCC toolchain. Which version of Rust you need depends largely on what C/C++ libraries you want to interoperate with. Use the MSVC build of Rust to interop with software produced by Visual Studio and the GNU build to interop with GNU software built using the MinGW/MSYS2 toolchain.
MinGW
MSYS2 can be used to easily build Rust on Windows:
-
Download the latest MSYS2 installer and go through the installer.
-
Run
mingw32_shell.batormingw64_shell.batfrom the MSYS2 installation directory (e.g.C:\msys64), depending on whether you want 32-bit or 64-bit Rust. (As of the latest version of MSYS2 you have to runmsys2_shell.cmd -mingw32ormsys2_shell.cmd -mingw64from the command line instead.) -
From this terminal, install the required tools:
# Update package mirrors (may be needed if you have a fresh install of MSYS2) pacman -Sy pacman-mirrors # Install build tools needed for Rust. If you're building a 32-bit compiler, # then replace "x86_64" below with "i686". If you've already got Git, Python, # or CMake installed and in PATH you can remove them from this list. # Note that it is important that you do **not** use the 'python2', 'cmake', # and 'ninja' packages from the 'msys2' subsystem. # The build has historically been known to fail with these packages. pacman -S git \ make \ diffutils \ tar \ mingw-w64-x86_64-python \ mingw-w64-x86_64-cmake \ mingw-w64-x86_64-gcc \ mingw-w64-x86_64-ninja -
Navigate to Rust's source code (or clone it), then build it:
python x.py setup user && python x.py build && python x.py install
MSVC
MSVC builds of Rust additionally require an installation of Visual Studio 2017
(or later) so rustc can use its linker. The simplest way is to get
Visual Studio, check the "C++ build tools" and "Windows 10 SDK" workload.
(If you're installing CMake yourself, be careful that "C++ CMake tools for Windows" doesn't get included under "Individual components".)
With these dependencies installed, you can build the compiler in a cmd.exe
shell with:
python x.py setup user
python x.py build
Right now, building Rust only works with some known versions of Visual Studio. If you have a more recent version installed and the build system doesn't understand, you may need to force rustbuild to use an older version. This can be done by manually calling the appropriate vcvars file before running the bootstrap.
CALL "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat"
python x.py build
Specifying an ABI
Each specific ABI can also be used from either environment (for example, using the GNU ABI in PowerShell) by using an explicit build triple. The available Windows build triples are:
- GNU ABI (using GCC)
i686-pc-windows-gnux86_64-pc-windows-gnu
- The MSVC ABI
i686-pc-windows-msvcx86_64-pc-windows-msvc
The build triple can be specified by either specifying --build=<triple> when
invoking x.py commands, or by creating a config.toml file (as described in
Building on a Unix-like system), and passing --set build.build=<triple> to ./configure.
Building Documentation
If you'd like to build the documentation, it's almost the same:
./x.py doc
The generated documentation will appear under doc in the build directory for
the ABI used. That is, if the ABI was x86_64-pc-windows-msvc, the directory
will be build\x86_64-pc-windows-msvc\doc.
Notes
Since the Rust compiler is written in Rust, it must be built by a precompiled "snapshot" version of itself (made in an earlier stage of development). As such, source builds require an Internet connection to fetch snapshots, and an OS that can execute the available snapshot binaries.
See https://doc.rust-lang.org/nightly/rustc/platform-support.html for a list of supported platforms. Only "host tools" platforms have a pre-compiled snapshot binary available; to compile for a platform without host tools you must cross-compile.
You may find that other platforms work, but these are our officially supported build environments that are most likely to work.
Getting Help
See https://www.rust-lang.org/community for a list of chat platforms and forums.
Contributing
See CONTRIBUTING.md.
License
Rust is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.
See LICENSE-APACHE, LICENSE-MIT, and COPYRIGHT for details.
Trademark
The Rust Foundation owns and protects the Rust and Cargo trademarks and logos (the "Rust Trademarks").
If you want to use these names or brands, please read the media guide.
Third-party logos may be subject to third-party copyrights and trademarks. See Licenses for details.