Commit Graph

72 Commits

Author SHA1 Message Date
Vadim Petrochenkov 5bf23f64cc libcore: Add iter::from_generator which is like iter::from_fn, but for coroutines instead of functions 2022-05-27 01:51:31 +03:00
bors 8c4fc9d9a4 Auto merge of #94598 - scottmcm:prefix-free-hasher-methods, r=Amanieu
Add a dedicated length-prefixing method to `Hasher`

This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths

This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.

Fixes #94026

r? rust-lang/libs

---

The core of this change is the following two new methods on `Hasher`:

```rust
pub trait Hasher {
    /// Writes a length prefix into this hasher, as part of being prefix-free.
    ///
    /// If you're implementing [`Hash`] for a custom collection, call this before
    /// writing its contents to this `Hasher`.  That way
    /// `(collection![1, 2, 3], collection![4, 5])` and
    /// `(collection![1, 2], collection![3, 4, 5])` will provide different
    /// sequences of values to the `Hasher`
    ///
    /// The `impl<T> Hash for [T]` includes a call to this method, so if you're
    /// hashing a slice (or array or vector) via its `Hash::hash` method,
    /// you should **not** call this yourself.
    ///
    /// This method is only for providing domain separation.  If you want to
    /// hash a `usize` that represents part of the *data*, then it's important
    /// that you pass it to [`Hasher::write_usize`] instead of to this method.
    ///
    /// # Examples
    ///
    /// ```
    /// #![feature(hasher_prefixfree_extras)]
    /// # // Stubs to make the `impl` below pass the compiler
    /// # struct MyCollection<T>(Option<T>);
    /// # impl<T> MyCollection<T> {
    /// #     fn len(&self) -> usize { todo!() }
    /// # }
    /// # impl<'a, T> IntoIterator for &'a MyCollection<T> {
    /// #     type Item = T;
    /// #     type IntoIter = std::iter::Empty<T>;
    /// #     fn into_iter(self) -> Self::IntoIter { todo!() }
    /// # }
    ///
    /// use std::hash::{Hash, Hasher};
    /// impl<T: Hash> Hash for MyCollection<T> {
    ///     fn hash<H: Hasher>(&self, state: &mut H) {
    ///         state.write_length_prefix(self.len());
    ///         for elt in self {
    ///             elt.hash(state);
    ///         }
    ///     }
    /// }
    /// ```
    ///
    /// # Note to Implementers
    ///
    /// If you've decided that your `Hasher` is willing to be susceptible to
    /// Hash-DoS attacks, then you might consider skipping hashing some or all
    /// of the `len` provided in the name of increased performance.
    #[inline]
    #[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")]
    fn write_length_prefix(&mut self, len: usize) {
        self.write_usize(len);
    }

    /// Writes a single `str` into this hasher.
    ///
    /// If you're implementing [`Hash`], you generally do not need to call this,
    /// as the `impl Hash for str` does, so you can just use that.
    ///
    /// This includes the domain separator for prefix-freedom, so you should
    /// **not** call `Self::write_length_prefix` before calling this.
    ///
    /// # Note to Implementers
    ///
    /// The default implementation of this method includes a call to
    /// [`Self::write_length_prefix`], so if your implementation of `Hasher`
    /// doesn't care about prefix-freedom and you've thus overridden
    /// that method to do nothing, there's no need to override this one.
    ///
    /// This method is available to be overridden separately from the others
    /// as `str` being UTF-8 means that it never contains `0xFF` bytes, which
    /// can be used to provide prefix-freedom cheaper than hashing a length.
    ///
    /// For example, if your `Hasher` works byte-by-byte (perhaps by accumulating
    /// them into a buffer), then you can hash the bytes of the `str` followed
    /// by a single `0xFF` byte.
    ///
    /// If your `Hasher` works in chunks, you can also do this by being careful
    /// about how you pad partial chunks.  If the chunks are padded with `0x00`
    /// bytes then just hashing an extra `0xFF` byte doesn't necessarily
    /// provide prefix-freedom, as `"ab"` and `"ab\u{0}"` would likely hash
    /// the same sequence of chunks.  But if you pad with `0xFF` bytes instead,
    /// ensuring at least one padding byte, then it can often provide
    /// prefix-freedom cheaper than hashing the length would.
    #[inline]
    #[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")]
    fn write_str(&mut self, s: &str) {
        self.write_length_prefix(s.len());
        self.write(s.as_bytes());
    }
}
```

With updates to the `Hash` implementations for slices and containers to call `write_length_prefix` instead of `write_usize`.

`write_str` defaults to using `write_length_prefix` since, as was pointed out in the issue, the `write_u8(0xFF)` approach is insufficient for hashers that work in chunks, as those would hash `"a\u{0}"` and `"a"` to the same thing.  But since `SipHash` works byte-wise (there's an internal buffer to accumulate bytes until a full chunk is available) it overrides `write_str` to continue to use the add-non-UTF-8-byte approach.

---

Compatibility:

Because the default implementation of `write_length_prefix` calls `write_usize`, the changed hash implementation for slices will do the same thing the old one did on existing `Hasher`s.
2022-05-06 09:43:57 +00:00
Scott McMurray 98054377ee Add a dedicated length-prefixing method to Hasher
This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths

This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
2022-05-06 00:03:38 -07:00
Josh Triplett 0fc5c524f5 Stabilize bool::then_some 2022-05-04 13:22:08 +02:00
bors 563ef23529 Auto merge of #95899 - petrochenkov:modchild2, r=cjgillot
rustc_metadata: Do not encode unnecessary module children

This should remove the syntax context shift and the special case for `ExternCrate` in decoder in https://github.com/rust-lang/rust/pull/95880.

This PR also shifts some work from decoding to encoding, which is typically useful for performance (but probably not much in this case).
r? `@cjgillot`
2022-04-16 22:04:10 +00:00
Ralf Jung e30d6d9096 make unaligned_references lint deny-by-default 2022-04-14 21:16:42 -04:00
Vadim Petrochenkov 233fa659f4 rustc_metadata: Do not encode unnecessary module children 2022-04-13 01:35:27 +03:00
Tomasz Miąsko 725c11ef3c Add SmallStr 2022-03-04 16:57:34 +01:00
Mark Rousskov 22c3a71de1 Switch bootstrap cfgs 2022-02-25 08:00:52 -05:00
Nicholas Nethercote 36b495f3cf Introduce ChunkedBitSet and use it for some dataflow analyses.
This reduces peak memory usage significantly for some programs with very
large functions, such as:
- `keccak`, `unicode_normalization`, and `match-stress-enum`, from
  the `rustc-perf` benchmark suite;
- `http-0.2.6` from crates.io.

The new type is used in the analyses where the bitsets can get huge
(e.g. 10s of thousands of bits): `MaybeInitializedPlaces`,
`MaybeUninitializedPlaces`, and `EverInitializedPlaces`.

Some refactoring was required in `rustc_mir_dataflow`. All existing
analysis domains are either `BitSet` or a trivial wrapper around
`BitSet`, and access in a few places is done via `Borrow<BitSet>` or
`BorrowMut<BitSet>`. Now that some of these domains are `ClusterBitSet`,
that no longer works. So this commit replaces the `Borrow`/`BorrowMut`
usage with a new trait `BitSetExt` containing the needed bitset
operations. The impls just forward these to the underlying bitset type.
This required fiddling with trait bounds in a few places.

The commit also:
- Moves `static_assert_size` from `rustc_data_structures` to
  `rustc_index` so it can be used in the latter; the former now
  re-exports it so existing users are unaffected.
- Factors out some common "clear excess bits in the final word"
  functionality in `bit_set.rs`.
- Uses `fill` in a few places instead of loops.
2022-02-23 10:18:49 +11:00
est31 2ef8af6619 Adopt let else in more places 2022-02-19 17:27:43 +01:00
Nicholas Nethercote 0c2ebbd412 Rename PtrKey as Interned and improve it.
In particular, there's now more protection against incorrect usage,
because you can only create one via `Interned::new_unchecked`, which
makes it more obvious that you must be careful.

There are also some tests.
2022-02-15 15:50:29 +11:00
lcnr a1a30f7548 add a rustc::query_stability lint 2022-02-01 10:15:59 +01:00
Alan Egerton acd39ff0fe Make IdFunctor::try_map_id panic-safe 2021-12-07 11:11:23 +00:00
Scott McMurray 308fd59f42 Stop enabling in_band_lifetimes in rustc_data_structures
There's a conversation in the tracking issue about possibly unaccepting `in_band_lifetimes`, but it's used heavily in the compiler, and thus there'd need to be a bunch of PRs like this if that were to happen.

So here's one to see how much of an impact it has.

(Oh, and I removed `nll` while I was here too, since it didn't seem needed.  Let me know if I should put that back.)
2021-12-05 20:17:35 -08:00
Alan Egerton db7295fa96 Remove no-longer used IdFunctor::map_id 2021-12-02 15:45:40 +00:00
Alan Egerton 9f714ef035 Delegate from map_id to try_map_id 2021-11-27 16:55:21 +00:00
Mark Rousskov 3215eeb99f Revert "Add rustc lint, warning when iterating over hashmaps" 2021-10-28 11:01:42 -04:00
bors 235d9853d8 Auto merge of #90042 - pietroalbini:1.56-master, r=Mark-Simulacrum
Bump bootstrap compiler to 1.57

Fixes https://github.com/rust-lang/rust/issues/90152

r? `@Mark-Simulacrum`
2021-10-25 11:31:47 +00:00
Pietro Albini b63ab8005a update cfg(bootstrap) 2021-10-23 21:55:57 -04:00
lcnr 00e5abe9b6 allow potential_query_instability everywhere 2021-10-15 10:58:18 +02:00
Jubilee 9866b090f4 Rollup merge of #89508 - jhpratt:stabilize-const_panic, r=joshtriplett
Stabilize `const_panic`

Closes #51999

FCP completed in #89006

```@rustbot``` label +A-const-eval +A-const-fn +T-lang

cc ```@oli-obk``` for review (not `r?`'ing as not on lang team)
2021-10-04 13:58:17 -07:00
Jacob Pratt bce8621983 Stabilize const_panic 2021-10-04 02:33:33 -04:00
bjorn3 83ddedf170 Remove various unused feature gates 2021-10-02 19:09:18 +02:00
Maybe Waffle 71e2eacc7b Stabilize Iterator::map_while 2021-09-17 19:42:46 +03:00
Vadim Petrochenkov 294510e1bb rustc: Remove local variable IDs from Exports
Local variables can never be exported.
2021-09-10 23:41:48 +03:00
Mark Rousskov b4e7649d6d Bump stage0 compiler to 1.56 2021-09-08 20:51:05 -04:00
Santiago Pastorino 5bff8429a0 Use type_alias_impl_trait instead of min in compiler and lib 2021-07-27 12:27:08 -03:00
Yuki Okushi 6761826b1b Sort features alphabetically 2021-07-23 18:08:26 +09:00
Yuki Okushi 8d00be9980 Use map_while instead of take_while + map 2021-07-23 18:04:28 +09:00
Oli Scherer d693a98f4e Fix VecMap::iter_mut
It used to allow you to mutate the key, even though that can invalidate the map by creating duplicate keys.
2021-07-22 11:20:29 +00:00
bors 0a8629bff6 Auto merge of #85885 - bjorn3:remove_box_region, r=cjgillot
Don't use a generator for BoxedResolver

The generator is non-trivial and requires unsafe code anyway. Using regular unsafe code without a generator is much easier to follow.

Based on #85810 as it touches rustc_interface too.
2021-06-11 16:11:20 +00:00
bjorn3 99e112d282 Inline the rest of box_region 2021-06-08 19:24:16 +02:00
Santiago Pastorino aa7024b0c7 Add VecMap to rustc_data_structures 2021-06-07 19:03:51 -03:00
bjorn3 312f964478 Remove unused feature gates 2021-05-31 13:55:43 +02:00
bjorn3 b4ed7114bd Remove unnecessary unboxed_closures feature usage
It has been possible to clone closures for a while now
2021-05-31 12:13:47 +02:00
bjorn3 8331dbe6d0 Add an Mmap wrapper to rustc_data_structures
This wrapper implements StableAddress and falls back to directly reading
the file on wasm32
2021-03-30 18:57:03 +02:00
Mara Bos 81932be5e7 Revert "Revert stabilizing integer::BITS." 2021-03-24 22:34:36 +01:00
Dylan DPC cabe97272d Rollup merge of #82057 - upsuper-forks:cstr, r=davidtwco,wesleywiser
Replace const_cstr with cstr crate

This PR replaces the `const_cstr` macro inside `rustc_data_structures` with `cstr` macro from [cstr](https://crates.io/crates/cstr) crate.

The two macros basically serve the same purpose, which is to generate `&'static CStr` from a string literal. `cstr` is better because it validates the literal at compile time, while the existing `const_cstr` does it at runtime when `debug_assertions` is enabled. In addition, the value `cstr` generates can be used in constant context (which is seemingly not needed anywhere currently, though).
2021-02-27 02:34:21 +01:00
Joshua Nelson 3733275854 Update the bootstrap compiler
Note this does not change `core::derive` since it was merged after the
beta bump.
2021-02-20 17:19:30 -05:00
Xidorn Quan 38e4233a32 Replace const_cstr with cstr crate 2021-02-14 09:45:35 +11:00
Mara Bos 89882388d9 Revert stabilizing integer::BITS. 2021-02-03 22:23:58 +01:00
Ashley Mannix 8940a2652e stabilize int_bits_const 2021-01-31 21:50:47 +10:00
Mark Rousskov fe031180d0 Bump bootstrap compiler to 1.50 beta 2020-12-30 09:27:19 -05:00
Bastian Kauschke 06cc9c26da stabilize min_const_generics 2020-12-26 18:24:10 +01:00
Camelid 810324d1f3 Rename optin_builtin_traits to auto_traits
They were originally called "opt-in, built-in traits" (OIBITs), but
people realized that the name was too confusing and a mouthful, and so
they were renamed to just "auto traits". The feature flag's name wasn't
updated, though, so that's what this PR does.

There are some other spots in the compiler that still refer to OIBITs,
but I don't think changing those now is worth it since they are internal
and not particularly relevant to this PR.

Also see <https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/opt-in.2C.20built-in.20traits.20(auto.20traits).20feature.20name>.
2020-11-23 14:14:06 -08:00
Tyson Nottingham 142932ab19 Set unaligned_references lint to deny in rustc_data_structures
To detect misuse of private packed field in `PackedFingerprint`.
2020-11-20 01:13:15 -08:00
Bastian Kauschke 2bf93bd852 compiler: fold by value 2020-11-16 22:34:57 +01:00
Jonas Schievink ae1916b3b4 Rollup merge of #79058 - dtolnay:likelymacro, r=Mark-Simulacrum
Move likely/unlikely argument outside of invisible unsafe block

The previous `likely!`/`unlikely!` macros were unsound because it permits the caller's expr to contain arbitrary unsafe code.

```rust
pub fn huh() -> bool {
    likely!(std::ptr::read(&() as *const () as *const bool))
}
```

**Before:** compiles cleanly.
**After:**

```console
error[E0133]: call to unsafe function is unsafe and requires unsafe function or block
   |
70 |     likely!(std::ptr::read(&() as *const () as *const bool))
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
   |
   = note: consult the function's documentation for information on how to avoid undefined behavior
```
2020-11-15 13:40:03 +01:00
David Tolnay afb817054c Move likely/unlikely argument outside of invisible unsafe block
The previous `likely!`/`unlikely!` macros were unsound because it
permits the caller's expr to contain arbitrary unsafe code.

    pub fn huh() -> bool {
        likely!(std::ptr::read(&() as *const () as *const bool))
    }

Before: compiles cleanly.
After:

    error[E0133]: call to unsafe function is unsafe and requires unsafe function or block
       |
    70 |     likely!(std::ptr::read(&() as *const () as *const bool))
       |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
       |
       = note: consult the function's documentation for information on how to avoid undefined behavior
2020-11-14 14:03:57 -08:00