Commit Graph

593 Commits

Author SHA1 Message Date
Aleksey Kladov e63fe150bf move unescape module to rustc_lexer 2019-07-21 16:46:11 +03:00
Mazdak Farrokhzad 952ee77871 Rollup merge of #62329 - matklad:no-peeking, r=petrochenkov
Remove support for 1-token lookahead from the lexer

`StringReader` maintained `peek_token` and `peek_span_src_raw` for look ahead.

`peek_token` was used only by rustdoc syntax coloring. After moving peeking logic into highlighter, I was able to remove `peek_token` from the lexer. I tried to use `iter::Peekable`, but that wasn't as pretty as I hoped, due to buffered fatal errors. So I went with hand-rolled peeking.

After that I've noticed that the only peeking behavior left was for raw tokens to test tt jointness. I've rewritten it in terms of trivia tokens, and not just spans.

After that it became possible to simplify the awkward constructor of the lexer, which could return `Err` if the first peeked token contained error.
2019-07-06 02:38:01 +02:00
Aleksey Kladov 601bad86b2 cleanup lexer constructors 2019-07-04 09:08:45 +03:00
Mazdak Farrokhzad bb7fbb99a2 Add separate 'async_closure' feature gate. 2019-07-03 23:59:36 +02:00
Mazdak Farrokhzad eb4f54a58d let_chains: Move feature gating to pre-expansion. 2019-06-23 01:29:29 +02:00
bors 3f511ade5b Auto merge of #60669 - c410-f3r:attrs-fn, r=petrochenkov
Allow attributes in formal function parameters

Implements https://github.com/rust-lang/rust/issues/60406.

This is my first contribution to the compiler and since this is a large and complex project, I am not fully aware of the consequences of the changes I have made.

**TODO**

- [x] Forbid some built-in attributes.
- [x] Expand cfg/cfg_attr
2019-06-12 07:38:01 +00:00
Mazdak Farrokhzad 9f22708ced Rollup merge of #61654 - Electron-libre:use_slice_patterns_in_rustc, r=oli-obk,Centril
use pattern matching for slices destructuring

refs #61542

Use slices pattern where it seems to make sense .
2019-06-12 04:22:50 +02:00
Caio 1eaaf440d5 Allow attributes in formal function parameters 2019-06-09 07:58:40 -03:00
Vadim Petrochenkov 25b05147b3 syntax: Remove Deref impl from Token 2019-06-08 22:38:23 +03:00
Cedric 0a4504d400 fix libsyntax test 2019-06-08 20:43:24 +02:00
Cedric 4c242a948c cast vec to slices 2019-06-08 16:21:15 +02:00
Cedric dd442a1fcf use default binding mode in match clauses 2019-06-08 13:29:43 +02:00
Cedric 4123b5d796 fix bad style for structs 2019-06-08 12:18:13 +02:00
Cedric ad91a8e59a improve style 2019-06-08 11:38:15 +02:00
Cedric 5fb099dc78 use pattern matching for slices destructuring 2019-06-08 10:49:46 +02:00
Vadim Petrochenkov 3da094319c parser: self.span -> self.token.span 2019-06-07 13:51:23 +03:00
Vadim Petrochenkov ff40e37b98 Some code cleanup and tidy/test fixes 2019-06-06 14:04:02 +03:00
Vadim Petrochenkov 67ce3f4589 syntax: Switch function parameter order in TokenTree::token 2019-06-06 14:04:02 +03:00
Vadim Petrochenkov f745e5f9b6 syntax: Remove duplicate span from token::Ident 2019-06-06 14:04:02 +03:00
Vadim Petrochenkov aa6fba98ae syntax: Use Token in Parser 2019-06-06 14:04:02 +03:00
Vadim Petrochenkov e0127dbf81 syntax: Use Token in TokenTree::Token 2019-06-06 14:03:15 +03:00
Vadim Petrochenkov 99b27d749c syntax: Rename Token into TokenKind 2019-06-06 14:03:14 +03:00
Vadim Petrochenkov eac3846b65 Always use token kinds through token module rather than Token type 2019-06-06 14:01:57 +03:00
Esteban Küber a2f853a691 Fix rebase 2019-05-24 11:50:21 -07:00
Esteban Küber 5c5fa775e5 review comments 2019-05-24 11:50:21 -07:00
Esteban Küber 24160171e4 Tweak macro parse errors when reaching EOF during macro call parse
- Add detail on origin of current parser when reaching EOF and stop
  saying "found <eof>" and point at the end of macro calls
- Handle empty `cfg_attr` attribute
- Reword empty `derive` attribute error
2019-05-24 11:49:33 -07:00
Mazdak Farrokhzad 3c9ac30dd9 Rollup merge of #60995 - topecongiro:parser-from-stream-and-base-dir, r=michaelwoerister
Add stream_to_parser_with_base_dir

This PR adds `stream_to_parser_with_base_dir`, which creates a parser from a token stream and a base directory.

Context: I would like to parse `cfg_if!` macro and get a list of modules defined inside it from rustfmt so that rustfmt can format those modules (cc https://github.com/rust-lang/rustfmt/issues/3253). To do so, I need to create a parser from `TokenStream` and set the directory of `Parser` to the same directory as the parent directory of a file which contains `cfg_if!` invocation. AFAIK there is no way to achieve this, and hence this PR.

Alternatively, I could change the visibility of `Parser.directory` from `crate` to `pub` so that the value can be modified after initializing a parser. I don't have a preference over either approach (or others, as long as it works).
2019-05-22 03:47:38 +02:00
John Kåre Alsaker a1f2dceaeb Move edition outside the hygiene lock and avoid accessing it 2019-05-21 18:17:05 +02:00
topecongiro 1f1a9176e7 Fix tidy: remove a trailing whitespace 2019-05-21 23:17:59 +09:00
topecongiro b07dbe1d44 Add doc comment 2019-05-21 22:57:34 +09:00
topecongiro e186d3f3e0 Add stream_to_parser_with_base_dir 2019-05-21 13:18:20 +09:00
bors 024c25dc79 Auto merge of #60763 - matklad:tt-parser, r=petrochenkov
Move token tree related lexer state to a separate struct

Just a types-based refactoring.

We only used a bunch of fields when tokenizing into a token tree, so let's move them out of the base lexer
2019-05-16 04:15:12 +00:00
Aleksey Kladov b91e0a3786 move span and token to tt reader 2019-05-13 14:54:34 +03:00
Aleksey Kladov d29f0d23c3 Move token tree related lexer state to a separate struct
We only used a bunch of fields when tokenizing into a token tree,
so let's move them out of the base lexer
2019-05-13 14:54:34 +03:00
Nicholas Nethercote 999c1fc281 Remove the equality operation between Symbol and strings.
And also the equality between `Path` and strings, because `Path` is made
up of `Symbol`s.
2019-05-13 09:31:30 +10:00
Nicholas Nethercote fb084a48e2 Pass a Symbol to check_name, emit_feature_err, and related functions. 2019-05-13 09:29:22 +10:00
Vadim Petrochenkov 3f064cae3d Move literal parsing code into a separate file
Remove some dead code
2019-05-11 16:03:16 +03:00
Vadim Petrochenkov 8739668438 Simplify conversions between tokens and semantic literals 2019-05-11 14:24:21 +03:00
Vadim Petrochenkov f2834a403a Keep the original token in ast::Lit 2019-05-11 14:24:21 +03:00
Mazdak Farrokhzad 39edc68c69 Rollup merge of #60188 - estebank:recover-block, r=varkor
Identify when a stmt could have been parsed as an expr

There are some expressions that can be parsed as a statement without
a trailing semicolon depending on the context, which can lead to
confusing errors due to the same looking code being accepted in some
places and not others. Identify these cases and suggest enclosing in
parenthesis making the parse non-ambiguous without changing the
accepted grammar.

Fix #54186, cc #54482, fix #59975, fix #47287.
2019-05-09 23:56:09 +02:00
Esteban Küber 54430ad53a review comments: fix typo and add comments 2019-05-06 16:00:21 -07:00
bors 46d0ca00ad Auto merge of #60261 - matklad:one-escape, r=petrochenkov
introduce unescape module

A WIP PR to gauge early feedback

Currently, we deal with escape sequences twice: once when we [lex](https://github.com/rust-lang/rust/blob/112f7e9ac564e2cfcfc13d599c8376a219fde1bc/src/libsyntax/parse/lexer/mod.rs#L928-L1065) a string, and a second time when we [unescape](https://github.com/rust-lang/rust/blob/112f7e9ac564e2cfcfc13d599c8376a219fde1bc/src/libsyntax/parse/mod.rs#L313-L366) literals. Note that we also produce different sets of diagnostics in these two cases.

This PR aims to remove this duplication, by introducing a new `unescape` module as a single source of truth for character escaping rules.

I think this would be a useful cleanup by itself, but I also need this for https://github.com/rust-lang/rust/pull/59706.

In the current state, the PR has `unescape` module which fully (modulo bugs) deals with string and char literals. I am quite happy about the state of this module

What this PR doesn't have yet are:
* [x] handling of byte and byte string literals (should be simple to add)
* [x] good diagnostics
* [x] actual removal of code from lexer (giant `scan_char_or_byte` should go away completely)
* [x] performance check
* [x] general cleanup of the new code

Diagnostics will be the most labor-consuming bit here, but they are mostly a question of just correctly adjusting spans to sub-tokens. The current setup for diagnostics is that `unescape` produces a plain old `enum` with various problems, and they are rendered into `Handler` separately. This bit is not actually required (it is possible to just pass the `Handler` in), but I like the separation between diagnostics and logic this approach imposes, and such separation should again be useful for #59706

cc @eddyb , @petrochenkov
2019-05-06 00:16:16 +00:00
Esteban Küber f6a4b5270a Deduplicate needed parentheses suggestion code 2019-05-02 16:13:28 -07:00
Aleksey Kladov 1835cbeb65 don't amplify errors in format! with bad literals 2019-05-02 21:01:02 +03:00
Aleksey Kladov bfa5f27847 introduce unescape module
Currently, we deal with escape sequences twice: once when we lex a
string, and a second time when we unescape literals. This PR aims to
remove this duplication, by introducing a new `unescape` mode as a
single source of truth for character escaping rules
2019-05-02 15:31:57 +03:00
Mazdak Farrokhzad 01ce87ad14 Rollup merge of #60348 - agnxy:refactor-parser, r=petrochenkov
move some functions from parser.rs to diagostics.rs

Starting with a few functions mentioned in https://github.com/rust-lang/rust/issues/60015#issuecomment-484259773. We might refactor parser.rs further in subsequent changes.
r? @petrochenkov
2019-05-02 01:09:25 +02:00
Andrew Xu d3fff6cda7 move some functions from parser.rs to diagostics.rs
parser.rs is too big. Some functions only for error reporting and error
recovery are being moved to diagostics.rs.
2019-05-01 19:54:48 +08:00
Esteban Küber f007e6f442 Identify when a stmt could have been parsed as an expr
There are some expressions that can be parsed as a statement without
a trailing semicolon depending on the context, which can lead to
confusing errors due to the same looking code being accepted in some
places and not others. Identify these cases and suggest enclosing in
parenthesis making the parse non-ambiguous without changing the
accepted grammar.
2019-04-29 14:07:02 -07:00
Aleksey Kladov b83ea7f917 reduce visibility 2019-04-23 00:51:38 +03:00
topecongiro 2309625941 Expose new_sub_parser_from_file
This function is useful when external tools like rustfmt want to parse
internal files without parsing a whole crate.
2019-03-09 21:59:54 +09:00