Backport TRPL, reference, and grammar.

Rather than port each individual change to these files, for the release,
I just waited to do it all at the end, in this commit. Since individual
comits made it to master, everyone should get proper credit in the
main tree, and AUTHORS includes those whose changes are here.
This commit is contained in:
Steve Klabnik
2015-05-13 02:48:36 -04:00
parent 6e2106d376
commit 665aed059a
62 changed files with 6715 additions and 2137 deletions
+117 -93
View File
@@ -5,8 +5,7 @@
This document is the primary reference for the Rust programming language grammar. It
provides only one kind of material:
- Chapters that formally define the language grammar and, for each
construct.
- Chapters that formally define the language grammar.
This document does not serve as an introduction to the language. Background
familiarity with the language is assumed. A separate [guide] is available to
@@ -97,12 +96,16 @@ explicit codepoint lists. [^inputformat]
## Special Unicode Productions
The following productions in the Rust grammar are defined in terms of Unicode
properties: `ident`, `non_null`, `non_star`, `non_eol`, `non_slash_or_star`,
`non_single_quote` and `non_double_quote`.
properties: `ident`, `non_null`, `non_eol`, `non_single_quote` and
`non_double_quote`.
### Identifiers
The `ident` production is any nonempty Unicode string of the following form:
The `ident` production is any nonempty Unicode[^non_ascii_idents] string of
the following form:
[^non_ascii_idents]: Non-ASCII characters in identifiers are currently feature
gated. This is expected to improve soon.
- The first character has property `XID_start`
- The remaining characters have property `XID_continue`
@@ -119,8 +122,6 @@ Some productions are defined by exclusion of particular Unicode characters:
- `non_null` is any single Unicode character aside from `U+0000` (null)
- `non_eol` is `non_null` restricted to exclude `U+000A` (`'\n'`)
- `non_star` is `non_null` restricted to exclude `U+002A` (`*`)
- `non_slash_or_star` is `non_null` restricted to exclude `U+002F` (`/`) and `U+002A` (`*`)
- `non_single_quote` is `non_null` restricted to exclude `U+0027` (`'`)
- `non_double_quote` is `non_null` restricted to exclude `U+0022` (`"`)
@@ -153,19 +154,19 @@ token : simple_token | ident | literal | symbol | whitespace token ;
<p id="keyword-table-marker"></p>
| | | | | |
|----------|----------|----------|----------|--------|
| abstract | alignof | as | become | box |
| break | const | continue | crate | do |
| else | enum | extern | false | final |
| fn | for | if | impl | in |
| let | loop | match | mod | move |
| mut | offsetof | once | override | priv |
| proc | pub | pure | ref | return |
| sizeof | static | self | struct | super |
| true | trait | type | typeof | unsafe |
| unsized | use | virtual | where | while |
| yield | | | | |
| | | | | |
|----------|----------|----------|----------|---------|
| abstract | alignof | as | become | box |
| break | const | continue | crate | do |
| else | enum | extern | false | final |
| fn | for | if | impl | in |
| let | loop | macro | match | mod |
| move | mut | offsetof | override | priv |
| proc | pub | pure | ref | return |
| Self | self | sizeof | static | struct |
| super | trait | true | type | typeof |
| unsafe | unsized | use | virtual | where |
| while | yield | | | |
Each of these keywords has special meaning in its grammar, and all of them are
@@ -175,9 +176,15 @@ excluded from the `ident` rule.
```antlr
lit_suffix : ident;
literal : [ string_lit | char_lit | byte_string_lit | byte_lit | num_lit ] lit_suffix ?;
literal : [ string_lit | char_lit | byte_string_lit | byte_lit | num_lit | bool_lit ] lit_suffix ?;
```
The optional `lit_suffix` production is only used for certain numeric literals,
but is reserved for future extension. That is, the above gives the lexical
grammar, but a Rust parser will reject everything but the 12 special cases
mentioned in [Number literals](reference.html#number-literals) in the
reference.
#### Character and string literals
```antlr
@@ -237,14 +244,16 @@ dec_lit : [ dec_digit | '_' ] + ;
#### Boolean literals
**FIXME:** write grammar
```antlr
bool_lit : [ "true" | "false" ] ;
```
The two values of the boolean type are written `true` and `false`.
### Symbols
```antlr
symbol : "::" "->"
symbol : "::" | "->"
| '#' | '[' | ']' | '(' | ')' | '{' | '}'
| ',' | ';' ;
```
@@ -295,8 +304,8 @@ transcriber : '(' transcriber * ')' | '[' transcriber * ']'
## Items
```antlr
item : mod_item | fn_item | type_item | struct_item | enum_item
| static_item | trait_item | impl_item | extern_block ;
item : vis ? mod_item | fn_item | type_item | struct_item | enum_item
| const_item | static_item | trait_item | impl_item | extern_block ;
```
### Type Parameters
@@ -313,27 +322,27 @@ mod : [ view_item | item ] * ;
#### View items
```antlr
view_item : extern_crate_decl | use_decl ;
view_item : extern_crate_decl | use_decl ';' ;
```
##### Extern crate declarations
```antlr
extern_crate_decl : "extern" "crate" crate_name
crate_name: ident | ( string_lit as ident )
crate_name: ident | ( ident "as" ident )
```
##### Use declarations
```antlr
use_decl : "pub" ? "use" [ path "as" ident
| path_glob ] ;
use_decl : vis ? "use" [ path "as" ident
| path_glob ] ;
path_glob : ident [ "::" [ path_glob
| '*' ] ] ?
| '{' path_item [ ',' path_item ] * '}' ;
path_item : ident | "mod" ;
path_item : ident | "self" ;
```
### Functions
@@ -368,6 +377,10 @@ path_item : ident | "mod" ;
**FIXME:** grammar?
### Enumerations
**FIXME:** grammar?
### Constant items
```antlr
@@ -401,16 +414,17 @@ extern_block : [ foreign_fn ] * ;
## Visibility and Privacy
**FIXME:** grammar?
```antlr
vis : "pub" ;
```
### Re-exporting and Visibility
**FIXME:** grammar?
See [Use declarations](#use-declarations).
## Attributes
```antlr
attribute : "#!" ? '[' meta_item ']' ;
attribute : '#' '!' ? '[' meta_item ']' ;
meta_item : ident [ '=' literal
| '(' meta_seq ')' ] ? ;
meta_seq : meta_item [ ',' meta_seq ] ? ;
@@ -420,28 +434,21 @@ meta_seq : meta_item [ ',' meta_seq ] ? ;
## Statements
**FIXME:** grammar?
```antlr
stmt : decl_stmt | expr_stmt ;
```
### Declaration statements
**FIXME:** grammar?
A _declaration statement_ is one that introduces one or more *names* into the
enclosing statement block. The declared names may denote new slots or new
items.
```antlr
decl_stmt : item | let_decl ;
```
#### Item declarations
**FIXME:** grammar?
See [Items](#items).
An _item declaration statement_ has a syntactic form identical to an
[item](#items) declaration within a module. Declaring an item &mdash; a
function, enumeration, structure, type, static, trait, implementation or module
&mdash; locally within a statement block is simply a way of restricting its
scope to a narrow region containing all of its uses; it is otherwise identical
in meaning to declaring the item outside the statement block.
#### Slot declarations
#### Variable declarations
```antlr
let_decl : "let" pat [':' type ] ? [ init ] ? ';' ;
@@ -450,11 +457,21 @@ init : [ '=' ] expr ;
### Expression statements
**FIXME:** grammar?
```antlr
expr_stmt : expr ';' ;
```
## Expressions
**FIXME:** grammar?
```antlr
expr : literal | path | tuple_expr | unit_expr | struct_expr
| block_expr | method_call_expr | field_expr | array_expr
| idx_expr | range_expr | unop_expr | binop_expr
| paren_expr | call_expr | lambda_expr | while_expr
| loop_expr | break_expr | continue_expr | for_expr
| if_expr | match_expr | if_let_expr | while_let_expr
| return_expr ;
```
#### Lvalues, rvalues and temporaries
@@ -466,19 +483,23 @@ init : [ '=' ] expr ;
### Literal expressions
**FIXME:** grammar?
See [Literals](#literals).
### Path expressions
**FIXME:** grammar?
See [Paths](#paths).
### Tuple expressions
**FIXME:** grammar?
```antlr
tuple_expr : '(' [ expr [ ',' expr ] * | expr ',' ] ? ')' ;
```
### Unit expressions
**FIXME:** grammar?
```antlr
unit_expr : "()" ;
```
### Structure expressions
@@ -494,8 +515,7 @@ struct_expr : expr_path '{' ident ':' expr
### Block expressions
```antlr
block_expr : '{' [ view_item ] *
[ stmt ';' | item ] *
block_expr : '{' [ stmt ';' | item ] *
[ expr ] '}' ;
```
@@ -516,7 +536,7 @@ field_expr : expr '.' ident ;
```antlr
array_expr : '[' "mut" ? array_elems? ']' ;
array_elems : [expr [',' expr]*] | [expr ',' ".." expr] ;
array_elems : [expr [',' expr]*] | [expr ';' expr] ;
```
### Index expressions
@@ -525,68 +545,72 @@ array_elems : [expr [',' expr]*] | [expr ',' ".." expr] ;
idx_expr : expr '[' expr ']' ;
```
### Range expressions
```antlr
range_expr : expr ".." expr |
expr ".." |
".." expr |
".." ;
```
### Unary operator expressions
**FIXME:** grammar?
```antlr
unop_expr : unop expr ;
unop : '-' | '*' | '!' ;
```
### Binary operator expressions
```antlr
binop_expr : expr binop expr ;
binop_expr : expr binop expr | type_cast_expr
| assignment_expr | compound_assignment_expr ;
binop : arith_op | bitwise_op | lazy_bool_op | comp_op
```
#### Arithmetic operators
**FIXME:** grammar?
```antlr
arith_op : '+' | '-' | '*' | '/' | '%' ;
```
#### Bitwise operators
**FIXME:** grammar?
```antlr
bitwise_op : '&' | '|' | '^' | "<<" | ">>" ;
```
#### Lazy boolean operators
**FIXME:** grammar?
```antlr
lazy_bool_op : "&&" | "||" ;
```
#### Comparison operators
**FIXME:** grammar?
```antlr
comp_op : "==" | "!=" | '<' | '>' | "<=" | ">=" ;
```
#### Type cast expressions
**FIXME:** grammar?
```antlr
type_cast_expr : value "as" type ;
```
#### Assignment expressions
**FIXME:** grammar?
```antlr
assignment_expr : expr '=' expr ;
```
#### Compound assignment expressions
**FIXME:** grammar?
#### Operator precedence
The precedence of Rust binary operators is ordered as follows, going from
strong to weak:
```text
* / %
as
+ -
<< >>
&
^
|
< > <= >=
== !=
&&
||
=
```antlr
compound_assignment_expr : expr [ arith_op | bitwise_op ] '=' expr ;
```
Operators at the same precedence level are evaluated left-to-right. [Unary
operators](#unary-operator-expressions) have the same precedence level and it
is stronger than any of the binary operators'.
### Grouped expressions
```antlr
@@ -611,7 +635,7 @@ lambda_expr : '|' ident_list '|' expr ;
### While loops
```antlr
while_expr : "while" no_struct_literal_expr '{' block '}' ;
while_expr : [ lifetime ':' ] "while" no_struct_literal_expr '{' block '}' ;
```
### Infinite loops
@@ -635,7 +659,7 @@ continue_expr : "continue" [ lifetime ];
### For expressions
```antlr
for_expr : "for" pat "in" no_struct_literal_expr '{' block '}' ;
for_expr : [ lifetime ':' ] "for" pat "in" no_struct_literal_expr '{' block '}' ;
```
### If expressions
@@ -763,7 +787,7 @@ bound := path | lifetime
### Memory ownership
### Memory slots
### Variables
### Boxes
+518 -822
View File
@@ -29,60 +29,29 @@ You may also be interested in the [grammar].
# Notation
Rust's grammar is defined over Unicode codepoints, each conventionally denoted
`U+XXXX`, for 4 or more hexadecimal digits `X`. _Most_ of Rust's grammar is
confined to the ASCII range of Unicode, and is described in this document by a
dialect of Extended Backus-Naur Form (EBNF), specifically a dialect of EBNF
supported by common automated LL(k) parsing tools such as `llgen`, rather than
the dialect given in ISO 14977. The dialect can be defined self-referentially
as follows:
```{.ebnf .notation}
grammar : rule + ;
rule : nonterminal ':' productionrule ';' ;
productionrule : production [ '|' production ] * ;
production : term * ;
term : element repeats ;
element : LITERAL | IDENTIFIER | '[' productionrule ']' ;
repeats : [ '*' | '+' ] NUMBER ? | NUMBER ? | '?' ;
```
Where:
- Whitespace in the grammar is ignored.
- Square brackets are used to group rules.
- `LITERAL` is a single printable ASCII character, or an escaped hexadecimal
ASCII code of the form `\xQQ`, in single quotes, denoting the corresponding
Unicode codepoint `U+00QQ`.
- `IDENTIFIER` is a nonempty string of ASCII letters and underscores.
- The `repeat` forms apply to the adjacent `element`, and are as follows:
- `?` means zero or one repetition
- `*` means zero or more repetitions
- `+` means one or more repetitions
- NUMBER trailing a repeat symbol gives a maximum repetition count
- NUMBER on its own gives an exact repetition count
This EBNF dialect should hopefully be familiar to many readers.
## Unicode productions
A few productions in Rust's grammar permit Unicode codepoints outside the ASCII
range. We define these productions in terms of character properties specified
in the Unicode standard, rather than in terms of ASCII-range codepoints. The
section [Special Unicode Productions](#special-unicode-productions) lists these
productions.
A few productions in Rust's grammar permit Unicode code points outside the
ASCII range. We define these productions in terms of character properties
specified in the Unicode standard, rather than in terms of ASCII-range code
points. The grammar has a [Special Unicode Productions][unicodeproductions]
section that lists these productions.
[unicodeproductions]: grammar.html#special-unicode-productions
## String table productions
Some rules in the grammar &mdash; notably [unary
operators](#unary-operator-expressions), [binary
operators](#binary-operator-expressions), and [keywords](#keywords) &mdash; are
operators](#binary-operator-expressions), and [keywords][keywords] &mdash; are
given in a simplified form: as a listing of a table of unquoted, printable
whitespace-separated strings. These cases form a subset of the rules regarding
the [token](#tokens) rule, and are assumed to be the result of a
lexical-analysis phase feeding the parser, driven by a DFA, operating over the
disjunction of all such string table entries.
[keywords]: grammar.html#keywords
When such a string enclosed in double-quotes (`"`) occurs inside the grammar,
it is an implicit reference to a single member of such a string table
production. See [tokens](#tokens) for more information.
@@ -91,80 +60,59 @@ production. See [tokens](#tokens) for more information.
## Input format
Rust input is interpreted as a sequence of Unicode codepoints encoded in UTF-8.
Rust input is interpreted as a sequence of Unicode code points encoded in UTF-8.
Most Rust grammar rules are defined in terms of printable ASCII-range
codepoints, but a small number are defined in terms of Unicode properties or
explicit codepoint lists. [^inputformat]
code points, but a small number are defined in terms of Unicode properties or
explicit code point lists. [^inputformat]
[^inputformat]: Substitute definitions for the special Unicode productions are
provided to the grammar verifier, restricted to ASCII range, when verifying the
grammar in this document.
## Special Unicode Productions
## Identifiers
The following productions in the Rust grammar are defined in terms of Unicode
properties: `ident`, `non_null`, `non_star`, `non_eol`, `non_slash_or_star`,
`non_single_quote` and `non_double_quote`.
An identifier is any nonempty Unicode[^non_ascii_idents] string of the following form:
### Identifiers
The `ident` production is any nonempty Unicode string of the following form:
[^non_ascii_idents]: Non-ASCII characters in identifiers are currently feature
gated. This is expected to improve soon.
- The first character has property `XID_start`
- The remaining characters have property `XID_continue`
that does _not_ occur in the set of [keywords](#keywords).
that does _not_ occur in the set of [keywords][keywords].
> **Note**: `XID_start` and `XID_continue` as character properties cover the
> character ranges used to form the more familiar C and Java language-family
> identifiers.
### Delimiter-restricted productions
Some productions are defined by exclusion of particular Unicode characters:
- `non_null` is any single Unicode character aside from `U+0000` (null)
- `non_eol` is `non_null` restricted to exclude `U+000A` (`'\n'`)
- `non_star` is `non_null` restricted to exclude `U+002A` (`*`)
- `non_slash_or_star` is `non_null` restricted to exclude `U+002F` (`/`) and `U+002A` (`*`)
- `non_single_quote` is `non_null` restricted to exclude `U+0027` (`'`)
- `non_double_quote` is `non_null` restricted to exclude `U+0022` (`"`)
## Comments
```{.ebnf .gram}
comment : block_comment | line_comment ;
block_comment : "/*" block_comment_body * "*/" ;
block_comment_body : [block_comment | character] * ;
line_comment : "//" non_eol * ;
```
Comments in Rust code follow the general C++ style of line and block-comment
forms. Nested block comments are supported.
Comments in Rust code follow the general C++ style of line (`//`) and
block (`/* ... */`) comment forms. Nested block comments are supported.
Line comments beginning with exactly _three_ slashes (`///`), and block
comments beginning with exactly one repeated asterisk in the block-open
sequence (`/**`), are interpreted as a special syntax for `doc`
[attributes](#attributes). That is, they are equivalent to writing
`#[doc="..."]` around the body of the comment (this includes the comment
characters themselves, ie `/// Foo` turns into `#[doc="/// Foo"]`).
`#[doc="..."]` around the body of the comment, i.e., `/// Foo` turns into
`#[doc="Foo"]`.
`//!` comments apply to the parent of the comment, rather than the item that
follows. `//!` comments are usually used to display information on the crate
index page.
Line comments beginning with `//!` and block comments beginning with `/*!` are
doc comments that apply to the parent of the comment, rather than the item
that follows. That is, they are equivalent to writing `#![doc="..."]` around
the body of the comment. `//!` comments are usually used to document
modules that occupy a source file.
Non-doc comments are interpreted as a form of whitespace.
## Whitespace
```{.ebnf .gram}
whitespace_char : '\x20' | '\x09' | '\x0a' | '\x0d' ;
whitespace : [ whitespace_char | comment ] + ;
```
Whitespace is any non-empty string containing only the following characters:
The `whitespace_char` production is any nonempty Unicode string consisting of
any of the following Unicode characters: `U+0020` (space, `' '`), `U+0009`
(tab, `'\t'`), `U+000A` (LF, `'\n'`), `U+000D` (CR, `'\r'`).
- `U+0020` (space, `' '`)
- `U+0009` (tab, `'\t'`)
- `U+000A` (LF, `'\n'`)
- `U+000D` (CR, `'\r'`)
Rust is a "free-form" language, meaning that all forms of whitespace serve only
to separate _tokens_ in the grammar, and have no semantic significance.
@@ -174,40 +122,11 @@ with any other legal whitespace element, such as a single space character.
## Tokens
```{.ebnf .gram}
simple_token : keyword | unop | binop ;
token : simple_token | ident | literal | symbol | whitespace token ;
```
Tokens are primitive productions in the grammar defined by regular
(non-recursive) languages. "Simple" tokens are given in [string table
production](#string-table-productions) form, and occur in the rest of the
grammar as double-quoted strings. Other tokens have exact rules given.
### Keywords
<p id="keyword-table-marker"></p>
| | | | | |
|----------|----------|----------|----------|---------|
| abstract | alignof | as | become | box |
| break | const | continue | crate | do |
| else | enum | extern | false | final |
| fn | for | if | impl | in |
| let | loop | macro | match | mod |
| move | mut | offsetof | override | priv |
| pub | pure | ref | return | sizeof |
| static | self | struct | super | true |
| trait | type | typeof | unsafe | unsized |
| use | virtual | where | while | yield |
Each of these keywords has special meaning in its grammar, and all of them are
excluded from the `ident` rule.
Note that some of these keywords are reserved, and do not currently do
anything.
### Literals
A literal is an expression consisting of a single token, rather than a sequence
@@ -215,16 +134,6 @@ of tokens, that immediately and directly denotes the value it evaluates to,
rather than referring to it by name or some other evaluation rule. A literal is
a form of constant expression, so is evaluated (primarily) at compile time.
```{.ebnf .gram}
lit_suffix : ident;
literal : [ string_lit | char_lit | byte_string_lit | byte_lit | num_lit ] lit_suffix ?;
```
The optional suffix is only used for certain numeric literals, but is
reserved for future extension, that is, the above gives the lexical
grammar, but a Rust parser will reject everything but the 12 special
cases mentioned in [Number literals](#number-literals) below.
#### Examples
##### Characters and strings
@@ -268,36 +177,10 @@ cases mentioned in [Number literals](#number-literals) below.
##### Suffixes
| Integer | Floating-point |
|---------|----------------|
| `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, `is` (`isize`), `us` (`usize`) | `f32`, `f64` |
| `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, `isize`, `usize` | `f32`, `f64` |
#### Character and string literals
```{.ebnf .gram}
char_lit : '\x27' char_body '\x27' ;
string_lit : '"' string_body * '"' | 'r' raw_string ;
char_body : non_single_quote
| '\x5c' [ '\x27' | common_escape | unicode_escape ] ;
string_body : non_double_quote
| '\x5c' [ '\x22' | common_escape | unicode_escape ] ;
raw_string : '"' raw_string_body '"' | '#' raw_string '#' ;
common_escape : '\x5c'
| 'n' | 'r' | 't' | '0'
| 'x' hex_digit 2
unicode_escape : 'u' '{' hex_digit+ 6 '}';
hex_digit : 'a' | 'b' | 'c' | 'd' | 'e' | 'f'
| 'A' | 'B' | 'C' | 'D' | 'E' | 'F'
| dec_digit ;
oct_digit : '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' ;
dec_digit : '0' | nonzero_dec ;
nonzero_dec: '1' | '2' | '3' | '4'
| '5' | '6' | '7' | '8' | '9' ;
```
##### Character literals
A _character literal_ is a single Unicode character enclosed within two
@@ -308,13 +191,13 @@ which must be _escaped_ by a preceding `U+005C` character (`\`).
A _string literal_ is a sequence of any Unicode characters enclosed within two
`U+0022` (double-quote) characters, with the exception of `U+0022` itself,
which must be _escaped_ by a preceding `U+005C` character (`\`), or a _raw
string literal_.
which must be _escaped_ by a preceding `U+005C` character (`\`).
A multi-line string literal may be defined by terminating each line with a
`U+005C` character (`\`) immediately before the newline. This causes the
`U+005C` character, the newline, and all whitespace at the beginning of the
next line to be ignored.
Line-break characters are allowed in string literals. Normally they represent
themselves (i.e. no translation), but as a special exception, when a `U+005C`
character (`\`) occurs immediately before the newline, the `U+005C` character,
the newline, and all whitespace at the beginning of the next line are ignored.
Thus `a` and `b` are equal:
```rust
let a = "foobar";
@@ -330,14 +213,14 @@ Some additional _escapes_ are available in either character or non-raw string
literals. An escape starts with a `U+005C` (`\`) and continues with one of the
following forms:
* An _8-bit codepoint escape_ escape starts with `U+0078` (`x`) and is
followed by exactly two _hex digits_. It denotes the Unicode codepoint
* An _8-bit code point escape_ starts with `U+0078` (`x`) and is
followed by exactly two _hex digits_. It denotes the Unicode code point
equal to the provided hex value.
* A _24-bit codepoint escape_ starts with `U+0075` (`u`) and is followed
* A _24-bit code point escape_ starts with `U+0075` (`u`) and is followed
by up to six _hex digits_ surrounded by braces `U+007B` (`{`) and `U+007D`
(`}`). It denotes the Unicode codepoint equal to the provided hex value.
(`}`). It denotes the Unicode code point equal to the provided hex value.
* A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072`
(`r`), or `U+0074` (`t`), denoting the unicode values `U+000A` (LF),
(`r`), or `U+0074` (`t`), denoting the Unicode values `U+000A` (LF),
`U+000D` (CR) or `U+0009` (HT) respectively.
* The _backslash escape_ is the character `U+005C` (`\`) which must be
escaped in order to denote *itself*.
@@ -346,11 +229,10 @@ following forms:
Raw string literals do not process any escapes. They start with the character
`U+0072` (`r`), followed by zero or more of the character `U+0023` (`#`) and a
`U+0022` (double-quote) character. The _raw string body_ is not defined in the
EBNF grammar above: it can contain any sequence of Unicode characters and is
terminated only by another `U+0022` (double-quote) character, followed by the
same number of `U+0023` (`#`) characters that preceded the opening `U+0022`
(double-quote) character.
`U+0022` (double-quote) character. The _raw string body_ can contain any sequence
of Unicode characters and is terminated only by another `U+0022` (double-quote)
character, followed by the same number of `U+0023` (`#`) characters that preceded
the opening `U+0022` (double-quote) character.
All Unicode characters contained in the raw string body represent themselves,
the characters `U+0022` (double-quote) (except when followed by at least as
@@ -372,26 +254,14 @@ r##"foo #"# bar"##; // foo #"# bar
#### Byte and byte string literals
```{.ebnf .gram}
byte_lit : "b\x27" byte_body '\x27' ;
byte_string_lit : "b\x22" string_body * '\x22' | "br" raw_byte_string ;
byte_body : ascii_non_single_quote
| '\x5c' [ '\x27' | common_escape ] ;
byte_string_body : ascii_non_double_quote
| '\x5c' [ '\x22' | common_escape ] ;
raw_byte_string : '"' raw_byte_string_body '"' | '#' raw_byte_string '#' ;
```
##### Byte literals
A _byte literal_ is a single ASCII character (in the `U+0000` to `U+007F`
range) enclosed within two `U+0027` (single-quote) characters, with the
exception of `U+0027` itself, which must be _escaped_ by a preceding U+005C
character (`\`), or a single _escape_. It is equivalent to a `u8` unsigned
8-bit integer _number literal_.
range) or a single _escape_ preceded by the characters `U+0062` (`b`) and
`U+0027` (single-quote), and followed by the character `U+0027`. If the character
`U+0027` is present within the literal, it must be _escaped_ by a preceding
`U+005C` (`\`) character. It is equivalent to a `u8` unsigned 8-bit integer
_number literal_.
##### Byte string literals
@@ -400,14 +270,14 @@ preceded by the characters `U+0062` (`b`) and `U+0022` (double-quote), and
followed by the character `U+0022`. If the character `U+0022` is present within
the literal, it must be _escaped_ by a preceding `U+005C` (`\`) character.
Alternatively, a byte string literal can be a _raw byte string literal_, defined
below. A byte string literal is equivalent to a `&'static [u8]` borrowed array
below. A byte string literal of length `n` is equivalent to a `&'static [u8; n]` borrowed fixed-sized array
of unsigned 8-bit integers.
Some additional _escapes_ are available in either byte or non-raw byte string
literals. An escape starts with a `U+005C` (`\`) and continues with one of the
following forms:
* An _byte escape_ escape starts with `U+0078` (`x`) and is
* A _byte escape_ escape starts with `U+0078` (`x`) and is
followed by exactly two _hex digits_. It denotes the byte
equal to the provided hex value.
* A _whitespace escape_ is one of the characters `U+006E` (`n`), `U+0072`
@@ -421,11 +291,10 @@ following forms:
Raw byte string literals do not process any escapes. They start with the
character `U+0062` (`b`), followed by `U+0072` (`r`), followed by zero or more
of the character `U+0023` (`#`), and a `U+0022` (double-quote) character. The
_raw string body_ is not defined in the EBNF grammar above: it can contain any
sequence of ASCII characters and is terminated only by another `U+0022`
(double-quote) character, followed by the same number of `U+0023` (`#`)
characters that preceded the opening `U+0022` (double-quote) character. A raw
byte string literal can not contain any non-ASCII byte.
_raw string body_ can contain any sequence of ASCII characters and is terminated
only by another `U+0022` (double-quote) character, followed by the same number of
`U+0023` (`#`) characters that preceded the opening `U+0022` (double-quote)
character. A raw byte string literal can not contain any non-ASCII byte.
All characters contained in the raw string body represent their ASCII encoding,
the characters `U+0022` (double-quote) (except when followed by at least as
@@ -447,19 +316,6 @@ b"\\x52"; br"\x52"; // \x52
#### Number literals
```{.ebnf .gram}
num_lit : nonzero_dec [ dec_digit | '_' ] * float_suffix ?
| '0' [ [ dec_digit | '_' ] * float_suffix ?
| 'b' [ '1' | '0' | '_' ] +
| 'o' [ oct_digit | '_' ] +
| 'x' [ hex_digit | '_' ] + ] ;
float_suffix : [ exponent | '.' dec_lit exponent ? ] ? ;
exponent : ['E' | 'e'] ['-' | '+' ] ? dec_lit ;
dec_lit : [ dec_digit | '_' ] + ;
```
A _number literal_ is either an _integer literal_ or a _floating-point
literal_. The grammar for recognizing the two kinds of literals is mixed.
@@ -509,11 +365,19 @@ A _floating-point literal_ has one of two forms:
optionally followed by another decimal literal, with an optional _exponent_.
* A single _decimal literal_ followed by an _exponent_.
By default, a floating-point literal has a generic type, and, like integer
literals, the type must be uniquely determined from the context. There are two valid
Like integer literals, a floating-point literal may be followed by a
suffix, so long as the pre-suffix part does not end with `U+002E` (`.`).
The suffix forcibly sets the type of the literal. There are two valid
_floating-point suffixes_, `f32` and `f64` (the 32-bit and 64-bit floating point
types), which explicitly determine the type of the literal.
The type of an _unsuffixed_ floating-point literal is determined by type
inference. If a floating-point type can be _uniquely_ determined from the
surrounding program context, the unsuffixed floating-point literal has that type.
If the program context underconstrains the type, it defaults to double-precision `f64`;
if the program context overconstrains the type, it is considered a static type
error.
Examples of floating-point literals of various forms:
```
@@ -537,34 +401,18 @@ The two values of the boolean type are written `true` and `false`.
### Symbols
```{.ebnf .gram}
symbol : "::" | "->"
| '#' | '[' | ']' | '(' | ')' | '{' | '}'
| ',' | ';' ;
```
Symbols are a general class of printable [token](#tokens) that play structural
roles in a variety of grammar productions. They are catalogued here for
completeness as the set of remaining miscellaneous printable tokens that do not
otherwise appear as [unary operators](#unary-operator-expressions), [binary
operators](#binary-operator-expressions), or [keywords](#keywords).
operators](#binary-operator-expressions), or [keywords][keywords].
## Paths
```{.ebnf .gram}
expr_path : [ "::" ] ident [ "::" expr_path_tail ] + ;
expr_path_tail : '<' type_expr [ ',' type_expr ] + '>'
| expr_path ;
type_path : ident [ type_path_tail ] + ;
type_path_tail : '<' type_expr [ ',' type_expr ] + '>'
| "::" type_path ;
```
A _path_ is a sequence of one or more path components _logically_ separated by
a namespace qualifier (`::`). If a path consists of only one component, it may
refer to either an [item](#items) or a [slot](#memory-slots) in a local control
refer to either an [item](#items) or a [variable](#variables) in a local control
scope. If a path has multiple components, it refers to an item.
Every item has a _canonical path_ within its crate, but the path naming an item
@@ -578,12 +426,12 @@ x;
x::y::z;
```
Path components are usually [identifiers](#identifiers), but the trailing
component of a path may be an angle-bracket-enclosed list of type arguments. In
[expression](#expressions) context, the type argument list is given after a
final (`::`) namespace qualifier in order to disambiguate it from a relational
expression involving the less-than symbol (`<`). In type expression context,
the final namespace qualifier is omitted.
Path components are usually [identifiers](#identifiers), but they may
also include angle-bracket-enclosed lists of type arguments. In
[expression](#expressions) context, the type argument list is given
after a `::` namespace qualifier in order to disambiguate it from a
relational expression involving the less-than symbol (`<`). In type
expression context, the final namespace qualifier is omitted.
Two examples of paths with type arguments:
@@ -649,27 +497,15 @@ names, and invoked through a consistent syntax: `some_extension!(...)`.
Users of `rustc` can define new syntax extensions in two ways:
* [Compiler plugins][plugin] can include arbitrary
Rust code that manipulates syntax trees at compile time.
* [Compiler plugins][plugin] can include arbitrary Rust code that
manipulates syntax trees at compile time. Note that the interface
for compiler plugins is considered highly unstable.
* [Macros](book/macros.html) define new syntax in a higher-level,
declarative way.
## Macros
```{.ebnf .gram}
expr_macro_rules : "macro_rules" '!' ident '(' macro_rule * ')' ;
macro_rule : '(' matcher * ')' "=>" '(' transcriber * ')' ';' ;
matcher : '(' matcher * ')' | '[' matcher * ']'
| '{' matcher * '}' | '$' ident ':' ident
| '$' '(' matcher * ')' sep_token? [ '*' | '+' ]
| non_special_token ;
transcriber : '(' transcriber * ')' | '[' transcriber * ']'
| '{' transcriber * '}' | '$' ident
| '$' '(' transcriber * ')' sep_token? [ '*' | '+' ]
| non_special_token ;
```
`macro_rules` allows users to define syntax extension in a declarative way. We
call such extensions "macros by example" or simply "macros" — to be distinguished
from the "procedural macros" defined in [compiler plugins][plugin].
@@ -697,9 +533,9 @@ in macro rules). In the transcriber, the designator is already known, and so
only the name of a matched nonterminal comes after the dollar sign.
In both the matcher and transcriber, the Kleene star-like operator indicates
repetition. The Kleene star operator consists of `$` and parens, optionally
repetition. The Kleene star operator consists of `$` and parentheses, optionally
followed by a separator token, followed by `*` or `+`. `*` means zero or more
repetitions, `+` means at least one repetition. The parens are not matched or
repetitions, `+` means at least one repetition. The parentheses are not matched or
transcribed. On the matcher side, a name is bound to _all_ of the names it
matches, in a structure that mimics the structure of the repetition encountered
on a successful match. The job of the transcriber is to sort that structure
@@ -716,7 +552,7 @@ _name_ s that occur in its body. At the "current layer", they all must repeat
the same number of times, so ` ( $( $i:ident ),* ; $( $j:ident ),* ) => ( $(
($i,$j) ),* )` is valid if given the argument `(a,b,c ; d,e,f)`, but not
`(a,b,c ; d,e)`. The repetition walks through the choices at that layer in
lockstep, so the former input transcribes to `( (a,d), (b,e), (c,f) )`.
lockstep, so the former input transcribes to `(a,d), (b,e), (c,f)`.
Nested repetitions are allowed.
@@ -725,27 +561,40 @@ Nested repetitions are allowed.
The parser used by the macro system is reasonably powerful, but the parsing of
Rust syntax is restricted in two ways:
1. The parser will always parse as much as possible. If it attempts to match
`$i:expr [ , ]` against `8 [ , ]`, it will attempt to parse `i` as an array
index operation and fail. Adding a separator can solve this problem.
1. Macro definitions are required to include suitable separators after parsing
expressions and other bits of the Rust grammar. This implies that
a macro definition like `$i:expr [ , ]` is not legal, because `[` could be part
of an expression. A macro definition like `$i:expr,` or `$i:expr;` would be legal,
however, because `,` and `;` are legal separators. See [RFC 550] for more information.
2. The parser must have eliminated all ambiguity by the time it reaches a `$`
_name_ `:` _designator_. This requirement most often affects name-designator
pairs when they occur at the beginning of, or immediately after, a `$(...)*`;
requiring a distinctive token in front can solve the problem.
[RFC 550]: https://github.com/rust-lang/rfcs/blob/master/text/0550-macro-future-proofing.md
# Crates and source files
Rust is a *compiled* language. Its semantics obey a *phase distinction*
between compile-time and run-time. Those semantic rules that have a *static
interpretation* govern the success or failure of compilation. We refer to
these rules as "static semantics". Semantic rules called "dynamic semantics"
govern the behavior of programs at run-time. A program that fails to compile
due to violation of a compile-time rule has no defined dynamic semantics; the
compiler should halt with an error report, and produce no executable artifact.
Although Rust, like any other language, can be implemented by an interpreter as
well as a compiler, the only existing implementation is a compiler &mdash;
from now on referred to as *the* Rust compiler &mdash; and the language has
always been designed to be compiled. For these reasons, this section assumes a
compiler.
Rust's semantics obey a *phase distinction* between compile-time and
run-time.[^phase-distinction] Those semantic rules that have a *static
interpretation* govern the success or failure of compilation. Those semantics
that have a *dynamic interpretation* govern the behavior of the program at
run-time.
[^phase-distinction]: This distinction would also exist in an interpreter.
Static checks like syntactic analysis, type checking, and lints should
happen before the program is executed regardless of when it is executed.
The compilation model centers on artifacts called _crates_. Each compilation
processes a single crate in source form, and if successful, produces a single
crate in binary form: either an executable or a library.[^cratesourcefile]
crate in binary form: either an executable or some sort of
library.[^cratesourcefile]
[^cratesourcefile]: A crate is somewhat analogous to an *assembly* in the
ECMA-335 CLI model, a *library* in the SML/NJ Compilation Manager, a *unit*
@@ -766,21 +615,25 @@ extension `.rs`.
A Rust source file describes a module, the name and location of which &mdash;
in the module tree of the current crate &mdash; are defined from outside the
source file: either by an explicit `mod_item` in a referencing source file, or
by the name of the crate itself.
by the name of the crate itself. Every source file is a module, but not every
module needs its own source file: [module definitions](#modules) can be nested
within one file.
Each source file contains a sequence of zero or more `item` definitions, and
may optionally begin with any number of `attributes` that apply to the
containing module. Attributes on the anonymous crate module define important
metadata that influences the behavior of the compiler.
may optionally begin with any number of [attributes](#items-and-attributes)
that apply to the containing module, most of which influence the behavior of
the compiler. The anonymous crate module can have additional attributes that
apply to the crate as a whole.
```no_run
// Crate name
// Specify the crate name.
#![crate_name = "projx"]
// Specify the output type
// Specify the type of output artifact.
#![crate_type = "lib"]
// Turn on a warning
// Turn on a warning.
// This can be done in any module, not just the anonymous crate module.
#![warn(non_camel_case_types)]
```
@@ -795,12 +648,6 @@ Crates contain [items](#items), each of which may have some number of
## Items
```{.ebnf .gram}
item : extern_crate_decl | use_decl | mod_item | fn_item | type_item
| struct_item | enum_item | static_item | trait_item | impl_item
| extern_block ;
```
An _item_ is a component of a crate. Items are organized within a crate by a
nested set of [modules](#modules). Every crate has a single "outermost"
anonymous module; all further items within the crate have [paths](#paths)
@@ -815,9 +662,10 @@ There are several kinds of item:
* [`use` declarations](#use-declarations)
* [modules](#modules)
* [functions](#functions)
* [type definitions](#type-definitions)
* [type definitions](grammar.html#type-definitions)
* [structures](#structures)
* [enumerations](#enumerations)
* [constant items](#constant-items)
* [static items](#static-items)
* [traits](#traits)
* [implementations](#implementations)
@@ -834,24 +682,20 @@ which sub-item declarations may appear.
### Type Parameters
All items except modules may be *parameterized* by type. Type parameters are
given as a comma-separated list of identifiers enclosed in angle brackets
(`<...>`), after the name of the item and before its definition. The type
parameters of an item are considered "part of the name", not part of the type
of the item. A referencing [path](#paths) must (in principle) provide type
arguments as a list of comma-separated types enclosed within angle brackets, in
order to refer to the type-parameterized item. In practice, the type-inference
system can usually infer such argument types from context. There are no
general type-parametric types, only type-parametric items. That is, Rust has
no notion of type abstraction: there are no first-class "forall" types.
All items except modules, constants and statics may be *parameterized* by type.
Type parameters are given as a comma-separated list of identifiers enclosed in
angle brackets (`<...>`), after the name of the item and before its definition.
The type parameters of an item are considered "part of the name", not part of
the type of the item. A referencing [path](#paths) must (in principle) provide
type arguments as a list of comma-separated types enclosed within angle
brackets, in order to refer to the type-parameterized item. In practice, the
type-inference system can usually infer such argument types from context. There
are no general type-parametric types, only type-parametric items. That is, Rust
has no notion of type abstraction: there are no higher-ranked (or "forall") types
abstracted over other types, though higher-ranked types do exist for lifetimes.
### Modules
```{.ebnf .gram}
mod_item : "mod" ident ( ';' | '{' mod '}' );
mod : item * ;
```
A module is a container for zero or more [items](#items).
A _module item_ is a module, surrounded in braces, named, and prefixed with the
@@ -894,6 +738,7 @@ mod vec;
mod thread {
// Load the `local_data` module from `thread/local_data.rs`
// or `thread/local_data/mod.rs`.
mod local_data;
}
```
@@ -910,12 +755,7 @@ mod thread {
}
```
##### Extern crate declarations
```{.ebnf .gram}
extern_crate_decl : "extern" "crate" crate_name
crate_name: ident | ( string_lit "as" ident )
```
#### Extern crate declarations
An _`extern crate` declaration_ specifies a dependency on an external crate.
The external crate is then bound into the declaring scope as the `ident`
@@ -924,11 +764,10 @@ provided in the `extern_crate_decl`.
The external crate is resolved to a specific `soname` at compile time, and a
runtime linkage requirement to that `soname` is passed to the linker for
loading at runtime. The `soname` is resolved at compile time by scanning the
compiler's library path and matching the optional `crateid` provided as a
string literal against the `crateid` attributes that were declared on the
external crate when it was compiled. If no `crateid` is provided, a default
`name` attribute is assumed, equal to the `ident` given in the
`extern_crate_decl`.
compiler's library path and matching the optional `crateid` provided against
the `crateid` attributes that were declared on the external crate when it was
compiled. If no `crateid` is provided, a default `name` attribute is assumed,
equal to the `ident` given in the `extern_crate_decl`.
Three examples of `extern crate` declarations:
@@ -940,23 +779,12 @@ extern crate std; // equivalent to: extern crate std as std;
extern crate std as ruststd; // linking to 'std' under another name
```
##### Use declarations
```{.ebnf .gram}
use_decl : "pub" ? "use" [ path "as" ident
| path_glob ] ;
path_glob : ident [ "::" [ path_glob
| '*' ] ] ?
| '{' path_item [ ',' path_item ] * '}' ;
path_item : ident | "self" ;
```
#### Use declarations
A _use declaration_ creates one or more local name bindings synonymous with
some other [path](#paths). Usually a `use` declaration is used to shorten the
path required to refer to a module item. These declarations may appear at the
top of [modules](#modules) and [blocks](#blocks).
top of [modules](#modules) and [blocks](grammar.html#block-expressions).
> **Note**: Unlike in many languages,
> `use` declarations in Rust do *not* declare linkage dependency with external crates.
@@ -975,8 +803,7 @@ Use declarations support a number of convenient shortcuts:
An example of `use` declarations:
```
# #![feature(core)]
```rust
use std::option::Option::{Some, None};
use std::collections::hash_map::{self, HashMap};
@@ -1027,22 +854,23 @@ module declarations should be at the crate root if direct usage of the declared
modules within `use` items is desired. It is also possible to use `self` and
`super` at the beginning of a `use` item to refer to the current and direct
parent modules respectively. All rules regarding accessing declared modules in
`use` declarations applies to both module declarations and `extern crate`
`use` declarations apply to both module declarations and `extern crate`
declarations.
An example of what will and will not work for `use` items:
```
# #![feature(core)]
# #![allow(unused_imports)]
use foo::core::iter; // good: foo is at the root of the crate
use foo::baz::foobaz; // good: foo is at the root of the crate
mod foo {
extern crate core;
use foo::core::iter; // good: foo is at crate root
// use core::iter; // bad: core is not at the crate root
mod example {
pub mod iter {}
}
use foo::example::iter; // good: foo is at crate root
// use example::iter; // bad: core is not at the crate root
use self::baz::foobaz; // good: self refers to module 'foo'
use foo::bar::foobar; // good: foo is at crate root
@@ -1064,9 +892,9 @@ fn main() {}
A _function item_ defines a sequence of [statements](#statements) and an
optional final [expression](#expressions), along with a name and a set of
parameters. Functions are declared with the keyword `fn`. Functions declare a
set of *input* [*slots*](#memory-slots) as parameters, through which the caller
passes arguments into the function, and an *output* [*slot*](#memory-slots)
through which the function passes results back to the caller.
set of *input* [*variables*](#variables) as parameters, through which the caller
passes arguments into the function, and the *output* [*type*](#types)
of the value the function will return to its caller on completion.
A function may also be copied into a first-class *value*, in which case the
value has the corresponding [*function type*](#function-types), and can be used
@@ -1101,40 +929,31 @@ signature. Each type parameter must be explicitly declared, in an
angle-bracket-enclosed, comma-separated list following the function name.
```{.ignore}
fn iter<T>(seq: &[T], f: |T|) {
for elt in seq.iter() { f(elt); }
fn iter<T, F>(seq: &[T], f: F) where T: Copy, F: Fn(T) {
for elt in seq { f(*elt); }
}
fn map<T, U>(seq: &[T], f: |T| -> U) -> Vec<U> {
fn map<T, U, F>(seq: &[T], f: F) -> Vec<U> where T: Copy, U: Copy, F: Fn(T) -> U {
let mut acc = vec![];
for elt in seq.iter() { acc.push(f(elt)); }
for elt in seq { acc.push(f(*elt)); }
acc
}
```
Inside the function signature and body, the name of the type parameter can be
used as a type name.
used as a type name. [Trait](#traits) bounds can be specified for type parameters
to allow methods with that trait to be called on values of that type. This is
specified using the `where` syntax, as in the above example.
When a generic function is referenced, its type is instantiated based on the
context of the reference. For example, calling the `iter` function defined
above on `[1, 2]` will instantiate type parameter `T` with `i32`, and require
the closure parameter to have type `fn(i32)`.
the closure parameter to have type `Fn(i32)`.
The type parameters can also be explicitly supplied in a trailing
[path](#paths) component after the function name. This might be necessary if
there is not sufficient context to determine the type parameters. For example,
`mem::size_of::<u32>() == 4`.
Since a parameter type is opaque to the generic function, the set of operations
that can be performed on it is limited. Values of parameter type can only be
moved, not copied.
```
fn id<T>(x: T) -> T { x }
```
Similarly, [trait](#traits) bounds can be specified for type parameters to
allow methods with that trait to be called on values of that type.
#### Unsafety
Unsafe operations are those that potentially violate the memory-safety
@@ -1192,7 +1011,8 @@ the guarantee that these issues are never caused by safe code.
* `&mut` and `&` follow LLVMs scoped [noalias] model, except if the `&T`
contains an `UnsafeCell<U>`. Unsafe code must not violate these aliasing
guarantees.
* Mutating an immutable value/reference without `UnsafeCell<U>`
* Mutating non-mutable data (that is, data reached through a shared reference or
data owned by a `let` binding), unless that data is contained within an `UnsafeCell<U>`.
* Invoking undefined behavior via compiler intrinsics:
* Indexing outside of the bounds of an object with `std::ptr::offset`
(`offset` intrinsic), with
@@ -1211,9 +1031,9 @@ the guarantee that these issues are never caused by safe code.
[noalias]: http://llvm.org/docs/LangRef.html#noalias
##### Behaviour not considered unsafe
##### Behavior not considered unsafe
This is a list of behaviour not considered *unsafe* in Rust terms, but that may
This is a list of behavior not considered *unsafe* in Rust terms, but that may
be undesired.
* Deadlocks
@@ -1222,14 +1042,18 @@ be undesired.
* Exiting without calling destructors
* Sending signals
* Accessing/modifying the file system
* Unsigned integer overflow (well-defined as wrapping)
* Signed integer overflow (well-defined as twos complement representation
wrapping)
* Integer overflow
- Overflow is considered "unexpected" behavior and is always user-error,
unless the `wrapping` primitives are used. In non-optimized builds, the compiler
will insert debug checks that panic on overflow, but in optimized builds overflow
instead results in wrapped values. See [RFC 560] for the rationale and more details.
[RFC 560]: https://github.com/rust-lang/rfcs/blob/master/text/0560-integer-overflow.md
#### Diverging functions
A special kind of function can be declared with a `!` character where the
output slot type would normally be. For example:
output type would normally be. For example:
```
fn my_err(s: &str) -> ! {
@@ -1302,18 +1126,11 @@ contiguous stack segments like C.
A _type alias_ defines a new name for an existing [type](#types). Type
aliases are declared with the keyword `type`. Every value has a single,
specific type; the type-specified aspects of a value include:
specific type, but may implement several different traits, or be compatible with
several different type constraints.
* Whether the value is composed of sub-values or is indivisible.
* Whether the value represents textual or numerical information.
* Whether the value represents integral or floating-point information.
* The sequence of memory operations required to access the value.
* The [kind](#type-kinds) of the type.
For example, the type `(u8, u8)` defines the set of immutable values that are
composite pairs, each containing two unsigned 8-bit integers accessed by
pattern-matching and laid out in memory with the `x` component preceding the
`y` component:
For example, the following defines the type `Point` as a synonym for the type
`(u8, u8)`, the type of pairs of unsigned 8 bit integers:
```
type Point = (u8, u8);
@@ -1343,9 +1160,7 @@ let px: i32 = match p { Point(x, _) => x };
```
A _unit-like struct_ is a structure without any fields, defined by leaving off
the list of fields entirely. Such types will have a single value, just like
the [unit value `()`](#unit-and-boolean-literals) of the unit type. For
example:
the list of fields entirely. Such types will have a single value. For example:
```
struct Cookie;
@@ -1377,9 +1192,7 @@ a = Animal::Cat;
Enumeration constructors can have either named or unnamed fields:
```
# #![feature(struct_variant)]
# fn main() {
```rust
enum Animal {
Dog (String, f64),
Cat { name: String, weight: f64 }
@@ -1387,7 +1200,6 @@ enum Animal {
let mut a: Animal = Animal::Dog("Cocoa".to_string(), 37.2);
a = Animal::Cat { name: "Spotty".to_string(), weight: 2.7 };
# }
```
In this example, `Cat` is a _struct-like enum variant_,
@@ -1416,10 +1228,6 @@ it were `Bar(i32)`, this is disallowed.
### Constant items
```{.ebnf .gram}
const_item : "const" ident ':' type '=' expr ';' ;
```
A *constant item* is a named _constant value_ which is not associated with a
specific memory location in the program. Constants are essentially inlined
wherever they are used, meaning that they are copied directly into the relevant
@@ -1456,10 +1264,6 @@ const BITS_N_STRINGS: BitsNStrings<'static> = BitsNStrings {
### Static items
```{.ebnf .gram}
static_item : "static" ident ':' type '=' expr ';' ;
```
A *static item* is similar to a *constant*, except that it represents a precise
memory location in the program. A static is never "inlined" at the usage site,
and all references to it refer to the same memory location. Static items have
@@ -1518,11 +1322,26 @@ type of the value is not required to ascribe to `Sync`.
### Traits
A _trait_ describes a set of method types.
A _trait_ describes an abstract interface that types can
implement. This interface consists of associated items, which come in
three varieties:
Traits can include default implementations of methods, written in terms of some
unknown [`self` type](#self-types); the `self` type may either be completely
unspecified, or constrained by some other trait.
- functions
- constants
- types
Associated functions whose first parameter is named `self` are called
methods and may be invoked using `.` notation (e.g., `x.foo()`).
All traits define an implicit type parameter `Self` that refers to
"the type that is implementing this interface". Traits may also
contain additional type parameters. These type parameters (including
`Self`) may be constrained by other traits and so forth as usual.
Trait bounds on `Self` are considered "supertraits". These are
required to be acyclic. Supertraits are somewhat different from other
constraints in that they affect what methods are available in the
vtable when the trait is used as a [trait object](#trait-objects).
Traits are implemented for specific types through separate
[implementations](#implementations).
@@ -1567,15 +1386,18 @@ fn draw_twice<T: Shape>(surface: Surface, sh: T) {
}
```
Traits also define an [object type](#object-types) with the same name as the
trait. Values of this type are created by [casting](#type-cast-expressions)
pointer values (pointing to a type for which an implementation of the given
trait is in scope) to pointers to the trait name, used as a type.
Traits also define an [trait object](#trait-objects) with the same
name as the trait. Values of this type are created by coercing from a
pointer of some specific type to a pointer of trait type. For example,
`&T` could be coerced to `&Shape` if `T: Shape` holds (and similarly
for `Box<T>`). This coercion can either be implicit or
[explicit](#type-cast-expressions). Here is an example of an explicit
coercion:
```
# trait Shape { fn dummy(&self) { } }
# impl Shape for i32 { }
# let mycircle = 0i32;
trait Shape { }
impl Shape for i32 { }
let mycircle = 0i32;
let myshape: Box<Shape> = Box::new(mycircle) as Box<Shape>;
```
@@ -1714,11 +1536,6 @@ impl Seq<bool> for u32 {
### External blocks
```{.ebnf .gram}
extern_block_item : "extern" '{' extern_block '}' ;
extern_block : [ foreign_fn ] * ;
```
External blocks form the basis for Rust's foreign function interface.
Declarations in an external block describe symbols in external, non-Rust
libraries.
@@ -1727,17 +1544,6 @@ Functions within external blocks are declared in the same way as other Rust
functions, with the exception that they may not have a body and are instead
terminated by a semicolon.
```
# #![feature(libc)]
extern crate libc;
use libc::{c_char, FILE};
extern {
fn fopen(filename: *const c_char, mode: *const c_char) -> *mut FILE;
}
# fn main() {}
```
Functions within external blocks may be called by Rust code, just like
functions defined in Rust. The Rust compiler automatically translates between
the Rust ABI and the foreign ABI.
@@ -1748,7 +1554,7 @@ By default external blocks assume that the library they are calling uses the
standard C "cdecl" ABI. Other ABIs may be specified using an `abi` string, as
shown here:
```{.ignore}
```ignore
// Interface to the Windows API
extern "stdcall" { }
```
@@ -1783,8 +1589,7 @@ warnings are generated, or otherwise "you used a private item of another module
and weren't allowed to."
By default, everything in Rust is *private*, with one exception. Enum variants
in a `pub` enum are also public by default. You are allowed to alter this
default visibility with the `priv` keyword. When an item is declared as `pub`,
in a `pub` enum are also public by default. When an item is declared as `pub`,
it can be thought of as being accessible to the outside world. For example:
```
@@ -1929,13 +1734,6 @@ the namespace hierarchy as it normally would.
## Attributes
```{.ebnf .gram}
attribute : '#' '!' ? '[' meta_item ']' ;
meta_item : ident [ '=' literal
| '(' meta_seq ')' ] ? ;
meta_seq : meta_item [ ',' meta_seq ] ? ;
```
Any item declaration may have an _attribute_ applied to it. Attributes in Rust
are modeled on Attributes in ECMA-335, with the syntax coming from ECMA-334
(C#). An attribute is a general, free-form metadatum that is interpreted
@@ -1981,7 +1779,7 @@ type int8_t = i8;
### Crate-only attributes
- `crate_name` - specify the this crate's crate name.
- `crate_name` - specify the crate's crate name.
- `crate_type` - see [linkage](#linkage).
- `feature` - see [compiler features](#compiler-features).
- `no_builtins` - disable optimizing certain code patterns to invocations of
@@ -2101,13 +1899,12 @@ macro scope.
lower to the target's SIMD instructions, if any; the `simd` feature gate
is necessary to use this attribute.
- `static_assert` - on statics whose type is `bool`, terminates compilation
with an error if it is not initialized to `true`.
- `unsafe_destructor` - allow implementations of the "drop" language item
where the type it is implemented for does not implement the "send" language
item; the `unsafe_destructor` feature gate is needed to use this attribute
with an error if it is not initialized to `true`. To use this, the `static_assert`
feature gate must be enabled.
- `unsafe_no_drop_flag` - on structs, remove the flag that prevents
destructors from being run twice. Destructors might be run multiple times on
the same object with this attribute.
the same object with this attribute. To use this, the `unsafe_no_drop_flag` feature
gate must be enabled.
- `doc` - Doc comments such as `/// foo` are equivalent to `#[doc = "foo"]`.
- `rustc_on_unimplemented` - Write a custom note to be shown along with the error
when the trait is found to be unimplemented on a type.
@@ -2125,8 +1922,8 @@ release builds.
There are two kinds of configuration options, one that is either defined or not
(`#[cfg(foo)]`), and the other that contains a string that can be checked
against (`#[cfg(bar = "baz")]` (currently only compiler-defined configuration
options can have the latter form).
against (`#[cfg(bar = "baz")]`). Currently, only compiler-defined configuration
options can have the latter form.
```
// The function is only included in the build when compiling for OSX
@@ -2169,7 +1966,7 @@ The following configurations must be defined by the implementation:
`"unix"` or `"windows"`. The value of this configuration option is defined
as a configuration itself, like `unix` or `windows`.
* `target_os = "..."`. Operating system of the target, examples include
`"win32"`, `"macos"`, `"linux"`, `"android"`, `"freebsd"`, `"dragonfly"`,
`"windows"`, `"macos"`, `"ios"`, `"linux"`, `"android"`, `"freebsd"`, `"dragonfly"`,
`"bitrig"` or `"openbsd"`.
* `target_pointer_width = "..."`. Target pointer width in bits. This is set
to `"32"` for targets with 32-bit pointers, and likewise set to `"64"` for
@@ -2177,6 +1974,14 @@ The following configurations must be defined by the implementation:
* `unix`. See `target_family`.
* `windows`. See `target_family`.
You can also set another attribute based on a `cfg` variable with `cfg_attr`:
```rust,ignore
#[cfg_attr(a, b)]
```
Will be the same as `#[b]` if `a` is set by `cfg`, and nothing otherwise.
### Lint check attributes
A lint check names a potentially undesirable coding pattern, such as
@@ -2194,7 +1999,7 @@ For any lint check `C`:
The lint checks supported by the compiler can be found via `rustc -W help`,
along with their default settings. [Compiler
plugins](book/plugins.html#lint-plugins) can provide additional lint checks.
plugins](book/compiler-plugins.html#lint-plugins) can provide additional lint checks.
```{.ignore}
mod m1 {
@@ -2256,7 +2061,7 @@ makes it possible to declare these operations. For example, the `str` module
in the Rust standard library defines the string equality function:
```{.ignore}
#[lang="str_eq"]
#[lang = "str_eq"]
pub fn eq_slice(a: &str, b: &str) -> bool {
// details elided
}
@@ -2266,25 +2071,21 @@ The name `str_eq` has a special meaning to the Rust compiler, and the presence
of this definition means that it will use this definition when generating calls
to the string equality function.
A complete list of the built-in language items will be added in the future.
The set of language items is currently considered unstable. A complete
list of the built-in language items will be added in the future.
### Inline attributes
The inline attribute is used to suggest to the compiler to perform an inline
expansion and place a copy of the function or static in the caller rather than
generating code to call the function or access the static where it is defined.
The inline attribute suggests that the compiler should place a copy of
the function or static in the caller, rather than generating code to
call the function or access the static where it is defined.
The compiler automatically inlines functions based on internal heuristics.
Incorrectly inlining functions can actually making the program slower, so it
Incorrectly inlining functions can actually make the program slower, so it
should be used with care.
Immutable statics are always considered inlineable unless marked with
`#[inline(never)]`. It is undefined whether two different inlineable statics
have the same memory address. In other words, the compiler is free to collapse
duplicate inlineable statics together.
`#[inline]` and `#[inline(always)]` always causes the function to be serialized
into crate metadata to allow cross-crate inlining.
`#[inline]` and `#[inline(always)]` always cause the function to be serialized
into the crate metadata to allow cross-crate inlining.
There are three different types of inline attributes:
@@ -2355,7 +2156,10 @@ The currently implemented features of the reference compiler are:
semantics are likely to change, so this macro usage must be opted
into.
* `associated_types` - Allows type aliases in traits. Experimental.
* `associated_consts` - Allows constants to be defined in `impl` and `trait`
blocks, so that they can be associated with a type or
trait in a similar manner to methods and associated
types.
* `box_patterns` - Allows `box` patterns, the exact semantics of which
is subject to change.
@@ -2368,7 +2172,7 @@ The currently implemented features of the reference compiler are:
removed entirely for something more wholesome.
* `custom_attribute` - Allows the usage of attributes unknown to the compiler
so that new attributes can be added in a bacwards compatible
so that new attributes can be added in a backwards compatible
manner (RFC 572).
* `custom_derive` - Allows the use of `#[derive(Foo,Bar)]` as sugar for
@@ -2397,7 +2201,7 @@ The currently implemented features of the reference compiler are:
nasty hack that will certainly be removed.
* `main` - Allows use of the `#[main]` attribute, which changes the entry point
into a Rust program. This capabiilty is subject to change.
into a Rust program. This capability is subject to change.
* `macro_reexport` - Allows macros to be re-exported from one crate after being imported
from another. This feature was originally designed with the sole
@@ -2444,7 +2248,9 @@ The currently implemented features of the reference compiler are:
* `simd_ffi` - Allows use of SIMD vectors in signatures for foreign functions.
The SIMD interface is subject to change.
* `staged_api` - Allows usage of stability markers and `#![staged_api]` in a crate
* `staged_api` - Allows usage of stability markers and `#![staged_api]` in a
crate. Stability markers are also attributes: `#[stable]`,
`#[unstable]`, and `#[deprecated]` are the three levels.
* `static_assert` - The `#[static_assert]` functionality is experimental and
unstable. The attribute can be attached to a `static` of
@@ -2453,7 +2259,7 @@ The currently implemented features of the reference compiler are:
is unintuitive and suboptimal.
* `start` - Allows use of the `#[start]` attribute, which changes the entry point
into a Rust program. This capabiilty, especially the signature for the
into a Rust program. This capability, especially the signature for the
annotated function, is subject to change.
* `struct_inherit` - Allows using struct inheritance, which is barely
@@ -2479,10 +2285,6 @@ The currently implemented features of the reference compiler are:
* `unboxed_closures` - Rust's new closure design, which is currently a work in
progress feature with many known bugs.
* `unsafe_destructor` - Allows use of the `#[unsafe_destructor]` attribute,
which is considered wildly unsafe and will be
obsoleted by language improvements.
* `unsafe_no_drop_flag` - Allows use of the `#[unsafe_no_drop_flag]` attribute,
which removes hidden flag added to a type that
implements the `Drop` trait. The design for the
@@ -2507,7 +2309,7 @@ The currently implemented features of the reference compiler are:
terms of encapsulation).
If a feature is promoted to a language feature, then all existing programs will
start to receive compilation warnings about #[feature] directives which enabled
start to receive compilation warnings about `#![feature]` directives which enabled
the new feature (because the directive is no longer necessary). However, if a
feature is decided to be removed from the language, errors will be issued (if
there isn't a parser error first). The directive in this case is no longer
@@ -2541,7 +2343,7 @@ statements](#expression-statements).
### Declaration statements
A _declaration statement_ is one that introduces one or more *names* into the
enclosing statement block. The declared names may denote new slots or new
enclosing statement block. The declared names may denote new variables or new
items.
#### Item declarations
@@ -2556,18 +2358,13 @@ in meaning to declaring the item outside the statement block.
> **Note**: there is no implicit capture of the function's dynamic environment when
> declaring a function-local item.
#### Slot declarations
#### Variable declarations
```{.ebnf .gram}
let_decl : "let" pat [':' type ] ? [ init ] ? ';' ;
init : [ '=' ] expr ;
```
A _slot declaration_ introduces a new set of slots, given by a pattern. The
A _variable declaration_ introduces a new set of variable, given by a pattern. The
pattern may be followed by a type annotation, and/or an initializer expression.
When no type annotation is given, the compiler will infer the type, or signal
an error if insufficient type information is available for definite inference.
Any slots introduced by a slot declaration are visible from the point of
Any variables introduced by a variable declaration are visible from the point of
declaration until the end of the enclosing block scope.
### Expression statements
@@ -2607,22 +2404,58 @@ expressions](#index-expressions) (`expr[expr]`), and [field
references](#field-expressions) (`expr.f`). All other expressions are rvalues.
The left operand of an [assignment](#assignment-expressions) or
[compound-assignment](#compound-assignment-expressions) expression is an lvalue
context, as is the single operand of a unary
[borrow](#unary-operator-expressions). All other expression contexts are
rvalue contexts.
[compound-assignment](#compound-assignment-expressions) expression is
an lvalue context, as is the single operand of a unary
[borrow](#unary-operator-expressions). The discriminant or subject of
a [match expression](#match-expressions) may be an lvalue context, if
ref bindings are made, but is otherwise an rvalue context. All other
expression contexts are rvalue contexts.
When an lvalue is evaluated in an _lvalue context_, it denotes a memory
location; when evaluated in an _rvalue context_, it denotes the value held _in_
that memory location.
When an rvalue is used in an lvalue context, a temporary un-named lvalue is
created and used instead. A temporary's lifetime equals the largest lifetime
of any reference that points to it.
##### Temporary lifetimes
When an rvalue is used in an lvalue context, a temporary un-named
lvalue is created and used instead. The lifetime of temporary values
is typically the innermost enclosing statement; the tail expression of
a block is considered part of the statement that encloses the block.
When a temporary rvalue is being created that is assigned into a `let`
declaration, however, the temporary is created with the lifetime of
the enclosing block instead, as using the enclosing statement (the
`let` declaration) would be a guaranteed error (since a pointer to the
temporary would be stored into a variable, but the temporary would be
freed before the variable could be used). The compiler uses simple
syntactic rules to decide which values are being assigned into a `let`
binding, and therefore deserve a longer temporary lifetime.
Here are some examples:
- `let x = foo(&temp())`. The expression `temp()` is an rvalue. As it
is being borrowed, a temporary is created which will be freed after
the innermost enclosing statement (the `let` declaration, in this case).
- `let x = temp().foo()`. This is the same as the previous example,
except that the value of `temp()` is being borrowed via autoref on a
method-call. Here we are assuming that `foo()` is an `&self` method
defined in some trait, say `Foo`. In other words, the expression
`temp().foo()` is equivalent to `Foo::foo(&temp())`.
- `let x = &temp()`. Here, the same temporary is being assigned into
`x`, rather than being passed as a parameter, and hence the
temporary's lifetime is considered to be the enclosing block.
- `let x = SomeStruct { foo: &temp() }`. As in the previous case, the
temporary is assigned into a struct which is then assigned into a
binding, and hence it is given the lifetime of the enclosing block.
- `let x = [ &temp() ]`. As in the previous case, the
temporary is assigned into an array which is then assigned into a
binding, and hence it is given the lifetime of the enclosing block.
- `let ref x = temp()`. In this case, the temporary is created using a ref binding,
but the result is the same: the lifetime is extended to the enclosing block.
#### Moved and copied types
When a [local variable](#memory-slots) is used as an
When a [local variable](#variables) is used as an
[rvalue](#lvalues,-rvalues-and-temporaries) the variable will either be moved
or copied, depending on its type. All values whose type implements `Copy` are
copied, all others are moved.
@@ -2651,27 +2484,20 @@ Tuples are written by enclosing zero or more comma-separated expressions in
parentheses. They are used to create [tuple-typed](#tuple-types) values.
```{.tuple}
(0,);
(0.0, 4.5);
("a", 4us, true);
("a", 4usize, true);
```
### Unit expressions
You can disambiguate a single-element tuple from a value in parentheses with a
comma:
The expression `()` denotes the _unit value_, the only value of the type with
the same name.
```
(0,); // single-element tuple
(0); // zero in parentheses
```
### Structure expressions
```{.ebnf .gram}
struct_expr : expr_path '{' ident ':' expr
[ ',' ident ':' expr ] *
[ ".." expr ] '}' |
expr_path '(' expr
[ ',' expr ] * ')' |
expr_path ;
```
There are several forms of structure expressions. A _structure expression_
consists of the [path](#paths) of a [structure item](#structures), followed by
a brace-enclosed list of one or more comma-separated name-value pairs,
@@ -2722,11 +2548,6 @@ Point3d {y: 0, z: 10, .. base};
### Block expressions
```{.ebnf .gram}
block_expr : '{' [ stmt ';' | item ] *
[ expr ] '}' ;
```
A _block expression_ is similar to a module in terms of the declarations that
are possible. Each block conceptually introduces a new namespace scope. Use
items can bring new names into scopes and declared items are in scope for only
@@ -2749,22 +2570,14 @@ assert_eq!(5, x);
### Method-call expressions
```{.ebnf .gram}
method_call_expr : expr '.' ident paren_expr_list ;
```
A _method call_ consists of an expression followed by a single dot, an
identifier, and a parenthesized expression-list. Method calls are resolved to
methods on specific traits, either statically dispatching to a method if the
exact `self`-type of the left-hand-side is known, or dynamically dispatching if
the left-hand-side expression is an indirect [object type](#object-types).
the left-hand-side expression is an indirect [trait object](#trait-objects).
### Field expressions
```{.ebnf .gram}
field_expr : expr '.' ident ;
```
A _field expression_ consists of an expression followed by a single dot and an
identifier, when not immediately followed by a parenthesized expression-list
(the latter is a [method call expression](#method-call-expressions)). A field
@@ -2780,17 +2593,13 @@ A field access is an [lvalue](#lvalues,-rvalues-and-temporaries) referring to
the value of that field. When the type providing the field inherits mutability,
it can be [assigned](#assignment-expressions) to.
Also, if the type of the expression to the left of the dot is a pointer, it is
automatically dereferenced to make the field access possible.
Also, if the type of the expression to the left of the dot is a
pointer, it is automatically dereferenced as many times as necessary
to make the field access possible. In cases of ambiguity, we prefer
fewer autoderefs to more.
### Array expressions
```{.ebnf .gram}
array_expr : '[' "mut" ? array_elems? ']' ;
array_elems : [expr [',' expr]*] | [expr ';' expr] ;
```
An [array](#array,-and-slice-types) _expression_ is written by enclosing zero
or more comma-separated expressions of uniform type in square brackets.
@@ -2807,27 +2616,55 @@ constant expression that can be evaluated at compile time, such as a
### Index expressions
```{.ebnf .gram}
idx_expr : expr '[' expr ']' ;
```
[Array](#array,-and-slice-types)-typed expressions can be indexed by
writing a square-bracket-enclosed expression (the index) after them. When the
array is mutable, the resulting [lvalue](#lvalues,-rvalues-and-temporaries) can
be assigned to.
Indices are zero-based, and may be of any integral type. Vector access is
bounds-checked at run-time. When the check fails, it will put the thread in a
_panicked state_.
bounds-checked at compile-time for constant arrays being accessed with a constant index value.
Otherwise a check will be performed at run-time that will put the thread in a _panicked state_ if it fails.
```{should-fail}
([1, 2, 3, 4])[0];
(["a", "b"])[10]; // panics
let x = (["a", "b"])[10]; // compiler error: const index-expr is out of bounds
let n = 10;
let y = (["a", "b"])[n]; // panics
let arr = ["a", "b"];
arr[10]; // panics
```
Also, if the type of the expression to the left of the brackets is a
pointer, it is automatically dereferenced as many times as necessary
to make the indexing possible. In cases of ambiguity, we prefer fewer
autoderefs to more.
### Range expressions
The `..` operator will construct an object of one of the `std::ops::Range` variants.
```
1..2; // std::ops::Range
3..; // std::ops::RangeFrom
..4; // std::ops::RangeTo
..; // std::ops::RangeFull
```
The following expressions are equivalent.
```
let x = std::ops::Range {start: 0, end: 10};
let y = 0..10;
assert_eq!(x,y);
```
### Unary operator expressions
Rust defines three unary operators. They are all written as prefix operators,
Rust defines the following unary operators. They are all written as prefix operators,
before the expression they apply to.
* `-`
@@ -2841,18 +2678,23 @@ before the expression they apply to.
implemented by the type and required for an outer expression that will or
could mutate the dereference), and produces the result of dereferencing the
`&` or `&mut` borrowed pointer returned from the overload method.
* `!`
: Logical negation. On the boolean type, this flips between `true` and
`false`. On integer types, this inverts the individual bits in the
two's complement representation of the value.
* `&` and `&mut`
: Borrowing. When applied to an lvalue, these operators produce a
reference (pointer) to the lvalue. The lvalue is also placed into
a borrowed state for the duration of the reference. For a shared
borrow (`&`), this implies that the lvalue may not be mutated, but
it may be read or shared again. For a mutable borrow (`&mut`), the
lvalue may not be accessed in any way until the borrow expires.
If the `&` or `&mut` operators are applied to an rvalue, a
temporary value is created; the lifetime of this temporary value
is defined by [syntactic rules](#temporary-lifetimes).
### Binary operator expressions
```{.ebnf .gram}
binop_expr : expr binop expr ;
```
Binary operators expressions are given in terms of [operator
precedence](#operator-precedence).
@@ -2959,6 +2801,13 @@ fn avg(v: &[f64]) -> f64 {
}
```
Some of the conversions which can be done through the `as` operator
can also be done implicitly at various points in the program, such as
argument passing and assignment to a `let` binding with an explicit
type. Implicit conversions are limited to "harmless" conversions that
do not lose information and which have minimal or no risk of
surprising side-effects on the dynamic execution semantics.
#### Assignment expressions
An _assignment expression_ consists of an
@@ -3013,10 +2862,6 @@ An expression enclosed in parentheses evaluates to the result of the enclosed
expression. Parentheses can be used to explicitly specify evaluation order
within an expression.
```{.ebnf .gram}
paren_expr : '(' expr ')' ;
```
An example of a parenthesized expression:
```
@@ -3026,16 +2871,9 @@ let x: i32 = (2 + 3) * 4;
### Call expressions
```{.ebnf .gram}
expr_list : [ expr [ ',' expr ]* ] ? ;
paren_expr_list : '(' expr_list ')' ;
call_expr : expr paren_expr_list ;
```
A _call expression_ invokes a function, providing zero or more input slots and
an optional reference slot to serve as the function's output, bound to the
`lval` on the right hand side of the call. If the function eventually returns,
then the expression completes.
A _call expression_ invokes a function, providing zero or more input variables
and an optional location to move the function's output into. If the function
eventually returns, then the expression completes.
Some examples of call expressions:
@@ -3048,11 +2886,6 @@ let pi: Result<f32, _> = "3.14".parse();
### Lambda expressions
```{.ebnf .gram}
ident_list : [ ident [ ',' ident ]* ] ? ;
lambda_expr : '|' ident_list '|' expr ;
```
A _lambda expression_ (sometimes called an "anonymous function expression")
defines a function and denotes it as a value, in a single expression. A lambda
expression is a pipe-symbol-delimited (`|`) list of identifiers followed by an
@@ -3092,11 +2925,39 @@ fn ten_times<F>(f: F) where F: Fn(i32) {
ten_times(|j| println!("hello, {}", j));
```
### While loops
### Infinite loops
```{.ebnf .gram}
while_expr : [ lifetime ':' ] "while" no_struct_literal_expr '{' block '}' ;
```
A `loop` expression denotes an infinite loop.
A `loop` expression may optionally have a _label_. The label is written as
a lifetime preceding the loop expression, as in `'foo: loop{ }`. If a
label is present, then labeled `break` and `continue` expressions nested
within this loop may exit out of this loop or return control to its head.
See [Break expressions](#break-expressions) and [Continue
expressions](#continue-expressions).
### Break expressions
A `break` expression has an optional _label_. If the label is absent, then
executing a `break` expression immediately terminates the innermost loop
enclosing it. It is only permitted in the body of a loop. If the label is
present, then `break 'foo` terminates the loop with label `'foo`, which need not
be the innermost label enclosing the `break` expression, but must enclose it.
### Continue expressions
A `continue` expression has an optional _label_. If the label is absent, then
executing a `continue` expression immediately terminates the current iteration
of the innermost loop enclosing it, returning control to the loop *head*. In
the case of a `while` loop, the head is the conditional expression controlling
the loop. In the case of a `for` loop, the head is the call-expression
controlling the loop. If the label is present, then `continue 'foo` returns
control to the head of the loop with label `'foo`, which need not be the
innermost label enclosing the `break` expression, but must enclose it.
A `continue` expression is only permitted in the body of a loop.
### While loops
A `while` loop begins by evaluating the boolean loop conditional expression.
If the loop conditional expression evaluates to `true`, the loop body block
@@ -3114,71 +2975,29 @@ while i < 10 {
}
```
### Infinite loops
A `loop` expression denotes an infinite loop.
```{.ebnf .gram}
loop_expr : [ lifetime ':' ] "loop" '{' block '}';
```
A `loop` expression may optionally have a _label_. If a label is present, then
labeled `break` and `continue` expressions nested within this loop may exit out
of this loop or return control to its head. See [Break
expressions](#break-expressions) and [Continue
expressions](#continue-expressions).
### Break expressions
```{.ebnf .gram}
break_expr : "break" [ lifetime ];
```
A `break` expression has an optional _label_. If the label is absent, then
executing a `break` expression immediately terminates the innermost loop
enclosing it. It is only permitted in the body of a loop. If the label is
present, then `break foo` terminates the loop with label `foo`, which need not
be the innermost label enclosing the `break` expression, but must enclose it.
### Continue expressions
```{.ebnf .gram}
continue_expr : "continue" [ lifetime ];
```
A `continue` expression has an optional _label_. If the label is absent, then
executing a `continue` expression immediately terminates the current iteration
of the innermost loop enclosing it, returning control to the loop *head*. In
the case of a `while` loop, the head is the conditional expression controlling
the loop. In the case of a `for` loop, the head is the call-expression
controlling the loop. If the label is present, then `continue foo` returns
control to the head of the loop with label `foo`, which need not be the
innermost label enclosing the `break` expression, but must enclose it.
A `continue` expression is only permitted in the body of a loop.
Like `loop` expressions, `while` loops can be controlled with `break` or
`continue`, and may optionally have a _label_. See [infinite
loops](#infinite-loops), [break expressions](#break-expressions), and
[continue expressions](#continue-expressions) for more information.
### For expressions
```{.ebnf .gram}
for_expr : [ lifetime ':' ] "for" pat "in" no_struct_literal_expr '{' block '}' ;
```
A `for` expression is a syntactic construct for looping over elements provided
by an implementation of `std::iter::Iterator`.
by an implementation of `std::iter::IntoIterator`.
An example of a for loop over the contents of an array:
```
# type Foo = i32;
# fn bar(f: Foo) { }
# fn bar(f: &Foo) { }
# let a = 0;
# let b = 0;
# let c = 0;
let v: &[Foo] = &[a, b, c];
for e in v.iter() {
bar(*e);
for e in v {
bar(e);
}
```
@@ -3191,16 +3010,13 @@ for i in 0..256 {
}
```
Like `loop` expressions, `for` loops can be controlled with `break` or
`continue`, and may optionally have a _label_. See [infinite
loops](#infinite-loops), [break expressions](#break-expressions), and
[continue expressions](#continue-expressions) for more information.
### If expressions
```{.ebnf .gram}
if_expr : "if" no_struct_literal_expr '{' block '}'
else_tail ? ;
else_tail : "else" [ if_expr | if_let_expr
| '{' block '}' ] ;
```
An `if` expression is a conditional branch in program control. The form of an
`if` expression is a condition expression, followed by a consequent block, any
number of `else if` conditions and blocks, and an optional trailing `else`
@@ -3213,14 +3029,6 @@ if` condition is evaluated. If all `if` and `else if` conditions evaluate to
### Match expressions
```{.ebnf .gram}
match_expr : "match" no_struct_literal_expr '{' match_arm * '}' ;
match_arm : attribute * match_pat "=>" [ expr "," | '{' block '}' ] ;
match_pat : pat [ '|' pat ] * [ "if" expr ] ? ;
```
A `match` expression branches on a *pattern*. The exact form of matching that
occurs depends on the pattern. Patterns consist of some combination of
literals, destructured arrays or enum constructors, structures and tuples,
@@ -3231,55 +3039,7 @@ expression.
In a pattern whose head expression has an `enum` type, a placeholder (`_`)
stands for a *single* data field, whereas a wildcard `..` stands for *all* the
fields of a particular variant. For example:
```
#![feature(box_patterns)]
#![feature(box_syntax)]
enum List<X> { Nil, Cons(X, Box<List<X>>) }
fn main() {
let x: List<i32> = List::Cons(10, box List::Cons(11, box List::Nil));
match x {
List::Cons(_, box List::Nil) => panic!("singleton list"),
List::Cons(..) => return,
List::Nil => panic!("empty list")
}
}
```
The first pattern matches lists constructed by applying `Cons` to any head
value, and a tail value of `box Nil`. The second pattern matches _any_ list
constructed with `Cons`, ignoring the values of its arguments. The difference
between `_` and `..` is that the pattern `C(_)` is only type-correct if `C` has
exactly one argument, while the pattern `C(..)` is type-correct for any enum
variant `C`, regardless of how many arguments `C` has.
Used inside an array pattern, `..` stands for any number of elements, when the
`advanced_slice_patterns` feature gate is turned on. This wildcard can be used
at most once for a given array, which implies that it cannot be used to
specifically match elements that are at an unknown distance from both ends of a
array, like `[.., 42, ..]`. If preceded by a variable name, it will bind the
corresponding slice to the variable. Example:
```
# #![feature(advanced_slice_patterns, slice_patterns)]
fn is_symmetric(list: &[u32]) -> bool {
match list {
[] | [_] => true,
[x, inside.., y] if x == y => is_symmetric(inside),
_ => false
}
}
fn main() {
let sym = &[0, 1, 4, 2, 4, 1, 0];
let not_sym = &[0, 1, 7, 2, 4, 1, 0];
assert!(is_symmetric(sym));
assert!(!is_symmetric(not_sym));
}
```
fields of a particular variant.
A `match` behaves differently depending on whether or not the head expression
is an [lvalue or an rvalue](#lvalues,-rvalues-and-temporaries). If the head
@@ -3298,30 +3058,15 @@ the inside of the match.
An example of a `match` expression:
```
#![feature(box_patterns)]
#![feature(box_syntax)]
# fn process_pair(a: i32, b: i32) { }
# fn process_ten() { }
let x = 1;
enum List<X> { Nil, Cons(X, Box<List<X>>) }
fn main() {
let x: List<i32> = List::Cons(10, box List::Cons(11, box List::Nil));
match x {
List::Cons(a, box List::Cons(b, _)) => {
process_pair(a, b);
}
List::Cons(10, _) => {
process_ten();
}
List::Nil => {
return;
}
_ => {
panic!();
}
}
match x {
1 => println!("one"),
2 => println!("two"),
3 => println!("three"),
4 => println!("four"),
5 => println!("five"),
_ => println!("something else"),
}
```
@@ -3334,28 +3079,12 @@ Subpatterns can also be bound to variables by the use of the syntax `variable @
subpattern`. For example:
```
#![feature(box_patterns)]
#![feature(box_syntax)]
let x = 1;
enum List { Nil, Cons(u32, Box<List>) }
fn is_sorted(list: &List) -> bool {
match *list {
List::Nil | List::Cons(_, box List::Nil) => true,
List::Cons(x, ref r @ box List::Cons(_, _)) => {
match *r {
box List::Cons(y, _) => (x <= y) && is_sorted(&**r),
_ => panic!()
}
}
}
match x {
e @ 1 ... 5 => println!("got a range element {}", e),
_ => println!("anything"),
}
fn main() {
let a = List::Cons(6, box List::Cons(7, box List::Cons(42, box List::Nil)));
assert!(is_sorted(&a));
}
```
Patterns can also dereference pointers by using the `&`, `&mut` and `box`
@@ -3416,22 +3145,26 @@ let message = match maybe_digit {
### If let expressions
```{.ebnf .gram}
if_let_expr : "if" "let" pat '=' expr '{' block '}'
else_tail ? ;
else_tail : "else" [ if_expr | if_let_expr | '{' block '}' ] ;
```
An `if let` expression is semantically identical to an `if` expression but in place
of a condition expression it expects a refutable let statement. If the value of the
expression on the right hand side of the let statement matches the pattern, the corresponding
block will execute, otherwise flow proceeds to the first `else` block that follows.
### While let loops
```{.ebnf .gram}
while_let_expr : "while" "let" pat '=' expr '{' block '}' ;
```
let dish = ("Ham", "Eggs");
// this body will be skipped because the pattern is refuted
if let ("Bacon", b) = dish {
println!("Bacon is served with {}", b);
}
// this body will execute
if let ("Ham", b) = dish {
println!("Ham is served with {}", b);
}
```
### While let loops
A `while let` loop is semantically identical to a `while` loop but in place of a
condition expression it expects a refutable let statement. If the value of the
@@ -3441,14 +3174,10 @@ Otherwise, the while expression completes.
### Return expressions
```{.ebnf .gram}
return_expr : "return" expr ? ;
```
Return expressions are denoted with the keyword `return`. Evaluating a `return`
expression moves its argument into the output slot of the current function,
destroys the current function activation frame, and transfers control to the
caller frame.
expression moves its argument into the designated output location for the
current function call, destroys the current function activation frame, and
transfers control to the caller frame.
An example of a `return` expression:
@@ -3465,7 +3194,7 @@ fn max(a: i32, b: i32) -> i32 {
## Types
Every slot, item and value in a Rust program has a type. The _type_ of a
Every variable, item and value in a Rust program has a type. The _type_ of a
*value* defines the interpretation of the memory holding it.
Built-in types and type-constructors are tightly integrated into the language,
@@ -3476,17 +3205,10 @@ User-defined types have limited capabilities.
The primitive types are the following:
* The "unit" type `()`, having the single "unit" value `()` (occasionally called
"nil"). [^unittype]
* The boolean type `bool` with values `true` and `false`.
* The machine types.
* The machine-dependent integer and floating-point types.
[^unittype]: The "unit" value `()` is *not* a sentinel "null pointer" value for
reference slots; the "unit" type is the implicit return type from functions
otherwise lacking a return type, and can be used in other contexts (such as
message-sending or type-parametric code) as a zero-size type.]
#### Machine types
The machine types are the following:
@@ -3525,9 +3247,9 @@ is not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to
UTF-32 string.
A value of type `str` is a Unicode string, represented as an array of 8-bit
unsigned bytes holding a sequence of UTF-8 codepoints. Since `str` is of
unsigned bytes holding a sequence of UTF-8 code points. Since `str` is of
unknown size, it is not a _first-class_ type, but can only be instantiated
through a pointer type, such as `&str` or `String`.
through a pointer type, such as `&str`.
### Tuple types
@@ -3583,7 +3305,7 @@ to an array or slice is always bounds-checked.
A `struct` *type* is a heterogeneous product of other types, called the
*fields* of the type.[^structtype]
[^structtype]: `struct` types are analogous `struct` types in C,
[^structtype]: `struct` types are analogous to `struct` types in C,
the *record* types of the ML family,
or the *structure* types of the Lisp family.
@@ -3597,7 +3319,7 @@ a corresponding struct *expression*; the resulting `struct` value will always
have the same memory layout.
The fields of a `struct` may be qualified by [visibility
modifiers](#re-exporting-and-visibility), to allow access to data in a
modifiers](#visibility-and-privacy), to allow access to data in a
structure outside a module.
A _tuple struct_ type is just like a structure type, except that the fields are
@@ -3637,7 +3359,7 @@ constructor or `struct` field may refer, directly or indirectly, to the
enclosing `enum` or `struct` type itself. Such recursion has restrictions:
* Recursive types must include a nominal type in the recursion
(not mere [type definitions](#type-definitions),
(not mere [type definitions](grammar.html#type-definitions),
or other structural types such as [arrays](#array,-and-slice-types) or [tuples](#tuple-types)).
* A recursive `enum` item must have at least one non-recursive constructor
(in order to give the recursion a basis case).
@@ -3665,18 +3387,18 @@ varieties of pointer in Rust:
* References (`&`)
: These point to memory _owned by some other value_.
A reference type is written `&type` for some lifetime-variable `f`,
or just `&'a type` when you need an explicit lifetime.
A reference type is written `&type`,
or `&'a type` when you need to specify an explicit lifetime.
Copying a reference is a "shallow" operation:
it involves only copying the pointer itself.
Releasing a reference typically has no effect on the value it points to,
with the exception of temporary values, which are released when the last
reference to them is released.
Releasing a reference has no effect on the value it points to,
but a reference of a temporary value will keep it alive during the scope
of the reference itself.
* Raw pointers (`*`)
: Raw pointers are pointers without safety or liveness guarantees.
Raw pointers are written as `*const T` or `*mut T`,
for example `*const int` means a raw pointer to an integer.
for example `*const i32` means a raw pointer to a 32-bit integer.
Copying or dropping a raw pointer has no effect on the lifecycle of any
other value. Dereferencing a raw pointer or converting it to any other
pointer type is an [`unsafe` operation](#unsafe-functions).
@@ -3707,58 +3429,62 @@ let bo: Binop = add;
x = bo(5,7);
```
#### Function types for specific items
Internally to the compiler, there are also function types that are specific to a particular
function item. In the following snippet, for example, the internal types of the functions
`foo` and `bar` are different, despite the fact that they have the same signature:
```
fn foo() { }
fn bar() { }
```
The types of `foo` and `bar` can both be implicitly coerced to the fn
pointer type `fn()`. There is currently no syntax for unique fn types,
though the compiler will emit a type like `fn() {foo}` in error
messages to indicate "the unique fn type for the function `foo`".
### Closure types
```{.ebnf .notation}
closure_type := [ 'unsafe' ] [ '<' lifetime-list '>' ] '|' arg-list '|'
[ ':' bound-list ] [ '->' type ]
lifetime-list := lifetime | lifetime ',' lifetime-list
arg-list := ident ':' type | ident ':' type ',' arg-list
bound-list := bound | bound '+' bound-list
bound := path | lifetime
```
A [lambda expression](#lambda-expressions) produces a closure value with
a unique, anonymous type that cannot be written out.
The type of a closure mapping an input of type `A` to an output of type `B` is
`|A| -> B`. A closure with no arguments or return values has type `||`.
Depending on the requirements of the closure, its type implements one or
more of the closure traits:
An example of creating and calling a closure:
* `FnOnce`
: The closure can be called once. A closure called as `FnOnce`
can move out values from its environment.
```rust
let captured_var = 10;
* `FnMut`
: The closure can be called multiple times as mutable. A closure called as
`FnMut` can mutate values from its environment. `FnMut` implies
`FnOnce`.
let closure_no_args = || println!("captured_var={}", captured_var);
* `Fn`
: The closure can be called multiple times through a shared reference.
A closure called as `Fn` can neither move out from nor mutate values
from its environment. `Fn` implies `FnMut` and `FnOnce`.
let closure_args = |arg: i32| -> i32 {
println!("captured_var={}, arg={}", captured_var, arg);
arg // Note lack of semicolon after 'arg'
};
fn call_closure<F: Fn(), G: Fn(i32) -> i32>(c1: F, c2: G) {
c1();
c2(2);
}
call_closure(closure_no_args, closure_args);
```
### Object types
### Trait objects
Every trait item (see [traits](#traits)) defines a type with the same name as
the trait. This type is called the _object type_ of the trait. Object types
the trait. This type is called the _trait object_ of the trait. Trait objects
permit "late binding" of methods, dispatched using _virtual method tables_
("vtables"). Whereas most calls to trait methods are "early bound" (statically
resolved) to specific implementations at compile time, a call to a method on an
object type is only resolved to a vtable entry at compile time. The actual
trait objects is only resolved to a vtable entry at compile time. The actual
implementation for each vtable entry can vary on an object-by-object basis.
Given a pointer-typed expression `E` of type `&T` or `Box<T>`, where `T`
implements trait `R`, casting `E` to the corresponding pointer type `&R` or
`Box<R>` results in a value of the _object type_ `R`. This result is
`Box<R>` results in a value of the _trait object_ `R`. This result is
represented as a pair of pointers: the vtable pointer for the `T`
implementation of `R`, and the pointer value of `E`.
An example of an object type:
An example of a trait object:
```
trait Printable {
@@ -3778,7 +3504,7 @@ fn main() {
}
```
In this example, the trait `Printable` occurs as an object type in both the
In this example, the trait `Printable` occurs as a trait object in both the
type signature of `print`, and the cast expression in `main`.
### Type parameters
@@ -3787,24 +3513,25 @@ Within the body of an item that has type parameter declarations, the names of
its type parameters are types:
```ignore
fn map<A: Clone, B: Clone>(f: |A| -> B, xs: &[A]) -> Vec<B> {
fn to_vec<A: Clone>(xs: &[A]) -> Vec<A> {
if xs.is_empty() {
return vec![];
}
let first: B = f(xs[0].clone());
let mut rest: Vec<B> = map(f, xs.slice(1, xs.len()));
let first: A = xs[0].clone();
let mut rest: Vec<A> = to_vec(&xs[1..]);
rest.insert(0, first);
return rest;
rest
}
```
Here, `first` has type `B`, referring to `map`'s `B` type parameter; and `rest`
has type `Vec<B>`, a vector type with element type `B`.
Here, `first` has type `A`, referring to `to_vec`'s `A` type parameter; and `rest`
has type `Vec<A>`, a vector with element type `A`.
### Self types
The special type `self` has a meaning within methods inside an impl item. It
refers to the type of the implicit `self` argument. For example, in:
The special type `Self` has a meaning within traits and impls. In a trait definition, it refers
to an implicit type parameter representing the "implementing" type. In an impl,
it is an alias for the implementing type. For example, in:
```
trait Printable {
@@ -3818,21 +3545,24 @@ impl Printable for String {
}
```
`self` refers to the value of type `String` that is the receiver for a call to
the method `make_string`.
The notation `&self` is a shorthand for `self: &Self`. In this case,
in the impl, `Self` refers to the value of type `String` that is the
receiver for a call to the method `make_string`.
# The `Copy` trait
# Special traits
Rust has a special trait, `Copy`, which when implemented changes the semantics
of a value. Values whose type implements `Copy` are copied rather than moved
upon assignment.
Several traits define special evaluation behavior.
# The `Sized` trait
## The `Copy` trait
`Sized` is a special trait which indicates that the size of this type is known
at compile-time.
The `Copy` trait changes the semantics of a type implementing it. Values whose
type implements `Copy` are copied rather than moved upon assignment.
# The `Drop` trait
## The `Sized` trait
The `Sized` trait indicates that the size of this type is known at compile-time.
## The `Drop` trait
The `Drop` trait provides a destructor, to be run whenever a value of this type
is to be destroyed.
@@ -3840,10 +3570,12 @@ is to be destroyed.
# Memory model
A Rust program's memory consists of a static set of *items* and a *heap*.
Immutable portions of the heap may be shared between threads, mutable portions
may not.
Immutable portions of the heap may be safely shared between threads, mutable
portions may not be safely shared, but several mechanisms for effectively-safe
sharing of mutable values, built on unsafe code but enforcing a safe locking
discipline, exist in the standard library.
Allocations in the stack consist of *slots*, and allocations in the heap
Allocations in the stack consist of *variables*, and allocations in the heap
consist of *boxes*.
### Memory allocation and lifetime
@@ -3862,10 +3594,11 @@ in the heap, heap allocations may outlive the frame they are allocated within.
When a stack frame is exited, its local allocations are all released, and its
references to boxes are dropped.
### Memory slots
### Variables
A _slot_ is a component of a stack frame, either a function parameter, a
[temporary](#lvalues,-rvalues-and-temporaries), or a local variable.
A _variable_ is a component of a stack frame, either a named function parameter,
an anonymous [temporary](#lvalues,-rvalues-and-temporaries), or a named local
variable.
A _local variable_ (or *stack-local* allocation) holds a value directly,
allocated within the stack's memory. The value is a part of the stack frame.
@@ -3878,7 +3611,7 @@ Box<i32>, y: Box<i32>)` declare one mutable variable `x` and one immutable
variable `y`).
Methods that take either `self` or `Box<Self>` can optionally place them in a
mutable slot by prefixing them with `mut` (similar to regular arguments):
mutable variable by prefixing them with `mut` (similar to regular arguments):
```
trait Changer {
@@ -3893,44 +3626,7 @@ state. Subsequent statements within a function may or may not initialize the
local variables. Local variables can be used only after they have been
initialized; this is enforced by the compiler.
# Runtime services, linkage and debugging
The Rust _runtime_ is a relatively compact collection of Rust code that
provides fundamental services and datatypes to all Rust threads at run-time. It
is smaller and simpler than many modern language runtimes. It is tightly
integrated into the language's execution model of memory, threads, communication
and logging.
### Memory allocation
The runtime memory-management system is based on a _service-provider
interface_, through which the runtime requests blocks of memory from its
environment and releases them back to its environment when they are no longer
needed. The default implementation of the service-provider interface consists
of the C runtime functions `malloc` and `free`.
The runtime memory-management system, in turn, supplies Rust threads with
facilities for allocating releasing stacks, as well as allocating and freeing
heap data.
### Built in types
The runtime provides C and Rust code to assist with various built-in types,
such as arrays, strings, and the low level communication system (ports,
channels, threads).
Support for other built-in types such as simple types, tuples and enums is
open-coded by the Rust compiler.
### Thread scheduling and communication
The runtime provides code to manage inter-thread communication. This includes
the system of thread-lifecycle state transitions depending on the contents of
queues, as well as code to copy values between queues and their recipients and
to serialize values for transmission over operating-system inter-process
communication facilities.
### Linkage
# Linkage
The Rust compiler supports various methods to link crates together both
statically and dynamically. This section will explore the various methods to
@@ -4068,4 +3764,4 @@ that have since been removed):
pattern syntax
[ffi]: book/ffi.html
[plugin]: book/plugins.html
[plugin]: book/compiler-plugins.html
+11 -4
View File
@@ -8,7 +8,7 @@ good at: embedding in other languages, programs with specific space and time
requirements, and writing low-level code, like device drivers and operating
systems. It improves on current languages targeting this space by having a
number of compile-time safety checks that produce no runtime overhead, while
eliminating all data races. Rust also aims to achieve zero-cost abstrations
eliminating all data races. Rust also aims to achieve zero-cost abstractions
even though some of these abstractions feel like those of a high-level
language. Even then, Rust still allows precise control like a low-level
language would.
@@ -24,6 +24,7 @@ is the first. After this:
* [Syntax and Semantics][ss] - Each bit of Rust, broken down into small chunks.
* [Nightly Rust][nr] - Cutting-edge features that arent in stable builds yet.
* [Glossary][gl] - A reference of terms used in the book.
* [Academic Research][ar] - Literature that influenced Rust.
[gs]: getting-started.html
[lr]: learn-rust.html
@@ -31,6 +32,7 @@ is the first. After this:
[ss]: syntax-and-semantics.html
[nr]: nightly-rust.html
[gl]: glossary.html
[ar]: academic-research.html
After reading this introduction, youll want to dive into either Learn Rust
or Syntax and Semantics, depending on your preference: Learn Rust if you
@@ -38,6 +40,11 @@ want to dive in with a project, or Syntax and Semantics if you prefer to
start small, and learn a single concept thoroughly before moving onto the next.
Copious cross-linking connects these parts together.
### Contributing
The source files from which this book is generated can be found on Github:
[github.com/rust-lang/rust/tree/master/src/doc/trpl](https://github.com/rust-lang/rust/tree/master/src/doc/trpl)
## A brief introduction to Rust
Is Rust a language you might be interested in? Lets examine a few small code
@@ -125,7 +132,7 @@ vector. When we try to compile this program, we get an error:
```text
error: cannot borrow `x` as mutable because it is also borrowed as immutable
x.push(4);
x.push("foo");
^
note: previous borrow of `x` occurs here; the immutable borrow prevents
subsequent moves or mutable borrows of `x` until the borrow ends
@@ -165,7 +172,7 @@ fn main() {
Rust has [move semantics][move] by default, so if we want to make a copy of some
data, we call the `clone()` method. In this example, `y` is no longer a reference
to the vector stored in `x`, but a copy of its first element, `"hello"`. Now
to the vector stored in `x`, but a copy of its first element, `"Hello"`. Now
that we dont have a reference, our `push()` works just fine.
[move]: move-semantics.html
@@ -188,5 +195,5 @@ fn main() {
We created an inner scope with an additional set of curly braces. `y` will go out of
scope before we call `push()`, and so were all good.
This concept of ownership isnt just good for preventing danging pointers, but an
This concept of ownership isnt just good for preventing dangling pointers, but an
entire set of related problems, like iterator invalidation, concurrency, and more.
+15 -12
View File
@@ -5,16 +5,20 @@
* [Hello, world!](hello-world.md)
* [Hello, Cargo!](hello-cargo.md)
* [Learn Rust](learn-rust.md)
* [Guessing Game](guessing-game.md)
* [Dining Philosophers](dining-philosophers.md)
* [Rust inside other languages](rust-inside-other-languages.md)
* [Effective Rust](effective-rust.md)
* [The Stack and the Heap](the-stack-and-the-heap.md)
* [Debug and Display](debug-and-display.md)
* [Testing](testing.md)
* [Conditional Compilation](conditional-compilation.md)
* [Documentation](documentation.md)
* [Iterators](iterators.md)
* [Concurrency](concurrency.md)
* [Error Handling](error-handling.md)
* [FFI](ffi.md)
* [Deref coercions](deref-coercions.md)
* [Borrow and AsRef](borrow-and-asref.md)
* [Release Channels](release-channels.md)
* [Syntax and Semantics](syntax-and-semantics.md)
* [Variable Bindings](variable-bindings.md)
* [Functions](functions.md)
@@ -27,34 +31,32 @@
* [References and Borrowing](references-and-borrowing.md)
* [Lifetimes](lifetimes.md)
* [Mutability](mutability.md)
* [Move semantics](move-semantics.md)
* [Structs](structs.md)
* [Enums](enums.md)
* [Match](match.md)
* [Patterns](patterns.md)
* [Structs](structs.md)
* [Method Syntax](method-syntax.md)
* [Drop](drop.md)
* [Vectors](vectors.md)
* [Strings](strings.md)
* [Traits](traits.md)
* [Operators and Overloading](operators-and-overloading.md)
* [Generics](generics.md)
* [Traits](traits.md)
* [Drop](drop.md)
* [if let](if-let.md)
* [Trait Objects](trait-objects.md)
* [Closures](closures.md)
* [Universal Function Call Syntax](ufcs.md)
* [Crates and Modules](crates-and-modules.md)
* [`static`](static.md)
* [`const`](const.md)
* [Tuple Structs](tuple-structs.md)
* [`const` and `static`](const-and-static.md)
* [Attributes](attributes.md)
* [Conditional Compilation](conditional-compilation.md)
* [`type` aliases](type-aliases.md)
* [Casting between types](casting-between-types.md)
* [Associated Types](associated-types.md)
* [Unsized Types](unsized-types.md)
* [Operators and Overloading](operators-and-overloading.md)
* [Deref coercions](deref-coercions.md)
* [Macros](macros.md)
* [`unsafe` Code](unsafe-code.md)
* [Raw Pointers](raw-pointers.md)
* [`unsafe`](unsafe.md)
* [Nightly Rust](nightly-rust.md)
* [Compiler Plugins](compiler-plugins.md)
* [Inline Assembly](inline-assembly.md)
@@ -65,5 +67,6 @@
* [Benchmark Tests](benchmark-tests.md)
* [Box Syntax and Patterns](box-syntax-and-patterns.md)
* [Slice Patterns](slice-patterns.md)
* [Associated Constants](associated-constants.md)
* [Glossary](glossary.md)
* [Academic Research](academic-research.md)
+79
View File
@@ -0,0 +1,79 @@
% Associated Constants
With the `associated_consts` feature, you can define constants like this:
```rust
#![feature(associated_consts)]
trait Foo {
const ID: i32;
}
impl Foo for i32 {
const ID: i32 = 1;
}
fn main() {
assert_eq!(1, i32::ID);
}
```
Any implementor of `Foo` will have to define `ID`. Without the definition:
```rust,ignore
#![feature(associated_consts)]
trait Foo {
const ID: i32;
}
impl Foo for i32 {
}
```
gives
```text
error: not all trait items implemented, missing: `ID` [E0046]
impl Foo for i32 {
}
```
A default value can be implemented as well:
```rust
#![feature(associated_consts)]
trait Foo {
const ID: i32 = 1;
}
impl Foo for i32 {
}
impl Foo for i64 {
const ID: i32 = 5;
}
fn main() {
assert_eq!(1, i32::ID);
assert_eq!(5, i64::ID);
}
```
As you can see, when implementing `Foo`, you can leave it unimplemented, as
with `i32`. It will then use the default value. But, as in `i64`, we can also
add our own definition.
Associated constants dont have to be associated with a trait. An `impl` block
for a `struct` works fine too:
```rust
#![feature(associated_consts)]
struct Foo;
impl Foo {
pub const FOO: u32 = 3;
}
```
+8 -8
View File
@@ -1,8 +1,8 @@
% Associated Types
Associated types are a powerful part of Rust's type system. They're related to
the idea of a 'type family', in other words, grouping multiple types together. That
description is a bit abstract, so let's dive right into an example. If you want
Associated types are a powerful part of Rusts type system. Theyre related to
the idea of a type family, in other words, grouping multiple types together. That
description is a bit abstract, so lets dive right into an example. If you want
to write a `Graph` trait, you have two types to be generic over: the node type
and the edge type. So you might write a trait, `Graph<N, E>`, that looks like
this:
@@ -48,11 +48,11 @@ fn distance<G: Graph>(graph: &G, start: &G::N, end: &G::N) -> uint { ... }
No need to deal with the `E`dge type here!
Let's go over all this in more detail.
Lets go over all this in more detail.
## Defining associated types
Let's build that `Graph` trait. Here's the definition:
Lets build that `Graph` trait. Heres the definition:
```rust
trait Graph {
@@ -86,7 +86,7 @@ trait Graph {
## Implementing associated types
Just like any trait, traits that use associated types use the `impl` keyword to
provide implementations. Here's a simple implementation of Graph:
provide implementations. Heres a simple implementation of Graph:
```rust
# trait Graph {
@@ -118,13 +118,13 @@ impl Graph for MyGraph {
This silly implementation always returns `true` and an empty `Vec<Edge>`, but it
gives you an idea of how to implement this kind of thing. We first need three
`struct`s, one for the graph, one for the node, and one for the edge. If it made
more sense to use a different type, that would work as well, we're just going to
more sense to use a different type, that would work as well, were just going to
use `struct`s for all three here.
Next is the `impl` line, which is just like implementing any other trait.
From here, we use `=` to define our associated types. The name the trait uses
goes on the left of the `=`, and the concrete type we're `impl`ementing this
goes on the left of the `=`, and the concrete type were `impl`ementing this
for goes on the right. Finally, we use the concrete types in our function
declarations.
+68 -1
View File
@@ -1,3 +1,70 @@
% Attributes
Coming Soon!
Declarations can be annotated with attributes in Rust. They look like this:
```rust
#[test]
# fn foo() {}
```
or like this:
```rust
# mod foo {
#![test]
# }
```
The difference between the two is the `!`, which changes what the attribute
applies to:
```rust,ignore
#[foo]
struct Foo;
mod bar {
#![bar]
}
```
The `#[foo]` attribute applies to the next item, which is the `struct`
declaration. The `#![bar]` attribute applies to the item enclosing it, which is
the `mod` declaration. Otherwise, theyre the same. Both change the meaning of
the item theyre attached to somehow.
For example, consider a function like this:
```rust
#[test]
fn check() {
assert_eq!(2, 1 + 1);
}
```
It is marked with `#[test]`. This means its special: when you run
[tests][tests], this function will execute. When you compile as usual, it wont
even be included. This function is now a test function.
[tests]: testing.html
Attributes may also have additional data:
```rust
#[inline(always)]
fn super_fast_fn() {
# }
```
Or even keys and values:
```rust
#[cfg(target_os = "macos")]
mod macos_only {
# }
```
Rust attributes are used for a number of different things. There is a full list
of attributes [in the reference][reference]. Currently, you are not allowed to
create your own attributes, the Rust compiler defines them.
[reference]: ../reference.html#attributes
+1 -1
View File
@@ -13,7 +13,7 @@ pub fn add_two(a: i32) -> i32 {
}
#[cfg(test)]
mod test {
mod tests {
use super::*;
use test::Bencher;
+93
View File
@@ -0,0 +1,93 @@
% Borrow and AsRef
The [`Borrow`][borrow] and [`AsRef`][asref] traits are very similar, but
different. Heres a quick refresher on what these two traits mean.
[borrow]: ../std/borrow/trait.Borrow.html
[asref]: ../std/convert/trait.AsRef.html
# Borrow
The `Borrow` trait is used when youre writing a datastructure, and you want to
use either an owned or borrowed type as synonymous for some purpose.
For example, [`HashMap`][hashmap] has a [`get` method][get] which uses `Borrow`:
```rust,ignore
fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V>
where K: Borrow<Q>,
Q: Hash + Eq
```
[hashmap]: ../std/collections/struct.HashMap.html
[get]: ../std/collections/struct.HashMap.html#method.get
This signature is pretty complicated. The `K` parameter is what were interested
in here. It refers to a parameter of the `HashMap` itself:
```rust,ignore
struct HashMap<K, V, S = RandomState> {
```
The `K` parameter is the type of _key_ the `HashMap` uses. So, looking at
the signature of `get()` again, we can use `get()` when the key implements
`Borrow<Q>`. That way, we can make a `HashMap` which uses `String` keys,
but use `&str`s when were searching:
```rust
use std::collections::HashMap;
let mut map = HashMap::new();
map.insert("Foo".to_string(), 42);
assert_eq!(map.get("Foo"), Some(&42));
```
This is because the standard library has `impl Borrow<str> for String`.
For most types, when you want to take an owned or borrowed type, a `&T` is
enough. But one area where `Borrow` is effective is when theres more than one
kind of borrowed value. Slices are an area where this is especially true: you
can have both an `&[T]` or a `&mut [T]`. If we wanted to accept both of these
types, `Borrow` is up for it:
```
use std::borrow::Borrow;
use std::fmt::Display;
fn foo<T: Borrow<i32> + Display>(a: T) {
println!("a is borrowed: {}", a);
}
let mut i = 5;
foo(&i);
foo(&mut i);
```
This will print out `a is borrowed: 5` twice.
# AsRef
The `AsRef` trait is a conversion trait. Its used for converting some value to
a reference in generic code. Like this:
```rust
let s = "Hello".to_string();
fn foo<T: AsRef<str>>(s: T) {
let slice = s.as_ref();
}
```
# Which should I use?
We can see how theyre kind of the same: they both deal with owned and borrowed
versions of some type. However, theyre a bit different.
Choose `Borrow` when you want to abstract over different kinds of borrowing, or
when youre building a datastructure that treats owned and borrowed values in
equivalent ways, such as hashing and comparison.
Choose `AsRef` when you want to convert something to a reference directly, and
youre writing generic code.
+87 -1
View File
@@ -1,3 +1,89 @@
% Casting Between Types
Coming Soon
Rust, with its focus on safety, provides two different ways of casting
different types between each other. The first, `as`, is for safe casts.
In contrast, `transmute` allows for arbitrary casting, and is one of the
most dangerous features of Rust!
# `as`
The `as` keyword does basic casting:
```rust
let x: i32 = 5;
let y = x as i64;
```
It only allows certain kinds of casting, however:
```rust,ignore
let a = [0u8, 0u8, 0u8, 0u8];
let b = a as u32; // four eights makes 32
```
This errors with:
```text
error: non-scalar cast: `[u8; 4]` as `u32`
let b = a as u32; // four eights makes 32
^~~~~~~~
```
Its a non-scalar cast because we have multiple values here: the four
elements of the array. These kinds of casts are very dangerous, because they
make assumptions about the way that multiple underlying structures are
implemented. For this, we need something more dangerous.
# `transmute`
The `transmute` function is provided by a [compiler intrinsic][intrinsics], and
what it does is very simple, but very scary. It tells Rust to treat a value of
one type as though it were another type. It does this regardless of the
typechecking system, and just completely trusts you.
[intrinsics]: intrinsics.html
In our previous example, we know that an array of four `u8`s represents a `u32`
properly, and so we want to do the cast. Using `transmute` instead of `as`,
Rust lets us:
```rust
use std::mem;
unsafe {
let a = [0u8, 0u8, 0u8, 0u8];
let b = mem::transmute::<[u8; 4], u32>(a);
}
```
We have to wrap the operation in an `unsafe` block for this to compile
successfully. Technically, only the `mem::transmute` call itself needs to be in
the block, but it's nice in this case to enclose everything related, so you
know where to look. In this case, the details about `a` are also important, and
so they're in the block. You'll see code in either style, sometimes the context
is too far away, and wrapping all of the code in `unsafe` isn't a great idea.
While `transmute` does very little checking, it will at least make sure that
the types are the same size. This errors:
```rust,ignore
use std::mem;
unsafe {
let a = [0u8, 0u8, 0u8, 0u8];
let b = mem::transmute::<[u8; 4], u64>(a);
}
```
with:
```text
error: transmute called on types with different sizes: [u8; 4] (32 bits) to u64
(64 bits)
```
Other than that, you're on your own!
+30 -30
View File
@@ -1,9 +1,9 @@
% Closures
Rust not only has named functions, but anonymous functions as well. Anonymous
functions that have an associated environment are called 'closures', because they
functions that have an associated environment are called closures, because they
close over an environment. Rust has a really great implementation of them, as
we'll see.
well see.
# Syntax
@@ -15,7 +15,7 @@ let plus_one = |x: i32| x + 1;
assert_eq!(2, plus_one(1));
```
We create a binding, `plus_one`, and assign it to a closure. The closure's
We create a binding, `plus_one`, and assign it to a closure. The closures
arguments go between the pipes (`|`), and the body is an expression, in this
case, `x + 1`. Remember that `{ }` is an expression, so we can have multi-line
closures too:
@@ -33,7 +33,7 @@ let plus_two = |x| {
assert_eq!(4, plus_two(2));
```
You'll notice a few things about closures that are a bit different than regular
Youll notice a few things about closures that are a bit different than regular
functions defined with `fn`. The first of which is that we did not need to
annotate the types of arguments the closure takes or the values it returns. We
can:
@@ -44,13 +44,13 @@ let plus_one = |x: i32| -> i32 { x + 1 };
assert_eq!(2, plus_one(1));
```
But we don't have to. Why is this? Basically, it was chosen for ergonomic reasons.
But we dont have to. Why is this? Basically, it was chosen for ergonomic reasons.
While specifying the full type for named functions is helpful with things like
documentation and type inference, the types of closures are rarely documented
since theyre anonymous, and they dont cause the kinds of error-at-a-distance
that inferring named function types can.
The second is that the syntax is similar, but a bit different. I've added spaces
The second is that the syntax is similar, but a bit different. Ive added spaces
here to make them look a little closer:
```rust
@@ -59,11 +59,11 @@ let plus_one_v2 = |x: i32 | -> i32 { x + 1 };
let plus_one_v3 = |x: i32 | x + 1 ;
```
Small differences, but they're similar in ways.
Small differences, but theyre similar in ways.
# Closures and their environment
Closures are called such because they 'close over their environment.' It
Closures are called such because they close over their environment. It
looks like this:
```rust
@@ -105,7 +105,7 @@ fn main() {
^
```
A verbose yet helpful error message! As it says, we can't take a mutable borrow
A verbose yet helpful error message! As it says, we cant take a mutable borrow
on `num` because the closure is already borrowing it. If we let the closure go
out of scope, we can:
@@ -140,7 +140,7 @@ let takes_nums = || nums;
```
`Vec<T>` has ownership over its contents, and therefore, when we refer to it
in our closure, we have to take ownership of `nums`. It's the same as if we'd
in our closure, we have to take ownership of `nums`. Its the same as if wed
passed `nums` to a function that took ownership of it.
## `move` closures
@@ -156,7 +156,7 @@ let owns_num = move |x: i32| x + num;
Now, even though the keyword is `move`, the variables follow normal move semantics.
In this case, `5` implements `Copy`, and so `owns_num` takes ownership of a copy
of `num`. So what's the difference?
of `num`. So whats the difference?
```rust
let mut num = 5;
@@ -171,11 +171,11 @@ assert_eq!(10, num);
```
So in this case, our closure took a mutable reference to `num`, and then when
we called `add_num`, it mutated the underlying value, as we'd expect. We also
we called `add_num`, it mutated the underlying value, as wed expect. We also
needed to declare `add_num` as `mut` too, because were mutating its
environment.
If we change to a `move` closure, it's different:
If we change to a `move` closure, its different:
```rust
let mut num = 5;
@@ -203,8 +203,8 @@ you tons of control over what your code does, and closures are no different.
# Closure implementation
Rust's implementation of closures is a bit different than other languages. They
are effectively syntax sugar for traits. You'll want to make sure to have read
Rusts implementation of closures is a bit different than other languages. They
are effectively syntax sugar for traits. Youll want to make sure to have read
the [traits chapter][traits] before this one, as well as the chapter on [trait
objects][trait-objects].
@@ -237,9 +237,9 @@ pub trait FnOnce<Args> {
# }
```
You'll notice a few differences between these traits, but a big one is `self`:
Youll notice a few differences between these traits, but a big one is `self`:
`Fn` takes `&self`, `FnMut` takes `&mut self`, and `FnOnce` takes `self`. This
covers all three kinds of `self` via the usual method call syntax. But we've
covers all three kinds of `self` via the usual method call syntax. But weve
split them up into three traits, rather than having a single one. This gives us
a large amount of control over what kind of closures we can take.
@@ -253,7 +253,7 @@ Now that we know that closures are traits, we already know how to accept and
return closures: just like any other trait!
This also means that we can choose static vs dynamic dispatch as well. First,
let's write a function which takes something callable, calls it, and returns
lets write a function which takes something callable, calls it, and returns
the result:
```rust
@@ -271,7 +271,7 @@ assert_eq!(3, answer);
We pass our closure, `|x| x + 2`, to `call_with_one`. It just does what it
suggests: it calls the closure, giving it `1` as an argument.
Let's examine the signature of `call_with_one` in more depth:
Lets examine the signature of `call_with_one` in more depth:
```rust
fn call_with_one<F>(some_closure: F) -> i32
@@ -280,7 +280,7 @@ fn call_with_one<F>(some_closure: F) -> i32
```
We take one parameter, and it has the type `F`. We also return a `i32`. This part
isn't interesting. The next part is:
isnt interesting. The next part is:
```rust
# fn call_with_one<F>(some_closure: F) -> i32
@@ -292,9 +292,9 @@ Because `Fn` is a trait, we can bound our generic with it. In this case, our clo
takes a `i32` as an argument and returns an `i32`, and so the generic bound we use
is `Fn(i32) -> i32`.
There's one other key point here: because we're bounding a generic with a
trait, this will get monomorphized, and therefore, we'll be doing static
dispatch into the closure. That's pretty neat. In many langauges, closures are
Theres one other key point here: because were bounding a generic with a
trait, this will get monomorphized, and therefore, well be doing static
dispatch into the closure. Thats pretty neat. In many languages, closures are
inherently heap allocated, and will always involve dynamic dispatch. In Rust,
we can stack allocate our closure environment, and statically dispatch the
call. This happens quite often with iterators and their adapters, which often
@@ -320,7 +320,7 @@ to our closure when we pass it to `call_with_one`, so we use `&||`.
Its very common for functional-style code to return closures in various
situations. If you try to return a closure, you may run into an error. At
first, it may seem strange, but we'll figure it out. Here's how you'd probably
first, it may seem strange, but well figure it out. Heres how youd probably
try to return a closure from a function:
```rust,ignore
@@ -361,7 +361,7 @@ In order to return something from a function, Rust needs to know what
size the return type is. But since `Fn` is a trait, it could be various
things of various sizes: many different types can implement `Fn`. An easy
way to give something a size is to take a reference to it, as references
have a known size. So we'd write this:
have a known size. So wed write this:
```rust,ignore
fn factory() -> &(Fn(i32) -> Vec<i32>) {
@@ -385,7 +385,7 @@ fn factory() -> &(Fn(i32) -> i32) {
```
Right. Because we have a reference, we need to give it a lifetime. But
our `factory()` function takes no arguments, so elision doesn't kick in
our `factory()` function takes no arguments, so elision doesnt kick in
here. What lifetime can we choose? `'static`:
```rust,ignore
@@ -414,7 +414,7 @@ error: mismatched types:
```
This error is letting us know that we don't have a `&'static Fn(i32) -> i32`,
This error is letting us know that we dont have a `&'static Fn(i32) -> i32`,
we have a `[closure <anon>:7:9: 7:20]`. Wait, what?
Because each closure generates its own environment `struct` and implementation
@@ -422,7 +422,7 @@ of `Fn` and friends, these types are anonymous. They exist just solely for
this closure. So Rust shows them as `closure <anon>`, rather than some
autogenerated name.
But why doesn't our closure implement `&'static Fn`? Well, as we discussed before,
But why doesnt our closure implement `&'static Fn`? Well, as we discussed before,
closures borrow their environment. And in this case, our environment is based
on a stack-allocated `5`, the `num` variable binding. So the borrow has a lifetime
of the stack frame. So if we returned this closure, the function call would be
@@ -445,7 +445,7 @@ assert_eq!(6, answer);
# }
```
We use a trait object, by `Box`ing up the `Fn`. There's just one last problem:
We use a trait object, by `Box`ing up the `Fn`. Theres just one last problem:
```text
error: `num` does not live long enough
@@ -471,5 +471,5 @@ assert_eq!(6, answer);
```
By making the inner closure a `move Fn`, we create a new stack frame for our
closure. By `Box`ing it up, we've given it a known size, and allowing it to
closure. By `Box`ing it up, weve given it a known size, and allowing it to
escape our stack frame.
+1 -1
View File
@@ -176,7 +176,7 @@ for a full example, the core of which is reproduced here:
```ignore
declare_lint!(TEST_LINT, Warn,
"Warn about items named 'lintme'")
"Warn about items named 'lintme'");
struct Pass;
+6 -6
View File
@@ -116,7 +116,7 @@ use std::thread;
fn main() {
let mut data = vec![1u32, 2, 3];
for i in 0..2 {
for i in 0..3 {
thread::spawn(move || {
data[i] += 1;
});
@@ -154,7 +154,7 @@ use std::sync::Mutex;
fn main() {
let mut data = Mutex::new(vec![1u32, 2, 3]);
for i in 0..2 {
for i in 0..3 {
let data = data.lock().unwrap();
thread::spawn(move || {
data[i] += 1;
@@ -176,8 +176,8 @@ Here's the error:
^~~~~~~~~~~~~
```
You see, [`Mutex`](std/sync/struct.Mutex.html) has a
[`lock`](http://doc.rust-lang.org/nightly/std/sync/struct.Mutex.html#method.lock)
You see, [`Mutex`](../std/sync/struct.Mutex.html) has a
[`lock`](../std/sync/struct.Mutex.html#method.lock)
method which has this signature:
```ignore
@@ -196,7 +196,7 @@ use std::thread;
fn main() {
let data = Arc::new(Mutex::new(vec![1u32, 2, 3]));
for i in 0..2 {
for i in 0..3 {
let data = data.clone();
thread::spawn(move || {
let mut data = data.lock().unwrap();
@@ -217,7 +217,7 @@ thread more closely:
# use std::thread;
# fn main() {
# let data = Arc::new(Mutex::new(vec![1u32, 2, 3]));
# for i in 0..2 {
# for i in 0..3 {
# let data = data.clone();
thread::spawn(move || {
let mut data = data.lock().unwrap();
+91 -1
View File
@@ -1,3 +1,93 @@
% Conditional Compilation
Coming Soon!
Rust has a special attribute, `#[cfg]`, which allows you to compile code
based on a flag passed to the compiler. It has two forms:
```rust
#[cfg(foo)]
# fn foo() {}
#[cfg(bar = "baz")]
# fn bar() {}
```
They also have some helpers:
```rust
#[cfg(any(unix, windows))]
# fn foo() {}
#[cfg(all(unix, target_pointer_width = "32"))]
# fn bar() {}
#[cfg(not(foo))]
# fn not_foo() {}
```
These can nest arbitrarily:
```rust
#[cfg(any(not(unix), all(target_os="macos", target_arch = "powerpc")))]
# fn foo() {}
```
As for how to enable or disable these switches, if youre using Cargo,
they get set in the [`[features]` section][features] of your `Cargo.toml`:
[features]: http://doc.crates.io/manifest.html#the-[features]-section
```toml
[features]
# no features by default
default = []
# The “secure-password” feature depends on the bcrypt package.
secure-password = ["bcrypt"]
```
When you do this, Cargo passes along a flag to `rustc`:
```text
--cfg feature="${feature_name}"
```
The sum of these `cfg` flags will determine which ones get activated, and
therefore, which code gets compiled. Lets take this code:
```rust
#[cfg(feature = "foo")]
mod foo {
}
```
If we compile it with `cargo build --features "foo"`, it will send the `--cfg
feature="foo"` flag to `rustc`, and the output will have the `mod foo` in it.
If we compile it with a regular `cargo build`, no extra flags get passed on,
and so, no `foo` module will exist.
# cfg_attr
You can also set another attribute based on a `cfg` variable with `cfg_attr`:
```rust
#[cfg_attr(a, b)]
# fn foo() {}
```
Will be the same as `#[b]` if `a` is set by `cfg` attribute, and nothing otherwise.
# cfg!
The `cfg!` [syntax extension][compilerplugins] lets you use these kinds of flags
elsewhere in your code, too:
```rust
if cfg!(target_os = "macos") || cfg!(target_os = "ios") {
println!("Think Different!");
}
```
[compilerplugins]: compiler-plugins.html
These will be replaced by a `true` or `false` at compile-time, depending on the
configuration settings.
+86
View File
@@ -0,0 +1,86 @@
% `const` and `static`
Rust has a way of defining constants with the `const` keyword:
```rust
const N: i32 = 5;
```
Unlike [`let`][let] bindings, you must annotate the type of a `const`.
[let]: variable-bindings.html
Constants live for the entire lifetime of a program. More specifically,
constants in Rust have no fixed address in memory. This is because theyre
effectively inlined to each place that theyre used. References to the same
constant are not necessarily guaranteed to refer to the same memory address for
this reason.
# `static`
Rust provides a global variable sort of facility in static items. Theyre
similar to constants, but static items arent inlined upon use. This means that
there is only one instance for each value, and its at a fixed location in
memory.
Heres an example:
```rust
static N: i32 = 5;
```
Unlike [`let`][let] bindings, you must annotate the type of a `static`.
[let]: variable-bindings.html
Statics live for the entire lifetime of a program, and therefore any
reference stored in a constant has a [`static` lifetime][lifetimes]:
```rust
static NAME: &'static str = "Steve";
```
[lifetimes]: lifetimes.html
## Mutability
You can introduce mutability with the `mut` keyword:
```rust
static mut N: i32 = 5;
```
Because this is mutable, one thread could be updating `N` while another is
reading it, causing memory unsafety. As such both accessing and mutating a
`static mut` is [`unsafe`][unsafe], and so must be done in an `unsafe` block:
```rust
# static mut N: i32 = 5;
unsafe {
N += 1;
println!("N: {}", N);
}
```
[unsafe]: unsafe.html
Furthermore, any type stored in a `static` must be `Sync`.
# Initializing
Both `const` and `static` have requirements for giving them a value. They may
only be given a value thats a constant expression. In other words, you cannot
use the result of a function call or anything similarly complex or at runtime.
# Which construct should I use?
Almost always, if you can choose between the two, choose `const`. Its pretty
rare that you actually want a memory location associated with your constant,
and using a const allows for optimizations like constant propagation not only
in your crate but downstream crates.
A const can be thought of as a `#define` in C: it has metadata overhead but it
has no runtime overhead. “Should I use a #define or a static in C,” is largely
the same question as whether you should use a const or a static in Rust.
+43 -43
View File
@@ -1,16 +1,16 @@
% Crates and Modules
When a project starts getting large, it's considered good software
When a project starts getting large, its considered good software
engineering practice to split it up into a bunch of smaller pieces, and then
fit them together. It's also important to have a well-defined interface, so
fit them together. Its also important to have a well-defined interface, so
that some of your functionality is private, and some is public. To facilitate
these kinds of things, Rust has a module system.
# Basic terminology: Crates and Modules
Rust has two distinct terms that relate to the module system: *crate* and
*module*. A crate is synonymous with a *library* or *package* in other
languages. Hence "Cargo" as the name of Rust's package management tool: you
Rust has two distinct terms that relate to the module system: crate and
module. A crate is synonymous with a library or package in other
languages. Hence Cargo as the name of Rusts package management tool: you
ship your crates to others with Cargo. Crates can produce an executable or a
library, depending on the project.
@@ -18,10 +18,10 @@ Each crate has an implicit *root module* that contains the code for that crate.
You can then define a tree of sub-modules under that root module. Modules allow
you to partition your code within the crate itself.
As an example, let's make a *phrases* crate, which will give us various phrases
in different languages. To keep things simple, we'll stick to "greetings" and
"farewells" as two kinds of phrases, and use English and Japanese (日本語) as
two languages for those phrases to be in. We'll use this module layout:
As an example, lets make a *phrases* crate, which will give us various phrases
in different languages. To keep things simple, well stick to greetings and
farewells as two kinds of phrases, and use English and Japanese (日本語) as
two languages for those phrases to be in. Well use this module layout:
```text
+-----------+
@@ -47,7 +47,7 @@ In this example, `phrases` is the name of our crate. All of the rest are
modules. You can see that they form a tree, branching out from the crate
*root*, which is the root of the tree: `phrases` itself.
Now that we have a plan, let's define these modules in code. To start,
Now that we have a plan, lets define these modules in code. To start,
generate a new crate with Cargo:
```bash
@@ -72,7 +72,7 @@ above.
# Defining Modules
To define each of our modules, we use the `mod` keyword. Let's make our
To define each of our modules, we use the `mod` keyword. Lets make our
`src/lib.rs` look like this:
```
@@ -101,7 +101,7 @@ Within a given `mod`, you can declare sub-`mod`s. We can refer to sub-modules
with double-colon (`::`) notation: our four nested modules are
`english::greetings`, `english::farewells`, `japanese::greetings`, and
`japanese::farewells`. Because these sub-modules are namespaced under their
parent module, the names don't conflict: `english::greetings` and
parent module, the names dont conflict: `english::greetings` and
`japanese::greetings` are distinct, even though their names are both
`greetings`.
@@ -116,11 +116,11 @@ build deps examples libphrases-a7448e02a0468eaa.rlib native
```
`libphrase-hash.rlib` is the compiled crate. Before we see how to use this
crate from another crate, let's break it up into multiple files.
crate from another crate, lets break it up into multiple files.
# Multiple file crates
If each crate were just one file, these files would get very large. It's often
If each crate were just one file, these files would get very large. Its often
easier to split up crates into multiple files, and Rust supports this in two
ways.
@@ -141,7 +141,7 @@ mod english;
If we do that, Rust will expect to find either a `english.rs` file, or a
`english/mod.rs` file with the contents of our module.
Note that in these files, you don't need to re-declare the module: that's
Note that in these files, you dont need to re-declare the module: thats
already been done with the initial `mod` declaration.
Using these two techniques, we can break up our crate into two directories and
@@ -180,7 +180,7 @@ mod japanese;
These two declarations tell Rust to look for either `src/english.rs` and
`src/japanese.rs`, or `src/english/mod.rs` and `src/japanese/mod.rs`, depending
on our preference. In this case, because our modules have sub-modules, we've
on our preference. In this case, because our modules have sub-modules, weve
chosen the second. Both `src/english/mod.rs` and `src/japanese/mod.rs` look
like this:
@@ -192,11 +192,11 @@ mod farewells;
Again, these declarations tell Rust to look for either
`src/english/greetings.rs` and `src/japanese/greetings.rs` or
`src/english/farewells/mod.rs` and `src/japanese/farewells/mod.rs`. Because
these sub-modules don't have their own sub-modules, we've chosen to make them
these sub-modules dont have their own sub-modules, weve chosen to make them
`src/english/greetings.rs` and `src/japanese/farewells.rs`. Whew!
The contents of `src/english/greetings.rs` and `src/japanese/farewells.rs` are
both empty at the moment. Let's add some functions.
both empty at the moment. Lets add some functions.
Put this in `src/english/greetings.rs`:
@@ -223,7 +223,7 @@ fn hello() -> String {
```
Of course, you can copy and paste this from this web page, or just type
something else. It's not important that you actually put "konnichiwa" to learn
something else. Its not important that you actually put konnichiwa to learn
about the module system.
Put this in `src/japanese/farewells.rs`:
@@ -234,17 +234,17 @@ fn goodbye() -> String {
}
```
(This is "Sayōnara", if you're curious.)
(This is Sayōnara, if youre curious.)
Now that we have some functionality in our crate, let's try to use it from
Now that we have some functionality in our crate, lets try to use it from
another crate.
# Importing External Crates
We have a library crate. Let's make an executable crate that imports and uses
We have a library crate. Lets make an executable crate that imports and uses
our library.
Make a `src/main.rs` and put this in it (it won't quite compile yet):
Make a `src/main.rs` and put this in it (it wont quite compile yet):
```rust,ignore
extern crate phrases;
@@ -259,7 +259,7 @@ fn main() {
```
The `extern crate` declaration tells Rust that we need to compile and link to
the `phrases` crate. We can then use `phrases`' modules in this one. As we
the `phrases` crate. We can then use `phrases` modules in this one. As we
mentioned earlier, you can use double colons to refer to sub-modules and the
functions inside of them.
@@ -267,10 +267,10 @@ Also, Cargo assumes that `src/main.rs` is the crate root of a binary crate,
rather than a library crate. Our package now has two crates: `src/lib.rs` and
`src/main.rs`. This pattern is quite common for executable crates: most
functionality is in a library crate, and the executable crate uses that
library. This way, other programs can also use the library crate, and it's also
library. This way, other programs can also use the library crate, and its also
a nice separation of concerns.
This doesn't quite work yet, though. We get four errors that look similar to
This doesnt quite work yet, though. We get four errors that look similar to
this:
```bash
@@ -287,14 +287,14 @@ note: in expansion of format_args!
phrases/src/main.rs:4:5: 4:76 note: expansion site
```
By default, everything is private in Rust. Let's talk about this in some more
By default, everything is private in Rust. Lets talk about this in some more
depth.
# Exporting a Public Interface
Rust allows you to precisely control which aspects of your interface are
public, and so private is the default. To make things public, you use the `pub`
keyword. Let's focus on the `english` module first, so let's reduce our `src/main.rs`
keyword. Lets focus on the `english` module first, so lets reduce our `src/main.rs`
to just this:
```{rust,ignore}
@@ -306,21 +306,21 @@ fn main() {
}
```
In our `src/lib.rs`, let's add `pub` to the `english` module declaration:
In our `src/lib.rs`, lets add `pub` to the `english` module declaration:
```{rust,ignore}
pub mod english;
mod japanese;
```
And in our `src/english/mod.rs`, let's make both `pub`:
And in our `src/english/mod.rs`, lets make both `pub`:
```{rust,ignore}
pub mod greetings;
pub mod farewells;
```
In our `src/english/greetings.rs`, let's add `pub` to our `fn` declaration:
In our `src/english/greetings.rs`, lets add `pub` to our `fn` declaration:
```{rust,ignore}
pub fn hello() -> String {
@@ -358,12 +358,12 @@ Goodbye in English: Goodbye.
Now that our functions are public, we can use them. Great! However, typing out
`phrases::english::greetings::hello()` is very long and repetitive. Rust has
another keyword for importing names into the current scope, so that you can
refer to them with shorter names. Let's talk about `use`.
refer to them with shorter names. Lets talk about `use`.
# Importing Modules with `use`
Rust has a `use` keyword, which allows us to import names into our local scope.
Let's change our `src/main.rs` to look like this:
Lets change our `src/main.rs` to look like this:
```{rust,ignore}
extern crate phrases;
@@ -378,7 +378,7 @@ fn main() {
```
The two `use` lines import each module into the local scope, so we can refer to
the functions by a much shorter name. By convention, when importing functions, it's
the functions by a much shorter name. By convention, when importing functions, its
considered best practice to import the module, rather than the function directly. In
other words, you _can_ do this:
@@ -395,7 +395,7 @@ fn main() {
```
But it is not idiomatic. This is significantly more likely to introduce a
naming conflict. In our short program, it's not a big deal, but as it grows, it
naming conflict. In our short program, its not a big deal, but as it grows, it
becomes a problem. If we have conflicting names, Rust will give a compilation
error. For example, if we made the `japanese` functions public, and tried to do
this:
@@ -423,7 +423,7 @@ error: aborting due to previous error
Could not compile `phrases`.
```
If we're importing multiple names from the same module, we don't have to type it out
If were importing multiple names from the same module, we dont have to type it out
twice. Instead of this:
```{rust,ignore}
@@ -439,11 +439,11 @@ use phrases::english::{greetings, farewells};
## Re-exporting with `pub use`
You don't just use `use` to shorten identifiers. You can also use it inside of your crate
You dont just use `use` to shorten identifiers. You can also use it inside of your crate
to re-export a function inside another module. This allows you to present an external
interface that may not directly map to your internal code organization.
Let's look at an example. Modify your `src/main.rs` to read like this:
Lets look at an example. Modify your `src/main.rs` to read like this:
```{rust,ignore}
extern crate phrases;
@@ -494,11 +494,11 @@ mod farewells;
```
The `pub use` declaration brings the function into scope at this part of our
module hierarchy. Because we've `pub use`d this inside of our `japanese`
module hierarchy. Because weve `pub use`d this inside of our `japanese`
module, we now have a `phrases::japanese::hello()` function and a
`phrases::japanese::goodbye()` function, even though the code for them lives in
`phrases::japanese::greetings::hello()` and
`phrases::japanese::farewells::goodbye()`. Our internal organization doesn't
`phrases::japanese::farewells::goodbye()`. Our internal organization doesnt
define our external interface.
Here we have a `pub use` for each function we want to bring into the
@@ -507,13 +507,13 @@ everything from `greetings` into the current scope: `pub use self::greetings::*`
What about the `self`? Well, by default, `use` declarations are absolute paths,
starting from your crate root. `self` makes that path relative to your current
place in the hierarchy instead. There's one more special form of `use`: you can
place in the hierarchy instead. Theres one more special form of `use`: you can
`use super::` to reach one level up the tree from your current location. Some
people like to think of `self` as `.` and `super` as `..`, from many shells'
people like to think of `self` as `.` and `super` as `..`, from many shells
display for the current directory and the parent directory.
Outside of `use`, paths are relative: `foo::bar()` refers to a function inside
of `foo` relative to where we are. If that's prefixed with `::`, as in
of `foo` relative to where we are. If thats prefixed with `::`, as in
`::foo::bar()`, it refers to a different `foo`, an absolute path from your
crate root.
+117 -1
View File
@@ -1,3 +1,119 @@
% `Deref` coercions
Coming soon!
The standard library provides a special trait, [`Deref`][deref]. Its normally
used to overload `*`, the dereference operator:
```rust
use std::ops::Deref;
struct DerefExample<T> {
value: T,
}
impl<T> Deref for DerefExample<T> {
type Target = T;
fn deref(&self) -> &T {
&self.value
}
}
fn main() {
let x = DerefExample { value: 'a' };
assert_eq!('a', *x);
}
```
[deref]: ../std/ops/trait.Deref.html
This is useful for writing custom pointer types. However, theres a language
feature related to `Deref`: deref coercions. Heres the rule: If you have a
type `U`, and it implements `Deref<Target=T>`, values of `&U` will
automatically coerce to a `&T`. Heres an example:
```rust
fn foo(s: &str) {
// borrow a string for a second
}
// String implements Deref<Target=str>
let owned = "Hello".to_string();
// therefore, this works:
foo(&owned);
```
Using an ampersand in front of a value takes a reference to it. So `owned` is a
`String`, `&owned` is an `&String`, and since `impl Deref<Target=str> for
String`, `&String` will deref to `&str`, which `foo()` takes.
Thats it. This rule is one of the only places in which Rust does an automatic
conversion for you, but it adds a lot of flexibility. For example, the `Rc<T>`
type implements `Deref<Target=T>`, so this works:
```rust
use std::rc::Rc;
fn foo(s: &str) {
// borrow a string for a second
}
// String implements Deref<Target=str>
let owned = "Hello".to_string();
let counted = Rc::new(owned);
// therefore, this works:
foo(&counted);
```
All weve done is wrap our `String` in an `Rc<T>`. But we can now pass the
`Rc<String>` around anywhere wed have a `String`. The signature of `foo`
didnt change, but works just as well with either type. This example has two
conversions: `Rc<String>` to `String` and then `String` to `&str`. Rust will do
this as many times as possible until the types match.
Another very common implementation provided by the standard library is:
```rust
fn foo(s: &[i32]) {
// borrow a slice for a second
}
// Vec<T> implements Deref<Target=[T]>
let owned = vec![1, 2, 3];
foo(&owned);
```
Vectors can `Deref` to a slice.
## Deref and method calls
`Deref` will also kick in when calling a method. In other words, these are
the same two things in Rust:
```rust
struct Foo;
impl Foo {
fn foo(&self) { println!("Foo"); }
}
let f = Foo;
f.foo();
```
Even though `f` isnt a reference, and `foo` takes `&self`, this works.
Thats because these things are the same:
```rust,ignore
f.foo();
(&f).foo();
(&&f).foo();
(&&&&&&&&f).foo();
```
A value of type `&&&&&&&&&&&&&&&&Foo` can still have methods defined on `Foo`
called, because the compiler will insert as many * operations as necessary to
get it right. And since its inserting `*`s, that uses `Deref`.
+690
View File
@@ -0,0 +1,690 @@
% Dining Philosophers
For our second project, lets look at a classic concurrency problem. Its
called the dining philosophers. It was originally conceived by Dijkstra in
1965, but well use the version from [this paper][paper] by Tony Hoare in 1985.
[paper]: http://www.usingcsp.com/cspbook.pdf
> In ancient times, a wealthy philanthropist endowed a College to accommodate
> five eminent philosophers. Each philosopher had a room in which he could
> engage in his professional activity of thinking; there was also a common
> dining room, furnished with a circular table, surrounded by five chairs, each
> labelled by the name of the philosopher who was to sit in it. They sat
> anticlockwise around the table. To the left of each philosopher there was
> laid a golden fork, and in the centre stood a large bowl of spaghetti, which
> was constantly replenished. A philosopher was expected to spend most of his
> time thinking; but when he felt hungry, he went to the dining room, sat down
> in his own chair, picked up his own fork on his left, and plunged it into the
> spaghetti. But such is the tangled nature of spaghetti that a second fork is
> required to carry it to the mouth. The philosopher therefore had also to pick
> up the fork on his right. When we was finished he would put down both his
> forks, get up from his chair, and continue thinking. Of course, a fork can be
> used by only one philosopher at a time. If the other philosopher wants it, he
> just has to wait until the fork is available again.
This classic problem shows off a few different elements of concurrency. The
reason is that it's actually slightly tricky to implement: a simple
implementation can deadlock. For example, let's consider a simple algorithm
that would solve this problem:
1. A philosopher picks up the fork on their left.
2. They then pick up the fork on their right.
3. They eat.
4. They return the forks.
Now, lets imagine this sequence of events:
1. Philosopher 1 begins the algorithm, picking up the fork on their left.
2. Philosopher 2 begins the algorithm, picking up the fork on their left.
3. Philosopher 3 begins the algorithm, picking up the fork on their left.
4. Philosopher 4 begins the algorithm, picking up the fork on their left.
5. Philosopher 5 begins the algorithm, picking up the fork on their left.
6. ... ? All the forks are taken, but nobody can eat!
There are different ways to solve this problem. Well get to our solution in
the tutorial itself. For now, lets get started modelling the problem itself.
Well start with the philosophers:
```rust
struct Philosopher {
name: String,
}
impl Philosopher {
fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
}
}
fn main() {
let p1 = Philosopher::new("Baruch Spinoza");
let p2 = Philosopher::new("Gilles Deleuze");
let p3 = Philosopher::new("Karl Marx");
let p4 = Philosopher::new("Friedrich Nietzsche");
let p5 = Philosopher::new("Michel Foucault");
}
```
Here, we make a [`struct`][struct] to represent a philosopher. For now,
a name is all we need. We choose the [`String`][string] type for the name,
rather than `&str`. Generally speaking, working with a type which owns its
data is easier than working with one that uses references.
Lets continue:
```rust
# struct Philosopher {
# name: String,
# }
impl Philosopher {
fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
}
}
```
This `impl` block lets us define things on `Philosopher` structs. In this case,
we define an associated function called `new`. The first line looks like this:
```rust
# struct Philosopher {
# name: String,
# }
# impl Philosopher {
fn new(name: &str) -> Philosopher {
# Philosopher {
# name: name.to_string(),
# }
# }
# }
```
We take one argument, a `name`, of type `&str`. This is a reference to another
string. It returns an instance of our `Philosopher` struct.
```rust
# struct Philosopher {
# name: String,
# }
# impl Philosopher {
# fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
# }
# }
```
This creates a new `Philosopher`, and sets its `name` to our `name` argument.
Not just the argument itself, though, as we call `.to_string()` on it. This
will create a copy of the string that our `&str` points to, and give us a new
`String`, which is the type of the `name` field of `Philosopher`.
Why not accept a `String` directly? Its nicer to call. If we took a `String`,
but our caller had a `&str`, theyd have to call this method themselves. The
downside of this flexibility is that we _always_ make a copy. For this small
program, thats not particularly important, as we know well just be using
short strings anyway.
One last thing youll notice: we just define a `Philosopher`, and seemingly
dont do anything with it. Rust is an expression based language, which means
that almost everything in Rust is an expression which returns a value. This is
true of functions as well, the last expression is automatically returned. Since
we create a new `Philosopher` as the last expression of this function, we end
up returning it.
This name, `new()`, isnt anything special to Rust, but it is a convention for
functions that create new instances of structs. Before we talk about why, lets
look at `main()` again:
```rust
# struct Philosopher {
# name: String,
# }
#
# impl Philosopher {
# fn new(name: &str) -> Philosopher {
# Philosopher {
# name: name.to_string(),
# }
# }
# }
#
fn main() {
let p1 = Philosopher::new("Baruch Spinoza");
let p2 = Philosopher::new("Gilles Deleuze");
let p3 = Philosopher::new("Karl Marx");
let p4 = Philosopher::new("Friedrich Nietzsche");
let p5 = Philosopher::new("Michel Foucault");
}
```
Here, we create five variable bindings with five new philosophers. These are my
favorite five, but you can substitute anyone you want. If we _didnt_ define
that `new()` function, it would look like this:
```rust
# struct Philosopher {
# name: String,
# }
fn main() {
let p1 = Philosopher { name: "Baruch Spinoza".to_string() };
let p2 = Philosopher { name: "Gilles Deleuze".to_string() };
let p3 = Philosopher { name: "Karl Marx".to_string() };
let p4 = Philosopher { name: "Friedrich Nietzche".to_string() };
let p5 = Philosopher { name: "Michel Foucault".to_string() };
}
```
Thats much noisier. Using `new` has other advantages too, but even in
this simple case, it ends up being nicer to use.
Now that weve got the basics in place, theres a number of ways that we can
tackle the broader problem here. I like to start from the end first: lets
set up a way for each philosopher to finish eating. As a tiny step, lets make
a method, and then loop through all the philosophers, calling it:
```rust
struct Philosopher {
name: String,
}
impl Philosopher {
fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
}
fn eat(&self) {
println!("{} is done eating.", self.name);
}
}
fn main() {
let philosophers = vec![
Philosopher::new("Baruch Spinoza"),
Philosopher::new("Gilles Deleuze"),
Philosopher::new("Karl Marx"),
Philosopher::new("Friedrich Nietzsche"),
Philosopher::new("Michel Foucault"),
];
for p in &philosophers {
p.eat();
}
}
```
Lets look at `main()` first. Rather than have five individual variable
bindings for our philosophers, we make a `Vec<T>` of them instead. `Vec<T>` is
also called a vector, and its a growable array type. We then use a
[`for`][for] loop to iterate through the vector, getting a reference to each
philosopher in turn.
[for]: for-loops.html
In the body of the loop, we call `p.eat()`, which is defined above:
```rust,ignore
fn eat(&self) {
println!("{} is done eating.", self.name);
}
```
In Rust, methods take an explicit `self` parameter. Thats why `eat()` is a
method, but `new` is an associated function: `new()` has no `self`. For our
first version of `eat()`, we just print out the name of the philosopher, and
mention theyre done eating. Running this program should give you the following
output:
```text
Baruch Spinoza is done eating.
Gilles Deleuze is done eating.
Karl Marx is done eating.
Friedrich Nietzsche is done eating.
Michel Foucault is done eating.
```
Easy enough, theyre all done! We havent actually implemented the real problem
yet, though, so were not done yet!
Next, we want to make our philosophers not just finish eating, but actually
eat. Heres the next version:
```rust
use std::thread;
struct Philosopher {
name: String,
}
impl Philosopher {
fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
}
fn eat(&self) {
println!("{} is eating.", self.name);
thread::sleep_ms(1000);
println!("{} is done eating.", self.name);
}
}
fn main() {
let philosophers = vec![
Philosopher::new("Baruch Spinoza"),
Philosopher::new("Gilles Deleuze"),
Philosopher::new("Karl Marx"),
Philosopher::new("Friedrich Nietzsche"),
Philosopher::new("Michel Foucault"),
];
for p in &philosophers {
p.eat();
}
}
```
Just a few changes. Lets break it down.
```rust,ignore
use std::thread;
```
`use` brings names into scope. Were going to start using the `thread` module
from the standard library, and so we need to `use` it.
```rust,ignore
fn eat(&self) {
println!("{} is eating.", self.name);
thread::sleep_ms(1000);
println!("{} is done eating.", self.name);
}
```
We now print out two messages, with a `sleep_ms()` in the middle. This will
simulate the time it takes a philosopher to eat.
If you run this program, You should see each philosopher eat in turn:
```text
Baruch Spinoza is eating.
Baruch Spinoza is done eating.
Gilles Deleuze is eating.
Gilles Deleuze is done eating.
Karl Marx is eating.
Karl Marx is done eating.
Friedrich Nietzsche is eating.
Friedrich Nietzsche is done eating.
Michel Foucault is eating.
Michel Foucault is done eating.
```
Excellent! Were getting there. Theres just one problem: we arent actually
operating in a concurrent fashion, which is a core part of the problem!
To make our philosophers eat concurrently, we need to make a small change.
Heres the next iteration:
```rust
use std::thread;
struct Philosopher {
name: String,
}
impl Philosopher {
fn new(name: &str) -> Philosopher {
Philosopher {
name: name.to_string(),
}
}
fn eat(&self) {
println!("{} is eating.", self.name);
thread::sleep_ms(1000);
println!("{} is done eating.", self.name);
}
}
fn main() {
let philosophers = vec![
Philosopher::new("Baruch Spinoza"),
Philosopher::new("Gilles Deleuze"),
Philosopher::new("Karl Marx"),
Philosopher::new("Friedrich Nietzsche"),
Philosopher::new("Michel Foucault"),
];
let handles: Vec<_> = philosophers.into_iter().map(|p| {
thread::spawn(move || {
p.eat();
})
}).collect();
for h in handles {
h.join().unwrap();
}
}
```
All weve done is change the loop in `main()`, and added a second one! Heres the
first change:
```rust,ignore
let handles: Vec<_> = philosophers.into_iter().map(|p| {
thread::spawn(move || {
p.eat();
})
}).collect();
```
While this is only five lines, theyre a dense four. Lets break it down.
```rust,ignore
let handles: Vec<_> =
```
We introduce a new binding, called `handles`. Weve given it this name because
we are going to make some new threads, and that will return some handles to those
threads that let us control their operation. We need to explicitly annotate
the type here, though, due to an issue well talk about later. The `_` is
a type placeholder. Were saying “`handles` is a vector of something, but you
can figure out what that something is, Rust.”
```rust,ignore
philosophers.into_iter().map(|p| {
```
We take our list of philosophers and call `into_iter()` on it. This creates an
iterator that takes ownership of each philosopher. We need to do this to pass
them to our threads. We take that iterator and call `map` on it, which takes a
closure as an argument and calls that closure on each element in turn.
```rust,ignore
thread::spawn(move || {
p.eat();
})
```
Heres where the concurrency happens. The `thread::spawn` function takes a closure
as an argument and executes that closure in a new thread. This closure needs
an extra annotation, `move`, to indicate that the closure is going to take
ownership of the values its capturing. Primarily, the `p` variable of the
`map` function.
Inside the thread, all we do is call `eat()` on `p`.
```rust,ignore
}).collect();
```
Finally, we take the result of all those `map` calls and collect them up.
`collect()` will make them into a collection of some kind, which is why we
needed to annotate the return type: we want a `Vec<T>`. The elements are the
return values of the `thread::spawn` calls, which are handles to those threads.
Whew!
```rust,ignore
for h in handles {
h.join().unwrap();
}
```
At the end of `main()`, we loop through the handles and call `join()` on them,
which blocks execution until the thread has completed execution. This ensures
that the threads complete their work before the program exits.
If you run this program, youll see that the philosophers eat out of order!
We have mult-threading!
```text
Gilles Deleuze is eating.
Gilles Deleuze is done eating.
Friedrich Nietzsche is eating.
Friedrich Nietzsche is done eating.
Michel Foucault is eating.
Baruch Spinoza is eating.
Baruch Spinoza is done eating.
Karl Marx is eating.
Karl Marx is done eating.
Michel Foucault is done eating.
```
But what about the forks? We havent modeled them at all yet.
To do that, lets make a new `struct`:
```rust
use std::sync::Mutex;
struct Table {
forks: Vec<Mutex<()>>,
}
```
This `Table` has an vector of `Mutex`es. A mutex is a way to control
concurrency: only one thread can access the contents at once. This is exactly
the property we need with our forks. We use an empty tuple, `()`, inside the
mutex, since were not actually going to use the value, just hold onto it.
Lets modify the program to use the `Table`:
```rust
use std::thread;
use std::sync::{Mutex, Arc};
struct Philosopher {
name: String,
left: usize,
right: usize,
}
impl Philosopher {
fn new(name: &str, left: usize, right: usize) -> Philosopher {
Philosopher {
name: name.to_string(),
left: left,
right: right,
}
}
fn eat(&self, table: &Table) {
let _left = table.forks[self.left].lock().unwrap();
let _right = table.forks[self.right].lock().unwrap();
println!("{} is eating.", self.name);
thread::sleep_ms(1000);
println!("{} is done eating.", self.name);
}
}
struct Table {
forks: Vec<Mutex<()>>,
}
fn main() {
let table = Arc::new(Table { forks: vec![
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
]});
let philosophers = vec![
Philosopher::new("Baruch Spinoza", 0, 1),
Philosopher::new("Gilles Deleuze", 1, 2),
Philosopher::new("Karl Marx", 2, 3),
Philosopher::new("Friedrich Nietzsche", 3, 4),
Philosopher::new("Michel Foucault", 0, 4),
];
let handles: Vec<_> = philosophers.into_iter().map(|p| {
let table = table.clone();
thread::spawn(move || {
p.eat(&table);
})
}).collect();
for h in handles {
h.join().unwrap();
}
}
```
Lots of changes! However, with this iteration, weve got a working program.
Lets go over the details:
```rust,ignore
use std::sync::{Mutex, Arc};
```
Were going to use another structure from the `std::sync` package: `Arc<T>`.
Well talk more about it when we use it.
```rust,ignore
struct Philosopher {
name: String,
left: usize,
right: usize,
}
```
We need to add two more fields to our `Philosopher`. Each philosopher is going
to have two forks: the one on their left, and the one on their right.
Well use the `usize` type to indicate them, as its the type that you index
vectors with. These two values will be the indexes into the `forks` our `Table`
has.
```rust,ignore
fn new(name: &str, left: usize, right: usize) -> Philosopher {
Philosopher {
name: name.to_string(),
left: left,
right: right,
}
}
```
We now need to construct those `left` and `right` values, so we add them to
`new()`.
```rust,ignore
fn eat(&self, table: &Table) {
let _left = table.forks[self.left].lock().unwrap();
let _right = table.forks[self.right].lock().unwrap();
println!("{} is eating.", self.name);
thread::sleep_ms(1000);
println!("{} is done eating.", self.name);
}
```
We have two new lines. Weve also added an argument, `table`. We access the
`Table`s list of forks, and then use `self.left` and `self.right` to access
the fork at that particular index. That gives us access to the `Mutex` at that
index, and we call `lock()` on it. If the mutex is currently being accessed by
someone else, well block until it becomes available.
The call to `lock()` might fail, and if it does, we want to crash. In this
case, the error that could happen is that the mutex is [poisoned][poison],
which is what happens when the thread panics while the lock is held. Since this
shouldnt happen, we just use `unwrap()`.
[poison]: ../std/sync/struct.Mutex.html#poisoning
One other odd thing about these lines: weve named the results `_left` and
`_right`. Whats up with that underscore? Well, we arent planning on
_using_ the value inside the lock. We just want to acquire it. As such,
Rust will warn us that we never use the value. By using the underscore,
we tell Rust that this is what we intended, and it wont throw a warning.
What about releasing the lock? Well, that will happen when `_left` and
`_right` go out of scope, automatically.
```rust,ignore
let table = Arc::new(Table { forks: vec![
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
Mutex::new(()),
]});
```
Next, in `main()`, we make a new `Table` and wrap it in an `Arc<T>`.
arc stands for atomic reference count, and we need that to share
our `Table` across multiple threads. As we share it, the reference
count will go up, and when each thread ends, it will go back down.
```rust,ignore
let philosophers = vec![
Philosopher::new("Baruch Spinoza", 0, 1),
Philosopher::new("Gilles Deleuze", 1, 2),
Philosopher::new("Karl Marx", 2, 3),
Philosopher::new("Friedrich Nietzsche", 3, 4),
Philosopher::new("Michel Foucault", 0, 4),
];
```
We need to pass in our `left` and `right` values to the constructors for our
`Philosopher`s. But theres one more detail here, and its _very_ important. If
you look at the pattern, its all consistent until the very end. Monsieur
Foucault should have `4, 0` as arguments, but instead, has `0, 4`. This is what
prevents deadlock, actually: one of our philosophers is left handed! This is
one way to solve the problem, and in my opinion, its the simplest.
```rust,ignore
let handles: Vec<_> = philosophers.into_iter().map(|p| {
let table = table.clone();
thread::spawn(move || {
p.eat(&table);
})
}).collect();
```
Finally, inside of our `map()`/`collect()` loop, we call `table.clone()`. The
`clone()` method on `Arc<T>` is what bumps up the reference count, and when it
goes out of scope, it decrements the count. Youll notice we can introduce a
new binding to `table` here, and it will shadow the old one. This is often used
so that you dont need to come up with two unique names.
With this, our program works! Only two philosophers can eat at any one time,
and so youll get some output like this:
```text
Gilles Deleuze is eating.
Friedrich Nietzsche is eating.
Friedrich Nietzsche is done eating.
Gilles Deleuze is done eating.
Baruch Spinoza is eating.
Karl Marx is eating.
Baruch Spinoza is done eating.
Michel Foucault is eating.
Karl Marx is done eating.
Michel Foucault is done eating.
```
Congrats! Youve implemented a classic concurrency problem in Rust.
+15 -2
View File
@@ -380,7 +380,10 @@ $ rustdoc --test path/to/my/crate/root.rs
$ cargo test
```
That's right, `cargo test` tests embedded documentation too.
That's right, `cargo test` tests embedded documentation too. However,
`cargo test` will not test binary crates, only library ones. This is
due to the way `rustdoc` works: it links against the library to be tested,
but with a binary, theres nothing to link to.
There are a few more annotations that are useful to help `rustdoc` do the right
thing when testing your code:
@@ -553,10 +556,20 @@ This sets a few different options, with a logo, favicon, and a root URL.
## Generation options
`rustdoc` also contains a few other options on the command line, for further customiziation:
`rustdoc` also contains a few other options on the command line, for further customization:
- `--html-in-header FILE`: includes the contents of FILE at the end of the
`<head>...</head>` section.
- `--html-before-content FILE`: includes the contents of FILE directly after
`<body>`, before the rendered content (including the search bar).
- `--html-after-content FILE`: includes the contents of FILE after all the rendered content.
## Security note
The Markdown in documentation comments is placed without processing into
the final webpage. Be careful with literal HTML:
```rust
/// <script>alert(document.cookie)</script>
# fn foo() {}
```
+66 -2
View File
@@ -1,3 +1,67 @@
% `Drop`
% Drop
Coming soon!
Now that weve discussed traits, lets talk about a particular trait provided
by the Rust standard library, [`Drop`][drop]. The `Drop` trait provides a way
to run some code when a value goes out of scope. For example:
[drop]: ../std/ops/trait.Drop.html
```rust
struct HasDrop;
impl Drop for HasDrop {
fn drop(&mut self) {
println!("Dropping!");
}
}
fn main() {
let x = HasDrop;
// do stuff
} // x goes out of scope here
```
When `x` goes out of scope at the end of `main()`, the code for `Drop` will
run. `Drop` has one method, which is also called `drop()`. It takes a mutable
reference to `self`.
Thats it! The mechanics of `Drop` are very simple, but there are some
subtleties. For example, values are dropped in the opposite order they are
declared. Heres another example:
```rust
struct Firework {
strength: i32,
}
impl Drop for Firework {
fn drop(&mut self) {
println!("BOOM times {}!!!", self.strength);
}
}
fn main() {
let firecracker = Firework { strength: 1 };
let tnt = Firework { strength: 100 };
}
```
This will output:
```text
BOOM times 100!!!
BOOM times 1!!!
```
The TNT goes off before the firecracker does, because it was declared
afterwards. Last in, first out.
So what is `Drop` good for? Generally, `Drop` is used to clean up any resources
associated with a `struct`. For example, the [`Arc<T>` type][arc] is a
reference-counted type. When `Drop` is called, it will decrement the reference
count, and if the total number of references is zero, will clean up the
underlying value.
[arc]: ../std/sync/struct.Arc.html
+50 -129
View File
@@ -1,147 +1,68 @@
% Enums
Finally, Rust has a "sum type", an *enum*. Enums are an incredibly useful
feature of Rust, and are used throughout the standard library. An `enum` is
a type which relates a set of alternates to a specific name. For example, below
we define `Character` to be either a `Digit` or something else. These
can be used via their fully scoped names: `Character::Other` (more about `::`
below).
An `enum` in Rust is a type that represents data that could be one of
several possible variants:
```rust
enum Character {
Digit(i32),
Other,
enum Message {
Quit,
ChangeColor(i32, i32, i32),
Move { x: i32, y: i32 },
Write(String),
}
```
Most normal types are allowed as the variant components of an `enum`. Here are
some examples:
Each variant can optionally have data associated with it. The syntax for
defining variants resembles the syntaxes used to define structs: you can
have variants with no data (like unit-like structs), variants with named
data, and variants with unnamed data (like tuple structs). Unlike
separate struct definitions, however, an `enum` is a single type. A
value of the enum can match any of the variants. For this reason, an
enum is sometimes called a sum type: the set of possible values of the
enum is the sum of the sets of possible values for each variant.
We use the `::` syntax to use the name of each variant: theyre scoped by the name
of the `enum` itself. This allows both of these to work:
```rust
struct Empty;
struct Color(i32, i32, i32);
struct Length(i32);
struct Status { Health: i32, Mana: i32, Attack: i32, Defense: i32 }
struct HeightDatabase(Vec<i32>);
# enum Message {
# Move { x: i32, y: i32 },
# }
let x: Message = Message::Move { x: 3, y: 4 };
enum BoardGameTurn {
Move { squares: i32 },
Pass,
}
let y: BoardGameTurn = BoardGameTurn::Move { squares: 1 };
```
You see that, depending on its type, an `enum` variant may or may not hold data.
In `Character`, for instance, `Digit` gives a meaningful name for an `i32`
value, where `Other` is only a name. However, the fact that they represent
distinct categories of `Character` is a very useful property.
Both variants are named `Move`, but since theyre scoped to the name of
the enum, they can both be used without conflict.
As with structures, the variants of an enum by default are not comparable with
equality operators (`==`, `!=`), have no ordering (`<`, `>=`, etc.), and do not
support other binary operations such as `*` and `+`. As such, the following code
is invalid for the example `Character` type:
A value of an enum type contains information about which variant it is,
in addition to any data associated with that variant. This is sometimes
referred to as a tagged union, since the data includes a tag
indicating what type it is. The compiler uses this information to
enforce that youre accessing the data in the enum safely. For instance,
you cant simply try to destructure a value as if it were one of the
possible variants:
```{rust,ignore}
// These assignments both succeed
let ten = Character::Digit(10);
let four = Character::Digit(4);
// Error: `*` is not implemented for type `Character`
let forty = ten * four;
// Error: `<=` is not implemented for type `Character`
let four_is_smaller = four <= ten;
// Error: `==` is not implemented for type `Character`
let four_equals_ten = four == ten;
```
This may seem rather limiting, but it's a limitation which we can overcome.
There are two ways: by implementing equality ourselves, or by pattern matching
variants with [`match`][match] expressions, which you'll learn in the next
chapter. We don't know enough about Rust to implement equality yet, but we can
use the `Ordering` enum from the standard library, which does:
```
enum Ordering {
Less,
Equal,
Greater,
```rust,ignore
fn process_color_change(msg: Message) {
let Message::ChangeColor(r, g, b) = msg; // compile-time error
}
```
Because `Ordering` has already been defined for us, we will import it with the
`use` keyword. Here's an example of how it is used:
Both variants are named `Digit`, but since theyre scoped to the `enum` name
there's no ambiguity.
```{rust}
use std::cmp::Ordering;
Not supporting these operations may seem rather limiting, but its a limitation
which we can overcome. There are two ways: by implementing equality ourselves,
or by pattern matching variants with [`match`][match] expressions, which youll
learn in the next section. We dont know enough about Rust to implement
equality yet, but well find out in the [`traits`][traits] section.
fn cmp(a: i32, b: i32) -> Ordering {
if a < b { Ordering::Less }
else if a > b { Ordering::Greater }
else { Ordering::Equal }
}
fn main() {
let x = 5;
let y = 10;
let ordering = cmp(x, y); // ordering: Ordering
if ordering == Ordering::Less {
println!("less");
} else if ordering == Ordering::Greater {
println!("greater");
} else if ordering == Ordering::Equal {
println!("equal");
}
}
```
The `::` symbol is used to indicate a namespace. In this case, `Ordering` lives
in the `cmp` submodule of the `std` module. We'll talk more about modules later
in the guide. For now, all you need to know is that you can `use` things from
the standard library if you need them.
Okay, let's talk about the actual code in the example. `cmp` is a function that
compares two things, and returns an `Ordering`. We return either
`Ordering::Less`, `Ordering::Greater`, or `Ordering::Equal`, depending on
whether the first value is less than, greater than, or equal to the second. Note
that each variant of the `enum` is namespaced under the `enum` itself: it's
`Ordering::Greater`, not `Greater`.
The `ordering` variable has the type `Ordering`, and so contains one of the
three values. We then do a bunch of `if`/`else` comparisons to check which
one it is.
This `Ordering::Greater` notation is too long. Let's use another form of `use`
to import the `enum` variants instead. This will avoid full scoping:
```{rust}
use std::cmp::Ordering::{self, Equal, Less, Greater};
fn cmp(a: i32, b: i32) -> Ordering {
if a < b { Less }
else if a > b { Greater }
else { Equal }
}
fn main() {
let x = 5;
let y = 10;
let ordering = cmp(x, y); // ordering: Ordering
if ordering == Less { println!("less"); }
else if ordering == Greater { println!("greater"); }
else if ordering == Equal { println!("equal"); }
}
```
Importing variants is convenient and compact, but can also cause name conflicts,
so do this with caution. For this reason, it's normally considered better style
to `use` an enum rather than its variants directly.
As you can see, `enum`s are quite a powerful tool for data representation, and
are even more useful when they're [generic][generics] across types. Before we
get to generics, though, let's talk about how to use enums with pattern
matching, a tool that will let us deconstruct sum types (the type theory term
for enums) like `Ordering` in a very elegant way that avoids all these messy
and brittle `if`/`else`s.
[match]: ./match.html
[generics]: ./generics.html
[match]: match.html
[if-let]: if-let.html
+23 -20
View File
@@ -20,18 +20,18 @@ panic. A *failure* is an error that can be recovered from in some way. A
*panic* is an error that cannot be recovered from.
What do we mean by "recover"? Well, in most cases, the possibility of an error
is expected. For example, consider the `from_str` function:
is expected. For example, consider the `parse` function:
```{rust,ignore}
from_str("5");
```ignore
"5".parse();
```
This function takes a string argument and converts it into another type. But
because it's a string, you can't be sure that the conversion actually works.
For example, what should this convert to?
This method converts a string into another type. But because it's a string, you
can't be sure that the conversion actually works. For example, what should this
convert to?
```{rust,ignore}
from_str("hello5world");
```ignore
"hello5world".parse();
```
This won't work. So we know that this function will only work properly for some
@@ -40,7 +40,8 @@ inputs. It's expected behavior. We call this kind of error a *failure*.
On the other hand, sometimes, there are errors that are unexpected, or which
we cannot recover from. A classic example is an `assert!`:
```{rust,ignore}
```rust
# let x = 5;
assert!(x == 5);
```
@@ -119,17 +120,19 @@ Rust calls these sorts of errors *panics*.
# Handling errors with `Option` and `Result`
The simplest way to indicate that a function may fail is to use the `Option<T>`
type. Remember our `from_str()` example? Here's its type signature:
type. For example, the `find` method on strings attempts to find a pattern
in a string, and returns an `Option`:
```{rust,ignore}
pub fn from_str<A: FromStr>(s: &str) -> Option<A>
```rust
let s = "foo";
assert_eq!(s.find('f'), Some(0));
assert_eq!(s.find('z'), None);
```
`from_str()` returns an `Option<A>`. If the conversion succeeds, it will return
`Some(value)`, and if it fails, it will return `None`.
This is appropriate for the simplest of cases, but doesn't give us a lot of
information in the failure case. What if we wanted to know _why_ the conversion
information in the failure case. What if we wanted to know _why_ the function
failed? For this, we can use the `Result<T, E>` type. It looks like this:
```rust
@@ -201,7 +204,7 @@ Because these kinds of situations are relatively rare, use panics sparingly.
In certain circumstances, even though a function may fail, we may want to treat
it as a panic instead. For example, `io::stdin().read_line(&mut buffer)` returns
an `Result<usize>`, when there is an error reading the line. This allows us to
a `Result<usize>`, when there is an error reading the line. This allows us to
handle and possibly recover from error.
If we don't want to handle this error, and would rather just abort the program,
@@ -211,7 +214,7 @@ we can use the `unwrap()` method:
io::stdin().read_line(&mut buffer).unwrap();
```
`unwrap()` will `panic!` if the `Option` is `None`. This basically says "Give
`unwrap()` will `panic!` if the `Result` is `Err`. This basically says "Give
me the value, and if something goes wrong, just crash." This is less reliable
than matching the error and attempting to recover, but is also significantly
shorter. Sometimes, just crashing is appropriate.
@@ -249,7 +252,7 @@ struct Info {
}
fn write_info(info: &Info) -> io::Result<()> {
let mut file = File::open("my_best_friends.txt").unwrap();
let mut file = File::create("my_best_friends.txt").unwrap();
if let Err(e) = writeln!(&mut file, "name: {}", info.name) {
return Err(e)
@@ -279,7 +282,7 @@ struct Info {
}
fn write_info(info: &Info) -> io::Result<()> {
let mut file = try!(File::open("my_best_friends.txt"));
let mut file = try!(File::create("my_best_friends.txt"));
try!(writeln!(&mut file, "name: {}", info.name));
try!(writeln!(&mut file, "age: {}", info.age));
@@ -297,5 +300,5 @@ It's worth noting that you can only use `try!` from a function that returns a
`Result`, which means that you cannot use `try!` inside of `main()`, because
`main()` doesn't return anything.
`try!` makes use of [`From<Error>`](../std/convert/trait.From.hml) to determine
`try!` makes use of [`From<Error>`](../std/convert/trait.From.html) to determine
what to return in the error case.
+52 -105
View File
@@ -1,31 +1,13 @@
% Generics
Sometimes, when writing a function or data type, we may want it to work for
multiple types of arguments. For example, remember our `OptionalInt` type?
multiple types of arguments. Luckily, Rust has a feature that gives us a better
way: generics. Generics are called parametric polymorphism in type theory,
which means that they are types or functions that have multiple forms (poly
is multiple, morph is form) over a given parameter (parametric).
```{rust}
enum OptionalInt {
Value(i32),
Missing,
}
```
If we wanted to also have an `OptionalFloat64`, we would need a new enum:
```{rust}
enum OptionalFloat64 {
Valuef64(f64),
Missingf64,
}
```
This is really unfortunate. Luckily, Rust has a feature that gives us a better
way: generics. Generics are called *parametric polymorphism* in type theory,
which means that they are types or functions that have multiple forms (*poly*
is multiple, *morph* is form) over a given parameter (*parametric*).
Anyway, enough with type theory declarations, let's check out the generic form
of `OptionalInt`. It is actually provided by Rust itself, and looks like this:
Anyway, enough with type theory, lets check out some generic code. Rusts
standard library provides a type, `Option<T>`, thats generic:
```rust
enum Option<T> {
@@ -34,41 +16,40 @@ enum Option<T> {
}
```
The `<T>` part, which you've seen a few times before, indicates that this is
The `<T>` part, which youve seen a few times before, indicates that this is
a generic data type. Inside the declaration of our enum, wherever we see a `T`,
we substitute that type for the same type used in the generic. Here's an
we substitute that type for the same type used in the generic. Heres an
example of using `Option<T>`, with some extra type annotations:
```{rust}
```rust
let x: Option<i32> = Some(5);
```
In the type declaration, we say `Option<i32>`. Note how similar this looks to
`Option<T>`. So, in this particular `Option`, `T` has the value of `i32`. On
the right-hand side of the binding, we do make a `Some(T)`, where `T` is `5`.
Since that's an `i32`, the two sides match, and Rust is happy. If they didn't
match, we'd get an error:
Since thats an `i32`, the two sides match, and Rust is happy. If they didnt
match, wed get an error:
```{rust,ignore}
```rust,ignore
let x: Option<f64> = Some(5);
// error: mismatched types: expected `core::option::Option<f64>`,
// found `core::option::Option<_>` (expected f64 but found integral variable)
```
That doesn't mean we can't make `Option<T>`s that hold an `f64`! They just have to
match up:
That doesnt mean we cant make `Option<T>`s that hold an `f64`! They just have
to match up:
```{rust}
```rust
let x: Option<i32> = Some(5);
let y: Option<f64> = Some(5.0f64);
```
This is just fine. One definition, multiple uses.
Generics don't have to only be generic over one type. Consider Rust's built-in
`Result<T, E>` type:
Generics dont have to only be generic over one type. Consider another type from Rusts standard library thats similar, `Result<T, E>`:
```{rust}
```rust
enum Result<T, E> {
Ok(T),
Err(E),
@@ -76,9 +57,9 @@ enum Result<T, E> {
```
This type is generic over _two_ types: `T` and `E`. By the way, the capital letters
can be any letter you'd like. We could define `Result<T, E>` as:
can be any letter youd like. We could define `Result<T, E>` as:
```{rust}
```rust
enum Result<A, Z> {
Ok(A),
Err(Z),
@@ -86,92 +67,58 @@ enum Result<A, Z> {
```
if we wanted to. Convention says that the first generic parameter should be
`T`, for 'type,' and that we use `E` for 'error.' Rust doesn't care, however.
`T`, for type, and that we use `E` for error. Rust doesnt care, however.
The `Result<T, E>` type is intended to be used to return the result of a
computation, and to have the ability to return an error if it didn't work out.
Here's an example:
computation, and to have the ability to return an error if it didnt work out.
```{rust}
let x: Result<f64, String> = Ok(2.3f64);
let y: Result<f64, String> = Err("There was an error.".to_string());
```
## Generic functions
This particular Result will return an `f64` if there's a success, and a
`String` if there's a failure. Let's write a function that uses `Result<T, E>`:
We can write functions that take generic types with a similar syntax:
```{rust}
fn inverse(x: f64) -> Result<f64, String> {
if x == 0.0f64 { return Err("x cannot be zero!".to_string()); }
Ok(1.0f64 / x)
```rust
fn takes_anything<T>(x: T) {
// do something with x
}
```
We don't want to take the inverse of zero, so we check to make sure that we
weren't passed zero. If we were, then we return an `Err`, with a message. If
it's okay, we return an `Ok`, with the answer.
The syntax has two parts: the `<T>` says “this function is generic over one
type, `T`”, and the `x: T` says “x has the type `T`.”
Why does this matter? Well, remember how `match` does exhaustive matches?
Here's how this function gets used:
Multiple arguments can have the same generic type:
```{rust}
# fn inverse(x: f64) -> Result<f64, String> {
# if x == 0.0f64 { return Err("x cannot be zero!".to_string()); }
# Ok(1.0f64 / x)
# }
let x = inverse(25.0f64);
match x {
Ok(x) => println!("The inverse of 25 is {}", x),
Err(msg) => println!("Error: {}", msg),
```rust
fn takes_two_of_the_same_things<T>(x: T, y: T) {
// ...
}
```
The `match` enforces that we handle the `Err` case. In addition, because the
answer is wrapped up in an `Ok`, we can't just use the result without doing
the match:
We could write a version that takes multiple types:
```{rust,ignore}
let x = inverse(25.0f64);
println!("{}", x + 2.0f64); // error: binary operation `+` cannot be applied
// to type `core::result::Result<f64,collections::string::String>`
```
This function is great, but there's one other problem: it only works for 64 bit
floating point values. What if we wanted to handle 32 bit floating point as
well? We'd have to write this:
```{rust}
fn inverse32(x: f32) -> Result<f32, String> {
if x == 0.0f32 { return Err("x cannot be zero!".to_string()); }
Ok(1.0f32 / x)
```rust
fn takes_two_things<T, U>(x: T, y: U) {
// ...
}
```
Bummer. What we need is a *generic function*. Luckily, we can write one!
However, it won't _quite_ work yet. Before we get into that, let's talk syntax.
A generic version of `inverse` would look something like this:
Generic functions are most useful with trait bounds, which well cover in the
[section on traits][traits].
```{rust,ignore}
fn inverse<T>(x: T) -> Result<T, String> {
if x == 0.0 { return Err("x cannot be zero!".to_string()); }
[traits]: traits.html
Ok(1.0 / x)
## Generic structs
You can store a generic type in a `struct` as well:
```
struct Point<T> {
x: T,
y: T,
}
let int_origin = Point { x: 0, y: 0 };
let float_origin = Point { x: 0.0, y: 0.0 };
```
Just like how we had `Option<T>`, we use a similar syntax for `inverse<T>`.
We can then use `T` inside the rest of the signature: `x` has type `T`, and half
of the `Result` has type `T`. However, if we try to compile that example, we'll get
an error:
```text
error: binary operation `==` cannot be applied to type `T`
```
Because `T` can be _any_ type, it may be a type that doesn't implement `==`,
and therefore, the first line would be wrong. What do we do?
To fix this example, we need to learn about another Rust feature: traits.
Similarly to functions, the `<T>` is where we declare the generic parameters,
and we then use `x: T` in the type declaration, too.
+1 -1
View File
@@ -1,5 +1,5 @@
% Getting Started
This first section of the book will get you going with Rust and its tooling.
First, well install Rust. Then: the classic Hello World program. Finally,
First, well install Rust. Then, the classic Hello World program. Finally,
well talk about Cargo, Rusts build system and package manager.
+1 -1
View File
@@ -19,7 +19,7 @@ In the example above `x` and `y` have arity 2. `z` has arity 3.
When a compiler is compiling your program, it does a number of different
things. One of the things that it does is turn the text of your program into an
'abstract syntax tree,' or 'AST.' This tree is a representation of the
abstract syntax tree, orAST. This tree is a representation of the
structure of your program. For example, `2 + 3` can be turned into a tree:
```text
+1003
View File
@@ -0,0 +1,1003 @@
% Guessing Game
For our first project, well implement a classic beginner programming problem:
the guessing game. Heres how it works: Our program will generate a random
integer between one and a hundred. It will then prompt us to enter a guess.
Upon entering our guess, it will tell us if were too low or too high. Once we
guess correctly, it will congratulate us. Sounds good?
# Set up
Lets set up a new project. Go to your projects directory. Remember how we had
to create our directory structure and a `Cargo.toml` for `hello_world`? Cargo
has a command that does that for us. Lets give it a shot:
```bash
$ cd ~/projects
$ cargo new guessing_game --bin
$ cd guessing_game
```
We pass the name of our project to `cargo new`, and then the `--bin` flag,
since were making a binary, rather than a library.
Check out the generated `Cargo.toml`:
```toml
[package]
name = "guessing_game"
version = "0.0.1"
authors = ["Your Name <you@example.com>"]
```
Cargo gets this information from your environment. If its not correct, go ahead
and fix that.
Finally, Cargo generated a Hello, world! for us. Check out `src/main.rs`:
```rust
fn main() {
println!("Hello, world!")
}
```
Lets try compiling what Cargo gave us:
```{bash}
$ cargo build
Compiling guessing_game v0.0.1 (file:///home/you/projects/guessing_game)
```
Excellent! Open up your `src/main.rs` again. Well be writing all of
our code in this file.
Before we move on, let me show you one more Cargo command: `run`. `cargo run`
is kind of like `cargo build`, but it also then runs the produced executable.
Try it out:
```bash
$ cargo run
Compiling guessing_game v0.0.1 (file:///home/you/projects/guessing_game)
Running `target/debug/guessing_game`
Hello, world!
```
Great! The `run` command comes in handy when you need to rapidly iterate on a
project. Our game is just such a project, we need to quickly test each
iteration before moving on to the next one.
# Processing a Guess
Lets get to it! The first thing we need to do for our guessing game is
allow our player to input a guess. Put this in your `src/main.rs`:
```rust,no_run
use std::io;
fn main() {
println!("Guess the number!");
println!("Please input your guess.");
let mut guess = String::new();
io::stdin().read_line(&mut guess)
.ok()
.expect("Failed to read line");
println!("You guessed: {}", guess);
}
```
Theres a lot here! Lets go over it, bit by bit.
```rust,ignore
use std::io;
```
Well need to take user input, and then print the result as output. As such, we
need the `io` library from the standard library. Rust only imports a few things
into every program, [the prelude][prelude]. If its not in the prelude,
youll have to `use` it directly.
[prelude]: ../std/prelude/index.html
```rust,ignore
fn main() {
```
As youve seen before, the `main()` function is the entry point into your
program. The `fn` syntax declares a new function, the `()`s indicate that
there are no arguments, and `{` starts the body of the function. Because
we didnt include a return type, its assumed to be `()`, an empty
[tuple][tuples].
[tuples]: primitive-types.html#tuples
```rust,ignore
println!("Guess the number!");
println!("Please input your guess.");
```
We previously learned that `println!()` is a [macro][macros] that
prints a [string][strings] to the screen.
[macros]: macros.html
[strings]: strings.html
```rust,ignore
let mut guess = String::new();
```
Now were getting interesting! Theres a lot going on in this little line.
The first thing to notice is that this is a [let statement][let], which is
used to create variable bindings. They take this form:
```rust,ignore
let foo = bar;
```
[let]: variable-bindings.html
This will create a new binding named `foo`, and bind it to the value `bar`. In
many languages, this is called a variable, but Rusts variable bindings have
a few tricks up their sleeves.
For example, theyre [immutable][immutable] by default. Thats why our example
uses `mut`: it makes a binding mutable, rather than immutable. `let` doesnt
take a name on the left hand side, it actually accepts a
[pattern][patterns]. Well use patterns more later. Its easy enough
to use for now:
```
let foo = 5; // immutable.
let mut bar = 5; // mutable
```
[immutable]: mutability.html
[patterns]: patterns.html
Oh, and `//` will start a comment, until the end of the line. Rust ignores
everything in [comments][comments].
[comments]: comments.html
So now we know that `let mut guess` will introduce a mutable binding named
`guess`, but we have to look at the other side of the `=` for what its
bound to: `String::new()`.
`String` is a string type, provided by the standard library. A
[`String`][string] is a growable, UTF-8 encoded bit of text.
[string]: ../std/string/struct.String.html
The `::new()` syntax uses `::` because this is an associated function of
a particular type. That is to say, its associated with `String` itself,
rather than a particular instance of a `String`. Some languages call this a
static method.
This function is named `new()`, because it creates a new, empty `String`.
Youll find a `new()` function on many types, as its a common name for making
a new value of some kind.
Lets move forward:
```rust,ignore
io::stdin().read_line(&mut guess)
.ok()
.expect("Failed to read line");
```
Thats a lot more! Lets go bit-by-bit. The first line has two parts. Heres
the first:
```rust,ignore
io::stdin()
```
Remember how we `use`d `std::io` on the first line of the program? Were now
calling an associated function on it. If we didnt `use std::io`, we could
have written this line as `std::io::stdin()`.
This particular function returns a handle to the standard input for your
terminal. More specifically, a [std::io::Stdin][iostdin].
[iostdin]: ../std/io/struct.Stdin.html
The next part will use this handle to get input from the user:
```rust,ignore
.read_line(&mut guess)
```
Here, we call the [`read_line()`][read_line] method on our handle.
[Method][method]s are like associated functions, but are only available on a
particular instance of a type, rather than the type itself. Were also passing
one argument to `read_line()`: `&mut guess`.
[read_line]: ../std/io/struct.Stdin.html#method.read_line
[method]: methods.html
Remember how we bound `guess` above? We said it was mutable. However,
`read_line` doesnt take a `String` as an argument: it takes a `&mut String`.
Rust has a feature called [references][references], which allows you to have
multiple references to one piece of data, which can reduce copying. References
are a complex feature, as one of Rusts major selling points is how safe and
easy it is to use references. We dont need to know a lot of those details to
finish our program right now, though. For now, all we need to know is that
like `let` bindings, references are immutable by default. Hence, we need to
write `&mut guess`, rather than `&guess`.
Why does `read_line()` take a mutable reference to a string? Its job is
to take what the user types into standard input, and place that into a
string. So it takes that string as an argument, and in order to add
the input, it needs to be mutable.
[references]: references-and-borrowing.html
But were not quite done with this line of code, though. While its
a single line of text, its only the first part of the single logical line of
code:
```rust,ignore
.ok()
.expect("Failed to read line");
```
When you call a method with the `.foo()` syntax, you may introduce a newline
and other whitespace. This helps you split up long lines. We _could_ have
done:
```rust,ignore
io::stdin().read_line(&mut guess).ok().expect("failed to read line");
```
But that gets hard to read. So weve split it up, three lines for three
method calls. We already talked about `read_line()`, but what about `ok()`
and `expect()`? Well, we already mentioned that `read_line()` puts what
the user types into the `&mut String` we pass it. But it also returns
a value: in this case, an [`io::Result`][ioresult]. Rust has a number of
types named `Result` in its standard library: a generic [`Result`][result],
and then specific versions for sub-libraries, like `io::Result`.
[ioresult]: ../std/io/type.Result.html
[result]: ../std/result/enum.Result.html
The purpose of these `Result` types is to encode error handling information.
Values of the `Result` type, like any type, have methods defined on them. In
this case, `io::Result` has an `ok()` method, which says we want to assume
this value is a successful one. If not, just throw away the error
information. Why throw it away? Well, for a basic program, we just want to
print a generic error, as basically any issue means we cant continue. The
[`ok()` method][ok] returns a value which has another method defined on it:
`expect()`. The [`expect()` method][expect] takes a value its called on, and
if it isnt a successful one, [`panic!`][panic]s with a message you
passed it. A `panic!` like this will cause our program to crash, displaying
the message.
[ok]: ../std/result/enum.Result.html#method.ok
[expect]: ../std/option/enum.Option.html#method.expect
[panic]: error-handling.html
If we leave off calling these two methods, our program will compile, but
well get a warning:
```bash
$ cargo build
Compiling guessing_game v0.1.0 (file:///home/you/projects/guessing_game)
src/main.rs:10:5: 10:39 warning: unused result which must be used,
#[warn(unused_must_use)] on by default
src/main.rs:10 io::stdin().read_line(&mut guess);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
Rust warns us that we havent used the `Result` value. This warning comes from
a special annotation that `io::Result` has. Rust is trying to tell you that
you havent handled a possible error. The right way to suppress the error is
to actually write error handling. Luckily, if we just want to crash if theres
a problem, we can use these two little methods. If we can recover from the
error somehow, wed do something else, but well save that for a future
project.
Theres just one line of this first example left:
```rust,ignore
println!("You guessed: {}", guess);
}
```
This prints out the string we saved our input in. The `{}`s are a placeholder,
and so we pass it `guess` as an argument. If we had multiple `{}`s, we would
pass multiple arguments:
```rust
let x = 5;
let y = 10;
println!("x and y: {} and {}", x, y);
```
Easy.
Anyway, thats the tour. We can run what we have with `cargo run`:
```bash
$ cargo run
Compiling guessing_game v0.1.0 (file:///home/you/projects/guessing_game)
Running `target/debug/guessing_game`
Guess the number!
Please input your guess.
6
You guessed: 6
```
All right! Our first part is done: we can get input from the keyboard,
and then print it back out.
# Generating a secret number
Next, we need to generate a secret number. Rust does not yet include random
number functionality in its standard library. The Rust team does, however,
provide a [`rand` crate][randcrate]. A crate is a package of Rust code.
Weve been building a binary crate, which is an executable. `rand` is a
library crate, which contains code thats intended to be used with other
programs.
[randcrate]: https://crates.io/crates/rand
Using external crates is where Cargo really shines. Before we can write
the code using `rand`, we need to modify our `Cargo.toml`. Open it up, and
add these few lines at the bottom:
```toml
[dependencies]
rand="0.3.0"
```
The `[dependencies]` section of `Cargo.toml` is like the `[package]` section:
everything that follows it is part of it, until the next section starts.
Cargo uses the dependencies section to know what dependencies on external
crates you have, and what versions you require. In this case, weve used version `0.3.0`.
Cargo understands [Semantic Versioning][semver], which is a standard for writing version
numbers. If we wanted to use the latest version we could use `*` or we could use a range
of versions. [Cargos documentation][cargodoc] contains more details.
[semver]: http://semver.org
[cargodoc]: http://doc.crates.io/crates-io.html
Now, without changing any of our code, lets build our project:
```bash
$ cargo build
Updating registry `https://github.com/rust-lang/crates.io-index`
Downloading rand v0.3.8
Downloading libc v0.1.6
Compiling libc v0.1.6
Compiling rand v0.3.8
Compiling guessing_game v0.1.0 (file:///home/you/projects/guessing_game)
```
(You may see different versions, of course.)
Lots of new output! Now that we have an external dependency, Cargo fetches the
latest versions of everything from the registry, which is a copy of data from
[Crates.io][cratesio]. Crates.io is where people in the Rust ecosystem
post their open source Rust projects for others to use.
[cratesio]: https://crates.io
After updating the registry, Cargo checks our `[dependencies]` and downloads
any we dont have yet. In this case, while we only said we wanted to depend on
`rand`, weve also grabbed a copy of `libc`. This is because `rand` depends on
`libc` to work. After downloading them, it compiles them, and then compiles
our project.
If we run `cargo build` again, well get different output:
```bash
$ cargo build
```
Thats right, no output! Cargo knows that our project has been built, and that
all of its dependencies are built, and so theres no reason to do all that
stuff. With nothing to do, it simply exits. If we open up `src/main.rs` again,
make a trivial change, and then save it again, well just see one line:
```bash
$ cargo build
Compiling guessing_game v0.1.0 (file:///home/you/projects/guessing_game)
```
So, we told Cargo we wanted any `0.3.x` version of `rand`, and so it fetched the latest
version at the time this was written, `v0.3.8`. But what happens when next
week, version `v0.3.9` comes out, with an important bugfix? While getting
bugfixes is important, what if `0.3.9` contains a regression that breaks our
code?
The answer to this problem is the `Cargo.lock` file youll now find in your
project directory. When you build your project for the first time, Cargo
figures out all of the versions that fit your criteria, and then writes them
to the `Cargo.lock` file. When you build your project in the future, Cargo
will see that the `Cargo.lock` file exists, and then use that specific version
rather than do all the work of figuring out versions again. This lets you
have a repeatable build automatically. In other words, well stay at `0.3.8`
until we explicitly upgrade, and so will anyone who we share our code with,
thanks to the lock file.
What about when we _do_ want to use `v0.3.9`? Cargo has another command,
`update`, which says ignore the lock, figure out all the latest versions that
fit what weve specified. If that works, write those versions out to the lock
file. But, by default, Cargo will only look for versions larger than `0.3.0`
and smaller than `0.4.0`. If we want to move to `0.4.x`, wed have to update
the `Cargo.toml` directly. When we do, the next time we `cargo build`, Cargo
will update the index and re-evaluate our `rand` requirements.
Theres a lot more to say about [Cargo][doccargo] and [its
ecosystem][doccratesio], but for now, thats all we need to know. Cargo makes
it really easy to re-use libraries, and so Rustaceans tend to write smaller
projects which are assembled out of a number of sub-packages.
[doccargo]: http://doc.crates.io
[doccratesio]: http://doc.crates.io/crates-io.html
Lets get on to actually _using_ `rand`. Heres our next step:
```rust,ignore
extern crate rand;
use std::io;
use rand::Rng;
fn main() {
println!("Guess the number!");
let secret_number = rand::thread_rng().gen_range(1, 101);
println!("The secret number is: {}", secret_number);
println!("Please input your guess.");
let mut guess = String::new();
io::stdin().read_line(&mut guess)
.ok()
.expect("failed to read line");
println!("You guessed: {}", guess);
}
```
The first thing weve done is change the first line. It now says
`extern crate rand`. Because we declared `rand` in our `[dependencies]`, we
can use `extern crate` to let Rust know well be making use of it. This also
does the equivalent of a `use rand;` as well, so we can make use of anything
in the `rand` crate by prefixing it with `rand::`.
Next, we added another `use` line: `use rand::Rng`. Were going to use a
method in a moment, and it requires that `Rng` be in scope to work. The basic
idea is this: methods are defined on something called traits, and for the
method to work, it needs the trait to be in scope. For more about the
details, read the [traits][traits] section.
[traits]: traits.html
There are two other lines we added, in the middle:
```rust,ignore
let secret_number = rand::thread_rng().gen_range(1, 101);
println!("The secret number is: {}", secret_number);
```
We use the `rand::thread_rng()` function to get a copy of the random number
generator, which is local to the particular [thread][concurrency] of execution
were in. Because we `use rand::Rng`d above, it has a `gen_range()` method
available. This method takes two arguments, and generates a number between
them. Its inclusive on the lower bound, but exclusive on the upper bound,
so we need `1` and `101` to get a number between one and a hundred.
[concurrency]: concurrency.html
The second line just prints out the secret number. This is useful while
were developing our program, so we can easily test it out. But well be
deleting it for the final version. Its not much of a game if it prints out
the answer when you start it up!
Try running our new program a few times:
```bash
$ cargo run
Compiling guessing_game v0.1.0 (file:///home/you/projects/guessing_game)
Running `target/debug/guessing_game`
Guess the number!
The secret number is: 7
Please input your guess.
4
You guessed: 4
$ cargo run
Running `target/debug/guessing_game`
Guess the number!
The secret number is: 83
Please input your guess.
5
You guessed: 5
```
Great! Next up: lets compare our guess to the secret guess.
# Comparing guesses
Now that weve got user input, lets compare our guess to the random guess.
Heres our next step, though it doesnt quite work yet:
```rust,ignore
extern crate rand;
use std::io;
use std::cmp::Ordering;
use rand::Rng;
fn main() {
println!("Guess the number!");
let secret_number = rand::thread_rng().gen_range(1, 101);
println!("The secret number is: {}", secret_number);
println!("Please input your guess.");
let mut guess = String::new();
io::stdin().read_line(&mut guess)
.ok()
.expect("failed to read line");
println!("You guessed: {}", guess);
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => println!("You win!"),
}
}
```
A few new bits here. The first is another `use`. We bring a type called
`std::cmp::Ordering` into scope. Then, five new lines at the bottom that use
it:
```rust,ignore
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => println!("You win!"),
}
```
The `cmp()` method can be called on anything that can be compared, and it
takes a reference to the thing you want to compare it to. It returns the
`Ordering` type we `use`d earlier. We use a [`match`][match] statement to
determine exactly what kind of `Ordering` it is. `Ordering` is an
[`enum`][enum], short for enumeration, which looks like this:
```rust
enum Foo {
Bar,
Baz,
}
```
[match]: match.html
[enum]: enums.html
With this definition, anything of type `Foo` can be either a
`Foo::Bar` or a `Foo::Baz`. We use the `::` to indicate the
namespace for a particular `enum` variant.
The [`Ordering`][ordering] enum has three possible variants: `Less`, `Equal`,
and `Greater`. The `match` statement takes a value of a type, and lets you
create an arm for each possible value. Since we have three types of
`Ordering`, we have three arms:
```rust,ignore
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => println!("You win!"),
}
```
[ordering]: ../std/cmp/enum.Ordering.html
If its `Less`, we print `Too small!`, if its `Greater`, `Too big!`, and if
`Equal`, `You win!`. `match` is really useful, and is used often in Rust.
I did mention that this wont quite work yet, though. Lets try it:
```bash
$ cargo build
Compiling guessing_game v0.1.0 (file:///home/you/projects/guessing_game)
src/main.rs:28:21: 28:35 error: mismatched types:
expected `&collections::string::String`,
found `&_`
(expected struct `collections::string::String`,
found integral variable) [E0308]
src/main.rs:28 match guess.cmp(&secret_number) {
^~~~~~~~~~~~~~
error: aborting due to previous error
Could not compile `guessing_game`.
```
Whew! This is a big error. The core of it is that we have mismatched types.
Rust has a strong, static type system. However, it also has type inference.
When we wrote `let guess = String::new()`, Rust was able to infer that `guess`
should be a `String`, and so it doesnt make us write out the type. And with
our `secret_number`, there are a number of types which can have a value
between one and a hundred: `i32`, a thirty-two-bit number, or `u32`, an
unsigned thirty-two-bit number, or `i64`, a sixty-four-bit number. Or others.
So far, that hasnt mattered, and so Rust defaults to an `i32`. However, here,
Rust doesnt know how to compare the `guess` and the `secret_number`. They
need to be the same type. Ultimately, we want to convert the `String` we
read as input into a real number type, for comparison. We can do that
with three more lines. Heres our new program:
```rust,ignore
extern crate rand;
use std::io;
use std::cmp::Ordering;
use rand::Rng;
fn main() {
println!("Guess the number!");
let secret_number = rand::thread_rng().gen_range(1, 101);
println!("The secret number is: {}", secret_number);
println!("Please input your guess.");
let mut guess = String::new();
io::stdin().read_line(&mut guess)
.ok()
.expect("failed to read line");
let guess: u32 = guess.trim().parse()
.ok()
.expect("Please type a number!");
println!("You guessed: {}", guess);
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => println!("You win!"),
}
}
```
The new three lines:
```rust,ignore
let guess: u32 = guess.trim().parse()
.ok()
.expect("Please type a number!");
```
Wait a minute, I thought we already had a `guess`? We do, but Rust allows us
to shadow the previous `guess` with a new one. This is often used in this
exact situation, where `guess` starts as a `String`, but we want to convert it
to an `u32`. Shadowing lets us re-use the `guess` name, rather than forcing us
to come up with two unique names like `guess_str` and `guess`, or something
else.
We bind `guess` to an expression that looks like something we wrote earlier:
```rust,ignore
guess.trim().parse()
```
Followed by an `ok().expect()` invocation. Here, `guess` refers to the old
`guess`, the one that was a `String` with our input in it. The `trim()`
method on `String`s will eliminate any white space at the beginning and end of
our string. This is important, as we had to press the return key to satisfy
`read_line()`. This means that if we type `5` and hit return, `guess` looks
like this: `5\n`. The `\n` represents newline, the enter key. `trim()` gets
rid of this, leaving our string with just the `5`. The [`parse()` method on
strings][parse] parses a string into some kind of number. Since it can parse a
variety of numbers, we need to give Rust a hint as to the exact type of number
we want. Hence, `let guess: u32`. The colon (`:`) after `guess` tells Rust
were going to annotate its type. `u32` is an unsigned, thirty-two bit
integer. Rust has [a number of built-in number types][number], but weve
chosen `u32`. Its a good default choice for a small positive number.
[parse]: ../std/primitive.str.html#method.parse
[number]: primitive-types.html#numeric-types
Just like `read_line()`, our call to `parse()` could cause an error. What if
our string contained `A👍%`? Thered be no way to convert that to a number. As
such, well do the same thing we did with `read_line()`: use the `ok()` and
`expect()` methods to crash if theres an error.
Lets try our program out!
```bash
$ cargo run
Compiling guessing_game v0.0.1 (file:///home/you/projects/guessing_game)
Running `target/guessing_game`
Guess the number!
The secret number is: 58
Please input your guess.
76
You guessed: 76
Too big!
```
Nice! You can see I even added spaces before my guess, and it still figured
out that I guessed 76. Run the program a few times, and verify that guessing
the number works, as well as guessing a number too small.
Now weve got most of the game working, but we can only make one guess. Lets
change that by adding loops!
# Looping
The `loop` keyword gives us an infinite loop. Lets add that in:
```rust,ignore
extern crate rand;
use std::io;
use std::cmp::Ordering;
use rand::Rng;
fn main() {
println!("Guess the number!");
let secret_number = rand::thread_rng().gen_range(1, 101);
println!("The secret number is: {}", secret_number);
loop {
println!("Please input your guess.");
let mut guess = String::new();
io::stdin().read_line(&mut guess)
.ok()
.expect("failed to read line");
let guess: u32 = guess.trim().parse()
.ok()
.expect("Please type a number!");
println!("You guessed: {}", guess);
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => println!("You win!"),
}
}
}
```
And try it out. But wait, didnt we just add an infinite loop? Yup. Remember
our discussion about `parse()`? If we give a non-number answer, well `return`
and quit. Observe:
```bash
$ cargo run
Compiling guessing_game v0.0.1 (file:///home/you/projects/guessing_game)
Running `target/guessing_game`
Guess the number!
The secret number is: 59
Please input your guess.
45
You guessed: 45
Too small!
Please input your guess.
60
You guessed: 60
Too big!
Please input your guess.
59
You guessed: 59
You win!
Please input your guess.
quit
thread '<main>' panicked at 'Please type a number!'
```
Ha! `quit` actually quits. As does any other non-number input. Well, this is
suboptimal to say the least. First, lets actually quit when you win the game:
```rust,ignore
extern crate rand;
use std::io;
use std::cmp::Ordering;
use rand::Rng;
fn main() {
println!("Guess the number!");
let secret_number = rand::thread_rng().gen_range(1, 101);
println!("The secret number is: {}", secret_number);
loop {
println!("Please input your guess.");
let mut guess = String::new();
io::stdin().read_line(&mut guess)
.ok()
.expect("failed to read line");
let guess: u32 = guess.trim().parse()
.ok()
.expect("Please type a number!");
println!("You guessed: {}", guess);
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => {
println!("You win!");
break;
}
}
}
}
```
By adding the `break` line after the `You win!`, well exit the loop when we
win. Exiting the loop also means exiting the program, since its the last
thing in `main()`. We have just one more tweak to make: when someone inputs a
non-number, we dont want to quit, we just want to ignore it. We can do that
like this:
```rust,ignore
extern crate rand;
use std::io;
use std::cmp::Ordering;
use rand::Rng;
fn main() {
println!("Guess the number!");
let secret_number = rand::thread_rng().gen_range(1, 101);
println!("The secret number is: {}", secret_number);
loop {
println!("Please input your guess.");
let mut guess = String::new();
io::stdin().read_line(&mut guess)
.ok()
.expect("failed to read line");
let guess: u32 = match guess.trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
println!("You guessed: {}", guess);
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => {
println!("You win!");
break;
}
}
}
}
```
These are the lines that changed:
```rust,ignore
let guess: u32 = match guess.trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
```
This is how you generally move from crash on error to actually handle the
error, by switching from `ok().expect()` to a `match` statement. The `Result`
returned by `parse()` is an enum just like `Ordering`, but in this case, each
variant has some data associated with it: `Ok` is a success, and `Err` is a
failure. Each contains more information: the successful parsed integer, or an
error type. In this case, we `match` on `Ok(num)`, which sets the inner value
of the `Ok` to the name `num`, and then we just return it on the right-hand
side. In the `Err` case, we dont care what kind of error it is, so we just
use `_` instead of a name. This ignores the error, and `continue` causes us
to go to the next iteration of the `loop`.
Now we should be good! Lets try:
```bash
$ cargo run
Compiling guessing_game v0.0.1 (file:///home/you/projects/guessing_game)
Running `target/guessing_game`
Guess the number!
The secret number is: 61
Please input your guess.
10
You guessed: 10
Too small!
Please input your guess.
99
You guessed: 99
Too big!
Please input your guess.
foo
Please input your guess.
61
You guessed: 61
You win!
```
Awesome! With one tiny last tweak, we have finished the guessing game. Can you
think of what it is? Thats right, we dont want to print out the secret
number. It was good for testing, but it kind of ruins the game. Heres our
final source:
```rust,ignore
extern crate rand;
use std::io;
use std::cmp::Ordering;
use rand::Rng;
fn main() {
println!("Guess the number!");
let secret_number = rand::thread_rng().gen_range(1, 101);
loop {
println!("Please input your guess.");
let mut guess = String::new();
io::stdin().read_line(&mut guess)
.ok()
.expect("failed to read line");
let guess: u32 = match guess.trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
println!("You guessed: {}", guess);
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => {
println!("You win!");
break;
}
}
}
}
```
# Complete!
At this point, you have successfully built the Guessing Game! Congratulations!
This first project showed you a lot: `let`, `match`, methods, associated
functions, using external crates, and more. Our next project will show off
even more.
+9 -2
View File
@@ -5,7 +5,7 @@ projects. Cargo is currently in a pre-1.0 state, and so it is still a work in
progress. However, it is already good enough to use for many Rust projects, and
so it is assumed that Rust projects will use Cargo from the beginning.
[cratesio]: https://doc.crates.io
[cratesio]: http://doc.crates.io
Cargo manages three things: building your code, downloading the dependencies
your code needs, and building those dependencies. At first, your
@@ -32,6 +32,13 @@ $ mkdir src
$ mv main.rs src/main.rs
```
Note that since we're creating an executable, we used `main.rs`. If we
want to make a library instead, we should use `lib.rs`.
Custom file locations for the entry point can be specified
with a [`[[lib]]` or `[[bin]]`][crates-custom] key in the TOML file described below.
[crates-custom]: http://doc.crates.io/manifest.html#configuring-a-target
Cargo expects your source files to live inside a `src` directory. That leaves
the top level for other things, like READMEs, license information, and anything
not related to your code. Cargo helps us keep our projects nice and tidy. A
@@ -89,7 +96,7 @@ we hadnt changed the source file, and so it just ran the binary. If we had
made a modification, we would have seen it do both:
```bash
$ cargo build
$ cargo run
Compiling hello_world v0.0.1 (file:///home/yourname/projects/hello_world)
Running `target/debug/hello_world`
Hello, world!
+1 -1
View File
@@ -147,7 +147,7 @@ $ ./main # or main.exe on Windows
This prints out our `Hello, world!` text to our terminal.
If you come from a dynamically typed language like Ruby, Python, or JavaScript,
If you come from a dynamic language like Ruby, Python, or JavaScript,
you may not be used to these two steps being separate. Rust is an
ahead-of-time compiled language, which means that you can compile a program,
give it to someone else, and they don't need to have Rust installed. If you
+80 -1
View File
@@ -1,3 +1,82 @@
% if let
COMING SOON
`if let` allows you to combine `if` and `let` together to reduce the overhead
of certain kinds of pattern matches.
For example, lets say we have some sort of `Option<T>`. We want to call a function
on it if its `Some<T>`, but do nothing if its `None`. That looks like this:
```rust
# let option = Some(5);
# fn foo(x: i32) { }
match option {
Some(x) => { foo(x) },
None => {},
}
```
We dont have to use `match` here, for example, we could use `if`:
```rust
# let option = Some(5);
# fn foo(x: i32) { }
if option.is_some() {
let x = option.unwrap();
foo(x);
}
```
Neither of these options is particularly appealing. We can use `if let` to
do the same thing in a nicer way:
```rust
# let option = Some(5);
# fn foo(x: i32) { }
if let Some(x) = option {
foo(x);
}
```
If a [pattern][patterns] matches successfully, it binds any appropriate parts of
the value to the identifiers in the pattern, then evaluates the expression. If
the pattern doesnt match, nothing happens.
If youd rather to do something else when the pattern does not match, you can
use `else`:
```rust
# let option = Some(5);
# fn foo(x: i32) { }
# fn bar() { }
if let Some(x) = option {
foo(x);
} else {
bar();
}
```
## `while let`
In a similar fashion, `while let` can be used when you want to conditionally
loop as long as a value matches a certain pattern. It turns code like this:
```rust
# let option: Option<i32> = None;
loop {
match option {
Some(x) => println!("{}", x),
_ => break,
}
}
```
Into code like this:
```rust
# let option: Option<i32> = None;
while let Some(x) = option {
println!("{}", x);
}
```
[patterns]: patterns.html
+32 -5
View File
@@ -58,7 +58,7 @@ but you must add the right number of `:` if you skip them:
asm!("xor %eax, %eax"
:
:
: "eax"
: "{eax}"
);
# } }
```
@@ -69,7 +69,7 @@ Whitespace also doesn't matter:
# #![feature(asm)]
# #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
# fn main() { unsafe {
asm!("xor %eax, %eax" ::: "eax");
asm!("xor %eax, %eax" ::: "{eax}");
# } }
```
@@ -77,13 +77,13 @@ asm!("xor %eax, %eax" ::: "eax");
Input and output operands follow the same format: `:
"constraints1"(expr1), "constraints2"(expr2), ..."`. Output operand
expressions must be mutable lvalues:
expressions must be mutable lvalues, or not yet assigned:
```
# #![feature(asm)]
# #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
fn add(a: i32, b: i32) -> i32 {
let mut c = 0;
let c: i32;
unsafe {
asm!("add $2, $0"
: "=r"(c)
@@ -100,6 +100,22 @@ fn main() {
}
```
If you would like to use real operands in this position, however,
you are required to put curly braces `{}` around the register that
you want, and you are required to put the specific size of the
operand. This is useful for very low level programming, where
which register you use is important:
```
# #![feature(asm)]
# #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
# unsafe fn read_byte_in(port: u16) -> u8 {
let result: u8;
asm!("in %dx, %al" : "={al}"(result) : "{dx}"(port));
result
# }
```
## Clobbers
Some instructions modify registers which might otherwise have held
@@ -112,7 +128,7 @@ stay valid.
# #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
# fn main() { unsafe {
// Put the value 0x200 in eax
asm!("mov $$0x200, %eax" : /* no outputs */ : /* no inputs */ : "eax");
asm!("mov $$0x200, %eax" : /* no outputs */ : /* no inputs */ : "{eax}");
# } }
```
@@ -139,3 +155,14 @@ Current valid options are:
the compiler to insert its usual stack alignment code
3. *intel* - use intel syntax instead of the default AT&T.
```
# #![feature(asm)]
# #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
# fn main() {
let result: i32;
unsafe {
asm!("mov eax, 2" : "={eax}"(result) : : : "intel")
}
println!("eax is currently {}", result);
# }
```
+7 -9
View File
@@ -6,16 +6,16 @@ or a Mac, all you need to do is this (note that you don't need to type in the
`$`s, they just indicate the start of each command):
```bash
$ curl -sf -L https://static.rust-lang.org/rustup.sh | sudo sh
$ curl -sf -L https://static.rust-lang.org/rustup.sh | sh
```
If you're concerned about the [potential insecurity][insecurity] of using `curl
| sudo sh`, please keep reading and see our disclaimer below. And feel free to
| sh`, please keep reading and see our disclaimer below. And feel free to
use a two-step version of the installation and examine our installation script:
```bash
$ curl -f -L https://static.rust-lang.org/rustup.sh -O
$ sudo sh rustup.sh
$ sh rustup.sh
```
[insecurity]: http://curlpipesh.tumblr.com
@@ -40,13 +40,11 @@ If you used the Windows installer, just re-run the `.msi` and it will give you
an uninstall option.
Some people, and somewhat rightfully so, get very upset when we tell you to
`curl | sudo sh`. Basically, when you do this, you are trusting that the good
`curl | sh`. Basically, when you do this, you are trusting that the good
people who maintain Rust aren't going to hack your computer and do bad things.
That's a good instinct! If you're one of those people, please check out the
documentation on [building Rust from Source][from source], or [the official
binary downloads][install page]. And we promise that this method will not be
the way to install Rust forever: it's just the easiest way to keep people
updated while Rust is in its alpha state.
binary downloads][install page].
[from source]: https://github.com/rust-lang/rust#building-from-source
[install page]: http://www.rust-lang.org/install.html
@@ -91,9 +89,9 @@ If not, there are a number of places where you can get help. The easiest is
[Mibbit][mibbit]. Click that link, and you'll be chatting with other Rustaceans
(a silly nickname we call ourselves), and we can help you out. Other great
resources include [the users forum][users], and
[Stack Overflow][stack overflow].
[Stack Overflow][stackoverflow].
[irc]: irc://irc.mozilla.org/#rust
[mibbit]: http://chat.mibbit.com/?server=irc.mozilla.org&channel=%23rust
[users]: http://users.rust-lang.org/
[stack overflow]: http://stackoverflow.com/questions/tagged/rust
[stackoverflow]: http://stackoverflow.com/questions/tagged/rust
+5 -16
View File
@@ -212,9 +212,9 @@ see why consumers matter.
As we've said before, an iterator is something that we can call the
`.next()` method on repeatedly, and it gives us a sequence of things.
Because you need to call the method, this means that iterators
are *lazy* and don't need to generate all of the values upfront.
This code, for example, does not actually generate the numbers
`1-100`, and just creates a value that represents the sequence:
can be *lazy* and not generate all of the values upfront. This code,
for example, does not actually generate the numbers `1-100`, instead
creating a value that merely represents the sequence:
```rust
let nums = 1..100;
@@ -235,7 +235,7 @@ Ranges are one of two basic iterators that you'll see. The other is `iter()`.
in turn:
```rust
let nums = [1, 2, 3];
let nums = vec![1, 2, 3];
for num in nums.iter() {
println!("{}", num);
@@ -243,18 +243,7 @@ for num in nums.iter() {
```
These two basic iterators should serve you well. There are some more
advanced iterators, including ones that are infinite. Like using range syntax
and `step_by`:
```rust
# #![feature(step_by)]
(1..).step_by(5);
```
This iterator counts up from one, adding five each time. It will give
you a new integer every time, forever (well, technically, until it reaches the
maximum number representable by an `i32`). But since iterators are lazy,
that's okay! You probably don't want to use `collect()` on it, though...
advanced iterators, including ones that are infinite.
That's enough about iterators. Iterator adapters are the last concept
we need to talk about with regards to iterators. Let's get to it!
+3 -3
View File
@@ -7,7 +7,7 @@
The `rustc` compiler has certain pluggable operations, that is,
functionality that isn't hard-coded into the language, but is
implemented in libraries, with a special marker to tell the compiler
it exists. The marker is the attribute `#[lang="..."]` and there are
it exists. The marker is the attribute `#[lang = "..."]` and there are
various different values of `...`, i.e. various different 'lang
items'.
@@ -28,7 +28,7 @@ extern {
#[lang = "owned_box"]
pub struct Box<T>(*mut T);
#[lang="exchange_malloc"]
#[lang = "exchange_malloc"]
unsafe fn allocate(size: usize, _align: usize) -> *mut u8 {
let p = libc::malloc(size as libc::size_t) as *mut u8;
@@ -39,7 +39,7 @@ unsafe fn allocate(size: usize, _align: usize) -> *mut u8 {
p
}
#[lang="exchange_free"]
#[lang = "exchange_free"]
unsafe fn deallocate(ptr: *mut u8, _size: usize, _align: usize) {
libc::free(ptr as *mut libc::c_void)
}
+7 -2
View File
@@ -1,4 +1,9 @@
% Learn Rust
This section is coming soon! It will eventually have a few tutorials with
building real Rust projects, but they are under development.
Welcome! This section has a few tutorials that teach you Rust through building
projects. Youll get a high-level overview, but well skim over the details.
If youd prefer a more from the ground up-style experience, check
out [Syntax and Semantics][ss].
[ss]: syntax-and-semantics.html
+295 -1
View File
@@ -1,3 +1,297 @@
% Lifetimes
Coming soon!
This guide is one of three presenting Rusts ownership system. This is one of
Rusts most unique and compelling features, with which Rust developers should
become quite acquainted. Ownership is how Rust achieves its largest goal,
memory safety. There are a few distinct concepts, each with its own chapter:
* [ownership][ownership], the key concept
* [borrowing][borrowing], and their associated feature references
* lifetimes, which youre reading now
These three chapters are related, and in order. Youll need all three to fully
understand the ownership system.
[ownership]: ownership.html
[borrowing]: references-and-borrowing.html
# Meta
Before we get to the details, two important notes about the ownership system.
Rust has a focus on safety and speed. It accomplishes these goals through many
zero-cost abstractions, which means that in Rust, abstractions cost as little
as possible in order to make them work. The ownership system is a prime example
of a zero-cost abstraction. All of the analysis well talk about in this guide
is _done at compile time_. You do not pay any run-time cost for any of these
features.
However, this system does have a certain cost: learning curve. Many new users
to Rust experience something we like to call fighting with the borrow
checker, where the Rust compiler refuses to compile a program that the author
thinks is valid. This often happens because the programmers mental model of
how ownership should work doesnt match the actual rules that Rust implements.
You probably will experience similar things at first. There is good news,
however: more experienced Rust developers report that once they work with the
rules of the ownership system for a period of time, they fight the borrow
checker less and less.
With that in mind, lets learn about lifetimes.
# Lifetimes
Lending out a reference to a resource that someone else owns can be
complicated. For example, imagine this set of operations:
- I acquire a handle to some kind of resource.
- I lend you a reference to the resource.
- I decide Im done with the resource, and deallocate it, while you still have
your reference.
- You decide to use the resource.
Uh oh! Your reference is pointing to an invalid resource. This is called a
dangling pointer or use after free, when the resource is memory.
To fix this, we have to make sure that step four never happens after step
three. The ownership system in Rust does this through a concept called
lifetimes, which describe the scope that a reference is valid for.
When we have a function that takes a reference by argument, we can be implicit
or explicit about the lifetime of the reference:
```rust
// implicit
fn foo(x: &i32) {
}
// explicit
fn bar<'a>(x: &'a i32) {
}
```
The `'a` reads the lifetime a. Technically, every reference has some lifetime
associated with it, but the compiler lets you elide them in common cases.
Before we get to that, though, lets break the explicit example down:
```rust,ignore
fn bar<'a>(...)
```
This part declares our lifetimes. This says that `bar` has one lifetime, `'a`.
If we had two reference parameters, it would look like this:
```rust,ignore
fn bar<'a, 'b>(...)
```
Then in our parameter list, we use the lifetimes weve named:
```rust,ignore
...(x: &'a i32)
```
If we wanted an `&mut` reference, wed do this:
```rust,ignore
...(x: &'a mut i32)
```
If you compare `&mut i32` to `&'a mut i32`, theyre the same, its just that
the lifetime `'a` has snuck in between the `&` and the `mut i32`. We read `&mut
i32` as a mutable reference to an i32 and `&'a mut i32` as a mutable
reference to an `i32` with the lifetime `'a`.
Youll also need explicit lifetimes when working with [`struct`][structs]s:
```rust
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let y = &5; // this is the same as `let _y = 5; let y = &_y;`
let f = Foo { x: y };
println!("{}", f.x);
}
```
[structs]: structs.html
As you can see, `struct`s can also have lifetimes. In a similar way to functions,
```rust
struct Foo<'a> {
# x: &'a i32,
# }
```
declares a lifetime, and
```rust
# struct Foo<'a> {
x: &'a i32,
# }
```
uses it. So why do we need a lifetime here? We need to ensure that any reference
to a `Foo` cannot outlive the reference to an `i32` it contains.
## Thinking in scopes
A way to think about lifetimes is to visualize the scope that a reference is
valid for. For example:
```rust
fn main() {
let y = &5; // -+ y goes into scope
// |
// stuff // |
// |
} // -+ y goes out of scope
```
Adding in our `Foo`:
```rust
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let y = &5; // -+ y goes into scope
let f = Foo { x: y }; // -+ f goes into scope
// stuff // |
// |
} // -+ f and y go out of scope
```
Our `f` lives within the scope of `y`, so everything works. What if it didnt?
This code wont work:
```rust,ignore
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let x; // -+ x goes into scope
// |
{ // |
let y = &5; // ---+ y goes into scope
let f = Foo { x: y }; // ---+ f goes into scope
x = &f.x; // | | error here
} // ---+ f and y go out of scope
// |
println!("{}", x); // |
} // -+ x goes out of scope
```
Whew! As you can see here, the scopes of `f` and `y` are smaller than the scope
of `x`. But when we do `x = &f.x`, we make `x` a reference to something thats
about to go out of scope.
Named lifetimes are a way of giving these scopes a name. Giving something a
name is the first step towards being able to talk about it.
## 'static
The lifetime named static is a special lifetime. It signals that something
has the lifetime of the entire program. Most Rust programmers first come across
`'static` when dealing with strings:
```rust
let x: &'static str = "Hello, world.";
```
String literals have the type `&'static str` because the reference is always
alive: they are baked into the data segment of the final binary. Another
example are globals:
```rust
static FOO: i32 = 5;
let x: &'static i32 = &FOO;
```
This adds an `i32` to the data segment of the binary, and `x` is a reference
to it.
## Lifetime Elision
Rust supports powerful local type inference in function bodies, but its
forbidden in item signatures to allow reasoning about the types just based in
the item signature alone. However, for ergonomic reasons a very restricted
secondary inference algorithm called “lifetime elision” applies in function
signatures. It infers only based on the signature components themselves and not
based on the body of the function, only infers lifetime parameters, and does
this with only three easily memorizable and unambiguous rules. This makes
lifetime elision a shorthand for writing an item signature, while not hiding
away the actual types involved as full local inference would if applied to it.
When talking about lifetime elision, we use the term *input lifetime* and
*output lifetime*. An *input lifetime* is a lifetime associated with a parameter
of a function, and an *output lifetime* is a lifetime associated with the return
value of a function. For example, this function has an input lifetime:
```rust,ignore
fn foo<'a>(bar: &'a str)
```
This one has an output lifetime:
```rust,ignore
fn foo<'a>() -> &'a str
```
This one has a lifetime in both positions:
```rust,ignore
fn foo<'a>(bar: &'a str) -> &'a str
```
Here are the three rules:
* Each elided lifetime in a functions arguments becomes a distinct lifetime
parameter.
* If there is exactly one input lifetime, elided or not, that lifetime is
assigned to all elided lifetimes in the return values of that function.
* If there are multiple input lifetimes, but one of them is `&self` or `&mut
self`, the lifetime of `self` is assigned to all elided output lifetimes.
Otherwise, it is an error to elide an output lifetime.
### Examples
Here are some examples of functions with elided lifetimes. Weve paired each
example of an elided lifetime with its expanded form.
```rust,ignore
fn print(s: &str); // elided
fn print<'a>(s: &'a str); // expanded
fn debug(lvl: u32, s: &str); // elided
fn debug<'a>(lvl: u32, s: &'a str); // expanded
// In the preceding example, `lvl` doesnt need a lifetime because its not a
// reference (`&`). Only things relating to references (such as a `struct`
// which contains a reference) need lifetimes.
fn substr(s: &str, until: u32) -> &str; // elided
fn substr<'a>(s: &'a str, until: u32) -> &'a str; // expanded
fn get_str() -> &str; // ILLEGAL, no inputs
fn frob(s: &str, t: &str) -> &str; // ILLEGAL, two inputs
fn frob<'a, 'b>(s: &'a str, t: &'b str) -> &str; // Expanded: Output lifetime is unclear
fn get_mut(&mut self) -> &mut T; // elided
fn get_mut<'a>(&'a mut self) -> &'a mut T; // expanded
fn args<T:ToCStr>(&mut self, args: &[T]) -> &mut Command // elided
fn args<'a, 'b, T:ToCStr>(&'a mut self, args: &'b [T]) -> &'a mut Command // expanded
fn new(buf: &mut [u8]) -> BufWriter; // elided
fn new<'a>(buf: &'a mut [u8]) -> BufWriter<'a> // expanded
```
+157 -48
View File
@@ -1,20 +1,20 @@
% Macros
By now you've learned about many of the tools Rust provides for abstracting and
By now youve learned about many of the tools Rust provides for abstracting and
reusing code. These units of code reuse have a rich semantic structure. For
example, functions have a type signature, type parameters have trait bounds,
and overloaded functions must belong to a particular trait.
This structure means that Rust's core abstractions have powerful compile-time
This structure means that Rusts core abstractions have powerful compile-time
correctness checking. But this comes at the price of reduced flexibility. If
you visually identify a pattern of repeated code, you may find it's difficult
you visually identify a pattern of repeated code, you may find its difficult
or cumbersome to express that pattern as a generic function, a trait, or
anything else within Rust's semantics.
anything else within Rusts semantics.
Macros allow us to abstract at a *syntactic* level. A macro invocation is
Macros allow us to abstract at a syntactic level. A macro invocation is
shorthand for an "expanded" syntactic form. This expansion happens early in
compilation, before any static checking. As a result, macros can capture many
patterns of code reuse that Rust's core abstractions cannot.
patterns of code reuse that Rusts core abstractions cannot.
The drawback is that macro-based code can be harder to understand, because
fewer of the built-in rules apply. Like an ordinary function, a well-behaved
@@ -23,8 +23,8 @@ difficult to design a well-behaved macro! Additionally, compiler errors in
macro code are harder to interpret, because they describe problems in the
expanded code, not the source-level form that developers use.
These drawbacks make macros something of a "feature of last resort". That's not
to say that macros are bad; they are part of Rust because sometimes they're
These drawbacks make macros something of a "feature of last resort". Thats not
to say that macros are bad; they are part of Rust because sometimes theyre
needed for truly concise, well-abstracted code. Just keep this tradeoff in
mind.
@@ -33,14 +33,14 @@ mind.
You may have seen the `vec!` macro, used to initialize a [vector][] with any
number of elements.
[vector]: arrays-vectors-and-slices.html
[vector]: vectors.html
```rust
let x: Vec<u32> = vec![1, 2, 3];
# assert_eq!(x, [1, 2, 3]);
```
This can't be an ordinary function, because it takes any number of arguments.
This cant be an ordinary function, because it takes any number of arguments.
But we can imagine it as syntactic shorthand for
```rust
@@ -57,8 +57,7 @@ let x: Vec<u32> = {
We can implement this shorthand, using a macro: [^actual]
[^actual]: The actual definition of `vec!` in libcollections differs from the
one presented here, for reasons of efficiency and reusability. Some
of these are mentioned in the [advanced macros chapter][].
one presented here, for reasons of efficiency and reusability.
```rust
macro_rules! vec {
@@ -77,20 +76,20 @@ macro_rules! vec {
# }
```
Whoa, that's a lot of new syntax! Let's break it down.
Whoa, thats a lot of new syntax! Lets break it down.
```ignore
macro_rules! vec { ... }
```
This says we're defining a macro named `vec`, much as `fn vec` would define a
function named `vec`. In prose, we informally write a macro's name with an
This says were defining a macro named `vec`, much as `fn vec` would define a
function named `vec`. In prose, we informally write a macros name with an
exclamation point, e.g. `vec!`. The exclamation point is part of the invocation
syntax and serves to distinguish a macro from an ordinary function.
## Matching
The macro is defined through a series of *rules*, which are pattern-matching
The macro is defined through a series of rules, which are pattern-matching
cases. Above, we had
```ignore
@@ -99,14 +98,14 @@ cases. Above, we had
This is like a `match` expression arm, but the matching happens on Rust syntax
trees, at compile time. The semicolon is optional on the last (here, only)
case. The "pattern" on the left-hand side of `=>` is known as a *matcher*.
case. The "pattern" on the left-hand side of `=>` is known as a matcher.
These have [their own little grammar] within the language.
[their own little grammar]: ../reference.html#macros
The matcher `$x:expr` will match any Rust expression, binding that syntax tree
to the *metavariable* `$x`. The identifier `expr` is a *fragment specifier*;
the full possibilities are enumerated in the [advanced macros chapter][].
to the metavariable `$x`. The identifier `expr` is a fragment specifier;
the full possibilities are enumerated later in this chapter.
Surrounding the matcher with `$(...),*` will match zero or more expressions,
separated by commas.
@@ -158,8 +157,8 @@ Each matched expression `$x` will produce a single `push` statement in the
macro expansion. The repetition in the expansion proceeds in "lockstep" with
repetition in the matcher (more on this in a moment).
Because `$x` was already declared as matching an expression, we don't repeat
`:expr` on the right-hand side. Also, we don't include a separating comma as
Because `$x` was already declared as matching an expression, we dont repeat
`:expr` on the right-hand side. Also, we dont include a separating comma as
part of the repetition operator. Instead, we have a terminating semicolon
within the repeated block.
@@ -180,7 +179,7 @@ The outer braces are part of the syntax of `macro_rules!`. In fact, you can use
The inner braces are part of the expanded syntax. Remember, the `vec!` macro is
used in an expression context. To write an expression with multiple statements,
including `let`-bindings, we use a block. If your macro expands to a single
expression, you don't need this extra layer of braces.
expression, you dont need this extra layer of braces.
Note that we never *declared* that the macro produces an expression. In fact,
this is not determined until we use the macro as an expression. With care, you
@@ -194,7 +193,7 @@ The repetition operator follows two principal rules:
1. `$(...)*` walks through one "layer" of repetitions, for all of the `$name`s
it contains, in lockstep, and
2. each `$name` must be under at least as many `$(...)*`s as it was matched
against. If it is under more, it'll be duplicated, as appropriate.
against. If it is under more, itll be duplicated, as appropriate.
This baroque macro illustrates the duplication of variables from outer
repetition levels.
@@ -219,7 +218,7 @@ fn main() {
}
```
That's most of the matcher syntax. These examples use `$(...)*`, which is a
Thats most of the matcher syntax. These examples use `$(...)*`, which is a
"zero or more" match. Alternatively you can write `$(...)+` for a "one or
more" match. Both forms optionally include a separator, which can be any token
except `+` or `*`.
@@ -244,9 +243,9 @@ int main() {
```
After expansion we have `5 * 2 + 3`, and multiplication has greater precedence
than addition. If you've used C macros a lot, you probably know the standard
than addition. If youve used C macros a lot, you probably know the standard
idioms for avoiding this problem, as well as five or six others. In Rust, we
don't have to worry about it.
dont have to worry about it.
```rust
macro_rules! five_times {
@@ -261,8 +260,8 @@ fn main() {
The metavariable `$x` is parsed as a single expression node, and keeps its
place in the syntax tree even after substitution.
Another common problem in macro systems is *variable capture*. Here's a C
macro, using [a GNU C extension] to emulate Rust's expression blocks.
Another common problem in macro systems is variable capture. Heres a C
macro, using [a GNU C extension] to emulate Rusts expression blocks.
[a GNU C extension]: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html
@@ -275,7 +274,7 @@ macro, using [a GNU C extension] to emulate Rust's expression blocks.
})
```
Here's a simple use case that goes terribly wrong:
Heres a simple use case that goes terribly wrong:
```text
const char *state = "reticulating splines";
@@ -315,10 +314,10 @@ fn main() {
```
This works because Rust has a [hygienic macro system][]. Each macro expansion
happens in a distinct *syntax context*, and each variable is tagged with the
syntax context where it was introduced. It's as though the variable `state`
happens in a distinct syntax context, and each variable is tagged with the
syntax context where it was introduced. Its as though the variable `state`
inside `main` is painted a different "color" from the variable `state` inside
the macro, and therefore they don't conflict.
the macro, and therefore they dont conflict.
[hygienic macro system]: http://en.wikipedia.org/wiki/Hygienic_macro
@@ -336,7 +335,7 @@ fn main() {
}
```
Instead you need to pass the variable name into the invocation, so it's tagged
Instead you need to pass the variable name into the invocation, so its tagged
with the right syntax context.
```rust
@@ -368,7 +367,7 @@ fn main() {
# Recursive macros
A macro's expansion can include more macro invocations, including invocations
A macros expansion can include more macro invocations, including invocations
of the very same macro being expanded. These recursive macros are useful for
processing tree-structured input, as illustrated by this (simplistic) HTML
shorthand:
@@ -429,7 +428,7 @@ they are unstable and require feature gates.
Even when Rust code contains un-expanded macros, it can be parsed as a full
[syntax tree][ast]. This property can be very useful for editors and other
tools that process code. It also has a few consequences for the design of
Rust's macro system.
Rusts macro system.
[ast]: glossary.html#abstract-syntax-tree
@@ -454,13 +453,13 @@ consist of valid Rust tokens. Furthermore, parentheses, brackets, and braces
must be balanced within a macro invocation. For example, `foo!([)` is
forbidden. This allows Rust to know where the macro invocation ends.
More formally, the macro invocation body must be a sequence of *token trees*.
More formally, the macro invocation body must be a sequence of token trees.
A token tree is defined recursively as either
* a sequence of token trees surrounded by matching `()`, `[]`, or `{}`, or
* any other single token.
Within a matcher, each metavariable has a *fragment specifier*, identifying
Within a matcher, each metavariable has a fragment specifier, identifying
which syntactic form it matches.
* `ident`: an identifier. Examples: `x`; `foo`.
@@ -482,7 +481,7 @@ There are additional rules regarding the next token after a metavariable:
* `pat` variables must be followed by one of: `=> , =`
* Other variables may be followed by any token.
These rules provide some flexibility for Rust's syntax to evolve without
These rules provide some flexibility for Rusts syntax to evolve without
breaking existing macros.
The macro system does not deal with parse ambiguity at all. For example, the
@@ -500,7 +499,7 @@ One downside is that scoping works differently for macros, compared to other
constructs in the language.
Definition and expansion of macros both happen in a single depth-first,
lexical-order traversal of a crate's source. So a macro defined at module scope
lexical-order traversal of a crates source. So a macro defined at module scope
is visible to any subsequent code in the same module, which includes the body
of any subsequent child `mod` items.
@@ -508,8 +507,8 @@ A macro defined within the body of a single `fn`, or anywhere else not at
module scope, is visible only within that item.
If a module has the `macro_use` attribute, its macros are also visible in its
parent module after the child's `mod` item. If the parent also has `macro_use`
then the macros will be visible in the grandparent after the parent's `mod`
parent module after the childs `mod` item. If the parent also has `macro_use`
then the macros will be visible in the grandparent after the parents `mod`
item, and so forth.
The `macro_use` attribute can also appear on `extern crate`. In this context
@@ -524,7 +523,7 @@ If the attribute is given simply as `#[macro_use]`, all macros are loaded. If
there is no `#[macro_use]` attribute then no macros are loaded. Only macros
defined with the `#[macro_export]` attribute may be loaded.
To load a crate's macros *without* linking it into the output, use `#[no_link]`
To load a crates macros without linking it into the output, use `#[no_link]`
as well.
An example:
@@ -566,7 +565,7 @@ When this library is loaded with `#[macro_use] extern crate`, only `m2` will
be imported.
The Rust Reference has a [listing of macro-related
attributes](../reference.html#macro--and-plugin-related-attributes).
attributes](../reference.html#macro-related-attributes).
# The variable `$crate`
@@ -619,12 +618,12 @@ only appear at the root of your crate, not inside `mod`. This ensures that
The introductory chapter mentioned recursive macros, but it did not give the
full story. Recursive macros are useful for another reason: Each recursive
invocation gives you another opportunity to pattern-match the macro's
invocation gives you another opportunity to pattern-match the macros
arguments.
As an extreme example, it is possible, though hardly advisable, to implement
the [Bitwise Cyclic Tag](http://esolangs.org/wiki/Bitwise_Cyclic_Tag) automaton
within Rust's macro system.
within Rusts macro system.
```rust
macro_rules! bct {
@@ -653,11 +652,121 @@ macro_rules! bct {
Exercise: use macros to reduce duplication in the above definition of the
`bct!` macro.
# Common macros
Here are some common macros youll see in Rust code.
## panic!
This macro causes the current thread to panic. You can give it a message
to panic with:
```rust,no_run
panic!("oh no!");
```
## vec!
The `vec!` macro is used throughout the book, so youve probably seen it
already. It creates `Vec<T>`s with ease:
```rust
let v = vec![1, 2, 3, 4, 5];
```
It also lets you make vectors with repeating values. For example, a hundred
zeroes:
```rust
let v = vec![0; 100];
```
## assert! and assert_eq!
These two macros are used in tests. `assert!` takes a boolean, and `assert_eq!`
takes two values and compares them. Truth passes, success `panic!`s. Like
this:
```rust,no_run
// A-ok!
assert!(true);
assert_eq!(5, 3 + 2);
// nope :(
assert!(5 < 3);
assert_eq!(5, 3);
```
## try!
`try!` is used for error handling. It takes something that can return a
`Result<T, E>`, and gives `T` if its a `Ok<T>`, and `return`s with the
`Err(E)` if its that. Like this:
```rust,no_run
use std::fs::File;
fn foo() -> std::io::Result<()> {
let f = try!(File::create("foo.txt"));
Ok(())
}
```
This is cleaner than doing this:
```rust,no_run
use std::fs::File;
fn foo() -> std::io::Result<()> {
let f = File::create("foo.txt");
let f = match f {
Ok(t) => t,
Err(e) => return Err(e),
};
Ok(())
}
```
## unreachable!
This macro is used when you think some code should never execute:
```rust
if false {
unreachable!();
}
```
Sometimes, the compiler may make you have a different branch that you know
will never, ever run. In these cases, use this macro, so that if you end
up wrong, youll get a `panic!` about it.
```rust
let x: Option<i32> = None;
match x {
Some(_) => unreachable!(),
None => println!("I know x is None!"),
}
```
## unimplemented!
The `unimplemented!` macro can be used when youre trying to get your functions
to typecheck, and dont want to worry about writing out the body of the
function. One example of this situation is implementing a trait with multiple
required methods, where you want to tackle one at a time. Define the others
as `unimplemented!` until youre ready to write them.
# Procedural macros
If Rust's macro system can't do what you need, you may want to write a
[compiler plugin](plugins.html) instead. Compared to `macro_rules!`
If Rusts macro system cant do what you need, you may want to write a
[compiler plugin](compiler-plugins.html) instead. Compared to `macro_rules!`
macros, this is significantly more work, the interfaces are much less stable,
and bugs can be much harder to track down. In exchange you get the
flexibility of running arbitrary Rust code within the compiler. Syntax
extension plugins are sometimes called *procedural macros* for this reason.
extension plugins are sometimes called procedural macros for this reason.
+54 -16
View File
@@ -1,10 +1,8 @@
% Match
Often, a simple `if`/`else` isnt enough, because you have more than two
possible options. Also, `else` conditions can get incredibly complicated, so
whats the solution?
Rust has a keyword, `match`, that allows you to replace complicated `if`/`else`
Often, a simple [`if`][if]/`else` isnt enough, because you have more than two
possible options. Also, conditions can get quite complex. Rust
has a keyword, `match`, that allows you to replace complicated `if`/`else`
groupings with something more powerful. Check it out:
```rust
@@ -20,16 +18,18 @@ match x {
}
```
`match` takes an expression and then branches based on its value. Each *arm* of
[if]: if.html
`match` takes an expression and then branches based on its value. Each arm of
the branch is of the form `val => expression`. When the value matches, that arms
expression will be evaluated. Its called `match` because of the term pattern
matching, which `match` is an implementation of. Theres an [entire section on
patterns][patterns] coming up next, that covers all the options that fit here.
patterns][patterns] that covers all the patterns that are possible here.
[patterns]: patterns.html
So whats the big advantage here? Well, there are a few. First of all, `match`
enforces *exhaustiveness checking*. Do you see that last arm, the one with the
So whats the big advantage? Well, there are a few. First of all, `match`
enforces exhaustiveness checking. Do you see that last arm, the one with the
underscore (`_`)? If we remove that arm, Rust will give us an error:
```text
@@ -37,11 +37,12 @@ error: non-exhaustive patterns: `_` not covered
```
In other words, Rust is trying to tell us we forgot a value. Because `x` is an
integer, Rust knows that it can have a number of different values for example,
`6`. Without the `_`, however, there is no arm that could match, and so Rust refuses
to compile. `_` acts like a catch-all arm. If none of the other arms match,
the arm with `_` will, and since we have this catch-all arm, we now have an arm
for every possible value of `x`, and so our program will compile successfully.
integer, Rust knows that it can have a number of different values for
example, `6`. Without the `_`, however, there is no arm that could match, and
so Rust refuses to compile the code. `_` acts like a catch-all arm. If none
of the other arms match, the arm with `_` will, and since we have this
catch-all arm, we now have an arm for every possible value of `x`, and so our
program will compile successfully.
`match` is also an expression, which means we can use it on the right-hand
side of a `let` binding or directly where an expression is used:
@@ -49,7 +50,7 @@ side of a `let` binding or directly where an expression is used:
```rust
let x = 5;
let numer = match x {
let number = match x {
1 => "one",
2 => "two",
3 => "three",
@@ -59,4 +60,41 @@ let numer = match x {
};
```
Sometimes, its a nice way of converting things.
Sometimes its a nice way of converting something from one type to another.
# Matching on enums
Another important use of the `match` keyword is to process the possible
variants of an enum:
```rust
enum Message {
Quit,
ChangeColor(i32, i32, i32),
Move { x: i32, y: i32 },
Write(String),
}
fn quit() { /* ... */ }
fn change_color(r: i32, g: i32, b: i32) { /* ... */ }
fn move_cursor(x: i32, y: i32) { /* ... */ }
fn process_message(msg: Message) {
match msg {
Message::Quit => quit(),
Message::ChangeColor(r, g, b) => change_color(r, g, b),
Message::Move { x: x, y: y } => move_cursor(x, y),
Message::Write(s) => println!("{}", s),
};
}
```
Again, the Rust compiler checks exhaustiveness, so it demands that you
have a match arm for every variant of the enum. If you leave one off, it
will give you a compile-time error unless you use `_`.
Unlike the previous uses of `match`, you cant use the normal `if`
statement to do this. You can use the [`if let`][if-let] statement,
which can be seen as an abbreviated form of `match`.
[if-let][if-let.html]
+44 -44
View File
@@ -3,27 +3,26 @@
Functions are great, but if you want to call a bunch of them on some data, it
can be awkward. Consider this code:
```{rust,ignore}
```rust,ignore
baz(bar(foo)));
```
We would read this left-to right, and so we see "baz bar foo." But this isn't the
order that the functions would get called in, that's inside-out: "foo bar baz."
Wouldn't it be nice if we could do this instead?
We would read this left-to right, and so we see baz bar foo. But this isnt the
order that the functions would get called in, thats inside-out: foo bar baz.
Wouldnt it be nice if we could do this instead?
```{rust,ignore}
```rust,ignore
foo.bar().baz();
```
Luckily, as you may have guessed with the leading question, you can! Rust provides
the ability to use this *method call syntax* via the `impl` keyword.
the ability to use this method call syntax via the `impl` keyword.
## Method calls
# Method calls
Here's how it works:
Heres how it works:
```{rust}
# #![feature(core)]
```rust
struct Circle {
x: f64,
y: f64,
@@ -44,15 +43,23 @@ fn main() {
This will print `12.566371`.
We've made a struct that represents a circle. We then write an `impl` block,
and inside it, define a method, `area`. Methods take a special first
parameter, of which there are three variants: `self`, `&self`, and `&mut self`.
You can think of this first parameter as being the `foo` in `foo.bar()`. The three
variants correspond to the three kinds of things `foo` could be: `self` if it's
just a value on the stack, `&self` if it's a reference, and `&mut self` if it's
a mutable reference. We should default to using `&self`, as you should prefer
borrowing over taking ownership, as well as taking immutable references
over mutable ones. Here's an example of all three variants:
Weve made a struct that represents a circle. We then write an `impl` block,
and inside it, define a method, `area`.
Methods take a special first parameter, of which there are three variants:
`self`, `&self`, and `&mut self`. You can think of this first parameter as
being the `foo` in `foo.bar()`. The three variants correspond to the three
kinds of things `foo` could be: `self` if its just a value on the stack,
`&self` if its a reference, and `&mut self` if its a mutable reference.
Because we took the `&self` parameter to `area`, we can use it just like any
other parameter. Because we know its a `Circle`, we can access the `radius`
just like we would with any other struct.
We should default to using `&self`, as you should prefer borrowing over taking
ownership, as well as taking immutable references over mutable ones. Heres an
example of all three variants:
```rust
struct Circle {
@@ -76,20 +83,13 @@ impl Circle {
}
```
Finally, as you may remember, the value of the area of a circle is `π*r²`.
Because we took the `&self` parameter to `area`, we can use it just like any
other parameter. Because we know it's a `Circle`, we can access the `radius`
just like we would with any other struct. An import of π and some
multiplications later, and we have our area.
## Chaining method calls
# Chaining method calls
So, now we know how to call a method, such as `foo.bar()`. But what about our
original example, `foo.bar().baz()`? This is called 'method chaining', and we
original example, `foo.bar().baz()`? This is called method chaining, and we
can do it by returning `self`.
```
# #![feature(core)]
struct Circle {
x: f64,
y: f64,
@@ -124,15 +124,15 @@ fn grow(&self) -> Circle {
# Circle } }
```
We just say we're returning a `Circle`. With this method, we can grow a new
We just say were returning a `Circle`. With this method, we can grow a new
circle to any arbitrary size.
## Static methods
# Associated functions
You can also define methods that do not take a `self` parameter. Here's a
pattern that's very common in Rust code:
You can also define associated functions that do not take a `self` parameter.
Heres a pattern thats very common in Rust code:
```
```rust
struct Circle {
x: f64,
y: f64,
@@ -154,20 +154,20 @@ fn main() {
}
```
This *static method* builds a new `Circle` for us. Note that static methods
are called with the `Struct::method()` syntax, rather than the `ref.method()`
syntax.
This associated function builds a new `Circle` for us. Note that associated
functions are called with the `Struct::function()` syntax, rather than the
`ref.method()` syntax. Some other langauges call associated functions static
methods.
## Builder Pattern
# Builder Pattern
Let's say that we want our users to be able to create Circles, but we will
Lets say that we want our users to be able to create Circles, but we will
allow them to only set the properties they care about. Otherwise, the `x`
and `y` attributes will be `0.0`, and the `radius` will be `1.0`. Rust doesn't
and `y` attributes will be `0.0`, and the `radius` will be `1.0`. Rust doesnt
have method overloading, named arguments, or variable arguments. We employ
the builder pattern instead. It looks like this:
```
# #![feature(core)]
struct Circle {
x: f64,
y: f64,
@@ -188,7 +188,7 @@ struct CircleBuilder {
impl CircleBuilder {
fn new() -> CircleBuilder {
CircleBuilder { x: 0.0, y: 0.0, radius: 0.0, }
CircleBuilder { x: 0.0, y: 0.0, radius: 1.0, }
}
fn x(&mut self, coordinate: f64) -> &mut CircleBuilder {
@@ -224,9 +224,9 @@ fn main() {
}
```
What we've done here is make another struct, `CircleBuilder`. We've defined our
builder methods on it. We've also defined our `area()` method on `Circle`. We
What weve done here is make another struct, `CircleBuilder`. Weve defined our
builder methods on it. Weve also defined our `area()` method on `Circle`. We
also made one more method on `CircleBuilder`: `finalize()`. This method creates
our final `Circle` from the builder. Now, we've used the type system to enforce
our final `Circle` from the builder. Now, weve used the type system to enforce
our concerns: we can use the methods on `CircleBuilder` to constrain making
`Circle`s in any way we choose.
+177 -1
View File
@@ -1,3 +1,179 @@
% Mutability
Coming Soon
Mutability, the ability to change something, works a bit differently in Rust
than in other languages. The first aspect of mutability is its non-default
status:
```rust,ignore
let x = 5;
x = 6; // error!
```
We can introduce mutability with the `mut` keyword:
```rust
let mut x = 5;
x = 6; // no problem!
```
This is a mutable [variable binding][vb]. When a binding is mutable, it means
youre allowed to change what the binding points to. So in the above example,
its not so much that the value at `x` is changing, but that the binding
changed from one `i32` to another.
[vb]: variable-bindings.html
If you want to change what the binding points to, youll need a [mutable reference][mr]:
```rust
let mut x = 5;
let y = &mut x;
```
[mr]: references-and-borrowing.html
`y` is an immutable binding to a mutable reference, which means that you cant
bind `y` to something else (`y = &mut z`), but you can mutate the thing thats
bound to `y`. (`*y = 5`) A subtle distinction.
Of course, if you need both:
```rust
let mut x = 5;
let mut y = &mut x;
```
Now `y` can be bound to another value, and the value its referencing can be
changed.
Its important to note that `mut` is part of a [pattern][pattern], so you
can do things like this:
```rust
let (mut x, y) = (5, 6);
fn foo(mut x: i32) {
# }
```
[pattern]: patterns.html
# Interior vs. Exterior Mutability
However, when we say something is immutable in Rust, that doesnt mean that
its not able to be changed: We mean something has exterior mutability. Consider,
for example, [`Arc<T>`][arc]:
```rust
use std::sync::Arc;
let x = Arc::new(5);
let y = x.clone();
```
[arc]: ../std/sync/struct.Arc.html
When we call `clone()`, the `Arc<T>` needs to update the reference count. Yet
weve not used any `mut`s here, `x` is an immutable binding, and we didnt take
`&mut 5` or anything. So what gives?
To understand this, we have to go back to the core of Rusts guiding
philosophy, memory safety, and the mechanism by which Rust guarantees it, the
[ownership][ownership] system, and more specifically, [borrowing][borrowing]:
> You may have one or the other of these two kinds of borrows, but not both at
> the same time:
>
> * one or more references (`&T`) to a resource.
> * exactly one mutable reference (`&mut T`)
[ownership]: ownership.html
[borrowing]: borrowing.html#The-Rules
So, thats the real definition of immutability: is this safe to have two
pointers to? In `Arc<T>`s case, yes: the mutation is entirely contained inside
the structure itself. Its not user facing. For this reason, it hands out `&T`
with `clone()`. If it handed out `&mut T`s, though, that would be a problem.
Other types, like the ones in the [`std::cell`][stdcell] module, have the
opposite: interior mutability. For example:
```rust
use std::cell::RefCell;
let x = RefCell::new(42);
let y = x.borrow_mut();
```
[stdcell]: ../std/cell/index.html
RefCell hands out `&mut` references to whats inside of it with the
`borrow_mut()` method. Isnt that dangerous? What if we do:
```rust,ignore
use std::cell::RefCell;
let x = RefCell::new(42);
let y = x.borrow_mut();
let z = x.borrow_mut();
# (y, z);
```
This will in fact panic, at runtime. This is what `RefCell` does: it enforces
Rusts borrowing rules at runtime, and `panic!`s if theyre violated. This
allows us to get around another aspect of Rusts mutability rules. Lets talk
about it first.
## Field-level mutability
Mutability is a property of either a borrow (`&mut`) or a binding (`let mut`).
This means that, for example, you cannot have a [`struct`][struct] with
some fields mutable and some immutable:
```rust,ignore
struct Point {
x: i32,
mut y: i32, // nope
}
```
The mutability of a struct is in its binding:
```rust,ignore
struct Point {
x: i32,
y: i32,
}
let mut a = Point { x: 5, y: 6 };
a.x = 10;
let b = Point { x: 5, y: 6};
b.x = 10; // error: cannot assign to immutable field `b.x`
```
[struct]: structs.html
However, by using `Cell<T>`, you can emulate field-level mutability:
```
use std::cell::Cell;
struct Point {
x: i32,
y: Cell<i32>,
}
let point = Point { x: 5, y: Cell::new(6) };
point.y.set(7);
println!("y: {:?}", point.y);
```
This will print `y: Cell { value: 7 }`. Weve successfully updated `y`.
+6 -9
View File
@@ -9,16 +9,16 @@ process, see [Stability as a deliverable][stability].
To install nightly Rust, you can use `rustup.sh`:
```bash
$ curl -s https://static.rust-lang.org/rustup.sh | sudo sh -s -- --channel=nightly
$ curl -s https://static.rust-lang.org/rustup.sh | sh -s -- --channel=nightly
```
If you're concerned about the [potential insecurity][insecurity] of using `curl
| sudo sh`, please keep reading and see our disclaimer below. And feel free to
| sh`, please keep reading and see our disclaimer below. And feel free to
use a two-step version of the installation and examine our installation script:
```bash
$ curl -f -L https://static.rust-lang.org/rustup.sh -O
$ sudo sh rustup.sh --channel=nightly
$ sh rustup.sh --channel=nightly
```
[insecurity]: http://curlpipesh.tumblr.com
@@ -43,13 +43,11 @@ If you used the Windows installer, just re-run the `.msi` and it will give you
an uninstall option.
Some people, and somewhat rightfully so, get very upset when we tell you to
`curl | sudo sh`. Basically, when you do this, you are trusting that the good
`curl | sh`. Basically, when you do this, you are trusting that the good
people who maintain Rust aren't going to hack your computer and do bad things.
That's a good instinct! If you're one of those people, please check out the
documentation on [building Rust from Source][from source], or [the official
binary downloads][install page]. And we promise that this method will not be
the way to install Rust forever: it's just the easiest way to keep people
updated while Rust is in its alpha state.
binary downloads][install page].
[from source]: https://github.com/rust-lang/rust#building-from-source
[install page]: http://www.rust-lang.org/install.html
@@ -93,8 +91,7 @@ If not, there are a number of places where you can get help. The easiest is
[the #rust IRC channel on irc.mozilla.org][irc], which you can access through
[Mibbit][mibbit]. Click that link, and you'll be chatting with other Rustaceans
(a silly nickname we call ourselves), and we can help you out. Other great
resources include [the users forum][users], and [Stack Overflow][stack
overflow].
resources include [the users forum][users], and [Stack Overflow][stack overflow].
[irc]: irc://irc.mozilla.org/#rust
[mibbit]: http://chat.mibbit.com/?server=irc.mozilla.org&channel=%23rust
+81 -1
View File
@@ -1,3 +1,83 @@
% Operators and Overloading
Coming soon!
Rust allows for a limited form of operator overloading. There are certain
operators that are able to be overloaded. To support a particular operator
between types, theres a specific trait that you can implement, which then
overloads the operator.
For example, the `+` operator can be overloaded with the `Add` trait:
```rust
use std::ops::Add;
#[derive(Debug)]
struct Point {
x: i32,
y: i32,
}
impl Add for Point {
type Output = Point;
fn add(self, other: Point) -> Point {
Point { x: self.x + other.x, y: self.y + other.y }
}
}
fn main() {
let p1 = Point { x: 1, y: 0 };
let p2 = Point { x: 2, y: 3 };
let p3 = p1 + p2;
println!("{:?}", p3);
}
```
In `main`, we can use `+` on our two `Point`s, since weve implemented
`Add<Output=Point>` for `Point`.
There are a number of operators that can be overloaded this way, and all of
their associated traits live in the [`std::ops`][stdops] module. Check out its
documentation for the full list.
[stdops]: ../std/ops/index.html
Implementing these traits follows a pattern. Lets look at [`Add`][add] in more
detail:
```rust
# mod foo {
pub trait Add<RHS = Self> {
type Output;
fn add(self, rhs: RHS) -> Self::Output;
}
# }
```
[add]: ../std/ops/trait.Add.html
Theres three types in total involved here: the type you `impl Add` for, `RHS`,
which defaults to `Self`, and `Output`. For an expression `let z = x + y`, `x`
is the `Self` type, `y` is the RHS, and `z` is the `Self::Output` type.
```rust
# struct Point;
# use std::ops::Add;
impl Add<i32> for Point {
type Output = f64;
fn add(self, rhs: i32) -> f64 {
// add an i32 to a Point and get an f64
# 1.0
}
}
```
will let you do this:
```rust,ignore
let p: Point = // ...
let x: f64 = p + 2i32;
```
+128 -475
View File
@@ -1,555 +1,208 @@
% Ownership
This guide presents Rust's ownership system. This is one of Rust's most unique
and compelling features, with which Rust developers should become quite
acquainted. Ownership is how Rust achieves its largest goal, memory safety.
The ownership system has a few distinct concepts: *ownership*, *borrowing*,
and *lifetimes*. We'll talk about each one in turn.
This guide is one of three presenting Rusts ownership system. This is one of
Rusts most unique and compelling features, with which Rust developers should
become quite acquainted. Ownership is how Rust achieves its largest goal,
memory safety. There are a few distinct concepts, each with its own
chapter:
* ownership, which youre reading now
* [borrowing][borrowing], and their associated feature references
* [lifetimes][lifetimes], an advanced concept of borrowing
These three chapters are related, and in order. Youll need all three to fully
understand the ownership system.
[borrowing]: references-and-borrowing.html
[lifetimes]: lifetimes.html
# Meta
Before we get to the details, two important notes about the ownership system.
Rust has a focus on safety and speed. It accomplishes these goals through many
*zero-cost abstractions*, which means that in Rust, abstractions cost as little
zero-cost abstractions, which means that in Rust, abstractions cost as little
as possible in order to make them work. The ownership system is a prime example
of a zero cost abstraction. All of the analysis we'll talk about in this guide
of a zero-cost abstraction. All of the analysis well talk about in this guide
is _done at compile time_. You do not pay any run-time cost for any of these
features.
However, this system does have a certain cost: learning curve. Many new users
to Rust experience something we like to call "fighting with the borrow
checker," where the Rust compiler refuses to compile a program that the author
thinks is valid. This often happens because the programmer's mental model of
how ownership should work doesn't match the actual rules that Rust implements.
to Rust experience something we like to call fighting with the borrow
checker, where the Rust compiler refuses to compile a program that the author
thinks is valid. This often happens because the programmers mental model of
how ownership should work doesnt match the actual rules that Rust implements.
You probably will experience similar things at first. There is good news,
however: more experienced Rust developers report that once they work with the
rules of the ownership system for a period of time, they fight the borrow
checker less and less.
With that in mind, let's learn about ownership.
With that in mind, lets learn about ownership.
# Ownership
At its core, ownership is about *resources*. For the purposes of the vast
majority of this guide, we will talk about a specific resource: memory. The
concept generalizes to any kind of resource, like a file handle, but to make it
more concrete, we'll focus on memory.
When your program allocates some memory, it needs some way to deallocate that
memory. Imagine a function `foo` that allocates four bytes of memory, and then
never deallocates that memory. We call this problem *leaking* memory, because
each time we call `foo`, we're allocating another four bytes. Eventually, with
enough calls to `foo`, we will run our system out of memory. That's no good. So
we need some way for `foo` to deallocate those four bytes. It's also important
that we don't deallocate too many times, either. Without getting into the
details, attempting to deallocate memory multiple times can lead to problems.
In other words, any time some memory is allocated, we need to make sure that we
deallocate that memory once and only once. Too many times is bad, not enough
times is bad. The counts must match.
There's one other important detail with regards to allocating memory. Whenever
we request some amount of memory, what we are given is a handle to that memory.
This handle (often called a *pointer*, when we're referring to memory) is how
we interact with the allocated memory. As long as we have that handle, we can
do something with the memory. Once we're done with the handle, we're also done
with the memory, as we can't do anything useful without a handle to it.
Historically, systems programming languages require you to track these
allocations, deallocations, and handles yourself. For example, if we want some
memory from the heap in a language like C, we do this:
```c
{
int *x = malloc(sizeof(int));
// we can now do stuff with our handle x
*x = 5;
free(x);
}
```
The call to `malloc` allocates some memory. The call to `free` deallocates the
memory. There's also bookkeeping about allocating the correct amount of memory.
Rust combines these two aspects of allocating memory (and other resources) into
a concept called *ownership*. Whenever we request some memory, that handle we
receive is called the *owning handle*. Whenever that handle goes out of scope,
Rust knows that you cannot do anything with the memory anymore, and so
therefore deallocates the memory for you. Here's the equivalent example in
Rust:
[Variable bindings][bindings] have a property in Rust: they have ownership
of what theyre bound to. This means that when a binding goes out of scope, the
resource that theyre bound to are freed. For example:
```rust
{
let x = Box::new(5);
fn foo() {
let v = vec![1, 2, 3];
}
```
The `Box::new` function creates a `Box<T>` (specifically `Box<i32>` in this
case) by allocating a small segment of memory on the heap with enough space to
fit an `i32`. But where in the code is the box deallocated? We said before that
we must have a deallocation for each allocation. Rust handles this for you. It
knows that our handle, `x`, is the owning reference to our box. Rust knows that
`x` will go out of scope at the end of the block, and so it inserts a call to
deallocate the memory at the end of the scope. Because the compiler does this
for us, it's impossible to forget. We always have exactly one deallocation
paired with each of our allocations.
When `v` comes into scope, a new [`Vec<T>`][vect] is created. In this case, the
vector also allocates space on [the heap][heap], for the three elements. When
`v` goes out of scope at the end of `foo()`, Rust will clean up everything
related to the vector, even the heap-allocated memory. This happens
deterministically, at the end of the scope.
This is pretty straightforward, but what happens when we want to pass our box
to a function? Let's look at some code:
[vect]: ../std/vec/struct.Vec.html
[heap]: the-stack-and-the-heap.html
[bindings]: variable-bindings.html
# Move semantics
Theres some more subtlety here, though: Rust ensures that there is _exactly
one_ binding to any given resource. For example, if we have a vector, we can
assign it to another binding:
```rust
fn main() {
let x = Box::new(5);
let v = vec![1, 2, 3];
add_one(x);
}
fn add_one(mut num: Box<i32>) {
*num += 1;
}
let v2 = v;
```
This code works, but it's not ideal. For example, let's add one more line of
code, where we print out the value of `x`:
But, if we try to use `v` afterwards, we get an error:
```{rust,ignore}
fn main() {
let x = Box::new(5);
```rust,ignore
let v = vec![1, 2, 3];
add_one(x);
let v2 = v;
println!("{}", x);
}
fn add_one(mut num: Box<i32>) {
*num += 1;
}
println!("v[0] is: {}", v[0]);
```
This does not compile, and gives us an error:
It looks like this:
```text
error: use of moved value: `x`
println!("{}", x);
^
error: use of moved value: `v`
println!("v[0] is: {}", v[0]);
^
```
Remember, we need one deallocation for every allocation. When we try to pass
our box to `add_one`, we would have two handles to the memory: `x` in `main`,
and `num` in `add_one`. If we deallocated the memory when each handle went out
of scope, we would have two deallocations and one allocation, and that's wrong.
So when we call `add_one`, Rust defines `num` as the owner of the handle. And
so, now that we've given ownership to `num`, `x` is invalid. `x`'s value has
"moved" from `x` to `num`. Hence the error: use of moved value `x`.
A similar thing happens if we define a function which takes ownership, and
try to use something after weve passed it as an argument:
To fix this, we can have `add_one` give ownership back when it's done with the
box:
```rust,ignore
fn take(v: Vec<i32>) {
// what happens here isnt important.
}
let v = vec![1, 2, 3];
take(v);
println!("v[0] is: {}", v[0]);
```
Same error: use of moved value. When we transfer ownership to something else,
we say that weve moved the thing we refer to. You dont need some sort of
special annotation here, its the default thing that Rust does.
## The details
The reason that we cannot use a binding after weve moved it is subtle, but
important. When we write code like this:
```rust
fn main() {
let x = Box::new(5);
let v = vec![1, 2, 3];
let y = add_one(x);
println!("{}", y);
}
fn add_one(mut num: Box<i32>) -> Box<i32> {
*num += 1;
num
}
let v2 = v;
```
This code will compile and run just fine. Now, we return a `box`, and so the
ownership is transferred back to `y` in `main`. We only have ownership for the
duration of our function before giving it back. This pattern is very common,
and so Rust introduces a concept to describe a handle which temporarily refers
to something another handle owns. It's called *borrowing*, and it's done with
*references*, designated by the `&` symbol.
The first line allocates memory for the vector object, `v`, and for the data it
contains. The vector object is stored on the [stack][sh] and contains a pointer
to the content (`[1, 2, 3]`) stored on the [heap][sh]. When we move `v` to `v2`,
it creates a copy of that pointer, for `v2`. Which means that there would be two
pointers to the content of the vector on the heap. It would violate Rusts
safety guarantees by introducing a data race. Therefore, Rust forbids using `v`
after weve done the move.
# Borrowing
[sh]: the-stack-and-the-heap.html
Here's the current state of our `add_one` function:
Its also important to note that optimizations may remove the actual copy of
the bytes on the stack, depending on circumstances. So it may not be as
inefficient as it initially seems.
## `Copy` types
Weve established that when ownership is transferred to another binding, you
cannot use the original binding. However, theres a [trait][traits] that changes this
behavior, and its called `Copy`. We havent discussed traits yet, but for now,
you can think of them as an annotation to a particular type that adds extra
behavior. For example:
```rust
fn add_one(mut num: Box<i32>) -> Box<i32> {
*num += 1;
let v = 1;
num
}
let v2 = v;
println!("v is: {}", v);
```
This function takes ownership, because it takes a `Box`, which owns its
contents. But then we give ownership right back.
In this case, `v` is an `i32`, which implements the `Copy` trait. This means
that, just like a move, when we assign `v` to `v2`, a copy of the data is made.
But, unlike a move, we can still use `v` afterward. This is because an `i32`
has no pointers to data somewhere else, copying it is a full copy.
In the physical world, you can give one of your possessions to someone for a
short period of time. You still own your possession, you're just letting someone
else use it for a while. We call that *lending* something to someone, and that
person is said to be *borrowing* that something from you.
We will discuss how to make your own types `Copy` in the [traits][traits]
section.
Rust's ownership system also allows an owner to lend out a handle for a limited
period. This is also called *borrowing*. Here's a version of `add_one` which
borrows its argument rather than taking ownership:
[traits]: traits.html
# More than ownership
Of course, if we had to hand ownership back with every function we wrote:
```rust
fn add_one(num: &mut i32) {
*num += 1;
fn foo(v: Vec<i32>) -> Vec<i32> {
// do stuff with v
// hand back ownership
v
}
```
This function borrows an `i32` from its caller, and then increments it. When
the function is over, and `num` goes out of scope, the borrow is over.
We have to change our `main` a bit too:
This would get very tedious. It gets worse the more things we want to take ownership of:
```rust
fn main() {
let mut x = 5;
fn foo(v1: Vec<i32>, v2: Vec<i32>) -> (Vec<i32>, Vec<i32>, i32) {
// do stuff with v1 and v2
add_one(&mut x);
println!("{}", x);
// hand back ownership, and the result of our function
(v1, v2, 42)
}
fn add_one(num: &mut i32) {
*num += 1;
}
let v1 = vec![1, 2, 3];
let v2 = vec![1, 2, 3];
let (v1, v2, answer) = foo(v1, v2);
```
We don't need to assign the result of `add_one()` anymore, because it doesn't
return anything anymore. This is because we're not passing ownership back,
since we just borrow, not take ownership.
Ugh! The return type, return line, and calling the function gets way more
complicated.
# Lifetimes
Luckily, Rust offers a feature, borrowing, which helps us solve this problem.
Its the topic of the next section!
Lending out a reference to a resource that someone else owns can be
complicated, however. For example, imagine this set of operations:
1. I acquire a handle to some kind of resource.
2. I lend you a reference to the resource.
3. I decide I'm done with the resource, and deallocate it, while you still have
your reference.
4. You decide to use the resource.
Uh oh! Your reference is pointing to an invalid resource. This is called a
*dangling pointer* or "use after free," when the resource is memory.
To fix this, we have to make sure that step four never happens after step
three. The ownership system in Rust does this through a concept called
*lifetimes*, which describe the scope that a reference is valid for.
Remember the function that borrowed an `i32`? Let's look at it again.
```rust
fn add_one(num: &mut i32) {
*num += 1;
}
```
Rust has a feature called *lifetime elision*, which allows you to not write
lifetime annotations in certain circumstances. This is one of them. We will
cover the others later. Without eliding the lifetimes, `add_one` looks like
this:
```rust
fn add_one<'a>(num: &'a mut i32) {
*num += 1;
}
```
The `'a` is called a *lifetime*. Most lifetimes are used in places where
short names like `'a`, `'b` and `'c` are clearest, but it's often useful to
have more descriptive names. Let's dig into the syntax in a bit more detail:
```{rust,ignore}
fn add_one<'a>(...)
```
This part _declares_ our lifetimes. This says that `add_one` has one lifetime,
`'a`. If we had two, it would look like this:
```{rust,ignore}
fn add_two<'a, 'b>(...)
```
Then in our parameter list, we use the lifetimes we've named:
```{rust,ignore}
...(num: &'a mut i32)
```
If you compare `&mut i32` to `&'a mut i32`, they're the same, it's just that the
lifetime `'a` has snuck in between the `&` and the `mut i32`. We read `&mut i32` as "a
mutable reference to an i32" and `&'a mut i32` as "a mutable reference to an i32 with the lifetime 'a.'"
Why do lifetimes matter? Well, for example, here's some code:
```rust
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let y = &5; // this is the same as `let _y = 5; let y = &_y;`
let f = Foo { x: y };
println!("{}", f.x);
}
```
As you can see, `struct`s can also have lifetimes. In a similar way to functions,
```{rust}
struct Foo<'a> {
# x: &'a i32,
# }
```
declares a lifetime, and
```rust
# struct Foo<'a> {
x: &'a i32,
# }
```
uses it. So why do we need a lifetime here? We need to ensure that any reference
to a `Foo` cannot outlive the reference to an `i32` it contains.
## Thinking in scopes
A way to think about lifetimes is to visualize the scope that a reference is
valid for. For example:
```rust
fn main() {
let y = &5; // -+ y goes into scope
// |
// stuff // |
// |
} // -+ y goes out of scope
```
Adding in our `Foo`:
```rust
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let y = &5; // -+ y goes into scope
let f = Foo { x: y }; // -+ f goes into scope
// stuff // |
// |
} // -+ f and y go out of scope
```
Our `f` lives within the scope of `y`, so everything works. What if it didn't?
This code won't work:
```{rust,ignore}
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let x; // -+ x goes into scope
// |
{ // |
let y = &5; // ---+ y goes into scope
let f = Foo { x: y }; // ---+ f goes into scope
x = &f.x; // | | error here
} // ---+ f and y go out of scope
// |
println!("{}", x); // |
} // -+ x goes out of scope
```
Whew! As you can see here, the scopes of `f` and `y` are smaller than the scope
of `x`. But when we do `x = &f.x`, we make `x` a reference to something that's
about to go out of scope.
Named lifetimes are a way of giving these scopes a name. Giving something a
name is the first step towards being able to talk about it.
## 'static
The lifetime named *static* is a special lifetime. It signals that something
has the lifetime of the entire program. Most Rust programmers first come across
`'static` when dealing with strings:
```rust
let x: &'static str = "Hello, world.";
```
String literals have the type `&'static str` because the reference is always
alive: they are baked into the data segment of the final binary. Another
example are globals:
```rust
static FOO: i32 = 5;
let x: &'static i32 = &FOO;
```
This adds an `i32` to the data segment of the binary, and `x` is a reference
to it.
# Shared Ownership
In all the examples we've considered so far, we've assumed that each handle has
a singular owner. But sometimes, this doesn't work. Consider a car. Cars have
four wheels. We would want a wheel to know which car it was attached to. But
this won't work:
```{rust,ignore}
struct Car {
name: String,
}
struct Wheel {
size: i32,
owner: Car,
}
fn main() {
let car = Car { name: "DeLorean".to_string() };
for _ in 0..4 {
Wheel { size: 360, owner: car };
}
}
```
We try to make four `Wheel`s, each with a `Car` that it's attached to. But the
compiler knows that on the second iteration of the loop, there's a problem:
```text
error: use of moved value: `car`
Wheel { size: 360, owner: car };
^~~
note: `car` moved here because it has type `Car`, which is non-copyable
Wheel { size: 360, owner: car };
^~~
```
We need our `Car` to be pointed to by multiple `Wheel`s. We can't do that with
`Box<T>`, because it has a single owner. We can do it with `Rc<T>` instead:
```rust
use std::rc::Rc;
struct Car {
name: String,
}
struct Wheel {
size: i32,
owner: Rc<Car>,
}
fn main() {
let car = Car { name: "DeLorean".to_string() };
let car_owner = Rc::new(car);
for _ in 0..4 {
Wheel { size: 360, owner: car_owner.clone() };
}
}
```
We wrap our `Car` in an `Rc<T>`, getting an `Rc<Car>`, and then use the
`clone()` method to make new references. We've also changed our `Wheel` to have
an `Rc<Car>` rather than just a `Car`.
This is the simplest kind of multiple ownership possible. For example, there's
also `Arc<T>`, which uses more expensive atomic instructions to be the
thread-safe counterpart of `Rc<T>`.
## Lifetime Elision
Rust supports powerful local type inference in function bodies, but its
forbidden in item signatures to allow reasoning about the types just based in
the item signature alone. However, for ergonomic reasons a very restricted
secondary inference algorithm called “lifetime elision” applies in function
signatures. It infers only based on the signature components themselves and not
based on the body of the function, only infers lifetime parameters, and does
this with only three easily memorizable and unambiguous rules. This makes
lifetime elision a shorthand for writing an item signature, while not hiding
away the actual types involved as full local inference would if applied to it.
When talking about lifetime elision, we use the term *input lifetime* and
*output lifetime*. An *input lifetime* is a lifetime associated with a parameter
of a function, and an *output lifetime* is a lifetime associated with the return
value of a function. For example, this function has an input lifetime:
```{rust,ignore}
fn foo<'a>(bar: &'a str)
```
This one has an output lifetime:
```{rust,ignore}
fn foo<'a>() -> &'a str
```
This one has a lifetime in both positions:
```{rust,ignore}
fn foo<'a>(bar: &'a str) -> &'a str
```
Here are the three rules:
* Each elided lifetime in a function's arguments becomes a distinct lifetime
parameter.
* If there is exactly one input lifetime, elided or not, that lifetime is
assigned to all elided lifetimes in the return values of that function.
* If there are multiple input lifetimes, but one of them is `&self` or `&mut
self`, the lifetime of `self` is assigned to all elided output lifetimes.
Otherwise, it is an error to elide an output lifetime.
### Examples
Here are some examples of functions with elided lifetimes. We've paired each
example of an elided lifetime with its expanded form.
```{rust,ignore}
fn print(s: &str); // elided
fn print<'a>(s: &'a str); // expanded
fn debug(lvl: u32, s: &str); // elided
fn debug<'a>(lvl: u32, s: &'a str); // expanded
// In the preceding example, `lvl` doesn't need a lifetime because it's not a
// reference (`&`). Only things relating to references (such as a `struct`
// which contains a reference) need lifetimes.
fn substr(s: &str, until: u32) -> &str; // elided
fn substr<'a>(s: &'a str, until: u32) -> &'a str; // expanded
fn get_str() -> &str; // ILLEGAL, no inputs
fn frob(s: &str, t: &str) -> &str; // ILLEGAL, two inputs
fn frob<'a, 'b>(s: &'a str, t: &'b str) -> &str; // Expanded: Output lifetime is unclear
fn get_mut(&mut self) -> &mut T; // elided
fn get_mut<'a>(&'a mut self) -> &'a mut T; // expanded
fn args<T:ToCStr>(&mut self, args: &[T]) -> &mut Command // elided
fn args<'a, 'b, T:ToCStr>(&'a mut self, args: &'b [T]) -> &'a mut Command // expanded
fn new(buf: &mut [u8]) -> BufWriter; // elided
fn new<'a>(buf: &'a mut [u8]) -> BufWriter<'a> // expanded
```
# Related Resources
Coming Soon.
+66 -6
View File
@@ -21,6 +21,8 @@ match x {
}
```
This prints `one`.
# Multiple patterns
You can match multiple patterns with `|`:
@@ -35,6 +37,8 @@ match x {
}
```
This prints `one or two`.
# Ranges
You can match a range of values with `...`:
@@ -48,12 +52,25 @@ match x {
}
```
Ranges are mostly used with integers and single characters.
This prints `one through five`.
Ranges are mostly used with integers and `char`s:
```rust
let x = '💅';
match x {
'a' ... 'j' => println!("early letter"),
'k' ... 'z' => println!("late letter"),
_ => println!("something else"),
}
```
This prints `something else`
# Bindings
If youre matching multiple things, via a `|` or a `...`, you can bind
the value to a name with `@`:
You can bind values to names with `@`:
```rust
let x = 1;
@@ -64,6 +81,37 @@ match x {
}
```
This prints `got a range element 1`. This is useful when you want to
do a complicated match of part of a data structure:
```rust
#[derive(Debug)]
struct Person {
name: Option<String>,
}
let name = "Steve".to_string();
let mut x: Option<Person> = Some(Person { name: Some(name) });
match x {
Some(Person { name: ref a @ Some(_), .. }) => println!("{:?}", a),
_ => {}
}
```
This prints `Some("Steve")`: Weve bound the inner `name` to `a`.
If you use `@` with `|`, you need to make sure the name is bound in each part
of the pattern:
```rust
let x = 5;
match x {
e @ 1 ... 5 | e @ 8 ... 10 => println!("got a range element {}", e),
_ => println!("anything"),
}
```
# Ignoring variants
If youre matching on an enum which has variants, you can use `..` to
@@ -83,6 +131,8 @@ match x {
}
```
This prints `Got an int!`.
# Guards
You can introduce match guards with `if`:
@@ -102,6 +152,8 @@ match x {
}
```
This prints `Got an int!`
# ref and ref mut
If you want to get a [reference][ref], use the `ref` keyword:
@@ -114,6 +166,8 @@ match x {
}
```
This prints `Got a reference to 5`.
[ref]: references-and-borrowing.html
Here, the `r` inside the `match` has the type `&i32`. In other words, the `ref`
@@ -130,7 +184,7 @@ match x {
# Destructuring
If you have a compound data type, like a `struct`, you can destructure it
If you have a compound data type, like a [`struct`][struct], you can destructure it
inside of a pattern:
```rust
@@ -146,6 +200,8 @@ match origin {
}
```
[struct]: structs.html
If we only care about some of the values, we dont have to give them all names:
```rust
@@ -161,6 +217,8 @@ match origin {
}
```
This prints `x is 0`.
You can do this kind of match on any member, not just the first:
```rust
@@ -176,6 +234,8 @@ match origin {
}
```
This prints `y is 0`.
This destructuring behavior works on any compound data type, like
[tuples][tuples] or [enums][enums].
@@ -187,10 +247,10 @@ This destructuring behavior works on any compound data type, like
Whew! Thats a lot of different ways to match things, and they can all be
mixed and matched, depending on what youre doing:
```{rust,ignore}
```rust,ignore
match x {
Foo { x: Some(ref name), y: None } => ...
}
```
Patterns are very powerful. Make good use of them.
Patterns are very powerful. Make good use of them.
+42 -17
View File
@@ -15,7 +15,7 @@ let x = true;
let y: bool = false;
```
A common use of booleans is in [`if` statements][if].
A common use of booleans is in [`if` conditionals][if].
[if]: if.html
@@ -62,14 +62,14 @@ let y = 1.0; // y has type f64
Heres a list of the different numeric types, with links to their documentation
in the standard library:
* [i8](../std/primitive.i8.html)
* [i16](../std/primitive.i16.html)
* [i32](../std/primitive.i32.html)
* [i64](../std/primitive.i64.html)
* [i8](../std/primitive.i8.html)
* [u8](../std/primitive.u8.html)
* [u16](../std/primitive.u16.html)
* [u32](../std/primitive.u32.html)
* [u64](../std/primitive.u64.html)
* [u8](../std/primitive.u8.html)
* [isize](../std/primitive.isize.html)
* [usize](../std/primitive.usize.html)
* [f32](../std/primitive.f32.html)
@@ -82,12 +82,12 @@ Lets go over them by category:
Integer types come in two varieties: signed and unsigned. To understand the
difference, lets consider a number with four bits of size. A signed, four-bit
number would let you store numbers from `-8` to `+7`. Signed numbers use
twos compliment representation. An unsigned four bit number, since it does
twos complement representation. An unsigned four bit number, since it does
not need to store negatives, can store values from `0` to `+15`.
Unsigned types use a `u` for their category, and signed types use `i`. The `i`
is for integer. So `u8` is an eight-bit unsigned number, and `i8` is an
eight-bit signed number.
eight-bit signed number.
## Fixed size types
@@ -103,7 +103,7 @@ and unsigned varieties. This makes for two types: `isize` and `usize`.
## Floating-point types
Rust also two floating point types: `f32` and `f64`. These correspond to
Rust also has two floating point types: `f32` and `f64`. These correspond to
IEEE-754 single and double precision numbers.
# Arrays
@@ -168,6 +168,7 @@ like arrays:
```rust
let a = [0, 1, 2, 3, 4];
let middle = &a[1..4]; // A slice of a: just the elements 1, 2, and 3
let complete = &a[..]; // A slice containing all of the elements in a
```
Slices have type `&[T]`. Well talk about that `T` when we cover
@@ -175,7 +176,7 @@ Slices have type `&[T]`. Well talk about that `T` when we cover
[generics]: generics.html
You can find more documentation for `slices`s [in the standard library
You can find more documentation for slices [in the standard library
documentation][slice].
[slice]: ../std/primitive.slice.html
@@ -216,6 +217,18 @@ In systems programming languages, strings are a bit more complex than in other
languages. For now, just read `&str` as a *string slice*, and well learn more
soon.
You can assign one tuple into another, if they have the same contained types
and [arity]. Tuples have the same arity when they have the same length.
[arity]: glossary.html#arity
```rust
let mut x = (1, 2); // x: (i32, i32)
let y = (2, 3); // y: (i32, i32)
x = y;
```
You can access the fields in a tuple through a *destructuring let*. Heres
an example:
@@ -228,27 +241,39 @@ println!("x is {}", x);
Remember [before][let] when I said the left-hand side of a `let` statement was more
powerful than just assigning a binding? Here we are. We can put a pattern on
the left-hand side of the `let`, and if it matches up to the right-hand side,
we can assign multiple bindings at once. In this case, `let` "destructures,"
or "breaks up," the tuple, and assigns the bits to three bindings.
we can assign multiple bindings at once. In this case, `let` destructures
or breaks up the tuple, and assigns the bits to three bindings.
[let]: variable-bindings.html
This pattern is very powerful, and well see it repeated more later.
There are also a few things you can do with a tuple as a whole, without
destructuring. You can assign one tuple into another, if they have the same
contained types and [arity]. Tuples have the same arity when they have the same
length.
You can disambiguate a single-element tuple from a value in parentheses with a
comma:
```
(0,); // single-element tuple
(0); // zero in parentheses
```
## Tuple Indexing
You can also access fields of a tuple with indexing syntax:
[arity]: glossary.html#arity
```rust
let mut x = (1, 2); // x: (i32, i32)
let y = (2, 3); // y: (i32, i32)
let tuple = (1, 2, 3);
x = y;
let x = tuple.0;
let y = tuple.1;
let z = tuple.2;
println!("x is {}", x);
```
Like array indexing, it starts at zero, but unlike array indexing, it uses a
`.`, rather than `[]`s.
You can find more documentation for tuples [in the standard library
documentation][tuple].
+122
View File
@@ -0,0 +1,122 @@
% Raw Pointers
Rust has a number of different smart pointer types in its standard library, but
there are two types that are extra-special. Much of Rusts safety comes from
compile-time checks, but raw pointers dont have such guarantees, and are
[unsafe][unsafe] to use.
`*const T` and `*mut T` are called raw pointers in Rust. Sometimes, when
writing certain kinds of libraries, youll need to get around Rusts safety
guarantees for some reason. In this case, you can use raw pointers to implement
your library, while exposing a safe interface for your users. For example, `*`
pointers are allowed to alias, allowing them to be used to write
shared-ownership types, and even thread-safe shared memory types (the `Rc<T>`
and `Arc<T>` types are both implemented entirely in Rust).
Here are some things to remember about raw pointers that are different than
other pointer types. They:
- are not guaranteed to point to valid memory and are not even
guaranteed to be non-null (unlike both `Box` and `&`);
- do not have any automatic clean-up, unlike `Box`, and so require
manual resource management;
- are plain-old-data, that is, they don't move ownership, again unlike
`Box`, hence the Rust compiler cannot protect against bugs like
use-after-free;
- lack any form of lifetimes, unlike `&`, and so the compiler cannot
reason about dangling pointers; and
- have no guarantees about aliasing or mutability other than mutation
not being allowed directly through a `*const T`.
# Basics
Creating a raw pointer is perfectly safe:
```rust
let x = 5;
let raw = &x as *const i32;
let mut y = 10;
let raw_mut = &mut y as *mut i32;
```
However, dereferencing one is not. This wont work:
```rust,ignore
let x = 5;
let raw = &x as *const i32;
println!("raw points at {}", *raw);
```
It gives this error:
```text
error: dereference of unsafe pointer requires unsafe function or block [E0133]
println!("raw points at{}", *raw);
^~~~
```
When you dereference a raw pointer, youre taking responsibility that its not
pointing somewhere that would be incorrect. As such, you need `unsafe`:
```rust
let x = 5;
let raw = &x as *const i32;
let points_at = unsafe { *raw };
println!("raw points at {}", points_at);
```
For more operations on raw pointers, see [their API documentation][rawapi].
[unsafe]: unsafe.html
[rawapi]: ../std/primitive.pointer.html
# FFI
Raw pointers are useful for FFI: Rusts `*const T` and `*mut T` are similar to
Cs `const T*` and `T*`, respectfully. For more about this use, consult the
[FFI chapter][ffi].
[ffi]: ffi.html
# References and raw pointers
At runtime, a raw pointer `*` and a reference pointing to the same piece of
data have an identical representation. In fact, an `&T` reference will
implicitly coerce to an `*const T` raw pointer in safe code and similarly for
the `mut` variants (both coercions can be performed explicitly with,
respectively, `value as *const T` and `value as *mut T`).
Going the opposite direction, from `*const` to a reference `&`, is not safe. A
`&T` is always valid, and so, at a minimum, the raw pointer `*const T` has to
point to a valid instance of type `T`. Furthermore, the resulting pointer must
satisfy the aliasing and mutability laws of references. The compiler assumes
these properties are true for any references, no matter how they are created,
and so any conversion from raw pointers is asserting that they hold. The
programmer *must* guarantee this.
The recommended method for the conversion is
```rust
let i: u32 = 1;
// explicit cast
let p_imm: *const u32 = &i as *const u32;
let mut m: u32 = 2;
// implicit coercion
let p_mut: *mut u32 = &mut m;
unsafe {
let ref_imm: &u32 = &*p_imm;
let ref_mut: &mut u32 = &mut *p_mut;
}
```
The `&*x` dereferencing style is preferred to using a `transmute`. The latter
is far more powerful than necessary, and the more restricted operation is
harder to use incorrectly; for example, it requires that `x` is a pointer
(unlike `transmute`).
+369 -1
View File
@@ -1,3 +1,371 @@
% References and Borrowing
Coming Soon!
This guide is one of three presenting Rusts ownership system. This is one of
Rusts most unique and compelling features, with which Rust developers should
become quite acquainted. Ownership is how Rust achieves its largest goal,
memory safety. There are a few distinct concepts, each with its own
chapter:
* [ownership][ownership], the key concept
* borrowing, which youre reading now
* [lifetimes][lifetimes], an advanced concept of borrowing
These three chapters are related, and in order. Youll need all three to fully
understand the ownership system.
[ownership]: ownership.html
[lifetimes]: lifetimes.html
# Meta
Before we get to the details, two important notes about the ownership system.
Rust has a focus on safety and speed. It accomplishes these goals through many
zero-cost abstractions, which means that in Rust, abstractions cost as little
as possible in order to make them work. The ownership system is a prime example
of a zero cost abstraction. All of the analysis well talk about in this guide
is _done at compile time_. You do not pay any run-time cost for any of these
features.
However, this system does have a certain cost: learning curve. Many new users
to Rust experience something we like to call fighting with the borrow
checker, where the Rust compiler refuses to compile a program that the author
thinks is valid. This often happens because the programmers mental model of
how ownership should work doesnt match the actual rules that Rust implements.
You probably will experience similar things at first. There is good news,
however: more experienced Rust developers report that once they work with the
rules of the ownership system for a period of time, they fight the borrow
checker less and less.
With that in mind, lets learn about borrowing.
# Borrowing
At the end of the [ownership][ownership] section, we had a nasty function that looked
like this:
```rust
fn foo(v1: Vec<i32>, v2: Vec<i32>) -> (Vec<i32>, Vec<i32>, i32) {
// do stuff with v1 and v2
// hand back ownership, and the result of our function
(v1, v2, 42)
}
let v1 = vec![1, 2, 3];
let v2 = vec![1, 2, 3];
let (v1, v2, answer) = foo(v1, v2);
```
This is not idiomatic Rust, however, as it doesnt take advantage of borrowing. Heres
the first step:
```rust
fn foo(v1: &Vec<i32>, v2: &Vec<i32>) -> i32 {
// do stuff with v1 and v2
// return the answer
42
}
let v1 = vec![1, 2, 3];
let v2 = vec![1, 2, 3];
let answer = foo(&v1, &v2);
// we can use v1 and v2 here!
```
Instead of taking `Vec<i32>`s as our arguments, we take a reference:
`&Vec<i32>`. And instead of passing `v1` and `v2` directly, we pass `&v1` and
`&v2`. We call the `&T` type a reference, and rather than owning the resource,
it borrows ownership. A binding that borrows something does not deallocate the
resource when it goes out of scope. This means that after the call to `foo()`,
we can use our original bindings again.
References are immutable, just like bindings. This means that inside of `foo()`,
the vectors cant be changed at all:
```rust,ignore
fn foo(v: &Vec<i32>) {
v.push(5);
}
let v = vec![];
foo(&v);
```
errors with:
```text
error: cannot borrow immutable borrowed content `*v` as mutable
v.push(5);
^
```
Pushing a value mutates the vector, and so we arent allowed to do it.
# &mut references
Theres a second kind of reference: `&mut T`. A mutable reference allows you
to mutate the resource youre borrowing. For example:
```rust
let mut x = 5;
{
let y = &mut x;
*y += 1;
}
println!("{}", x);
```
This will print `6`. We make `y` a mutable reference to `x`, then add one to
the thing `y` points at. Youll notice that `x` had to be marked `mut` as well,
if it wasnt, we couldnt take a mutable borrow to an immutable value.
Otherwise, `&mut` references are just like references. There _is_ a large
difference between the two, and how they interact, though. You can tell
something is fishy in the above example, because we need that extra scope, with
the `{` and `}`. If we remove them, we get an error:
```text
error: cannot borrow `x` as immutable because it is also borrowed as mutable
println!("{}", x);
^
note: previous borrow of `x` occurs here; the mutable borrow prevents
subsequent moves, borrows, or modification of `x` until the borrow ends
let y = &mut x;
^
note: previous borrow ends here
fn main() {
}
^
```
As it turns out, there are rules.
# The Rules
Heres the rules about borrowing in Rust:
First, any borrow must last for a smaller scope than the owner. Second, you may
have one or the other of these two kinds of borrows, but not both at the same
time:
* 0 to N references (`&T`) to a resource.
* exactly one mutable reference (`&mut T`)
You may notice that this is very similar, though not exactly the same as,
to the definition of a data race:
> There is a data race when two or more pointers access the same memory
> location at the same time, where at least one of them is writing, and the
> operations are not synchronized.
With references, you may have as many as youd like, since none of them are
writing. If you are writing, you need two or more pointers to the same memory,
and you can only have one `&mut` at a time. This is how Rust prevents data
races at compile time: well get errors if we break the rules.
With this in mind, lets consider our example again.
## Thinking in scopes
Heres the code:
```rust,ignore
let mut x = 5;
let y = &mut x;
*y += 1;
println!("{}", x);
```
This code gives us this error:
```text
error: cannot borrow `x` as immutable because it is also borrowed as mutable
println!("{}", x);
^
```
This is because weve violated the rules: we have a `&mut T` pointing to `x`,
and so we arent allowed to create any `&T`s. One or the other. The note
hints at how to think about this problem:
```text
note: previous borrow ends here
fn main() {
}
^
```
In other words, the mutable borow is held through the rest of our example. What
we want is for the mutable borrow to end _before_ we try to call `println!` and
make an immutable borrow. In Rust, borrowing is tied to the scope that the
borrow is valid for. And our scopes look like this:
```rust,ignore
let mut x = 5;
let y = &mut x; // -+ &mut borrow of x starts here
// |
*y += 1; // |
// |
println!("{}", x); // -+ - try to borrow x here
// -+ &mut borrow of x ends here
```
The scopes conflict: we cant make an `&x` while `y` is in scope.
So when we add the curly braces:
```rust
let mut x = 5;
{
let y = &mut x; // -+ &mut borrow starts here
*y += 1; // |
} // -+ ... and ends here
println!("{}", x); // <- try to borrow x here
```
Theres no problem. Our mutable borrow goes out of scope before we create an
immutable one. But scope is the key to seeing how long a borrow lasts for.
## Issues borrowing prevents
Why have these restrictive rules? Well, as we noted, these rules prevent data
races. What kinds of issues do data races cause? Heres a few.
### Iterator invalidation
One example is iterator invalidation, which happens when you try to mutate a
collection that youre iterating over. Rusts borrow checker prevents this from
happening:
```rust
let mut v = vec![1, 2, 3];
for i in &v {
println!("{}", i);
}
```
This prints out one through three. As we iterate through the vectors, were
only given references to the elements. And `v` is itself borrowed as immutable,
which means we cant change it while were iterating:
```rust,ignore
let mut v = vec![1, 2, 3];
for i in &v {
println!("{}", i);
v.push(34);
}
```
Heres the error:
```text
error: cannot borrow `v` as mutable because it is also borrowed as immutable
v.push(34);
^
note: previous borrow of `v` occurs here; the immutable borrow prevents
subsequent moves or mutable borrows of `v` until the borrow ends
for i in &v {
^
note: previous borrow ends here
for i in &v {
println!(“{}”, i);
v.push(34);
}
^
```
We cant modify `v` because its borrowed by the loop.
### use after free
References must live as long as the resource they refer to. Rust will check the
scopes of your references to ensure that this is true.
If Rust didnt check that this property, we could accidentally use a reference
which was invalid. For example:
```rust,ignore
let y: &i32;
{
let x = 5;
y = &x;
}
println!("{}", y);
```
We get this error:
```text
error: `x` does not live long enough
y = &x;
^
note: reference must be valid for the block suffix following statement 0 at
2:16...
let y: &i32;
{
let x = 5;
y = &x;
}
note: ...but borrowed value is only valid for the block suffix following
statement 0 at 4:18
let x = 5;
y = &x;
}
```
In other words, `y` is only valid for the scope where `x` exists. As soon as
`x` goes away, it becomes invalid to refer to it. As such, the error says that
the borrow doesnt live long enough because its not valid for the right
amount of time.
The same problem occurs when the reference is declared _before_ the variable it refers to:
```rust,ignore
let y: &i32;
let x = 5;
y = &x;
println!("{}", y);
```
We get this error:
```text
error: `x` does not live long enough
y = &x;
^
note: reference must be valid for the block suffix following statement 0 at
2:16...
let y: &i32;
let x = 5;
y = &x;
println!("{}", y);
}
note: ...but borrowed value is only valid for the block suffix following
statement 1 at 3:14
let x = 5;
y = &x;
println!("{}", y);
}
```
+45
View File
@@ -0,0 +1,45 @@
% Release Channels
The Rust project uses a concept called release channels to manage releases.
Its important to understand this process to choose which version of Rust
your project should use.
# Overview
There are three channels for Rust releases:
* Nightly
* Beta
* Stable
New nightly releases are created once a day. Every six weeks, the latest
nightly release is promoted to Beta. At that point, it will only receive
patches to fix serious errors. Six weeks later, the beta is promoted to
Stable, and becomes the next release of `1.x`.
This process happens in parallel. So every six weeks, on the same day,
nightly goes to beta, beta goes to stable. When `1.x` is released, at
the same time, `1.(x + 1)-beta` is released, and the nightly becomes the
first version of `1.(x + 2)-nightly`.
# Choosing a version
Generally speaking, unless you have a specific reason, you should be using the
stable release channel. These releases are intended for a general audience.
However, depending on your interest in Rust, you may choose to use nightly
instead. The basic tradeoff is this: in the nightly channel, you can use
unstable, new Rust features. However, unstable features are subject to change,
and so any new nightly release may break your code. If you use the stable
release, you cannot use experimental features, but the next release of Rust
will not cause significant issues through breaking changes.
# Helping the ecosystem through CI
What about beta? We encourage all Rust users who use the stable release channel
to also test against the beta channel in their continuous integration systems.
This will help alert the team in case theres an accidental regression.
Additionally, testing against nightly can catch regressions even sooner, and so
if you dont mind a third build, wed appreciate testing against all channels.
+353
View File
@@ -0,0 +1,353 @@
% Rust Inside Other Languages
For our third project, were going to choose something that shows off one of
Rusts greatest strengths: a lack of a substantial runtime.
As organizations grow, they increasingly rely on a multitude of programming
languages. Different programming languages have different strengths and
weaknesses, and a polyglot stack lets you use a particular language where
its strengths make sense, and use a different language where its weak.
A very common area where many programming languages are weak is in runtime
performance of programs. Often, using a language that is slower, but offers
greater programmer productivity is a worthwhile trade-off. To help mitigate
this, they provide a way to write some of your system in C, and then call
the C code as though it were written in the higher-level language. This is
called a foreign function interface, often shortened to FFI.
Rust has support for FFI in both directions: it can call into C code easily,
but crucially, it can also be called _into_ as easily as C. Combined with
Rusts lack of a garbage collector and low runtime requirements, this makes
Rust a great candidate to embed inside of other languages when you need
some extra oomph.
There is a whole [chapter devoted to FFI][ffi] and its specifics elsewhere in
the book, but in this chapter, well examine this particular use-case of FFI,
with three examples, in Ruby, Python, and JavaScript.
[ffi]: ffi.html
# The problem
There are many different projects we could choose here, but were going to
pick an example where Rust has a clear advantage over many other languages:
numeric computing and threading.
Many languages, for the sake of consistency, place numbers on the heap, rather
than on the stack. Especially in languages that focus on object-oriented
programming and use garbage collection, heap allocation is the default. Sometimes
optimizations can stack allocate particular numbers, but rather than relying
on an optimizer to do its job, we may want to ensure that were always using
primitive number types rather than some sort of object type.
Second, many languages have a global interpreter lock, which limits
concurrency in many situations. This is done in the name of safety, which is
a positive effect, but it limits the amount of work that can be done at the
same time, which is a big negative.
To emphasize these two aspects, were going to create a little project that
uses these two aspects heavily. Since the focus of the example is the embedding
of Rust into the languages, rather than the problem itself, well just use a
toy example:
> Start ten threads. Inside each thread, count from one to five million. After
> All ten threads are finished, print out done!.
I chose five million based on my particular computer. Heres an example of this
code in Ruby:
```ruby
threads = []
10.times do
threads << Thread.new do
count = 0
5_000_000.times do
count += 1
end
end
end
threads.each {|t| t.join }
puts "done!"
```
Try running this example, and choose a number that runs for a few seconds.
Depending on your computers hardware, you may have to increase or decrease the
number.
On my system, running this program takes `2.156` seconds. And, if I use some
sort of process monitoring tool, like `top`, I can see that it only uses one
core on my machine. Thats the GIL kicking in.
While its true that this is a synthetic program, one can imagine many problems
that are similar to this in the real world. For our purposes, spinning up some
busy threads represents some sort of parallel, expensive computation.
# A Rust library
Lets re-write this problem in Rust. First, lets make a new project with
Cargo:
```bash
$ cargo new embed
$ cd embed
```
This program is fairly easy to write in Rust:
```rust
use std::thread;
fn process() {
let handles: Vec<_> = (0..10).map(|_| {
thread::spawn(|| {
let mut _x = 0;
for _ in (0..5_000_001) {
_x += 1
}
})
}).collect();
for h in handles {
h.join().ok().expect("Could not join a thread!");
}
}
```
Some of this should look familiar from previous examples. We spin up ten
threads, collecting them into a `handles` vector. Inside of each thread, we
loop five million times, and add one to `_x` each time. Why the underscore?
Well, if we remove it and compile:
```bash
$ cargo build
Compiling embed v0.1.0 (file:///home/steve/src/embed)
src/lib.rs:3:1: 16:2 warning: function is never used: `process`, #[warn(dead_code)] on by default
src/lib.rs:3 fn process() {
src/lib.rs:4 let handles: Vec<_> = (0..10).map(|_| {
src/lib.rs:5 thread::spawn(|| {
src/lib.rs:6 let mut x = 0;
src/lib.rs:7 for _ in (0..5_000_001) {
src/lib.rs:8 x += 1
...
src/lib.rs:6:17: 6:22 warning: variable `x` is assigned to, but never used, #[warn(unused_variables)] on by default
src/lib.rs:6 let mut x = 0;
^~~~~
```
That first warning is because we are building a library. If we had a test
for this function, the warning would go away. But for now, its never
called.
The second is related to `x` versus `_x`. Because we never actually _do_
anything with `x`, we get a warning about it. In our case, thats perfectly
okay, as were just trying to waste CPU cycles. Prefixing `x` with the
underscore removes the warning.
Finally, we join on each thread.
Right now, however, this is a Rust library, and it doesnt expose anything
thats callable from C. If we tried to hook this up to another language right
now, it wouldnt work. We only need to make two small changes to fix this,
though. The first is modify the beginning of our code:
```rust,ignore
#[no_mangle]
pub extern fn process() {
```
We have to add a new attribute, `no_mangle`. When you create a Rust library, it
changes the name of the function in the compiled output. The reasons for this
are outside the scope of this tutorial, but in order for other languages to
know how to call the function, we need to not do that. This attribute turns
that behavior off.
The other change is the `pub extern`. The `pub` means that this function should
be callable from outside of this module, and the `extern` says that it should
be able to be called from C. Thats it! Not a whole lot of change.
The second thing we need to do is to change a setting in our `Cargo.toml`. Add
this at the bottom:
```toml
[lib]
name = "embed"
crate-type = ["dylib"]
```
This tells Rust that we want to compile our library into a standard dynamic
library. By default, Rust compiles into an rlib, a Rust-specific format.
Lets build the project now:
```bash
$ cargo build --release
Compiling embed v0.1.0 (file:///home/steve/src/embed)
```
Weve chosen `cargo build --release`, which builds with optimizations on. We
want this to be as fast as possible! You can find the output of the library in
`target/release`:
```bash
$ ls target/release/
build deps examples libembed.so native
```
That `libembed.so` is our shared object library. We can use this file
just like any shared object library written in C! As an aside, this may be
`embed.dll` or `libembed.dylib`, depending on the platform.
Now that weve got our Rust library built, lets use it from our Ruby.
# Ruby
Open up a `embed.rb` file inside of our project, and do this:
```ruby
require 'ffi'
module Hello
extend FFI::Library
ffi_lib 'target/release/libembed.so'
attach_function :process, [], :void
end
Hello.process
puts "done!”
```
Before we can run this, we need to install the `ffi` gem:
```bash
$ gem install ffi # this may need sudo
Fetching: ffi-1.9.8.gem (100%)
Building native extensions. This could take a while...
Successfully installed ffi-1.9.8
Parsing documentation for ffi-1.9.8
Installing ri documentation for ffi-1.9.8
Done installing documentation for ffi after 0 seconds
1 gem installed
```
And finally, we can try running it:
```bash
$ ruby embed.rb
done!
$
```
Whoah, that was fast! On my system, this took `0.086` seconds, rather than
the two seconds the pure Ruby version took. Lets break down this Ruby
code:
```ruby
require 'ffi'
```
We first need to require the `ffi` gem. This lets us interface with our
Rust library like a C library.
```ruby
module Hello
extend FFI::Library
ffi_lib 'target/release/libembed.so'
```
The `ffi` gems authors recommend using a module to scope the functions
well import from the shared library. Inside, we `extend` the necessary
`FFI::Library` module, and then call `ffi_lib` to load up our shared
object library. We just pass it the path that our library is stored,
which as we saw before, is `target/release/libembed.so`.
```ruby
attach_function :process, [], :void
```
The `attach_function` method is provided by the FFI gem. Its what
connects our `process()` function in Rust to a Ruby function of the
same name. Since `process()` takes no arguments, the second parameter
is an empty array, and since it returns nothing, we pass `:void` as
the final argument.
```ruby
Hello.process
```
This is the actual call into Rust. The combination of our `module`
and the call to `attach_function` sets this all up. It looks like
a Ruby function, but is actually Rust!
```ruby
puts "done!"
```
Finally, as per our projects requirements, we print out `done!`.
Thats it! As weve seen, bridging between the two languages is really easy,
and buys us a lot of performance.
Next, lets try Python!
# Python
Create an `embed.py` file in this directory, and put this in it:
```python
from ctypes import cdll
lib = cdll.LoadLibrary("target/release/libembed.so")
lib.process()
print("done!")
```
Even easier! We use `cdll` from the `ctypes` module. A quick call
to `LoadLibrary` later, and we can call `process()`.
On my system, this takes `0.017` seconds. Speedy!
# Node.js
Node isnt a language, but its currently the dominant implementation of
server-side JavaScript.
In order to do FFI with Node, we first need to install the library:
```bash
$ npm install ffi
```
After that installs, we can use it:
```javascript
var ffi = require('ffi');
var lib = ffi.Library('target/release/libembed', {
'process': [ 'void', [] ]
});
lib.process();
console.log("done!");
```
It looks more like the Ruby example than the Python example. We use
the `ffi` module to get access to `ffi.Library()`, which loads up
our shared object. We need to annotate the return type and argument
types of the function, which are 'void' for return, and an empty
array to signify no arguments. From there, we just call it and
print the result.
On my system, this takes a quick `0.092` seconds.
# Conclusion
As you can see, the basics of doing this are _very_ easy. Of course,
there's a lot more that we could do here. Check out the [FFI][ffi]
chapter for more details.
+24
View File
@@ -16,3 +16,27 @@ fn main() {
}
```
The `advanced_slice_patterns` gate lets you use `..` to indicate any number of
elements inside a pattern matching a slice. This wildcard can only be used once
for a given array. If there's an identifier before the `..`, the result of the
slice will be bound to that name. For example:
```rust
#![feature(advanced_slice_patterns, slice_patterns)]
fn is_symmetric(list: &[u32]) -> bool {
match list {
[] | [_] => true,
[x, inside.., y] if x == y => is_symmetric(inside),
_ => false
}
}
fn main() {
let sym = &[0, 1, 4, 2, 4, 1, 0];
assert!(is_symmetric(sym));
let not_sym = &[0, 1, 7, 2, 4, 1, 0];
assert!(!is_symmetric(not_sym));
}
```
+91 -23
View File
@@ -1,36 +1,34 @@
% Strings
Strings are an important concept for any programmer to master. Rust's string
Strings are an important concept for any programmer to master. Rusts string
handling system is a bit different from other languages, due to its systems
focus. Any time you have a data structure of variable size, things can get
tricky, and strings are a re-sizable data structure. That being said, Rust's
tricky, and strings are a re-sizable data structure. That being said, Rusts
strings also work differently than in some other systems languages, such as C.
Let's dig into the details. A *string* is a sequence of Unicode scalar values
encoded as a stream of UTF-8 bytes. All strings are guaranteed to be
validly encoded UTF-8 sequences. Additionally, strings are not null-terminated
and can contain null bytes.
Lets dig into the details. A string is a sequence of Unicode scalar values
encoded as a stream of UTF-8 bytes. All strings are guaranteed to be a valid
encoding of UTF-8 sequences. Additionally, unlike some systems languages,
strings are not null-terminated and can contain null bytes.
Rust has two main types of strings: `&str` and `String`.
Rust has two main types of strings: `&str` and `String`. Lets talk about
`&str` first. These are called string slices. String literals are of the type
`&'static str`:
The first kind is a `&str`. These are called *string slices*. String literals
are of the type `&str`:
```{rust}
let string = "Hello there."; // string: &str
```rust
let string = "Hello there."; // string: &'static str
```
This string is statically allocated, meaning that it's saved inside our
This string is statically allocated, meaning that its saved inside our
compiled program, and exists for the entire duration it runs. The `string`
binding is a reference to this statically allocated string. String slices
have a fixed size, and cannot be mutated.
A `String`, on the other hand, is a heap-allocated string. This string
is growable, and is also guaranteed to be UTF-8. `String`s are
commonly created by converting from a string slice using the
`to_string` method.
A `String`, on the other hand, is a heap-allocated string. This string is
growable, and is also guaranteed to be UTF-8. `String`s are commonly created by
converting from a string slice using the `to_string` method.
```{rust}
```rust
let mut s = "Hello".to_string(); // mut s: String
println!("{}", s);
@@ -54,8 +52,78 @@ fn main() {
Viewing a `String` as a `&str` is cheap, but converting the `&str` to a
`String` involves allocating memory. No reason to do that unless you have to!
That's the basics of strings in Rust! They're probably a bit more complicated
than you are used to, if you come from a scripting language, but when the
low-level details matter, they really matter. Just remember that `String`s
allocate memory and control their data, while `&str`s are a reference to
another string, and you'll be all set.
## Indexing
Because strings are valid UTF-8, strings do not support indexing:
```rust,ignore
let s = "hello";
println!("The first letter of s is {}", s[0]); // ERROR!!!
```
Usually, access to a vector with `[]` is very fast. But, because each character
in a UTF-8 encoded string can be multiple bytes, you have to walk over the
string to find the nᵗʰ letter of a string. This is a significantly more
expensive operation, and we dont want to be misleading. Furthermore, letter
isnt something defined in Unicode, exactly. We can choose to look at a string as
individual bytes, or as codepoints:
```rust
let hachiko = "忠犬ハチ公";
for b in hachiko.as_bytes() {
print!("{}, ", b);
}
println!("");
for c in hachiko.chars() {
print!("{}, ", c);
}
println!("");
```
This prints:
```text
229, 191, 160, 231, 138, 172, 227, 131, 143, 227, 131, 129, 229, 133, 172,
忠, 犬, ハ, チ, 公,
```
As you can see, there are more bytes than `char`s.
You can get something similar to an index like this:
```rust
# let hachiko = "忠犬ハチ公";
let dog = hachiko.chars().nth(1); // kinda like hachiko[1]
```
This emphasizes that we have to go through the whole list of `chars`.
## Concatenation
If you have a `String`, you can concatenate a `&str` to the end of it:
```rust
let hello = "Hello ".to_string();
let world = "world!";
let hello_world = hello + world;
```
But if you have two `String`s, you need an `&`:
```rust
let hello = "Hello ".to_string();
let world = "world!".to_string();
let hello_world = hello + &world;
```
This is because `&String` can automatically coerece to a `&str`. This is a
feature called [`Deref` coercions][dc].
[dc]: deref-coercions.html
+114 -5
View File
@@ -1,6 +1,6 @@
% Structs
Structs are a way of creating more complex datatypes. For example, if we were
Structs are a way of creating more complex data types. For example, if we were
doing calculations involving coordinates in 2D space, we would need both an `x`
and a `y` value:
@@ -24,12 +24,12 @@ fn main() {
}
```
Theres a lot going on here, so lets break it down. We declare a struct with
the `struct` keyword, and then with a name. By convention, structs begin with a
capital letter and are also camel cased: `PointInSpace`, not `Point_In_Space`.
Theres a lot going on here, so lets break it down. We declare a `struct` with
the `struct` keyword, and then with a name. By convention, `struct`s begin with
a capital letter and are camel cased: `PointInSpace`, not `Point_In_Space`.
We can create an instance of our struct via `let`, as usual, but we use a `key:
value` style syntax to set each field. The order doesn't need to be the same as
value` style syntax to set each field. The order doesnt need to be the same as
in the original declaration.
Finally, because fields have names, we can access the field through dot
@@ -87,3 +87,112 @@ fn main() {
point.y = 6; // this causes an error
}
```
# Update syntax
A `struct` can include `..` to indicate that you want to use a copy of some
other struct for some of the values. For example:
```rust
struct Point3d {
x: i32,
y: i32,
z: i32,
}
let mut point = Point3d { x: 0, y: 0, z: 0 };
point = Point3d { y: 1, .. point };
```
This gives `point` a new `y`, but keeps the old `x` and `z` values. It doesnt
have to be the same `struct` either, you can use this syntax when making new
ones, and it will copy the values you dont specify:
```rust
# struct Point3d {
# x: i32,
# y: i32,
# z: i32,
# }
let origin = Point3d { x: 0, y: 0, z: 0 };
let point = Point3d { z: 1, x: 2, .. origin };
```
# Tuple structs
Rust has another data type thats like a hybrid between a [tuple][tuple] and a
struct, called a tuple struct. Tuple structs have a name, but
their fields dont:
```rust
struct Color(i32, i32, i32);
struct Point(i32, i32, i32);
```
[tuple]: primitive-types.html#tuples
These two will not be equal, even if they have the same values:
```rust
# struct Color(i32, i32, i32);
# struct Point(i32, i32, i32);
let black = Color(0, 0, 0);
let origin = Point(0, 0, 0);
```
It is almost always better to use a struct than a tuple struct. We would write
`Color` and `Point` like this instead:
```rust
struct Color {
red: i32,
blue: i32,
green: i32,
}
struct Point {
x: i32,
y: i32,
z: i32,
}
```
Now, we have actual names, rather than positions. Good names are important,
and with a struct, we have actual names.
There _is_ one case when a tuple struct is very useful, though, and thats a
tuple struct with only one element. We call this the newtype pattern, because
it allows you to create a new type, distinct from that of its contained value
and expressing its own semantic meaning:
```rust
struct Inches(i32);
let length = Inches(10);
let Inches(integer_length) = length;
println!("length is {} inches", integer_length);
```
As you can see here, you can extract the inner integer type through a
destructuring `let`, just as with regular tuples. In this case, the
`let Inches(integer_length)` assigns `10` to `integer_length`.
# Unit-like structs
You can define a struct with no members at all:
```rust
struct Electron;
```
Such a struct is called unit-like because it resembles the empty
tuple, `()`, sometimes called unit. Like a tuple struct, it defines a
new type.
This is rarely useful on its own (although sometimes it can serve as a
marker type), but in combination with other features, it can become
useful. For instance, a library may ask you to create a structure that
implements a certain [trait][trait] to handle events. If you dont have
any data you need to store in the structure, you can just create a
unit-like struct.
+11 -11
View File
@@ -219,10 +219,10 @@ fn it_works() {
This is a very common use of `assert_eq!`: call some function with
some known arguments and compare it to the expected output.
# The `test` module
# The `tests` module
There is one way in which our existing example is not idiomatic: it's
missing the test module. The idiomatic way of writing our example
missing the `tests` module. The idiomatic way of writing our example
looks like this:
```{rust,ignore}
@@ -231,7 +231,7 @@ pub fn add_two(a: i32) -> i32 {
}
#[cfg(test)]
mod test {
mod tests {
use super::add_two;
#[test]
@@ -241,7 +241,7 @@ mod test {
}
```
There's a few changes here. The first is the introduction of a `mod test` with
There's a few changes here. The first is the introduction of a `mod tests` with
a `cfg` attribute. The module allows us to group all of our tests together, and
to also define helper functions if needed, that don't become a part of the rest
of our crate. The `cfg` attribute only compiles our test code if we're
@@ -260,7 +260,7 @@ pub fn add_two(a: i32) -> i32 {
}
#[cfg(test)]
mod test {
mod tests {
use super::*;
#[test]
@@ -279,7 +279,7 @@ $ cargo test
Running target/adder-91b3e234d4ed382a
running 1 test
test test::it_works ... ok
test tests::it_works ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
@@ -292,7 +292,7 @@ test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured
It works!
The current convention is to use the `test` module to hold your "unit-style"
The current convention is to use the `tests` module to hold your "unit-style"
tests. Anything that just tests one small bit of functionality makes sense to
go here. But what about "integration-style" tests instead? For that, we have
the `tests` directory
@@ -325,7 +325,7 @@ $ cargo test
Running target/adder-91b3e234d4ed382a
running 1 test
test test::it_works ... ok
test tests::it_works ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
@@ -346,7 +346,7 @@ test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured
Now we have three sections: our previous test is also run, as well as our new
one.
That's all there is to the `tests` directory. The `test` module isn't needed
That's all there is to the `tests` directory. The `tests` module isn't needed
here, since the whole thing is focused on tests.
Let's finally check out that third section: documentation tests.
@@ -382,7 +382,7 @@ pub fn add_two(a: i32) -> i32 {
}
#[cfg(test)]
mod test {
mod tests {
use super::*;
#[test]
@@ -405,7 +405,7 @@ $ cargo test
Running target/adder-91b3e234d4ed382a
running 1 test
test test::it_works ... ok
test tests::it_works ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured
+568 -1
View File
@@ -1,3 +1,570 @@
% The Stack and the Heap
Coming Soon
As a systems language, Rust operates at a low level. If youre coming from a
high-level language, there are some aspects of systems programming that you may
not be familiar with. The most important one is how memory works, with a stack
and a heap. If youre familiar with how C-like languages use stack allocation,
this chapter will be a refresher. If youre not, youll learn about this more
general concept, but with a Rust-y focus.
# Memory management
These two terms are about memory management. The stack and the heap are
abstractions that help you determine when to allocate and deallocate memory.
Heres a high-level comparison:
The stack is very fast, and is where memory is allocated in Rust by default.
But the allocation is local to a function call, and is limited in size. The
heap, on the other hand, is slower, and is explicitly allocated by your
program. But its effectively unlimited in size, and is globally accessible.
# The Stack
Lets talk about this Rust program:
```rust
fn main() {
let x = 42;
}
```
This program has one variable binding, `x`. This memory needs to be allocated
from somewhere. Rust stack allocates by default, which means that basic
values go on the stack. What does that mean?
Well, when a function gets called, some memory gets allocated for all of its
local variables and some other information. This is called a stack frame, and
for the purpose of this tutorial, were going to ignore the extra information
and just consider the local variables were allocating. So in this case, when
`main()` is run, well allocate a single 32-bit integer for our stack frame.
This is automatically handled for you, as you can see, we didnt have to write
any special Rust code or anything.
When the function is over, its stack frame gets deallocated. This happens
automatically, we didnt have to do anything special here.
Thats all there is for this simple program. The key thing to understand here
is that stack allocation is very, very fast. Since we know all the local
variables we have ahead of time, we can grab the memory all at once. And since
well throw them all away at the same time as well, we can get rid of it very
fast too.
The downside is that we cant keep values around if we need them for longer
than a single function. We also havent talked about what that name, stack
means. To do that, we need a slightly more complicated example:
```rust
fn foo() {
let y = 5;
let z = 100;
}
fn main() {
let x = 42;
foo();
}
```
This program has three variables total: two in `foo()`, one in `main()`. Just
as before, when `main()` is called, a single integer is allocated for its stack
frame. But before we can show what happens when `foo()` is called, we need to
visualize whats going on with memory. Your operating system presents a view of
memory to your program thats pretty simple: a huge list of addresses, from 0
to a large number, representing how much RAM your computer has. For example, if
you have a gigabyte of RAM, your addresses go from `0` to `1,073,741,824`. That
number comes from 2<sup>30</sup>, the number of bytes in a gigabyte.
This memory is kind of like a giant array: addresses start at zero and go
up to the final number. So heres a diagram of our first stack frame:
| Address | Name | Value |
+---------+------+-------+
| 0 | x | 42 |
Weve got `x` located at address `0`, with the value `42`.
When `foo()` is called, a new stack frame is allocated:
| Address | Name | Value |
+---------+------+-------+
| 2 | z | 100 |
| 1 | y | 5 |
| 0 | x | 42 |
Because `0` was taken by the first frame, `1` and `2` are used for `foo()`s
stack frame. It grows upward, the more functions we call.
Theres some important things we have to take note of here. The numbers 0, 1,
and 2 are all solely for illustrative purposes, and bear no relationship to the
actual numbers the computer will actually use. In particular, the series of
addresses are in reality going to be separated by some number of bytes that
separate each address, and that separation may even exceed the size of the
value being stored.
After `foo()` is over, its frame is deallocated:
| Address | Name | Value |
+---------+------+-------+
| 0 | x | 42 |
And then, after `main()`, even this last value goes away. Easy!
Its called a stack because it works like a stack of dinner plates: the first
plate you put down is the last plate to pick back up. Stacks are sometimes
called last in, first out queues for this reason, as the last value you put
on the stack is the first one you retrieve from it.
Lets try a three-deep example:
```rust
fn bar() {
let i = 6;
}
fn foo() {
let a = 5;
let b = 100;
let c = 1;
bar();
}
fn main() {
let x = 42;
foo();
}
```
Okay, first, we call `main()`:
| Address | Name | Value |
+---------+------+-------+
| 0 | x | 42 |
Next up, `main()` calls `foo()`:
| Address | Name | Value |
+---------+------+-------+
| 3 | c | 1 |
| 2 | b | 100 |
| 1 | a | 5 |
| 0 | x | 42 |
And then `foo()` calls `bar()`:
| Address | Name | Value |
+---------+------+-------+
| 4 | i | 6 |
| 3 | c | 1 |
| 2 | b | 100 |
| 1 | a | 5 |
| 0 | x | 42 |
Whew! Our stack is growing tall.
After `bar()` is over, its frame is deallocated, leaving just `foo()` and
`main()`:
| Address | Name | Value |
+---------+------+-------+
| 3 | c | 1 |
| 2 | b | 100 |
| 1 | a | 5 |
| 0 | x | 42 |
And then `foo()` ends, leaving just `main()`
| Address | Name | Value |
+---------+------+-------+
| 0 | x | 42 |
And then were done. Getting the hang of it? Its like piling up dishes: you
add to the top, you take away from the top.
# The Heap
Now, this works pretty well, but not everything can work like this. Sometimes,
you need to pass some memory between different functions, or keep it alive for
longer than a single functions execution. For this, we can use the heap.
In Rust, you can allocate memory on the heap with the [`Box<T>` type][box].
Heres an example:
```rust
fn main() {
let x = Box::new(5);
let y = 42;
}
```
[box]: ../std/boxed/index.html
Heres what happens in memory when `main()` is called:
| Address | Name | Value |
+---------+------+--------+
| 1 | y | 42 |
| 0 | x | ?????? |
We allocate space for two variables on the stack. `y` is `42`, as it always has
been, but what about `x`? Well, `x` is a `Box<i32>`, and boxes allocate memory
on the heap. The actual value of the box is a structure which has a pointer to
the heap. When we start executing the function, and `Box::new()` is called,
it allocates some memory for the heap, and puts `5` there. The memory now looks
like this:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 5 |
| ... | ... | ... |
| 1 | y | 42 |
| 0 | x | 2<sup>30</sup> |
We have 2<sup>30</sup> in our hypothetical computer with 1GB of RAM. And since
our stack grows from zero, the easiest place to allocate memory is from the
other end. So our first value is at the highest place in memory. And the value
of the struct at `x` has a [raw pointer][rawpointer] to the place weve
allocated on the heap, so the value of `x` is 2<sup>30</sup>, the memory
location weve asked for.
[rawpointer]: raw-pointers.html
We havent really talked too much about what it actually means to allocate and
deallocate memory in these contexts. Getting into very deep detail is out of
the scope of this tutorial, but whats important to point out here is that
the heap isnt just a stack that grows from the opposite end. Well have an
example of this later in the book, but because the heap can be allocated and
freed in any order, it can end up with holes. Heres a diagram of the memory
layout of a program which has been running for a while now:
| Address | Name | Value |
+----------------------+------+----------------------+
| 2<sup>30</sup> | | 5 |
| (2<sup>30</sup>) - 1 | | |
| (2<sup>30</sup>) - 2 | | |
| (2<sup>30</sup>) - 3 | | 42 |
| ... | ... | ... |
| 3 | y | (2<sup>30</sup>) - 3 |
| 2 | y | 42 |
| 1 | y | 42 |
| 0 | x | 2<sup>30</sup> |
In this case, weve allocated four things on the heap, but deallocated two of
them. Theres a gap between 2<sup>30</sup> and (2<sup>30</sup>) - 3 which isnt
currently being used. The specific details of how and why this happens depends
on what kind of strategy you use to manage the heap. Different programs can use
different memory allocators, which are libraries that manage this for you.
Rust programs use [jemalloc][jemalloc] for this purpose.
[jemalloc]: http://www.canonware.com/jemalloc/
Anyway, back to our example. Since this memory is on the heap, it can stay
alive longer than the function which allocates the box. In this case, however,
it doesnt.[^moving] When the function is over, we need to free the stack frame
for `main()`. `Box<T>`, though, has a trick up its sleve: [Drop][drop]. The
implementation of `Drop` for `Box` deallocates the memory that was allocated
when it was created. Great! So when `x` goes away, it first frees the memory
allocated on the heap:
| Address | Name | Value |
+---------+------+--------+
| 1 | y | 42 |
| 0 | x | ?????? |
[drop]: drop.html
[moving]: We can make the memory live longer by transferring ownership,
sometimes called moving out of the box. More complex examples will
be covered later.
And then the stack frame goes away, freeing all of our memory.
# Arguments and borrowing
Weve got some basic examples with the stack and the heap going, but what about
function arguments and borrowing? Heres a small Rust program:
```rust
fn foo(i: &i32) {
let z = 42;
}
fn main() {
let x = 5;
let y = &x;
foo(y);
}
```
When we enter `main()`, memory looks like this:
| Address | Name | Value |
+---------+------+-------+
| 1 | y | 0 |
| 0 | x | 5 |
`x` is a plain old `5`, and `y` is a reference to `x`. So its value is the
memory location that `x` lives at, which in this case is `0`.
What about when we call `foo()`, passing `y` as an argument?
| Address | Name | Value |
+---------+------+-------+
| 3 | z | 42 |
| 2 | i | 0 |
| 1 | y | 0 |
| 0 | x | 5 |
Stack frames arent just for local bindings, theyre for arguments too. So in
this case, we need to have both `i`, our argument, and `z`, our local variable
binding. `i` is a copy of the argument, `y`. Since `y`s value is `0`, so is
`i`s.
This is one reason why borrowing a variable doesnt deallocate any memory: the
value of a reference is just a pointer to a memory location. If we got rid of
the underlying memory, things wouldnt work very well.
# A complex example
Okay, lets go through this complex program step-by-step:
```rust
fn foo(x: &i32) {
let y = 10;
let z = &y;
baz(z);
bar(x, z);
}
fn bar(a: &i32, b: &i32) {
let c = 5;
let d = Box::new(5);
let e = &d;
baz(e);
}
fn baz(f: &i32) {
let g = 100;
}
fn main() {
let h = 3;
let i = Box::new(20);
let j = &h;
foo(j);
}
```
First, we call `main()`:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
We allocate memory for `j`, `i`, and `h`. `i` is on the heap, and so has a
value pointing there.
Next, at the end of `main()`, `foo()` gets called:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
Space gets allocated for `x`, `y`, and `z`. The argument `x` has the same value
as `j`, since thats what we passed it in. Its a pointer to the `0` address,
since `j` points at `h`.
Next, `foo()` calls `baz()`, passing `z`:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 7 | g | 100 |
| 6 | f | 4 |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
Weve allocated memory for `f` and `g`. `baz()` is very short, so when its
over, we get rid of its stack frame:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
Next, `foo()` calls `bar()` with `x` and `z`:
| Address | Name | Value |
+----------------------+------+----------------------+
| 2<sup>30</sup> | | 20 |
| (2<sup>30</sup>) - 1 | | 5 |
| ... | ... | ... |
| 10 | e | 4 |
| 9 | d | (2<sup>30</sup>) - 1 |
| 8 | c | 5 |
| 7 | b | 4 |
| 6 | a | 0 |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
We end up allocating another value on the heap, and so we have to subtract one
from 2<sup>30</sup>. Its easier to just write that than `1,073,741,823`. In any
case, we set up the variables as usual.
At the end of `bar()`, it calls `baz()`:
| Address | Name | Value |
+----------------------+------+----------------------+
| 2<sup>30</sup> | | 20 |
| (2<sup>30</sup>) - 1 | | 5 |
| ... | ... | ... |
| 12 | g | 100 |
| 11 | f | 4 |
| 10 | e | 4 |
| 9 | d | (2<sup>30</sup>) - 1 |
| 8 | c | 5 |
| 7 | b | 4 |
| 6 | a | 0 |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
With this, were at our deepest point! Whew! Congrats for following along this
far.
After `baz()` is over, we get rid of `f` and `g`:
| Address | Name | Value |
+----------------------+------+----------------------+
| 2<sup>30</sup> | | 20 |
| (2<sup>30</sup>) - 1 | | 5 |
| ... | ... | ... |
| 10 | e | 4 |
| 9 | d | (2<sup>30</sup>) - 1 |
| 8 | c | 5 |
| 7 | b | 4 |
| 6 | a | 0 |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
Next, we return from `bar()`. `d` in this case is a `Box<T>`, so it also frees
what it points to: (2<sup>30</sup>) - 1.
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 5 | z | 4 |
| 4 | y | 10 |
| 3 | x | 0 |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
And after that, `foo()` returns:
| Address | Name | Value |
+-----------------+------+----------------+
| 2<sup>30</sup> | | 20 |
| ... | ... | ... |
| 2 | j | 0 |
| 1 | i | 2<sup>30</sup> |
| 0 | h | 3 |
And then, finally, `main()`, which cleans the rest up. When `i` is `Drop`ped,
it will clean up the last of the heap too.
# What do other languages do?
Most languages with a garbage collector heap-allocate by default. This means
that every value is boxed. There are a number of reasons why this is done, but
theyre out of scope for this tutorial. There are some possible optimizations
that dont make it true 100% of the time, too. Rather than relying on the stack
and `Drop` to clean up memory, the garbage collector deals with the heap
instead.
# Which to use?
So if the stack is faster and easier to manage, why do we need the heap? A big
reason is that Stack-allocation alone means you only have LIFO semantics for
reclaiming storage. Heap-allocation is strictly more general, allowing storage
to be taken from and returned to the pool in arbitrary order, but at a
complexity cost.
Generally, you should prefer stack allocation, and so, Rust stack-allocates by
default. The LIFO model of the stack is simpler, at a fundamental level. This
has two big impacts: runtime efficiency and semantic impact.
## Runtime Efficiency.
Managing the memory for the stack is trivial: The machine just
increments or decrements a single value, the so-called “stack pointer”.
Managing memory for the heap is non-trivial: heap-allocated memory is freed at
arbitrary points, and each block of heap-allocated memory can be of arbitrary
size, the memory manager must generally work much harder to identify memory for
reuse.
If youd like to dive into this topic in greater detail, [this paper][wilson]
is a great introduction.
[wilson]: http://www.cs.northwestern.edu/~pdinda/icsclass/doc/dsa.pdf
## Semantic impact
Stack-allocation impacts the Rust language itself, and thus the developers
mental model. The LIFO semantics is what drives how the Rust language handles
automatic memory management. Even the deallocation of a uniquely-owned
heap-allocated box can be driven by the stack-based LIFO semantics, as
discussed throughout this chapter. The flexibility (i.e. expressiveness) of non
LIFO-semantics means that in general the compiler cannot automatically infer at
compile-time where memory should be freed; it has to rely on dynamic protocols,
potentially from outside the language itself, to drive deallocation (reference
counting, as used by `Rc<T>` and `Arc<T>`, is one example of this).
When taken to the extreme, the increased expressive power of heap allocation
comes at the cost of either significant runtime support (e.g. in the form of a
garbage collector) or significant programmer effort (in the form of explicit
memory management calls that require verification not provided by the Rust
compiler).
+23 -23
View File
@@ -1,15 +1,15 @@
% Trait Objects
When code involves polymorphism, there needs to be a mechanism to determine
which specific version is actually run. This is called 'dispatch.' There are
which specific version is actually run. This is called dispatch. There are
two major forms of dispatch: static dispatch and dynamic dispatch. While Rust
favors static dispatch, it also supports dynamic dispatch through a mechanism
called 'trait objects.'
called trait objects.
## Background
For the rest of this chapter, we'll need a trait and some implementations.
Let's make a simple one, `Foo`. It has one method that is expected to return a
For the rest of this chapter, well need a trait and some implementations.
Lets make a simple one, `Foo`. It has one method that is expected to return a
`String`.
```rust
@@ -18,7 +18,7 @@ trait Foo {
}
```
We'll also implement this trait for `u8` and `String`:
Well also implement this trait for `u8` and `String`:
```rust
# trait Foo { fn method(&self) -> String; }
@@ -53,7 +53,7 @@ fn main() {
}
```
Rust uses 'monomorphization' to perform static dispatch here. This means that
Rust uses monomorphization to perform static dispatch here. This means that
Rust will create a special version of `do_something()` for both `u8` and
`String`, and then replace the call sites with calls to these specialized
functions. In other words, Rust generates something like this:
@@ -82,7 +82,7 @@ fn main() {
This has a great upside: static dispatch allows function calls to be
inlined because the callee is known at compile time, and inlining is
the key to good optimization. Static dispatch is fast, but it comes at
a tradeoff: 'code bloat', due to many copies of the same function
a tradeoff: code bloat, due to many copies of the same function
existing in the binary, one for each type.
Furthermore, compilers arent perfect and may “optimize” code to become slower.
@@ -99,7 +99,7 @@ reason.
## Dynamic dispatch
Rust provides dynamic dispatch through a feature called 'trait objects.' Trait
Rust provides dynamic dispatch through a feature called trait objects. Trait
objects, like `&Foo` or `Box<Foo>`, are normal values that store a value of
*any* type that implements the given trait, where the precise type can only be
known at runtime.
@@ -109,12 +109,12 @@ implements the trait by *casting* it (e.g. `&x as &Foo`) or *coercing* it
(e.g. using `&x` as an argument to a function that takes `&Foo`).
These trait object coercions and casts also work for pointers like `&mut T` to
`&mut Foo` and `Box<T>` to `Box<Foo>`, but that's all at the moment. Coercions
`&mut Foo` and `Box<T>` to `Box<Foo>`, but thats all at the moment. Coercions
and casts are identical.
This operation can be seen as "erasing" the compiler's knowledge about the
This operation can be seen as erasing the compilers knowledge about the
specific type of the pointer, and hence trait objects are sometimes referred to
as "type erasure".
as type erasure.
Coming back to the example above, we can use the same trait to perform dynamic
dispatch with trait objects by casting:
@@ -155,7 +155,7 @@ A function that takes a trait object is not specialized to each of the types
that implements `Foo`: only one copy is generated, often (but not always)
resulting in less code bloat. However, this comes at the cost of requiring
slower virtual function calls, and effectively inhibiting any chance of
inlining and related optimisations from occurring.
inlining and related optimizations from occurring.
### Why pointers?
@@ -167,7 +167,7 @@ on the heap to store it.
For `Foo`, we would need to have a value that could be at least either a
`String` (24 bytes) or a `u8` (1 byte), as well as any other type for which
dependent crates may implement `Foo` (any number of bytes at all). There's no
dependent crates may implement `Foo` (any number of bytes at all). Theres no
way to guarantee that this last point can work if the values are stored without
a pointer, because those other types can be arbitrarily large.
@@ -177,14 +177,14 @@ when we are tossing a trait object around, only the size of the pointer itself.
### Representation
The methods of the trait can be called on a trait object via a special record
of function pointers traditionally called a 'vtable' (created and managed by
of function pointers traditionally called a vtable (created and managed by
the compiler).
Trait objects are both simple and complicated: their core representation and
layout is quite straight-forward, but there are some curly error messages and
surprising behaviors to discover.
Let's start simple, with the runtime representation of a trait object. The
Lets start simple, with the runtime representation of a trait object. The
`std::raw` module contains structs with layouts that are the same as the
complicated built-in types, [including trait objects][stdraw]:
@@ -199,12 +199,12 @@ pub struct TraitObject {
[stdraw]: ../std/raw/struct.TraitObject.html
That is, a trait object like `&Foo` consists of a "data" pointer and a "vtable"
That is, a trait object like `&Foo` consists of a data pointer and a vtable
pointer.
The data pointer addresses the data (of some unknown type `T`) that the trait
object is storing, and the vtable pointer points to the vtable ("virtual method
table") corresponding to the implementation of `Foo` for `T`.
object is storing, and the vtable pointer points to the vtable (virtual method
table) corresponding to the implementation of `Foo` for `T`.
A vtable is essentially a struct of function pointers, pointing to the concrete
@@ -212,7 +212,7 @@ piece of machine code for each method in the implementation. A method call like
`trait_object.method()` will retrieve the correct pointer out of the vtable and
then do a dynamic call of it. For example:
```{rust,ignore}
```rust,ignore
struct FooVtable {
destructor: fn(*mut ()),
size: usize,
@@ -261,7 +261,7 @@ static Foo_for_String_vtable: FooVtable = FooVtable {
```
The `destructor` field in each vtable points to a function that will clean up
any resources of the vtable's type, for `u8` it is trivial, but for `String` it
any resources of the vtables type, for `u8` it is trivial, but for `String` it
will free the memory. This is necessary for owning trait objects like
`Box<Foo>`, which need to clean-up both the `Box` allocation as well as the
internal type when they go out of scope. The `size` and `align` fields store
@@ -270,11 +270,11 @@ essentially unused at the moment since the information is embedded in the
destructor, but will be used in the future, as trait objects are progressively
made more flexible.
Suppose we've got some values that implement `Foo`, then the explicit form of
Suppose weve got some values that implement `Foo`, then the explicit form of
construction and use of `Foo` trait objects might look a bit like (ignoring the
type mismatches: they're all just pointers anyway):
type mismatches: theyre all just pointers anyway):
```{rust,ignore}
```rust,ignore
let a: String = "foo".to_string();
let x: u8 = 1;
+99 -112
View File
@@ -1,10 +1,9 @@
% Traits
Do you remember the `impl` keyword, used to call a function with method
syntax?
Do you remember the `impl` keyword, used to call a function with [method
syntax][methodsyntax]?
```{rust}
# #![feature(core)]
```rust
struct Circle {
x: f64,
y: f64,
@@ -18,11 +17,12 @@ impl Circle {
}
```
[methodsyntax]: method-syntax.html
Traits are similar, except that we define a trait with just the method
signature, then implement the trait for that struct. Like this:
```{rust}
# #![feature(core)]
```rust
struct Circle {
x: f64,
y: f64,
@@ -41,20 +41,13 @@ impl HasArea for Circle {
```
As you can see, the `trait` block looks very similar to the `impl` block,
but we don't define a body, just a type signature. When we `impl` a trait,
but we dont define a body, just a type signature. When we `impl` a trait,
we use `impl Trait for Item`, rather than just `impl Item`.
So what's the big deal? Remember the error we were getting with our generic
`inverse` function?
```text
error: binary operation `==` cannot be applied to type `T`
```
We can use traits to constrain our generics. Consider this function, which
does not compile, and gives us a similar error:
```{rust,ignore}
```rust,ignore
fn print_area<T>(shape: T) {
println!("This shape has an area of {}", shape.area());
}
@@ -66,11 +59,11 @@ Rust complains:
error: type `T` does not implement any method in scope named `area`
```
Because `T` can be any type, we can't be sure that it implements the `area`
method. But we can add a *trait constraint* to our generic `T`, ensuring
Because `T` can be any type, we cant be sure that it implements the `area`
method. But we can add a trait constraint to our generic `T`, ensuring
that it does:
```{rust}
```rust
# trait HasArea {
# fn area(&self) -> f64;
# }
@@ -83,10 +76,9 @@ The syntax `<T: HasArea>` means `any type that implements the HasArea trait`.
Because traits define function type signatures, we can be sure that any type
which implements `HasArea` will have an `.area()` method.
Here's an extended example of how this works:
Heres an extended example of how this works:
```{rust}
# #![feature(core)]
```rust
trait HasArea {
fn area(&self) -> f64;
}
@@ -144,10 +136,10 @@ This shape has an area of 3.141593
This shape has an area of 1
```
As you can see, `print_area` is now generic, but also ensures that we
have passed in the correct types. If we pass in an incorrect type:
As you can see, `print_area` is now generic, but also ensures that we have
passed in the correct types. If we pass in an incorrect type:
```{rust,ignore}
```rust,ignore
print_area(5);
```
@@ -157,11 +149,11 @@ We get a compile-time error:
error: failed to find an implementation of trait main::HasArea for int
```
So far, we've only added trait implementations to structs, but you can
implement a trait for any type. So technically, we _could_ implement
`HasArea` for `i32`:
So far, weve only added trait implementations to structs, but you can
implement a trait for any type. So technically, we _could_ implement `HasArea`
for `i32`:
```{rust}
```rust
trait HasArea {
fn area(&self) -> f64;
}
@@ -181,102 +173,57 @@ It is considered poor style to implement methods on such primitive types, even
though it is possible.
This may seem like the Wild West, but there are two other restrictions around
implementing traits that prevent this from getting out of hand. First, traits
must be `use`d in any scope where you wish to use the trait's method. So for
example, this does not work:
implementing traits that prevent this from getting out of hand. The first is
that if the trait isnt defined in your scope, it doesnt apply. Heres an
example: the standard library provides a [`Write`][write] trait which adds
extra functionality to `File`s, for doing file I/O. By default, a `File`
wont have its methods:
```{rust,ignore}
mod shapes {
use std::f64::consts;
[write]: ../std/io/trait.Write.html
trait HasArea {
fn area(&self) -> f64;
}
struct Circle {
x: f64,
y: f64,
radius: f64,
}
impl HasArea for Circle {
fn area(&self) -> f64 {
consts::PI * (self.radius * self.radius)
}
}
}
fn main() {
let c = shapes::Circle {
x: 0.0f64,
y: 0.0f64,
radius: 1.0f64,
};
println!("{}", c.area());
}
```rust,ignore
let mut f = std::fs::File::open("foo.txt").ok().expect("Couldnt open foo.txt");
let result = f.write("whatever".as_bytes());
# result.unwrap(); // ignore the error
```
Now that we've moved the structs and traits into their own module, we get an
error:
Heres the error:
```text
error: type `shapes::Circle` does not implement any method in scope named `area`
error: type `std::fs::File` does not implement any method in scope named `write`
let result = f.write(b"whatever");
^~~~~~~~~~~~~~~~~~
```
If we add a `use` line right above `main` and make the right things public,
everything is fine:
We need to `use` the `Write` trait first:
```{rust}
# #![feature(core)]
mod shapes {
use std::f64::consts;
```rust,ignore
use std::io::Write;
pub trait HasArea {
fn area(&self) -> f64;
}
pub struct Circle {
pub x: f64,
pub y: f64,
pub radius: f64,
}
impl HasArea for Circle {
fn area(&self) -> f64 {
consts::PI * (self.radius * self.radius)
}
}
}
use shapes::HasArea;
fn main() {
let c = shapes::Circle {
x: 0.0f64,
y: 0.0f64,
radius: 1.0f64,
};
println!("{}", c.area());
}
let mut f = std::fs::File::open("foo.txt").ok().expect("Couldnt open foo.txt");
let result = f.write("whatever".as_bytes());
# result.unwrap(); // ignore the error
```
This will compile without error.
This means that even if someone does something bad like add methods to `int`,
it won't affect you, unless you `use` that trait.
it wont affect you, unless you `use` that trait.
There's one more restriction on implementing traits. Either the trait or the
type you're writing the `impl` for must be inside your crate. So, we could
implement the `HasArea` type for `i32`, because `HasArea` is in our crate. But
Theres one more restriction on implementing traits. Either the trait or the
type youre writing the `impl` for must be defined by you. So, we could
implement the `HasArea` type for `i32`, because `HasArea` is in our code. But
if we tried to implement `Float`, a trait provided by Rust, for `i32`, we could
not, because both the trait and the type aren't in our crate.
not, because neither the trait nor the type are in our code.
One last thing about traits: generic functions with a trait bound use
*monomorphization* (*mono*: one, *morph*: form), so they are statically
dispatched. What's that mean? Check out the chapter on [trait
objects](trait-objects.html) for more.
monomorphization (mono: one, morph: form), so they are statically dispatched.
Whats that mean? Check out the chapter on [trait objects][to] for more details.
## Multiple trait bounds
[to]: trait-objects.html
# Multiple trait bounds
Youve seen that you can bound a generic type parameter with a trait:
@@ -299,10 +246,10 @@ fn foo<T: Clone + Debug>(x: T) {
`T` now needs to be both `Clone` as well as `Debug`.
## Where clause
# Where clause
Writing functions with only a few generic types and a small number of trait
bounds isn't too bad, but as the number increases, the syntax gets increasingly
bounds isnt too bad, but as the number increases, the syntax gets increasingly
awkward:
```
@@ -318,7 +265,7 @@ fn foo<T: Clone, K: Clone + Debug>(x: T, y: K) {
The name of the function is on the far left, and the parameter list is on the
far right. The bounds are getting in the way.
Rust has a solution, and it's called a '`where` clause':
Rust has a solution, and its called a `where` clause:
```
use std::fmt::Debug;
@@ -391,7 +338,7 @@ plain type parameter (like `T`).
## Default methods
There's one last feature of traits we should cover: default methods. It's
Theres one last feature of traits we should cover: default methods. Its
easiest just to show an example:
```rust
@@ -402,8 +349,8 @@ trait Foo {
}
```
Implementors of the `Foo` trait need to implement `bar()`, but they don't
need to implement `baz()`. They'll get this default behavior. They can
Implementors of the `Foo` trait need to implement `bar()`, but they dont
need to implement `baz()`. Theyll get this default behavior. They can
override the default if they so choose:
```rust
@@ -431,3 +378,43 @@ default.baz(); // prints "We called baz."
let over = OverrideDefault;
over.baz(); // prints "Override baz!"
```
# Inheritance
Sometimes, implementing a trait requires implementing another trait:
```rust
trait Foo {
fn foo(&self);
}
trait FooBar : Foo {
fn foobar(&self);
}
```
Implementors of `FooBar` must also implement `Foo`, like this:
```rust
# trait Foo {
# fn foo(&self);
# }
# trait FooBar : Foo {
# fn foobar(&self);
# }
struct Baz;
impl Foo for Baz {
fn foo(&self) { println!("foo"); }
}
impl FooBar for Baz {
fn foobar(&self) { println!("foobar"); }
}
```
If we forget to implement `Foo`, Rust will tell us:
```text
error: the trait `main::Foo` is not implemented for the type `main::Baz` [E0277]
```
+74 -1
View File
@@ -1,3 +1,76 @@
% `type` Aliases
Coming soon
The `type` keyword lets you declare an alias of another type:
```rust
type Name = String;
```
You can then use this type as if it were a real type:
```rust
type Name = String;
let x: Name = "Hello".to_string();
```
Note, however, that this is an _alias_, not a new type entirely. In other
words, because Rust is strongly typed, youd expect a comparison between two
different types to fail:
```rust,ignore
let x: i32 = 5;
let y: i64 = 5;
if x == y {
// ...
}
```
this gives
```text
error: mismatched types:
expected `i32`,
found `i64`
(expected i32,
found i64) [E0308]
if x == y {
^
```
But, if we had an alias:
```rust
type Num = i32;
let x: i32 = 5;
let y: Num = 5;
if x == y {
// ...
}
```
This compiles without error. Values of a `Num` type are the same as a value of
type `i32`, in every way.
You can also use type aliases with generics:
```rust
use std::result;
enum ConcreteError {
Foo,
Bar,
}
type Result<T> = result::Result<T, ConcreteError>;
```
This creates a specialized version of the `Result` type, which always has a
`ConcreteError` for the `E` part of `Result<T, E>`. This is commonly used
in the standard library to create custom errors for each subsection. For
example, [io::Result][ioresult].
[ioresult]: ../std/io/type.Result.html
+125 -1
View File
@@ -1,3 +1,127 @@
% Universal Function Call Syntax
Coming soon
Sometimes, functions can have the same names. Consider this code:
```rust
trait Foo {
fn f(&self);
}
trait Bar {
fn f(&self);
}
struct Baz;
impl Foo for Baz {
fn f(&self) { println!("Bazs impl of Foo"); }
}
impl Bar for Baz {
fn f(&self) { println!("Bazs impl of Bar"); }
}
let b = Baz;
```
If we were to try to call `b.f()`, wed get an error:
```text
error: multiple applicable methods in scope [E0034]
b.f();
^~~
note: candidate #1 is defined in an impl of the trait `main::Foo` for the type
`main::Baz`
fn f(&self) { println!("Bazs impl of Foo"); }
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: candidate #2 is defined in an impl of the trait `main::Bar` for the type
`main::Baz`
fn f(&self) { println!("Bazs impl of Bar"); }
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
We need a way to disambiguate which method we need. This feature is called
universal function call syntax, and it looks like this:
```rust
# trait Foo {
# fn f(&self);
# }
# trait Bar {
# fn f(&self);
# }
# struct Baz;
# impl Foo for Baz {
# fn f(&self) { println!("Bazs impl of Foo"); }
# }
# impl Bar for Baz {
# fn f(&self) { println!("Bazs impl of Bar"); }
# }
# let b = Baz;
Foo::f(&b);
Bar::f(&b);
```
Lets break it down.
```rust,ignore
Foo::
Bar::
```
These halves of the invocation are the types of the two traits: `Foo` and
`Bar`. This is what ends up actually doing the disambiguation between the two:
Rust calls the one from the trait name you use.
```rust,ignore
f(&b)
```
When we call a method like `b.f()` using [method syntax][methodsyntax], Rust
will automatically borrow `b` if `f()` takes `&self`. In this case, Rust will
not, and so we need to pass an explicit `&b`.
[methodsyntax]: method-syntax.html
# Angle-bracket Form
The form of UFCS we just talked about:
```rust,ignore
Trait::method(args);
```
Is a short-hand. Theres an expanded form of this thats needed in some
situations:
```rust,ignore
<Type as Trait>::method(args);
```
The `<>::` syntax is a means of providing a type hint. The type goes inside
the `<>`s. In this case, the type is `Type as Trait`, indicating that we want
`Trait`s version of `method` to be called here. The `as Trait` part is
optional if its not ambiguous. Same with the angle brackets, hence the
shorter form.
Heres an example of using the longer form.
```rust
trait Foo {
fn clone(&self);
}
#[derive(Clone)]
struct Bar;
impl Foo for Bar {
fn clone(&self) {
println!("Making a clone of Bar");
<Bar as Clone>::clone(self);
}
}
```
This will call the `Clone` traits `clone()` method, rather than `Foo`s.
+128
View File
@@ -0,0 +1,128 @@
% Unsafe
Rusts main draw is its powerful static guarantees about behavior. But safety
checks are conservative by nature: there are some programs that are actually
safe, but the compiler is not able to verify this is true. To write these kinds
of programs, we need to tell the compiler to relax its restrictions a bit. For
this, Rust has a keyword, `unsafe`. Code using `unsafe` has less restrictions
than normal code does.
Lets go over the syntax, and then well talk semantics. `unsafe` is used in
two contexts. The first one is to mark a function as unsafe:
```rust
unsafe fn danger_will_robinson() {
// scary stuff
}
```
All functions called from [FFI][ffi] must be marked as `unsafe`, for example.
The second use of `unsafe` is an unsafe block:
[ffi]: ffi.html
```rust
unsafe {
// scary stuff
}
```
Its important to be able to explicitly delineate code that may have bugs that
cause big problems. If a Rust program segfaults, you can be sure its somewhere
in the sections marked `unsafe`.
# What does safe mean?
Safe, in the context of Rust, means “doesnt do anything unsafe.” Easy!
Okay, lets try again: what is not safe to do? Heres a list:
* Data races
* Dereferencing a null/dangling raw pointer
* Reads of [undef][undef] (uninitialized) memory
* Breaking the [pointer aliasing rules][aliasing] with raw pointers.
* `&mut T` and `&T` follow LLVMs scoped [noalias][noalias] model, except if
the `&T` contains an `UnsafeCell<U>`. Unsafe code must not violate these
aliasing guarantees.
* Mutating an immutable value/reference without `UnsafeCell<U>`
* Invoking undefined behavior via compiler intrinsics:
* Indexing outside of the bounds of an object with `std::ptr::offset`
(`offset` intrinsic), with
the exception of one byte past the end which is permitted.
* Using `std::ptr::copy_nonoverlapping_memory` (`memcpy32`/`memcpy64`
intrinsics) on overlapping buffers
* Invalid values in primitive types, even in private fields/locals:
* Null/dangling references or boxes
* A value other than `false` (0) or `true` (1) in a `bool`
* A discriminant in an `enum` not included in its type definition
* A value in a `char` which is a surrogate or above `char::MAX`
* Non-UTF-8 byte sequences in a `str`
* Unwinding into Rust from foreign code or unwinding from Rust into foreign
code.
[noalias]: http://llvm.org/docs/LangRef.html#noalias
[undef]: http://llvm.org/docs/LangRef.html#undefined-values
[aliasing]: http://llvm.org/docs/LangRef.html#pointer-aliasing-rules
Whew! Thats a bunch of stuff. Its also important to notice all kinds of
behaviors that are certainly bad, but are expressly _not_ unsafe:
* Deadlocks
* Reading data from private fields
* Leaks due to reference count cycles
* Exiting without calling destructors
* Sending signals
* Accessing/modifying the file system
* Integer overflow
Rust cannot prevent all kinds of software problems. Buggy code can and will be
written in Rust. These things arent great, but they dont qualify as `unsafe`
specifically.
# Unsafe Superpowers
In both unsafe functions and unsafe blocks, Rust will let you do three things
that you normally can not do. Just three. Here they are:
1. Access or update a [static mutable variable][static].
2. Dereference a raw pointer.
3. Call unsafe functions. This is the most powerful ability.
Thats it. Its important that `unsafe` does not, for example, turn off the
borrow checker. Adding `unsafe` to some random Rust code doesnt change its
semantics, it wont just start accepting anything.
But it will let you write things that _do_ break some of the rules. Lets go
over these three abilities in order.
## Access or update a `static mut`
Rust has a feature called `static mut` which allows for mutable global state.
Doing so can cause a data race, and as such is inherently not safe. For more
details, see the [static][static] section of the book.
[static]: const-and-static.html#static
## Dereference a raw pointer
Raw pointers let you do arbitrary pointer arithmetic, and can cause a number of
different memory safety and security issues. In some senses, the ability to
dereference an arbitrary pointer is one of the most dangerous things you can
do. For more on raw pointers, see [their section of the book][rawpointers].
[rawpointers]: raw-pointers.html
## Call unsafe functions
This last ability works with both aspects of `unsafe`: you can only call
functions marked `unsafe` from inside an unsafe block.
This ability is powerful and varied. Rust exposes some [compiler
intrinsics][intrinsics] as unsafe functions, and some unsafe functions bypass
safety checks, trading safety for speed.
Ill repeat again: even though you _can_ do arbitrary things in unsafe blocks
and functions doesnt mean you should. The compiler will act as though youre
upholding its invariants, so be careful!
[intrinsics]: intrinsics.html
+58 -1
View File
@@ -1,3 +1,60 @@
% Unsized Types
Coming Soon!
Most types have a particular size, in bytes, that is knowable at compile time.
For example, an `i32` is thirty-two bits big, or four bytes. However, there are
some types which are useful to express, but do not have a defined size. These are
called unsized or dynamically sized types. One example is `[T]`. This type
represents a certain number of `T` in sequence. But we dont know how many
there are, so the size is not known.
Rust understands a few of these types, but they have some restrictions. There
are three:
1. We can only manipulate an instance of an unsized type via a pointer. An
`&[T]` works just fine, but a `[T]` does not.
2. Variables and arguments cannot have dynamically sized types.
3. Only the last field in a `struct` may have a dynamically sized type; the
other fields must not. Enum variants must not have dynamically sized types as
data.
So why bother? Well, because `[T]` can only be used behind a pointer, if we
didnt have language support for unsized types, it would be impossible to write
this:
```rust,ignore
impl Foo for str {
```
or
```rust,ignore
impl<T> Foo for [T] {
```
Instead, you would have to write:
```rust,ignore
impl Foo for &str {
```
Meaning, this implementation would only work for [references][ref], and not
other types of pointers. With the `impl for str`, all pointers, including (at
some point, there are some bugs to fix first) user-defined custom smart
pointers, can use this `impl`.
[ref]: references-and-borrowing.html
# ?Sized
If you want to write a function that accepts a dynamically sized type, you
can use the special bound, `?Sized`:
```rust
struct Foo<T: ?Sized> {
f: T,
}
```
This `?`, read as “T may be `Sized`”, means that this bound is special: it
lets us match more kinds, not less. Its almost like every `T` implicitly has
`T: Sized`, and the `?` undoes this default.
+1 -1
View File
@@ -1,6 +1,6 @@
% Variable Bindings
Vitually every non-Hello World Rust program uses *variable bindings*. They
Virtually every non-'Hello World Rust program uses *variable bindings*. They
look like this:
```rust
+43 -15
View File
@@ -1,32 +1,60 @@
% Vectors
A *vector* is a dynamic or "growable" array, implemented as the standard
library type [`Vec<T>`](../std/vec/) (Where `<T>` is a [Generic](./generics.md) statement). Vectors always allocate their data on the heap. Vectors are to slices
what `String` is to `&str`. You can create them with the `vec!` macro:
A vector is a dynamic or growable array, implemented as the standard
library type [`Vec<T>`][vec]. The `T` means that we can have vectors
of any type (see the chapter on [generics][generic] for more).
Vectors always allocate their data on the heap.
You can create them with the `vec!` macro:
```{rust}
let v = vec![1, 2, 3]; // v: Vec<i32>
```rust
let v = vec![1, 2, 3, 4, 5]; // v: Vec<i32>
```
(Notice that unlike the `println!` macro we've used in the past, we use square
brackets `[]` with `vec!`. Rust allows you to use either in either situation,
(Notice that unlike the `println!` macro weve used in the past, we use square
brackets `[]` with `vec!` macro. Rust allows you to use either in either situation,
this is just convention.)
There's an alternate form of `vec!` for repeating an initial value:
Theres an alternate form of `vec!` for repeating an initial value:
```
let v = vec![0; 10]; // ten zeroes
```
You can get the length of, iterate over, and subscript vectors just like
arrays. In addition, (mutable) vectors can grow automatically:
## Accessing elements
```{rust}
let mut nums = vec![1, 2, 3]; // mut nums: Vec<i32>
To get the value at a particular index in the vector, we use `[]`s:
nums.push(4);
```rust
let v = vec![1, 2, 3, 4, 5];
println!("The length of nums is now {}", nums.len()); // Prints 4
println!("The third element of v is {}", v[2]);
```
Vectors have many more useful methods.
The indices count from `0`, so the third element is `v[2]`.
## Iterating
Once you have a vector, you can iterate through its elements with `for`. There
are three versions:
```rust
let mut v = vec![1, 2, 3, 4, 5];
for i in &v {
println!("A reference to {}", i);
}
for i in &mut v {
println!("A mutable reference to {}", i);
}
for i in v {
println!("Take ownership of the vector and its element {}", i);
}
```
Vectors have many more useful methods, which you can read about in [their
API documentation][vec].
[vec]: ../std/vec/index.html
[generic]: generics.html
+1 -1
View File
@@ -1,4 +1,4 @@
% while loops
% while Loops
Rust also has a `while` loop. It looks like this: