iTranslated by AI
Rust: Why str cannot be used as a function argument or return type
Background
In The Rust Programming Language, 4.3 The Slice Type, there is sample code that defines a function with a signature like this:
// A function that takes a string and returns the first word in that string
fn first_word(s: &String) -> &str
In this case, the return type is defined not as str, but as &str, which is a reference to str. When returning a usize, we don't use &usize but simply return usize as is, and it can indeed be implemented that way.
This raised a question for me: "Since usize can be returned and passed as an argument directly without a reference, can str be passed or returned directly instead of &str?" When I tried returning str directly, it resulted in a compilation error with the following message.
error[E0277]: the size for values of type `str` cannot be known at compilation time
--> src/main.rs:32:24
|
32 | fn test(s: &String) -> str {
| ^^^ doesn't have a size known at compile-time
|
= help: the trait `Sized` is not implemented for `str`
= note: the return type of a function must have a statically known size
Trying to use str as an argument also results in a similar error.
error[E0277]: the size for values of type `str` cannot be known at compilation time
--> src/main.rs:32:9
|
32 | fn test(s: str) -> String {
| ^ doesn't have a size known at compile-time
|
= help: the trait `Sized` is not implemented for `str`
= help: unsized locals are gated as an unstable feature
help: function arguments must have a statically known size, borrowed types always have a known size
Questions
Why can't str be used directly as a function argument or return value?
Using &str allows it to be used for both arguments and return values, and it seems most other samples use &str as well. So, what exactly does this &str refer to, and why does using a reference make it usable for both?
The error message "the size for values of type str cannot be known at compilation time" appears in both cases. What does "size" mean here, and how does it relate to compilation?
Conclusion
Function arguments and return values are managed on the stack, but because of the nature of the stack, only fixed-size data can be placed on it.
On the other hand, str is a data type of arbitrary length, so it cannot be used directly as a function argument or return value.
Instead of str, using a reference to str (&str) makes it a fixed-size data type, allowing it to be used as an argument or return value.
The error message "cannot be known at compilation time" appears because Rust detects these memory management issues in advance.
Investigation
First, I searched for the error message "the size for values of type str cannot be known at compilation time" and found the following question and answer on StackOverflow.
What it means is harder to explain succinctly. Rust has a number of types that are unsized. The most prevalent ones are str and [T]. Contrast these types to how you normally see them used: &str or &[T]. You might even see them as Box<str> or Arc<[T]>. The commonality is that they are always used behind a reference of some kind.
Because these types don't have a size, they cannot be stored in a variable on the stack — the compiler wouldn't know how much stack space to reserve for them! That's the essence of the error message.
First, regarding the error message itself, "the size for values of type str cannot be known at compilation time":
This can be resolved by using&'static strinstead ofstr. In Rust, pointers are not created automatically, sostrrepresents the actual string data of arbitrary length. Returning this from a function would mean, for example, that for a 1,000-character string, you would be copying 1,000 characters of data to return to the caller. Currently, Rust does not allow operations that copy arbitrary-length data to the stack. By using a pointer via&str, the size becomes fixed at 16 bytes (8-byte pointer size + 8-byte str size), so the error no longer occurs.
You can basically remember that "you don't use the str type in its raw form."
Based on these answers, there seem to be two key points:
-
stris an arbitrary-length data type and does not have a specific size[1] - Variables that store arbitrary-length data cannot be placed on the stack[2]
With this information, we can explain it as follows.
First, str represents the string data of arbitrary length itself. That is, it is the string itself as a "list of characters," represented by the following structure:
[0] -> 'H'
[1] -> 'e'
[2] -> 'l'
[3] -> 'l'
[4] -> 'o'
Values passed as function arguments are placed on the stack[3]. At this time, as mentioned earlier[1:1], all data placed on the stack must be fixed-length—that is, they must have a known, fixed size.
On the other hand, since str is an arbitrary-length data type and not fixed-length, a variable of type str cannot be placed on the stack (it must be placed on the heap instead). Therefore, str cannot be used as a data type for an argument. Rust detects these memory issues in advance and issues a compilation error.
Function return values are also placed on the stack[4]. Thus, for the same reason as with arguments, return values must also have a known, fixed size. Consequently, str, being an arbitrary-length data type, cannot be used as a return type either, leading to a compilation error.
Based on the above, str cannot be handled as a function argument or return value. In reality, however, there are cases where we want to use strings—or more accurately, string slices—as arguments or return values, and the Rust documentation cited at the beginning provides exactly such examples.
Therefore, by using the reference &str instead of str, we can use strings (slices) as arguments and return values. This is because the size of pointer-type data can be determined in advance as "pointer size + the size of the data type the pointer points to"—in other words, it is fixed-length[5].
Since pointer types are fixed-length data types, pointer-type variables can be placed on the stack, and thus they can be used as function arguments and return values. In fact, for objects other than str (such as String), even if the object itself is a non-fixed size data type, they can be used as function arguments or return values by using a reference like &String.
Scraps before organizing into an article
-
https://doc.rust-jp.rs/book-ja/ch08-02-strings.html "Rust has only one string type in the core language: the string slice str, which is usually seen in its borrowed form &str." ↩︎ ↩︎
-
https://doc.rust-jp.rs/book-ja/ch04-01-what-is-ownership.html "Another property that makes the stack fast is that all data on the stack must have a known, fixed size." ↩︎
-
https://doc.rust-jp.rs/book-ja/ch04-01-what-is-ownership.html "When your code calls a function, the values passed into the function (including, potentially, pointers to data on the heap) and the function’s local variables get pushed onto the stack. When the function is over, those values get popped off the stack." ↩︎
-
https://brain.cc.kogakuin.ac.jp/~kanamaru/lecture/MP/final/part06/node9.html "When the square function ends, the stack area becomes as shown in Figure 7(c), and the value of the return value $v0 is assigned to variable b, ending the program." ↩︎
-
https://ja.stackoverflow.com/questions/65708/rustのresultについての質問-errore0277-the-size-for-values-of-type-str-cannot-be-kno "By using
&strto go through a pointer, the size becomes fixed at 16 bytes (8-byte pointer size + 8-byte str size), so the error no longer occurs." ↩︎
Discussion