rust-by-practice/en/src/compound-types/string.md

6.9 KiB
Raw Blame History

string

The type of string literal "hello, world" is &str, e.g let s: &str = "hello, world".

str and &str

  1. 🌟 We can't use str type in normal ways, but we can use &str

// fix error without adding new line
fn main() {
   let s: str = "hello, world";

   println!("Success!")
}
  1. 🌟🌟 We can only use str by boxed it, & can be used to convert Box<str> to &str

// fix the error with at least two solutions
fn main() {
   let s: Box<str> =  "hello, world".into();
   greetings(s)
}

fn greetings(s: &str) {
    println!("{}",s)
}

String

String type is defined in std and stored as a vector of bytes (Vec), but guaranteed to always be a valid UTF-8 sequence. String is heap allocated, growable and not null terminated.

  1. 🌟

// fill the blank
fn main() {
   let mut s = __;
   s.push_str("hello, world");
   s.push('!');

   assert_eq!(s, "hello, world!");

   println!("Success!")
}
  1. 🌟🌟🌟

// fix all errors without adding newline
fn main() {
   let  s =  String::from("hello");
    s.push(',');
    s.push(" world");
    s += "!".to_string();

    println!("{}", s)
}
  1. 🌟🌟 replace can be used to replace substring

// fill the blank
fn main() {
   let s = String::from("I like dogs");
   // Allocate new memory and store the modified string there
   let s1 = s.__("dogs", "cats");

   assert_eq!(s1, "I like cats");

   println!("Success!")
}

More String methods can be found under String module.

  1. 🌟🌟 You can only concat a String with &str, and String's ownership can be moved to another variable

// fix errors without removing any line
fn main() {
    let s1 = String::from("hello,");
    let s2 = String::from("world!");
    let s3 = s1 + s2; 
    assert_eq!(s3,"hello,world!");
    println!("{}",s1);
}

&str and String

Opsite to the seldom using of str, &str and String are used everywhere!

  1. 🌟🌟 &str can be converted to String in two ways

// fix error with at lest two solutions
fn main() {
   let s =  "hello, world";
   greetings(s)
}

fn greetings(s: String) {
    println!("{}",s)
}
  1. 🌟🌟 We can use String::from or to_string to convert a &str to String

// use two approaches to fix the error and without adding a new line
fn main() {
   let s =  "hello, world".to_string();
   let s1: &str = s;

   println!("Success!")
}

string escapes

  1. 🌟
fn main() {
    // You can use escapes to write bytes by their hexadecimal values
    // fill the blank below to show "I'm writing Rust"
    let byte_escape = "I'm writing Ru\x73__!";
    println!("What are you doing\x3F (\\x3F means ?) {}", byte_escape);

    // ...or Unicode code points.
    let unicode_codepoint = "\u{211D}";
    let character_name = "\"DOUBLE-STRUCK CAPITAL R\"";

    println!("Unicode character {} (U+211D) is called {}",
                unicode_codepoint, character_name );

   let long_string = "String literals
                        can span multiple lines.
                        The linebreak and indentation here \
                         can be escaped too!";
    println!("{}", long_string);
}
  1. 🌟🌟🌟 Sometimes there are just too many characters that need to be escaped or it's just much more convenient to write a string out as-is. This is where raw string literals come into play.

fn main() {
    let raw_str = r"Escapes don't work here: \x3F \u{211D}";
    // modify below line to make it work
    assert_eq!(raw_str, "Escapes don't work here: ? ℝ");

    // If you need quotes in a raw string, add a pair of #s
    let quotes = r#"And then I said: "There is no escape!""#;
    println!("{}", quotes);

    // If you need "# in your string, just use more #s in the delimiter.
    // You can use up to 65535 #s.
    let  delimiter = r###"A string with "# in it. And even "##!"###;
    println!("{}", delimiter);

    // fill the blank
    let long_delimiter = __;
    assert_eq!(long_delimiter, "Hello, \"##\"");

    println!("Success!")
}

byte string

Want a string that's not UTF-8? (Remember, str and String must be valid UTF-8). Or maybe you want an array of bytes that's mostly text? Byte strings to the rescue!

Example:

use std::str;

fn main() {
    // Note that this is not actually a `&str`
    let bytestring: &[u8; 21] = b"this is a byte string";

    // Byte arrays don't have the `Display` trait, so printing them is a bit limited
    println!("A byte string: {:?}", bytestring);

    // Byte strings can have byte escapes...
    let escaped = b"\x52\x75\x73\x74 as bytes";
    // ...but no unicode escapes
    // let escaped = b"\u{211D} is not allowed";
    println!("Some escaped bytes: {:?}", escaped);


    // Raw byte strings work just like raw strings
    let raw_bytestring = br"\u{211D} is not escaped here";
    println!("{:?}", raw_bytestring);

    // Converting a byte array to `str` can fail
    if let Ok(my_str) = str::from_utf8(raw_bytestring) {
        println!("And the same as text: '{}'", my_str);
    }

    let _quotes = br#"You can also use "fancier" formatting, \
                    like with normal raw strings"#;

    // Byte strings don't have to be UTF-8
    let shift_jis = b"\x82\xe6\x82\xa8\x82\xb1\x82\xbb"; // "γ‚ˆγ†γ“γ" in SHIFT-JIS

    // But then they can't always be converted to `str`
    match str::from_utf8(shift_jis) {
        Ok(my_str) => println!("Conversion successful: '{}'", my_str),
        Err(e) => println!("Conversion failed: {:?}", e),
    };
}

A more detailed listing of the ways to write string literals and escape characters is given in the 'Tokens' chapter of the Rust Reference.

string index

  1. 🌟🌟🌟 You can't use index to access a char in a string, but you can use slice &s1[start..end].

fn main() {
    let s1 = String::from("hi,δΈ­ε›½");
    let h = s1[0]; //modify this line to fix the error, tips: `h` only takes 1 byte in UTF8 format
    assert_eq!(h, "h");

    let h1 = &s1[3..5];//modify this line to fix the error, tips: `δΈ­`  takes 3 bytes in UTF8 format
    assert_eq!(h1, "δΈ­");

    println!("Success!")
}

operate on UTF8 string

  1. 🌟

fn main() {
    // fill the blank to print each char in "δ½ ε₯½οΌŒδΈ–η•Œ"
    for c in "δ½ ε₯½οΌŒδΈ–η•Œ".__ {
        println!("{}", c)
    }
}

utf8_slice

You can use utf8_slice to slice UTF8 string, it can index chars instead of bytes.

Example

use utf8_slice;
fn main() {
   let s = "The πŸš€ goes to the πŸŒ‘!";

   let rocket = utf8_slice::slice(s, 4, 5);
   // Will equal "πŸš€"
}

You can find the solutions here(under the solutions path), but only use it when you need it