Documentation

Init.Data.String.Defs

Preliminary developments for strings #

This file contains the material about strings which we can write down without the results in Init.Data.String.Decode, i.e., without knowing about the bijection between String and List Char given by UTF-8 decoding and encoding.

Note that this file, despite being called Defs, contains quite a few lemmas.

@[inline]

Decodes an array of bytes that encode a string as UTF-8 into the corresponding string.

Instances For
    @[extern lean_string_to_utf8]

    Encodes a string in UTF-8 as an array of bytes.

    Instances For
      @[extern lean_string_append]

      Appends two strings. Usually accessed via the ++ operator.

      The internal implementation will perform destructive updates if the string is not shared.

      Examples:

      • "abc".append "def" = "abcdef"
      • "abc" ++ "def" = "abcdef"
      • "" ++ "" = ""
      Instances For
        @[implicit_reducible]
        @[deprecated String.toByteArray_ofList (since := "2025-10-30")]
        @[deprecated String.exists_eq_ofList (since := "2025-10-30")]

        The start position of the string, as a String.Pos.Raw.

        Instances For
          @[simp]
          @[deprecated String.byteIdx_rawEndPos (since := "2025-10-20")]
          @[simp]
          theorem String.utf8ByteSize_ofByteArray {b : ByteArray} {h : b.IsValidUTF8} :
          { toByteArray := b, isValidUTF8 := h }.utf8ByteSize = b.size
          @[deprecated String.singleton_eq_ofList (since := "2025-10-30")]
          @[simp]
          theorem String.append_singleton {s : String} {c : Char} :
          s ++ singleton c = s.push c
          @[simp]
          theorem String.append_left_inj {s₁ s₂ : String} (t : String) :
          s₁ ++ t = s₂ ++ t s₁ = s₂
          @[simp]
          theorem String.append_right_inj (s : String) {t₁ t₂ : String} :
          s ++ t₁ = s ++ t₂ t₁ = t₂
          theorem String.append_assoc {s₁ s₂ s₃ : String} :
          s₁ ++ s₂ ++ s₃ = s₁ ++ (s₂ ++ s₃)
          @[deprecated String.rawEndPos_eq_zero_iff (since := "2025-10-20")]
          @[inline]
          def String.pushn (s : String) (c : Char) (n : Nat) :

          Adds multiple repetitions of a character to the end of a string.

          Returns s, with n repetitions of c at the end. Internally, the implementation repeatedly calls String.push, so the string is modified in-place if there is a unique reference to it.

          Examples:

          • "indeed".pushn '!' 2 = "indeed!!"
          • "indeed".pushn '!' 0 = "indeed"
          • "".pushn ' ' 4 = " "
          Instances For
            theorem String.pushn_eq_repeat_push {s : String} {c : Char} {n : Nat} :
            s.pushn c n = Nat.repeat (fun (s : String) => s.push c) n s
            @[export lean_string_pushn]
            Instances For
              @[inline]

              Checks whether a string is empty.

              Empty strings are equal to "" and have length and end position 0.

              Examples:

              Instances For
                @[export lean_string_isempty]
                Instances For
                  @[inline]

                  Appends all the strings in a list of strings, in order.

                  Use String.intercalate to place a separator string between the strings in a list.

                  Examples:

                  Instances For

                    Appends the strings in a list of strings, placing the separator s between each pair.

                    Examples:

                    Instances For
                      @[export lean_string_intercalate]
                      Instances For
                        structure String.Pos.Raw.IsValid (s : String) (off : Raw) :

                        Predicate for validity of positions inside a String.

                        There are multiple equivalent definitions for validity.

                        We say that a position is valid if the string obtained by taking all of the bytes up to, but excluding, the given position, is valid UTF-8; see Pos.isValid_iff_isValidUTF8_extract_zero.

                        Similarly, a position is valid if the string obtained by taking all of the bytes starting at the given position is valid UTF-8; see Pos.isValid_iff_isValidUTF8_extract_utf8ByteSize.

                        An equivalent condition is that the position is the length of the UTF-8 encoding of some prefix of the characters of the string; see Pos.isValid_iff_exists_append and Pos.isValid_iff_exists_take_data.

                        Another equivalent condition that can be checked efficiently is that the position is either the end position or strictly smaller than the end position and the byte at the position satisfies UInt8.IsUTF8FirstByte; see Pos.isValid_iff_isUTF8FirstByte.

                        Examples:

                        • String.Pos.IsValid "abc" ⟨0⟩
                        • String.Pos.IsValid "abc" ⟨1⟩
                        • String.Pos.IsValid "abc" ⟨3⟩
                        • ¬ String.Pos.IsValid "abc" ⟨4⟩
                        • String.Pos.IsValid "𝒫(A)" ⟨0⟩
                        • ¬ String.Pos.IsValid "𝒫(A)" ⟨1⟩
                        • ¬ String.Pos.IsValid "𝒫(A)" ⟨2⟩
                        • ¬ String.Pos.IsValid "𝒫(A)" ⟨3⟩
                        • String.Pos.IsValid "𝒫(A)" ⟨4⟩
                        Instances For
                          @[deprecated String.Pos.Raw.IsValid.le_rawEndPos (since := "2025-10-20")]
                          theorem String.Pos.Raw.IsValid.le_endPos {s : String} {off : Raw} (h : IsValid s off) :
                          @[simp]
                          structure String.Pos (s : String) :

                          A Pos s is a byte offset in s together with a proof that this position is at a UTF-8 character boundary.

                          Instances For
                            theorem String.Pos.ext_iff {s : String} {x y : s.Pos} :
                            x = y x.offset = y.offset
                            theorem String.Pos.ext {s : String} {x y : s.Pos} (offset : x.offset = y.offset) :
                            x = y
                            def String.instDecidableEqPos.decEq {s✝ : String} (x✝ x✝¹ : s✝.Pos) :
                            Decidable (x✝ = x✝¹)
                            Instances For
                              @[implicit_reducible]
                              @[inline]

                              The start position of s, as an s.Pos.

                              Instances For
                                @[implicit_reducible]
                                @[inline]
                                def String.endPos (s : String) :
                                s.Pos

                                The past-the-end position of s, as an s.Pos.

                                Instances For
                                  @[implicit_reducible]
                                  instance String.instLEPos {s : String} :
                                  @[implicit_reducible]
                                  instance String.instLTPos {s : String} :
                                  theorem String.Pos.le_iff {s : String} {l r : s.Pos} :
                                  theorem String.Pos.lt_iff {s : String} {l r : s.Pos} :
                                  l < r l.offset < r.offset
                                  @[implicit_reducible]
                                  instance String.instDecidableLePos {s : String} (l r : s.Pos) :
                                  @[implicit_reducible]
                                  instance String.instDecidableLtPos {s : String} (l r : s.Pos) :
                                  Decidable (l < r)
                                  structure String.Slice :

                                  A region or slice of some underlying string.

                                  A slice consists of a string together with the start and end byte positions of a region of interest. Actually extracting a substring requires copying and memory allocation, while many slices of the same underlying string may exist with very little overhead. While this could be achieved by tracking the bounds by hand, the slice API is much more convenient.

                                  String.Slice bundles proofs to ensure that the start and end positions always delineate a valid string. For this reason, it should be preferred over Substring.Raw.

                                  • str : String

                                    The underlying strings.

                                  • startInclusive : self.str.Pos

                                    The byte position of the start of the string slice.

                                  • endExclusive : self.str.Pos

                                    The byte position of the end of the string slice.

                                  • startInclusive_le_endExclusive : self.startInclusive self.endExclusive

                                    The slice is not degenerate (but it may be empty).

                                  Instances For
                                    theorem String.Slice.ext {x y : Slice} (str : x.str = y.str) (startInclusive : x.startInclusive y.startInclusive) (endExclusive : x.endExclusive y.endExclusive) :
                                    x = y
                                    @[implicit_reducible]
                                    @[inline]

                                    Returns a slice that contains the entire string.

                                    Instances For
                                      @[implicit_reducible]
                                      @[simp]
                                      @[inline]

                                      The number of bytes of the UTF-8 encoding of the string slice.

                                      Instances For
                                        @[implicit_reducible]
                                        @[implicit_reducible]
                                        @[implicit_reducible]
                                        @[inline]

                                        The end position of a slice, as a Pos.Raw.

                                        Instances For

                                          Criterion for validity of positions in string slices.

                                          Instances For
                                            @[inline]

                                            Accesses the indicated byte in the UTF-8 encoding of a string slice.

                                            At runtime, this function is implemented by efficient, constant-time code.

                                            Instances For

                                              Accesses the indicated byte in the UTF-8 encoding of the string slice, or panics if the position is out-of-bounds.

                                              Instances For
                                                structure String.Slice.Pos (s : Slice) :

                                                A Slice.Pos s is a byte offset in s together with a proof that this position is at a UTF-8 character boundary.

                                                Instances For
                                                  theorem String.Slice.Pos.ext {s : Slice} {x y : s.Pos} (offset : x.offset = y.offset) :
                                                  x = y
                                                  theorem String.Slice.Pos.ext_iff {s : Slice} {x y : s.Pos} :
                                                  x = y x.offset = y.offset
                                                  @[implicit_reducible]
                                                  def String.Slice.instDecidableEqPos.decEq {s✝ : Slice} (x✝ x✝¹ : s✝.Pos) :
                                                  Decidable (x✝ = x✝¹)
                                                  Instances For
                                                    @[inline]

                                                    The start position of s, as an s.Pos.

                                                    Instances For
                                                      @[implicit_reducible]
                                                      @[inline]

                                                      The past-the-end position of s, as an s.Pos.

                                                      Instances For
                                                        @[implicit_reducible]
                                                        instance String.instLEPos_1 {s : Slice} :
                                                        @[implicit_reducible]
                                                        instance String.instLTPos_1 {s : Slice} :
                                                        theorem String.Slice.Pos.le_iff {s : Slice} {l r : s.Pos} :
                                                        theorem String.Slice.Pos.lt_iff {s : Slice} {l r : s.Pos} :
                                                        l < r l.offset < r.offset
                                                        @[implicit_reducible]
                                                        instance String.instDecidableLePos_1 {s : Slice} (l r : s.Pos) :
                                                        @[implicit_reducible]
                                                        instance String.instDecidableLtPos_1 {s : Slice} (l r : s.Pos) :
                                                        Decidable (l < r)
                                                        @[reducible, inline]
                                                        abbrev String.Pos.IsAtEnd {s : String} (pos : s.Pos) :

                                                        pos.IsAtEnd is just shorthand for pos = s.endPos that is easier to write if s is long.

                                                        Instances For
                                                          @[simp]
                                                          theorem String.Pos.isAtEnd_iff {s : String} {pos : s.Pos} :
                                                          pos.IsAtEnd pos = s.endPos
                                                          @[implicit_reducible, inline]
                                                          @[reducible, inline]
                                                          abbrev String.Slice.Pos.IsAtEnd {s : Slice} (pos : s.Pos) :

                                                          pos.IsAtEnd is just shorthand for pos = s.endPos that is easier to write if s is long.

                                                          Instances For
                                                            @[simp]
                                                            theorem String.Slice.Pos.isAtEnd_iff {s : Slice} {pos : s.Pos} :
                                                            pos.IsAtEnd pos = s.endPos
                                                            @[implicit_reducible, inline]
                                                            @[inline]
                                                            def String.Slice.Pos.byte {s : Slice} (pos : s.Pos) (h : pos s.endPos) :

                                                            Returns the byte at a position in a slice that is not the end position.

                                                            Instances For
                                                              @[simp]
                                                              @[inline]

                                                              Checks whether a slice is empty.

                                                              Empty slices have {name}utf8ByteSize {lean}0.

                                                              Examples:

                                                              Instances For
                                                                @[simp]
                                                                theorem String.Slice.Pos.le_refl {s : Slice} (p : s.Pos) :
                                                                p p
                                                                theorem String.Slice.Pos.lt_trans {s : Slice} {p q r : s.Pos} :
                                                                p < qq < rp < r
                                                                @[simp]
                                                                theorem String.Pos.le_refl {s : String} (p : s.Pos) :
                                                                p p
                                                                theorem String.Pos.lt_trans {s : String} {p q r : s.Pos} :
                                                                p < qq < rp < r
                                                                @[deprecated String.toRawSubstring (since := "2025-11-18")]
                                                                Instances For
                                                                  @[deprecated String.toRawSubstring' (since := "2025-11-18")]
                                                                  Instances For
                                                                    @[reducible, inline, deprecated String.Pos (since := "2025-11-24")]
                                                                    Instances For
                                                                      @[reducible, inline, deprecated String.startPos (since := "2025-11-24")]
                                                                      Instances For
                                                                        @[reducible, inline, deprecated String.endPos (since := "2025-11-24")]
                                                                        abbrev String.endValidPos (s : String) :
                                                                        s.Pos
                                                                        Instances For
                                                                          @[reducible, inline, deprecated String.toByteArray (since := "2025-11-24")]
                                                                          Instances For