Documentation

Init.Data.String.Defs

Preliminary developments for strings #

This file contains the material about strings which we can write down without the results in Init.Data.String.Decode, i.e., without knowing about the bijection between String and List Char given by UTF-8 decoding and encoding.

Note that this file, despite being called Defs, contains quite a few lemmas.

@[inline]

Decodes an array of bytes that encode a string as UTF-8 into the corresponding string.

Equations
    Instances For
      @[extern lean_string_to_utf8]

      Encodes a string in UTF-8 as an array of bytes.

      Equations
        Instances For
          @[extern lean_string_append]

          Appends two strings. Usually accessed via the ++ operator.

          The internal implementation will perform destructive updates if the string is not shared.

          Examples:

          • "abc".append "def" = "abcdef"
          • "abc" ++ "def" = "abcdef"
          • "" ++ "" = ""
          Equations
            Instances For
              @[deprecated String.toByteArray_ofList (since := "2025-10-30")]
              @[deprecated String.exists_eq_ofList (since := "2025-10-30")]

              The start position of the string, as a String.Pos.Raw.

              Equations
                Instances For
                  @[simp]
                  @[deprecated String.byteIdx_rawEndPos (since := "2025-10-20")]
                  @[simp]
                  theorem String.utf8ByteSize_ofByteArray {b : ByteArray} {h : b.IsValidUTF8} :
                  { toByteArray := b, isValidUTF8 := h }.utf8ByteSize = b.size
                  @[deprecated String.singleton_eq_ofList (since := "2025-10-30")]
                  @[simp]
                  theorem String.append_singleton {s : String} {c : Char} :
                  s ++ singleton c = s.push c
                  @[simp]
                  theorem String.append_left_inj {s₁ s₂ : String} (t : String) :
                  s₁ ++ t = s₂ ++ t s₁ = s₂
                  theorem String.append_assoc {s₁ s₂ s₃ : String} :
                  s₁ ++ s₂ ++ s₃ = s₁ ++ (s₂ ++ s₃)
                  @[deprecated String.rawEndPos_eq_zero_iff (since := "2025-10-20")]
                  @[inline]
                  def String.pushn (s : String) (c : Char) (n : Nat) :

                  Adds multiple repetitions of a character to the end of a string.

                  Returns s, with n repetitions of c at the end. Internally, the implementation repeatedly calls String.push, so the string is modified in-place if there is a unique reference to it.

                  Examples:

                  • "indeed".pushn '!' 2 = "indeed!!"
                  • "indeed".pushn '!' 0 = "indeed"
                  • "".pushn ' ' 4 = " "
                  Equations
                    Instances For
                      theorem String.pushn_eq_repeat_push {s : String} {c : Char} {n : Nat} :
                      s.pushn c n = Nat.repeat (fun (s : String) => s.push c) n s
                      @[export lean_string_pushn]
                      Equations
                        Instances For
                          @[inline]

                          Checks whether a string is empty.

                          Empty strings are equal to "" and have length and end position 0.

                          Examples:

                          Equations
                            Instances For
                              @[export lean_string_isempty]
                              Equations
                                Instances For
                                  @[inline]

                                  Appends all the strings in a list of strings, in order.

                                  Use String.intercalate to place a separator string between the strings in a list.

                                  Examples:

                                  Equations
                                    Instances For

                                      Appends the strings in a list of strings, placing the separator s between each pair.

                                      Examples:

                                      Equations
                                        Instances For
                                          @[export lean_string_intercalate]
                                          Equations
                                            Instances For
                                              structure String.Pos.Raw.IsValid (s : String) (off : Raw) :

                                              Predicate for validity of positions inside a String.

                                              There are multiple equivalent definitions for validity.

                                              We say that a position is valid if the string obtained by taking all of the bytes up to, but excluding, the given position, is valid UTF-8; see Pos.isValid_iff_isValidUTF8_extract_zero.

                                              Similarly, a position is valid if the string obtained by taking all of the bytes starting at the given position is valid UTF-8; see Pos.isValid_iff_isValidUTF8_extract_utf8ByteSize.

                                              An equivalent condition is that the position is the length of the UTF-8 encoding of some prefix of the characters of the string; see Pos.isValid_iff_exists_append and Pos.isValid_iff_exists_take_data.

                                              Another equivalent condition that can be checked efficiently is that the position is either the end position or strictly smaller than the end position and the byte at the position satisfies UInt8.IsUTF8FirstByte; see Pos.isValid_iff_isUTF8FirstByte.

                                              Examples:

                                              • String.Pos.IsValid "abc" ⟨0⟩
                                              • String.Pos.IsValid "abc" ⟨1⟩
                                              • String.Pos.IsValid "abc" ⟨3⟩
                                              • ¬ String.Pos.IsValid "abc" ⟨4⟩
                                              • String.Pos.IsValid "𝒫(A)" ⟨0⟩
                                              • ¬ String.Pos.IsValid "𝒫(A)" ⟨1⟩
                                              • ¬ String.Pos.IsValid "𝒫(A)" ⟨2⟩
                                              • ¬ String.Pos.IsValid "𝒫(A)" ⟨3⟩
                                              • String.Pos.IsValid "𝒫(A)" ⟨4⟩
                                              Instances For
                                                @[deprecated String.Pos.Raw.IsValid.le_rawEndPos (since := "2025-10-20")]
                                                theorem String.Pos.Raw.IsValid.le_endPos {s : String} {off : Raw} (h : IsValid s off) :
                                                @[simp]
                                                structure String.Pos (s : String) :

                                                A Pos s is a byte offset in s together with a proof that this position is at a UTF-8 character boundary.

                                                Instances For
                                                  theorem String.Pos.ext_iff {s : String} {x y : s.Pos} :
                                                  x = y x.offset = y.offset
                                                  theorem String.Pos.ext {s : String} {x y : s.Pos} (offset : x.offset = y.offset) :
                                                  x = y
                                                  Equations
                                                    def String.instDecidableEqPos.decEq {s✝ : String} (x✝ x✝¹ : s✝.Pos) :
                                                    Decidable (x✝ = x✝¹)
                                                    Equations
                                                      Instances For
                                                        @[inline]

                                                        The start position of s, as an s.Pos.

                                                        Equations
                                                          Instances For
                                                            @[inline]
                                                            def String.endPos (s : String) :
                                                            s.Pos

                                                            The past-the-end position of s, as an s.Pos.

                                                            Equations
                                                              Instances For
                                                                instance String.instLEPos {s : String} :
                                                                Equations
                                                                  instance String.instLTPos {s : String} :
                                                                  Equations
                                                                    theorem String.Pos.le_iff {s : String} {l r : s.Pos} :
                                                                    theorem String.Pos.lt_iff {s : String} {l r : s.Pos} :
                                                                    l < r l.offset < r.offset
                                                                    instance String.instDecidableLePos {s : String} (l r : s.Pos) :
                                                                    Equations
                                                                      instance String.instDecidableLtPos {s : String} (l r : s.Pos) :
                                                                      Decidable (l < r)
                                                                      Equations
                                                                        structure String.Slice :

                                                                        A region or slice of some underlying string.

                                                                        A slice consists of a string together with the start and end byte positions of a region of interest. Actually extracting a substring requires copying and memory allocation, while many slices of the same underlying string may exist with very little overhead. While this could be achieved by tracking the bounds by hand, the slice API is much more convenient.

                                                                        String.Slice bundles proofs to ensure that the start and end positions always delineate a valid string. For this reason, it should be preferred over Substring.Raw.

                                                                        • str : String

                                                                          The underlying strings.

                                                                        • startInclusive : self.str.Pos

                                                                          The byte position of the start of the string slice.

                                                                        • endExclusive : self.str.Pos

                                                                          The byte position of the end of the string slice.

                                                                        • startInclusive_le_endExclusive : self.startInclusive self.endExclusive

                                                                          The slice is not degenerate (but it may be empty).

                                                                        Instances For
                                                                          @[inline]

                                                                          Returns a slice that contains the entire string.

                                                                          Equations
                                                                            Instances For
                                                                              @[simp]
                                                                              @[inline]

                                                                              The number of bytes of the UTF-8 encoding of the string slice.

                                                                              Equations
                                                                                Instances For
                                                                                  @[inline]

                                                                                  The end position of a slice, as a Pos.Raw.

                                                                                  Equations
                                                                                    Instances For

                                                                                      Criterion for validity of positions in string slices.

                                                                                      Instances For
                                                                                        @[inline]

                                                                                        Accesses the indicated byte in the UTF-8 encoding of a string slice.

                                                                                        At runtime, this function is implemented by efficient, constant-time code.

                                                                                        Equations
                                                                                          Instances For

                                                                                            Accesses the indicated byte in the UTF-8 encoding of the string slice, or panics if the position is out-of-bounds.

                                                                                            Equations
                                                                                              Instances For
                                                                                                structure String.Slice.Pos (s : Slice) :

                                                                                                A Slice.Pos s is a byte offset in s together with a proof that this position is at a UTF-8 character boundary.

                                                                                                Instances For
                                                                                                  theorem String.Slice.Pos.ext {s : Slice} {x y : s.Pos} (offset : x.offset = y.offset) :
                                                                                                  x = y
                                                                                                  theorem String.Slice.Pos.ext_iff {s : Slice} {x y : s.Pos} :
                                                                                                  x = y x.offset = y.offset
                                                                                                  def String.Slice.instDecidableEqPos.decEq {s✝ : Slice} (x✝ x✝¹ : s✝.Pos) :
                                                                                                  Decidable (x✝ = x✝¹)
                                                                                                  Equations
                                                                                                    Instances For
                                                                                                      @[inline]

                                                                                                      The start position of s, as an s.Pos.

                                                                                                      Equations
                                                                                                        Instances For
                                                                                                          @[inline]

                                                                                                          The past-the-end position of s, as an s.Pos.

                                                                                                          Equations
                                                                                                            Instances For
                                                                                                              instance String.instLEPos_1 {s : Slice} :
                                                                                                              Equations
                                                                                                                instance String.instLTPos_1 {s : Slice} :
                                                                                                                Equations
                                                                                                                  theorem String.Slice.Pos.le_iff {s : Slice} {l r : s.Pos} :
                                                                                                                  theorem String.Slice.Pos.lt_iff {s : Slice} {l r : s.Pos} :
                                                                                                                  l < r l.offset < r.offset
                                                                                                                  instance String.instDecidableLePos_1 {s : Slice} (l r : s.Pos) :
                                                                                                                  Equations
                                                                                                                    instance String.instDecidableLtPos_1 {s : Slice} (l r : s.Pos) :
                                                                                                                    Decidable (l < r)
                                                                                                                    Equations
                                                                                                                      @[reducible, inline]
                                                                                                                      abbrev String.Pos.IsAtEnd {s : String} (pos : s.Pos) :

                                                                                                                      pos.IsAtEnd is just shorthand for pos = s.endPos that is easier to write if s is long.

                                                                                                                      Equations
                                                                                                                        Instances For
                                                                                                                          @[simp]
                                                                                                                          theorem String.Pos.isAtEnd_iff {s : String} {pos : s.Pos} :
                                                                                                                          pos.IsAtEnd pos = s.endPos
                                                                                                                          @[inline]
                                                                                                                          Equations
                                                                                                                            @[reducible, inline]
                                                                                                                            abbrev String.Slice.Pos.IsAtEnd {s : Slice} (pos : s.Pos) :

                                                                                                                            pos.IsAtEnd is just shorthand for pos = s.endPos that is easier to write if s is long.

                                                                                                                            Equations
                                                                                                                              Instances For
                                                                                                                                @[simp]
                                                                                                                                theorem String.Slice.Pos.isAtEnd_iff {s : Slice} {pos : s.Pos} :
                                                                                                                                pos.IsAtEnd pos = s.endPos
                                                                                                                                @[inline]
                                                                                                                                Equations
                                                                                                                                  @[inline]
                                                                                                                                  def String.Slice.Pos.byte {s : Slice} (pos : s.Pos) (h : pos s.endPos) :

                                                                                                                                  Returns the byte at a position in a slice that is not the end position.

                                                                                                                                  Equations
                                                                                                                                    Instances For
                                                                                                                                      @[simp]
                                                                                                                                      @[deprecated String.toRawSubstring (since := "2025-11-18")]
                                                                                                                                      Equations
                                                                                                                                        Instances For
                                                                                                                                          @[deprecated String.toRawSubstring' (since := "2025-11-18")]
                                                                                                                                          Equations
                                                                                                                                            Instances For
                                                                                                                                              @[reducible, inline, deprecated String.Pos (since := "2025-11-24")]
                                                                                                                                              Equations
                                                                                                                                                Instances For
                                                                                                                                                  @[reducible, inline, deprecated String.startPos (since := "2025-11-24")]
                                                                                                                                                  Equations
                                                                                                                                                    Instances For
                                                                                                                                                      @[reducible, inline, deprecated String.endPos (since := "2025-11-24")]
                                                                                                                                                      abbrev String.endValidPos (s : String) :
                                                                                                                                                      s.Pos
                                                                                                                                                      Equations
                                                                                                                                                        Instances For
                                                                                                                                                          @[reducible, inline, deprecated String.toByteArray (since := "2025-11-24")]
                                                                                                                                                          Equations
                                                                                                                                                            Instances For