Skip to content

Getting a UTF8Span directly from a small String doesn't work #82024

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
glessard opened this issue Jun 5, 2025 · 2 comments
Open

Getting a UTF8Span directly from a small String doesn't work #82024

glessard opened this issue Jun 5, 2025 · 2 comments
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. ownership Feature: Ownership modifiers and semantics

Comments

@glessard
Copy link
Contributor

glessard commented Jun 5, 2025

Description

UTF8Span instances obtained from a String in the small representation are not valid. They point to the wrong memory. In contrast, gettin a UTF8Span from a Span<UInt8> given out by UTF8View is fine. This is very surprising given that String.utf8Span is a thin wrapper over UTF8View.span.

Reproduction

let master = String(200) //"This string is not a small string."

func smallStringTest1() {
  let s = master
  let utf8 = s.utf8Span
  let span = utf8.span
  for i in span.indices {
      print(span[i], terminator: " ")
  }
  print()
}

smallStringTest1()

func smallStringTest2() {
  let s = master.utf8
  let view = s.span
  let utf8 = try! UTF8Span(validating: view)
  let span = utf8.span
  for i in span.indices {
      print(span[i], terminator: " ")
  }
  print()
}

smallStringTest2()

func smallStringTest3() {
  let s = master
  let utf8 = s.utf8Span
  var it = utf8.makeUnicodeScalarIterator()
  while let c = it.next() {
    print(c, terminator: " ")
  }
  print()
}

smallStringTest3()

func smallStringTest4() {
  let s = master.utf8
  let view = s.span
  let utf8 = try! UTF8Span(validating: view)
  var it = utf8.makeUnicodeScalarIterator()
  while let c = it.next() {
    print(c, terminator: " ")
  }
  print()
}

smallStringTest4()

(also here: https://swift.godbolt.org/z/zv15z94MG)

The 1st and 3rd functions print surprising output. The 3rd sometimes crashes. Sample output:

Program returned: 0
88 189 14 
50 48 48 
2 ½ � 
2 0 0

Expected behavior

Given this code, we'd expect the output of the 1st and 3rd to be identical to the 2nd and 4th, respectively:

Program returned: 0
50 48 48 
50 48 48 
2 0  0 
2 0  0 

Note that when master's value is changed to a "large" String, all the output is as expected.

Environment

swift-DEVELOPMENT-SNAPSHOT-2025-06-03-a
Observed on macOS and Linux.

Code example: https://swift.godbolt.org/z/zv15z94MG

Additional information

Related to: #81931

Also tracked as rdar://152615664

@glessard glessard added bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. ownership Feature: Ownership modifiers and semantics labels Jun 5, 2025
@jckarter
Copy link
Contributor

jckarter commented Jun 5, 2025

From a quick glance, it looks like String isn't @_addressableForDependencies, which it would need to be in order for a Span to be able to depend on its internal representation.

@glessard
Copy link
Contributor Author

glessard commented Jun 7, 2025

The original implementation also had a sneaky dependence on a temporary, which is fixed in #82077.
The addressable attribute is fixed in #82013.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. ownership Feature: Ownership modifiers and semantics
Projects
None yet
Development

No branches or pull requests

2 participants