Strings and Text Processing
Swift’s String
type is Unicode-compliant, supporting complex text processing with grapheme clusters, bidirectional text, and internationalization. This document covers string manipulation, Unicode, and advanced text processing.
Creating Strings
Example: Basic Strings:
swift
let simple = "Hello, Swift!"
let empty = String()
let fromLiteral = """
Multi-line
string
"""
print(simple, empty, fromLiteral)
Example: String Interpolation:
swift
let name = "Alice"
let greeting = "Hello, \(name.uppercased())!" // "Hello, ALICE!"
String Operations
- Concatenation: Use
+
,+=
, orappend
. - Length: Access
count
for grapheme clusters. - Access: Use indices, not integers, due to Unicode.
Example: Concatenation and Length:
swift
var message = "Hello"
message += ", World!" // "Hello, World!"
print(message.count) // 12
Example: Safe Access:
swift
let first = message[message.startIndex] // "H"
if let index = message.index(message.startIndex, offsetBy: 6, limitedBy: message.endIndex) {
print(message[index]) // "W"
}
Unicode and Characters
String
is a collection of Character
(Unicode grapheme clusters), not raw code points.
Example: Unicode Handling:
swift
let emoji = "👩🚀🇺🇳"
for char in emoji {
print(char) // 👩🚀, 🇺🇳
}
print(emoji.count) // 2 (grapheme clusters)
print(emoji.unicodeScalars.count) // 5 (code points)
Example: Combining Characters:
swift
let cafe = "café" // e + combining acute
let precomposed = "café".normalized(.NFC) // Single character
print(cafe.count, precomposed.count) // 4, 4
String Methods
- Searching:
contains
,hasPrefix
,hasSuffix
. - Splitting:
split
,components(separatedBy
. - Replacing:
replacingOccurrences
,~replacing
.
Example: Searching:
swift
print(message.hasPrefix("Hello")) // true
print(message.contains("World")) // true
Example: Splitting:
swift
let csv = "Alice,25,Engineer"
let fields = csv.split(separator: ",") // ["Alice", "25", "Engineer"]
Example: Replacing:
swift
let updated = message.replacingOccurrences(of: "World", with: "Swift")
print(updated) // "Hello, Swift!"
Regular Expressions
Swift 5.7+ supports regex literals and Regex
type.
Example: Regex Matching:
swift
let text = "Contact: alice@example.com, bob@example.com"
let regex = /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z{2,}/]
let emails = text.matches(of: regex).map { String($0.output) }
print(emails) // ["alice@example.com", "bob@example.com"]
String Comparison
Compare strings with ==
(Unicode-aware) or use localized
.
Example:
swift
let str1 = "café"
let str2 = "café"
print(str1 == str2) // true (normalized comparison)
let names = ["Zeina", "zéna"]
print(names.sorted { $0.localizedStandardCompare($1) < 0 }) // Case-insensitive sort
Attributed Strings
Use NSAttributedString
for rich text (example:
swift
import Foundation
let attr = NSAttributedString(
string: "Bold Text",
attributes: [.font: UIFont.boldSystemFont(ofSize: 16)]
)
Performance Optimization
- Avoid Repeated Access: Cache indices for iteration.
- Use NSString: For legacy APIs or performance-critical interop.
- Regex Caching: Reuse compiled regexes.
Example: Optimized Iteration:
swift
var indices: [String.Index] = []
var current = message.startIndex
while current < message.endIndex {
indices.append(current)
current = message.index(after: current)
}
Best Practices
- Unicode Compliance: Always handle graphemes for user text.
- Localization: Use
NSLocalizedString
for strings. - Regex for Parsing: Prefer regex over manual parsing.
- Immutable Strings: Use
let
for thread safety. - Performance: Profile string operations for large texts.
- Testing: Verify string handling with diverse inputs (e.g., emoji, RTL).
Troubleshooting
- Index Errors: Use safe indexing methods.
- Unicode Issues: Normalize strings for consistency.
- Regex Failures: Validate regex syntax and test cases.
- Performance Slowdowns: Profile with Instruments.
- Localization Bugs: Test with different locales.
Example: Comprehensive String Processing
swift
struct TextProcessor {
func extractHashtags(_ text: String) -> [String] {
let regex = /#[a-zA-Z0-9_]+/
return text.matches(of: regex).map { String($0.output) }
}
func normalize(_ text: String) -> String {
return text.normalized(.NFC).lowercased()
}
func truncate(_ text: String, to length: Int) -> String {
guard text.count > length else { return text }
let index = text.index(text.startIndex, offsetBy: length, limitedBy: text.endIndex)!
return String(text[..<index]) + "..."
}
}
let processor = TextProcessor()
let post = "Hello 👋 Welcome to #SwiftCoding and #Programming! café"
print(processor.extractHashtags(post)) // ["#SwiftCoding", "#Programming"]
print(processor.normalize(post)) // "hello 👋 welcome to #swiftcoding and #programming! café"
print(processor.truncate(post, to: 20)) // "Hello 👋 Welcome to #S..."