Mastering Swift: essential details about strings

前端之家收集整理的这篇文章主要介绍了Mastering Swift: essential details about strings前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。

https://rainsoft.io/mastering-swift-essential-details-about-strings/


Mastering Swift: essential details about strings

String type is an important component of any programming language. The most useful information that user reads from the window of an iOS application is pure text.

To reach a higher number of users,the iOS application must be internationalised and support a lot of modern languages. The Unicode standard solves this problem,but creates additional complexity when working with strings.

On one hand,the language should provide a good balance between the Unicode complexity and the performance when processing strings. On the other hand,it should provide developer with comfortable structures to handle strings.

In my opinion,Swift does a great job on both hands.

Fortunately Swift's string is not a simple sequence of UTF-16 code units,like in JavaScript or Java.
In case of a sequence of UTF-16 code units it's a pain to do Unicode-aware string manipulations: you might break a surrogate pair or combining character sequence.

Swift implements a better approach. The string itself is not a collection,instead it provides views over the string content that may be applied according to situation. And one particular view,String.CharacterView,is fully Unicode-aware.

Forlet myStr = "Hello,world"you can access the following string views:

  • myStr.charactersisString.CharacterView. Valuable to access graphemes,that visually are rendered as a single symbol. The most used view.
  • myStr.unicodeScalarsisString.UnicodeScalarView. Valuable to access the Unicode code point numbers as 21-bit integers
  • myStr.utf16isString.UTF16View. Useful to access the code unit values encoded in UTF16
  • myStr.utf8isString.UTF8View. Valuable to access the code unit values encoded in UTF8

Most of the time developer deals with simple string characters,without diving into details like encoding or code units.

CharacterViewworks nice for most of the string related tasks: iteration over the characters,counting the number of characters,verify substring existence,access by index,different manipulations and so on.
Let's see in more details how these tasks are accomplished in Swift.

1.CharacterandCharacterViewstructures

String.CharacterViewstructure is a view over string content that is a collection ofCharacter.

To access the view from a string,usecharactersstring property:

let message = "Hello,world"  
let characters = message.characters  
print(type(of: characters)) // => "CharacterView" 

message.charactersreturns theCharacterViewstructure.

The character view is a collection ofCharacterstructures. For example,let's access the first character in a string view:

let firstCharacter .characters.first! (firstCharacter) // => "H" : firstCharacter) // => "Character" let capitalHCharacter: Character "H" (capitalHCharacter == firstCharacter) // => true

message.characters.firstreturns an optional that is the first character"H".
The character instance represents a single symbolH.

In Unicode termsHisLatin Capital letter H,0)">U+0048code point.

Let's go beyond ASCII and see how Swift handles composite symbols. Such characters are rendered as a single visual symbol,but are composed from a sequence of two or moreUnicode scalars. Strictly such characters are namedgrapheme clusters.

Important:CharacterViewis a collection of grapheme clusters of the string.

Let's take a closer look atgrapheme. It may be represented in two ways:

  • UsingU+00E7LATIN SMALL LETTER C WITH CEDILLA: rendered asç
  • Or using a combining character sequence:U+0063LATIN SMALL LETTER Cplus the combining markU+0327COMBINING CEDILLA. The grapheme is composite:c+◌̧=

Let's pick the second option and see how Swift handles it:

"c\u{0327}a va bien" // => "ça va bien" ) // => "ç" let combiningCharacter"c\u{0327}" (combiningCharacter firstCharactercontains a single graphemethat is rendered using two Unicode scalarsU+0063andU+0327.

Characterstructure accepts multiple Unicode scalars as long as they create a single grapheme. If you try to add more graphemes into a singleCharacter,Swift triggers an error:

let singleGrapheme"c\u{0327}\u{0301}" // Works (singleGrapheme) // => "ḉ" let multipleGraphemes"ab" // Error!

Even ifsingleGraphemeis composed of 3 Unicode scalars,it creates a single graphemeḉ.
multipleGraphemestries to create aCharacterfrom 2 Unicode scalars. This creates 2 separated graphemesaandbin a singleCharacterstructure,which is not allowed.

2. Iterating over characters in a string

CharacterViewcollection conforms toSequenceprotocol. This allows to iterate over the view characters in afor-inloop:

let weather "rain" for char in weather.characters { (char) } // => "r" // => "a" // => "i" // => "n"

Each character fromweather.charactersis accessed usingfor-inloop. On every iterationcharvariable is assigned with a character fromweatherstring:"r",0)">"a",0)">"i"and"n".

As an alternative,you can iterate over the characters usingforEach(_:)method,indicating a closure as the first argument:

"rain" weather.forEach { char in } // => "r" // => "a" // => "i" // => "n"

The iteration usingforEach(_:)method is almost the same asfor-in,only that you cannot usecontinueorbreakstatements.

To access the index of the current character in the loop,0)">CharacterViewprovides theenumerated()method. The method returns a sequence of tuples(index,character):

for (index, char) .enumerated() ("index: \(index),char: \(char)"} // => "index: 0,char: r" // => "index: 1,char: a" // => "index: 2,char: i" // => "index: 3,char: n"

enumerated()method on each iteration returns tuplesindexvariable contains the character index at the current loop step. Correspondinglycharvariable contains the character.

3. Counting characters

Simply usecountproperty of theCharacterViewto get the number of characters:

"sunny" (weathercount) // => 5

weather.characters.countcontains the number of characters in the string.

Each character in the view holds a grapheme. When an adjacent character (for example acombining mark) is appended to string,you may find thatcountproperty is not increased.

It happens because an adjacent character does not create a new grapheme in the string,instead it modifies an existingbase Unicode character. Let's see an example:

var drink "cafe" (drink) // => 4 drink +"\u{0301}" ) // => "café" ) // => 4

Initiallydrinkhas 4 characters.
When the combining markU+0301COMBINING ACUTE ACCENTis appended to string,it modifies the previous base charactereand creates a new grapheme. The propertycountis not increased,because the number of graphemes is still the same.

4. Accessing character by index

Swift doesn't know about the characters count in the string view until it actually evaluates the graphemes in it. As result a subscript that allows to access the character by an integer index directly does not exist.
You can access the characters by a special typeString.Index.

If you need to access the first or last characters in the string,the character view structure hasfirstandlastproperties:

let season "summer" (season!) // => "s" last) // => "r" let empty "" (emptyfirst == nil) // => true last ) // => true

Notice thatlastproperties are optional typeCharacter?.
In the empty stringemptythese properties arenil.

To get a character at specific position,you have to useString.Indextype (actually an alias ofString.CharacterView.Index). String offers a subscript that acceptsString.Indexto access the character,as well as pre-defined indexesmyString.startIndexandmyString.endIndex.

Using string index type,let's access the first and last characters:

let color "green" let startIndex = color.startIndex let beforeEndIndex index(before: color.endIndex) (color[startIndex]) // => "g" [beforeEndIndex) // => "n"

color.startIndexis the first character index,socolor[startIndex]evaluates tog.
color.endIndexindicates thepast the endposition,or simply the position one greater than the last valid subscript argument. To access the last character,you must calculate the index right before string's end index:color.index(before: color.endIndex).

To access characters at position by an offset,use theoffsetByargument ofindex(theIndex,offsetBy: theOffset)method:

let secondCharIndex .startIndex: 1) let thirdCharIndex 2[secondCharIndex) // => "r" [thirdCharIndex) // => "e"

Indicating theoffsetByargument,you can access the character at specific offset.

Of courseoffsetByargument is jumping over string graphemes,i.e. the offset applies overCharacterinstances of string'sCharacterView.

If the index is out of range,Swift generates an error:

let oops 100) // Error!

To prevent such situations,indicate an additional argumentlimitedByto limit the offset:limitedBy: theLimit). The function returns an optional,which isnilfor out of bounds index:

) if let charIndex = oops "Correct index"} else { "Incorrect index"} // => "Incorrect index"

oopsis an optionalString.Index?. The optional unwrap verifies whether the index didn't jump out of the string.

5. Checking substring existence

The simplest way to verify the substring existence is to callcontains(_ other: String)string method:

import Foundation let animal "white rabbit" (animalcontains"rabbit""cat") // => false

animal.contains("rabbit")returnstruebecauseanimalcontains"rabbit"substring.
Correspondinglyanimal.contains("cat")evaluates tofalsefor a non-existing substring.

To verify whether the string has specific prefix or suffix,the methodshasPrefix(_:)andhasSuffix(_:)are available. Let's use them in an example:

hasPrefix"white") // => true hasSuffix"white"is a prefix and"rabbit"is a suffix of"white rabbit". So the corresponding method callsanimal.hasPrefix("white")andanimal.hasSuffix("rabbit")returntrue.

When you need to search for a particular character,it makes sense to query directly the character view. For example:

"white rabbit" let aChar"a" let bChar"b" (aCharcontains { $0 == aChar || $== bChar }) // => true

contains(_:)verifies whether the character view has a particular character.
The second function form accepts a closure:contains(where predicate: (Character) -> Bool)and performs the same verification.

6. String manipulation

The string in Swift is avalue type. Whether you pass a string as an argument on function call,assign it to a variable or constant - every time acopyof the original string is created.

A mutating method call changes the string in place.

This chapter covers the common manipulations over strings.

Append to string a character or another string

The simplest way to append to string is+=operator. You can append an entire string to original one:

var bird "pigeon" bird " sparrow" (bird) // => "pigeon sparrow"

String structure provides a mutating methodappend(). The method accepts a string,a character or even a sequence of characters,and appends it to the original string. For instance:

"pigeon" let sChar"s" birdappend(sChar) // => "pigeons" bird" and sparrows") // => "pigeons and sparrows" bird(contentsOf: " fly") // => "pigeons and sparrows fly"

Extract a substring from string

The methodsubstring()allows to extract substrings:

  • from a specific index up to the end of string
  • from the the start up to a specific index
  • or based on a range of indexes.

Let's see how it works:

let plant "red flower" let strIndex = plant(plant4substring(from: strIndex) // => "flower" (to) // => "red " let index "f"{ let flowerRange = index..<plant.endIndex (with: flowerRange) // => "flower" }

The string subscript accepts a range or closed range of string indexes. This helps extracting substrings based on ranges of indexes:

"green tree" let excludeFirstRange = plant.endIndex [excludeFirstRange) // => "reen tree" let lastTwoRange : -[lastTwoRange) // => "ee"

Insert into string

The string type provides the mutating methodinsert(). The method allows to insert a character or a sequence of characters at specific index.

The new character or sequence is inserted before the element currently at the specified index.

See the following sample:

var plant "green tree" plantinsert"s": plant) // => "green trees" plant"nice ") // => "nice green trees"

Remove from string

The mutating methodremove(at:)removes the character at an index:

var weather "sunny day" = weather" "{ weatherremove(at: index) ) // => "sunnyday" }

You can remove characters in the string that are in a range of indexes usingremoveSubrange(_:):

6let range <weather.endIndex weatherremoveSubrange(range) // => "sunny"

Replace in string

The methodreplaceSubrange(_:with:)accepts a range of indexes that should be replaced with a particular string. The method is mutating the string.

Let's see a sample:

<index weatherreplaceSubrange"rainy") // => "rainy day" }

The character view mutation alternative

Many of string manipulations described above may be applied directly on string's character view.

It is a good alternative if you find more comfortable to work directly with a collection of characters.

For example you can remove characters at specific index,or directly the first or last characters:

var fruit "apple" fruit: fruit(fruit) // => "pple" fruitremoveFirst) // => "ple" fruitremoveLast) // => "pl"

To reverse a word usereversed()method of the character view:

"peach" var reversed = Stringreversed(reversed) // => "hcaep"

You can easily filter the string:

let fruit "or*an*ge" let filtered = fruitfilter in return char != "*" } (filtered) // => "orange"

Map the string content by applying a transformer closure:

猜你在找的Swift相关文章