Introduction to Strings
Introduction to Strings
Having Characters
Introduction
To represent the values of an application, we primarily use characters, letters, and symbols from the alphabet or out of the alphabet. To recognize these symbols, the F# language provides the char data type. The char data type is identified in the .NET Framework by the Char structure, which gets represented with a 16-bit value. As a result, the F#'s char is just a customized name of the .NET's Char structure.
To declare a variable that can hold one character, a letter, or a symbol, when initializing the variable, include its value between two single-quotes. Here are examples:
let gender = 'm' let moneySymbol = '$' let multiplication = '*' let numberOne = '1' printfn "A few characters" printfn "Gender: %c" gender printfn "Money Symbol: %c" moneySymbol printfn "Multiplication: %c" multiplication printfn "Number One: %c" numberOne
This would produce:
A few characters Gender: m Money Symbol: $ Multiplication: * Number One: 1 Press any key to continue...
Categories of Characters
As far as computers or operating systems are concerned, every readable or non-readable symbol used in an application is a character. All those symbols are considered objects of type char. The Char structure is able to recognize every one of them. In fact, the Char structure makes the symbols into various categories.
An alphabetical letter is a readable character recognized by a human language. To let you find out whether a character is a letter, the Char structure is equipped with a static method named IsLetter. It is overloaded with two versions. A digit is a symbol used in a number. It can be 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. To let you find out whether a character is a digit, the Char structure is equipped the IsDigit() static method that is overloaded with two versions. In the same way, the Char structure provides various methods to test the category of characters being used. All these methods are static and they are given in two versions. Each has a version that takes one argument as a character. If the argument is the type sought, the method returns true. Otherwise it returns false. The methods are:
Method | Returns true if the argument is |
IsLetter (c : char) : bool | A letter |
IsLower (c : char) : bool | A lowercase letter |
IsUpper (c : char) : bool | An uppercase letter |
IsDigit (c : char) : bool | A digit |
IsNumber (c : char) : bool | A digit or any other type of number |
IsLetterOrDigit (c : char) : bool | A letter or a digit |
IsControl (c : char) : bool | A control character (Ctrl, Shift, Enter, Del, Ins, etc) |
IsPunctuation (c : char) : bool | A punctuation such as , . - ! ? ' " ( ) | # \ / % & * > @ < � � |
IsSymbol (c : char) : bool | A symbol such as | + � � � � = ^ � $ |
IsWhiteSpace (c : char) : bool | An empty space such as created by pressing the SPACE bar |
IsSeparator (c : char) : bool | An empty space or the end of a line |
Here are examples of calling these methods:
open System; printfn "%b" (Char.IsLetter 'q'); printfn "%b" (Char.IsLower 'a'); printfn "%b" (Char.IsUpper 'W'); printfn "%b" (Char.IsDigit '1'); printfn "%b" (Char.IsLetterOrDigit 'w'); printfn "%b" (Char.IsLetterOrDigit '3'); printfn "%b" (Char.IsNumber '0'); printfn "%b" (Char.IsPunctuation '_'); printfn "%b" (Char.IsPunctuation '#'); printfn "%b" (Char.IsPunctuation '\\'); printfn "%b" (Char.IsWhiteSpace ' '); printfn "%b" (Char.IsSeparator ' '); printfn "%b" (Char.IsSymbol '+');
This would produce:
True True True True True True True True True True True True True Press any key to close this window . . .
Introduction to Strings
The String as an Array of Characters
In different programs so far, when we needed a string object, we would declare a variable of type string. To support strings, the .NET Framework provides the String class. This class is defined in the F# language as the string data type. Here is an example of declaring, initializing, and using a string object:
let gender = "Female"
printfn "Gender: %s\n" gender
This would produce:
Gender: Female Press any key to close this window . . .
If you observe a value such as "Female", you may see that it primarily resembles a collection of characters. A string is a group of characters. This also means that a string is an array of characters. After declaring and initializing a string, it is considered an array of values where each character occupies a specific position. The positions are numbered so that the most left character of the string occupies index 0; the second character is at index 1, and so on.
To support this idea of an array of characters, the String class is equipped with an indexed property named Chars. This is also how you can retrieve the character at a specific index in the string, using the .[] operator of arrays. Here is an example:
let gender = "Female"
let gdr = gender.[2]
printfn "Gender: %s" gender
printfn "Character: %c" gdr
This would produce:
Gender: Female Character: m Press any key to close this window . . .
Once (and because) a string is considered a collection of items, you can use the foreach operator to access each member of the collection. Here is an example:
let gender = "Female" printfn "Gender: %s" gender printfn "\nIndividual Characters" for c : char in gender do printfn "Character: %c" c
This would produce:
Gender: Female Individual Characters Character: F Character: e Character: m Character: a Character: l Character: e Press any key to close this window . . .
As mentioned already, the Char structure is equipped to find out what type of character is used somewhere. In fact, you can scan a string to find out what type of character is used in a certain position within the string. To support this, the Char structure has various methods that are second versions to the methods we saw for characters passed as arguments. Each of these second versions takes two arguments. The first argument is passed as a string. The second argument is the index where the character is positioned. If the character at that position is the category sought, the method returns true. Otherwise it returns false. The methods are:
Method | Returns true if the character at index i within string s is |
IsLetter (s : string) (i : int) : bool | A letter |
IsLower (s : string) (i : int) : bool | A lowercase letter |
IsUpper (s : string) (i : int) : bool | An uppercase letter |
IsDigit (s : string) (i : int) : bool | A digit |
IsNumber (s : string) (i : int) : bool | A digit or any other type of number |
IsLetterOrDigit (s : string) (i : int) : bool | A letter or a digit |
IsControl (s : string) (i : int) : bool | A control character (Ctrl, Shift, Enter, Del, Ins, etc) |
IsPunctuation (s : string) (i : int) : bool | A punctuation such as , . - ! ? ' " ( ) | # \ / % & * > @ < � � |
IsSymbol (s : string) (i : int) : bool | A symbol such as | + � � � � = ^ � $ |
IsWhiteSpace (s : string) (i : int) : bool | An empty space such as created by pressing the SPACE bar |
IsSeparator (s : string) (i : int) : bool | An empty space or the end of a line |
Here are examples of calling these methods:
open System; let strSentence : string = "On Rte 29, the speed limit is set to 45m/hr. Violators may receive a $35 ticket!"; let mutable i : int = 0 for c : Char in strSentence do if Char.IsLetter(strSentence, i) then printfn "%c is a letter." c if Char.IsLower(strSentence, i) then printfn "%c is a lowercase letter." c if Char.IsUpper(strSentence, i) then printfn "%c is an uppercase letter." c if Char.IsDigit(strSentence, i) then printfn "%c is a digit." c if Char.IsNumber(strSentence, i) then printfn "%c is a number." c if Char.IsLetterOrDigit(strSentence, i) then printfn "%c is either a letter or a digit." c if Char.IsPunctuation(strSentence, i) then printfn "%c is a punctuation character." c if Char.IsSymbol(strSentence, i) then printfn "%c is a symbol." c if Char.IsWhiteSpace(strSentence, i) then printfn "%c is a white space." c if Char.IsSeparator(strSentence, i) then printfn "%c is a separator." c i <- i + 1
This would produce:
O is a letter. O is an uppercase letter. O is either a letter or a digit. n is a letter. n is a lowercase letter. n is either a letter or a digit. is a white space. is a separator. R is a letter. R is an uppercase letter. R is either a letter or a digit. t is a letter. t is a lowercase letter. t is either a letter or a digit. e is a letter. e is a lowercase letter. e is either a letter or a digit. is a white space. is a separator. 2 is a digit. 2 is a number. 2 is either a letter or a digit. 9 is a digit. 9 is a number. 9 is either a letter or a digit. , is a punctuation character. is a white space. is a separator. t is a letter. t is a lowercase letter. t is either a letter or a digit. h is a letter. h is a lowercase letter. h is either a letter or a digit. e is a letter. e is a lowercase letter. e is either a letter or a digit. is a white space. is a separator. s is a letter. s is a lowercase letter. s is either a letter or a digit. p is a letter. p is a lowercase letter. p is either a letter or a digit. e is a letter. e is a lowercase letter. e is either a letter or a digit. e is a letter. e is a lowercase letter. e is either a letter or a digit. d is a letter. d is a lowercase letter. d is either a letter or a digit. is a white space. is a separator. l is a letter. l is a lowercase letter. l is either a letter or a digit. i is a letter. i is a lowercase letter. i is either a letter or a digit. m is a letter. m is a lowercase letter. m is either a letter or a digit. i is a letter. i is a lowercase letter. i is either a letter or a digit. t is a letter. t is a lowercase letter. t is either a letter or a digit. is a white space. is a separator. i is a letter. i is a lowercase letter. i is either a letter or a digit. s is a letter. s is a lowercase letter. s is either a letter or a digit. is a white space. is a separator. s is a letter. s is a lowercase letter. s is either a letter or a digit. e is a letter. e is a lowercase letter. e is either a letter or a digit. t is a letter. t is a lowercase letter. t is either a letter or a digit. is a white space. is a separator. t is a letter. t is a lowercase letter. t is either a letter or a digit. o is a letter. o is a lowercase letter. o is either a letter or a digit. is a white space. is a separator. 4 is a digit. 4 is a number. 4 is either a letter or a digit. 5 is a digit. 5 is a number. 5 is either a letter or a digit. m is a letter. m is a lowercase letter. m is either a letter or a digit. / is a punctuation character. h is a letter. h is a lowercase letter. h is either a letter or a digit. r is a letter. r is a lowercase letter. r is either a letter or a digit. . is a punctuation character. is a white space. is a separator. V is a letter. V is an uppercase letter. V is either a letter or a digit. i is a letter. i is a lowercase letter. i is either a letter or a digit. o is a letter. o is a lowercase letter. o is either a letter or a digit. l is a letter. l is a lowercase letter. l is either a letter or a digit. a is a letter. a is a lowercase letter. a is either a letter or a digit. t is a letter. t is a lowercase letter. t is either a letter or a digit. o is a letter. o is a lowercase letter. o is either a letter or a digit. r is a letter. r is a lowercase letter. r is either a letter or a digit. s is a letter. s is a lowercase letter. s is either a letter or a digit. is a white space. is a separator. m is a letter. m is a lowercase letter. m is either a letter or a digit. a is a letter. a is a lowercase letter. a is either a letter or a digit. y is a letter. y is a lowercase letter. y is either a letter or a digit. is a white space. is a separator. r is a letter. r is a lowercase letter. r is either a letter or a digit. e is a letter. e is a lowercase letter. e is either a letter or a digit. c is a letter. c is a lowercase letter. c is either a letter or a digit. e is a letter. e is a lowercase letter. e is either a letter or a digit. i is a letter. i is a lowercase letter. i is either a letter or a digit. v is a letter. v is a lowercase letter. v is either a letter or a digit. e is a letter. e is a lowercase letter. e is either a letter or a digit. is a white space. is a separator. a is a letter. a is a lowercase letter. a is either a letter or a digit. is a white space. is a separator. $ is a symbol. 3 is a digit. 3 is a number. 3 is either a letter or a digit. 5 is a digit. 5 is a number. 5 is either a letter or a digit. is a white space. is a separator. t is a letter. t is a lowercase letter. t is either a letter or a digit. i is a letter. i is a lowercase letter. i is either a letter or a digit. c is a letter. c is a lowercase letter. c is either a letter or a digit. k is a letter. k is a lowercase letter. k is either a letter or a digit. e is a letter. e is a lowercase letter. e is either a letter or a digit. t is a letter. t is a lowercase letter. t is either a letter or a digit. ! is a punctuation character. Press any key to close this window . . .
Converting Characters to the Opposite Case
The English language uses two character representations: lowercase and uppercase. The characters in lowercase are: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, and z. The equivalent characters in uppercase are represented as A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, and Z. Characters used for counting are called numeric characters; each one of them is called a digit. They are 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. There are other characters used to represent things in computer applications, mathematics, and others. Some of these characters, also called symbols are ~ , ! @ # $ % ^ & * ( ) _ + { } ` | = [ ] \ : " ; ' < > ? , . / These characters are used for various reasons and under different circumstances. For example, some of them are used as operators in mathematics or in computer programming. Regardless of whether a character is easily identifiable or not, all these symbols are character types and can be declared using the char data type followed by a name.
An alphabetic character, for any reason judged necessary, can be converted from one case to another. The other characters, non-alphabetic symbols, and the numbers, do not have a case and therefore cannot be converted in cases.
To convert a string from lowercase to uppercase, you can call use the ToUpper() method of the String class. It is overloaded with two versions. One of the versions of this method uses the following syntax:
member ToUpper : unit -> string
This method takes no argument. This method considers each character of the string that called it. If the character is already in uppercase, it would not change. If the character is a lowercase alphabetic character, it would be converted to uppercase. If the character is not an alphabetic character, it would be kept "as-is". Here is an example:
open System let strFullName = "Alexander Patrick Katts" let strConversion = strFullName.ToUpper() printfn "Full Name: %s" strFullName printfn "Full Name: %s" strConversion
This would produce:
Full Name: Alexander Patrick Katts Full Name: ALEXANDER PATRICK KATTS Press any key to close this window . . .
To convert a string to lowercase, you can call the String.ToLower() method. Its syntax is:
member ToLower : unit -> string
This method follows the same logic as its counterpart: it scans the string that called it, visiting each character. If the character is not an alphabetic character, it would be kept "as-is"ÂÂ. If the character is an uppercase alphabetic character, it would be converted to lowercase. If it is in lowercase, it would not be converted.
Replacing a Character
If you have a string that contains a wrong character, you can either delete that character or replace it with another character of your choice. To support this operation, the String class is equipped with the Replace() method that is overloaded with two versions. One of the versions of the string.Replace() method uses the following syntax:
member Replace : oldChar:char * newChar:char -> string
The first argument of this method is used to identify the sought character. If and everywhere that character is found in the string, it would be replaced by the character passed as the second argument. Here is an example that received a telephone number from the user and it stripped that phone number with various things to end up with only the digits:
open System let mutable phoneNumber : string = "(105) 293-8074" printfn "Phone Number: %s" phoneNumber // Remove the spaces phoneNumber <- phoneNumber.Replace(" ", "") printfn "Phone Number: %s" phoneNumber // Remove the left parenthesis, if any phoneNumber <- phoneNumber.Replace("(", "") printfn "Phone Number: %s" phoneNumber // Remove the right parenthesis, if any phoneNumber <- phoneNumber.Replace(")", "") printfn "Phone Number: %s" phoneNumber // Remove the dash, if any phoneNumber <- phoneNumber.Replace("-", "") printfn "Phone Number: %s" phoneNumber
This would produce:
Phone Number: (105) 293-8074 Phone Number: (105)293-8074 Phone Number: 105)293-8074 Phone Number: 105293-8074 Phone Number: 1052938074 Press any key to close this window . . .
Working With Strings
An Empty String
A string is referred to as empty if it contains nothing at all. Here is an example:
let empty : string = "" printfn "String: %s" empty
This would produce:
String: Press any key to close this window . . .
The Length of a String
In many operations, you will need to know the number of characters a string consists of. To get the size of a string, The String class provides the Length member variable. Here is an example of using it:
open System
let gender = "Female"
printfn "Gender: %s" gender
printfn "Length: %i Characters\n" gender.Length
This would produce:
Gender: Female Length: 6 Characters Press any key to close this window . . .
In the same way, you can access the Length property when processing the individual characters of a string. Here is an example:
open System; let gender = "Female"; printfn "Gender: %s" gender printfn "Length: %i Characters" gender.Length printfn "\nIndividual Characters" for c = 0 to gender.Length - 1 do printfn "Index.[%i]: %c" c gender.[c]
This would produce:
Gender: Female Length: 6 Characters Individual Characters Index.[0]: F Index.[1]: e Index.[2]: m Index.[3]: a Index.[4]: l Index.[5]: e Press any key to close this window . . .
String Concatenation
One of the routine operations you can perform on two strings consists of adding one to another, that is, putting one string to the right of another string, to produce a new string made of both. There are two techniques you can use.
To add one string to another, you can use the addition operator as done in arithmetic. Here is an example:
let strNeed = "Needs"
let strRepair = "Repair"
let strAddition = strNeed + strRepair
printfn "%s" strAddition
This would produce:
NeedsRepair Press any key to close this window . . .
In the same way, you can add as many strings as necessary using +. Here is an example:
let strfirstName = "Alexander"; let strMiddleName = "Patrick"; let strlastName = "Katts"; let strFullName = strfirstName + " " + strMiddleName + " " + strlastName printfn "First Name: %s" strfirstName printfn "Middle Name: %s" strMiddleName printfn "Last Name: %s" strlastName printfn "Full Name: %s\n" strFullName
This would produce:
First Name: Alexander Middle Name: Patrick Last Name: Katts Full Name: Alexander Patrick Katts Press any key to close this window . . .
Besides the addition operator, to formally support string concatenation, the String class provides the Concat() method that is overloaded in various versions. One of the versions of this method takes two String arguments. Its syntax is:
static member Concat : str0:string * str1:string -> string
This versions takes two strings that should be concatenated. The method returns a new string as the first added to the second. Two imitations of this version use the following versions:
static member Concat : str0:string * str1:string * str2:string -> string static member Concat : str0:string * str1:string * str2:string * str3:string -> string
In each case, the method takes the number of strings and adds them.
Replacing a Sub-String
Inside of a string, if you have a combination of consecutive characters you don't want to keep, you can either remove that sub-string or replace it with an new combination of consecutive characters of your choice. To support this operation, the String class provides another version of the the Replace() method whose syntax is:
member Replace : oldValue:string * newValue:string -> string
The oldStr argument is the sub-string to look for in the string. Whenever that sub-string is found in the string, it is replaced by the newStr argument.
Formatting a String
Formatting a string consists of specifying how it would be presented as an object. To support this operation, the String class is equipped with a static method named Format. The String.Format() method is overloaded in various versions; the syntax of the simplest is:
static member Format : format:string * arg0:Object -> string
This method takes two arguments. The first argument can contain one or a combination of % placeholders. The second argument contains one or a combination of values that would be added to the % placeholders of the first argument.
Here is an example:
open System let wage = 22.45 let strDisplay = String.Format("Hourly Salary: {0}", wage) printfn "%s" strDisplay
This would produce:
Hourly Salary: 22.45 Press any key to close this window . . .
Copying a String
After declaring and initializing one string variable, you can assign it to another string variable using the assignment operator. Here is an example:
let strPerson = "Charles Stanley"
let strSomebody = strPerson
printfn "Full Name: %s" strPerson
printfn "Full Name: %s" strSomebody
This would produce:
Full Name: Charles Stanley Full Name: Charles Stanley Press any key to close this window . . .
Assigning one variable to another is referred to as copying it. To formally support this operator, the String class is equipped with the Copy() method. Its syntax is:
static member Copy : str:string -> string
This method takes as argument an existing String object and copies it, producing a new string. Here is an example:
open System
let strPerson = "Charles Stanley"
let strSomebody = String.Copy(strPerson)
printfn "Full Name: %s" strPerson
printfn "Full Name: %s" strSomebody
The String.Copy() method is used to copy all characters of one string into another. If you want to copy only a few characters, use the String.CopyTo() method. Its syntax is:
member CopyTo : sourceIndex:int * destination:char[] * destinationIndex:int * count:int -> unit
Strings Comparisons
Introduction
String comparison consists of examining the characters of two strings with a character of one string compared to a character of the other string with both characters at the same positions. To support this operation, the String class is equipped with the Compare() method that is overloaded with many versions. One of the versions uses the following syntax:
static member Compare : strA:string * strB:string -> int
This method is declared static and it takes two arguments. When it starts, the first character of the first argument is compared to the first character of the second string. Alphabetically, if the first character of the first string has a lower alphabetical index than the first character of the second, this method returns a negative value. If the first character of the first string has a higher alphabetical index than the first character of the second, this method returns a positive value. If the first characters of both strings are the same, the method continues with the second character of each string. If both strings have the exact same characters, the method returns 0. This can be resumed as follows. The method returns:
Here is an example:
open System; let firstName1 = "Andy"; let lastName1 = "Stanley"; let firstName2 = "Charles"; let lastName2 = "Stanley"; let value1 = String.Compare(firstName1, firstName2) let value2 = String.Compare(firstName2, firstName1) let value3 = String.Compare(lastName1, lastName2) printfn "The result of comparing %s and %s is %i" firstName1 firstName2 value1 printfn "The result of comparing %s and %s is %i" firstName2 firstName1 value2 printfn "The result of comparing %s and %s is %i\n" lastName1 lastName2 value3
This would produce:
The result of comparing Andy and Charles is -1 The result of comparing Charles and Andy is 1 The result of comparing Stanley and Stanley is 0 Press any key to continue...
When using this version of the String.Compare() method, the case (upper or lower) of each character is considered. If you don't want to consider this option, the String class proposes another version of the method. Its syntax is:
static member Compare : strA:string * strB:string * ignoreCase:bool -> int
The third argument allows you to ignore the case of the characters when performing the comparison.
String Equality
In the previous section, we saw that the indexed-equivalent characters of two strings can be compared to know whether one is lower or higher than the other's. If you are only interested to know whether two strings are equivalent, you can call the Equals() method of the String class. It is overloaded with various versions. Two versions use the following syntaxes:
override Equals : obj:Object -> bool override Equals : value:string -> bool
When calling one of these versions, use an Object object or a String variable that calls it. The method takes one argument. The variable that calls the method is compared to the value passed as argument. If both values are the exact same, the method returns true. The comparison is performed considering the case of each character. If you don't want to consider the case, use the following version of the method:
member Equals : value:string * comparisonType:StringComparison -> bool
An alternative to the second syntax is to use a static version of this method whose syntax is:
static member Equals : a:string * b:string -> bool
This method takes two String arguments and compares them. If they are the same, the method returns true. This method considers the cases of the characters. If you don't want this factor taken into consideration, use the following version of the method:
member Equals : value:string * comparisonType:StringComparison -> bool
Working With Sub-Strings
Introduction
A sub-string is a section or part of a string. To create a sub-string, you first need a string and can retrieve one or more values from it. To support this, the String class is equipped with the Substring() method that is overloaded in two versions. The syntax of one is:
member Substring : startIndex:int -> string
The integer argument specifies the position of the first character from the variable that called the method. The return value is a new String that is made of the characters from startIndex to the end of the string.
Sub-String Creation
Probably the most consistent way to create a string is to control the beginning and end retrieved from the original string. To support this, the String class is equipped with another version of the Substring() method. Its syntax is:
member Substring : startIndex:int * length:int -> string
The first argument specifies the index of the character to start from the String variable that calls this method. The second argument specifies the length of the string.
|
|||
Previous | Copyright © 2014-2024, FunctionX | Monday 04 September 2016 | Home |
|