Having Characters

Introduction

To represent the values of an application, we primarily use characters, letters, and symbols from the alphabet or out of the alphabet. To recognize these symbols, the F# language provides the char data type. The char data type is identified in the .NET Framework by the Char structure, which gets represented with a 16-bit value. As a result, the F#'s char is just a customized name of the .NET's Char structure.

To declare a variable that can hold one character, a letter, or a symbol, when initializing the variable, include its value between two single-quotes. Here are examples:

let gender = 'm'
let moneySymbol = '$'
let multiplication = '*'
let numberOne = '1'

printfn "A few characters"
printfn "Gender:         %c" gender
printfn "Money Symbol:   %c" moneySymbol
printfn "Multiplication: %c" multiplication
printfn "Number One:     %c" numberOne

This would produce:

A few characters
Gender:         m
Money Symbol:   $
Multiplication: *
Number One:     1

Press any key to continue...

Categories of Characters

As far as computers or operating systems are concerned, every readable or non-readable symbol used in an application is a character. All those symbols are considered objects of type char. The Char structure is able to recognize every one of them. In fact, the Char structure makes the symbols into various categories.

An alphabetical letter is a readable character recognized by a human language. To let you find out whether a character is a letter, the Char structure is equipped with a static method named IsLetter. It is overloaded with two versions. A digit is a symbol used in a number. It can be 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. To let you find out whether a character is a digit, the Char structure is equipped the IsDigit() static method that is overloaded with two versions. In the same way, the Char structure provides various methods to test the category of characters being used. All these methods are static and they are given in two versions. Each has a version that takes one argument as a character. If the argument is the type sought, the method returns true. Otherwise it returns false. The methods are:

Method Returns true if the argument is
IsLetter (c : char) : bool A letter
IsLower (c : char) : bool A lowercase letter
IsUpper (c : char) : bool An uppercase letter
IsDigit (c : char) : bool A digit
IsNumber (c : char) : bool A digit or any other type of number
IsLetterOrDigit (c : char) : bool A letter or a digit
IsControl (c : char) : bool A control character (Ctrl, Shift, Enter, Del, Ins, etc)
IsPunctuation (c : char) : bool A punctuation such as , . - ! ? ' " ( ) | # \ / % & * > @ < � �
IsSymbol (c : char) : bool A symbol such as | + � � � � = ^ � $
IsWhiteSpace (c : char) : bool An empty space such as created by pressing the SPACE bar
IsSeparator (c : char) : bool An empty space or the end of a line

Here are examples of calling these methods:

open System;

printfn "%b" (Char.IsLetter 'q');
printfn "%b" (Char.IsLower 'a');
printfn "%b" (Char.IsUpper 'W');
printfn "%b" (Char.IsDigit '1');
printfn "%b" (Char.IsLetterOrDigit 'w');
printfn "%b" (Char.IsLetterOrDigit '3');
printfn "%b" (Char.IsNumber '0');
printfn "%b" (Char.IsPunctuation '_');
printfn "%b" (Char.IsPunctuation '#');
printfn "%b" (Char.IsPunctuation '\\');
printfn "%b" (Char.IsWhiteSpace ' ');
printfn "%b" (Char.IsSeparator ' ');
printfn "%b" (Char.IsSymbol '+');

This would produce:

True
True
True
True
True
True
True
True
True
True
True
True
True
Press any key to close this window . . .

Introduction to Strings

The String as an Array of Characters

In different programs so far, when we needed a string object, we would declare a variable of type string. To support strings, the .NET Framework provides the String class. This class is defined in the F# language as the string data type. Here is an example of declaring, initializing, and using a string object:

let gender = "Female"

printfn "Gender: %s\n" gender

This would produce:

Gender: Female

Press any key to close this window . . .

If you observe a value such as "Female", you may see that it primarily resembles a collection of characters. A string is a group of characters. This also means that a string is an array of characters. After declaring and initializing a string, it is considered an array of values where each character occupies a specific position. The positions are numbered so that the most left character of the string occupies index 0; the second character is at index 1, and so on.

To support this idea of an array of characters, the String class is equipped with an indexed property named Chars. This is also how you can retrieve the character at a specific index in the string, using the .[] operator of arrays. Here is an example:

let gender = "Female"
let  gdr = gender.[2]
      
printfn "Gender:    %s" gender
printfn "Character: %c" gdr

This would produce:

Gender:    Female
Character: m

Press any key to close this window . . .

Once (and because) a string is considered a collection of items, you can use the foreach operator to access each member of the collection. Here is an example:

let gender = "Female"

printfn "Gender: %s" gender

printfn "\nIndividual Characters"
for c : char in gender do
printfn "Character: %c" c

This would produce:

Gender: Female

Individual Characters
Character: F
Character: e
Character: m
Character: a
Character: l
Character: e
Press any key to close this window . . .

As mentioned already, the Char structure is equipped to find out what type of character is used somewhere. In fact, you can scan a string to find out what type of character is used in a certain position within the string. To support this, the Char structure has various methods that are second versions to the methods we saw for characters passed as arguments. Each of these second versions takes two arguments. The first argument is passed as a string. The second argument is the index where the character is positioned. If the character at that position is the category sought, the method returns true. Otherwise it returns false. The methods are:

Method Returns true if the character at index i within string s is
IsLetter (s : string) (i : int) : bool A letter
IsLower (s : string) (i : int) : bool A lowercase letter
IsUpper (s : string) (i : int) : bool An uppercase letter
IsDigit (s : string) (i : int) : bool A digit
IsNumber (s : string) (i : int) : bool A digit or any other type of number
IsLetterOrDigit (s : string) (i : int) : bool A letter or a digit
IsControl (s : string) (i : int) : bool A control character (Ctrl, Shift, Enter, Del, Ins, etc)
IsPunctuation (s : string) (i : int) : bool A punctuation such as , . - ! ? ' " ( ) | # \ / % & * > @ < � �
IsSymbol (s : string) (i : int) : bool A symbol such as | + � � � � = ^ � $
IsWhiteSpace (s : string) (i : int) : bool An empty space such as created by pressing the SPACE bar
IsSeparator (s : string) (i : int) : bool An empty space or the end of a line

Here are examples of calling these methods:

open System;

let strSentence : string = "On Rte 29, the speed limit is set to 45m/hr. Violators may receive a $35 ticket!";

let mutable i : int = 0

for c : Char in strSentence do
    if Char.IsLetter(strSentence, i) then
printfn "%c is a letter." c
    if Char.IsLower(strSentence, i) then
printfn "%c is a lowercase letter." c
    if Char.IsUpper(strSentence, i) then
printfn "%c is an uppercase letter." c
    if Char.IsDigit(strSentence, i) then
printfn "%c is a digit." c
    if Char.IsNumber(strSentence, i) then
printfn "%c is a number." c
    if Char.IsLetterOrDigit(strSentence, i) then
printfn "%c is either a letter or a digit." c
    if Char.IsPunctuation(strSentence, i) then
printfn "%c is a punctuation character." c
    if Char.IsSymbol(strSentence, i) then
printfn "%c is a symbol." c
    if Char.IsWhiteSpace(strSentence, i) then
printfn "%c is a white space." c
    if Char.IsSeparator(strSentence, i) then
printfn "%c is a separator." c

    i <- i + 1

This would produce:

O is a letter.
O is an uppercase letter.
O is either a letter or a digit.
n is a letter.
n is a lowercase letter.
n is either a letter or a digit.
  is a white space.
  is a separator.
R is a letter.
R is an uppercase letter.
R is either a letter or a digit.
t is a letter.
t is a lowercase letter.
t is either a letter or a digit.
e is a letter.
e is a lowercase letter.
e is either a letter or a digit.
  is a white space.
  is a separator.
2 is a digit.
2 is a number.
2 is either a letter or a digit.
9 is a digit.
9 is a number.
9 is either a letter or a digit.
, is a punctuation character.
  is a white space.
  is a separator.
t is a letter.
t is a lowercase letter.
t is either a letter or a digit.
h is a letter.
h is a lowercase letter.
h is either a letter or a digit.
e is a letter.
e is a lowercase letter.
e is either a letter or a digit.
  is a white space.
  is a separator.
s is a letter.
s is a lowercase letter.
s is either a letter or a digit.
p is a letter.
p is a lowercase letter.
p is either a letter or a digit.
e is a letter.
e is a lowercase letter.
e is either a letter or a digit.
e is a letter.
e is a lowercase letter.
e is either a letter or a digit.
d is a letter.
d is a lowercase letter.
d is either a letter or a digit.
  is a white space.
  is a separator.
l is a letter.
l is a lowercase letter.
l is either a letter or a digit.
i is a letter.
i is a lowercase letter.
i is either a letter or a digit.
m is a letter.
m is a lowercase letter.
m is either a letter or a digit.
i is a letter.
i is a lowercase letter.
i is either a letter or a digit.
t is a letter.
t is a lowercase letter.
t is either a letter or a digit.
  is a white space.
  is a separator.
i is a letter.
i is a lowercase letter.
i is either a letter or a digit.
s is a letter.
s is a lowercase letter.
s is either a letter or a digit.
  is a white space.
  is a separator.
s is a letter.
s is a lowercase letter.
s is either a letter or a digit.
e is a letter.
e is a lowercase letter.
e is either a letter or a digit.
t is a letter.
t is a lowercase letter.
t is either a letter or a digit.
  is a white space.
  is a separator.
t is a letter.
t is a lowercase letter.
t is either a letter or a digit.
o is a letter.
o is a lowercase letter.
o is either a letter or a digit.
  is a white space.
  is a separator.
4 is a digit.
4 is a number.
4 is either a letter or a digit.
5 is a digit.
5 is a number.
5 is either a letter or a digit.
m is a letter.
m is a lowercase letter.
m is either a letter or a digit.
/ is a punctuation character.
h is a letter.
h is a lowercase letter.
h is either a letter or a digit.
r is a letter.
r is a lowercase letter.
r is either a letter or a digit.
. is a punctuation character.
  is a white space.
  is a separator.
V is a letter.
V is an uppercase letter.
V is either a letter or a digit.
i is a letter.
i is a lowercase letter.
i is either a letter or a digit.
o is a letter.
o is a lowercase letter.
o is either a letter or a digit.
l is a letter.
l is a lowercase letter.
l is either a letter or a digit.
a is a letter.
a is a lowercase letter.
a is either a letter or a digit.
t is a letter.
t is a lowercase letter.
t is either a letter or a digit.
o is a letter.
o is a lowercase letter.
o is either a letter or a digit.
r is a letter.
r is a lowercase letter.
r is either a letter or a digit.
s is a letter.
s is a lowercase letter.
s is either a letter or a digit.
  is a white space.
  is a separator.
m is a letter.
m is a lowercase letter.
m is either a letter or a digit.
a is a letter.
a is a lowercase letter.
a is either a letter or a digit.
y is a letter.
y is a lowercase letter.
y is either a letter or a digit.
  is a white space.
  is a separator.
r is a letter.
r is a lowercase letter.
r is either a letter or a digit.
e is a letter.
e is a lowercase letter.
e is either a letter or a digit.
c is a letter.
c is a lowercase letter.
c is either a letter or a digit.
e is a letter.
e is a lowercase letter.
e is either a letter or a digit.
i is a letter.
i is a lowercase letter.
i is either a letter or a digit.
v is a letter.
v is a lowercase letter.
v is either a letter or a digit.
e is a letter.
e is a lowercase letter.
e is either a letter or a digit.
  is a white space.
  is a separator.
a is a letter.
a is a lowercase letter.
a is either a letter or a digit.
  is a white space.
  is a separator.
$ is a symbol.
3 is a digit.
3 is a number.
3 is either a letter or a digit.
5 is a digit.
5 is a number.
5 is either a letter or a digit.
  is a white space.
  is a separator.
t is a letter.
t is a lowercase letter.
t is either a letter or a digit.
i is a letter.
i is a lowercase letter.
i is either a letter or a digit.
c is a letter.
c is a lowercase letter.
c is either a letter or a digit.
k is a letter.
k is a lowercase letter.
k is either a letter or a digit.
e is a letter.
e is a lowercase letter.
e is either a letter or a digit.
t is a letter.
t is a lowercase letter.
t is either a letter or a digit.
! is a punctuation character.

Press any key to close this window . . .

Converting Characters to the Opposite Case

The English language uses two character representations: lowercase and uppercase. The characters in lowercase are: a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, and z. The equivalent characters in uppercase are represented as A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, and Z. Characters used for counting are called numeric characters; each one of them is called a digit. They are 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. There are other characters used to represent things in computer applications, mathematics, and others. Some of these characters, also called symbols are ~ , ! @ # $ % ^ & * ( ) _ + { } ` | = [ ] \ : " ; ' < > ? , . / These characters are used for various reasons and under different circumstances. For example, some of them are used as operators in mathematics or in computer programming. Regardless of whether a character is easily identifiable or not, all these symbols are character types and can be declared using the char data type followed by a name.

An alphabetic character, for any reason judged necessary, can be converted from one case to another. The other characters, non-alphabetic symbols, and the numbers, do not have a case and therefore cannot be converted in cases.

To convert a string from lowercase to uppercase, you can call use the ToUpper() method of the String class. It is overloaded with two versions. One of the versions of this method uses the following syntax:

member ToUpper : unit -> string

This method takes no argument. This method considers each character of the string that called it. If the character is already in uppercase, it would not change. If the character is a lowercase alphabetic character, it would be converted to uppercase. If the character is not an alphabetic character, it would be kept "as-is". Here is an example:

open System

let strFullName   = "Alexander Patrick Katts"
let strConversion = strFullName.ToUpper()

printfn "Full Name: %s" strFullName
printfn "Full Name: %s" strConversion

This would produce:

Full Name: Alexander Patrick Katts
Full Name: ALEXANDER PATRICK KATTS
Press any key to close this window . . .

To convert a string to lowercase, you can call the String.ToLower() method. Its syntax is:

member ToLower : unit -> string

This method follows the same logic as its counterpart: it scans the string that called it, visiting each character. If the character is not an alphabetic character, it would be kept "as-is". If the character is an uppercase alphabetic character, it would be converted to lowercase. If it is in lowercase, it would not be converted.

Replacing a Character

If you have a string that contains a wrong character, you can either delete that character or replace it with another character of your choice. To support this operation, the String class is equipped with the Replace() method that is overloaded with two versions. One of the versions of the string.Replace() method uses the following syntax:

member Replace : 
        oldChar:char * 
        newChar:char -> string

The first argument of this method is used to identify the sought character. If and everywhere that character is found in the string, it would be replaced by the character passed as the second argument. Here is an example that received a telephone number from the user and it stripped that phone number with various things to end up with only the digits:

open System

let mutable phoneNumber : string = "(105) 293-8074"

printfn "Phone Number: %s" phoneNumber

// Remove the spaces
phoneNumber <- phoneNumber.Replace(" ", "")
printfn "Phone Number: %s" phoneNumber

// Remove the left parenthesis, if any
phoneNumber <- phoneNumber.Replace("(", "")
printfn "Phone Number: %s" phoneNumber

// Remove the right parenthesis, if any
phoneNumber <- phoneNumber.Replace(")", "")
printfn "Phone Number: %s" phoneNumber

// Remove the dash, if any
phoneNumber <- phoneNumber.Replace("-", "")
printfn "Phone Number: %s" phoneNumber

This would produce:

Phone Number: (105) 293-8074
Phone Number: (105)293-8074
Phone Number: 105)293-8074
Phone Number: 105293-8074
Phone Number: 1052938074
Press any key to close this window . . .

Working With Strings

An Empty String

A string is referred to as empty if it contains nothing at all. Here is an example:

let empty : string = ""

printfn "String: %s" empty

This would produce:

String:
Press any key to close this window . . .

The Length of a String

In many operations, you will need to know the number of characters a string consists of. To get the size of a string, The String class provides the Length member variable. Here is an example of using it:

open System

let gender = "Female"
      
printfn "Gender: %s" gender
printfn "Length: %i Characters\n" gender.Length

This would produce:

Gender: Female
Length: 6 Characters

Press any key to close this window . . .

In the same way, you can access the Length property when processing the individual characters of a string. Here is an example:

open System;

let gender = "Female";
      
printfn "Gender: %s" gender
printfn "Length: %i Characters" gender.Length

printfn "\nIndividual Characters"

for c = 0 to gender.Length - 1 do
    printfn "Index.[%i]: %c" c gender.[c]

This would produce:

Gender: Female
Length: 6 Characters

Individual Characters
Index.[0]: F
Index.[1]: e
Index.[2]: m
Index.[3]: a
Index.[4]: l
Index.[5]: e
Press any key to close this window . . .

String Concatenation

One of the routine operations you can perform on two strings consists of adding one to another, that is, putting one string to the right of another string, to produce a new string made of both. There are two techniques you can use.

To add one string to another, you can use the addition operator as done in arithmetic. Here is an example:

let strNeed = "Needs"
let strRepair = "Repair"
let strAddition = strNeed + strRepair

printfn "%s" strAddition

This would produce:

NeedsRepair
Press any key to close this window . . .

In the same way, you can add as many strings as necessary using +. Here is an example:

let strfirstName  = "Alexander";
let strMiddleName = "Patrick";
let strlastName   = "Katts";
let strFullName   = strfirstName + " " + strMiddleName + " " + strlastName

printfn "First Name:  %s" strfirstName
printfn "Middle Name: %s" strMiddleName
printfn "Last Name:   %s" strlastName
printfn "Full Name:   %s\n" strFullName

This would produce:

First Name:  Alexander
Middle Name: Patrick
Last Name:   Katts
Full Name:   Alexander Patrick Katts

Press any key to close this window . . .

Besides the addition operator, to formally support string concatenation, the String class provides the Concat() method that is overloaded in various versions. One of the versions of this method takes two String arguments. Its syntax is:

static member Concat : 
        str0:string * 
        str1:string -> string

This versions takes two strings that should be concatenated. The method returns a new string as the first added to the second. Two imitations of this version use the following versions:

static member Concat : 
        str0:string * 
        str1:string * 
        str2:string -> string
static member Concat : 
        str0:string * 
        str1:string * 
        str2:string * 
        str3:string -> string

In each case, the method takes the number of strings and adds them.

Replacing a Sub-String

Inside of a string, if you have a combination of consecutive characters you don't want to keep, you can either remove that sub-string or replace it with an new combination of consecutive characters of your choice. To support this operation, the String class provides another version of the the Replace() method whose syntax is:

member Replace : 
        oldValue:string * 
        newValue:string -> string

The oldStr argument is the sub-string to look for in the string. Whenever that sub-string is found in the string, it is replaced by the newStr argument.

Formatting a String

Formatting a string consists of specifying how it would be presented as an object. To support this operation, the String class is equipped with a static method named Format. The String.Format() method is overloaded in various versions; the syntax of the simplest is:

static member Format : 
        format:string * 
        arg0:Object -> string

This method takes two arguments. The first argument can contain one or a combination of % placeholders. The second argument contains one or a combination of values that would be added to the % placeholders of the first argument.

Here is an example:

open System

let wage = 22.45
let strDisplay = String.Format("Hourly Salary: {0}", wage)

printfn "%s" strDisplay

This would produce:

Hourly Salary: 22.45
Press any key to close this window . . .

Copying a String

After declaring and initializing one string variable, you can assign it to another string variable using the assignment operator. Here is an example:

let strPerson   = "Charles Stanley"
let strSomebody = strPerson

printfn "Full Name: %s" strPerson
printfn "Full Name: %s" strSomebody

This would produce:

Full Name: Charles Stanley
Full Name: Charles Stanley
Press any key to close this window . . .

Assigning one variable to another is referred to as copying it. To formally support this operator, the String class is equipped with the Copy() method. Its syntax is:

static member Copy : 
        str:string -> string

This method takes as argument an existing String object and copies it, producing a new string. Here is an example:

open System

let strPerson   = "Charles Stanley"
let strSomebody = String.Copy(strPerson)

printfn "Full Name: %s" strPerson
printfn "Full Name: %s" strSomebody

The String.Copy() method is used to copy all characters of one string into another. If you want to copy only a few characters, use the String.CopyTo() method. Its syntax is:

member CopyTo : 
        sourceIndex:int * 
        destination:char[] * 
        destinationIndex:int * 
        count:int -> unit

Strings Comparisons

Introduction

String comparison consists of examining the characters of two strings with a character of one string compared to a character of the other string with both characters at the same positions. To support this operation, the String class is equipped with the Compare() method that is overloaded with many versions. One of the versions uses the following syntax:

static member Compare : 
        strA:string * 
        strB:string -> int

This method is declared static and it takes two arguments. When it starts, the first character of the first argument is compared to the first character of the second string. Alphabetically, if the first character of the first string has a lower alphabetical index than the first character of the second, this method returns a negative value. If the first character of the first string has a higher alphabetical index than the first character of the second, this method returns a positive value. If the first characters of both strings are the same, the method continues with the second character of each string. If both strings have the exact same characters, the method returns 0. This can be resumed as follows. The method returns:

Here is an example:

open System;

let firstName1 = "Andy";
let lastName1  = "Stanley";
let firstName2 = "Charles";
let lastName2  = "Stanley";

let value1 = String.Compare(firstName1, firstName2)
let value2 = String.Compare(firstName2, firstName1)
let value3 = String.Compare(lastName1, lastName2)

printfn "The result of comparing %s and %s is %i" firstName1 firstName2 value1
printfn "The result of comparing %s and %s is %i" firstName2 firstName1 value2
printfn "The result of comparing %s and %s is %i\n" lastName1 lastName2 value3

This would produce:

The result of comparing Andy and Charles is -1
The result of comparing Charles and Andy is 1
The result of comparing Stanley and Stanley is  0

Press any key to continue...

When using this version of the String.Compare() method, the case (upper or lower) of each character is considered. If you don't want to consider this option, the String class proposes another version of the method. Its syntax is:

static member Compare : 
        strA:string * 
        strB:string * 
        ignoreCase:bool -> int

The third argument allows you to ignore the case of the characters when performing the comparison.

String Equality

In the previous section, we saw that the indexed-equivalent characters of two strings can be compared to know whether one is lower or higher than the other's. If you are only interested to know whether two strings are equivalent, you can call the Equals() method of the String class. It is overloaded with various versions. Two versions use the following syntaxes:

override Equals : 
        obj:Object -> bool
override Equals : 
        value:string -> bool

When calling one of these versions, use an Object object or a String variable that calls it. The method takes one argument. The variable that calls the method is compared to the value passed as argument. If both values are the exact same, the method returns true. The comparison is performed considering the case of each character. If you don't want to consider the case, use the following version of the method:

member Equals : 
        value:string * 
        comparisonType:StringComparison -> bool

An alternative to the second syntax is to use a static version of this method whose syntax is:

static member Equals : 
        a:string * 
        b:string -> bool

This method takes two String arguments and compares them. If they are the same, the method returns true. This method considers the cases of the characters. If you don't want this factor taken into consideration, use the following version of the method:

member Equals : 
        value:string * 
        comparisonType:StringComparison -> bool

Working With Sub-Strings

Introduction

A sub-string is a section or part of a string. To create a sub-string, you first need a string and can retrieve one or more values from it. To support this, the String class is equipped with the Substring() method that is overloaded in two versions. The syntax of one is:

member Substring : 
        startIndex:int -> string

The integer argument specifies the position of the first character from the variable that called the method. The return value is a new String that is made of the characters from startIndex to the end of the string.

Sub-String Creation

Probably the most consistent way to create a string is to control the beginning and end retrieved from the original string. To support this, the String class is equipped with another version of the Substring() method. Its syntax is:

member Substring : 
        startIndex:int * 
        length:int -> string

The first argument specifies the index of the character to start from the String variable that calls this method. The second argument specifies the length of the string.


Previous Copyright © 2014-2024, FunctionX Monday 04 September 2016 Home