C# Tips

C# Tip Article

Turkish i in C#

When comparing parameter with literal value, it is common to make it to uppercase or lowercase. That kind of case change will work almost correctly but not 100% correctly. Here is an example.

string arg = "Login";

if (arg.ToUpper() == "LOGIN")
	// Normal operation...
	throw new InvalidOperationException();

But in Turkish, this code will always throw exception, because ToUpper("i") method returns İ instead of I.

In Turkish, there are four i characters. Dotted I and Dotless I with each uppercase and lowercase. Uppercase for dotted i will be İ, and uppercase for dotless ı will be I.

In C#, we can use InvariantCulture version of ToUpper and ToLower - ToUpperInvariant() and ToLowerInvariant(). It is generally recommended to use ToUpperInvariant rather than ToLowerInvariant.

// Set current culture to Turkey (for Test)
Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR");
string s = "i";
Debug.WriteLine(s.ToUpper()); // İ
Debug.WriteLine(s.ToUpperInvariant()); // I

s = "I";
Debug.WriteLine(s.ToLower()); // ı
Debug.WriteLine(s.ToLowerInvariant()); // i

So the first example above can be rewritten in various ways as below.

// Use ToUpperInvariant
if (arg.ToUpperInvariant() == "LOGIN")   

// Use Equals with InvariantCultureIgnoreCase
if (arg.Equals("Login", StringComparison.InvariantCultureIgnoreCase))

// Use Equals with OrdinalIgnoreCase
if (arg.Equals("LOGIN", StringComparison.OrdinalIgnoreCase))

ToUpperInvariant() and Equals with InvariantCultureIgnoreCase option are generally OK. But InvariantCulture has its own problem. That is, under this invariant culture, some characters can be interpreted differently. For example, \u0061\u030a is interpreted as \u00e5.

InvariantCulture uses culture table to compare characters (and interpret linguistically) whereas Ordinal performs byte-by-byte comparison. OrdinalIgnoreCase is the same as Ordinal except that casing is ignored for [A-Z] and [a-z] characters. Other characters except [A-Z], OrdinalIgnoreCase uses InvariantCulture table to lookup uppercase/lowercase.

So what is the best approach for the first example code? Go with Equals(OrdinalIgnoreCase).