strings¶
This chapter shows some tips, tricks and pitfalls in combination with string
‘s.
Use StringBuilder
when concatenating a lot of strings¶
As string
‘s are immutable in C# concatenating those will result in many allocations and loss of performance. In scenarios where many strings get concatenated a StringBuilder
is preferred. The same applies to operations like string.Join
or add a single character to a string
.
❌ Bad Will use a lot of allocations and will result in a performance penalty.
var outputString = "";
for (var i = 0; i < 25; i++)
{
outputString += "test" + i;
}
✅ Good Usage of StringBuilder
will reduce the allocations dramatically and also performs better.
Here is a comparison of both methods:
| Method | Times | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen 0 | Allocated |
|-------------------- |------ |------------:|----------:|------------:|------------:|------:|--------:|--------:|----------:|
| StringConcat | 10 | 298.7 ns | 1.86 ns | 1.45 ns | 298.3 ns | 1.00 | 0.00 | 0.4549 | 2 KB |
| StringBuilderAppend | 10 | 436.2 ns | 5.31 ns | 4.15 ns | 437.1 ns | 1.46 | 0.02 | 0.4206 | 2 KB |
| | | | | | | | | | |
| StringConcat | 100 | 15,025.7 ns | 739.33 ns | 2,011.40 ns | 14,579.0 ns | 1.00 | 0.00 | 39.5203 | 161 KB |
| StringBuilderAppend | 100 | 5,989.5 ns | 415.73 ns | 1,225.78 ns | 6,416.1 ns | 0.41 | 0.12 | 3.9063 | 16 KB |
Getting the printable length of a string or character¶
Retrieving the length of a string can often be done via "my string".Length
, which in a lot of scenarios is good enough. Under the hood string.Length
will return the number of characters in this string object. Unfortunately that does not always map one to one with the printed characters on screen.
❌ Bad Assuming every printed characters has the same length.
Console.Write("The following string has the length of 1: ");
Console.WriteLine("🏴".Length);
Output:
The following string has the length of 1: 14
Emojis can consist out of “other” emojis making the length very variable. Also other charaters like the following are wider:
Console.WriteLine("𝖙𝖍𝖎𝖘".Length); // Prints 8
✅ Good Take StringInfo.LengthInTextElements
to know the amount of printed characters.
Console.WriteLine(new StringInfo("🏴").LengthInTextElements);
Output:
1
To summarize: string.Length
will give return the internal array size not the length of printed characters. StringInfo.LengthInTextElements
will return the amount of printed characters.
💡 Info: Some more information about Unicode, UTF-8, UTF-16 and UTF-32 can be found here.
Use StringComparison
instead of ToLowerCase
or ToUpperCase
for insensitive comparison¶
Lots of code is using "ABC".ToLowerCase() == "abc".ToLowerCase()
to compare two strings, when casing doesn’t matter. The problem with that code is ToLowerCase
as well as ToUpperCase
creates a new string instance, resulting in unnecessary allocations and performance loss.
❌ Bad Using new allocations for comparing strings.
var areStringsEqual = "abc".ToUpperCase() == "ABC".ToUpperCase();
✅ Good Use of the string.Equals
overload with the appropriate StringComparison
technique.
var areStringsEqual = string.Equals("ABC", "abc", StringComparison.OrdinalIgnoreCase);
Benchmark¶
[MemoryDiagnoser]
[HideColumns(Column.Arguments)]
public class StringBenchmark
{
[Benchmark(Baseline = true)]
[Arguments(
"HellO WoRLD, how are you? You are doing good?",
"hElLO wOrLD, how Are you? you are doing good?")]
public bool AreEqualToLower(string a, string b) => a.ToLower() == b.ToLower();
[Benchmark(Baseline = false)]
[Arguments(
"HellO WoRLD, how are you? You are doing good?",
"hElLO wOrLD, how Are you? you are doing good?")]
public bool AreEqualStringComparison(string a, string b) => string.Equals(a, b, StringComparison.OrdinalIgnoreCase);
}
Results:
| Method | Mean | Error | StdDev | Ratio | Gen0 | Allocated | Alloc Ratio |
|------------------------- |---------:|---------:|---------:|------:|-------:|----------:|------------:|
| AreEqualToLower | 60.93 ns | 1.008 ns | 0.943 ns | 1.00 | 0.0356 | 224 B | 1.00 |
| AreEqualStringComparison | 16.10 ns | 0.030 ns | 0.028 ns | 0.26 | - | - | 0.00 |
Prefer StartsWith
over IndexOf() == 0
¶
The problem with IndexOf is, that it will go through the whole string in the worst case. StartsWith on the contrary will directly abort one the first mismatch.
❌ Bad Using IndexOf
which might run through the whole string.
var startsWithHallo = "Hello World".IndexOf("Hallo") == 0;
✅ Good More readable, as well as more performant with StartsWith
.
var startsWithHallo = "Hello World".StartsWith("Hallo");
Benchmark¶
[Benchmark(Baseline = true)]
[Arguments("That is a sentence", "Thzt")]
public bool IndexOf(string haystack, string needle) => haystack.IndexOf(needle, StringComparison.OrdinalIgnoreCase) == 0;
[Benchmark]
[Arguments("That is a sentence", "Thzt")]
public bool StartsWith(string haystack, string needle) =>
haystack.StartsWith(needle, StringComparison.OrdinalIgnoreCase);
Results:
| Method | haystack | needle | Mean | Error | StdDev | Ratio |
|----------- |------------------- |------- |----------:|----------:|----------:|------:|
| IndexOf | That is a sentence | Thzt | 21.966 ns | 0.1584 ns | 0.1482 ns | 1.00 |
| StartsWith | That is a sentence | Thzt | 3.066 ns | 0.0142 ns | 0.0126 ns | 0.14 |
Prefer AsSpan
over Substring
¶
Substring
always allocates a new string object on the heap. If you have a method that accepts a Span<char>
or ReadOnlySpan<char>
you can avoid these allocations. A prime example is string.Concat
that takes a ReadOnlySpan<char>
as an input parameter.
❌ Bad Creating new string
objects that are directly discarded afterward.
var output = Text.Substring(0, 5) + " - " + Text.Substring(11, 4);
✅ Good Directly use the underlying memory to avoid heap allocations.
var output = string.Concat(Text.AsSpan(0, 5), " - ", Text.AsSpan(11, 4));
Benchmark¶
[MemoryDiagnoser]
public class Benchmark
{
[Params("Hello dear world")]
public string Text { get; set; }
[Benchmark]
public string Substring()
=> Text.Substring(0, 5) + " - " + Text.Substring(11, 4);
[Benchmark]
public string AsSpanConcat()
=> string.Concat(Text.AsSpan(0, 5), " - ", Text.AsSpan(11, 4));
}
Results:
| Method | Text | Mean | Error | StdDev | Gen0 | Allocated |
|------------- |----------------- |---------:|---------:|---------:|-------:|----------:|
| Substring | Hello dear world | 21.18 ns | 0.085 ns | 0.076 ns | 0.0179 | 112 B |
| AsSpanConcat | Hello dear world | 10.20 ns | 0.021 ns | 0.018 ns | 0.0076 | 48 B |