• Sometimes valid RSA signatures in .NET

    One of the nice things about .NET Core being open source is following along with some of the issues that people report. I tend to keep an eye on System.Security tagged issues, since those tend to be at the intersection of things that interest me and things I can maybe help with.

    A user filed an issue where .NET Framework considered a CMS valid, and .NET Core did not. This didn’t entirely surprise me. In the .NET Framework, the SignedCms class is heavily backed by Windows’ handling of CMS/PKCS#7. In .NET Core, the implementation is managed (sans the cryptography). The managed implementation adheres somewhat strictly to the CMS specification. As other issues have noticed, Windows’, thus .NET Framework’s, implementation was a little more relaxed in some ways.

    This turned out not to be one of those cases. The CMS part was actually working just fine. What as failing was RSA itself. The core of the issue was that different implementations of RSA disagreed on the RSA signature’s validity.

    That seems pretty strange!

    When I talk about different implementations on Windows, I am usually referring to CAPI vs CNG, or RSACryptoServiceProvider and RSACng, respectively. For now, I’m keeping this post to the .NET Framework. We’ll bring .NET Core in to the discussion later.

    There are two implementations because, well, Windows has two of them. CNG, or “Cryptography API: Next Generation” is the newer of the two and is intended to be future of cryptographic primitives on Windows. It shipped in Windows Vista, and offers functionality that CAPI cannot do. An example of that is PSS RSA signatures.

    .NET Framework exposes these implementations as RSACryptoServiceProvider and RSACng. They should be interchangable, and CNG implementations should be used going forward. However, there is one corner case where the old, CAPI implementation considers a signature valid while the CNG one does not.

    The issue can be demonstrated like so:

    byte[] n = new byte[] { ... };
    byte[] e = new byte[] { ... };
    byte[] signature = new byte[] { ... };
    var digest = new byte[] {
        0x68, 0xB4, 0xF9, 0x26, 0x34, 0x31, 0x25, 0xDD,
        0x26, 0x50, 0x13, 0x68, 0xC1, 0x99, 0x26, 0x71,
        0x19, 0xA2, 0xDE, 0x81, 
    };
    using (var rsa = new RSACng())
    {
        rsa.ImportParameters(new RSAParameters {
            Modulus = n,
            Exponent = e
        });
        var valid = rsa.VerifyHash(digest, signature, HashAlgorithmName.SHA1,
                                   RSASignaturePadding.Pkcs1);
        Console.WriteLine(valid);
    }
    using (var rsa = new RSACryptoServiceProvider())
    {
        rsa.ImportParameters(new RSAParameters {
            Modulus = n,
            Exponent = e
        });
        var valid = rsa.VerifyHash(digest, signature, HashAlgorithmName.SHA1,
                                   RSASignaturePadding.Pkcs1);
        Console.WriteLine(valid);
    }
    

    When used with one of the curious signatures that exhibits this behavior, such as the one in the GitHub link, the first result will be false, and the second will be true.

    Nothing jumped out at me as being problematic. The signature padding is PKCS, the public exponent is the very typical 67,537, and the RSA key is sensible in size.

    To make it stranger, this signature came off the timestamp of Firefox’s own signed installer. So why are the results different?

    Jeremy Barton from Microsoft on .NET Core made the observation that the padding in the RSA signature itself is incorrect, but in a way that CAPI tollerates and CNG does not, at least by default. Let’s look at the raw signature. To do that, we need the public key and signature on disk, and we can poke at them with OpenSSL.

    Using the command:

    openssl rsautl -verify -in sig.bin -inkey key.der \
        -pubin -hexdump -raw -keyform der
    

    We get the following output:

    0000 - 00 01 ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0010 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0020 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0030 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0040 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0050 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0060 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0070 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0080 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0090 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    00a0 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    00b0 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    00c0 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    00d0 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    00e0 - ff ff ff ff ff ff ff ff-ff ff ff 00 68 b4 f9 26
    00f0 - 34 31 25 dd 26 50 13 68-c1 99 26 71 19 a2 de 81
    

    This is a PKCS#1 v1.5 padded signature, as indicated by by starting with 00 01. The digest at the end can be seen, 68 b4 f9 26 ... 19 a2 de 81 which matches the digest above, so we know that the signature is for the right digest.

    What is not correct in this signature is how the digest is encoded. The signature contains the bare digest. It should be encoded as an ASN.1 sequence along with the AlgorithmIdentifer of the digest:

    DigestInfo ::= SEQUENCE {
    	digestAlgorithm AlgorithmIdentifier,
    	digest OCTET STRING
    }
    

    This goes back all the way to a document (warning: link is to an ftp:// site) written in 1993 by RSA labratories explaining how PKCS#1 v1.5 works,and was standardized in to an RFC in 1998.

    The RSA signature we have only contains the raw digest. It is not part of a DigestInfo. If the digest were properly encoded, it would look something like this:

    0000 - 00 01 ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0010 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0020 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0030 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0040 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0050 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0060 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0070 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0080 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    0090 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    00a0 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    00b0 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    00c0 - ff ff ff ff ff ff ff ff-ff ff ff ff ff ff ff ff
    00d0 - ff ff ff ff ff ff ff ff-ff ff ff ff 00 30 21 30
    00e0 - 09 06 05 2b 0e 03 02 1a-05 00 04 14 68 b4 f9 26
    00f0 - 34 31 25 dd 26 50 13 68-c1 99 26 71 19 a2 de 81
    

    The signature now includes DigestInfo along with the OID 1.3.14.3.2.26 to indicate that the digest is SHA1.

    At this point we know what the difference is, and the original specification in part 10.1.2 makes it fairly clear that the “data” should be a digest and should be encoded as DigestInfo, not a bare digest.

    The source of this signature is from Verisign's timestamp authority at http://timestamp.verisign.com/​scripts/​timstamp.dll. After checking with someone at DigiCert (now running this TSA), it was launched in May 1995.

    I suspect that the TSA is old enough that the implementation was made before the specification was complete or simply got the specification wrong and no one noticed. Bringing this back to CNG and CAPI, CNG can validate this signatures, but you must explicitly tell CNG that the signature does not have an object identifier. BCRYPT_PKCS1_PADDING_INFO’s documentation has the detail there, but gist of it is

    If there is no OID in the signature, then verification fails unless this member is NULL.

    This would be used with {B,N}CryptVerifySignature. To bring this back around to the .NET Framework, how do we use RSACng and give null in for the padding algorithm? The short answer is: you cannot. If you try, you will get an explicit ArgumentException saying that the hash algorithm name cannot be null.

    For .NET Framework, this solution “keep using RSACryptoServiceProvider”. If you need to validate these signatures, chances are you do not need to use CNG’s newer capabilities like PSS since these malformed signatures appear to be coming from old systems. Higher level things like SignedCms and SignedXml use RSACryptoServiceProvider by default, so they will continue to work.

    To bring in .NET Core, the situation is a little more difficult. If you are using SignedCms like so:

    var signedCms = new SignedCms();
    signedCms.Decode(File.ReadAllBytes("cms-with-sig.bin"));
    signedCms.CheckSignature(true);
    

    This will start throwing when you migrate to .NET Core. .NET Core will use CNG when run on Windows to validate RSA signatures for SignedCms and SignedXml. This is currently not configurable, either. When used with SignedCms, it ultimately calls the X509Certificate2.GetRSAPublicKey() extension method, and that will always return an implementation based on CNG.

    If you are using SignedCms on .NET Core and need to validate a CMS that is signed with these problematic signatures, you are currently out of luck using in-the-box components. As far as other platforms go, both macOS and Linux environments for .NET Core will agree with CNG - that the signature is invalid.

    The good news is, these signatures are not easy to come by. So far, only the old Verisign timestamp authority is known to have produced signatures like this.

  • C# ReadOnlySpan and static data

    Since C# 7 there have been a lot of point releases that contain all kinds of goodies. Many of them are performance focused, such as safe stack allocations using Span<T>, or interoperability with improvements to fixed.

    One that I love, but is not documented well, is some special treatment that ReadOnlySpan<byte> gets when its contents are known at compile time.

    Here’s an example of a lookup table I used to aide with hex encoding that uses a byte[]:

    private static byte[] LookupTable => new byte[] {
        (byte)'0', (byte)'1', (byte)'2', (byte)'3', (byte)'4',
        (byte)'5', (byte)'6', (byte)'7', (byte)'8', (byte)'9',
        (byte)'A', (byte)'B', (byte)'C', (byte)'D', (byte)'E',
        (byte)'F',
    };
    

    This binary data has to get stored somewhere in our produced library. If we use dumpbin we can see it in the .text section of the binary.

    dumpbin /RAWDATA /SECTION:.text mylib.dll
    

    Right at the bottom, we see:

    00402A40: 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46  0123456789ABCDEF
    

    I won’t go into the a lot of the details on how this data is compiled into the .text section, but at this point we need to get that data into the array somehow.

    If we look at the jit assembly of LookupTable, we see:

    sub rsp, 0x28
    vzeroupper
    mov rcx, 0x7ffc4638746a
    mov edx, 0x10
    call 0x7ffc49b52630
    mov rdx, 0x1b51450099c
    lea rcx, [rax+0x10]
    vmovdqu xmm0, [rdx]
    vmovdqu [rcx], xmm0
    add rsp, 0x28
    ret
    

    Where 0x7ffc49b52630 is InitializeArray.

    With an array, our property leans on InitializeArray, the source of which is in the CoreCLR. For little-endian platforms, it boils down to a memcpy from a runtime field handle.

    Indeed, with a debugger we finally see:

    00007ffd`b18b701a e831a40e00       call    coreclr!memcpy (00007ffd`b19a1450)
    

    Dumping @rdx L10 yields:

    000001f0`4c552a90  30 31 32 33 34 35 36 37-38 39 41 42 43 44 45 46  0123456789ABCDEF
    

    So that was a very long-winded way of saying that when using arrays, initializing a field or variable with bytes results in memcpy from the image into the array, which results in more data on the heap.

    Now, starting in 7.3, we can avoid that memcpy when using ReadOnlySpan<byte>.

    private static ReadOnlySpan<byte> LookupTable => new byte[] {
        (byte)'0', (byte)'1', (byte)'2', (byte)'3', (byte)'4',
        (byte)'5', (byte)'6', (byte)'7', (byte)'8', (byte)'9',
        (byte)'A', (byte)'B', (byte)'C', (byte)'D', (byte)'E',
        (byte)'F',
    };
    

    Looking at the jit assembly:

    mov eax, 0x10
    xor edx, edx
    mov r8, 0x1b5144c0968
    mov [rcx], rdx
    mov [rcx+0x8], r8
    mov [rcx+0x10], eax
    mov rax, rcx
    ret
    

    We see that there is mov r8, 0x1b5144c0968. The contents of 0x1b5144c0968 are:

    000001b5`144c0968  30 31 32 33 34 35 36 37-38 39 41 42 43 44 45 46  0123456789ABCDEF
    

    So we see that the method is now returning the data directly and omitting the memcpy entirely, so our ReadOnlySpan<byte> is pointing directly to the .text section.

    This works for property getters as shown above, but also as the return of a method:

    ReadOnlySpan<byte> GetBytes() {
        return new byte[] { ... };
    }
    

    Which works similar to the getter of the property. In addition, this also works for locals in a method body as well:

    void Write200Ok(Stream s) {
        ReadOnlySpan<byte> data = new byte[] {
            (byte)'H', (byte)'T', (byte)'T', (byte)'P',
            (byte)'/', (byte)'1', (byte)'.', (byte)'1',
            (byte)' ', (byte)'2', (byte)'0', (byte)'0',
            (byte)' ', (byte)'O', (byte)'K'
        };
        s.Write(data);
    }
    

    Which also produces a reasonable JIT disassembly:

    sub     rsp, 0x38
    xor     eax, eax
    mov     qword ptr [rsp+0x28], rax
    mov     qword ptr [rsp+0x30], rax
    mov     rcx, 0x1e595b42ade
    mov     eax, 0x0F
    lea     r8, [rsp+0x28]
    mov     qword ptr [r8], rcx
    mov     dword ptr [r8+8], eax
    mov     rcx, rdx
    lea     rdx, [rsp+0x28]
    cmp     dword ptr [rcx], ecx
    call    0x7ff89ede10c8 (Stream.Write(System.ReadOnlySpan`1<Byte>), mdToken: 0000000006000001)
    add     rsp, 0x38
    ret
    

    Here we see mov rcx, 0x1e595b42ade which moves the address of the static data directly in to the register with no additional work to create a byte array.

    These optimizations currently only works with ReadOnlySpan<byte> right now. Other types will continue to use InitializeArray due to needing to handle different platforms and how they handle endianness.

  • C# 8 using declarations

    Visual Studio 2019 preview 2 was released a few days ago and I took the time to install it. Visual Studio itself is actually rather uninteresting to me, however the inclusion of the next C# 8 preview got my attention. I glanced at the feature highlights and posted “looks nice” on Twitter.

    Predictably, I got a few responses like “I’m not sure I like that”, and there is always a guarantee that if F# has a similar feature, an F# developer will appear and tell you F# has had this feature for 600 years.

    The one I like a lot is using declarations. This allows a local to automatically be disposed at the end of the block. Essentially, it hides the try/finally or the using() {...}. The .NET team’s blog kind of gave a bad example of this, so I’ll use one from Open OPC SignTool. Here is the original snippet:

    private static X509Certificate2 GetCertificateFromCertificateStore(string sha1)
    {
        using (var store = new X509Store(StoreName.My, StoreLocation.LocalMachine))
        {
            store.Open(OpenFlags.OpenExistingOnly | OpenFlags.ReadOnly);
            var certificates = store.Certificates.Find(X509FindType.FindByThumbprint, sha1, false);
            return certificates.Count > 0 ? certificates[0] : null;
        }
    }
    

    A using var can make this:

    private static X509Certificate2 GetCertificateFromCertificateStore(string sha1)
    {
        using var store = new X509Store(StoreName.My, StoreLocation.LocalMachine);
        store.Open(OpenFlags.OpenExistingOnly | OpenFlags.ReadOnly);
        var certificates = store.Certificates.Find(X509FindType.FindByThumbprint, sha1, false);
        return certificates.Count > 0 ? certificates[0] : null;
    }
    

    This has the same effect of store having Dispose called on it at the end of the method. The benefit here being that there is less indentation and braces. This keeps me focused on the code that matters. I don’t care when store is disposed in the method, I can just observe that it has a using modifier on the local and be assured that Dispose will be called.

    This isn’t the same as garbage collection or finalizers. Both of those are non- deterministic, and can lead to unexpected program behavior. That’s less so in the case of X509Store, so let’s look at another example:

    using Stream stream = entry.Open();
    var xmlDocument = XDocument.Load(stream, LoadOptions.PreserveWhitespace);
    return new OpcRelationships(location, xmlDocument, readOnlyMode);
    

    Not disposing a stream that is backed by a file can cause access errors later in software that might try to open that file again - it is already open, so not only is it a bad idea it leave streams to the GC, it is just simply incorrect.

    However again using on the local ensures it is deterministically closed.

    When it gets disposed I can see being slightly unclear to the developer. The quick explanation is when the local is no longer reachable, not when it is last used. The C# 8 above gets compiled roughly to:

    var stream = entry.Open();
    try
    {
        var xmlDocument = XDocument.Load(stream, LoadOptions.PreserveWhitespace);
        return new OpcRelationships(location, xmlDocument, readOnlyMode);
    }
    finally
    {
        if (stream != null)
        {
            ((IDisposable)stream).Dispose();
        }
    }
    

    The disposal is done after the return, when the local is no longer reachable, not after XDocument is created.

    I find this very helpful to keep code readable. This doesn’t work when you need fine control over when Dispose is called. A place where this does not work well is when the Dispose pattern is used for scopes, such as logging. The AzureSignTool project has code similar to this in SignCommand:

    var logger = loggerFactory.CreateLogger<SignCommand>();
    Parallel.ForEach(AllFiles, options, () => (succeeded: 0, failed: 0), (filePath, pls, state) =>
    {
        using (var loopScope = logger.BeginScope("File: {Id}", filePath))
        {
            logger.LogInformation("Signing file.");
            //Sign the file. omit a bunch of other code.
            logger.LogInformation("Done signing the file.");
        }
        logger.LogDebug("Incrementing success count.");
        return (state.succeeded + 1, state.failed);
    }
    

    Here, we cannot change this to a using var because then the LogDebug would be inside of that logging scope, which it wasn’t before. This is a place where we continue to want Dispose to be called at a different time from the when loopScope would no longer be in scope.

    My impression from C# developers is that they do not tend to call Dispose on resources as soon as it can be disposed, just at a reasonable point in the same method. Most developers do not write this code:

    public bool MightBeExe(string filePath)
    {
        var firstBytes = new byte[2];
        int bytesRead;
        using (var file = File.Open(filePath, FileMode.Open))
        {
            bytesRead = file.Read(firstBytes, 0, 2);
        }
        return bytesRead == 2 && firstBytes[0] == (byte)'M' && firstBytes[1] == (byte)'Z';
    }
    

    They will instead write something like:

    public bool MightBeExe(string filePath)
    {
        using (var file = File.Open(filePath, FileMode.Open))
        {
            var firstBytes = new byte[2];
            var bytesRead = file.Read(firstBytes, 0, 2);
            return bytesRead == 2 && firstBytes[0] == (byte)'M' && firstBytes[1] == (byte)'Z';
        }
    }
    

    Which is a perfect candidate for using var:

    public bool MightBeExe(string filePath)
    {
        using var file = File.Open(filePath, FileMode.Open);
        var firstBytes = new byte[2];
        var bytesRead = file.Read(firstBytes, 0, 2);
        return bytesRead == 2 && firstBytes[0] == (byte)'M' && firstBytes[1] == (byte)'Z';
    }
    

    There are of course some reasonable limitations to this feature. For example, it cannot be combined with out-variables.

    if (Crypto32.CryptEncodeObjectEx(
        // other stuff
        out var handle,
        ref size)
    )
    {
        using (handle)
        {
            // Do stuff
        }
    }
    

    This does not work:

    if (Crypto32.CryptEncodeObjectEx(
        // other stuff
        out using var handle,
        ref size)
    )
    {
        // Do stuff
    }
    

    Jared Parsons said on Twitter that C# folks thought of this, and decided that it had “Too much confusion about ownership.” Thinking about it myself, I agree, so it’s nice that the feature is limited in that regard.

    Another limitation is that the variable cannot be reassigned. For example:

    using var stream = entry.Open();
    stream = entry2.Open();
    

    This will produce error CS1656, “Cannot assign to ‘stream’ because it is a ‘using variable’”.

    All in all, I very much like this small feature in C# 8. It has reasonable guard rails on it from doing something too weird like re-assigning to it, while giving the benefit of less blocks, braces, indentation.

  • Secure Random Integers in .NET Core 3

    .NET Core 3.0 is tentatively set to include a new API for securely generating a random integer bound to a specific range.

    I won’t be shy in admitting that it was something I pushed for and made the initial attempt at implementing, though it’s unfair to say that I implemented it by myself given all of the outstanding feedback I got on the initial pull request (thanks Levi and Jeremy!)

    It’s been known for a while that System.Random shouldn’t be used when cryptographic randomness is required. Despite that, there wasn’t anything built in to .NET that made creating bounded random integers easy. You could either use System.Random and hope for the best, or use a CSPRNG like RandomNumberGenerator that gave back raw bytes, which requires some thought on how to to properly convert it to a random integer without introducing any kind of bias.

    Starting in .NET Core 3.0, you’ll be able to do:

    var min = 1;
    var max = 1_000;
    var randomNumber = RandomNumberGenerator.GetInt32(min, max);
    

    If you need this before .NET Core 3, well, the source is right there. It can be adapted with a bit of effort to work on the .NET Framework as well as other environments that don’t have Span<T>.

  • FixedTimeEquals in .NET Core

    .NET Core introduced FixedTimeEquals which I have personally found to be very helpful. It’s a small method, but given the kind of code I tend to write, I was writing it a lot from project to project, and am happy to see it in the box.

    This API is meant to prevent a timing side-channel when comparing two sequences of bytes. The goal is, the comparison should take the same amount of time regardless of the contents of the bytes, assuming they are the same length. This is often required when doing comparisons of MACs or simple possession demonstrations for APIs.

    Consider you had a web API that required an API key and was implemented in such a way that it just expected a magic string to act as a password to interact with the API.

    It required an HTTP header, like

    X-Api-Key: SecretVooDooWord
    

    and TLS was used to ensure it was kept confidential in transit. The API did a naive implementation like this:

    [HttpGet]
    public IActionResult SuperSensitiveApi() {
        var key = Request.Headers["X-Api-Key"].Single();
        if (key != "SecretVooDooWord") {
            return Unauthorized();
        }
        return new JsonResult(new { AllTheSecretData = "tada" });
    }
    

    The issue here is that neither != or == are fixed-time, and will return false as soon as there is an immediate difference. In pseudo code, it might look something like this:

    compare(word1, word2) {
        if word1.length != word2.length
            return false
        i = 0
        while (i < word1.length) {
            letter1 = word1[i]
            letter2 = word2[i]
            if letter1 != letter2
                return false
            else
                i = i + 1
        }
        return true
    }
    

    This function will compare the letters one by one. As soon as it encounters a difference, it returns false and doesn’t bother checking the rest. Why should it? The answer is going to be false every time.

    To an attacker that is trying to figure out the contents of the string, carefully measuring the time can leak the contents of the string. For argument’s sake, let’s say that checking each letter takes 2ns and the attacker has a very accurate way of measuring time over a network and can account for jitter and latency.

    GET /SuperSensitiveApi HTTP/1.1
    X-Api-Key: Raaaaaaaaaaaaaaa
    

    Checking the API key will take 2ns because the first letters do not match. The attacker moves on to the next letter.

    GET /SuperSensitiveApi HTTP/1.1
    X-Api-Key: Saaaaaaaaaaaaaaa
    

    This time, checking the API key takes 4ns because the first letter matched (2ns) and the second failed (another 2ns)

    GET /SuperSensitiveApi HTTP/1.1
    X-Api-Key: Sbaaaaaaaaaaaaaa
    

    Fails again taking 4ns to check the API key. After a few more tries…

    GET /SuperSensitiveApi HTTP/1.1
    X-Api-Key: Seaaaaaaaaaaaaaa
    

    This time it took 6ns to check the API key because the first and second letter were checked, and the third failed.

    The attacker can keep doing this by observing the amount of time each call takes. The longer it takes, the attacker can assume they have guessed the next letter. In practice, this is extremely difficult to do over a network and to get accurate enough timing information, but it is believed that it might be possible given enough persistence and an adversary that has the means to be in a network position that is very stable.

    These kinds of attacks do exist, an example timing side-channel is Lucky 13, which affected many library’s approach to handling CBC padding in TLS.

    So == in C# is a bad way to check strings for equality where timing side channels need to be mitigated. So you say, perhaps you’ll do something like this:

    private static bool CheckStringsFixedTime(string str1, string str2) {
        if (str1.Length != str2.Length) {
            return false;
        }
        var allTheSame = true;
        for (var i = 0; i < str1.Length; i++) {
            if (str1[i] != str2[i]) {
                allTheSame = false;
            }
        }
        return allTheSame;
    }
    

    This looks like it’s constant time, but on modern CPUs and .NET, it’s not. Branch statements, like if, have timing implications. We can’t take a branch.

    OK, what about this?

    private static bool CheckStringsFixedTime(string str1, string str2) {
        if (str1.Length != str2.Length) {
            return false;
        }
        var allTheSame = true;
        for (var i = 0; i < str1.Length; i++) {
            allTheSame &= str1[i] == str2[i];
        }
        return allTheSame;
    }
    

    This looks somewhat promising for C#. We know from the language specification that &= does not short circuit, so str1[i] == str2[i] will always be evaluated.

    We still have a few problems though, and this is where the JIT and x86 can make things more complicated for us.

    The first issue is that allTheSame is a simple true/false flag. That still leaves the .NET JIT and x86 instruction execution to make a few optimizations that introduce timing attacks. Timing side channel mitigation should avoid all decisions until the very end. & is a decision. However, we can improve that some:

    private static bool CheckStringsFixedTime(string str1, string str2) {
        if (str1.Length != str2.Length) {
            return false;
        }
        var result = 0;
        for (var i = 0; i < str1.Length; i++) {
            result |= str1[i] ^ str2[i];
        }
        return result == 0;
    }
    

    This is fairly close to being time-fixed. We update an integer using | to combine the bits from the result of an XOR. XOR anything with itself, and the the result is zero. Anything else, and you get a non-zero value. We use a binary OR to combine any bits. At the end, if we have a non-zero value, we know one of the comparison checks failed.

    There is some debate as to what arithmetic operators are better for fixed time operations between str1[x] and str2[x]. Some implementations use XOR, like above, other may use SUB for subtraction. Unfortunately, most CPU architectures make no guarantee if any operations are constant-time.

    We still have a problem to address. The JIT could be making unintended optimizations. The C# compiler itself makes very little optimizations. The JIT on the other hand may do all sorts of things, like unroll a loop or make a variety of optimizations so the code executes faster.

    We can tell the JIT to leave it alone, like so.

    [MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.NoOptimization)]
    private static bool CheckStringsFixedTime(string str1, string str2) {
        if (str1.Length != str2.Length) {
            return false;
        }
        var result = 0;
        for (var i = 0; i < str1.Length; i++) {
            result |= str1[i] ^ str2[i];
        }
        return result == 0;
    }
    

    So now the JIT will cease to optimize this method at all and it will be closer to as-written.

    This is close to time-independent as possible with the String type. We can improve things a bit by making the parameters ReadOnlySpan<char> which will generate very similar x86 as string parameters, however the null checks will be eliminated since a ROS cannot be null.

    This is what .NET Core 2.1’s FixedTimeEquals does. It just does it over a ReadOnlySpan<byte> instead of characters. The x86 produced by the current JIT will be identical for ReadOnlySpan<char> and ReadOnlySpan<byte> with the exception of the size of the MOVZX.

    It’s handy that this is just in the box for .NET Core now.