This blog entry is about the story of how I figured this out with the help of a friend, and information will be mainly presented this way. It covers very advanced topics within C# and the .NET runtime, so be sure to have Google ready for any unrecognized terms.
The research this article is about took a week in May 2020 in a Discord channel filled with an obscene amount of theories and swear words over this. My good friend ghorsington had helped significantly with this work.
I maintain and co-own a project on GitHub named BepInEx, which has a main goal of adding modding capability to C# based games (either not having a modding system already, or to extend a limited system).
A general design goal we have is to not modify any existing game assemblies, or files in general. Not only does this make us immune to game updates and file integrity checks, we can also edit assemblies at runtime with Mono.Cecil before loading them into the CLR.
We already have a system to do this within Unity games, however with BepInEx v6.0 we had branched out into pure C# games and executables (such as games built with XNA, FNA or MonoGame).
We had two goals we needed to accomplish to get a similar system working for pure C# executables:
The libary we use for doing this in Unity is named UnityDoorstop. The concept behind it is simple to anyone who has dabbled in executable hooking, but I'll go over how it works in a very simplified way.
Every native .exe file has a list of .dll files that it requires to be able to load functions from them. This is called an Import Address Table (IAT), and it sits in the header of the executable. If the executable wants to load a function from a .dll file, it uses this to check which .dll file it needs to load to get that function.
Instead of knowing in advance where to find a certain .dll file, Windows will search a list of folders to find a requested .dll file.
The most interesting item in that list is that "The directory from which the application loaded" is the very first directory that is checked, even before System32 or any core Windows directories.
For example, the Unity player executable relies on a function from winhttp.dll
. UnityDoorstop pretends to be winhttp.dll
, redirecting any library calls to the original .dll. When a .dll file is loaded by the system or with LoadLibrary
/LoadLibraryEx
, it will execute an entrypoint function if it exists.
Putting all these puzzle pieces together, by pure virtue of having a system .dll in the same folder as the executable you're trying to trick, you can get Windows to execute your custom entrypoint code for you.
This is recognized as an entrypoint for malware, as it can allow code to execute in places it shouldn't. On a similar note, if you've ever used a DirectX9 fix on a game in Windows 10 or used hacks in a game, you may have been asked to place a file called d3d9.dll
in the same folder as the game. This is the reason why it works as it does.
Naturally we had attempted to use a similar entrypoint for .NET executables. All .NET executables have mscoree.dll
as the only item in their IAT tables, so surely we could use a similar system for all .NET executables right?
Unfortunately, the answer is not at all. Turns out this is just a formality from .NET 1.0 and 1.1 days, and since Windows XP this will not work:
However, starting with Windows XP, the operating system loader knows natively how to deal with .NET assemblies, rendering most of this legacy code & structure unnecessary. It still is part of the spec, and so is part of every .NET assembly.
In case you didn't understand that, Windows itself will recognize if an .exe file is a .NET Framework executable and will use specialized code to load & run it. The IAT table is not even checked.1
Another solution is needed.
In .NET Framework, there exists the concept of a binding redirect. The primary use of a binding redirect is to tell the runtime to use a certain version of a .dll file (for example, if you want to use a .NET Framework 4.0 version of a .dll file instead of the latest version).
You would add binding redirects by editing the .exe.config file that Visual Studio builds alongside an application.
They can also be used in a much more nefarious manner.
By adding a .config file to the same folder as a .NET Framework executable and filling it with our custom binding redirects, we can essentially tell the runtime to use our DLL instead of the one that the game expects. Great! What now then?
First of all, we still need to be able to execute our custom code as soon as possible. With .NET, the easiest way is to use a module initializer. It is identical to the entrypoint I explained above for native .dll files, except for C# libraries. We can use this to inject our custom code.
On paper this looks like it will work out of the box. However, there are still two major problems.
The first one is much simpler and has an also similar fix. When using a binding redirect for a .NET libary, the replacement .dll should be somewhat similar to the .dll file it's attempting to replace. This is something I was never able to completely nail down; if the files are too different, Fusion (the part of .NET Framework that handles finding and loading libraries) will outright crash without any notice.
This is expected undefined behaviour since we're way beyond what the .NET Framework development team had intended with this feature, but the fix is not difficult to do. Simply make the replacement libary as similar as possible to the original, only adding a module initializer to it too. Fusion is happy at this point and will load the replacement library that we list.
I can't explain the next issue without explaining what custom code we actually add first.
People deeply knowlegable with the CLR may have seen this coming. One of the goals I was trying to accomplish with this was to have my custom code execute before any other code had executed (or more specifically, before any assemblies have loaded)
Since I can't guarantee how early this executes and that the current AppDomain is tainted with existing assemblies, it's pretty much a requirement for me to create an AppDomain.
For the uninitiated, in .NET Framework an AppDomain is what assemblies (your .NET .exe and .dll files) are loaded into. Unless you explicitly create more, there is only one in a process which is why you've never heard of it if you've never been in this area before.
They're primarily used for creating sandboxes, because you can apply restrictions on AppDomains (such as not allowing filesystem access) and AppDomains are isolated from each other. This is also a downside, as it makes communication between AppDomains pretty difficult.
We create an AppDomain since it's a blank slate with no loaded assemblies, which is a requirement for the "preloader patcher" functionality in BepInEx.
However our binding redirect trick has come back to bite us in the ass.
When .NET Framework creates a new AppDomain, it uses the same .config file (and therefore the same binding redirects) that the process started with. So when our new AppDomain wants to load the library that we created a binding redirect for, it will begin this process all over again in an endless loop.
There's no obvious way around this either. The runtime checks for the .exe.config file first before checking for assemblies already loaded into the AppDomain, so we can't preemptively load the original assembly.
This is where the most of the week spent on this task was spent, just simply attempting to tell the runtime to stop applying binding redirects. If you're thinking that there's a way of preventing it with static properties or something similar, it's straight up not possible2. As we're in a new AppDomain, everything is isolated and new, and that includes static properties.
This kind of functionality is CLR territory, so we need to interface with the .NET Framework itself.
We checked the MSDN documentation for something that would be able to manipulate the binding redirects at runtime, and had found something that looked promising:
In case the documentation has changed since the time of publication, here's a snippet from the page:
Modifies the binding policy for the specified assembly, and creates a new version of the policy.
...
pwzSourceAssemblyIdentity
[in] The identity of the assembly to modify.
pwzTargetAssemblyIdentity
[in] The new identity of the modified assembly.
pbApplicationPolicy
[in] A pointer to a buffer that contains the binding policy data for the assembly to modify.
cbAppPolicySize
[in] The size of the binding policy to be replaced.
dwPolicyModifyFlags
[in] A logical OR combination of EHostBindingPolicyModifyFlags values, indicating control of redirection.
pbNewApplicationPolicy
[out] A pointer to a buffer that contains the new binding policy data.
pcbNewAppPolicySize
[in, out] A pointer to the size of the new binding policy buffer.
Now this seemed really ambiguous to us at the time. Both the description of "creating a new version" and the fact that the policies were pointers to byte arrays didn't make much sense, but since the documentation was lacking here we had to experiment a bit.
As the documentation lacked information on how to access the CLR's COM objects, we referenced mscoree.h
instead:
We finally got stumped on this last error though. We had to find out what the binary blobs actually were if we wanted to get this working. ghorsington dusted off IDA and took a look at the disassembly of this function.
Can you guess what it was?
It was XML data. Yep, this function did not actually manipulate anything at runtime, but was for editing the .config file of an executable. While I'm unconvinced of the usefulness of a function doing this, it was pretty frustating to invest so much time into getting this interoperability working within C# only for it to not be what we thought it was due to lacking documentation.
Luckily we had found another lead; when creating an AppDomain, you can pass in an AppDomainSetup
object which contains a very interesting flag: DisallowBindingRedirects
.
This seemed like a no-brainer. When creating the new AppDomain in our custom code, we could add a flag to get it to disable binding redirects and it would solve our issues. But surprise, again, this didn't work. Why?
We can take a look at the CLR source code to see why.
While not the latest version of the .NET Framework CLR, older versions of the CLR (.NET 2.0) are available for public viewing. This is the repo we used:
https://github.com/SSCLI/sscli20_20060311
We had been referencing this during the events of the previous chapter already, but it was now where it actually helped us pinpoint an issue. Here's the smoking gun:
https://github.com/SSCLI/sscli20_20060311/blob/master/clr/src/fusion/binder/policy.cpp#L285
What does this mean? If the assembly that is being loaded into the AppDomain (in our case, the original assembly that we redirected from) does not have a strong name, then the logic that checks for DisallowBindingRedirects
is not executed. Even though it's only 20 lines of code away.
Unfortunately this is also something that the documentation also fails to specify:
Gets or sets a value that indicates whether an application domain allows assembly binding redirection.
Honestly this is just straight up misleading, as there's a huge caveat here that is not mentioned. Microsoft has really dropped the ball when it comes to documenting these lower level functions, but it's probably because the only people consuming these functions would be corporations with access to enterprise-level support.
Stumbling upon this by pure chance, we found a property in AppDomainSetup
that would cater to our needs: ConfigurationFile
.
Normally this is used to configure individual .exe.config files for each AppDomain, but we had also found an unintended use for it: removing a configuration file from being used in an AppDomain.
All we really have to do is set it to pretty much anything3 that isn't the file, and then suddenly the runtime will stop applying binding redirects.
An anti-climactic ending, but we were beyond relieved when we found this. Next task was to write the tool that generated the binding redirect and proxy assemblies for us, and then we could run our own code in a .NET executable without overwriting any files. Cool!
When integrating this to work with Terraria, I found something interesting with it; it was an ILMerged assembly.
ILMerging is the act of combining multiple assemblies into an single assembly, for example when you want to combine all of your .dll files into a single .exe file.
Most (if not all) ILMerging implementations rely on AppDomain.AssemblyResolve
to tell the runtime where to look for dependencies. In a nutshell, the CLR tries to find a dependency, can't find the .dll file, flags the event and the application says "Here's the assembly, it's been embedded within the .exe the entire time".
This is vulnerable because it relies on the fact that the .dll that it's searching for doesn't exist on disk already. We can just put our own .dll in it's place, and the runtime will pick up our proxy assembly by mistake and we have the same level of execution as before.
It's useful because it doesn't rely on binding redirects, but of course it only works if an assembly has an ILMerged dependency.
So to recap, here's what's required to be able to have custom code injected like this:
I've uploaded a repo that has all of the code I've made related to this:
https://github.com/BepInEx/NRedirect
Contains a tool that can find a suitable dependency from a .NET Framework executable, generate a proxy assembly w/ hook and the relevant binding redirect, and point to a custom assembly of your choice.
The final AppDomain logic is located here, and contains more comments than code.
Hope you found this interesting. Investigating .NET Core / .NET Framework 5 may or may not be in the future since they don't have AppDomains ¯\_(ツ)_/¯
- Bepis
https://github.com/bbepis
1: IAT abuse like this works when chained; for example, if a setup such as Exe -> DllA -> DllB
exists, after loading DllA the executable will load DllB because DllA relies on it. However this doesn't work for most (if not all) system DLL files like mscoree.dll, as they explictly only load from System32 and skip the search directories.
2: While we could technically add a check for the current AppDomain in the module initializer for the proxy assembly created in chapter 2, it is not a perfect solution since it would cause the proxy assembly to be used in place of the original assembly.
When an assembly is loaded into the current AppDomain, the CLR will use types and definitions on that first assembly no matter how many times you call Assembly.Load()
calls on other assemblies in an attempt to override them.
Therefore the proxy will need to be regenerated on every game update otherwise there's a risk of out of date code being used, as opposed to the better option of only recreating the proxy assembly when Fusion does not like a binding redirect and crashes, which is much less frequent.
3: You need to be careful here. While an empty string might sound like a good option, in practice it will cause errors within the framework itself that expect a value there. Most notably FileIOPermission.CheckIllegalCharacters
will throw exceptions, and there may be more caveats as this is completely undocumented territory.