Phalanger on Mono (part 2)
Posted on November 21st, 2007 in Personal | Comments
I blogged earlier about getting Phalanger running on Mono and the missing feature missing being the native extension support because the bridge was written in Managed C++.
Well I came up with an all C# solution. It’s not close to being done, more just a proof concept. I’m not sure the willingness of the upstream with supporting this method because it probably means a lot more work to maintain this method then the managed C++ version. With the managed C++ version, its less complicated because at least you can reuse the headers from PHP to help out with all the function declarations, typedefs, macros, and structs (that sometimes break ABI compatibility on rare occasion in PHP from version to version which everyone who has ever used the Zend optimizer knows about when they upgrade their PHP version the old version of Zend’s optimizer breaks). With the all C# solution, it means reversing all those headers and duplicating their function in C#.
There are also a few typedefs in the headers that compile to two different sizes between 32bit and 64bit platforms so I have to declare two versions of the same p/invoke calls on some occasions and use one or the other if the code is running on 32bit or 64bit.
The way the code works on the managed C++ side is that the code will iterate through all the php extension dynamic libraries. All the extensions in PHP are required to expose a standardized set of function exports. The managed C++ code will dlopen them (I’m using Unix terminology here, LoadLibrary on windows), use dlsym to get the function pointers for that extension, store them in to a struct, and stick the struct on to an array where it holds references to all the extensions. (I’m way over simplifying what it does, as phalanger also does some neat isolation tricks by loading the extensions into a different memory spaces/processes to keep buggy extensions from crashing the process phalanger is running in and remotes into them).
This same kind of operation using dlopen and dlsym like this can be done from Mono/.NET natively in a cross platform safe manner using just plain old P/Invoke and some really cute System.Reflection.Emit code. (Cecil maybe able to help me here for what I’m doing later). Basically, it means I emit an class for each extension on the native side, with all the emited classes inheriting from a interface that matches the same method signatures of the exports I’m P/invoking in on with each lib. It’s not possible to declare an extern in an interface, so I emit both a private pinvoke call and a public function with the same signature and name that just calls it which is in the interface. The class I emit each time is almost identical except for the given library name each of the p/invoke calls call out to. Poor mans dynamic p/invoke. This entire process can be at runtime, or done before hand, where I can generate assemblies or even exes that remote into a parent process for poor mans process isolation (which is really easy since I’m using a well known interface for each of the extensions so remoting to it is simple).
After the extension is loaded up, the managed C++ code will then do some System.Reflection.Emit operations to generate a wrapper assembly around the functions provided by the php extension library. This is done using the reflection like information provided by the extension to emit a similar call in the .NET side that proxies down to the extension. PHP has setter and getter methods, overloading, purely dynamic functions, well known functions, etc. Thankfully this managed C++ code converts to C# without to much trouble. There is still a lot of structs to rebuild though.
In all, its about a 6 to 8 week project if I had the time. (which I don’t)