Phalanger on Mono (part 2)

I blogged earlier about getting Phalanger running on Mono and the missing feature missing being the native extension support because the bridge was written in Managed C++.

Well I came up with an all C# solution. It’s not close to being done, more just a proof concept. I’m not sure the willingness of the upstream with supporting this method because it probably means a lot more work to maintain this method then the managed C++ version. With the managed C++ version, its less complicated because at least you can reuse the headers from PHP to help out with all the function declarations, typedefs, macros, and structs (that sometimes break ABI compatibility on rare occasion in PHP from version to version which everyone who has ever used the Zend optimizer knows about when they upgrade their PHP version the old version of Zend’s optimizer breaks). With the all C# solution, it means reversing all those headers and duplicating their function in C#.

There are also a few typedefs in the headers that compile to two different sizes between 32bit and 64bit platforms so I have to declare two versions of the same p/invoke calls on some occasions and use one or the other if the code is running on 32bit or 64bit.

The way the code works on the managed C++ side is that the code will iterate through all the php extension dynamic libraries. All the extensions in PHP are required to expose a standardized set of function exports. The managed C++ code will dlopen them (I’m using Unix terminology here, LoadLibrary on windows), use dlsym to get the function pointers for that extension, store them in to a struct, and stick the struct on to an array where it holds references to all the extensions. (I’m way over simplifying what it does, as phalanger also does some neat isolation tricks by loading the extensions into a different memory spaces/processes to keep buggy extensions from crashing the process phalanger is running in and remotes into them).

This same kind of operation using dlopen and dlsym like this can be done from Mono/.NET natively in a cross platform safe manner using just plain old P/Invoke and some really cute System.Reflection.Emit code. (Cecil maybe able to help me here for what I’m doing later). Basically, it means I emit an class for each extension on the native side, with all the emited classes inheriting from a interface that matches the same method signatures of the exports I’m P/invoking in on with each lib. It’s not possible to declare an extern in an interface, so I emit both a private pinvoke call and a public function with the same signature and name that just calls it which is in the interface. The class I emit each time is almost identical except for the given library name each of the p/invoke calls call out to. Poor mans dynamic p/invoke. This entire process can be at runtime, or done before hand, where I can generate assemblies or even exes that remote into a parent process for poor mans process isolation (which is really easy since I’m using a well known interface for each of the extensions so remoting to it is simple).

After the extension is loaded up, the managed C++ code will then do some System.Reflection.Emit operations to generate a wrapper assembly around the functions provided by the php extension library. This is done using the reflection like information provided by the extension to emit a similar call in the .NET side that proxies down to the extension. PHP has setter and getter methods, overloading, purely dynamic functions, well known functions, etc. Thankfully this managed C++ code converts to C# without to much trouble. There is still a lot of structs to rebuild though.

In all, its about a 6 to 8 week project if I had the time. (which I don’t) :-)

Tags:

5 Responses to “Phalanger on Mono (part 2)”

  1. pic_micro Says:

    Nice idea.

  2. Asbjørn Ulsberg Says:

    Wow, it almost made me sweat just reading that! Sounds exhausting. :) However, wouldn’t it be possible to perform some of the code-generation as an automatic step, perhaps in the (pre-) build process? That would at least help the upstream maintainers and perhaps make them embrace the all-C#-solution you’ve created (which I think sounds much better, if only just for the cross-platform interoperability, than the managed C++ one).

    Anyway, nice work! What would be even more interesting than writing PHP in a .NET environment, though, would be to write C# in a PHP environment. ;-)

  3. zbowling Says:

    Generation of the assemblies with reflection.emit is easy. It’s the reversing of the interfaces back and forth. Code generation sucks, plus its not dynamic to each new extension you drop in and you have to have more then just the .NET runtime but the entire SDK. You would assume that the SDK would be there if they are using ASP.NET but not every time. :-)

  4. Alan Says:

    [quote]
    There are also a few typedefs in the headers that compile to two different sizes between 32bit and 64bit platforms so I have to declare two versions of the same p/invoke calls on some occasions and use one or the other if the code is running on 32bit or 64bit.
    [/quote]

    Well, one trick is to have a native wrapper library which exposes a single API which does not have a change in size between 32bit/64bit platforms and then that native library will interop with your existing C++ code.

    The size of ‘long’ in your C++ code will be the same as in the wrapper library, so you’ll have no troubles interoping there, and your wrapper library will just expose it as int32_t or int64_t as required, so you are guaranteed fixed size variables.

    Then, the C# code P/Invokes your wrapper library on all platforms, so you only need 1 set of p/Invokes in C# and you don’t need runtime detection.

    Alternatively, if you don’t care about windows64 support, just represent a ‘long’ as an IntPtr in your C# code and you support all 32bit windows systems and all platforms with sizeof(pointer) == sizeof(long). Which is everything except win64.

    Hope that makes sense.

  5. zbowling Says:

    yes. makes sense.

Leave a Reply