Moonlight 4 Preview 1 is out

Yesterday we released Moonlight 4, Preview 1.

This release of Moonlight completes the Moonlight 3 feature set and includes many features from Silverlight 4. Check out our release notes for the list of things that are currently missing from our Silverlight 4 support.

Rendering

Moonlight rendering system uses a painter's algorithm coupled with culling to reduce the amount of rasterization that needs to take place.

For example, if we had these objects all rendered at the same location, on top of each other:

A simple implementation would render the triangle, then the rectangle and then the circle, and we would get this result:

Moonlight optimizes this rendering by computing the bounding boxes of each element and determining which objects are completely obscured. In this case, the triangle never needs to be rendered as it is fully covered.

Since we have the entire graph scene in memory, we can push all the rendering to the X server without ever having to pull data back from it.

Each visible element on Silverlight derives from the class UIElement and Moonlight tracks the bounding box for all of each individual element. As you compose elements, the aggregate bounding box is computed. For example, a canvas that hosts these three UIElements would have a bounding box that contains all three of them:

New Rendering Features

With Silverlight 3, Microsoft introduced two large important changes to the framework: 3D transformations on objects and support for pixel shaders. Both of these are applied to every visual element in Silverlight (this is implemented in the class UIElement).

In addition to the properties inherited from Silverlight 2, UIElements now have two new properties: Projection and Effect.

The Projection property is a 3D matrix transformation (the 3D variation of the 2D affine transform that is available in most 2D graphics APIs). Silverlight exposes both the raw 3D matrix or a set of convenient properties that are easier to use and require no understanding of the interactions of the twelve elements of the 3D matrix (see this page for an explanation).

Just like 2D affine APIs typically expose convenience methods to scale, rotate, skew and translate, the PlaneProjection properties allow developers to focus on those components.

You can see a sample here.

Effects follow a similar pattern. The blur and drop-shadow effects are given convenient names and accessors (BlurEffect and DropShadowEffect but Silverlight exposes an API to create programmable pixel shaders that go beyond these two simple cases.

The ShaderEffect allows users to load pixel shaders written using the High Level Shader Language (HLSL). Here is a sample app showing pixel shaders in action.

3D transformations and pixel shaders require that the contents of a UIElement are rendered to an intermediate output surface. The 3D transformation and shader effect is applied when this surface is composited onto the parent output surface. This compositing operation can be accelerated using graphics hardware.

From our previous example, the three elements would be rendered into a 2D surface, and the actual transformation can be done in the hardware:

Finally, the third new rendering upgrade was the introduction of a bitmap cache that can be applied to a UIElement. When a UIElement is flagged for being bitmap cached, the same kind of intermediate surface rendering and hardware accelerated compositing is performed as for elements with 3D transformations or pixel shaders. The contents of bitmap cache elements are also rendered once and kept on a bitmap that is later composited. This can improve performance vastly for complex controls with many interlocking pieces: instead of computing and re-rendering the entire contents every time, the bitmap cache is used.

This of course has some visible effect. If you instruct Silverlight to use a bitmap cache, and then you zoom-in the contents, you will see the result get pixelated. You can learn more about this on the BitmapCache documentation.

Moonlight's New Rendering Pipeline and GPU Acceleration

Both effects and projections can be implemented purely in software. Effects can be implemented by providing a small interpreter for HLSL code and projections by performing the rendering in software and compositing the results.

David Reveman, the hacker behind Compiz joined the Moonlight team last year to implement the new rendering primitives, but also to figure out how we could hardware accelerate Moonlight. The results of his work are available on yesterday's release of Moonlight 4.

The rendering pipeline was modified. Now, whenever a UIElement has either a Projection, an Effect or has the the flag BitmapCache set the entire UIElement and its children are rendered into a surface that is then off-loaded for the GPU to render.

When OpenGL is available on the system, the composition of UIElements is entirely done by the GPU.

Moonlight will use the GPU to accelerate for compositing, perspective transformations and pixel shaders if the hardware is present without having to turn this on. Silverlight by default requires developers to opt into hardware acceleration and has its own set of features that it will hardware accelerate.

In general, Moonlight uses the GPU more than Silverlight does, except in the case of DeepZoom, where we still do not accelerate this code path.

Gallium3D

Our new rendering pipeline is built using OpenGL and Gallium3D.

Gallium is an engine that is typically used to implement OpenGL. In our case, we use the Gallium3D API when we need to fallback to software rendering 3D transforms on Linux. Otherwise we go through the OpenGL pipeline:

If we were to only support Linux/X11, Gallium3D would have been a better choice for us. But we want to allow the Moonlight runtime to be ported to other windowing systems (Like Wayland on Linux) and other operating systems.

Room for Growth

Now that we have this 3D accelerated platform in place, it is easy to fine-tune specific UIElements to be hardware accelerated.

This first release did only the basics, but we can now trivially use hardware decoders for video and have that directly composited in hardware with the rest of a scene, or we can offload image transformations entirely to the hardware on a type-by-type basis and of course, DeepZoom.

Object Lifecycle

Objects in moonlight live in two spaces, low-level components live in C++ and are surfaced to C#. Typically this means that we have code that looks like this in C++:

	//
	class MyElement : public UIElement {
	  protected:
		MyElement ();
	  private:
	        // fields and methods for MyElement go here
	}

In C# we create a mirror, like this:


	public class DependencyObject {
		// Points to the C++ instance.
		IntPtr _native;
	}

When a user wrote in C# "new MyElement", we would P/Invoke into C++ code that does "new MyElement", get a pointer back to this instance and store it in the "_native" field.

In the other direction, if we created a C++ object and then we had to "surface" it to the managed world, we would initialize the object based on our C++ "this" pointer.

We could create instances of MyElement in C++, and when this instance needs to be surfaces to the managed world, we would create an instance of the managed object, and store the pointer to the underlying C++ object in the _native pointer.

In the Moonlight 2.0 days we used to have C++ objects that would only create managed objects on demand. At the time, we did this to avoid creating thousands of managed objects when loading a XAML file when only a handful of those would be controlled by user code.

The Moonlight runtime, running in pure C++ code, surfaced objects to the C# world and we tracked the object life cycle with good old reference counts. But with Silverlight 2, we started to see problems with the design as it was possible to end up with cycles. These cycles did not happen only in the C++ side or the C# side, but they spanned the boundaries. This made it very hard to debug and it made it hard to keep Moonlight from not leaking memory.

Templates for example could create these cycles.

With Moonlight 4, we have landed a new life cycle management system that works like this:

Every C++ object that we create always points to a managed counterpart. Gone are the days where the managed peer was created only when needed.
Every C++ instance field that points to a DependencyObject subclass goes through this cool C++ templated class that notifies managed when the reference changes.
There are no ref/unref pairs surrounding stores to instance fields in c++ anymore.

Now our base class in C++ has this:

	// Our entire hierarchy exposed to managed code
	// derives from EventObject
	class EventObject {
        	GCHandle managed_handle;
	}

Now all the c++ objects exist and are kept alive solely by their managed peers (there are some rare exceptions for things like async calls) and the whole graph is traversable by Mono's GC because all stores to c++ instance fields result in a managed ref between the respective peers.

With the new code, we no longer live in a world of refs/unrefs (again, except for some rare cases) and whenever we assign to a C++ field that points to a managed object we notify the managed peer of the change.

We were not able to ship Moonlight 4 with our new garbage collection system (Sgen) as we ran into a couple of hard to track down bugs at the last minute, but we are going to switch it on again soon.

Future Work

There is still room for improvement, and now that we know how to cook this kind of code, the goal is to use Mono's new GC in Moonlight more extensively.

We want to teach SGen to scan the C++ heap and track references to managed objects, dropping another potential source of problems and reducing the code needed. We would also love to go back to only creating managed objects on demand.

Platform Abstraction Layer

Moonlight was originally designed as a Linux-only, X11-only plugin for rendering Silverlight content. Developers constantly ask us whether they could run Moonlight on platform XX that is either not Linux or does not use X11.

The amount of work to port Moonlight 2 to those kinds of scenarios was so overwhelming that most people abandoned the efforts relatively quickly.

With Moonlight 4, we have introduced a new platform abstraction layer that will make it simpler for developers to port the Moonlight engine to other platforms.

Whether you want hardware accelerated user experiences in your video game or you want to put Moonlight on a the FreezeMaster 10000 Domestic Fridge with an Internet Connection and SmoothStreaming running on a barebones ARM CPU, you can now enjoy this in the comfort of your home.

We have done some minimal tests to exercise the API and can run the Moonlight engine on both MacOS and Android. You can look at exclusive footage of the animation test suite running on OSX and on Android.

If you are like me, not much of a click-on-the-video kind of person, and would rather get a screenshot, you can bask on the smooth colors of this screenshot on Android or in this delightful test on MacOS.

We are currently not planning on completing that work ourselves, so this is a fabulous opportunity for a caffeine-driven hacker to complete these ports.

Some possibilities, from the top of my head include being able to use Silverlight to design parts of your user experience for apps on the Mac AppStore (think MoonlightView in your MonoMac apps), or for your Android app.

Using Expression beats coding cute animations and futuristic UIs by hand. That is why every major video game embeds ScaleForm, the embeddable Flash runtime for handling the UI.

New XAML Parser

Our original XAML parser was written in C++, this worked fine for Moonlight 1, but with Moonlight 2 it started to become obvious that we were spending too much time calling back from C++ to C# to create managed objects.

This was acceptable up to version 2, but it no longer scaled with Moonlight 3. Too much time was spent going back and forth between the C++ world and the C# world. Those following the development of Moonlight would have noticed that every couple of weeks a new extra parameter to the xaml_load function was added to deal with yet another callback.

The new XAML parser is entirely written in C#, is faster and more reliable.