Mono's SIMD Support: Making Mono safe for Gaming

by Miguel de Icaza

This week at the Microsoft PDC we introduced a new feature in the Mono virtual machine that we have been working on quietly and will appear in our upcoming Mono 2.2 release (due in early December).

I believe we are the first VM for managed code that provides an object-oriented API to the underlying CPU SIMD instructions.

In short, this means that developers will be able to use the types in the Mono.Simd library and have those mapped directly to efficient vector operations on the hardware that supports it.

With Mono.Simd, the core of a vector operations like updating the coordinates on an existing vector like the following example will go from 40-60 CPU instructions into 4 or so SSE instructions.

Vector4f Move (Vector4f [] pos, ref Vector4f delta)
	for (int i = 0; i < pos.Length; i++)
		pos [i] += delta;

Which in C# turns out to be a call into the method Vector4f.operator + (Vector4f a, Vector4f b) that is implemented like this:

Vector4f static operator + (Vector3f a, Vector3f b)
	return new Vector4f (a.x+b.x, a.y+b.y, a.z+b.z, a.w+b.w);

The core of the operation is inlined in the `Move' method and it looks like this:

movups (%eax),%xmm0
movups (%edi),%xmm1
addps  %xmm1,%xmm0
movups %xmm0,(%eax)

You can see the details on the slides that I used at the PDC and look at the changes in the generated assembly, they are very large.

Ideally, once we tune the API based on our user feedback and contributions, it should be brought to ECMA for standardization. Hopefully we can get Microsoft to implement the SIMD support as well so all .NET developers have access to this.

Making Managed Code Viable for Gaming

Many developers have to resort to C++ or assembly language because managed languages did not provide the performance they needed. We believe that we can bring the productivity gains of managed languages to developers that seek high performance applications:

But even if you want to keep using your hand-tuned C++ game engine, the SIMD extensions will improve the performance of your scripting code. You can accelerate your ray casting operations by doing all the work in the managed world instead of paying for a costly managed to unmanaged transition and back.

You can avoid moving plenty of code from C# into C++ with this new functionality.

Some SIMD Background

Most modern CPUs contain special instructions that are able to perform arithmetic operations on multiple values at once. For example it is possible to add two 4-float vectors in one pass, or perform these arithmetic operations on 16-bytes at a time.

These are usually referred to as SIMD instructions and started showing up a few years ago in CPUs. On x86-class machines these new instructions were part of MMX, 3DNow or the SSEx extensions, on PowerPC these are called Altivec.

CPU manufacturers have been evolving the extensions, and newer versions always include more functionality and expand on the previous generations.

On x86 processors these instructions use a new register bank (the XMM registers) and can be configured to work on 16 bytes at a time using a number of possible combinations:

  • byte-level operations on 16 elements.
  • short-level operations on 8 elements.
  • single precision or integer-level operations on 4 elements.
  • double precision or long-integer operations on 2 elements.

The byte level operations are useful for example when doing image composition, scaling or format conversions. The floating point operations are useful for 3D math or physics simulations (useful for example when building video games).

Typically developers write the code in assembly language to take advantage of this feature, or they use compiler-specific intrinsic operations that map to these underlying instructions.

The Idea

Unlike native code generated by a compiler, Common Intermediate Language (CIL) or Java class files contain enough semantic information from the original language that it is very easy to build tools to compute code metrics (with tools like NDepend), find bugs in the code (with tools like Gendarme or FxCop, recreate the original program flow-analysis with libraries like Cecil.FlowAnalysis or even decompile the code and get back something relatively close to the original source code.

With this rich information, virtual machines can tune code when it is just-in-time compiled on a target system by tuning the code to best run on a particular system or recompiling the code on demand.

We had proposed in the past mechanisms to improve code performance of specific code patterns or languages like Lisp by creating special helper classes that are intimately linked with the runtime.

As Mono continues to be used as a high-performance scripting engine for games we were wondering how we could better serve our gaming users.

During the Game Developer Conference early this year, we had a chance to meet with Realtime Worlds which is using the Mono as their foundation for their new work and we wanted to understand how we could help them be more effective.

One of the issues that came up was the performance of Vector operations and how this could be optimized. We discussed with them the possibility of providing an object-oriented API that would map directly to the SIMD hardware available on modern computers. Realtime Worlds shared with us their needs in this space, and we promised that we would look into this.

The Development

Our initial discussion with Realtime Worlds was in May, and at the time we were working both towards Mono 2.0 and also on a new code generation engine that would improve Mono's performance.

The JIT engine that shipped with Mono 2.0 was not a great place to start adding SIMD support, so we decided to postpone this work until we switched Mono to the Linear IL engine.

Rodrigo started work on a proof-of-concept implementation for SIMD and after a weekend he managed to get the basics in place and got a simple demo working.

Beyond the proof of concept, there was a lingering question: were the benefits of Vector operations going to be noticeably faster than the regular code? We were afraid that the register spill/reload would eclipse the benefits of using the SIMD instructions or that our assumptions had been wrong.

Over the next few weeks the rest of the team worked with Rodrigo to turn the prototype into something that could be both integrated into Mono and would execute efficiently (Zoltan, Paolo and Mark).

For example, with Mono 2.2 we will now align the stack conveniently to a 16-byte boundary to improve performance for stack-allocated Mono.SIMD structures.

So far the reception from developers building games has been very positive.

Although today we only support x86 up to SSE3 and some SSE4, we will be expanding both the API and the reach of of our SIMD mapping based on our users feedback. For example, on other architectures we will map the operations to their own SIMD instructions.


The API lives in the Mono.Simd assembly and is available today from our SVN Repository (browse the API or get a tarball). You can also check our Mono.Simd documentation.

This assembly can be used in Mono or .NET and contains the following hardware accelerated types (as of today):

Mono.Simd.Vector16b  - 16 unsigned bytes
Mono.Simd.Vector16sb - 16 signed bytes
Mono.Simd.Vector2d   - 2 doubles
Mono.Simd.Vector2l   - 2 signed 64-bit longs
Mono.Simd.Vector2ul  - 2 unsigned 64-bit longs
Mono.Simd.Vector4f   - 4 floats
Mono.Simd.Vector4i   - 4 signed 32-bit ints
Mono.Simd.Vector4ui  - 4 unsigned 32-bit ints
Mono.Simd.Vector8s   - 8 signed 16-bit shorts
Mono.Simd.Vector8us  - 8 unsigned 16-bit shorts

The above are structs that occupy 16 bytes each, very similar to equivalent types found on libraries like OpenTK.

Our library provides C# fallbacks for all of the accelerated instructions. This means that if your code runs on a machine that does not provide any SIMD support, or one of the operations that you are using is not supported in your machine, the code will continue to work correctly.

This also means that you can use the Mono.Simd API with Microsoft's .NET on Windows to prototype and develop your code, and then run it at full speed using Mono.

With every new generation of SIMD instructions, new features are supported. To provide a seamless experience, you can always use the same API and Mono will automatically fallback to software implementations if the target processor does not support the instructions.

For the sake of documentation and to allow developers to detect at runtime if a particular method is hardware accelerated developers can use the Mono.Simd.SimdRuntime.IsMethodAccelerated method or look at the [Acceleration] atribute on the methods to identify if a specific method is hardware accelerated.

The Speed Tests

When we were measuring the performance improvement of the SIMD extensions we wrote our own home-grown tests and they showed some nice improvements. But I wanted to implement a real game workload and compare it to the non-accelerated case.

I picked a C++ implementation and did a straight-forward port to Mono.Simd without optimizing anything to compare Simd vs Simd. The result was surprising, as it was even faster than the C++ version:

Based on the C++ code from F# for Game Development

The source code for the above tests is available here.

I use the C++ version just because it peeked my curiosity. If you use compiler-specific features in C++ to use SIMD instructions you will likely improve the C++ performance (please post the updated version and numbers if you do).

I would love to see whether Johann Deneux from the F# for Game Development Blog could evaluate the performance of Mono.Simd in his scenarios.

If you are curious and want to look at the assembly code generated with or without the SIMD optimizations, you want to call Mono's runtime with the -v -v flags (yes, twice) and use -O=simd and -O=-simd to enable or disable it.


You can watch the presentation to get some color into the above discussion or check it in the Silverlight player, Get it as PDF, or PPTX.

Posted on 03 Nov 2008

Interactive GUI Shell

by Miguel de Icaza

This week at the Microsoft PDC I showed gsharp, our GUI repl for the C# 3.0 language, a tool that I had previously talked about.

Before the PDC we copied an idea from Owen's great reinteract shell where we provide our REPL with a mechanism to turn objects into Gtk.Widgets which we then insert.

Out of the box we support System.Drawing.Bitmap images, we turn those into Gtk Widgets that then we render:

I also added a toy Plot command that can take a number of lambdas that return floats to do some cute plots. The plots are just rendered into a System.Drawing.Bitmap so they get painted on the screen automatically:

But you can add your own handlers for any data types you want, all you have to do is call RegisterTransformHandler with a function that can return a Gtk.Widget based on an input object, or null if it does not know how to render it.

The implementation to render images is very simple, this is the implementation:

using System;

public class MyRenderHelpers {
	public static object RenderBitmaps (object o)
		System.Drawing.Bitmap bitmap = o as System.Drawing.Bitmap;
		if (bitmap == null)
			return null;

		return new BitmapWidget (bitmap);

You can put your own library of helper methods in a compiled assembly in ~/.config/gsharp, and then register all of your types from a file ending with the extension .cs in ~/.config/gsharp:

RegisterTransformHandler (MyRenderHelpers.RenderBitmaps);

And you are done.

The above could be used for example to create all kinds of information visualizers for the GUI REPL. I would love to see a LINQ query navigator, similar to the one in LinqPad.

Update: A one line change that brings gsharp into the new millenium by rendering `true' and `false' with icons instead of text:

Posted on 02 Nov 2008

Mono and .NET talk at PDC video.

by Miguel de Icaza

The PDC 2008 was a blast. It was a privilege to be able to present Mono to all of these developers.

Joseph Hill helped me prepare my presentation for the PDC. Our goal was to explore how Mono could help .NET developers, but we did not want to go down a laundry list of features, or a list of APIs, or rehash what the advanced audience at the PDC already knew.

player, pdf, pptx

The idea was to pick a couple of interesting innovations from Mono and try stitch a story together. I discussed our embeddable C# compiler (which we need to start calling "Compiler Service"), some applications of Mono in gaming, our recent SIMD extensions and using Mono on the iPhone.

As for me, I am catching up on all the sessions I missed this weekend. All of the videos and slide decks are available for free from the Microsoft PDC site, and republished in Channel9.

In the next few days I will blog in more detail about each topic.

Posted on 02 Nov 2008

In LA for the Microsoft PDC

by Miguel de Icaza

After a great week in Copenhagen with the Unity community I spent 14 hours on a high-tech tin can flying to LA for the Microsoft PDC conference.

Mono Talk

I am doing a talk on Wednesday at 4:45 in room 515B.

Unlike previous talks that I have done about Mono, this is an advanced talk and will skip over all the generalities and go straight to Mono CLI/C# features and innovations.

I decided against talking about Moonlight or APIs, as information about can be better learned elsewhere.


There has been enough leaked information that we know some bits about C# 4. Some guess it includes dynamic support, other that it will be more F#-like and add immutability, others that it will introduce some Alef-like threading capabilities.

Then there is talk about .NET 4, and I just have no clue what they will announce.

So what do you think they are announcing this week?

Speculate away in the comments.

Posted on 25 Oct 2008

Live Blogging from Unite

by Miguel de Icaza

I am live blogging from Unite, the Unity3D conference, one of the most fun users of Mono.


Next UnityEditor will run on Windows, and its rewritten in C# (it originated on MacOS, and is now moving to Windows).

Unity as of today ships for building games on the iPhone. These are fully legit binaries, no need to crack your iPhone, they are using Mono's batch compilation to generate static binaries with no JITing involved (per Apple licensing requirements).

Side Note

Since I am a Linuxista, you might be wondering why I am so excited about Unity. Of course I am excited because they use Mono, but I am also excited because Novell is working with Unity to bring this to Linux:

erandi$ uname -a
Linux erandi #1 SMP 2008-08-21 00:34:25 +0200 i686 i686 i386 GNU/Linux
erandi$ ls -l unity-linux/build/LinuxPlayer 
-rwxr-xr-x 1 miguel users 45735629 2008-10-12 00:37 unity-linux/build/LinuxPlayer*

We do not have a timeline yet, please do not spam the Unity guys with requests, stay tuned to this blog for updates.

FusionFall pre-mortem

12:17am Joachim explains their strategy with the web plugin and how they will cope with multiple versions of it as time goes by.

12:06am Cartoon Network will drive a lot of the plugin penetration.

11:50am Joachim is showing profiling information (particle, physics, the top scripts, time taken per frame).

The game went from 2gigs to 300megs by using some interesting compressions fo meshes and animations.

All of the features that were added for FusionFall are being folded back into the future Unity 2.5.

Pics: I wish I had brought my USB cable to post pictures from the camera. I will try to spice up this post tonight with the photos.

11:45am Runtime World Streaming: scenes are dynamically loaded and unloaded base on the player position in the world, the world is made up of a 16x16 grid. Scene loading happens when a player approaches a boundary in the world.

11:40am The Unity guys are talkng about the challenge of converting the assets from Gamebryo to Unity, the volume was large (25k files of Gamebryo data, which was constantly changing and growing).

They added support to Unity to interoperate with the Gamebryo and Cartoon Network data and wrote plenty of C# importing scripts and tools.

11:30am FusionFall talk by Joachim Ante.. FusionFall is a project that was done for the Cartoon Network.

It is an MMO with platform game elements.

It has a huge streaming world, so there are no pauses as you navigate the world. The game is targetted at kids (8-14).

The game was produced by Cartoon Network and developed by Grigon Entertainment, a team of 40 developers, 10 of them programmers, and has been under development for 3 years. Originally this was Windows-only standalone executable, developed with Gamebryo.

The cycle at some point included the prototyping done in Unity, shipping the executable to engineering, and engineering reimplementing in Gamebryo. They realized that they could turn the prototype as the actual client and that they could just communicate to their backend server. This allowed them to switch the entire game from Gamebryo to Unity.

Originally the standalone game was 2gb in size (lots of music, voice overs, terrains). This was a problem for kids, since they are not going to wait for 2 gigs to download, this was a big barrier to entry.

Unity's web based distribution and the strong world-streaming features were a good match for this project. This allowed Cartoon Network to give a great experience.

The entire MMO was ported in four months by a team Grigon developers and four Unity engineers working with them. They were on a very tight schedule. Four developers at Grigo ported the game client from C++ to C#.


10:52am Interesting overview of the challenges of the game industry. Where does the game industry go next? A discussion on integrarting games with the web and delivering games as services.

On one end there is the flashy stuff, on the other end lots of talk about the enterprise components of gaming. I had no idea.

Between the low-barrier to entry and the high-barrier to entry markets, there is a large space for gaming and where 3D games on the browser will make a difference.

Phil sees Unity as an agent that will help transform the game industry.

10:31am Phil Harrison president of Atari is now on stage, "First time I have done a presentation in a Planetarium". Atari has no commercial relationship with Unity or investing-wise. He is here because "I wanted to be here, what David and his team are doing is transformational for the industry. I had an Eureka moment early this year, I had just joined Atari, and someone told me `check out', I had heard about it, but never used it. Using the Web player demo was eye opening for me. [...] This is something similar to what I saw at Sony in 1993, when we first got the dinosaur demo on the PS1 [...] The island demo I believe is a game changer in this industry".

"I have become an unpaid evangelist of Unity".

Phil is now going to talk about the game industry, wont blog that.

10:29am Introspection, why we want to do this? Goals: we want people to build games for the web, the iPhone, the Wii, and for everything else.

10:29am Announcing Indie version, 199 dollars, but has some watermark/splashscreen at the beginning.

10:26am Windows Vista logo on screen.

"This is true, I have to admit it", they are demostrating the new Unity3D IDE on Unix. The same Unity3D tool but now running on Windows.

"We are going completely cross platform", every script that you write will run on both platforms (Woohoo! Go CIL! Go!).

10:21am Nicholas is going to talk about "Secret Labs". He talks about "Jus t Press Play", "Buliding for multiple targets" and the script property editor.

They wanted to improved upon th eUnity editor.

They rewrote the Unity editor from zero. Created entirely on top of the Unity APIs themselves - EditorWindows, unityGUI, GameObjects and Components. Unity is built on Unity now. "It is way faster than cocoa".

He is showing Unity 2.5, looks like Unity 2.1; They now use tabs and the various windows can be dragged around, very much like MonoDevelop. He is showing the editor by dragging a lightpost into a paradise island and showing the new GUI tools like snapping, rotation and UIs that are closer to Maya's tools.

He shows "Snap to surface" so you can easily position stuff on the 3D terrain. People like it.

The UI is a lot easier to use. "We have been focused on the tiny details, but we are not shipping this yet". Everything in the IDE can now be scripted.

He shows how a few lines of code a developer can attach a camera view when clicking on a property.

130 new API entry points, Unity developers can now do everything that the Unity3D GUI can do.

10:20am David talks about the pricing; Two pricing: cheap and expensive.

10:13am Joachim Ante is introducing Unity for the iPhone. "With Unity we have always focused on iteration time". He goes into some of the technical details, "With all the new input mechanisms for theiphone, how do we provide a quick feedback system, we wanted to keep the experience of develop, hit run under a second".

They provide a "Unity remote" that runs on the iPhone, you use the iPhone to control the game, but the debugging actually happes on the PC. So you hit "run" and there is no wait at all.

Joachim is explaining how they are using AOT to run native code on the iPhone without having any JIT on the iPhone.

Joachim shows "Build and Run" that does the cross compiler and sends the code to the iPhone. It takes about 10 seconds and he now has the game running on the iPhone and shows how both Javascript and C# code running on the system.

10:10am Some stats: sold out event; 180 attendees; community doubled in the last year (2x employees, 2x posts, 2x users);

Last year they released Unity 2.0.

Unity 2.1 was released, should have been 3.0, but they did not charge more.

They announced Unity for Wii.

Big games using Unity: Sony, Disney. Virtual worlds like Hangout.Net is out (very pretty!) A new online community was created (Blurst). SeriousGames released a new title "Global Conflicts" Latin America.

Hard to keep up with the list of users.

10:04am The Unity founders Joachim, David and Nicholas are on stage to start Unite'08.

They moved us from the room downstairs to the iMax auditorium whihc is packed with game guys. During the moments of panic that ensued after they moved us from the downstairs room into the switched rooms, I ended up in the front row, which in retrospect was a big mistake, you cant take pictures with the standard lens of this massive screen.


9:48am Waiting for the keynote at the Unite conference. There are about two hundred people packed in the Tycho Planetarium. The Blurst guys are all wearing matching shirts and have taken over the first line of the auditorium, From here I can see a them debugging a Sonic-like game that he is prototyping. Then he switched to some game that has an angry minotaur breaking dishes in a museum.

The minotaur looks angry and just scored 3,000 points of some kind.

Everyone in this conference seems to be using a Mac, and I could swear this computer is the only Linux machine in the audience.


Last night I had dinner at the Unity headquarters and got a chance to meet some of the Unity hackers and users before the conference started.

Gained a deeper insight into what we can do to improve Mono's VM for games. Lots of good ideas.

Phil Harrison brought up "the debugger" issue ;-).

Hopefully Rasmus from CellDotNet will show up for the Unity Mingle tonight.

Posted on 22 Oct 2008

Mono 2.0 OSX Installer Ready

by Miguel de Icaza

We released an updated installer for Mono 2.0 on MacOS X.

This release got delayed because we wanted to upgrade our bundled Gtk# stack to contain the latest release of Imendio's Gtk+ for MacOS X.

Banshee coming to an OSX near you this week.

Mono OSX Survey!

We are trying to understand how we can improve Mono on the OSX space. Help us figure this out by filling out our Mono on OSX Survey.

Relocatable Applications

If you have followed our Guidelines for Application Deployment your software should be easy to be packaged and distributed for MacOS X as a relocatable application.

Eoin Hennessy worked on integrating Banshee into the OS, and packaging it into a bundle that runs out of the box on MacOS. The following are some screenshots from Aaron's box:

The Banshee open source media player.

Sandy has ported Tomboy and Tasque to MacOS and Windows and provided installers for both.

Tomboy integrates into the dock and menus.

Aaron Bockover from the Novell Desktop Team has promised that Eoin's work will be part of Friday's Banshee release. From this point on, Banshee will be released both for Linux and MacOS X at the same time.

Maybe F-Spot is not too far behind?

The Small Print

  • We downgraded the bundled Gtk+ from 2.15 to 2.14.3, as 2.15 was a development version and 2.14 is the officially supported Gtk+.
    This means that applications that linked directly against Gtk+ 2.15 from Mono 1.9 will fail to run. Please re-link those binaries.
  • We removed MonoDevelop from this distribution, so our package only contains the Mono SDK and Mono runtime.
    A MonoDevelop installation package will come later, we apologize for this delay.
    On the upside: now that the distribution is split, we will be doing MonoDevelop Beta 2 previews as DMGs after the PDC.

Help us improve Mono on OSX by completing the Mono on OSX Survey and providing comments at the end.

Posted on 20 Oct 2008

Parallel Programming

by Miguel de Icaza

As much as I personally dislike the use of threads for user code, multi-cores systems are here to stay. They are becoming increasingly popular (most laptops now ship with dual core systems, home computers ship with 3 cpus and gaming consoles ship with multiple general purpose cpus as well).

Developers will need new frameworks for developing software that is ready to take advantage of multiple CPUs. But most importantly they will need to learn the traps and pitfalls of writing parallel/threaded code.

Here are two fantastic articles on MSDN that cover these topics:

J�r�mie Laval worked on an ParallelFX implementation for Mono over the summer as part of the Google Summer of Code.

The implementation currently lives on the student repository at I can not wait for the API to be stabilized so we can move it into the main Mono distribution.

Posted on 19 Oct 2008

Going to Copenhagen

by Miguel de Icaza

Next week I will be in Copenhagen for Unity3D's Unite conference.

Unity3D is one of the most fun users of Mono as they create IDEs for Game Developers and they are driving the adoption of Mono, C#, Boo and their own UnityScript in the gaming space.

As a newcomer into this industry, there are various sessions from actual Unity user on how they have built their games from start to finish. Other sessions include details on publishing, production (ArtPlant), Physics (Flashbang), Shader Programming (Unity), developing on the iPhone (Unity), a post-morterm on FusionFall's work for Cartoon Network and the hands-on lab.

Some cool stuff from the agenda includes a keynote participation from Atari's President.

If you want to meet up, drop me an email. I will likely be going to the Unity Mingle events at night and departing early on Friday to fly to the Microsoft PDC in LA.

Posted on 17 Oct 2008

Alan "BitSharp" McGovern Joins Novell

by Miguel de Icaza

Alan McGovern, who created BitSharp during a Google Summer of Code for Mono has joined the Moonlight team at Novell.

Imagine the possibilities! Bittorrent clients, servers, trackers all running from Silverlight 2.0 Web Applets!


Posted on 14 Oct 2008

MonoDevelop gets VI bindings

by Miguel de Icaza

I grew up mostly with Turbo Pascal as my development environment. When I started to write C code in DOS, I used Turbo C briefly but for some reason I switched to the BRIEF text editor for a while.

Around 1989 my friend Max Mendizabal who used nothing but Epsilon told me "Unix is the future, if you learn Epsilon, you will be ready to switch to Emacs when the time comes".

Prophetic words.

When I eventually switched to Unix in 1992, having learned Epsilon was useful, but Emacs was too slow for quick edits. I still used Emacs for programming, but for quickly making changes to a file, I ended up learning vi.

When computers got faster, I tried to switch to Emacs for all my editing tasks, but my brain had been hardwired. I even added "alias vi=emacs" to by shell, and I would find myself typing subconsciously "/usr/bin/vi".

To this day, I use both editors interchangeably.

In any case, the above story was just an excuse to introduce VI Mode for MonoDevelop.

Posted on 14 Oct 2008

« Newer entries | Older entries »