Posted on 09 Aug 2005
Atsushi Enomoto has a great list of hints on developers using XML and Mono's XML if they care about performance, his list is available here.
Mono originally was using IBM's Classes for Unicode (ICU) library. A C library that provides many tools to handle internationalized strings (like comparing strings, finding substrings, handling case sensitivity in a culture-aware fashion) and so on.
Basically all the managed code in Mono would call into the C runtime and the C runtime would use ICU's functionality to carry out the job. Unluckly Microsoft's behavior of the unicode operations differed from ICU implementation and the fixes that we applied in our wrapper code that use ICU were insufficient to provide the same semantics. Developers were running into various unexpected problems and erratic behavior that came out of our mapping which prompted us first discourage the use of ICU, and later to completely disable the ICU support code in Mono.
A few months ago, Atsushi was wrapping up his work on System.XML 2.x and asked me what should he look into as his next task. I asked Atsushi to look into implementing a replacement for ICU that we could use for Mono. He took this challenge very seriously and this past week he finally landed the new string collation code in the repository.
His latest blog post has some performance information and he links to the various posts that detail his quest into implementing string collation for Mono.
Posted on 09 Aug 2005
On Friday I will be presenting some of the recent work being done at Novell on improving Linux including Beagle, Xgl and other Mono-based applications.
There are a few Mono sessions and a Mono BOF that am planning on attending to discuss recent developments.
Posted on 02 Aug 2005
I just called Nat to go out and have brunch at Henrietta's Table in Harvard Square and a panting Nat informed me that he was biking to Provincetown and back. Unlike his last trip two weeks ago he left Boston fairly late. I just got the following SMS with Nat's current location:
14:50: N:41.690170, W:-70.317350. 74.9 miles from home or about 100 minutes by car.
18:21: Nat has arrived to Provincetown. The trip is one hour shorter than his last trip. Now the painful return home begins.
Posted on 30 Jul 2005
Mono's C# compiler has a fairly complete regression test suite: we routinely use mcs to compile about the two million lines of code that make Mono up but it also contains about two thousand individual positive and negative test cases.
Since we run continuous builds of Mono to detect regressions on the code base the two thousand test cases on the compiler were starting to slow down quite a bit the regular Mono builds and the release regression test cases by quite some time.
Faced with this problem Marek Safar, one of the main C# compiler contributors, refactored the compiler so that it could be reused as a library. Then he built a couple of drivers that would process the compiler's negative and positive tests. This reduced the time to run all the tests from about 6-8 minutes to 24 seconds.
The time savings come from not having to restart the compiler over and over for each one of the invocations and instead just have the compiler reset at the end of each compilation and move on to the next target:
test-anon-28.cs... OK test-anon-29.cs... OK test-anon-30.cs... OK
The bottle-neck moved from C# to my terminal program.
This embedding turned out to be useful for a small hack. ASP.NET invokes the compiler on a page the first time that a page is hit. When you reference an ASP.NET page like "demo.aspx" the file is compiled down into C# code, which is then passed to the C# compiler and the output is dynamically loaded into the server which in turn dispatches the request from it.
While developing applications developers might notice a 1-2 second delay on the first hit to the page as the compiler processes the file, some months ago I cooked a patch that uses Marek's interface to embed the C# compiler directly into the ASP.NET engine (original posting, patch).
On my machine the time went from about 1.5 second to .9 seconds to compile and load a page on the first hit.
This patch is not very interesting anymore as ASP.NET 2.x allows for websites to be deployed by preocompiling the whole site and copying the resulting DLL.
Posted on 28 Jul 2005
The third edition of the ECMA C# language specification and the Common Language Infrastructure (CLI) have been approved by the ECMA General Assembly. Check the press release for the juicy quote from my boss.
The introductory chapters of the C# specification read like a tutorial. I recommend those interested in the C# language to read the first few chapters as it has most of what they need.
Mono 126.96.36.199 includes a complete implementation of the new C# specification (written in C# of course) and our runtime contains all the new CLI features required.
We are still missing a few new classes and new methods from the updated class libraries, these should be with us soon.
Posted on 28 Jul 2005
Some people have drawn the wrong conclussion about J2EE and ASP.NET in terms of scalability. All that I said on the interview is that J2EE was entrenched on a segment of the application server market, in particular in the segment where the project budgets are north of two million dollars (this is what our research a few years ago showed). My data is from the research that we did in 2003 as part of a Mono Survey that talked to various companies about ASP.NET and J2EE.
I do not personally track carefully enough the scalability and speed claims made by the Java or .NET camps when it comes to application server performance and scalability. All that our research showed back then was that people investing those large sums of money were using a combination of expensive non-Intel hardware and were purchasing expensive application servers and suites on the high and Java had been on that market for longer than ASP.NET.
In my subjective opinion ASP.NET is thiner, lighter and simpler to grasp than the J2EE frameworks out there.
Posted on 28 Jul 2005
Joe correctly points out that a major sore point in Mono development today is the lack of a functional debugger. The lack of this tool is duly noted. All we can say at this point is that progress is happening in this area.
Joe, you might want to look into using Mono's profiler in statistical mode instead of using it on the default instrumentation mode. This is described on the man page, but here is a shorthand:
$ mono --profile=default:stat app.exe
In statistical mode, the program is interrupted a number of times per second and the information about the current routine being executed is gathered. This technique has a much lower overhead that tracking every entry and every exit from a routine.
For instance, bootstrapping the C# mono compiler with or without statistic profiling takes the same amount of time to run (2.5 seconds on my thinkpad t42p) while using the default settings for the Mono profiler takes four minutes and 21 seconds.
In addition on Linux with rtc support it is possible to set the sample rate anywhere between 64 and 8192 samples per second by setting the MONO_RTC environment variable (more details are on the manual page).
Other tools that Mono developers should get familiarized with is the tracing facility which lets developers peak at the program execution. The tracing facility has improved in the last few months and its worth taking a look at some of its new features.
Posted on 27 Jul 2005
Tim Anderson at ITWriting interviewed me on various subjects related to Mono: the Novell/Mainsoft collaboration on Mono; the application server space and Mono's plans in terms of Indigo.
I would like to expand on what I said on the interview as am afraid that the relation between Mainsoft and Novell was not clear.
Mainsoft collaboration: Mainsoft's product Grasshopper is a solution that allows .NET ASP.NET applications to run on a J2EE server by doing code translation from CLI bytecodes into JVM bytecodes. Mono allows the same, but instead of running on a J2EE server we provide an ECMA CLI execution system to execute the code.
Although we both use different virtual machines to provide this feature, the class libraries that are used to provide the functionality are shared between Mainsoft's Grasshopper product and Novell's Mono. As of a few weeks ago we finally merged both codebases into a single tree. There are some small differences in the hosting environment which we cope with by using a number of defines: TARGET_JVM, TARGET_J2EE and in some cases a completely different implementation of the class exists (XXX.jvm.cs) which is compiled instead of the XXX.cs equivalent.
As part of this work to integrate the and we are also integrating their build process into our continous build infrastructure to ensure that we do not accidentally break the build of the Java-version of the Mono class libraries.
On Indigo: A lot of folks are curious about Mono's plans regarding Indigo and I have not really had an answer in the past because I had not investigated Indigo in depth.
There are a number of issues with Indigo: the first one is that am not very well versed on all the things that Indigo has to offer, but I lament certain things about its design. In particular, in this hyped world of Services Oriented Architecture where the contract must be precisely spelled out instead of adopting an IDL-like language a meta-language built on top of sprinkling attributes left and right to decorate classes, methods and so on. This meta-language created by the sprinkling of attributes is likely more obfuscated than any possible IDL incaranation of the same idea. The fact that now interfaces are the thing to describe should have been a warning to the designers that it was probably time to look at IDL again.
My second issue with Indigo (and in general with many of the new WS-* protocols) is that the more they evolve, the closer they look and feel like the complex high-level CORBA services. Those services that got CORBA the reputation it has today.
In the interview I point out that Indigo is not available today as a released product and that it will take a few years before we see Indigo-based applications.
In the meantime, ZeroC is a startup founded by some ex-CORBA people who wanted something simpler. They have a fairly interesting OO RPC system called Ice which has many of the features that application developers need today and is available on a wide range of languages: Python, PHP, C++, Java and .NET languages. You can read about how it compares to CORBA and SOAP, it is available today and its dual licensed: free for free software applications and you can obtain a commercial license for your commercial applications.
Posted on 26 Jul 2005
My recent favorite is Crooks And Liars since they have copious amounts of video clips on the stories of the day.
Today I found a site that has a little bit of a lag but seems to have a fairly good coverage of Fisk's work here. For old Robert Fisk articles robert-fisk.com has a good selection and usually has a few audio interviews.
Posted on 17 Jul 2005