DevTeach
I will be speaking in a few weeks @ DevTeach Toronto 2008. I'm giving a talk on using Mono & .NET as a cross-platform development environment, but I'll likely also talk about Moonlight and some other neat things we have.
Hope to see you there.
Another day in the life of a Banshee developer
Your friendly neighbourhood banshee developer playing goth-scotch:
BONUS: Free shot of a shocker developer.
EDIT: I'm obviously too lazy to rotate this myself
Apple specific git repositories
I've published the latest objc2#, cocoa2# and cocoatouch# source code (modulo a few changes that aren't ready for public consumption yet) into a few git repos for the interested parties:
git://sublimeintervention.com/objc2-sharp.git
git://sublimeintervention.com/cocoa2-sharp.git
git://sublimeintervention.com/cocoatouch-sharp.git
Take a look if you're curious.
The longest 36-hours of my life.
At 9:08AM EST, Piers Matthew William Norton was delivered by c-section. He weighed 9lbs 3 oz. He's the best thing I've ever made:
Moonlight colorspace conversion
I took on the task recently of implementing the YUV to RGB colorspace converter for Moonlight. There is a lot of code out there that does this already, in fact we were using libswscale to do it previously, but this was less than ideal for us from a performance and licensing pespective. Let me start with a bit of background on YUV to RGB conversions (4:2:0 specifically) and then I'll describe our C/MMX/SSE2 implementation.
Traditional digital media is tranmitted in a colorspace known as YUV. YUV splits luminence and chrominance to display an image rather than putting it in RGB. This is mainly because of how the human eye detects color. We can share chrominance values that are near by other luminance values. The 4:2:0 colorspace shares the same U/V value for 4 Y values in the following manner:
Lets assume a 16 pixel wide image, 4 pixels high. Our planes would look as follows:
Y00 Y01 Y02 Y03 Y04 Y05 Y06 Y07 Y08 Y09 Y10 Y11 Y12 Y13 Y14 Y15 Y16 Y17 Y18 Y19 Y20 Y21 Y22 Y23 Y24 Y25 Y26 Y27 Y28 Y29 Y30 Y31 Y32 Y33 Y34 Y35 Y36 Y37 Y38 Y39 Y40 Y41 Y42 Y43 Y44 Y45 Y46 Y47 Y48 Y49 Y50 Y51 Y52 Y53 Y54 Y55 Y56 Y57 Y58 Y59 Y60 Y61 Y62 Y63 U00 U01 U02 U03 U04 U05 U06 U07 U08 U09 U10 U11 U12 U13 U14 U15 V00 V01 V02 V03 V04 V05 V06 V07 V08 V09 V10 V11 V12 V13 V14 V15
The Y planes are split into 2x2 blocks to determine the corresponding U/V plane
Y00 Y01 Y02 Y03 ... Y16 Y17 Y18 Y19 ... U00 U01 ... V00 V01 ...
YUV to RGB can be mathematically represented as:
R = 1.164 * (Y - 16) + 1.596 * (V - 128) G = 1.164 * (Y - 16) - 0.813 * (V - 128) - 0.391 * (U - 128) B = 1.164 * (Y - 16) + 2.018 * (U - 128)
A corresponding integer math based approxmation can be represented as:
R = CLAMP((298 * (Y - 16) + 409 * (V - 128) + 128) >> 8, 0, 255); G = CLAMP((298 * (Y - 16) - 100 * (U - 128) - 208 * (V - 128) + 128) >> 8, 0, 255); B = CLAMP((298 * (Y - 16) + 516 * (U - 128) + 128) >> 8, 0, 255);
So a simple C based implementation of this algorithm for YUV would be:
static inline void YUV444ToBGRA(uint8_t Y, uint8_t U, uint8_t V, uint8_t *dst)
{
dst[2] = CLAMP((298 * (Y - 16) + 409 * (V - 128) + 128) >> 8, 0, 255);
dst[1] = CLAMP((298 * (Y - 16) - 100 * (U - 128) - 208 * (V - 128) + 128) >> 8, 0, 255);
dst[0] = CLAMP((298 * (Y - 16) + 516 * (U - 128) + 128) >> 8, 0, 255);
dst[3] = 0xFF;
}
void Convert (int width, int height, uint8_t *y_plane, uint8_t *u_plane, uint8_t *v_plane, uint8_t *dest) {
int i, j;
uint8_t *y_row1 = y_plane;
uint8_t *y_row2 = y_plane+width;
uint8_t *dest_row1 = dest;
uint8_t *dest_row2 = dest+(width*4);
for (int i = 0; i < height; i++, y_row1+=width, y_row2+=width, dest_row1+=(width*4), dest_row2+=(width*4)) {
// Unroll the loop
for (int j = 0; j < width >> 1; j++, dest_row1+=8, dest_row2+=8, y_row1+=2, y_row2+=2, u_plane+=1, v_plane+=1) {
// Process Y1 U0 V0
YUV444ToBGRA (*y_row1, *u_plane, *v_plane, dest_row1);
// Process Y2 U0 V0
YUV444ToBGRA (y_row1[1], *u_plane, *v_plane, (dest_row1+4));
// Process Y16 U0 V0
YUV444ToBGRA (*y_row2, *u_plane, *v_plane, dest_row2);
// Process Y17 U0 V0
YUV444ToBGRA (y_row2[1], *u_plane, *v_plane, (dest_row2+4));
}
}
}
This works perfectly well, however its slow. On modern architectures we have SIMD processors which allow us to do simple math like this to multiple vales at the same time. So lets dissect our MMX implementation of YUV2RGB.
Taking our above mathematical representation:
R = 1.164 * (Y - 16) + 1.596 * (V - 128) G = 1.164 * (Y - 16) - 0.813 * (V - 128) - 0.391 * (U - 128) B = 1.164 * (Y - 16) + 2.018 * (U - 128)
This math isn't ideal for a SIMD, so we're going to promote and demote precision (64 is a simple shift operation):
R = (1.164*64)*(Y-16)/64 + (1.596*64)*(V-128)/64 G = (1.164*64)*(Y-16)/64 - (0.813*64)*(V-128)/64 - (0.391*64)*(U-128)/64 B = (1.164*64)*(Y-16)/64 + (2.018*64)*(U-128)/64
Expanding out our contants:
R_V = 1.596*64 =~ 102 G_V = 0.813*64 =~ 52 G_U = 0.391*64 =~ 25 B_U = 2.018*64 =~ 129 Y_C = 1.164*64 =~ 74
We store all these constants in a constant aligned memory buffer declared at compile time so we can do aligned loads from it in SSE mode.
This leaves us with 3 unknowns for each pixel. We start our MMX/SSE2 loop by calculating R_V, G_V-G_U, and B_U for 8 (16 for sse2) U/V values. We actually only use 4 (8 for SSE2) of these in each iteration, but we can precalculate the next 4 in the SIMD stream and this prevents an unaligned load/store in SSE2. At this point we're left with simple math to calculate the RGB:
R = 74*(Y-16)/64 + R_V G = 74*(Y-16)/64 - (G_V-G_U) B = 74*(Y-16)/64 + B_U
After performing this simple math (please refer to SVN if you want to see the nitty gritty on how we process 8/16 pels at a time, but basically we unpack the low and hi portions and operate on them seperately. This gives us the benefit of always being able to do aligned load/stores for sse), we do a bit of unpacking to order things into cairo's expected BGRA32 and dump it into the output buffer.
For the curious we do a simple alignment check to determine if we have to calculate the U/V planes in the loop iteration or not, it looks roughly like (again for mmx):
mov u_plane, %eax and $7, %eax test %eax, %eax je do_calc movq backup_r_v, %mm1 movq backup_g_u_g_v, %mm2 movq backup_b_u, %mm3 jmp done_calc
If you're more curious our implementation is in svn in yuv-converter.cpp. Feel free to ask me any questions you might have.
PS: If someone wants to contribut a LUT based implementation for our C fallback under MIT/X11 that would be appreciated.
More iPhone
The patches are now in SVN (for mono, the objc2-sharp patches need some clean up). Install the SDK from apple and run ./build-iphone.sh from our mono directory. You'll need to get a mcs build from another machine and put it on the iphone as you're cross compiling. Right before I went to lunch some people wanted to see buttons:
Monkeys in your iPhone
I figured I'd clean things up a bit and see the state of the JIT:
Credit to Luke Howard for building this up first. It turns out that Apple is using mostly the ARM EABI. So aside from a few toolchain issues (Apple, please provide a proper preprocessor), it just required a couple changes to exceptions-arm.c to deal with the differences in ucontext.
Its here
It took us a little longer than originally expected, but we've uploaded the Mono.framework which includes Gtk# and MonoDevelop to our website.
Grab it here. Take it out for a test driver. PLEASE report all bugs as this is still a preview release.
Flicker be gone!
I've updated the MWF driver on my blog again with some major changes. Flickering should be gone 100% now.
This version contains a number of other fixes for common problems as well. It unfortunately introduces a few edge cases where drawing can become messed up so I need atomic test cases for these please!
UPDATE: Edited to clarify that I eliminated flickering in MWF-Mac, not on my blog. Thanks hutch.
New Years Thoughts
So while waiting for guests I came across the fact that Apple shipped the Java SE 6 Develop Preview on Dec 19th. Good for them. However we supported Mono on Leopard on Nov 1st (SVN), 5 days after its official release and shipped our first version Dec 12th.
Its probably worth noting as well: "This version Java SE 6 Developer Preview requires a 64-bit capable Intel-based Mac (a Core 2 Duo or Xeon) and Mac OS X v 10.5.1 or later."
Dont worry PowerPC users, we have a home for you in Mono. Oh and you early Intel adopters at 32-bit, we support you too.
Food for a new years thought.
UPDATE: To clarify above, I was talking about compiling from source compatability. We ran on Leopard from day 1 with our binaries :)
Whoops
So the package I published earlier today was missing a small portion for D&D support. I've updated the download again and now that is as it should be. Here is the latest.
Updates a plenty
I've published another updated driver here. This fixes the mouse bugs being unable to move windows in the old updated driver and implements initial support for drag & drop as well as clipboard. This also marks the point where I'll start encouraging bug reports with test cases on anything except the following issues:
- Flickering. This is a known issue I'll be tackling soon.
- Keyboard. There is a fair amount of work still to be done here.
Any other issues you have please make a small test case, file a bug and assign it to me and it'll get fixed for the next update driver.
Get Summit with it
I'll drop a quick note here for those who didn't see Miguel's entry on the Mac stuff at the summit. We've done a number of cool things recently, probably the largest would be the patchset that landed yesterday that makes MonoDevelop run on the Mac without X11. We're using Imendios Gtk+ Quartz driver and Gtk# and I must say it runs quite nicely. We're going to ship packages for this, but you're going to need to give us a bit of time to get it all built up.
I also showed off the state of my MWF driver from my tree at the summit and its so much nicer than the one in 1.2.6 I'm going to release some "upgrade" packages to bring your 1.2.6 installs (coming soon to a download location near you) into a more up to date fashion. I make no promises that these will work, (they should tho). You can download them here.
More summit info to come including blurry pictures of random awesomeness when I get a spare moment.
Look ma, no X
I took 30 minutes out of my day today from MWF driver land to revisit Gtk# on Mac (Imendio Gtk+ driver).
PS. I have one offer open for the O200 from hub; but if there is a kind Toronto soul willing to donate 4U in a cage we'd still appreciate it. Again, we need practically 0 bandwidth (it'll be a build / development box). My e-mail is on the right hand side of my blog.
Reflector
Now that bugs days are over, I'm back full time on the MWF driver. Today I decided to see where we're at with some real world applications. I'm happy to say that Reflector runs without crashing on trunk now. Its by no means perfect but you can get it to show you disassembly:
I'm going to concentrate on the fixes I've got to our event translation stuff and hope to land it before we branch, but if not there is at least a developer preview out there. As always we encourage patches! :)
PS. I'm still looking for 4U in a Toronto colo for this MIPS box. Any kind souls? My e-mail is on the right hand side of my blog.
Mono Leopard Redux
The Mono patch for Leopard listed in my previous blog is now on the trunk. If you have a working glib compile with the patches mentiond in my previous post, Mono will compile for you from trunk on x86 or ppc.
Mono Leopard
So Leopard is out and I finally went and picked up my DVD so its time to get Mono in place for it. Thankfully there isn't too much wrong but there are a few caveats.
First apple changed some internal structures in their mcontext and thread_state structures. This patch fixes that for you.
Next our MWF native driver that I'm working on out of the box works fine, except libgdiplus has some issues with leopards versions of fontconfig and freetype2. This hack fixes that for you.
Unfortunately it seems our old trick of leveraging fontconfig and freetype2 causes X11 to actually launch now on Leopard. While annoying, its not really using X11. Thats on the TODO list.
There is one more thing you need to be aware of to build on leopard. Leopards ld doesn't compile glib nicely, you need to redifined G_INLINE_FUNC from extern inline to static inline in gutils.h
UPDATE2: Patches are cross-platform now and correct. Same bat-link.
Help Needed
Are you in Toronto with a free 4U? I have a Origin200 that needs a home. We need practically 0 bandwidth (it'll be used as a Mono build-bot) but we need some U's and power. Got some space to donate for us? Drop me an e-mail.
News as promised
So it's exciting times for me as I said. News #1 as promised? I have accepted a position with Novell on the Mono team. I suppose this means Miguel wins. He's been telling me that he'd get me employed eventually on the mono team for years.
More to come...
Back in the saddle again
Ever since a server move several months ago I hadn't setup my personal domains and or blog. Due to some interesting news in the coming weeks it's time for this to change.
Will I keep up with it this time? Tune in for our next installment soon.
I think finally I've managed to overcome one of my biggest hatreds of blogging however. Why do I need a whizz-bang CMS to blog? Vim works fine for me thank you. Thus as the astute of you will notice I'm using LameBlog and so far am quite happy. Wan't to design me a new skin?






