Mahalo is just plain silly

Software

Reading Dave Winer rant on about how silly the new startup Mahalo is, I decided to give it a shot. Less than a minute into my exploration you can colour me even more unimpressed. Not just because the theory behind it is iffy, but simply because their results suck!

I'll explain why, with some sidebar images of actual results from their site.

Konqueror on Mahalo

Konqueror doesn’t have a search result page, and doesn’t even have any Related Result Pages. Ironically they use the results from Google to hide the fact they returned nothing themselves.

I guess it is a fairly obscure search. I’ll try FreeBSD next.

Konqueror results from Google: About 7,620,000

If you haven't heard of Mahalo (and let's face they've focused their marketing efforts only on the US) it's a new human edited web directory based not on categories of links but on search terms. Essentially it's the same thing as Dmoz and what Yahoo! used to be and uses the same human element. You know, the way we used to do it before we realised how much more efficient computers are at gathering and organising large volumes of information.

FreeBSD on Mahalo

FreeBSD doesn’t have a search result page, and doesn’t even have any Related Result Pages. Ironically as before they use results from Google.

I guess being a critical engine that powers millions of websites and servers across the planet isn’t enough to warrant a search result page. Let’s try something more generic next… what about Singapore?

FreeBSD results from Google: About 39,200,000

I've got to hand it to them for taking on the gargantuan task of creating a search engine where every single search query you could conceive has already been thought of already and has had a page of links created for it, not to mention the equally daunting task of making sure each of these billions of pages are kept up to date and relevant in a world where information is updated and changed every minute.

Singapore on Mahalo

Okay we’re getting a bit back this time, but it’s only silly tourist information! What about at least an infomation bar or economic data or something?

I guess being an Alpha World City isn’t enough to warrant a dedicated search result page either. Okay fine, what about Australia? It’s bigger and more well known…

Singapore results from Google: About 224,000,000

To me it's a nirvana like ideal: as humans we know what we want and need so therefore as humans we are the most capable of fulfilling those needs and wants. The reality as I see it though is that it's an unrealisable fantasy to think we could ever match computers in this field. Heck, that's one of the reasons why we invented computers in the first place: to take care of these repetitive tasks for us accurately and quickly isn't it?

Australia on Mahalo

Okay now we’re really being silly. As with Singapore we’re given links to Related Pages which give us tourist information and an article about Horse Flu, but still nothing actually useful about the country itself. Nothing.

To see where they place their priorities, let’s see what their results for Paris Hilton after this.

Australia results from Google: About 391,000,000

Don't get me wrong, I think search engines still have a long way to go in terms of usability and search result relevance, but I surely don't think the answer is to give up on computers and do it ourselves.

Paris Hilton in Mahalo

So the truth comes out. They have no pages for Singapore, Konqueror, FreeBSD or Australia, but they have thorough, detailed and comprehensive search result page for Paris Hilton.

Paris Hilton results from Google: About 38,900,000

So let's look at these results. Despite the fact Singapore and Australia are both more popular than Paris Hilton by a factor of 10 and that FreeBSD and Konqueror are no lightweights themselves in Google, none of them other than Paris Hilton had a dedicated Mahalo page.

Now I understand that generating billions of pages and keeping them all up to date and relevant is a huge undertaking, so therefore it's only natural to expect there to be less results for queries than Google; and it is true that not all of Google's results are relevant themselves, but come on this is ridiculous.

I think that last search result says it all! Google has absolutely nothing to worry about as far as Mahalo is concerned.


Earthquake hits Indonesia, rocks Singapore

Thoughts

LATEST,
THURSDAY MORNING: The Indonesians are feeling aftershocks and so are we here. I definitely felt the one this morning.

An earthquake this afternoon rocked Indonesia today and we in Singapore felt it too. Actually I didn't, maybe I was on the MRT at the time and didn't feel any difference, but it was enough for the Singaporean authorities to evacuate people from downtown:

SINGAPORE : Singapore buildings swayed after an earthquake hit Indonesia on Wednesday evening.

Residents in various parts of the island felt the quake and people in some buildings, including in the central business district, were evacuated as a safety precaution.

Areas in Singapore which felt the tremors included Novena, Paris Ris, Raffles Place, Potong Pasir, Marsiling, Toa Payoh and Thomson Road.

Singapore’s Meteorological Services said the earthquake measured 8.5 on the Richter Scale. The preliminary reading was 7.9.

The earthquake struck out at sea at 7.10pm. Its epicentre was 120 kilometres south-west of the Sumatran town of Bengkulu, at a depth of 15 kilometres.

This is some 670 kilometres from Singapore.

~ Channel News Asia

Indonesia has sent out a tsunami alert, let's all hope it spares those poor people too much damage. God knows they've suffered enough from huge waves in the last few years. The 2004 Asian tsunami was one of my first posts on this weblog.


Rainer Schade’s open source juice

Internet

Rainer Schade's Patented Ultimate Juice

What do you do a whole fridge of fruit when you only have a day to eat it all? You make Rainer Schade's Ultimate Juice (patent pending). As with all enlightened individuals, my father has agreed to release the source code:

  • one Banana
  • two fuji apples
  • one large kiwifruit
  • quarter papaya
  • one carrot

I suggested adding Red Bull to it, but he refused. Maybe I might have to fork from the main project.


Maths spam experiment kaput

Internet

Swollen head
Spam makes my head swell, and not because I bought dubious medication advertised in said spam which turned out to have scary skull swelling side effects. Was that fancy alliteration or what?

After a couple of weeks using a WordPress plugin which would ask you to answer a mathematics question before you could post, I've decided to take it off. It didn't really do anything to stop pesky spam getting through and just caused problems for some people.

I'm really torn about what to do. This is a pretty meek weblog in the grand scheme of things and the Rubenerd Show is in a small niche, but sorting through all these messages every week is becoming a full time job in itself. That said though, for now I can't think of any alternatives.

WordPress allows me to automagically authorise any messages you post if you've posted a message in the past I approved, but if you're a first time poster you'll have to wait for me to allow it… just like before. Sorry lah.

Guess it'll give me more material for my silly spam commentary :).


Thinning universal binaries with ditto

Software

In November of 2006 I uploaded a post called A Closer Look At Apple’s Universal Binaries where I tried to describe what UB's are and how to use the lipo command in the Terminal to remove unnecessary code. Since then I've learned a bit more, and have found a slightly easier way to do it.

Previous post in a nutshell: If you've used Mac computers at all since 2005 you're probably aware of Universal binaries, the fancy name Apple gave to applications that have native code for PowerPC and Intel processors. While they really simplify distribution, they store code on your machine you don't actually need.

Fortunately in Tiger Apple bundled the ditto (and lipo) utility which you can use to create a thin version of a universal binary that only contains code for your processor.

TAKE NOTE! Some older applications that are only compiled for PowerPC CPUs require shared libraries or resources from other applications, some of which may have been updated as Universal. Therefore if you start deleting PowerPC code from them, you may start braking things. If you're not sure, always keep the original universal binary just in case you need to restore it!

ditto --arch i386 FooBar.app ThinFooBar.app
ditto --arch ppc FooBar.app ThinFooBar.app

The first line thins down FooBar.app to include only i386 code, the second preserves only PowerPC code.

Shakugan no Shana
The use of an unnecessarily long sword to spit universal binaries is not recommended… unless you're a flame haze… I'm sorry, but I can't stand dry weblog posts that don't have pictures. The thought of weblog posts without pictures keep me awake at night, as I'm sure it does for you too.

Given my current obsession with tables, below is a selection of my favourite Mac open source applications I thinned down on my MacBook Pro to compare the difference between their universal and thin binaries:

Application Version Fat universal Thin i386 % original % saved ↓
VLC VLC 0.8.6c 75.2 MiB 43.1 MiB 57.31 42.69
Camino 1.5.1int 53.7 MiB 34.2 MiB 63.69 36.31
Inkscape Inkscape 0.45.1 84.1 MiB 56.4 MiB 67.06 32.94
iTerm iTerm 0.9.5.x 4.1 MiB 3.2 MiB 78.05 21.95
Gimpshop Gimpshop 2.2.11 191.0 MiB 149.3 MiB 78.17 21.83
Smultron Smultron 3.1 10.1 MiB 9.4 MiB 93.06 6.94

And here's a similar table looking at bundled Apple applications:

Application Version Fat universal Thin i386 % original % saved ↓
Safari Safari 3.0b3 6.5 MiB 5.1 MiB 78.46 21.54
iTunes iTunes 7.4.1 113.0 MiB 98.8 MiB 87.43 12.57
Terminal Terminal 1.5 5.0 MiB 4.6 MiB 92.00 8.00
TextEdit 1.4 2.2 MiB 2.1 MiB 95.45 4.55

Some pretty interesting results, the most noticeable of which I would think is that none of the applications even approached a 50% reduction in file size by removing half their compiled instructions. This is due to applications having shared resources such as images, text files and whatnot that are used by both the PPC and Intel code.

In this case, we can see that VLC, Camino and Safari had a sizable amount of specialised code, whereas the bulk of the TextEdit and Smultron applications consisted of shared resources. We can infer just by looking at these results that rendering video and webpages require more processor specific instructions compared to, say, a text editor.

Plus it gave me the chance to show some Mac icons. I use KDE on FreeBSD and NetBSD and have used all the flavours of Windows at some point, and the Mac is still the prettiest ;).


MTV ads on Twitter?

Internet

MTV advertisement on the side of Twitter

We may be starting to see the introduction of ads on Twitter. Around 22:00, we all saw a link for the MTV Twitter account on the site’s sidebar.

On most sites, this wouldn’t be such a big deal. On Twitter though, we have a different situation. If MTV were really something Twitter users were interested in, wouldn’t people have just tweeted about it?

I suppose Twitter have to make money somehow. But MTV, really? Who cares?


Review of Cranky Geeks 080

Media

Cranky Geeks

Cranky Geeks is one of the best video podcasts I watch… probably because it's one of the only video podcasts I watch. No but seriously it's a fantastic show, I encourage you to check it out especially if you enjoy lighthearted and cranky discussion of tech trends and the well-deserved ridiculing of stupid news stories.

This was my review Episode 080 dated the 04th of September 2007.

Guy Kawasaki on Cranky Geeks

"Yeah well the Zune phone STARTS in the toilet!" Guy Kawasaki definitely seemed like he was having a great time, I wish I could have been there… unless that would mean I have to use one of those Vista microphones he mentioned!

Adam Curry on Cranky Geeks

Adam Curry dragged on a tad (as I seem to always do) a bit with some of the points he was making, but he was definitely an interesting guest and a great guy to have on considering the discussion points. He made a great point about crime in the US versus the Netherlands, which I think could easily apply to many, many places. And I hadn't thought about not being able to drive away in a flying vehicle after landing at an airport. That would be a real bummer.

Sebastian and Adam's point that the "green" label is being used to sell things is increasingly true in so many consumer products, but as with both of them I can't help but feeling skeptical at the same time. Woolworths in my birth country of Australia was recently busted because they claimed their tissue products were from sustainable forests when actually they were from endangered rainforests in Indonesia. If a company is sincere in it's efforts to be greener that's great, but if they're just using it as a marketing ploy without much real substance its a bit of a worry.

Sebastian Rupley, the Co-Crank on Cranky Geeks

Sebastian was really sharp this episode, he really looked as though he knew what he was taking about. Not that he usually doesn't, that didn't come out right! I thought his comments about blog linking, spamming and Wifi were right on the mark.

What's with the "paper" newspapers though? Do they still make those stone-tablet-era things? And does anyone still use Yahoo Messenger anymore? Or ActiveX? Or Monster? Or flying cars?

I was in Malaysia when the DVD sniffing dogs were there and it really seems like their authorities are finally starting to crack down on piracy. Its much easier to find them there still than most places, but many of the shopping centres that used to be full of discs are being boarded up. Whether this is just a token move like the Russians shutting down AllOfMp3 to appease American copyright owners or whether its a genuine effort (pun intended) remains to be seen.

John C. Dvorak on Cranky Geeks

I couldn't care less about American football (I'm a nerd at university and jocks are my sworn enemies) but perhaps not wearing the suit jacket allowed John to be a bit less formal. Hookers. He certainly looked better this episode too because I watched this episode on my laptop instead of my iPod. That didn't come out right either.


Coffee, free wifi and Twitter

Internet

Morning coffee, free wifi, Twitter

If there's any better way to start a day, I sure as hell don't know about it.


Rzip is absolutely incredible

Software

mikuru.jpg
Mikuru tried to compress my files too using her superpower energy. Rzip still worked better.

After reading an old post on Jeremy Zawodny's weblog and installing it myself, I have to say Rzip is my new favourite compression algorithm!

From the developer's website:

rzip is a compression program, similar in functionality to gzip or bzip2, but able to take advantage long distance redundancies in files, which can sometimes allow rzip to produce much better compression ratios than other programs. The original idea behind rzip is described in my PhD thesis.

For a bit of real world testing, I decided to try compressing the www folder in my home directory on my MacBook Pro. I thought this folder would be a useful test because it's relatively large and contains a few large files mixed in with hundreds of smaller ones. From what I understand of compression algorithms, they each tend to favour compressing certain types of files and in certain quantities so I figured this way it would show a more balanced result.

The original folder size was 436.0 MiB with 312 files. The Tape Archive is the control because it's needed for all but ZIP to archive the files before they can be compressed. For convenience the names also redirect to their associated Wikipedia pages.

Algorithm Extension File size % of original % saved
Tape Archive www.tar 423.9 MiB - -
ZIP www.tar.zip 290.9 MiB 68.62 31.38
Bzip2 www.tar.bz2 286.3 MiB 67.72 32.28
GNU zip www.tar.gz 284.8 MiB 67.54 32.46
Rzip www.tar.rz 104.7 MiB 24.70 75.30

What's curious is that Gzip was more efficient than Bzip2, in almost every other circumstance I've come across the reverse was true. I'm not sure how much that affected the results of the other formats. The final result is clear though, Rzip was able to squash like nobody else!

steamroller.jpg
Image © Jan Mehlich, from Wikimedia Commons. As with the image above, I thought it was mildly amusing given the subject matter. I hate dry weblog posts without pictures you see.

From what I can make out reading the developer's website; and with help from dadaist in real-time on Twitter; is that Rzip isn't an entirely new compression algorithm per-se, it essentially just uses larger chunks of data over much longer distances, and then uses existing algorithms to process it all.

I theorise from reading up on this that only in the last decade have computers had enough processing power, and more importantly memory, to be able to pull this off. 900MiB of looking space is great for compression, but can suck up all your resources pretty fast if you don't have much. This is why we haven't seen this level of compression until recently.

In any case, I know what I'll be using to compress all my large files and folders with now :).


Twitter down, internet addiction!

Internet

I guess this is what you get when you live on the other side of the planet from the web service you're maniacally addicted to: they take their site down for maintenance during their night time, which for us here in Asia Pacific is the early afternoon!

Oh no: Twitter outage!

I can remember back a few years ago when I wanted everything hosted on my own website with my own software so I would only be dependent on the web host I was on at the time, and that's it. Now I have Twitter (and Jaiku feeding off that), Flickr, del.icio.us… not to mention my Google calendars, documents and search history. It's actually mildly scary when I think about it; I don't care about the privacy concerns (yet), I'm more worried about my important data being spread all over the place on so many different services, any of which could disappear at any time.

Or maybe I'm just having the inevitable Twitter withdrawal jitters. It's very hard to tell at this stage. I think I need a very stiff, black cup of coffee!

Twitter being upgraded

SIDENOTE I had no idea how useful Flickr is for screenshots as well as photos. Previously I would have to take the screenshot, convert it to PNG and a scaled down JPG, upload both to my server and link them up. With Flickr I just have to upload the master and it does all the heavy lifting, very nice.