Posts tagged with "xml"

If this format doesn’t work, you’re not using enough of it.


Replacing RSS author with Dublin Core's dc:creator

Photo of Dublin I took in 2010

When Dave Winer introduced RSS 2.0, he included a number of new tags. One of which was author, which is designed for an email address:

<author>123@fakestreet.springfield (Ruben Schade)</author>

This probably made more sense back then than it does now. While it'd be lovely to contact the owner of a feed, it's mostly just a huge spam target bullseye.

We have four options (probably more). We can use our regular email address here, and hope our increasingly sophisticated spam filters can handle the sudden and inevitable onslaught. We can pollute our metadata with a fake address, or defeat the purpose of the tag with a junk address we'll probably never check.

A forth option is to take the RDF-spirited approach and import the Dublin Core namespace:

<rss version="2.0" 
    xmlns:dc="http://purl.org/dc/elements/1.1/">

With it, we get the lovely dc:creator tag which lets us do this:

<dc:creator>Ruben Schade</dc:creator>

There is some semantic impact to this; an author isn't the same as a dc:creator. In some ways however, I think it's superior. Author asserts a person is defined by their email address. dc:creator assumes an author's unique identifier (if you will) is their name.

It probably comes down to personal preference above all else; I for one prefer the latter. To be fair though, I have an unusual name combination!

Software packages like WordPress have been doing this for years, and I've finally decided to implement it myself on the feeds I generate. I try to avoid importing namespaces in RSS 2.0 when I can, but this is a useful addition. Plus then I get all the extra Dublin Core goodies ^_^.

Photo taken by me of the Dublin skyline in 2010. Ireland is so beautiful!


Google Reader should update feed addresses

Google Reader redirect example

Here's an idea for a Google Reader feature that in my opinion is long overdue. If Reader attempts to fetch a web feed and it encounters a 301 permanent redirect to a legitimate new address it should update its own records in user accounts to point to the new address instead of still pinging the old one.

I ask for my own selfish reasons because as of now more people are still subscribed to my blog here through the old http://rubenerdshow.com/blog/feed/ address instead of http://rubenerd.com/feed/. Each request to the old URI takes more effort and bandwidth than the new one, and I've noticed items that appear in the new one instantly can sometimes take an hour or longer to appear in the old one. An automatic update would fix this.

Good idea?


Uh oh, I killed The Google Readers

Google Reader

As I've eluded to previously I gave up on Firefox 3.5.x on my MacBook Pro OS X and FreeBSD partitions because it was far too unstable to use without going bat crazy insane. I left Windows for a reason!

For some reason though going back to 3.0.x has caused Google Reader to generate a few errors a day after not having any trouble at all. It could very well be a problem with our home internet connection here not Firefox but it is a weird coincidence.

If it weren't for the fact all my friends from Twitter, Whole Wheat Radio and the real world used it I'd probably go back to Bloglines full time. In fact at one point I was going to research whether I could subscribe to people's Google shared items and comments in Bloglines and have people subscribe to my Bloglines shared items and comments from Google Reader. Might be worth looking into again.


iTunes Rubenerd Show problems

I've figured out why some iTunes users have been reporting problems with subscribing to the Rubenerd Show through the iTunes Store. I deleted my own subscription, searched for "Rubenerd Show" in the iTunes Store and resubscribed to only be given a small circle and an exclamation point.

When I right clicked and chose "Show Description" I was given the above dialog box. No wonder it isn't working, it's trying to access new shows from http:///show/feed/ for some reason!

I don't know how or why this happened. I'll be contacting Apple about this to see if I can get it pointing back to the proper URI again. I believe my good friend Felix Tanjono submitted my podcast to the iTunes Store back in 2005 back when Australia and Singapore didn't have access to it.

While I'm sorting this out you can still go to iTunes, choose the Advanced menu and click Subscribe to Podcast, then enter the following address as a stopgap:

http://rubenerd.com/show/feed/

Sorry about this, I don't know how this could have happened :-(.


Servage and I are officially no more!

Great Servage graphic from WeAreMovieGeeks.com

For those of you subscribed to my blog through an aggregator using the old URL for the RSS feed instead of the new one, you may have noticed four recent posts with identical timestamps. You probably don't care why this happened, but I'm so excited I just have to relay it!

When I moved from Servage to Segment Publishing because the former was absolutely awful and because I had so much success with the latter for other projects, I also took the opportunity to move the site back to Rubenerd.com which I had previously lost to domain squatters. RubenerdShow.com was still with Servage, but all it contained was a simple .htaccess script to redirect all requests to the new domain.

Well as of today, I finally got around to moving the domain off Servage and onto Segment Publishing including said text file. This means, FINALLY, I am completely, 110% off Servage. I don't have anything hosted with them whatsoever. Clear as mud!

I'm putting the finishing touches on my post detailing what an awful webhost Servage is but it won't be ready for a few days. Another post I'm positive you're all anxiously awaiting ;-).


Web aggregators: the chocolate shop problem

Max Brenners at the Esplanade in Singapore, by Angie Teo
Max Brenners at the Esplanade in Singapore, by Angie Teo

One of the problems with using a feed aggregator or blog reader is you tend to act like a kid in a chocolate shop: you just keep adding and adding feeds because they're free and they're full of goodness until one day you're subscribed to so many feeds and you're getting so many entries you start to drown. As a result you start to click the "Clear Unread Items" or equivalent more often than you'd care to admit.

I've never understood why blog aggregators must treat each item as if it were an email or to do list item in dire need of my attention. When I read a newspaper or magazine I don't read every article or story, I only read what's interesting to me. I guess the comeback to that would be that if you receive too many email messages you only start reading ones you find interesting or necessary, but I think that's pushing it.

What metaphor do we use to replace the proverbial story "to do list" though if it's so flawed?

Bloglines unread items
Whoops!

As with a newspaper, unless we specify we want to keep something or share it with friends, we probably don't want to read the same story twice. By greying out an item from our subscribed feeds our software is telling us we don't need to read that material any more because we've already seen it. Short of deleting a story altogether from our own cache of previously read articles, this is probably the most logical thing to do.

ASIDE: Notice my careful wording above, I said the software tells us that we've already "seen" a story, not read it. Unfortunately we've only scratched the surface here, should our software be able to tell me whether I just skimmed an article, just looked at the pictures or read it in full? Could it have a timer perhaps? I'm getting in way over my head!

That's not to say though we want to be prompted in the opposite way if we haven't read an item, because again to me that's akin to the software telling me I'm slack that I haven't read every single story, which I don't want to do. But then again, it's useful to tell me what I haven't read, otherwise how do I know what's new? Bummer, we're back where we started!

I've often heard it said that one of the strengths of computers are their ability to process large volumes of data in an instant that would take a human an eternity. Silly jokes about politicians and physical education teachers aside, as humans we have the upper hand in having intelligence. The fact that so called "tags" and "categories" even exist for posts and other media online shows that artificial intelligence still has a long, long, long way to go. And I mean a LONG way. A computer can download every news story and media item from hundreds of feeds to my aggregator every time I check my browser and perhaps do some rudimentary filtering based on what I've previously read or what I've defined as my topics of interest, but it's speed and accuracy abruptly stop there. "Rudimetary" is the operative word.

I have a lot of reading ahead of me!
Whoops!

Perhaps it's not the software that needs retraining, it's us. Perhaps I need to train myself to stop subscribing to every single news feed I come across with the thought in the back of my mind that my aggregator will handle it for me somehow. Because every morning when I wake up, turn my computer on and am told that I have 1000+ unread stories along with comments from friends for several dozen of them, I end up just reading just the latter, a few other bits and pieces, then leave. I reckon if my Google Reader and Bloglines accounts told me exactly how many items I've failed to read over the years, the integer would be of sufficient length that if I had that amount in my bank account, I could purchase myself a small planet and retire there.

I haven't even touched on the problem of missing out on good stories I should have read because there's so much other stuff crowding around it, but I suspect if you've read this far and use an aggregator yourself you don't need me to elaborate any further!

As I've eluded to previously, what I really need is an electronic secretary of some sort who picks out important blog posts, emails, Tweets and so forth, then sends them to me in an email for me to skim each morning. Technologies like RSS and Atom allow us to deliver that material, but after that computers still have a long way to go.

Thesis material perhaps?


Podcast feed has been fixed

I was alerted by Todd and Kaede that the Rubenerd Show RSS podcast feed thingy wasn't working properly for them. I had a look on my server and it turns out the RSS feed php file was corrupted when I upgraded to the latest version of WordPress last night. As far as I can tell that was the only file affected, but to play it safe I re-uploaded every server-side file again.

Hope that clears it up for people, sorry for the trouble.


Camino and Google Reader atom problems

Sharon777 on Twitter pointed out a possible problem with either the Camino browser or Google Reader. If you use Camino to browse someone's Google Reader Shared Items page (such as mine or Whole Wheat Radio's), an web feed notification icon doesn't appear in the address bar:

Google Reader in Camino not showing a web feed icon

However if you click View Page Source in the View menu, you can clearly see the link to the web feed:

Google Reader in Camino not showing a web feed icon

I can't really think why it shouldn't find it. Perhaps Camino has trouble with Atom feeds as opposed to RSS. When I have some more time I'll see if I can reproduce the error somehow.


The Internet Explorer Q Continuum

As you may have gathered from reading previous posts, I'm a Mac OS X user on laptops and a hopeless FreeBSD fanboy on desktops. Therefore it probably wouldn't surprise you to find out I'm not a fan of Internet Explorer, or Windows Internet Explorer, or Chuck Norris Explorer or whatever they're calling it at the moment.

Why though? Is it the fact that it successfully and demonstrably held back innovation on the intertubes for so many years? Is it the silly user interface in version 7 which I get calls from people constantly asking me how they get to the menu bar? Is it the fact their CSS support is so patchy and inconsistent it makes a part of my work even more difficult than it has to be? Is it because it was bundled with a monopolistic operating system? Is it because the e logo just looks plain silly?

No. It's for one simple fact: Internet Explorer doesn't support the <q> tag!

Look at that browser Jean Luc, it doesn't support my existence!
Look at that browser Jean Luc, it doesn't support my existence!

You could be forgiven for not knowing about this tiny little tag; it was included by the W3C back in the HTML 4.0 specification in 1997 to delimitate small inline quotations which are not large enough to justify the use of a block level element, but current versions of IE are the only browsers even in 2008 not to support it, despite every other game in town having no trouble with them.

For example, one of the sentences below is enclosed in <q> tags. If you're using Internet Explorer they will look exactly the same:

Ruben Schade is an incredibly smart, devilishly attractive and very self deluded person.

Ruben Schade is an incredibly smart, devilishly attractive and very self deluded person.

But why is the lack of support for a seemingly insignificant and easily replaceable tag my number one gripe with Internet Explorer? Because of its stupefying simplicity! How difficult would it have been for Microsoft to have added full support for such a simple tag? It's mind blowing!

<q>This is an inline quote, complete with CSS support!</q>

<span class="quote">Here's another inline quote, but with support for IE </span>

I guess until Internet Explorer 12 comes out some time after 2095, I'll have to stick with using the latter example above. What a mess!

FOOTNOTE: For what it's worth, Opera, Mozilla Firefox, Netscape Navigator (rest in peace), Apple's Safari, KDE's Konqueror and even zippy little dillo, links and lynx support the <q> tag. Obviously it's not hard!

Further Reading


Insanely useful webapp to convert PDF to SVG

Sometimes you find a webapp on the intertubes that's so useful you'd be charged for criminally negligent conduct if you didn't tell other people about it. As far as I understand it.

In this case it's Texterity's FreeSVG service which takes PDFs you upload, breaks them down into their individual components (including text, images, fonts and vector drawings), converts it all into an SVG file, bundles all the material into a zip file, then emails you a 48 hour link to fetch it. Very, very nice.

Choosing the file to upload and convert

From their site:

FreeSVG is provided by Texterity to encourage the use of SVG (Scalable Vector Graphics) on the web and enable free, highly functional conversion of PDF documents into fully accessible and navigable SVG.

SVG is an industry standard, W3C (World Wide Web Consortium) specification describing text and graphics in XML. To learn more about SVG visit the W3C site at http://www.w3.org/Graphics/SVG/.

Now obviously for security reasons you'd definitely not want to use this service for confidential or sensitive information, but for general use it's a fantastic solution, and has saved my arse many times!