Posts tagged with "formats"


Every friggen page is now XHTML 1.0 Strict

The W3C Validator
As of February 2009, page 145 is one of the last ones!

Checking, correcting and validating malformed (or in most cases typo filled) XHTML code is a very useful thing to do when you're sick because it takes almost no brains whatsoever to do and it keeps you occupied instead of watching daytime television.

I can now say with 99.95% certainty (my solicitor advises me against ever being 100% certain) that every friggen page and post on this blog is valid XHTML 1.0 Strict. Sheesh.

Valid XHTML 1.0 Strict

Tentatively every page and post is valid XHTML 1.1 as well with two caveats

  1. I don't have an XML declaration in the first line of the source so I don't trip Internet Explorer's Quirks Mode
  2. Pages are still being served with the text/html mime type instead of the technically correct application/xhtml+xml, again for compatibility with even the latest versions of Internet Explorer such as 7 and 8 Beta.

Both of these could be handled and adjusted on the web server so that depending on what browser the client is accessing your site from you could dynamically add declarations and change mime types on the fly (I believe that's the W3C recommendation with XHTML 1.1) but it seems like a bit too much trouble to deal with now. I'll be keeping my eye on this though.

Obligatory Microsoft rant

Die IE!Microsoft was sued by the European Union because they felt they were being anti-competitive by bundling Internet Explorer.

While I believe that suit did have some merit, the really should have gone after Microsoft's appalling standards record instead. They've been responsible for untold amounts of damage by keeping back the progress of the internet and have caused frustration and head butting on walls by programmers and web designers.

What a mess!


W3C's XHTML ordered list mistake

Icon from the Tango Desktop ProjectI was under the impression that newer web standards emphasised the separation of content from presentation markup. This was the reason for the creation of CSS and relegating the humble table back to displaying... tabular data.

I've been an unabashed and unapologetic supporter of the web standards themselves, even if in the past my interpretations of them weren't exactly correct ;-). I'm attempting to correct this though because I see real value in everyone being on the same page on net as it were. That was a really clever and entirely unintended pun. I'm not Bill Kurtis.

What concerns me though is the removal of the value attribute from the humble ordered list element. This attribute is vital for generating non-contiguous but ordered lists of items, or where selected items share the same value, such as this example of ranking some of the cities I grew up living in based on the amount of time I spent there:

<ol>
<li value="1">Singapore</li>
<li value="2">Melbourne, Victoria, Australia</li>
<li value="3">Adelaide, South Australia, Australia</li>
<li value="4">Brisbane, Queensland, Australia</li>
<li value="4">Kuala Lumpur, Malaysia</li>
<li value="6">Sydney, New South Wales, Australia</li>
<li value="0">Orion's Belt, Far Far Away!</li>
</ol>

According to the W3Schools list element article, the value attribute was deprecated by the W3C for use in XHTML because you can "use styles instead". There's just one problem with this line of reasoning: The value of a list item is NOT a style attribute, it's DATA.

By removing this so called "presentation information" we're also removing an integral part of the information itself which is absolutely unacceptable.

If we were to take what they were saying as Gospel and represented these values in CSS (it is possible), then we rendered our now standards compliant document using another browser that didn't support CSS, we would be presented with a list without this data.

I urge the W3C (in my very limited capacity!) to seriously reconsider the omission of this attribute in their specifications.


Rubenerd Blog XHTML 1.0 Strict-yness

The W3C Validator

One of the final pieces of the puzzle (to use a worn out cliche that long since overstayed it's welcome) of moving web servers is making sure that the code in the blog posts here and for the show are valid XHTML (I had kept this in mind for several years, but I have made mistakes that need correcting). This is important for several reasons which I won't bore you with here, suffice to say it has to do with moving over to a new server with a new theme, possible assignment marks and with some XML software I'm writing.

For those who are staring at me with blank faces right now (even more than usual I mean), XHTML is a reformulation of HTML into strict XML instead of the more lenient SGML. Alphabet soup sentences aside, pragmatically this means you can pass your web pages through a XML parser which allows you to do some really cool things like converting pages into other file formats, extract data more easily, use microformats to generate feeds, and so on. The theory also is because XML is stricter than SGML, pages written in XHTML are more "correct" and should be easier for browsers to render.

ASIDE: You can tell if a page has been optimised for an XHTML standard by looking at the head of the source code for a page for a DOCTYPE definition.

The current standards are XHTML 1.0 Frameset, XHTML 1.0 Transitional and XHTML 1.0 Strict (which is what I currently use). XHTML 1.1 also exists, but has seen limited adoption given Internet Explorer 6's hostility towards the required xml version declaration which triggers it's quirks mode which isn't what we want!

There seems to be a huge difference of opinion between people who see the value of validating web pages with a XHTML W3C specification, and those who say it's a waste of time and more of a hindrance to the web than an assistance.

I'm firmly in the first camp, but that's not to say I approve of everything the W3C is doing with their specs. Unfortunately the way I see it, for every three steps forward they make, they take one step backwards. This means they're definitely heading in the right direction and making progress (albeit at a snail's pace), but they're shedding some useful stuff along the way in their absolute rigid pursuit of code purity and correctness. Iframe elements and ordered lists come to mind, but I'll save them for another post where I can elaborate further.

Valid XHTML 1.0 Strict

As of the 5th of February 2009 the home page of this blog is valid XHTML 1.0 Strict, but there's still lots of work to be done for individual posts. A lot of this can be done automatically with a few Perl scripts I've hacked together, but a few tags that will need replacement can really only be taken care of by a human looking at it and making the correct substitutions. At least I feel I'm making progress.

XHTML sounds like an isotonic energy drink. It doesn't sound like Bill Kurtis, which is useful because I'm not Bill Kurtis.


Camino and Google Reader atom problems

Sharon777 on Twitter pointed out a possible problem with either the Camino browser or Google Reader. If you use Camino to browse someone's Google Reader Shared Items page (such as mine or Whole Wheat Radio's), an web feed notification icon doesn't appear in the address bar:

Google Reader in Camino not showing a web feed icon

However if you click View Page Source in the View menu, you can clearly see the link to the web feed:

Google Reader in Camino not showing a web feed icon

I can't really think why it shouldn't find it. Perhaps Camino has trouble with Atom feeds as opposed to RSS. When I have some more time I'll see if I can reproduce the error somehow.


Specify image dimensions and save the world!

Jo Anne Hook painting: Australian Wildflowers

One of the more pleasurable things in life is when you can get on your high horse and let the rest of the world know why they're wrong, and you're right. Or maybe that only applies to conversations about music. Sorry Elke, comparing Akon to The Rat Pack is like comparing Barry Manilow to Jo Anne Hook. Wait, Jo Anne Hook is a painter. Never mind.

My gripe today is with people who use images in HTML on websites without defining their dimensions! You've probably seen pages at one point that seem to rearrange themselves as material moves around to make way for images that are loading. By defining the sizes of images in advance, browsers know how much visual space to allocate them as it draws the page.

Without declared dimensions
<img src="image.jpg" alt="description" />
Using HTML dimensions
<img src="image.jpg" alt="description"
width="320" height="240" />
Using inline CSS
<img src="image.jpg" alt="desccription"
style="width:320px; height:240px;" />
Using an external style sheet
Same as latter, but using an external style sheet linked with an id statement for individual images, or more pratically using class for many images on a page with the same dimensions.

Autumn anime art using... defined image dimensions!

Without this information, the browser is forced to render the page as it would look without the image until it has reached it; this is especially noticeable on slower internet connections and on mobile phones. It also does nothing to help the sanity of people who are halfway through reading a paragraph and suddenly have the text disappear as it's pushed away by an image that has started loading!

As far as I know from my own experience, Typo, WordPress and MediaWiki conveniently specify image sizes automagically on images you upload and insert, and I assume most other content management systems do too... save for Blosxom of course! Wow Blosxom, the first weblog publishing system I ever used, that brings back memories!

Save the world: specify image dimensions!


The Internet Explorer Q Continuum

As you may have gathered from reading previous posts, I'm a Mac OS X user on laptops and a hopeless FreeBSD fanboy on desktops. Therefore it probably wouldn't surprise you to find out I'm not a fan of Internet Explorer, or Windows Internet Explorer, or Chuck Norris Explorer or whatever they're calling it at the moment.

Why though? Is it the fact that it successfully and demonstrably held back innovation on the intertubes for so many years? Is it the silly user interface in version 7 which I get calls from people constantly asking me how they get to the menu bar? Is it the fact their CSS support is so patchy and inconsistent it makes a part of my work even more difficult than it has to be? Is it because it was bundled with a monopolistic operating system? Is it because the e logo just looks plain silly?

No. It's for one simple fact: Internet Explorer doesn't support the <q> tag!

Look at that browser Jean Luc, it doesn't support my existence!
Look at that browser Jean Luc, it doesn't support my existence!

You could be forgiven for not knowing about this tiny little tag; it was included by the W3C back in the HTML 4.0 specification in 1997 to delimitate small inline quotations which are not large enough to justify the use of a block level element, but current versions of IE are the only browsers even in 2008 not to support it, despite every other game in town having no trouble with them.

For example, one of the sentences below is enclosed in <q> tags. If you're using Internet Explorer they will look exactly the same:

Ruben Schade is an incredibly smart, devilishly attractive and very self deluded person.

Ruben Schade is an incredibly smart, devilishly attractive and very self deluded person.

But why is the lack of support for a seemingly insignificant and easily replaceable tag my number one gripe with Internet Explorer? Because of its stupefying simplicity! How difficult would it have been for Microsoft to have added full support for such a simple tag? It's mind blowing!

<q>This is an inline quote, complete with CSS support!</q>

<span class="quote">Here's another inline quote, but with support for IE </span>

I guess until Internet Explorer 12 comes out some time after 2095, I'll have to stick with using the latter example above. What a mess!

FOOTNOTE: For what it's worth, Opera, Mozilla Firefox, Netscape Navigator (rest in peace), Apple's Safari, KDE's Konqueror and even zippy little dillo, links and lynx support the <q> tag. Obviously it's not hard!

Further Reading


A philosophical security question

If implementing a standard leads to an unavoidable security hole, should you follow it?


Insanely useful webapp to convert PDF to SVG

Sometimes you find a webapp on the intertubes that's so useful you'd be charged for criminally negligent conduct if you didn't tell other people about it. As far as I understand it.

In this case it's Texterity's FreeSVG service which takes PDFs you upload, breaks them down into their individual components (including text, images, fonts and vector drawings), converts it all into an SVG file, bundles all the material into a zip file, then emails you a 48 hour link to fetch it. Very, very nice.

Choosing the file to upload and convert

From their site:

FreeSVG is provided by Texterity to encourage the use of SVG (Scalable Vector Graphics) on the web and enable free, highly functional conversion of PDF documents into fully accessible and navigable SVG.

SVG is an industry standard, W3C (World Wide Web Consortium) specification describing text and graphics in XML. To learn more about SVG visit the W3C site at http://www.w3.org/Graphics/SVG/.

Now obviously for security reasons you'd definitely not want to use this service for confidential or sensitive information, but for general use it's a fantastic solution, and has saved my arse many times!


WordPress eXtended RSS fun

WXR

I haven't been having much luck with technology this week, but this seems to be the icing on the cake so to speak. The problem is no matter how hard I try I just can't get WXR working.

WXR is of course the WordPress eXtended RSS format which allows you to quickly export the entire written contents of your weblog including posts, pages, categories, tags and kitchen sinks. It means you can pick up the guts of your weblog, then do a backup of your wp-content folder which contains all your uploaded media, plugins and themes, then import them somewhere else.

Only problem is, this is the seventh time and I still can't get it to work on one WordPress installation. I have a local web server running on my MacBook Pro which I've set up to test new themes and plugins I'm working on, and on this local installation of WordPress I can import my Rubenerd Show material without any trouble at all, but I've had no end of trouble when I try to do the same thing from the Rubenerd Blog.

The curious thing is that there's no consistency to the errors. On Thursday I tried importing from this weblog and WordPress silently failed; the import page just stopped rendering after it had uploaded the file. Then yesterday I tried again and it was able to import posts but only up to September 2006 when it decided to stop.

The only things I can think of that could be causing this problem is the WXR export php file in WordPress wasn't uploaded to the server correctly, or the file (2.2MiB) is too big somehow for my local web server to handle, or maybe there's some malformed HTML in one of my posts which breaks the resulting XML file it's contained in... maybe it's just gremlins.

One clue though showed itself when I tried to open the exported WXR file in Smultron:

So perhaps it's an encoding issue? Or does WordPress not output UTF-8? Could it be failing because some of my posts have East Asian characters which need UTF-8?

Whatever this blasted problem is, it looks like this is going to be a very, very, very long Saturday.


Sign the NoOOXML petition!

If you haven't signed the No Open Office XML Petition yet, hurry over and do it right now. Microsoft's efforts to ratify their file formats as a standard must be stopped not just because there's already an Open Document format standard, but because their implementation is flawed and broken and adopting it will cause all kinds of problems and the eventual destruction of the world as we know it.

For the reasons why and to sign the partition go to http://www.noooxml.org/petition.

Ruben Schade, Singapore, 54 seconds ago
Comments: This newly proposed "standard" being pushed (and bribed) by Microsoft is a genuine threat to interoperability and must be dismissed. Microsoft, a convicted monopolist in multiple jurisdictions must NOT be allowed under any circumstances to push through these file formats.