Replacing XML::XPath with XML::LibXML in Perl

Software

I’ve used XML::XPath whenever I need to process XML, including RSS, OPML, Apache configuration, sitemaps, and such. There are dedicated parsers for these I should certainly be using instead, but I find it easier just to use the one tool and build my own data structure around it. Is that a folly? Almost certainly yes! Is it robust? So far, yes.

I’ve seen people recommend XML::LibXML on sites like PerlMonks and mailing lists, so I thought I’d give it a try for a new personal project. It’s mostly a drop-in replacement, with familiar syntax:

#!/usr/bin/env perl
	
use strict;
use warnings;
use URI;
use XML::LibXML;	
	
my $rss = URI->new('http://showfeed.rubenerd.com');	
my $xml = XML::LibXML->load_xml(location => $rss);	
	
foreach my $title ($xml->findnodes('//item/title')) {
    print $title->to_literal(). "\n";
}
	
==> Rubenerd Show 414: The thingy stuff episode
==> Rubenerd Show 413: The Wheaty 2021 episode
==> Rubenerd Show 412: The wandering mug episode
==> Rubenerd Show 411: The FreeBSD cat(1) episode
==> Rubenerd Show 410: The apothecary coffee episode
==> [..]

The load_xml method can accept a location such as a URL, a string, or IO in the form of a Perl file handle. Like XML::XPath though, you need to use an external module like LWP or LWP::Simple if you need to interface with HTTPS, then pass it as a string:

#!/usr/bin/env perl
	
use strict;
use warnings;
use URI;
use LWP::Simple;
use XML::LibXML;
	
my $rss = URI->new('https://the.geekorium.com.au/index.xml');
my $content = get($rss);
my $xml = XML::LibXML->load_xml(string => $content);
	
print $xml->findnodes('/rss/channel/title')->to_literal(). "\n";
	
==> The Geekorium

And finally, here we are using it with a local file:

#!/usr/bin/env perl
	
use strict;
use warnings;
use XML::LibXML;
	
my $file = 'feeds.opml';
open(my $filehandle, '<', $file) or die "Could not open, $!";
$xml = XML::LibXML->load_xml(IO => $filehandle);
	
print $xml->findnodes('/opml/head/docs')->to_literal(). "\n";
	
close($filehandle);
	
==> http://dev.opml.org/spec2.html

There’s a script in my lunchbox for those want to tinker. But the best resource I’ve found is Grant McClean’s Perl XML::LibXML by Example. Check out the links to the packages to read their docs on MetaCPAN.

Author bio and support

Me!

Ruben Schade is a technical writer and infrastructure architect in Sydney, Australia who refers to himself in the third person. Hi!

The site is powered by Hugo, FreeBSD, and OpenZFS on OrionVM, everyone’s favourite bespoke cloud infrastructure provider.

If you found this post helpful or entertaining, you can shout me a coffee or send a comment. Thanks ☺️.