Perl libraries often attempt to do what you want (or DWIM, Do What I Mean) when it comes to deciding how to handle input and output. However, this can have unintended consequences if you’re not careful with these conveniences. I’m going to pick on an example I ran into today:
use Data::Dumper;
use XML::Simple;
my $data = XMLin($ARGV[0]);
print Dumper($data);
What’s that do? It takes the input from @ARGV and parses it as XML, right? It returns that information as a hash, right? Maybe.
If this were a little script named test.pl and I ran this:
./test.pl "<test><value>foo</value></test>"
I would end up with output of:
$VAR1 = { value => 'foo' };
However, if you just ran:
./test.pl
what would happen? It would try to find and open a file named test.xml.
Or I could run it like this:
cat some.xml | ./test.pl -
That would cause it to read in standard input. Okay, great! That’s pretty handy, except…
Imagine if you were using this in a web application. Suppose you have an upload widget where someone uploads an XML file to your app and then you pass that value straight into XMLin() without checking, this could be very bad.
my $xml = $c->parameters->{xml_file};
my $data = XMLin($xml);
print Dumper($data);
What if your user uploads a file that contains the string “/root/secret.xml” and that happens to refer to a real file that is secret?! You just passed back the information in that file.
Beware of the DWIM! Always check your inputs! In the case of the web app, I would recommend against any use of XMLin() directly and create some sort of wrapper that will only allow the input you expect:
sub parse_xml_content {
my ($xml, @options) = @_;
die 'input does not look like XML' unless $xml =~ /<.*?>/;
return XMLin($xml, @options);
}
This doesn’t go just for XML::Simple, but for just about anything that is smart with input.
Cheers.
