How to use Mojolicious for web scraping | Linux Format April 2022

US

0 Basket

BEST SELLERS OFFERS Craft & Hobby Aviation & Transport Leisure General Interest Sport

United States

Art & Photography Art Design Architecture Photography Aviation & Transport Motorcycles Flying & Aviation Car Magazines Trains Home & Family Kids Family Pets & Animals Cooking DIY Gardening Real Estate Home Decor Food and Drink Cooking & Baking Drink Vegetarian & Vegan Gluten Free & Special Diets General Interest History & Knowledge Astrology Education & Literary Spiritual & Religion Industry & Trade National & Regional Books News & Current Affairs Fitness & Health Healthcare Running Women's Fitness Men's Fitness Healthy Eating & Diet Spirituality & Wellbeing Craft & Hobby Collection RC Modelling Scale Modelling Sewing & Knitting Woodworking Arts & Crafts Leisure Interest Travel Boating & Sailing Poker & Gambling RV & Motorhome Outdoor & Camping TV & Movie Tattoo Horse & Equestrian Animal Men's Interest Fashion Gay TV & Movie Men's Fitness Motorcycles Car Magazines Soccer Angling & Fishing Gaming Accessories & Gadgets Newspapers All Music Classical Heavy Metal Alternative Rock Pop Practical & Playing Hi-Fi Sport Soccer Cycling Rugby Golf & Cricket Soccer Programmes Angling & Fishing Guns & Archery Boxing & MMA Horse & Equestrian Other Boards & Watersports Running & Athletics Racing Ski & Snowmobile Outdoor Adventure Gaming and Tech Apple Gaming Internet Gadgets PC Mobile Industry & Trade Money & Finance Architecture & Building Military & Defense Teaching & Education Media Retail Trade Agriculture Hospitality Business Logistics Government Travel Women's Interest Hairstyles Celebrity Gossip Weddings & Bridal Lifestyle & Fashion Weight Loss Fitness

United States

Digital Subscriptions > Linux Format > April 2022 > How to use Mojolicious for web scraping

Home SALE - 20% Off My Library My Account Pocketmags Plus+ Title A-Z Category A-Z Popular Magazines Latest Offers Gift Vouchers Activate a Subscription Blog Help & Support

Read on any device

Safe & Secure Ordering

7 MIN READ TIME

PERL

How to use Mojolicious for web scraping

Mark Gardner reveals how you can retrieve and parse HTML and XML from websites with a few lines of Perl and the Mojolicious framework.

Part One

Don’t miss next issue! Subscribe on page 16

OUR EXPERT

Mark Gardner is a software developer and blogger with over 25 years of IT experience. You can reach him at www.phoenixtrap. com and @markjgardner.

So much of the modern web is driven by services and front-end interfaces talking to APIs that it’s easy to lose sight of the fact that everything is ultimately presented in a soup of HTML markup. In the absence of a well-structured interface or format, sometimes the code you’re writing needs to scrape the ingredients of that soup apart and parse out meaningful data. Perl’s Mojolicious web framework includes a set of components that make this task easier.

Although most Linux distros come with a version of Perl, it helps to have your own installation separate from the system so you’re not tied to a possibly older version that’s required to support operating system tools and other packages. This separate installation can live in your $HOME directory (or wherever you specify) with its own modules that neither require sudo to install nor interfere with those handled by the package manager.

The most popular tool for managing separate Perl installations is called Perlbrew. Installation instructions are at https://perlbrew.pl/Installation.html. You can install it with either of the following shell commands, depending on what you already have installed:

Unlock this article and much more with

You can enjoy:

	Enjoy this edition in full
	Instant access to 600+ titles
	Thousands of back issues
	No contract or commitment

Try for 99c

SUBSCRIBE NOW

30 day trial, then just $9.99 / month. Cancel anytime. New subscribers only.

Learn more

Pocketmags Plus

Pocketmags Plus

More Options:

SUBSCRIBER LOGIN | PRINT OFFERS | DIGITAL OFFERS | DIGITAL BACK ISSUES

SUBSCRIBER LOGIN
PRINT OFFERS
DIGITAL OFFERS
DIGITAL BACK ISSUES

This article is from...

Linux Format

April 2022

Other Articles in this Issue

WELCOME

This issue we’re compiling the Linux kernel from source. We wondered if our experts compile anything from source, or does even the thought of this cause them to run screaming into the night?

Kernel of truth

Compiling the kernel from source was always a

REGULARS AT A GLANCE

Linux malware grows by 35%

Malware aimed directly at Linux systems increased drastically last year

Major Linux exploit found

Almost every distro is affected by a major vulnerability – make sure that your systems are patched

US government wants your messages

And the UK government wants to confirm IDs online

Michael Meeks is general manager at Collabora Productivity

OLD AND UNTRUSTED

Keith Edmunds is MD of Tiger Computing Ltd,

Framework open sources its firmware

Users can change how the modular laptop manages its hardware

FreeCAD Project Association formed

New legal non-profit association set up in Belgium

LibreOffice 7.3 Community now out

New features in the latest version of the office suite

What’s down the side of the free software sofa?

Matt Yonkovit Head of open source strategy and

MEASURE THE PULSE

Jon Masters has been involved with Linux for

Jon Masters summarises the latest happenings in the Linux kernel, because someone has to…

Got a burning question about open source or the kernel? Whatever your level, email it to lxf.answers@futurenet.com

WRITE TO US Do you have a burning

THE BEST NEW OPEN SOURCE SOFTWARE ON THE PLANET

REVIEWS

Intel Core i5 12400

It’s the new Q6600, Dave James’ greatest compliment ever!

ArchLabs 2022.02.12

Mayank Sharma has a soft spot for new Arch-based distros, and this time around he’s found one that tickles his fancy

Mechanisms for simplifying the installation of popular web tools are like snake oil. Or so thought Mayank Sharma, until he came across UBOS…

He’s no masochist… until there’s a new Slackware release, when Mayank Sharma puts himself through all kinds of pains to relive the good ol’ days

Valve Steam Deck

Our PCGamer friends got a hands-on preview hardware of the Steam Deck before the big launch at the end of February. Here’s what they learnt…

Ubuntu alternatives

WE COMPARE TONS OF STUFF SO YOU DON’T HAVE TO!

Ubuntu alternatives

BUILD THE KERNEL

The kernel is what makes Linux tick. Jonni Bidwell is happy to get his hands dirty and help you tune up those ticks…

Grasp the kernel basics

Just what is a kernel and why is it telling my computer what to do?

Compiling a kernel

Get straight to business and build your own Ubuntu-esque kernel

Kernel minification

Perfection is reached not when there’s nothing left to add, but rather when there’s nothing left to take away…

Popular patches

Forget trawling through configs – use a pre-rolled patchset to set the rules

Pi USER

The maker’s Coolest Projects goes Global!

The once US-centric event is opening its doors to makers from around the world. Yes, that means you!

FLIRC Pi Zero Case

Les Pounder keeps his cool with a case for his beloved Raspberry Pi Zero 2 W. Will it beat the heat of a benchmark?

Assemble a Micro Dot pHAT news ticker

Les Pounder builds his own homage to the famous Times Square news ticker, this time scaled for ants!

Create your own Chromecast device

Mats Tage Axelsson takes you on a tour of the Linux tools that will enable you to stream media between your laptop and other devices

IN-DEPTH

LINUX FROM SCRATCH

Like to get your hands dirty? Aaron Peters is your man, as we get deep into building Linux

TUTORIALS

Rapid fuzzy finder

Shashank Sharma isn’t one for magic, but he’s not averse to using the Accio spell or its computing incarnation, fuzzy finder, to find things quickly

Store and search your research notes

Always on the look-out for organisational tools, Nick Peers reveals how to bring all your research materials and notes into a single convenient space

May the forth be with the Jupiter Ace

Les Pounder nips back to the early 80s and pays homage to a home computer that sold less than 6,000 units, despite its go-faster stripes!

Make your home as smart as possible

Matt Holder, who’s a bit of a clever-clogs himself, investigates the usage of Home Assistant to make your home as smart as it can be

Publishing your own slick ebooks

Seeking to chalk up a best seller, Michael Reed goes further into the intricacies of publishing a book from his Ubuntu desktop

Create your own virtual classroom

David Rutland moodles along with classroom management software on the LXF virtual private server and starts his own online course

TOP OF THE FOSS

CHARITABLE CHARACTERS

The Emmabuntüs collective enlightens Jonni Bidwell as to its kind and open source efforts

CODING ACADEMY

Working with binary tree data structures

Mihalis Tsoukalos explains how to construct and use binary trees for faster searches and easier relationships (of the data sort, mind)

Interact with your 3D game environment

Enhance your gaming world by adding collision detection and custom objects, as Andrew Smith throws barrels at you

Other Links All Titles New Titles Free Magazines Our Publishers Plus+ for Business Privacy Policy California and US Privacy Info Terms & Conditions Cookie Policy My Privacy Choices

Help Help & FAQs Email Support

Gifting How Gifting Works Gifting Help

How it Works Apple Android Online Pocketmags Points Digital Magazines

Contact Us Product Queries Affiliates

Publishers Selling Information Apply to sell Login

The Company About Us Pocketmags.com magazine.co.uk JellyfishCoNNect.com

© Copyright 2011 - 2025 | Jellyfish Connect Ltd

Chat

X

Pocketmags Support

POWERED BY