How to scrape using Guzzle, Simple HTML Dom and anyIP.io?
Table of Contents
Table of Contents
In this quick tutorial, we will show you how to start to scrape any website using Guzzle (a PHP library) and using rotating proxies from anyIP.io, get your account now.
Guzzle installation
The recommended way to install Guzzle is through Composer.
composer require guzzlehttp/guzzle
How to use Guzzle
Following the documentation, opening a page using Guzzle is pretty simple:
$client = new GuzzleHttpClient();
$res = $client->request('GET', 'https://www.example.com');
echo $res->getBody();
To use a proxy, you have to add a proxy parameter:
$res = $client->request("POST", "https://www.example.com", [
 "proxy" => "https://username:password@portal.anyip.io",
]);
How to parse the page?
The content of the page is in $res->getBody(). After checking that you actually got the correct result (the status code is 200, the content header is text or similar, etc.), you can start to parse the page. They are many options for this:
Use a regex
Use the DOM library from PHP
Use Simple HTML Dom
As a quick introduction to the scraping world, we will use Simple HTML Dom. After installing it and initialize it, you can simply use any CSS selector to retrieve the content of your choice:
$simpleHTMLDom = str_get_html($res->getBody());
$links = $simpleHTMLDom ->find('a');
Read more
Mobile vs. Datacenter Proxies: 4 Key Differences
Learn about the differences between mobile and datacenter proxies, when to use them, and tips for choosing the right one in this anyIP guide.

