How to scrape using Guzzle, Simple HTML Dom and anyIP.io?

By
Khaled Bentoumi
Reviewed By
Updated
November 28, 2024
12 min read

In this quick tutorial, we will show you how to start to scrape any website using Guzzle (a PHP library) and using rotating proxies from anyIP.io, get your account now.

Guzzle installation

The recommended way to install Guzzle is through Composer.

Bash
    composer require guzzlehttp/guzzle
  

How to use Guzzle

Following the documentation, opening a page using Guzzle is pretty simple:

PHP
    $client = new GuzzleHttpClient(); 
$res = $client->request('GET', 'https://www.example.com'); 
echo $res->getBody();
  

To use a proxy, you have to add a proxy parameter:

PHP
    $res = $client->request("POST", "https://www.example.com", [ 
  "proxy" => "https://username:password@portal.anyip.io", 
]);
  

How to parse the page?

The content of the page is in $res->getBody(). After checking that you actually got the correct result (the status code is 200, the content header is text or similar, etc.), you can start to parse the page. They are many options for this:

As a quick introduction to the scraping world, we will use Simple HTML Dom. After installing it and initialize it, you can simply use any CSS selector to retrieve the content of your choice:

PHP
    $simpleHTMLDom = str_get_html($res->getBody()); 
$links = $simpleHTMLDom ->find('a');
  
Khaled Bentoumi

Khaled is a software engineer. He’s been involved in many startups of different sizes. Previously, he founded Data to Page, an AI Programmatic SEO startup. He now handles all the marketing at anyIP.

Get access to millions of residential and mobile IPs
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Article by
Khaled Bentoumi

Khaled is a software engineer. He’s been involved in many startups of different sizes. Previously, he founded Data to Page, an AI Programmatic SEO startup. He now handles all the marketing at anyIP.

Read more
12 min read
Python Requests Retry: A Complete Guide to Handling Failed HTTP Requests

Enhance your Python applications and learn how to handle HTTP request retries using Requests Retry

12 min read
How to Set Up Proxies with Potatso in iOS: Guide

Discover the ultimate Potatso proxy guide! Learn how to set up and configure proxies on your iOS device effortlessly.

12 min read
How to customize Your User-Agent with Python Requests

Learn how to update and rotate user-agents in Python Requests to avoid detection and improve scraping efficiency.

Ready for Next-Level Proxy Solutions?

Get started now and experience the ultimate in proxy flexibility, speed, and security.

Unlock the Power of anyIP Today!