PHP: Using cURL

tips_phpcURL is a PHP library that’s meant to safely retrieve remote data. The term ‘safely’ is a relative one, of course, depending on what you do with it. But since cURL doesn’t generally execute the fetched data, it is considered much safer than many other options.

cURL is a PHP module that should be available in most web hosts. However, if you run PHP on your own server or computer, you will need to make sure it is installed.

Simple Fetch

The most basic usage for cURL is simply a ‘fetch this document’ function. You should first initialize cURL and then request the data. Here’s a quick and dirty example:

$inithandle = curl_init();

As you can see, we start by initializing cURL, then use curl_setopt() function to set up the remote site address and print out the retrieved data.

When ‘curl_exec()’ is called, it will, by default, print the content onto the page. If you want to change this option (to store the data in a variable, for instance) you should change the curl options:

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

RETURNTRANSFER set to 1 (TRUE) will return the data, set to 0 (FALSE) and the operation will be automatically printed to the document.

Complete Function

Below is a simple ‘fetch’ function to be used with cURL to retrieve remote web pages. Feel free to use this function for your own purposes. An attribution would be appreciated.

	 * Reads the source of a remote URL
	 * @param String $url	The URL of the requested page
	public function curlfetch($url, $retHeader = 0) {
		if (!function_exists('curl_init')){
			return false; 

		$ch = curl_init();
		/* Get the URL */
		curl_setopt($ch, CURLOPT_URL, $url);

		/* Referer */
		curl_setopt($ch, CURLOPT_REFERER, "");

		/* User Agent */ 
		curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0");

		/* Include Header? */ 
		curl_setopt($ch, CURLOPT_HEADER, $retHeader);

		/* Return the data (Do not print) */
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

		/* Retrieve */
		$output = curl_exec($ch);

		/* Close cURL */

		return $output;		

Usage Example

Now that we have our function, we can decide what page to call at any given time. The function will return the data (not automatically print it) so we can manipulate it before printing the result.

In Miso Offensivator, I use cURL to retrieve the requested website into a variable. The system then uses regular expressions to replace certain terms in the text and only then prints it out. This is useful when you want to analyze or change the retrieved data before you display it to your users.

For example, the code below fetches a requested URL and displays only the data between the <body> tags:

$html = (curlfetch($url));
preg_match("/<body.*\/body>/s", $html, $matches);
$bodytext = $matches[0];
print $bodytext;

cURL Options

Remember that cURL has many other variables you can tweak and work with. For a full reference, read the official PHP cURL documentation.

Tags: ,

Trackback from your site.