The PHP CURL functions use the libcurl library to allow you to connect to various servers and different protocols. If the user agent string is not explicitly defined then nothing will be sent to the web server. If you are scraping a website with CURL for whatever reason, there may be times you need to specify the useragent string and this post shows how to do it.
The PHP code
The following PHP code example gets the webpage at http://www.example.com/path/to/webpage, using Firefox 3.5.2 on Windows as the user-agent string. The output from the page is saved to the $html variable.
$ch = curl_init(); curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/path/to/webpage'); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5'); $html = curl_exec($ch);
CURLOPT_USERAGENT
It’s the CURLOPT_USERAGENT line which sets the user agent string. Specify whatever you want as the value and that’s what will appear in the weblogs of the server the request is going to.
CURLOPT_RETURNTRANSFER
The "curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);" line makes it so the call to curl_exec returns the HTML from the web page as a variable instead of echoing it out to standard output.