Most of PHP’s file functions can be used to open local files as well as remote files via HTTP etc. By default the user agent string passed when making an HTTP request is an empty string but it is possible in PHP to change the user agent string to something else. This post looks at how to do this.
For example, if you were going to download a copy of an RSS file you could do this using the file_get_contents() function, where $rss is the full URL to the RSS file:
$xml = file_get_contents($rss);
It may be better to use CURL or other methods for a request like this which have better error handling etc, but that’s not the point of this post 🙂
The above request would look like this in an Apache access log file:
192.168.1.15 - - [01/Oct/2008:21:52:43 +1300] "GET / HTTP/1.0" 200 5194 "-" "-"
The last two "-" items in the above line are the referrer URL (when specified) and the user agent string (when specified). If they were not specified by the browser/script/etc then they are represented by –
PHP has an ini setting called "user_agent" which lets you specify the user agent string when making these sorts of HTTP requests. It can be specified anywhere – httpd.conf, php.ini, .htaccess and in your scripts.
To set the user_agent value in your PHP script use the ini_set function as in the following example. This sets the user-agent string to Firefox 3.0.3 in Windows, which is the current version of Firefox for Windows at the time of this post:
ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3');
Now when we make the same request as at the start of this post, the line in the Apache log file would look like this:
192.168.1.15 - - [01/Oct/2008:21:54:29 +1300] "GET / HTTP/1.0" 200 5193 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3"
For the purposes of readability, I’ve split the line into two but it would normally appear all on one line in the log file. You can see that while the HTTP referrer is still empty, the user-agent string is now populated with the value we specified.