Lynx is a command line web browser which I often use for checking the behaviour of redirection headers and content types in a web page’s http headers. The previous post showed how to get the headers with Lynx and this post how to set the user agent.
Why use Lynx?
There are browser extensions for getting http headers from a page and the browser can often show this sort of information about a page as well, but I find it easy to use Lynx; I have a lot of control over setting things like the user-agent string, and can run it from the command line on remote hosts as well as locally.
Setting the user-agent string with Lynx
It’s easy to set the user-agent string with Lynx; simply set the -useragent flag to the browser string you want it to be, for example:
lynx -useragent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.79 Safari/537.1" -head -dump http://www.example.com/
Note that if you don’t include the word Lynx or L_y_n_x then lynx will bitch about it like this:
Warning: User-Agent string does not contain "Lynx" or "L_y_n_x"!
This doesn’t really matter but for what it’s worth here’s the explanation from the Lynx users guide about why it does this:
Some sites may regard misrepresenting the browser as fraudulent deception, or as gaining unauthorized access, if it is used to circumvent blocking that was intentionally put in place. Some browser manufacturers may find the transmission of their product’s name objectionable. If you change the User-Agent string, it is your responsibility. The Options Menu issues a reminder whenever the header is changed to one which does not include "Lynx" or "L_y_n_x".
You can either choose to ignore the message, or add "Lynx" as part of the user agent string if you want to suppress the message 🙂
Testing as Googlebot and Googlebot-Mobile
I’ve been setting up a mobile version of a website with an m dot domain, and there’s server-sided redirection for mobile devices from the www version to the m version.
I used Lynx to test the redirection was working as expected for Google’s bot by setting the user agent to Googlebot and Googlebot-Mobile to ensure it was working as expected, which was:
a) that regular Googlebot would not be redirected off the www version of the site
b) that Googlebot-Mobile would be redirected off to the m version of the site
I grepped the user agent strings that Google had recently been using to spider one of my sites from the Apache logs, and used the following for each.
lynx -useragent="Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" -head -dump http://www.example.com/
lynx -useragent="Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" -head -dump http://www.example.com/
So that’s how I tested how redirection would/would not affect Google’s bots using Lynx to get the http headers. I hope this is useful to you too.