This is a summary of the posts I've written about PHP's file_get_contents functions when used to download remote content (e.g. webpages, XML files, images etc) and the CURL functions which are used to do the same thing.
PHP Manual Pages
As always, don't forget to read the manual pages from the PHP manual; they don't always cover everything that well and are sometimes lacking in examples but always make a good starting point when wanting to know how a particular function works. The user submitted comments are often useful too.
- Change the user agent string in PHP which shows how to set the user agent which the scripts/logs at the server you are downloading content will see; this is useful to mask your script and make it appear like a browser instead of a bot.
- Sending a username and password with PHP file_get_contents() using http basic authentication which is needed if a website is password protected. This won't help you if need to log in using a web form, just for http authorization .
- Setting the user agent with PHP CURL which shows how to set the user agent for the same reasons as listed in the file_get_functions() post linked to above.
- Sending a username and password with PHP CURL; as with the same file_get_contents() post listed above this is for sending a username and password with http basic authentication.
- PHP CURL and Cookies which shows how to configure CURL to store cookies in a cookie file which can be used both now for session cookies, and in the longer term if the same file is reused for cookies that are set for a longer duration.
- Submitting a form post with PHP and CURL for doing just that 🙂
- Setting the http referer with PHP CURL in case you need to scrape a page that expects a referrer
- MAMP PHP cURL and SSL which deals with an issue using MAMP on Mac OSX not being able to access https:// URLs