Home / Extract inline image attachments from an email with PHP

Extract inline image attachments from an email with PHP

A month ago I posted a PHP email message class for extracting attachments and have been asked by a couple of people how to extract inline images and make it so they can appear in the HTML content of the email. This post shows how to do this, using my email class to get the email.

The email class

The email class can be downloaded here. The file is a plain text PHP file, compressed as a zip file. I can’t really offer any support for this code other than with additional tutorials like this one. For the moment it is poorly documented but when I get more time I’ll do more work on it.

Inline images

Images can be embeeded in the HTML content of an email and sent along with the email. When they are “inline images” like this, the src property of the image starts with cid: followed by a unique identifier for the image. It’s then added as a “message part” which is more commonly known as an “attachment” and can be referenced by the unique identifier with my email class.

The exact format of the cid varies depending on the email client, and is not consistant. For example, I dragged and dropped two images (named 1.jpg and 2.jpg) into Apple’s Mail client and then to Gmail, and this is the resulting HTML from each.

Apple Mail:

I’ve removed the html and boy tags in the example below so it’s not quite such a mess of tag soup.

<img id="9583f2e3-aa95-49dd-9576-a29dc4e2a72b" height="450" width="600" apple-width="yes" apple-height="yes" src="cid:3FE80658-A359-4A43-82EA-BEAEA7CEBAEA@local">
<img id="e710c040-0288-44c5-afb5-14cf5294879d" height="450" width="600" apple-width="yes" apple-height="yes" src="cid:BE4DF124-417B-4AFC-9DC9-F01818A942B2@local">

Gmail:

<img src="cid:ii_136a895175f87da3" alt="Inline images 1"><br>
<div><img src="cid:ii_136a895663ed0afd" alt="Inline images 2"><br></div>

So both have cid: followed by a unique identifier. Apple adds a @local at the end which is presumably the domain name, Gmail does not. Other email clients will format the identifier differently.

Using the email class to download the message

Connect using the imap mail functions first and then use the email class to fetch the first message. The class doesn’t yet have functions for seeing how many messages are in the mailbox etc, use the regular imap functions for that.

$login = 'me@example.com';
$password = 'mypassword';
$server = '{imap.gmail.com:993/ssl/novalidate-cert}';
$connection = imap_open($server, $login, $password);
$emailMessage = new EmailMessage($connection, 1);
$emailMessage->fetch();

The structure of $emailMessage will be like this for the Gmail example:

EmailMessage Object
(
    [connection:protected] => Resource id #4
    [messageNumber:protected] => 1
    [bodyHTML] => <img src="cid:ii_136a895175f87da3" alt="Inline images 1"><br>
<div><img src="cid:ii_136a895663ed0afd" alt="Inline images 2"><br></div>

    [bodyPlain] => [image: Inline images 1]
[image: Inline images 2]

    [attachments] => Array
        (
            [ii_136a895663ed0afd] => Array
                (
                    [type] => 5
                    [subtype] => JPEG
                    [filename] => 2.jpg
                    [data] => ...
                    [inline] => 1
                )

            [ii_136a895175f87da3] => Array
                (
                    [type] => 5
                    [subtype] => JPEG
                    [filename] => 1.jpg
                    [data] => ...
                    [inline] => 1
                )

        )

    [getAttachments] => 
)

We can now do do a regular expression match to extract all the src=”cid:…” from bodyHTML and then replace them with a full http (or https) URL to the image on our filesystem.

preg_match_all('/src="cid:(.*)"/Uims', $emailMessage->bodyHTML, $matches);

The $matches array will contain the following:

Array
(
    [0] => Array
        (
            [0] => src="cid:ii_136a895175f87da3"
            [1] => src="cid:ii_136a895663ed0afd"
        )
    [1] => Array
        (
            [0] => ii_136a895175f87da3
            [1] => ii_136a895663ed0afd
        )
)

So we want to loop on the items in $matches[1] as that contains the unique idenfier which is used as the array key for the attachments array. They can be saved to file with a unique name and the src attribute in the HTML changed to a full URL.

if(count($matches)) {
	
	$search = array();
	$replace = array();
	
	foreach($matches[1] as $match) {
		$uniqueFilename = "A UNIQUE_FILENAME.extension";
		file_put_contents("/path/to/images/$uniqueFilename", $emailMessage->attachments[$match]['data']);
		$search[] = "src="cid:$match"";
		$replace[] = "src="http://www.example.com/images/$uniqueFilename"";
	}
	
	$emailMessage->bodyHTML = str_replace($search, $replace, $emailMessage->bodyHTML);
	
}

I’ll leave it up to you to work out how you want to craft your unique filename. Note you’ll need to change /path/to/images to the actual path where the file should be saved, and http://www.example.com/images/ to the actual URL that points to that location.

Download the code from this post

To save you having to copy and paste, here’s download links for the full PHP example from this page (which contains additional inline comments), and also the EmailMessage class:

PHP example
EmailMessage class

Both are gzip compressed and contain a single PHP plain text file.