Welcome to WebmasterWorld Guest from 34.238.194.166

Forum Moderators: DixonJones & mademetop

Message Too Old, No Replies

mail.google.com referer from outlook email

How did google get involved?

     
10:52 pm on Oct 16, 2018 (gmt 0)

New User

joined:Oct 16, 2018
posts: 5
votes: 2


First post, please excuse any ineptitudes.

I sent an email from a godaddy-based email account to a example at live.co.uk address. I routinely include images in emails for basic tracking, served from an apache server on my home PC.

The email was sent with timestamp of 14 Oct 2018 09:39:47 -0500
This is what popped up in the access log:
66.249.73.196 - - [14/Oct/2018:09:39:52 -0500] "GET /robots.txt HTTP/1.1" 404 208
66.249.73.196 - - [14/Oct/2018:09:39:52 -0500] "GET /robots.txt HTTP/1.1" 404 208 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.90.167 - - [14/Oct/2018:09:39:52 -0500] "GET /dox/graphic.jpg HTTP/1.1" 200 14855
66.249.90.167 - - [14/Oct/2018:09:39:52 -0500] "GET /dox/graphic.jpg HTTP/1.1" 200 14855 "http://mail.google.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246 Mozilla/5.0"


I'm assuming the recipient looked at the email right after he received it. The UA is fairly old Edge, looking at an outlook-based email, so how does google got into the act? And why the attempt to get robots.txt, normally the images are just accessed, no other attempts to search the site. I checked back through my logs and this only happened once before, back in February of this year, to the same recipient, same user agent including Edge version. These are also the only examples in my logs where the UA had that extra "Mozilla/5.0" tacked on the end, that isn't a typo.

There were two other emails to same recipient without this oddity, only a straight access of the image files.

Does anyone have an idea of what's going on here? Thanks for any help you can give. [I have another even weirder example I'll post later]

[edited by: engine at 8:18 am (utc) on Oct 18, 2018]
[edit reason] examplified [/edit]

8:44 am on Oct 18, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:12913
votes: 893


Hi mechtheist and welcome to WebmasterWorld [webmasterworld.com]

...so how does google got into the act? And why the attempt to get robots.txt
It appears the recipient opened the email in mail.google.com. That means they are signed into their Google account. If this is not where you sent the enail, the recipient may be forwarding to their google mail account (I actually do this.)

The IP address verifies the Googlebot UA is valid. If Googlebot hasn't crawled your pages in a while, they would request robots.txt. I see no mystery in the Googlebot visit.

As for the odd Edge UA, I have no idea. Yes it is strange, but so are a lot of log entries and I do this alot :)
9:07 am on Oct 18, 2018 (gmt 0)

New User

joined:Oct 16, 2018
posts: 5
votes: 2


Hey keyplyr, thanks for reply. Forwarding their email, hadn't thought of that. It makes sense, but it's kinda weird, who would use an old Edge version to sign into their gmail account to check their forwarded outlook/live account email? I can test that easily enough.

If you like figuring out weird log entries, please keep an eye out for my next post, I'll get it up within few hours, it's got a lot of weird on multiple levels. I've already tried a couple of forums and gotten no responses yet.

Thanks again, and thanks for the welcome.
11:52 am on Oct 19, 2018 (gmt 0)

New User

joined:Oct 16, 2018
posts: 5
votes: 2


I tried forwarding from my live.com email to gmail, logged into my gmail in Edge and looked at forwarded email and this is what shows in the access log:

66.102.7.82 - - [19/Oct/2018:03:55:40 -0500] "GET /dox/testgraphic.png HTTP/1.1" 200 2112521 "-" "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)"


It looks like it's probably some other weirdness going on.
5:31 pm on Oct 19, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:June 20, 2006
posts:2137
votes: 85


"who would use an old Edge version to sign into their gmail account to check their forwarded outlook/live account email?"

What if it's not an auto forward, but a manual forward... like I get the email, and think to myself "mom would love this", and I forward it to her, and she's using the default browser that came with her computer, Edge...
8:06 pm on Oct 19, 2018 (gmt 0)

New User

joined:Oct 16, 2018
posts: 5
votes: 2


I don't think that would change how gmail handles embedded images, you'd likely still see the image proxy as UA. Plus, the timing makes a manual forward unlikley, as does the email content.

OK, I thought maybe a non-browser email client was used, so I tried Thunderbird. It displayed the image but without re-downloading it. I used Thunderbird to send the message, so it knew the image and I couldn't figure out how to make it redownload it, so no hits on the access log. I even edited the image, but that didn't work and there was no '300' hit on the log so it wouldn't have worked anyways. So, tried installing Thunderbird on another PC and this is what I got:

10.0.0.1 - - [19/Oct/2018:14:19:34 -0500] "GET /dox/graphic.png HTTP/1.1" 200 2186946 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 Lightning/6.2.2.1"



AAARRRGGGHH. No referer, but also no image proxy, maybe this is close. But, that stillwouldn't work, it would have to be some seb-based email client that could log into gmail from a browser since the UA is Edge. I checked the headers of all of these emails, no "mail.google.com" anywhere. This is closest bit in the header:

Received: from NAM04-BN3-obe.outbound.protection.outlook.com (mail-bn3nam04lp0115.outbound.protection.outlook.com. [216.32.180.115]) by mx.google.com with ESMTPS id q124-v6si15327162iod.118.2018.10.19.01.48.20


I'm well out of my depth here. I don't know why there is no referer. It's possible the recipient used some other email client that would display the referer. I was going to try Opera, I thought it had an email client built in, now all I see is WhatsApp and FB messenger.

Maybe should have mentioned sooner, the recipient is a Brit, so it's really a xxx@live.co.uk. Maybe that would change something?

Oh well, thanks for the replies, maybe someone with a lot more knowledge in this crap will see this.