homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque

Webmaster General Forum

Android and referers

 8:25 pm on May 29, 2014 (gmt 0)

Was going to post this in SSID and then realized it's neither an ID question nor a search-engine question; it's about behavior as reflected in logs.

Look at this sequence:
184.151.37.xyz - - [28/May/2014:19:44:03 -0700] "GET /fonts/hamlet.html HTTP/1.1" 200 7861 "http://www.google.ca/" "Mozilla/5.0 (Linux; Android 4.3; SM-N900W8 Build/JSS15J) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.114 Mobile Safari/537.36"
184.151.37.xyz - - [28/May/2014:19:44:03 -0700] "GET /sharedstyles.css HTTP/1.1" 200 6316 "http://www.google.ca/" "{same UA}"
184.151.37.xyz - - [28/May/2014:19:44:03 -0700] "GET /fonts/fontstyles.css HTTP/1.1" 200 6401 "http://www.google.ca/" "{same UA}"
184.151.37.xyz - - [28/May/2014:19:44:03 -0700] "GET /fonts/fontcheck.js HTTP/1.1" 200 6710 "http://www.google.ca/" "{same UA}"
184.151.37.xyz - - [28/May/2014:19:44:03 -0700] "GET /fonts/images/bigH.png HTTP/1.1" 200 600 "http://www.google.ca/" "{same UA}"
184.151.37.xyz - - [28/May/2014:19:44:03 -0700] "GET /fonts/images/fonts-icon.png HTTP/1.1" 200 915 "http://www.google.ca/" "{same UA}"
184.151.37.xyz - - [28/May/2014:19:44:03 -0700] "GET /piwik/piwik.js HTTP/1.1" 200 21990 "http://www.google.ca/" "{same UA}"
184.151.37.xyz - - [28/May/2014:19:44:05 -0700] "GET /favicon.ico HTTP/1.1" 200 661 "http://www.google.ca/" "{same UA}"

This is a human. All requests for supporting files-- including the favicon, which normally has no referer at all-- give the same referer as the page. (piwik.php lives on a different site, so it isn't included in these logs, but it too gives google.ca as referer)

First sighting: Last September, sporadic since then but becoming more frequent.

UA: Linux; Android 4.x (never the version with U; in the middle, and never Android 3-or-earlier)
Browser: Chrome (never Safari as such), assorted recent versions

Referer: http(s)://www.google.various
This might be a red herring or it might be an app. Always with no visible query, even the http forms. Plain google, not image search.

Question: Is this an inherent behavior of Android Chrome (meaning that I'll have to tweak my log-processing functions to remove spurious image requests)?



 7:09 pm on Jun 2, 2014 (gmt 0)

This might be a red herring or it might be an app

I don't know the answer to your question, but you set me to wondering what kind of an app does general Google searches, and how you can identify it in your logs, if you can. I don't think I've used any apps that do general Google searches, but I don't have much experience using apps of any kind, so don't know much about the different types.


 6:18 pm on Jun 5, 2014 (gmt 0)

wondering what kind of an app does general Google searches, and how you can identify it in your logs, if you can

The Google app is one of that vast family of apps that are simply websites dressed up in app clothing so you can get there by touching an icon instead of having to navigate a bookmarks menu in your mobile browser. I guess. It only just occurred to me that I can try this for myself, using the google app on the iPad. It looks like this, with all edits in {braces}:
{my IP here} - - [05/Jun/2014:11:09:21 -0700] "GET /hovercraft/tango.html HTTP/1.1" 200 9731 "http://www.google.com/url?sa=t&rct=j&q={exact+search+string+here}&source=web&cd=25&ved=0CBUQFjAEOBQ&url=http%3A%2F%2Fexample.com%2Fhovercraft%2Ftango.html&ei={buncha letters & numbers}&usg={more letters & numbers}" "{my iPad's UA here}"

Interesting that they include the full search term. They're about the only google entity that still does. (Notice the "cd=25"? Bing would put me on page 1 for this particular search.)

Or did you mean who does the crawling? I'm pretty sure they have separate tablet apps and smartphone apps. Tablets fall on the "browser" side of the user-agent divide --that is, they use @media screen css-- so it would just be the ordinary googlebot.

Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved