Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

the imaginary /m/ and Googlebot

         

lucy24

10:58 pm on Dec 4, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Has anyone else found the googlebot asking for /m/ and-that's-all? What triggers it? Is it another of those automated functions, like asking for /index.html even if you don't use it in URLs? I've certainly done nothing to make them think I've got any separate mobile sites; after all, they can see my responsive css.

All requests come from the ordinary 66.249.64-79 crawl range, with the ordinary mobile-Googlebot UA (the Linux Android one). On one site they asked a total of 4 times from late May - early June. Everywhere else it's been just one request, at some random time in the last few weeks

Most recently there have been a couple of different requests in the form
/m/real-directory/real-subdir/realpage.html (new directories and pages each time)
wtf? These leave me wondering if they've found a link on somebody's bona fide mobile site. (If so, they're sites with even less traffic than me ;) because I don't find any /m/ referers for these specific pages, though /m/ does crop up now and then as a referer for other pages.)

not2easy

5:08 am on Dec 5, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I think it can only be an automated function, another form of "lets try this and see what we get" because I have seen requests for /m/ only and /mobile/ only and real pages prefixed with both of those non-existing directories. I look at it as more of a survey than a crawl.

keyplyr

5:11 am on Dec 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I agree. Google seeing how sites are set-up for mobile: seperate /m/ copy or responsive. The new mobile index is being built.

not2easy

6:04 am on Dec 5, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I've been seeing the /mobile/ 404's for quite a long time so they must have started their checking long before their mobile index was announced. The /m/ queries are more recent.

keyplyr

6:14 am on Dec 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



More than one purpose? What a concept :)

lucy24

7:10 am on Dec 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've been seeing the /mobile/ 404's for quite a long time

I would have been prepared to say I've never seen a request for /mobile/ -- but doing a quick search of raw logs, I find a few right around the same time (May/June) that they seemed determined to find /m/ on one minor site. (Hm. Why that site, only? Are they hinting that there really should be a mobile version?) And then one more on yet another secondary site in early November. Nothing earlier, though.

And then there was the time Google Search claimed that my test site--which is completely roboted-out--existed in an /m/ version. (Middle of last year, based on timestamped screen shot.) So the hypothetical searcher was being told “We don't know anything at all about this site, so we're going to assume it does have an /m/ division.”

:: irritably wondering what cat thinks he will achieve by repeatedly fussing at cat door and then immediately leaving the room every time I approach to open it for him ::

keyplyr

7:28 am on Dec 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That's it... you figured out that enormously enigmatic algo - Googlebot is a cat!

Wilburforce

8:24 am on Dec 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've been seeing the /mobile/ 404's for quite a long time


Yes. I didn't make a note of when I started noticing /m 404s, but that wasn't yesterday either.

I'm also noticing 404s for URLs that are missing extensions and/or partly truncated: e.g. /widgets (where /widgets.htm exists), or /blue- (where /blue-widgets.htm exists).

I'm not sure how or whether this might be related: probably not, as they are all Desktop, and the /mobile and /m 404s are all Smartphone, but I am curious. Anyone else?

keyplyr

8:40 am on Dec 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Wilburforce - the extensionless and truncated Googlebot requests have been discussed at least twice at WW with no solution AFAIK.

Happens with Bingbot as well. I traded emails with Bing support a couple years ago and their story was a corrupted DB and would eventually resolve. It didn't and later Googlebot started doing it as well.

Whatever the cause, it doesn't seem to effect the index so I just filter it out of my logs and disregard it.

I see no connection between the mobile sub-directory requests and those.

lucy24

10:31 pm on Dec 5, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm also noticing 404s for URLs that are missing extensions and/or partly truncated: e.g. /widgets (where /widgets.htm exists), or /blue- (where /blue-widgets.htm exists).
That one sounds very much like the standard "entrapment" request (as with /index.html): if you've got an extension, see if extensionless also yields valid content.

They don't seem to have tried this with me. The rare googlebot requests for /\w+ leading to a 404 are pretty obviously following someone else's mis-typed link, often stopping in mid-word. Matter of fact, I wouldn't be surprised if some HTML generators (handling UGC, forums, that kind of thing) hypercorrect by putting the </a> tag immediately before the first . they meet. This would lead to a spurious extensionless request for anyone--including a search engine--following the link.

Admittedly it would make more sense the other way around: if you've got extensionless URLs, like /widgets, check periodically to see if (a) /widgets.xtn (especially a language extension such as .php) or (b) widgets/ leads to a 200-class response. Or maybe they do do this; anyone know?