homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

Googlebot bloopers

WebmasterWorld Senior Member 5+ Year Member

Msg#: 4414115 posted 1:04 am on Feb 4, 2012 (gmt 0)

Hard to say whether Googlebot's following these "WebResult"-containing URIs from somewhere or causing them. They're huge -- the two I've been hit with were 558 and 466 characters long. Here's how they begin:


They include different dates and different content from who knows where, all separated by ASCII and who knows what.

I found mention of the same messes here: [google.co.uk...] (Note: The error generated is not the important part, the matching URIs are.) I reckon it's easier for you to see what they've posted than to slog through my including strikingly similar, huge URIs.

And here's a description of the 'decoded' URIs from that thread. It's akin to what I'm seeing:

It appears to be based on a malformed XML with elements <web:Results>, <web:Title>, <web:Description>, and <web:URL> that should end between these two elements </web:WebResult><web:WebResult> and contains information on at least two different sites.

Yep. Yep. The structure's identical but the parts differ. And the two sites included in each URI share a word but aren't otherwise related, like apple and appleaday.

Ultimately, that thread's so-called 'helper' summary --

I see any more time spent on this issue as purely an academic exercise.

-- struck me as a rude, 'It's all in your head' excuse for admitting: "I don't know."

Anyway. Anyone else seeing massive "WebResult" URIs all of a sudden? They almost look like Googlebot's scraped someone else's search results, or is barfing parts of its own indices. Beats me.


Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved