homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Google Crawling Out of Referral logs?

 11:20 am on Apr 21, 2010 (gmt 0)

Google started indexing SERPS from our new site search engine. The question comes up, HOW are they getting those search links? These are full keyword searches as performed by users.

When someone searches Google/bing/yahoo/alltheweb and then comes to WebmasterWorld - we highlight the page with those kw's and we also print "try this search on webmasterworld" with a full keyword link to perform a search on our site search engine. So a raw http link is there on the screen, only when someone kicks out a referral from a search engine. How does that link get into GoogleBot?

The only way I can think of is if google is reading pages via the tool bar or via that google accelerator proxy? Or is this just a reconfirm that Google is crawling out of it's referral logs?


brotherhood of LAN

 11:33 am on Apr 21, 2010 (gmt 0)

Perhaps you could try altering the order of the search GET variables to see which choice of events you think it is. Maybe too late now though?


 11:36 am on Apr 21, 2010 (gmt 0)

via the spybar


 2:19 pm on Apr 21, 2010 (gmt 0)

Could it be a FireFox(safebrowsing) URL Data that is collected as well?


 2:49 pm on Apr 21, 2010 (gmt 0)

What if someone does a WebmasterWorld search, then links to one of the search results that includes the highlighted term in the URL? Is that the URL pattern you're seeing requested by Googlebot?

You might be able to find some of these links if you download and scan through the "Links to your site" from webmaster tools.

This could be a good case for the canonical tag.


 3:59 pm on Apr 21, 2010 (gmt 0)

Most logical answer for me is that it probably crawls the Web History that Google keeps from users logged into a G account.


 8:02 pm on Apr 21, 2010 (gmt 0)

> Could it be a FireFox(safebrowsing)

Might be Chrome safe browsing too.


 10:15 pm on Apr 21, 2010 (gmt 0)

Brett_Tabke, my understanding is No, it's not directly, I have it on one of our large sites, gbot is testing bogus query strings to see if they return a proper 404 error, in our case I managed to fix a major issues [snip potential security issue]

How gbot discovered it, as said above, it only needs one onpage link or bookmark the gtoolbar can follow to a search result page from a user, and that's passed for next crawling. Because it's a search string, gbot will test the bogus search request while they are at it.

Here are the steps:

User searches for say, ajax help, lands on a useless to Gbot link which is [webmasterworld.com...] as it's using POST not a GET form and all the rest of the search params are missing, so not crawlable, BUT on that page the search results returned, one of whom is:


Now that is bookmarkable and many will find it interesting enough to post on their sites. Gbot collects it and spiders it, AND finds inside the page the link for the other option which says search for this or that on WebmasterWorld, it was there few minutes ago, have you removed it Brett_Tabke, anyway that link is the one GBot tried as [snip] and also tries it as [snip].... or a similar request which it does not exist, though I believe ".../?terms=" structure existed but buggy probably that's why it is removed.

[edited by: Brett_Tabke at 4:45 am (utc) on Apr 22, 2010]


 4:47 am on Apr 22, 2010 (gmt 0)

appreciate the thoughts dusky, but you are missing some stuff there. I not going to explain that here as it is a security issue you are putting us at risk for...

Either the toolbar or referral log is the general consensus.

MC Hammer - any comment?

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved