Welcome to WebmasterWorld Guest from 54.211.86.24

Google Crawling Out of Referral logs?

   
11:20 am on Apr 21, 2010 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Google started indexing SERPS from our new site search engine. The question comes up, HOW are they getting those search links? These are full keyword searches as performed by users.

When someone searches Google/bing/yahoo/alltheweb and then comes to WebmasterWorld - we highlight the page with those kw's and we also print "try this search on webmasterworld" with a full keyword link to perform a search on our site search engine. So a raw http link is there on the screen, only when someone kicks out a referral from a search engine. How does that link get into GoogleBot?

The only way I can think of is if google is reading pages via the tool bar or via that google accelerator proxy? Or is this just a reconfirm that Google is crawling out of it's referral logs?
11:33 am on Apr 21, 2010 (gmt 0)

WebmasterWorld Administrator brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Perhaps you could try altering the order of the search GET variables to see which choice of events you think it is. Maybe too late now though?
11:36 am on Apr 21, 2010 (gmt 0)



via the spybar
2:19 pm on Apr 21, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Could it be a FireFox(safebrowsing) URL Data that is collected as well?
2:49 pm on Apr 21, 2010 (gmt 0)

10+ Year Member



What if someone does a WebmasterWorld search, then links to one of the search results that includes the highlighted term in the URL? Is that the URL pattern you're seeing requested by Googlebot?

You might be able to find some of these links if you download and scan through the "Links to your site" from webmaster tools.

This could be a good case for the canonical tag.
3:59 pm on Apr 21, 2010 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Most logical answer for me is that it probably crawls the Web History that Google keeps from users logged into a G account.
8:02 pm on Apr 21, 2010 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



> Could it be a FireFox(safebrowsing)

Might be Chrome safe browsing too.
10:15 pm on Apr 21, 2010 (gmt 0)

10+ Year Member



Brett_Tabke, my understanding is No, it's not directly, I have it on one of our large sites, gbot is testing bogus query strings to see if they return a proper 404 error, in our case I managed to fix a major issues [snip potential security issue]

How gbot discovered it, as said above, it only needs one onpage link or bookmark the gtoolbar can follow to a search result page from a user, and that's passed for next crawling. Because it's a search string, gbot will test the bogus search request while they are at it.

Here are the steps:

User searches for say, ajax help, lands on a useless to Gbot link which is [webmasterworld.com...] as it's using POST not a GET form and all the rest of the search params are missing, so not crawlable, BUT on that page the search results returned, one of whom is:

[snip]

Now that is bookmarkable and many will find it interesting enough to post on their sites. Gbot collects it and spiders it, AND finds inside the page the link for the other option which says search for this or that on WebmasterWorld, it was there few minutes ago, have you removed it Brett_Tabke, anyway that link is the one GBot tried as [snip] and also tries it as [snip].... or a similar request which it does not exist, though I believe ".../?terms=" structure existed but buggy probably that's why it is removed.

[edited by: Brett_Tabke at 4:45 am (utc) on Apr 22, 2010]

4:47 am on Apr 22, 2010 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



appreciate the thoughts dusky, but you are missing some stuff there. I not going to explain that here as it is a security issue you are putting us at risk for...

Either the toolbar or referral log is the general consensus.

MC Hammer - any comment?
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month