Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000

Message Too Old, No Replies

What's Gigablast up to?

trying to spider with 'languages'

8:31 am on Oct 2, 2007 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 2, 2005
votes: 0

Until a few days ago, I very rarely saw Gigabot. It read robots.txt, saw it was unwelcome and went away happy. Now I'm getting regular visits throughout the day, attempting to spider old, 301ed URLs with some sort of 'language' query. Searches take the format:

example.com/arabic.php?u=www.example.com/obsolete page/ and the arabic will be replaced in quick succession with german, french, spanish or some other language. The IP is Gigablast.

Site is English only, and as the bot's now completely ignoring robots.txt it's eating 403s. Anyone else seen this odd activity?

[edited by: volatilegx at 11:28 pm (utc) on Oct. 2, 2007]
[edit reason] for examples, please use example.com [/edit]

1:28 pm on Oct 19, 2007 (gmt 0)

New User

10+ Year Member

joined:Oct 19, 2007
posts: 1
votes: 0

I've seen this to in addition to http://example.com/korean.php?u=http://www.example.com/page.html

Does anybody know what's going on and how to avoid it?

It's seems to be slightly confusing Google...

2:33 am on Oct 21, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 22, 2001
votes: 0

Welcome to WebmasterWorld, tbogus :)

How is it affecting Google? I don't get it.

1:26 pm on Oct 22, 2007 (gmt 0)

New User

10+ Year Member

joined:Oct 19, 2007
posts: 1
votes: 0

Google is somehow registering these URLs (with the language.php extension as URLs that are in our index, but cannot be found as valid URLs.

My initial thought is that Google is somehow indexing GigaBlast, which has the valid URL, but Google cannot correctly process the link.

In turn, I have about two-hundred URLs that Google is saying don't exist - we've seen in the past that that these invalid URLs affect the performance of our google ranking.

Most of all - I'm curious as to why and how Google is registering these URLs as 'valid' site URLs...

Ahh - the joys of SEO...