Forum Moderators: open

Message Too Old, No Replies

Strange result in allinurl

         

superscript

11:44 am on Dec 20, 2003 (gmt 0)



Need a bit of advice here folks. When I do an allinurl a few results like this show up instead of proper ones:

search.google.com/go/gs/***long string of characters***/http/www.mysite.com.uk/oneofmypages.html

What's worrying, it that the 'oneofmypages.html' listed above now appears to be out of Google. Duplicate penalty as a result of this fast link?

I have no idea what this link is, apart from its connection to fast, and no idea why 2 or 3 of them are
appearing in Google's allinurl and apparently getting the affected pages booted.

Any ideas?

onfire

8:14 pm on Dec 20, 2003 (gmt 0)

10+ Year Member



Saw this yesterday, but did not take much notice, until now!

superscript

8:19 pm on Dec 20, 2003 (gmt 0)



I can now confirm that 2 of the pages these links refer to have been de-indexed in Google.

edit: the missing pages still show a PR6, the site isn't penalised, but these pages are no longer indexed. Are there any Fast experts who can explain how Google has picked up these strange links?

claus

8:47 pm on Dec 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Have you tried entering that long redirect URL into the Server Header Checker?

I suspect it's a redirect returning a 301 or 302.

If that link is the reason that these pages have dissapeared, then it's a serious Google bug. Note "if".

/claus

superscript

10:43 pm on Dec 20, 2003 (gmt 0)



Hi Claus,

Tried server header checker as suggested - received 403 'Forbidden'

Any suggestions?

Worth noting - if you click on these strange, long links, in the allinurl SERPs, the pages show up fine. But they're no longer indexed by Google.

p.s. How come I have a re-direct to my site by a third party indexed by Google? I only have a 301 re-direct from non-www to www in place, pointing to exactly the same IP. Note that the 301 is fairly recent, replacing an earlier 302 that was causing this site to be separately listed as non-www, and www. The non-www and www site are one and the same - nothing fancy going on - same data at same IP.

claus

12:57 am on Dec 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, it's odd. I don't know your URL, but i can tell you what it is as i just made a test with some random query on ATW.

It's a link from a search result on Alltheweb. The link returns a "HTTP/1.1 302 Found" and redirects to your page. The Server Header Checker at searchengineworld cannot handle URLs that are that long, that's why you get the 403, as this is only for the partial URL.

The "click" subdomain at fastsearch.com has a robots.txt file that disallows all spiders, so someone must have copied a page of ATW SERPS to somewhere and Googlebot has indexed it there. Even if it was allowed to, i don't think Googlebot is capable of running random queries at other SE's so the 56,000 similar results must come from somewhere else.

Now, as the "click" subdirectory is unavailable to Googlebot, this is not something that Google will be able to correct automagically, as it will not be possible for Gbot to visit those links. I'll suggest that you write a nice email to webmaster (at) google.com and tell them about this, because they might not know about it, and it's clearly affecting thousands of pages.

As for your 301 redirect on your own site: Don't change that, it's perfectly fine. If this had anything to do with it, it would not just hit two pages.

I can imagine that Google has somehow gotten the impression that you are doing something dodgy with those two pages, and decided to eliminate them - anyway, perhaps it's just because they're unavailable for indexing, but the effect is the same.

/claus


Edit: Delinked the similar results after checking a few of those pages.
Added: The pages were from a wide range of industries, all were clearly SEO'ed, and some even ranked high for their keywords. So, a link like that does not seem to be the culprit by itself.

As to where that link came from, i'd say that you have probably placed it on the web by yourself. Think about it.

superscript

9:15 am on Dec 21, 2003 (gmt 0)



Hi Claus, many thanks for the advice - but I don't get this:

As to where that link came from, i'd say that you have probably placed it on the web by yourself

Yidaki

9:30 am on Dec 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



We have had this reported a few times in the past. Unfortunately Google seems to be ignorant about this problem, although there are some analyes that pretty good explain what's the problem:

Indexed AlltheWeb pages causing Google duplicates - Aug 14, 2003
[webmasterworld.com...]

click.fastsearch.com shows instead of my url? - Oct 8, 2002
[webmasterworld.com...]

superscript

9:45 am on Dec 21, 2003 (gmt 0)



Thanks for that Yidaki - a relatively recent problem then. It's possible my recent 301 has caused a temporary de-indexing, I expected this to cause a few gremlins (most of my site is still indexed as non-www, I've had it all converted to www)

Any opinions on whether it is likely only 2 pages from a site could be penalised in some way, and not the whole site? The index page is still on page 1 of the SERPs, unfortunately the two MIA pages are faqs pages with a lot of content.

takagi

1:18 pm on Dec 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Recently I've seen a lot of links in the SERPs of Google and ATW to pages that were just search results of different search engines. These pages have the query in the title and all/most of the snippets on these pages contain the keywords (so resulting in a nice keyword density). Some of the pages are cloacked, others have a redirect in JavaScript or are used to help pages of a certain site to be found by search engines. I don't know about superscript's case, but it is not unlikely that he is not directly involved in this.

heini

2:34 pm on Dec 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is in fact an old and reocurring problem not related to any software, though I won't rule that out as an additional possibility.
Google has problems with redirects, nothing new. Those ctr urls from ATW are coming up time and again. Other redirect urls from directories have often led to the same consequences: original url vanished, redirect url taking it's place.
Google problem. Other engines are reported to have similar problems.

claus

2:54 pm on Dec 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I see my previous post was modded out - that's good, as i got a strong reaction from superscript on it and was about to modify it.

Anyway, the essence of the problem is that there are pages out there that does not belong to the search engines, but still display full or partial SERPS from said engines. Of course, if a SERP URL to any of your pages is found on such a page, it is entirely without your knowledge or approval. I'm really sorry if i have suggested anything else, superscript - i'm sure you haven't placed that link by yourself.

GoogleGuy has stated a couple of times on this board that others cannot link to you in such a way that it will hurt your rankings. If this is true, then this should not even be a problem. If, otoh, these pages are really the cause of the problems (not just getting ignored), then this is a serious Google bug.

Still, it's got nothing to do with the 301 - those things need around a month to settle in the SERPS.

/claus


Added: Heini, the special thing about this problem is that the robots.txt of ATW disallows spiders, eg. googlebot, so it's not an easy thing to solve unless you choose to ignore such links altogether - which should really be the solution. The link in question is perfectly fine technically, it returns a 302 and all - googlebot just can't follow it.
Edit: Removed reference to their supposed look on such pages, i agree this is a purely technical problem. Also, edited paragraph 2 and 3.

Added (2): Just to remove any doubt:

I have no reason to believe that superscript has done anything wrong in any way whatsoever. I have not looked at his pages, i don't even know their URL and i assume they are perfectly compliant with Google guidelines and everything. I'm honestly sorry if it has appeared differently.

[edited by: claus at 3:56 pm (utc) on Dec. 21, 2003]

heini

3:02 pm on Dec 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Claus, I think it's really much simpler than that: Google follows redirect urls, in most cases this works just fine, in some cases they mess it up. Nothing to do with how they look at anything. As said before, it's a well known problem, involving large important directories, which get regularly spidered by Google passing PR through redirects. In some cases they mess it up, easy as that.

superscript

3:14 pm on Dec 21, 2003 (gmt 0)



Hi Heini,

You wrote:

Other redirect urls from directories have often led to the same consequences: original url vanished, redirect url taking it's place.

Is this likely to have happened in my case? I have three of these fast re-directs in my allinurl: 2 of the pages are no longer listed in Google. My concern is that the third one points to my index page - could this be the next to go?

onfire

4:59 pm on Dec 21, 2003 (gmt 0)

10+ Year Member



I have 5 of these strange fast links showing up in allinurl, and I am alarmed to read some comments that this could be self-inflicted, I use nothing that could possibly bring these links into play.

Its clean as clean could be, and I have today posted this problem to the Google Team to look into and see if they could shed some light on this and its possible negative affect.

Hissingsid

5:51 pm on Dec 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi,

Someone here suggested that it may be PFI (pay for inclusion) click tracking. You know if you have a Positiontech account you reports tell you how many clicks your URLs got in the period reported.

[webmasterworld.com...]

Best wishes

Sid