Welcome to WebmasterWorld Guest from 107.20.75.63

Message Too Old, No Replies

G shows links for index.html -- but it never existed

     
7:03 pm on Sep 30, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 3, 2003
posts:143
votes: 0


I was just checking google to see indexed pages (site:) backlinks (links:) and noticed something very strange.

G site: shows - www.mysite.com/index.html
when I check - link:www.mysite.com/index.html
it shows all of my backlinks

Problem is, I don't have a page named index.html and never have, I use index.shtml. I checked all of the pages G says is linking to www.mysite.com/index.html and none of them have index.html (some of them are internal pages, I know I never would have linked to a page that has never existed) - I even use a mod rewrite to point everything to www.mysite.com

not sure what this means, but it looks like google is just assuming that I have a page named index.html and that www.domain.com is the same as www.domain.com/index.html - even for sites that use different default pages.

Anyone else seen anything like this before?

6:36 pm on Oct 3, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


Now there's an odd one -- no, I have never seen that. But there's lots of oddities about in the past few weeks.
6:39 pm on Oct 3, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 20, 2003
posts:197
votes: 0


I've been noticing google indexing pages that don't exist a lot recently.

This is all part of their effort to inflate their index size to aid in the "mine's bigger" contest they're engaged in with yahoo. (also known as being a publically traded company and caring more about stock price than quality of results)

1:32 pm on Oct 5, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 3, 2003
posts:143
votes: 0


I also have google showing pages in the index that have never existed, as well as a slew of pages that have been removed and 301'd for 6-12 months.

Anyone have any idea why Google would be showing backlinks for index.html when I have never had an index.html - and all of the pages they show link to that are NOT, they are linking to www.mydomain.com?

wanted to repost to bump, mods didn't activate thread for a couple days.

3:50 pm on Oct 5, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


What does your server respond when someone requests index.html? I can see how any spider might be programmed to guess at the most common index page name, so how the server responds would be key. For example, I've seen "custom 404" pages that were actually set to serve a 200 header.
5:01 pm on Oct 5, 2005 (gmt 0)

Senior Member from DE 

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 20, 2003
posts:870
votes: 4


Google shows this behaviour for years - "index.html" is merged with "/".
5:44 pm on Oct 5, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 3, 2003
posts:143
votes: 0


no custom 404, just a plain old 404

Server Response: [mysite.com...]
HTTP Status Code: HTTP/1.1 404 Not Found
Date: Wed, 05 Oct 2005 18:39:31 GMT
Server: Apache/2.0.46 (Red Hat)

I use index.shtml and always have - site is 4+ years old

For the past year google has index www and non www versions of almost every page, they show about twice as many pages indexed as I actually have.

5:46 pm on Oct 5, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 3, 2003
posts:143
votes: 0


doc_z

don't you think it's a bit strange that they take the liberty of merging index.html with / but they can't figure out that www and non www pages that are indentical should be merged?

If my employees did something this stupid, I would fire them!

8:45 pm on Oct 5, 2005 (gmt 0)

Senior Member from DE 

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 20, 2003
posts:870
votes: 4


Indeed sometime Google's behaviour is strange. However, I've also seen several cases were they merged www and non-www pages.
1:38 am on Oct 19, 2005 (gmt 0)

New User

10+ Year Member

joined:Mar 12, 2004
posts:31
votes: 0


I've noticed a strange G cache lately for my site as well. Site is over 2 years old, and recently after multiple 302 redirect hijacking, I have noticed that G is showing a cache of:

www.mysite.com/index.html%20

Now my main index page is not showing in the index of course, and there is no G cache. Should be in there:

www.mysite.com/index.html

Anyone ever seen this kind of activity? Suggestions?
Thanks!

2:02 pm on Oct 25, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Oct 6, 2003
posts:234
votes: 0


Suggest ... use MSN instead and tell your friends to do same