Forum Moderators: open
Why would it be showing so many more pages?
And is this hurting me?
I have noticed in my logs this past week or so that
domain.com (without the www.) is showing some hits for my homepage
[edited by: ciml at 6:09 pm (utc) on Dec. 20, 2004]
[edit reason] Examplified [/edit]
I see the same thing, google is counting hundreds of pages as indexed that are blocked by robots.txt to avoid dup content stuff, and to avoid exactly what google seems unable currently to do: differentiate between a link to a url and a physical url html file.
beta.search.msn.com shows the same pattern, but has much better treatment of the site pages when you do a site: type command, the junk isn't obvious in the first 100 when I checked, unlike google. Only yahoo currently appears to be listing the actual allowed urls on my site, but it has other problems, like dropping and failing to index new pages.
Blocked links are to things like sections of pages, not the entire page, and so on. Also old junk, old pages that haven't been online for a year or more. They shouldn't have mixed their junk/sandbox index with the main index, haste makes waste, maybe they were too busy hacking in their teens to learn some basic truisms. Or maybe they just wanted to create the illusion of not having an indexing issue. Whatever it is, it didn't fool enough of the people this time around, nice try google.
This leaves us a crippled giant, a limping contender, and a new kid with a lot of issues to work out before he joins the big boys. Fun times in SEO land
My site recently underwent a complete re-alignment of architecture as well as a major change to a jump tracking page in order to prevent the Googlebot et al following and storing the tracking link in SERPS.
As I have just found out our pages have jumped massively as all the old tracking pages are still there (which is fine they will take a while to fall off at current rates) but there are now an almost identical number of entries for the new pages that are behind a 'forbidden' folder for the robots.txt
These links mean nothing once they have been clicked on as we use a one time link constructed using url, ip, time and date. So if you found the link after 20 minutes, you are directed through to a blank page as one or more factors have changed.
There is no title or description - just the entry within the google database.
Is this a permanent entry against the site or just a temporary 8 billion glitch..... until MSN launches?
Who knows but it is a bit puzzling. But hey haven't shareholders got value - a whole load more duff links. IMHO.