|5000 Pages but Google Says 12000|
Do Relative Links have something to do w it?
Hi, I have 5,000 pages on my site, but google always says I have 12,000. Is this possibly because I have both relative links and full links to pages?
For example, directory/subdirectory/page.htm and
Does it perhaps consider those two separate pages?
Are any of your pages dynamic? It is possible that Google may be indexing your dynamic pages with variables.
It is also possible that you may have people linking to you using www.example.com and example.com Google may in your case be seeing the pages in 2 different formats.
Do all of your 12,000 pages have PageRank, If the answer is yes would you consider helping me with my site ;) since you are clearly doing twice as well as me.
Do any of your links point to index.html, or just to a directory name?
That is, usually www.example.com/ (note the slash) is the same page as www.example.com/index.html If you have it both ways, google could well think you have twice as many pages as you really do. After someone pointed this out to me a while back, I've been steadily removing explicit mentions of index.html and just giving directory links. I hope all five thousand of your pages aren't hand coded!
About the trailing backslash - a link to a directory properly ends in a slash. Your webserver will still serve the page if it's missing, but it will do so by redirecting the readers browser to the page with the slash. This puts a little heavier load on your server, and reduces response times. So always put the trailing slash in your directory links.
|After someone pointed this out to me a while back, I've been steadily removing explicit mentions of index.html and just giving directory links. |
Will that accomplish anything if even one outsider links to index.html on your site (which may well be the case if you receive organic links and a linking party simply copies the page URL into his HTML editor)?
I get organic links all the time (mostly from obscure sources), and there's no way to even keep track of all such links, let alone police the URLs that third-party Webmasters may be using.
I have a site that is showing 3x the number of actual pages. It is plain old vanilla html BTW.
If you keep clicking through the pages shown it does stop at roughly the right number.
I was worried about duplicate content etc when I first saw this some weeks ago.
My guess now is that it is, hopefully, just a page count problem similar that one MSN.
|Will that accomplish anything if even one outsider links to index.html on your site (which may well be the case if you receive organic links and a linking party simply copies the page URL into his HTML editor)? |
You can't stop someone from saying "index.html" in his own link, but you can avoid intentionally showing him index.html in your own links.
A while after I had removed most of them from my site, I stopped getting much traffic to the index.html urls, and just got directory hits. I think what will happen is that new links that others place will have your proper directory link, but new links to index.html won't appear much, so index.html will fade in search engine rank as the directories increase.
I did also see more pages that I realy got when I did a site:mydomain.com, before my hijacking, try to look for other domains with your description and title.
I think google in the the start counted my pages plus the hijacker copies, then 3-4 weeks later I was out of the serps, I hope thats the case with you its just one option.
I know of several sites with this happening. They are all dynamic, so that is what is happening.
I often wondered why google does not penalize because it appears to be duplicate content.
|You can't stop someone from saying "index.html" in his own link, but you can avoid intentionally showing him index.html in your own links. |
What about the trailing slash in domain names? If you're using an absolute link for your home page in a navigation bar, does Google treat [mysite.com...] differently than [mysite.com?...]