Forum Moderators: open
I will appreciate any insight into my question. I searched the forums but could not find a satisfactory answer.
Problem.
Few months ago we noticed that all pages except index page of our domain xyz.com were no longer indexed by googlebot. The content for xyz had been taken from our main domain zyx.com. So we thought that googlebot was penalizing us for probably having similar content and had therefore excluded the inside pages. We did a redesign of xyz and added lot of new content. The website has now been up for one and a half month. Googlebot still continues to index only the index page. It has been spidering the website very frequently but it refuses to include any of the inside pages other than the index page. We are not able to figure out what the problem is. Does anyone know what exactly is happening? How long will googlebot take to include all the pages of xyz in its database?
We will appreciate any honest attempt at answring these questions.
thank you
I have a number of sites on the Internet, and the most recent that I have put up for a local charity went up with only two links from other sites. It has 50 pages, but only the index page got indexed. After two months, one more page has now made it into the index.
The others will arrive in due course, I'm sure, but it's a measure of how many pages Google has to visit, and how many popular paths there are to your own sub-pages, that determines how quickly you'll get indexed.
Three months doesn't seem to be all that long for a site that's not well connected to other sites on the web.
DerekH
hi Jesse,
The site is completely static with no session ids but regular html pages. Its an information site....so I do not think that would be my problem
thank you again to both of you for your input.
Can you sticky me your index page?
Your link structure may be causing this, either that or you have been very unlucky. I created a site with 2300 pages and was indexed with in a month, every single page.
The most common mistake is the lack of links on your main index page or your links on embedded in a javascript/flash type nav bar.
When I call it I receive your error page which does show a 404 however it redirects with in a few seconds.
Google and all search engines are looking for the Robots.txt file before they spider. If no file can befound then Google will index the whole page.
I may be wrong but in your case you don't have a robots.txt file and the redirect maybe causing the problem.
If you can't turn the redirect off I suggest you create an empty robots.txt file so Google finds it properly but reads nothing.
I could be wrong but maybe someone else could add to this more.
Otherwise your site is nice :)
Added
---------------------
After testing more some times your robots.txt file is producing a 200. The file is not a txt file and is redirecting I am almost 90% sure this is the problem
HTTP/1.1 301 Moved Permanentlyreal URL replaced by http*//www.xyz.com/
Date: Thu, 20 Nov 2003 16:13:47 GMT
Server: Apache/1.3.26 (Unix)
Location: http*//www.xyz.com/
Connection: close
Content-Type: text/html; charset=iso-8859-1
As far as I know the 301 direct from the hyphenated domain is the correct way, and Google should discard the old site and re-index the new site.
It could be that Google is getting very confused, by the old site redirecting with a 301 to a new site then being redirected again when it tries to find the robots.txt file.
Two redirects usually sets the alarm bells ringing.
Just another thought :)
You are redirecting correctly i presume from x-zy to xzy with 301 which we see in the server headers, and now that the robots.txt file is not redirecting the site xyz should get indexed and x-yz will get completely deleated.
I can't see anyother reasons for you not to get indexed, i think only GoogleGuy can tell us if we are right or wrong and at them moment he's a little busy pursading people not to jump out of the windows :)
We can only try this and be patient if nothing gets added with in the next 30 days try something else.