Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Group of "same template" sites we host went URL only

         

justsnooping

5:41 pm on Jan 26, 2006 (gmt 0)

10+ Year Member



One product that my company provides is template websites. [edited] We then host these websites on one of our webservers.

Recently we have found that Google does NOT LIKE these sites. When I say they don't
like them I mean that simple site:mydomainname searches in Google show domain name
only with no Cached results, no Title, no brief Description. I cannot understand why these pages have been penalized.

1) They are written in JSP which as I understand (from google's own guidelines) should not be a problem.
2) They have unique content, (except the parts that are inherently templated, names of div's and css rules mostly)
3) They have unique title's, meta-descriptions, etc...

Possible reasons they aren't listed Google (MY THEORIES):

1) We don't redirect non-www domain names to www domain names. (Could this have really penalized them so?)
2) We allow them to have home pages that aren't redirected to the domain name (www.<mydomainname>.com/ =www.<mydomainname>.com/home.jsp) for example (A concern but hopefully not critical)
3) A website on the same IP address was delisted for doing some things against Google's guidelines, but has since been relisted. (They wouldn't penalize all website's from the same IP address would they?)
4) We are using a templated system, so there are some elements that are created the same way for different websites. <div> tags and css rules will be the same.

Has anyone encountered a problem like this before? Especially with Template Websites, or with sites that exist on the same IP address as one that was delisted?

Powdork, here are the responses to your questions...

1) I will see if I can determine whether the Googlebot has come around for these sites by inspecting the log files.
2) I will see this as well...
3) We do not have robots.txt files for each of these websites. If one tried to go to a page that didn't exist it throws a 500 error because of the construction of our web.xml (I believe)

Thanks for your response.

[edited by: tedster at 6:02 pm (utc) on Jan. 26, 2006]
[edit reason] remove product details [/edit]

Powdork

8:22 am on Jan 26, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



<admin note -- these were Powdork's questions from another thread>

Justsnooping,
We need more details. Were the websites moved to a new host? If so;
1.Is Googlebot no longer spidering the sites, or have the pages gone url only despite being spidered? Has Gbody visited any pages, or requested robots.txt?
2. Has Slurp been visiting the pages? Good Crawls?
3. Is there a robots.txt file? If so, is it properly syntaxed. If not, is the new host serving the proper response (404 page not found)? Use a 'header checker tool' for this.
Typically, changes that only involve the nameservers don't have an effect with Google so the first thing to check would be problems spidering the pages on the new host's servers. You can use a 'spider simulator' to check this sort of thing. Also consider the reputation of the new website builder. Search G for them and their domain(s). You can also check to see how other clients of the host are indexed in Google.

Even if they weren't moved, these are some things to look for. Keep in mind I know very little about javaserver pages so there could be many server issues I know nothing about there.

[edited by: tedster at 7:21 pm (utc) on Jan. 26, 2006]

Powdork

6:27 pm on Jan 26, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A 500 error for a nonexistent robots.txt file would cause exactly what you are describing IMO.
When you get to your log files check for a Googlebot request for robots.txt and see if it gets the 500 response.

justsnooping

7:43 pm on Jan 26, 2006 (gmt 0)

10+ Year Member



Thank you so much:) We are trying that right now....

tedster

8:15 pm on Jan 26, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are the client sites also reachable from a url on your flagship domain -- and if so, does that url always 301 redirect to the client's domain?

Powdork

9:37 pm on Jan 26, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I just visited one of the client sites (I think) and when requesting the robots.txt file I am getting 302 redirected to the home page of the client site.
Unless I am mistaken, this is effectively serving Google the homepage as the robots.txt file.
I think Google would ignore the site since it can't obey the robots.txt file.

justsnooping

9:42 pm on Jan 26, 2006 (gmt 0)

10+ Year Member



No, they do not exist on our flagship domain

justsnooping

9:57 pm on Jan 26, 2006 (gmt 0)

10+ Year Member



Powdork...We have set them to 404 pages no 302 we dont host all our cutomers sites. If you PM me I can give you an actual domain to check....

Rob

tedster

10:05 pm on Jan 26, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I suggest you explore some of these domains on the Yahoo Site Explorer at [siteexplorer.search.yahoo.com...]

Sometimes you can turn up important clues there.

justsnooping

4:43 am on Jan 31, 2006 (gmt 0)

10+ Year Member



Thanks for all the help. As of today the sites are begining to return. It must have been the 505 errors causing it thanks alot!

Robert