Forum Moderators: phranque

Message Too Old, No Replies

Apache Server Headers, IP, What Else?

apache, rewrite rules, and search engine spiders

         

rover

11:55 pm on Jan 5, 2004 (gmt 0)

10+ Year Member



Hi-

We have one domain -- domainA.com -- on our dedicated server that we have set to NOT be indexed by the search engines through robots.txt. This domainA.com includes our main data which is used/rewritten by other domains--domainB.com, domainC.com, etc. -- that we have on the same server.

The other domains which are indexed by the search engines use rewrite rules so that while they are getting the data to create the HTML page by running dynamic URLs from domainA.com, the user only sees the domain that they are on (e.g. domainB.com or domainC.com), and they are never redirected to DomainA.com, and are never even aware of DomainA.com.

I checked and when going to a specific page on DomainB.com or DomainC.com, it will return the correct data, and the Server Headers give back:

HTTP:/1.1 200 OK

Does anyone know if there is any other site/page information (other than the Server Headers and IP address) that Google or other search engines get from a web page when they spider?

We don't include any duplicate content on the different domains, I just want to make sure that there isn't something that I'm overlooking that would make us appear to be spamming from a search engine's perspective.

--Jack

jdMorgan

3:52 am on Jan 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Jack,

Welcome to WebmasterWorld [webmasterworld.com]!

It looks like you've stumped the panel...

Generally (unless you redirect or block them), robots have access to anything a human would have access to: Your server headers, your page content, your robots.txt, your DNS information, and the same information from the other sites/pages which link to you. From your description of what you've done, it sounds like everything is in order.

Jim