Forum Moderators: open

Message Too Old, No Replies

inurl issues - domain crosslinking?

What in the world is Google doing here, and how can we avoid it?

         

mapsEdge

4:12 pm on Jan 27, 2010 (gmt 0)

10+ Year Member



One of our clients has a single website that serves three domains. Using Ionics ASAPI Rewrite Filter (on IIS), we direct traffic as needed to the pages based on a unique path, for instance:

# http : //domain1
rewriteRule /location1/car-rental /index.asp?p={1's page number}

# http : //domain2
rewriteRule /location2/car-rental /index.asp?p={2's page number}

# http : //domain3
rewriteRule /location3/car-rental /index.asp?p={3's page number}

None of the "sites" pages links to any other site except through a "Visit our Location2 Homepage" link.

Google, however, has found a way. It has indexed

http : //domain1/location2/car-rental
http : //domain2/location3/car-rental

and so on, for every similar page in every domain.

This is causing quite a few problems. Is Google ignoring the first "folder" (/location1/) and just reading the last piece (car-rental), crosslinking on its own? What's going on here and how can we prevent it?

Thanks!

- Bill in Kansas City, Mo, USA

Measure with a micrometer. Mark with a crayon. Cut with an ax.

[edited by: phranque at 12:35 pm (utc) on Jan. 28, 2010]
[edit reason] No urls, please. See TOS [webmasterworld.com] [/edit]

phranque

12:40 pm on Jan 28, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld [webmasterworld.com], mapsEdge!

i'm sure if i precisely understand your problem but it looks like you need to externally redirect (301) the request to the correct domain based on the subdirectory before you internally rewrite the request to the asp script.
the RewriteCond directive [isapirewrite.com] will help with this.

Ocean10000

6:06 am on Jan 29, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The one way I can think of is to serve up diffent Robots.txt files for each domain with isapirewrite. So that only the folders for that domain should be index, and the other should be ignored/not indexed.

As a backup method you may also want to update the asp page to check the domain being used and if it is the wrong one redirect (Permenent) it to the correct one based on the path.

phranque

2:43 pm on Feb 1, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



if google has already indexed the wrong urls, then it will never find its way to the correct urls if it is being excluded by robots.txt.
you need an external redirect to fix that mess.

mapsEdge

3:44 pm on Feb 1, 2010 (gmt 0)

10+ Year Member



Ocean10000 - okay, you have my undivided attention. Serve different robots.txt files using isapiRewrite? I'm trying to wrap my head around that, can you provide an example?

For posterity, I did find the solution to the original issue: the client had inadvertently introduced incorrect links very deep in the website where it wasn't immediately obvious. Google was simply doing what Google does.

Ocean10000

6:26 pm on Feb 1, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I am not as familiar with isapiRewrite abilities as others are, but I think the concepts are the same.

But I have one website which runs multiple domain names and tweaks the content based on that bit of information. I had to become a bit more defensive with this websites do to scrapers and other assorted trouble makers.

So I use a 404 error trap to capture requests for Robots.txt, which I feed into a Asp.Net Module (rules engine). This code checks the domain name (among other things) and directs it to either the (i) standard version (free to grab everything, within certain directories) or (ii) the other (bug off don't read anything or be banned version).

But in your case you want multiple different versions. Each one will relate to only one domain name which will only allowing certain directories based on the domain name with the other directories forbidden. This should keep the duplicates from happening in the future from showing up in the index.

Also you may want to adjust your coding on the dynamic pages to check the domain and do a permanent redirect if the proper domain is not selected as well, which will further reduce this problem in the future.