Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

301 Redirection post BD

         

asusplay

4:04 pm on Aug 31, 2006 (gmt 0)

10+ Year Member



Back in April when the whole Big Daddy thing was going on I implemented 301 redirects from the non version of www pages that Google had spidered as well as a quick fix for indexing the homepage separate to the domain name.

I simply changed the homepage from index.asp to default.asp and made sure there were no direct references to the file name, and that all links pointed to the domain name. The site is on asp and shared hosting so I had no access to the IIS.

However I have since realised that G has spidered default.asp page separately and it shows under the "omitted" results when doing a site: search, therefore incurring duplicate content penalty I'm sure. It is also showing again non www versions of some pages which is frustrating.

I thought that Google had somehow sorted this whole situation out. How can it spider a page that is not referenced directly ( ie www. example. com / default.asp , when all links point to www. example. com?

Is there any advice anyone can give regarding my situation? It's doing my head in once again...

hvacdirect

5:08 pm on Aug 31, 2006 (gmt 0)

10+ Year Member



I've got the same problem for all of my IIS hosted sites. I wish there was a way to resolve this. One even has three versions indexed, "/", "/default.asp", and "/Default.asp". All links in the site are "/".

Quadrille

5:12 pm on Aug 31, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First check your site for all the rogue domain.com/default.suffix and change them all to domain.com/

Second, check that there are no old pages on your server, orphans, even, which could conceivably hold the faulty link.

Third, be sure you have a user friendly 404 page, to catch those deflected visitors.

Hard to be certain what has happened, but sounds like you have a ghost somewhere; a link to the page that Google is having trouble not believing. Provided the correct page also appears, then the old one will eventually drop out. Sometimes these are reinforced by long forgotten pages that happen to have an external link to them. Good housekeeping often helps them on their way. Xenu is your friend.

But much better to have no links to index or default pages (always use domain.com/ or folder/), and never a need for those 301s

asusplay

7:04 pm on Aug 31, 2006 (gmt 0)

10+ Year Member



Hi, thanks for the replies.

I ran Xenu and it seems fine, plus I made sure theres no links to default.asp from inside the site.

It's as if Google has identified the path information (default.asp) and indexed it out of it's own accord. This does not happen with the other search engines so I have no idea why G has done it.

Would it help if I ran an exclusion in robots.txt to disallow www.example.com/default.asp, or would this disallow indexing of the homepage as www.example.com?

Am I completely off the mark here?

Bewenched

8:29 pm on Aug 31, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I thought that Google had somehow sorted this whole situation out. How can it spider a page that is not referenced directly ( ie www. example. com / default.asp , when all links point to www. example. com?

I noticed the same thing as well. We have never linked directly to our default.asp in the root directory and it has never shown in the serps. However once we started using google analytics all of the sudden it shows up.

I will have to say that there is a direct link between google spider and analytics after this showing up in the serps.

hvacdirect

9:34 pm on Aug 31, 2006 (gmt 0)

10+ Year Member



I too have made sure that there are no internal links to default.asp. One of the possible sources may have been the original 301 redirect we had in place to fix conicalization problems, that for some reason sent them to /default.asp if someone entered through the non-www version. I have since updated the code to send them to the "/" version.

The following is the coding I've set up. It can be used on any asp page on any domain as it checks for the server, it can also be used in any subfolder (www.example.com/products/) and will send them to the "/" rather than default.asp in subfolders as well.

When setting up a new site I just use an include it on the top of the pages.

To redirect to the www version.


<%
Dim Domain_Name, theURL, QUERY_STRING, HTTP_PATH,TEMP_NUM
'Get domain that the page is on
Domain_Name = lcase(request.ServerVariables("HTTP_HOST"))
'Check if URL is the www version
if left(Domain_Name, 3) <> "www" Then
HTTP_PATH = request.ServerVariables("PATH_INFO")
'Check if page is default.asp if so, redirect to "/". If other index page is used, such
'as index.asp the numbers in the right and len statement need to be changed, as well
'as the IF statment to indicate the index page.
If right(HTTP_PATH, 12) = "/default.asp" Then
TEMP_NUM = len(HTTP_PATH)-11
HTTP_PATH = left(HTTP_PATH,TEMP_NUM)
End If
' Sets the new URL settings with correct page
QUERY_STRING = request.ServerVariables("QUERY_STRING")
theURL = "http://www." & Domain_Name & HTTP_PATH
'This section passes on the query string variables
if len(QUERY_STRING) > 0 Then
theURL = theURL & "?" & QUERY_STRING
end if
' Send 301 response and new location
Response.Clear
Response.Status = "301 Moved Permanently"
Response.AddHeader "Location", theURL
Response.Flush
Response.End
end if
%>

To direct to the non-www version I use.

<%
Dim Domain_Name, theURL, QUERY_STRING, HTTP_PATH,TEMP_NUM
' Get domain name the page is on
Domain_Name = lcase(request.ServerVariables("HTTP_HOST"))
' Check to see if www version
if left(Domain_Name, 3) = "www" Then
' Changes http path to non-www version
TEMP_NUM = len(Domain_Name)-4
Domain_Name = right(Domain_Name,TEMP_NUM)
HTTP_PATH = request.ServerVariables("PATH_INFO")
'Check if page is default.asp if so, redirect to "/". If other index page is used, such
'as index.asp the numbers in the right and len statement need to be changed, as well
'as the IF statment to indicate the index page.
If right(HTTP_PATH, 12) = "/default.asp" Then
TEMP_NUM = len(HTTP_PATH)-11
HTTP_PATH = left(HTTP_PATH,TEMP_NUM)
End If
' Sets the new URL settings with correct page
QUERY_STRING = request.ServerVariables("QUERY_STRING")
theURL = "http://" & Domain_Name & HTTP_PATH
'This section passes on the query string variables
if len(QUERY_STRING) > 0 Then
theURL = theURL & "?" & QUERY_STRING
end if
' Send 301 response and new location
Response.Clear
Response.Status = "301 Moved Permanently"
Response.AddHeader "Location", theURL
Response.Flush
Response.End
end if
%>

g1smd

12:45 am on Sep 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Get a 301 redirect from /index.html (or whatever it was) to / installed to fix this. Make sure that you cater for all index pages in folders too, not just those in the root.

If any of those pages are shown as supplemental results then they will hang around for a year after the redirect is set up. Don't worry about that, they will not be harming things at all.

asusplay

7:28 am on Sep 1, 2006 (gmt 0)

10+ Year Member



Hi g1smd, thanks for your reply but the point is that when you're on shared windows hosting it is impossible to create a 301 redirect from index.asp or default.asp to "/" because it just creates an endless loop. Whichever way you try it it does not work because you are having to get the path server variables and these will always have index.asp or default.asp so it loop as it redirects to itself.

The code that was given above does not work for this purpose (I tired it) and I have similar code on my websites. I thought I had it covered when I specifically did not reference the default.asp page in any link whatsoever, but somehow it has spidered and indexed this page.

I feel that the site this has happened to will always suffer from duplicate content then because I don't see what else I can do. There's a massive limitation on what this type of hosting provides (and there's a hell of a lot of sites on shared hosting). The other search engines do not have this problem. I don't think it's too difficult to have in the algorithm something along the lines of:
If default.asp, index.asp, home.asp (or any other default page) = "/" then ignore and don't index.

Can anyone tell me if specifically disallowing www.example.com/default.asp in the robots.txt file will stop the homepage from being indexed as www.example.com?

g1smd

9:52 am on Sep 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You can disallow /filename and / will still be indexed.

hvacdirect

12:55 pm on Sep 1, 2006 (gmt 0)

10+ Year Member



I'm guessing the robots.txt analysis tool in google webmaster console could be used to test this. the disallow for the /default.asp page, and then check to see if it can reach your hompepage.

You are right about not being able to forward default.asp to "/" through on page code. As far as I know. Even when "/" is all that shows in the browser, the variables below will all show default.asp, so there is nothing to check against. Thust the loop you spoke of.

PATH_INFO
SCRIPT_NAME
URL:

hvacdirect

1:48 pm on Sep 1, 2006 (gmt 0)

10+ Year Member



I checked using Google Wemaster tools, and a robots.txt using

User-agent: *
Disallow: /default.asp
Disallow: /Default.asp

Will allow the home page, but block /default.asp and /Default.asp. However that does not pass any page rank on that must be there as in my case, and probably the most common, I have no internal links at all to default or Default so that means there are external links out there that I'd like to get credit for.

g1smd

2:08 pm on Sep 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yahoo's siteexplorer will show you exact incoming links to the "default" page; then you can contact those sites to ask for an update to the link.