Welcome to WebmasterWorld Guest from 184.108.40.206
Forum Moderators: open
I have a problem with my web site. It's not easy to explain but I'll try...
The structure of my site is such that a lot of people do not link directory to the root of my domain (www.example.com). Instead they link to a directory (www.example.com/directory1/). Fine.
A lot of them forget the final slash (/)! This is annoying because for a browser, the address without the slash (/) does not exist ; but browsers are smart and re-direct the user to the correct address.
You can try it out if you don't believe me, look at the address bar in explorer, first the slash is not there and after a second or two, it appears.
The real problem is with google index. I fear that the link is not counted (is this possible?).
The reason I think this is because when a do : "site:www.example.com" on Google, he finds the correct page BUT I have a ridiculous residual entry with the page without the slash! No title, no description, nothing. As if Google considers there are two pages...
Can some of you share your thought?
Am I wrong when I say Google does not count it?
What can I do?
[edited by: ciml at 9:25 am (utc) on Mar. 14, 2004]
[edit reason] Examplified domains. [/edit]
If Google indexes a URL to a folder without the trailing slash, then it's because Googlebot "thinks" that it is a file and not a folder. When there's "no title, no description, nothing" then Googlebot has not been able to get any content when it asked your web server about the URL (or, in some cases, it has just not visited yet).
Googlebot is not totally wrong here, as it is possible to have file names without file extensions on most web servers (certainly it is on Apache, which is the most common one).
>> Am I wrong when I say Google does not count it?
No. And Yes. Google will count the two types of links as "votes" for two different URLs, unless your web server is set up to redirect from the URL without the trailing slash to the URL with the trailing slash. If your server is set up to do that, then Yes, otherwise No.
>> browsers are smart and re-direct the user to the correct address
Actually its not the browser that does this - it's your web server.
Try entering the URL without the trailing slash in the Server Header Checker [searchengineworld.com]. You should see a second line like this:
Status: HTTP/1.1 301 Moved Permanently
...and three lines below that you should see the location with a trailing slash. Google "understands" a 301 status code so that the two URLs can be merged into one in the index.
Now, if you see anythin but the above, then your server setup might not be working like it should. Specifically: If it says "302" or ("200-something") in stead of "301" then you will have a problem with Googlebot, as then it will not pas the link "votes" on to the right page, and it will not erase the wrong URL from the index.
>> What can I do?
First option: Try the server header checker a few times - not just two times in a row, but spread across the day. It might be a problem that has to do with server load.
Second: If all looks fine (ie. a 301) then you'll have to wait for Googlebot to visit the URL without the slash, as then it will know that it's not the right one. After that you will have to wait until the changes that Googlebot sees are reflected in the index.
Sometimes this can take a very long time - if the wrong links are on pages that are visited by Googlebot very rarely, then you might get to wait for months (literally)
Third: If the server header checker returns anything but a 301, then you will have to speak to your hosting provider, and make them set up the server to return the proper status code (ie. that 301 code).