|Receiving links to a directory on a site.|
People forget the last slash...
| 6:35 am on Mar 14, 2004 (gmt 0)|
I have a problem with my web site. It's not easy to explain but I'll try...
The structure of my site is such that a lot of people do not link directory to the root of my domain (www.example.com). Instead they link to a directory (www.example.com/directory1/). Fine.
A lot of them forget the final slash (/)! This is annoying because for a browser, the address without the slash (/) does not exist ; but browsers are smart and re-direct the user to the correct address.
You can try it out if you don't believe me, look at the address bar in explorer, first the slash is not there and after a second or two, it appears.
The real problem is with google index. I fear that the link is not counted (is this possible?).
The reason I think this is because when a do : "site:www.example.com" on Google, he finds the correct page BUT I have a ridiculous residual entry with the page without the slash! No title, no description, nothing. As if Google considers there are two pages...
Can some of you share your thought?
Am I wrong when I say Google does not count it?
What can I do?
[edited by: ciml at 9:25 am (utc) on Mar. 14, 2004]
[edit reason] Examplified domains. [/edit]
| 9:41 am on Mar 14, 2004 (gmt 0)|
[edited by storevalley 14 March 2004 09:34]
Re-read your post, and removed comments ... they didn't really answer your question.
| 12:33 pm on Mar 14, 2004 (gmt 0)|
>> ridiculous residual entry with the page without the slash
>> No title, no description, nothing. As if Google considers there are two pages...
If Google indexes a URL to a folder without the trailing slash, then it's because Googlebot "thinks" that it is a file and not a folder. When there's "no title, no description, nothing" then Googlebot has not been able to get any content when it asked your web server about the URL (or, in some cases, it has just not visited yet).
Googlebot is not totally wrong here, as it is possible to have file names without file extensions on most web servers (certainly it is on Apache, which is the most common one). >> Am I wrong when I say Google does not count it?
No. And Yes. Google will count the two types of links as "votes" for two different URLs, unless your web server is set up to redirect from the URL without the trailing slash to the URL with the trailing slash. If your server is set up to do that, then Yes, otherwise No. >> browsers are smart and re-direct the user to the correct address
Actually its not the browser that does this - it's your web server.
Try entering the URL without the trailing slash in the Server Header Checker [searchengineworld.com]. You should see a second line like this:
Status: HTTP/1.1 301 Moved Permanently
...and three lines below that you should see the location with a trailing slash. Google "understands" a 301 status code so that the two URLs can be merged into one in the index.
Now, if you see anythin but the above, then your server setup might not be working like it should. Specifically: If it says "302" or ("200-something") in stead of "301" then you will have a problem with Googlebot, as then it will not pas the link "votes" on to the right page, and it will not erase the wrong URL from the index. >> What can I do?
First option: Try the server header checker a few times - not just two times in a row, but spread across the day. It might be a problem that has to do with server load.
Second: If all looks fine (ie. a 301) then you'll have to wait for Googlebot to visit the URL without the slash, as then it will know that it's not the right one. After that you will have to wait until the changes that Googlebot sees are reflected in the index.
Sometimes this can take a very long time - if the wrong links are on pages that are visited by Googlebot very rarely, then you might get to wait for months (literally)
Third: If the server header checker returns anything but a 301, then you will have to speak to your hosting provider, and make them set up the server to return the proper status code (ie. that 301 code).
| 7:15 pm on Mar 14, 2004 (gmt 0)|
Thanks for this great info then!
At first, it seems that I get the 301 message all the time ; so everything is fine but I'll check again in the future to be sure.
| 2:49 am on Mar 15, 2004 (gmt 0)|
The directory links in my new site also have no final slash (/) e.g. www.example.com/example. After reading this post, I begin worrying if in this case it will be difficult to pass PR from other linked pages.
Anyone could tell me if it really matters?