Forum Moderators: Robert Charlton & goodroi
In my case, it happened because a higher profile site linked to my site as http://example.com instead of http://www.example.com. I fixed the problem, and I jumped up in the serps again.
Today, while going through some links at yahoo with link:www....., I found that I had priority pages with weird links from within my site.
Example of link location going to the home page: www.example.com/~mysi/product1.htm
If I key that into the url, it gives me the product page (which should be just www.example.com/product1.htm).
To me, this looks just like the dupe content fiasco when you keyed in the url with or without the WWW, and still got the same result.
I'm not completely knocked out of the serps for my key terms, but page three is not picnic either. It really should be better than that.
Should I fix it? How do I fix it? What causes it?
[edited by: tedster at 4:46 pm (utc) on Feb. 13, 2008]
[edit reason] switch to example.com [/edit]
If I key that into the url, it gives me the product page
I don't understand why - your server configuration should not resolve urls with "extra" directories inserted. Are you using some form of url rewriting? If so, it needs some tweaking so the url cannot be "hacked" and still resolve with a 200 status.
The code -
RewriteEngine On
#
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(([^/]+/)*)index\.html\ HTTP/
RewriteRule index\.html$ http://www.example.com/%1 [R=301,L]
#
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(([^/]+/)*)index\.htm\ HTTP/
RewriteRule index\.htm$ http://www.example.com/%1 [R=301,L]
#
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule (.*) http://www.example.com/$1 [R=permanent,L]
[edited by: tedster at 5:52 pm (utc) on Feb. 13, 2008]
I know absolutely for sure that there is NO folder called ~mysi or any variation there of.
However, I believe that the mysit (it IS suppose to be missing the last letter) is actually a login name. The host chops off the last letter of the domain, and makes that your login for certain things.
Think theres a chance that somehow the programing caused a nightmare wiring mess, resulting in odd urls tied to a username? Yeah.... that sounds crazy.
So, if I add that line in the robot.txt, that would just tell the robots to completely ignore anything related to said directory... even tho the directory isn't there in the first place?
How about I do both - 301 AND robot.txt?