Forum Moderators: open
So to lay it out step by step:
1)When I do a search for "keywords x" the URL for our index page gets returned with "%20" tacked on to the end of it in the SERP.
2)If you click on it it goes to the homepage of our site but only after being redirected by a page not found redirect.
3)Basically www.mydomain.com/%20 does not exist and we would like it to be changed to just www.mydomain.com/ - I see that there has been a new update and for the 2nd month in a row it is happening (we waited for a new indexing because we thought it would automatically get fixed but since it didn't I thought it was time speak up about it). This is affecting our page ranking and would like to know if anything can be done to fix it.
This does not happen for all searches that send people to our site - If we do a search for "keywords y" the result that comes back is correct [mydomain.com...] (the correct URL).
I would happy to email anyone the specifics of this to get it resolved as it is having a major effect that I don't think will go away automatically anytime soon.
Go back through any of your referrals and start checking actual code.
I am concerned that it has nothing to do with our code. rather, I'm fairly sure it is a google hiccup that is recurring and I don't know who can slap it on the back!
Again, searching for 'keywords x' will return www.mydomain.com/%20 in the SERP. However, searching for 'keywords y' will return www.mydomain.com/ (without the %20).
Possibly the miscommunication I had with brett is that I didn't make myself clear that Imm using two different keyword phrase searches to bring up my homepage and one of the searches "keywords y" returns the homepage URL correctly - www.mydomain.com/ while the other search for "keywords x" returns my homepage URL incorrectly - www.mydomain.com/%20
However. as stated previously, it does not go to an error page, as it is scripted through yahoo store to return unfound URLs to the homepage, so it will never give a 404 error.
Thus google will never see this URL - www.mydomain.com/%20 - as a broken link and it will continue to stay indexed.
hopefully this clarification, along with the full description previously in this thread, will show how google is erring here.
Odd ... over a hundred external links point to [mysite.com...] , as well as internal links, and Google used the quirky link with the stray characters.
Again, searching for 'keywords x' will return www.mydomain.com/%20 in the SERP. However, searching for 'keywords y' will return www.mydomain.com/ (without the %20).
There was no miscommunication. you have 2 (or more) different copies of that page in the index. On one search term the file "/" is returned as the most relevant and in the other "/%20" is considered to be more relevant.
Since you are not returning an error, that means that when it follows that link with a space, you are returning a 200 status. That means that the URL that they gave to your server was correct.
So Brett and others, I went to alltheweb and did the same thing buckworks did above. I found one external link that had me listed as www.mydomain.com/%20 in it. You are saying that I should contact that site and have them remove it, and google will then stop returning certain results as www.mydomain.com/%20? I agree with buckworks that this is very peculiar, seeing as I have hundreds of backlinks that list me correctly as www.mydomain.com/ and this one rogue backlink is enough to throw google off for certain very important keywords. Wow! And what is more disturbing is that someone (a competitor perhaps) could maliciously add a www.mydomian.com/%20 link on their site to my site intentionally - and, boom, I'm gone for certain keyphrases in google.
In reply to the suggestion that I adjust my header script, I simply can't because I don't have that freedom in a Yahoo store format.
The other option I considered is using the google remove an outdated link tool and submitting www.mydomain.com/%20 but I am worried that will eliminate my index page (www.mydomain.com/) as well.
The link text only has our company name in it. No keyword phrase matches from 'keywords x' or 'keywords y'. The rogue link, as stated previously, was found doing a backlink check at alltheweb, but isn't even registering with google at all when backlink checking both with the %20 and without it - which kind of throws out the theory that one incorrect backlink could be the reason google is delivering www.mydomain.com/%20 for certain keyphrase results.
incredibly frustrating, wish someone from google could shed some conclusive light on this matter...
Do ask the other site to tidy up their link to you. It won't hurt, and might help, although no one can say for sure.
My page that this happened to has dropped about 200 ranks for its target phrase (I finally found it!). If that circumstance is related to the quirky link on someone else's site, then there's a much deeper problem here. If Google can't see past a quirk like that it opens the door to serious sabotage.
A quick Google search found a site who had linked his email address with:
<a href="mailto: bob@mail.somedomain.com"> (notice the space directly after the colon) and one quick email to get the link
1. corrected without a space in it, and
2. written as a javascript link to fool spiders,
has cleared up the problem.
The amount of spam email has also decreased slightly in the last few months too (and MailWasher has taken care of the rest of it).
So, this problem can easily occur on a mistyped link. I have seen %20 in a number of Google SERPs over the years, and all concerned were probably oblivious to the effects.
[edited by: g1smd at 3:31 am (utc) on June 18, 2003]
In reply to the suggestion that I adjust my header script, I simply can't because I don't have that freedom in a Yahoo store format.
Then Yahoo is broken, not Google. Google is doing just what it's supposed to - indexing a page that is linked to.
If Google hits the page and gets a valid response, how should it know that it's not actually valid?
What is important is WHY google is returning a result with %20 in the URL in it at all. From what I've gleaned so far, the only suggestion is that one rogue backlink with a %20 in the url could trigger google to provide an erroneous result, when hundreds of other backlinks are all correct
I've experienced a similar phenomenon. A major site of mine was indexed with a /?tracking after the domain where tracking was a three letter code of a referral site. That was a year and a half ago, now whenever the index of my site shows in the serps (for any keyword that brings the index, including looking just for the url) it shows it as www.mysite.com/?track
but.
It didn't affect my rankings at all. In fact, after other optimizations, the index was ranked for a term it hadn't ranked (well, or possibly at all) on before, still showing the tracking code.
Are you certain your rank change hasn't occurred naturally via Google updates?
What is important is WHY google is returning a result with %20 in the URL in it at all.
Why shouldn't Google return that as a result? It's a URL, and if clicking it doesn't return an HTTP error, then it's a valid URL.
Now, why does it rank higher than the page that you want to rank higher? I don't know. But if web sites don't return the correct error codes for incorrect URLs, then search engines will just keep filling up with useless pages.
The reason google is showing that result is because YOUR SITE IS BROKEN!
It really is that simple.
The internet works a specific way, and the HTTP protocol has specific return codes. Your site is violating the specs.
I could link to your yahoo store to a file named "argle-bargle-woof-woof" that is not there and your site will wend me to the home page and return a 200 telling me that it was successful. I have no way of knowing that that file does not exist.
Now if you were to return a 301 pointing to your home page you would not have this problem. If you were to return a 404 or a 410 while returning the code for your home page, you also would not have this problem.
You say you are going elsewhere to look for help? I would suggest going to yahoo stores and pointing them to this thread. They are the ONLY ones that can help you. All we can do is tell you what the problem is, and the problem is that your site is broken.
In my case, the pages with the rogue links did not have as high a PR as some of the other pages (both internal and external) which had the correct links.
BigDave, I'm not enough of a techie to diagnose anyone's error code problems, but with my site, if you type the rogue URL into the browser, it shows my 404 custom error page before taking you to the home page.
Despite that, Google still indexed the rogue URL.
Something is definitely weird.
I've had this problem recently too.
Assuming you are able to use PHP, maybe create a directory called /%20 and put an index.php or something in with:
<?php header("HTTP/1.1 301 Moved Permanently"); header("Location: [mydomain.com");?>...]
That will tell google that it doesn't exist.
Unfortunately, you do not have a custom 404 page. you have a 302 temporary redirect to an error page that returns a 200 OK. At no time do you tell the client that there was an error.
If you want to see what the return code is, at the top of this page, click on the "control panel" link. Then near the bottom of the column on the left there is a link for "server headers".
Enter an URL on your site that does not exist. You want your 404 page to return a 404.
I just went to check some search terms so I could share more details with someone, and things are different. In www-fi, where I saw the problem, I can't find the rogue listing anymore in any search, and the "proper" URL is ranking okay for its target term. For now, anyhow!
Hmm ... what was Winnie the Pooh's line about figuring out puzzles?
The site with the rogue link problem is not the one in my profile. However, it has a similar .htaccess file.
The line I'm using in my .htaccess to call the error page is this:
ErrorDocument 404 [mysite.com...]
In the HEAD tags of the error page, I use an HTTP refresh to take folks to the home page after they've had a second to read the error message and know what's going on. Fortunately few people ever see that error page!
The code I'm using for that is:
<META HTTP-EQUIV="refresh" CONTENT="1;URL=http://mysite.com/">
What do I need to change?