| 10:45 pm on Jun 16, 2003 (gmt 0)|
I've seen this several times before. Someone has linked to you or you have linked to yourself accidently with a space on the end of the url. It is very difficult to back track and figure out where or who it was. Many log analzers will strip that space out or you just wont notice it in the display.
Go back through any of your referrals and start checking actual code.
| 11:11 pm on Jun 16, 2003 (gmt 0)|
Thanks for the prompt reply brett.
I am concerned that it has nothing to do with our code. rather, I'm fairly sure it is a google hiccup that is recurring and I don't know who can slap it on the back!
Again, searching for 'keywords x' will return www.mydomain.com/%20 in the SERP. However, searching for 'keywords y' will return www.mydomain.com/ (without the %20).
Possibly the miscommunication I had with brett is that I didn't make myself clear that Imm using two different keyword phrase searches to bring up my homepage and one of the searches "keywords y" returns the homepage URL correctly - www.mydomain.com/ while the other search for "keywords x" returns my homepage URL incorrectly - www.mydomain.com/%20
| 11:24 pm on Jun 16, 2003 (gmt 0)|
If the link returns a 404 error or a 301 permanent, it should take care of itself.
| 11:48 pm on Jun 16, 2003 (gmt 0)|
Again thanks to mcavic for the reply.
However. as stated previously, it does not go to an error page, as it is scripted through yahoo store to return unfound URLs to the homepage, so it will never give a 404 error.
Thus google will never see this URL - www.mydomain.com/%20 - as a broken link and it will continue to stay indexed.
hopefully this clarification, along with the full description previously in this thread, will show how google is erring here.
| 2:18 am on Jun 17, 2003 (gmt 0)|
|to return unfound URLs to the homepage |
Point taken. But the page should do that using a 301, so that Google and others will realize that it's invalid.
| 2:20 am on Jun 17, 2003 (gmt 0)|
%20 is a space...
| 2:43 am on Jun 17, 2003 (gmt 0)|
I noticed this exact same thing happening to one of my sites yesterday. By checking backlinks for [mysite.com...] in AllTheWeb I found two pages whose link to my site included those stray characters. Both pages were on the same site. I sent a note requesting "a small correction" while also expressing gratitude for the links. I heard back from the webmaster today and the corrections have been made.
Odd ... over a hundred external links point to [mysite.com...] , as well as internal links, and Google used the quirky link with the stray characters.
| 3:09 am on Jun 17, 2003 (gmt 0)|
For me i have a couple of hundred links like www.domain.com/mypage.php?user=someone etc
There are about 5 different types of URL's showing up in my serps, that seem random. I know for a fact no one is linking to me, and i have no links anywhere to these strange URL's
| 4:01 am on Jun 17, 2003 (gmt 0)|
|Again, searching for 'keywords x' will return www.mydomain.com/%20 in the SERP. However, searching for 'keywords y' will return www.mydomain.com/ (without the %20). |
There was no miscommunication. you have 2 (or more) different copies of that page in the index. On one search term the file "/" is returned as the most relevant and in the other "/%20" is considered to be more relevant.
Since you are not returning an error, that means that when it follows that link with a space, you are returning a 200 status. That means that the URL that they gave to your server was correct.
| 6:37 am on Jun 17, 2003 (gmt 0)|
Yes, you can still return you index page for /%20, but set the headers to 301 (to conserve PR) (or a 404 just to get the bad links out). Replies in this thread are right on the mark.
| 2:19 pm on Jun 17, 2003 (gmt 0)|
"I noticed this exact same thing happening to one of my sites yesterday. By checking backlinks for [mysite.com...] in AllTheWeb I found two pages whose link to my site included those stray characters. Both pages were on the same site. I sent a note requesting "a small correction" while also expressing gratitude for the links. I heard back from the webmaster today and the corrections have been made."
So Brett and others, I went to alltheweb and did the same thing buckworks did above. I found one external link that had me listed as www.mydomain.com/%20 in it. You are saying that I should contact that site and have them remove it, and google will then stop returning certain results as www.mydomain.com/%20? I agree with buckworks that this is very peculiar, seeing as I have hundreds of backlinks that list me correctly as www.mydomain.com/ and this one rogue backlink is enough to throw google off for certain very important keywords. Wow! And what is more disturbing is that someone (a competitor perhaps) could maliciously add a www.mydomian.com/%20 link on their site to my site intentionally - and, boom, I'm gone for certain keyphrases in google.
In reply to the suggestion that I adjust my header script, I simply can't because I don't have that freedom in a Yahoo store format.
The other option I considered is using the google remove an outdated link tool and submitting www.mydomain.com/%20 but I am worried that will eliminate my index page (www.mydomain.com/) as well.
| 3:05 pm on Jun 17, 2003 (gmt 0)|
Did the external link you found that linked to www.mydomain.com/%20 have 'keywords x' in the link text, or 'keywords y'?
My moneys on 'keywords x'
| 3:18 pm on Jun 17, 2003 (gmt 0)|
The link text only has our company name in it. No keyword phrase matches from 'keywords x' or 'keywords y'. The rogue link, as stated previously, was found doing a backlink check at alltheweb, but isn't even registering with google at all when backlink checking both with the %20 and without it - which kind of throws out the theory that one incorrect backlink could be the reason google is delivering www.mydomain.com/%20 for certain keyphrase results.
incredibly frustrating, wish someone from google could shed some conclusive light on this matter...
| 3:36 pm on Jun 17, 2003 (gmt 0)|
Remember that Google doesn't show nearly as many backlinks as AllTheWeb ... only those with a PR higher than a certain threshold (believed to be around PR4).
Do ask the other site to tidy up their link to you. It won't hurt, and might help, although no one can say for sure.
My page that this happened to has dropped about 200 ranks for its target phrase (I finally found it!). If that circumstance is related to the quirky link on someone else's site, then there's a much deeper problem here. If Google can't see past a quirk like that it opens the door to serious sabotage.
| 4:38 pm on Jun 17, 2003 (gmt 0)|
Regarding the single errant www.mydomain.com/%20 backlink, I have just notified the webmaster of the mistake. While I'm not happy to see someone else with this same frustrating problem, it does show that there is something wrong here with google and it's really hurting rankings. I have dropped from #5 SERP to #43.
| 3:18 am on Jun 18, 2003 (gmt 0)|
A friend had been getting spam and junk mail addressed to "%email@example.com" for some time. He was always puzzled by it but never tried to find out why.
A quick Google search found a site who had linked his email address with:
<a href="mailto: firstname.lastname@example.org"> (notice the space directly after the colon) and one quick email to get the link
1. corrected without a space in it, and
has cleared up the problem.
The amount of spam email has also decreased slightly in the last few months too (and MailWasher has taken care of the rest of it).
So, this problem can easily occur on a mistyped link. I have seen %20 in a number of Google SERPs over the years, and all concerned were probably oblivious to the effects.
[edited by: g1smd at 3:31 am (utc) on June 18, 2003]
| 3:26 am on Jun 18, 2003 (gmt 0)|
|In reply to the suggestion that I adjust my header script, I simply can't because I don't have that freedom in a Yahoo store format. |
Then Yahoo is broken, not Google. Google is doing just what it's supposed to - indexing a page that is linked to.
If Google hits the page and gets a valid response, how should it know that it's not actually valid?
| 1:19 pm on Jun 18, 2003 (gmt 0)|
you're really missing the point here mcavic. What is important is WHY google is returning a result with %20 in the URL in it at all. From what I've gleaned so far, the only suggestion is that one rogue backlink with a %20 in the url could trigger google to provide an erroneous result, when hundreds of other backlinks are all correct. It doesn't add up though because one different link shouldn't ever be seen as more relevant than 100s of others. If this is indeed the answer then, yes, google and webmasters alike are in jeopardy of competitors intentionally creating skewed backlinks that will damage SERPs.
| 1:31 pm on Jun 18, 2003 (gmt 0)|
I guess it depends on the quality of the backlink and the anchor text used - if you have one PR 7 site linking to the wrong URL (and only linking to you) and 100 PR 4 sites linking to you and dozens of other pages then I can see the one link having more weight.
| 1:37 pm on Jun 18, 2003 (gmt 0)|
looking for help elsewhere now
| 3:56 pm on Jun 18, 2003 (gmt 0)|
|What is important is WHY google is returning a result with %20 in the URL in it at all. From what I've gleaned so far, the only suggestion is that one rogue backlink with a %20 in the url could trigger google to provide an erroneous result, when hundreds of other backlinks are all correct |
I've experienced a similar phenomenon. A major site of mine was indexed with a /?tracking after the domain where tracking was a three letter code of a referral site. That was a year and a half ago, now whenever the index of my site shows in the serps (for any keyword that brings the index, including looking just for the url) it shows it as www.mysite.com/?track
It didn't affect my rankings at all. In fact, after other optimizations, the index was ranked for a term it hadn't ranked (well, or possibly at all) on before, still showing the tracking code.
Are you certain your rank change hasn't occurred naturally via Google updates?
| 4:37 pm on Jun 18, 2003 (gmt 0)|
|What is important is WHY google is returning a result with %20 in the URL in it at all. |
Why shouldn't Google return that as a result? It's a URL, and if clicking it doesn't return an HTTP error, then it's a valid URL.
Now, why does it rank higher than the page that you want to rank higher? I don't know. But if web sites don't return the correct error codes for incorrect URLs, then search engines will just keep filling up with useless pages.
| 4:50 pm on Jun 18, 2003 (gmt 0)|
The reason google is showing that result is because YOUR SITE IS BROKEN!
It really is that simple.
The internet works a specific way, and the HTTP protocol has specific return codes. Your site is violating the specs.
I could link to your yahoo store to a file named "argle-bargle-woof-woof" that is not there and your site will wend me to the home page and return a 200 telling me that it was successful. I have no way of knowing that that file does not exist.
Now if you were to return a 301 pointing to your home page you would not have this problem. If you were to return a 404 or a 410 while returning the code for your home page, you also would not have this problem.
You say you are going elsewhere to look for help? I would suggest going to yahoo stores and pointing them to this thread. They are the ONLY ones that can help you. All we can do is tell you what the problem is, and the problem is that your site is broken.
| 5:15 pm on Jun 18, 2003 (gmt 0)|
<<I guess it depends on the quality of the backlink and the anchor text used - if you have one PR 7 site linking to the wrong URL (and only linking to you) and 100 PR 4 sites linking to you and dozens of other pages then I can see the one link having more weight. >>
In my case, the pages with the rogue links did not have as high a PR as some of the other pages (both internal and external) which had the correct links.
BigDave, I'm not enough of a techie to diagnose anyone's error code problems, but with my site, if you type the rogue URL into the browser, it shows my 404 custom error page before taking you to the home page.
Despite that, Google still indexed the rogue URL.
Something is definitely weird.
| 5:46 pm on Jun 18, 2003 (gmt 0)|
I also still think something is screwy here. However, bigdave, I'll try anything at this point and am willing to contact yahoo and discuss the possibility of setting up a more appropriate return code. I would like to know what you suggest to be the best choice for return code? The 301 seems great from my end because it informs the viewer that the page has been moved and then automatically takes them there. But, more importantly, will a 301 provide the correct info to google so that it will drop the www.mydomain.com/%20 from its index?
| 5:49 pm on Jun 18, 2003 (gmt 0)|
Your host is probably using a 302 header, instead of 301 - which would work correctly. Get a new host ;).
I've had this problem recently too.
Assuming you are able to use PHP, maybe create a directory called /%20 and put an index.php or something in with:
<?php header("HTTP/1.1 301 Moved Permanently"); header("Location: [mydomain.com");?>...]
That will tell google that it doesn't exist.
| 6:07 pm on Jun 18, 2003 (gmt 0)|
Unfortunately, you do not have a custom 404 page. you have a 302 temporary redirect to an error page that returns a 200 OK. At no time do you tell the client that there was an error.
If you want to see what the return code is, at the top of this page, click on the "control panel" link. Then near the bottom of the column on the left there is a link for "server headers".
Enter an URL on your site that does not exist. You want your 404 page to return a 404.
| 6:11 pm on Jun 18, 2003 (gmt 0)|
Something's happening ...
I just went to check some search terms so I could share more details with someone, and things are different. In www-fi, where I saw the problem, I can't find the rogue listing anymore in any search, and the "proper" URL is ranking okay for its target term. For now, anyhow!
Hmm ... what was Winnie the Pooh's line about figuring out puzzles?
| 6:32 pm on Jun 18, 2003 (gmt 0)|
The site with the rogue link problem is not the one in my profile. However, it has a similar .htaccess file.
The line I'm using in my .htaccess to call the error page is this:
ErrorDocument 404 [mysite.com...]
In the HEAD tags of the error page, I use an HTTP refresh to take folks to the home page after they've had a second to read the error message and know what's going on. Fortunately few people ever see that error page!
The code I'm using for that is:
<META HTTP-EQUIV="refresh" CONTENT="1;URL=http://mysite.com/">
What do I need to change?
| This 43 message thread spans 2 pages: 43 (  2 ) > > |