homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 325 message thread spans 11 pages: < < 325 ( 1 2 3 4 5 6 7 8 9 10 [11]     
Google Is Working on an Algo Fix - to help wrongly demoted sites

 8:56 am on Mar 2, 2011 (gmt 0)

Here's official news that many sites have been waiting to hear. Google fellow Amit Singal is quoted in Wired:

"We deeply care about the people who are generating high-quality content sites, which are the key to a healthy web ecosystem," Singhal said.

"Therefore any time a good site gets a lower ranking or falsely gets caught by our algorithm - and that does happen once in a while even though all of our testing shows this change was very accurate - we make a note of it and go back the next day to work harder to bring it closer to 100 percent."

"That's exactly what we are going to do, and our engineers are working as we speak building a new layer on top of this algorithm to make it even more accurate than it is," Singhal said.




 3:48 am on Mar 14, 2011 (gmt 0)

crobb305, I too saw these strange 404s in GWT for urls that never existed on the affected domain.I gave the example in some thread here.However, I haven't addressed them with 301s as these pages never existed on the domain.Are other sites affected by this update seeing this?

Odd characters are being put into the URLs that Googlebot tries to crawl, thereby rendering 404. For example, www.example.com/&837262intendedpage.htm I don't know how to eradicate these if they are being "discovered" on search portals.

I also see those 404s for urls that you described above.These also happen when those incoming links are from pdf files.I used to redirect them to the correct urls as those pages existed on the domain and the errors are caused by malformed incoming urls in third party sites.

But I am now thinking to remove the redirects for those odd urls, as google might be considering them negatively.

Can someone help me on this? Should I retain those redirects for those strange urls that crobb305 has explained or should I be removing them? I added those 301s after I saw them in GWT report on crawl errors.


 3:53 am on Mar 14, 2011 (gmt 0)

I'd do what's best for the visitor, especially in light of this recent thread: The TRUTH about Linking [webmasterworld.com]

Google's got so many people scared [bad-word]less to do anything [that just plain makes sense from a visitor perspective] it's not even funny ... I'd forget about G's BS for a minute and listen to some of G's BS: Build Your Site for Visitors, NOT Google ... [preceding emphasis mine] ... By not redirecting the URLs because you're afraid of the Big Bad Bot you're NOT building your site for visitors, you're building it for Google.


 3:56 am on Mar 14, 2011 (gmt 0)

By not redirecting the URLs because you're afraid of the Big Bad Bot you're NOT building your site for visitors, you're building it for Google.

I had been doing them all these days but this google algo is what is forcing me to reconsider them. Anyway, thanks for your inputs.


 4:02 am on Mar 14, 2011 (gmt 0)

Yeah, I wouldn't re-think...

What makes sense from a visitor perspective, redirecting a poorly formed URL to the correct information or serving a 410 page?

What makes a better user experience, redirecting from a poorly formed URL to the correct information or serving a 410 page?

Which is the higher 'quality' website, the one that serves up a bunch of 410 pages when a visitor tries to follow a link to it or the one that takes the visitor to the correct information?

If people quit adjusting things that make sense to do from a visitor perspective 'because Google might ... [blah here]', it would probably actually make everyone's job easier, including theirs.


 4:16 am on Mar 14, 2011 (gmt 0)

It looks like everyone of the malformed urls reported in GWT has this appended in front of the file name: %E2%80%8B


Everyone of these junk portals are appending that, and Googlebot is encountering 404 (30 to 50 on each crawl). My best bet might be to create an htaccess rule to strip that out, and 301 to the correct form, but my htaccess skill is poor. I'm not sure how to strip it out. Like you said, it would be better to keep those visitors coming from those sources, by doing a redirect.


 4:19 am on Mar 14, 2011 (gmt 0)

Something I think people should keep in mind when considering whether to redirect is (yes, this is a visitor-based perspective, but it's the one I usually work with): If someone clicks on a link to your site somewhere and the page is 'not found' or 'gone' you're the one who looks bad ... You know it's a malformed URL on the other site, but from a visitor perspective, I can't image complaining about (and I haven't ever had a complaint from) ending up at the information I wanted to find, but I know there are sites I don't visit if I get a couple of 'not found' errors in a row ... I don't go find another page of resources if I've found one already, I just look for the information I could have found at your site somewhere else, because you didn't bother to take me to it when I clicked.


 4:21 am on Mar 14, 2011 (gmt 0)

Searching on %E2%80%8B is yielding some information, apparently others are reporting it recently, and trying to resolve the same issue. I agree with you MadScientist, I am trying to do this so I can reroute the visitor to the correct page, without returning a 404 or 410, while simultaneously eliminating excessive 404s that Google has said can cause problems.


 4:26 am on Mar 14, 2011 (gmt 0)

Remember there is a big difference between excessive 404s on internal links compared to external links. If excessive external 404s could cause problems, a lot of webmasters would aim them at their competition.

External 404 links are shown in WMT as an FYI - not as a warning of impending trouble. So I ask whether any real traffic would come from those links. If not, I just forget about it.

[edited by: tedster at 4:27 am (utc) on Mar 14, 2011]


 4:27 am on Mar 14, 2011 (gmt 0)

This should be in the Apache Forum [webmasterworld.com] and to find better, make more efficient, error correct, get installation help, troubleshoot, or even understand this code, please visit there so we don't get way off track and topic here, but a 'down and dirty' way to error correct the beginning of a malformed URL is:

RewriteEngine on
RewriteCond %{THE_REQUEST} !^[A-Z]{3,9}\ /the-path/to/the-page\.ext\ HTTP/1\.
RewriteRule the-path/to/the-page\.ext$ http://www.example.com/the-path/to/the-page\.ext [R=301,L]

* Understanding exactly what the preceding does is HIGHLY recommended prior to installation, but please, have that discussion in the Apache Forum, where you can find some very high-quality information, and resources, both in the library and from the regular posters.

[edited by: TheMadScientist at 4:32 am (utc) on Mar 14, 2011]


 4:30 am on Mar 14, 2011 (gmt 0)

thanks MadScientist. Definitely don't want to get the thread off track, I just find it odd that there are so many malformed inbound links to my "thin" pages, serving 404 to Googlebot. Only my 5 thin/money pages are having this problem, so just trying to cover all my bases and figure out what has caused the 60% traffic drop. I already added new content to the thinest pages.

Thanks for the tips.


 4:35 am on Mar 14, 2011 (gmt 0)

Np crobb305 ... I've seen you around here for years and you contribute quite a bit, so I'm happy to be able to give you something back ... Hope it helps you out some.


 4:46 am on Mar 14, 2011 (gmt 0)

%E2%80%8B is the unicode sequence for a "zero width space". It has been implicated in several exploits of various kinds, and some email address obfuscation scripts insert it to hide the real address from email harvesters.

I've got to ask why any legitimate website would insert that into a URL - and whether any traffic that comes from such a page would be real or of any value.

If receiving a bunch of those bad links is in any way tripping a ranking problem, I'd be astounded.


 4:53 am on Mar 14, 2011 (gmt 0)

True enough, but why not redirect to ensure the 'off chance visitor(s)' actually finds what's probably a welcome relief (content) on your site?

According to Matt Cutts (cited in the Links thread I linked previously) 'bad' inbound links generally just don't count, so if they don't count either way for ranking purposes, personally, I'm going to try to make sure any visitors from those links can actually find some content on a well-built site, because they might remember it ... I think of it as a customer service 'thing' and if 'the number is always disconnected or no longer in service' not only do they have a bad experience with the 'phone book' (the site they found the link on) they have a bad experience with yours, because you didn't bother to 'forward' the 'number' you knew they were trying to 'call'.


 4:57 am on Mar 14, 2011 (gmt 0)

If receiving a bunch of those bad links is in any way tripping a ranking problem, I'd be astounded.

I thought Google has made some comments in the past about excessive 404s (I believe around the time Caffeine was launched). I may be mistaken, but something that was said about a year ago has caused me to pay extra attention to the number of 404s that Gbot encounters, particularly when I see such a recent surge, then my rankings get spanked. It could all be pure coincidence, and I tend to be perfectionist by nature :)


 5:33 am on Mar 14, 2011 (gmt 0)

If the excessive 404s are coming from internal links, then at some point that causes trouble. Google definitely did confirm that. And it makes sense - they'd be sending visitors into a poor experience.

But they also confirmed what I'm saying about external sources of 404s - I believe it was John Mueller on their Webmaster Forum. How could you be responsible for that? And if you try to get a "perfect report card" in WMT, it will definitely drive you bonkers.


 5:36 am on Mar 14, 2011 (gmt 0)

That's right, 404s on internal urls. Thank you for jogging my memory. I guess I will work on other issues with the thin content and move forward.

Thanks again for the help. Hopefully the almighty GoogleBot will like the changes I have made the past two days :)


 8:10 am on Mar 14, 2011 (gmt 0)

SearchEngineLand have done a good write-up of the information coming out of SMX West about the panda update:



Nothing majorly new, but it confirms a few points of speculation.


 9:03 am on Mar 14, 2011 (gmt 0)

Hi all. I am one of these "passive readers" that come back to webmasterworld since years. Thanks for your helpful tips over the years.
The Panda update is though motivation enough now to share some of my results.
We are now in the 3rd week of panda and traffic is still on the decline. Our site is a tech news site with over a 10 year history.
In my latest tests it is clear that all content is ranked down. And not just a little bit. In most cases we got pushed to page 2. Pages that rank #1 on Google.com Europe are ranking on page 2 in Panda. Sites that post excerpts to our stories rank higher than our original content.
In the first days of Panda it looked more that not all content is affected. Now it has settled to all.

Things we did so far:
delete short aka thin content (breaking news posts)
delete tag pages
remove ad units
clean up outbound links

For me this is not anymore one of the usual Google Updates. I have fallen somehow in a really dark hole without doing any black hat or other Google Guideline violations.

Did you guys see any positive effect yet from our changes?


 1:53 pm on Mar 14, 2011 (gmt 0)

@reblaus, I am exactly seeing the same thing as you do to one of the sites and nothing has helped so far.

I noticed a fresh cache for the home page of this site after nearly 10 days.The last cache date was 3rd march and this new cache is dated march 12.

Here is another interesting fact for users of wordpress.

Wordpress released a new version on Feb 23 and simple tags plugin caused a mess with category and tag pages.I wouldn't fault the plugin developer, as everything was working fine until this new version.Unfortunately this new wordpress release broke that plugin, which in turn broke the site's navigation.All category and tag pages displayed the content of home page. We were using "noindex,follow" on these pages.

we noticed this only around march 3 and fixed it.The developer made a quick release of the updated plugin by then.But I am not sure whether the home page was crawled before or after we fixed it on march 3.The re-crawl happened only on march 12.

This explained the problem that I reported earlier.I was seeing 404 errors for urls like

domain.com/category/wdget/page/4, etc.

in GWT but those pages never existed for those category pages.Since category pages were redirected to home, somehow googlebot crawled sub pages of home as sub pages of categories and reported those errors.

I have a feeling that these not only created some 404 errors but might also have triggered duplicate penalty.

This entire mess happened as we updated wordpress to the new version as soon as they released it.It was bad luck for us.

Now we do not know whether the panda update or the duplicate penalty induced by this wordpress/plugin has caused these issues.


 2:48 pm on Mar 14, 2011 (gmt 0)

I see fresh set of 404s in GWT today and these were all discovered on feb 28 and march 1.These are due to the issue that I had described above. we had fixed it by march 3rd, but GWT is reporting these errors for past days now.

The no. of 404s seem to be pretty huge now, though GWT is reporting only 20-25 of them.could this have a bad impact.How long will it take for G to wipe out this negative effect.

we are't still sure whether it is the panda update or these 404 errors that caused the site to see a drop in rankings for several of its pages.


 3:58 pm on Mar 14, 2011 (gmt 0)

you probably want google to see as noindex and then block or 404 it.

Thanks walkman for the confirmation ~ I will spend much of the day evaluating each and every page ... don't see any other recourse or path at this point.



 5:12 pm on Mar 14, 2011 (gmt 0)

reblaus and Indyank

Regarding when you'll see ranking improvements due to your changes/tweaks (I'm waiting just as you guys are), I wanted to suggest that you look in your Google Webmaster Tools console and check the crawl data. There are some graphs that show the spidering volume and crawl frequency. I see daily spidering, but I get a different spider with a deep crawl about every 45 days. In the past, when I have seen penalties, they were both initiated and removed as a result of this deeper crawl/spider. And just last week, I got a deep crawl (first one in a month), on March 10, then I was penalized by March 11. I have continued to see minor spidering since, but obviously no positive changes in rankings.

So my guess is, I will have to wait until that other spider comes around to re-rank me. So glance at your crawl charts (or your raw data) and look for spikes (in the raw data, I believe it's Googlebot/2.0). Then, you can estimate your deep-crawl frequency and how long it might take to see it come back around.

I'm sure spidering frequency is related to numerous factors (authority/trust, content updates, etc), so some sites get crawled more regularly by the bot(s) that really matter. I am not so fortunate. I sometimes have to wait 45 to 60 days to see that spike. Just don't beat your head against a wall everyday by hitting refresh on the results, because they aren't likely to change (unless it's your time to get re-ranked).



 7:11 pm on Mar 14, 2011 (gmt 0)


yes i saw a deep crawl around mar 10. So you say I need to wait for another crawler that looks at quality and see if google updates my quality score?


 7:32 pm on Mar 14, 2011 (gmt 0)

That wouldn't make sense to me. Crawlers are not algorithms, they only collect the server's response.


 10:49 am on Apr 1, 2011 (gmt 0)

There are many sites which will be benefiting with this and i think my site will also get good rank now after some time.

This 325 message thread spans 11 pages: < < 325 ( 1 2 3 4 5 6 7 8 9 10 [11]
Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved