Welcome to WebmasterWorld Guest from 107.20.54.98

Message Too Old, No Replies

Crawl Errors Jump after Shopping Cart Change

     
10:05 pm on Jun 20, 2011 (gmt 0)

New User

5+ Year Member

joined:Sept 22, 2010
posts: 17
votes: 0


I am running an ecommerce site that was doing well on Google. We switched our shopping cart to magento enterprise and after a couple of weeks Google webmaster reported 15,000 crawl errors. At the same time we saw our rankings drop by 20-30 positions. After a couple of months our error count has grown to 32,623 not found pages. Does anyone know how to deal with such a huge number of errors? Do crawl errors correlate to a drop in rankings? Some of the errors need to have 301 redirects but the majority of them are coming from query pages that the system generates based on product attributes. I recently blocked all query pages in the robots file, but I am not sure if that is the best solution.
Any advice on this topic would be greatly appreciated.
11:51 pm on June 20, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member netmeg is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2005
posts:12678
votes: 144


I took a client site to Magento Enterprise last year. I put in a bunch of redirects to old URLs (several thousand at least) and I also blocked off tons stuff that didn't need to be in the search engines, like sort and page and whatnot. And of course, I blocked the search results pages.

Have you implemented the canonical URL plugin? It was written for the Community version, but it works on Enterprise.

I guess I would take a look at what those errors actually are, and find a way to make sure that the store isn't generating them anymore. If you block with robots.txt, that's not going to cause them to fall out of the index; it'll just keep Google from going in to see them at all (and presumably you won't get any more)

If you want them out of the index, you're going to have to remove them manually one by one (ouch) or figure out a way to serve up a 404 or 410 result code on them.

I'm a little curious on what these pages really are though; I don't think we have anything similar.
3:19 am on June 21, 2011 (gmt 0)

New User

5+ Year Member

joined:Sept 22, 2010
posts: 17
votes: 0


Yikes.. I don't like the idea of doing 32k redirects. Do you know of any way to do a bulk upload or redirects in magento?
3:25 am on June 21, 2011 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member netmeg is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Mar 30, 2005
posts:12678
votes: 144


The way we did it was to bypass magento altogether and put them in .htaccess - the previous store was in a subdirectory, so we just installed a separate apache in that directory and put an .htaccess in *just* for serving redirects. Seemed to work okay.

If your pages aren't serving any particular purpose, or helpful to your users, not much point redirecting them. Just get 'em out of there.
7:51 am on June 21, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


I'm just working on a site (not Magento but a similar problem) that has exposed 100 000 URLs that should never have been indexed.

Since the URLs all had a common pattern, the solution was to add a rule to .htaccess to send "410 Gone" for all of those.

You could also set up redirects for URLs that actually have traffic, but only redirect to a new page if the content on the new page closely matches the content of the old page. In particular, do not "funnel" hundreds of URLs to one destination. Especially do not redirect users to the root home page. If you're going to do that, serve "410 Gone" and ensure the ErrorDocument has clickable links to the home page and to the major category pages.
10:22 pm on June 21, 2011 (gmt 0)

New User

5+ Year Member

joined:Sept 22, 2010
posts: 17
votes: 0


Correct me if I am wrong, but doesn't adding a large number of rules to your .htaccess slow down your sever? This is great advice, and I really appreciate it!
10:28 pm on June 21, 2011 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14624
votes: 88


Since the URLs all had a common pattern, the solution was to add a rule to .htaccess to send "410 Gone" for all of those.


I had a similar situation and put NOINDEX in those pages so googlebot would stop trying to hit them and WMTs wouldn't be cluttered with the junk. Took a couple of months to resolve but googlebot did and it's all good now.
11:22 pm on June 21, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Correct me if I am wrong, but doesn't adding a large number of rules to your .htaccess slow down your sever? This is great advice, and I really appreciate it!

Since the URLs all had a common pattern, the solution was to add a (one, single) rule to .htaccess to send "410 Gone" for all of those.
5:39 pm on June 23, 2011 (gmt 0)

New User

5+ Year Member

joined:Sept 22, 2010
posts: 17
votes: 0


Can you send me an example of the htaccess code rule i could use?
7:21 pm on June 23, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


It depends on the URLs themselves:

RewriteCond (^|&)pid=([^&]+)(&|$)
RewriteRule ^(index\.php)?$ - [G]