Welcome to WebmasterWorld Guest from 54.226.147.190

Forum Moderators: goodroi

Message Too Old, No Replies

More than 60'000 404 error page

     

mirkko22

12:01 am on Mar 14, 2012 (gmt 0)



hello

I have a Joomla website in two language composed of more than 12'000 page. The template used for this site have two "Font Size button" who allow to increase/decrease the font size of the site.

I launched my site 5 months ago and Google have well indexed everything and I'm happy. However I disabled now this Font Size button because I read in many place the fact to use this kind of button can result in duplicate content. Of course this is serious issue because instead to be indexed 1 time the site is indexed 3 time and this can be a case of duplicate content and can result to get a bad Google ranking.

That mean I have pages:

- with normal font size
- with larger font size having url who contain: ?fontstyle=f-larger
- with larger font size having url who contain: ?fontstyle=f-smaller

I see in Google Webmaster Tools crawling problem who result in more than 60'000 error 404 page. Most of those problem are related to this "fontstyle" attribute who is now no more present. I setup many weeks ago inside Google Webmaster Tools "URL parameters" for the attribute "fontstyle" for try to remove from the Google index all pages/url who contain "fontstyle". Unfortunately this seem to not work because instead to remove all URL/page already indexed with this attribute, their number continue to increase.

I have same problem with some other attribute like criteria, limit and task.

I would like to know if the fact to setup the robots.txt for stop Google to continue to crawl all page who contain the above attribute can be a solution.

I don't know what I can do for remove from Google index and inside Google Webmaster Tools all this 404 error page... I read maany thing about and many contradictory solution so I'm totally lost...

Somebody have some suggestions please?

thank for your time..

Cheers

lucy24

12:13 am on Mar 14, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



<< Whoops, wrong thread >>

g1smd

1:25 am on Mar 14, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



If you block access via robots.txt you will just have tens of thousands of errors in that part of the report instead.

The alternative URLs should have been served with the rel="canonical" attribute in their <head> section, pointing to the standard URL for each page.

The alternative is to alter the PHP script that generates the page to instead no longer show the font sizer button/link when it is Google requesting the normal page.

If there are no links pointing to the alternative URLs and those alternative pages no longer exist, the errors will clear themselves from the list within a few months.

mirkko22

9:09 am on Mar 14, 2012 (gmt 0)



thank for your reply...

Yes I read already about "rel=canonical" but using a CMS like Joomla this is not easy to implement.

For what I understand in this page:

[support.google.com...]

..the fact to set canonical consist to tell Google wath are my preferred URL. But instead to do that I would prefer tell to Google URL I want to NOT be crawled because this seem more logic and easy to implement.

So I have followed Google Webmaster Tools tutorial explained here:

[support.google.com...]

I apply a URL parameters for the attribute "fontstyle" telling this attribute change de way of how the content is displayed to "Other". Then I selected "No Urls". But after many weeks/month I don't see any result and my 404 error increase instead to decrease.. :-(

I analyzed many error 404 page and 90% of them are made because the attribute "fontstyle" is present inside my URL...

Can you tell me why Google continue to crawl that url?

Please other question:

For solve this "fontstyle" problem I wondering if the fact to publish again my "Fontstyle Button" and to apply to that buttons a rel=nofollow OR rel=noindex OR both (if possible but I'm unsure if both are possible) can do the job.

many thank
 

Featured Threads

Hot Threads This Week

Hot Threads This Month