homepage Welcome to WebmasterWorld Guest from 54.145.183.169
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
More than 60'000 404 error page
mirkko22



 
Msg#: 4428841 posted 12:01 am on Mar 14, 2012 (gmt 0)

hello

I have a Joomla website in two language composed of more than 12'000 page. The template used for this site have two "Font Size button" who allow to increase/decrease the font size of the site.

I launched my site 5 months ago and Google have well indexed everything and I'm happy. However I disabled now this Font Size button because I read in many place the fact to use this kind of button can result in duplicate content. Of course this is serious issue because instead to be indexed 1 time the site is indexed 3 time and this can be a case of duplicate content and can result to get a bad Google ranking.

That mean I have pages:

- with normal font size
- with larger font size having url who contain: ?fontstyle=f-larger
- with larger font size having url who contain: ?fontstyle=f-smaller

I see in Google Webmaster Tools crawling problem who result in more than 60'000 error 404 page. Most of those problem are related to this "fontstyle" attribute who is now no more present. I setup many weeks ago inside Google Webmaster Tools "URL parameters" for the attribute "fontstyle" for try to remove from the Google index all pages/url who contain "fontstyle". Unfortunately this seem to not work because instead to remove all URL/page already indexed with this attribute, their number continue to increase.

I have same problem with some other attribute like criteria, limit and task.

I would like to know if the fact to setup the robots.txt for stop Google to continue to crawl all page who contain the above attribute can be a solution.

I don't know what I can do for remove from Google index and inside Google Webmaster Tools all this 404 error page... I read maany thing about and many contradictory solution so I'm totally lost...

Somebody have some suggestions please?

thank for your time..

Cheers

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4428841 posted 12:13 am on Mar 14, 2012 (gmt 0)

<< Whoops, wrong thread >>

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4428841 posted 1:25 am on Mar 14, 2012 (gmt 0)

If you block access via robots.txt you will just have tens of thousands of errors in that part of the report instead.

The alternative URLs should have been served with the rel="canonical" attribute in their <head> section, pointing to the standard URL for each page.

The alternative is to alter the PHP script that generates the page to instead no longer show the font sizer button/link when it is Google requesting the normal page.

If there are no links pointing to the alternative URLs and those alternative pages no longer exist, the errors will clear themselves from the list within a few months.

mirkko22



 
Msg#: 4428841 posted 9:09 am on Mar 14, 2012 (gmt 0)

thank for your reply...

Yes I read already about "rel=canonical" but using a CMS like Joomla this is not easy to implement.

For what I understand in this page:

[support.google.com...]

..the fact to set canonical consist to tell Google wath are my preferred URL. But instead to do that I would prefer tell to Google URL I want to NOT be crawled because this seem more logic and easy to implement.

So I have followed Google Webmaster Tools tutorial explained here:

[support.google.com...]

I apply a URL parameters for the attribute "fontstyle" telling this attribute change de way of how the content is displayed to "Other". Then I selected "No Urls". But after many weeks/month I don't see any result and my 404 error increase instead to decrease.. :-(

I analyzed many error 404 page and 90% of them are made because the attribute "fontstyle" is present inside my URL...

Can you tell me why Google continue to crawl that url?

Please other question:

For solve this "fontstyle" problem I wondering if the fact to publish again my "Fontstyle Button" and to apply to that buttons a rel=nofollow OR rel=noindex OR both (if possible but I'm unsure if both are possible) can do the job.

many thank

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved