homepage Welcome to WebmasterWorld Guest from 54.242.200.172
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 34 message thread spans 2 pages: 34 ( [1] 2 > >     
"Not Selected" URLs In GWT Too Large -> How Do I Fix Them?
TheBigK




msg:4522344
 11:24 am on Nov 25, 2012 (gmt 0)

On September 4-5, our website (blog+forums) traffic went down by about 50%. Since then I've been investigating the cause. I've in the last two months, I've reverted most of the things that could have gone wrong.

Recently I discovered that in GWT, the 'Index Status' shows that we've a HUGE number of 'Not Selected' URLs. I've been told that this could be one of the main reasons our traffic went down. I'm attaching a screenshot of how it looks in GWT-

[i.imgur.com...]

Few more points that might be useful -

1. Around 4-5 September, the 404 errors went suddenly high on our site and continued to grow until Mid - September. This was a Javascript error in DISQUS plugin. I've fixed this problem already by setting up proper 301 redirect for all the error URLs that were created.

2. I've had a tag auto-link plugin that would find tag keywords on our wordpress blog and link them to relevant tag pages. I've disabled it 3 days ago.

3. Our website URL is - example.com and the forums reside on example.com/community/ . Our community traffic seems to be affected the most.

I'd really appreciate your inputs on fixing the 'Not Selected' URLs.

[edited by: goodroi at 11:32 am (utc) on Nov 25, 2012]
[edit reason] Please no URLs [/edit]

 

lucy24




msg:4522346
 12:37 pm on Nov 25, 2012 (gmt 0)

You left out a pair of key facts.

How many physically distinct pages do you actually have?
AND
How many ways can those pages be reached?

"Not selected" doesn't necessarily mean that g### didn't like the page. It also includes every page you've ever redirected, every non-significant parameter, you name it.

I don't see a drop in Pages Indexed, which should be the important one. I see a surge in Not Selected. So unless your site really got a lot fatter around the end of September, I'd look for duplicates. Forums are a good place to start, because a single page can easily rack up 10, 20, 30 and more URLs.

TheBigK




msg:4522348
 12:46 pm on Nov 25, 2012 (gmt 0)

My apologies. I should have mentioned that. The site has over 50k discussions and I'd expect it to have about 90-110k individual pages (pagination considered). The blog has about 4k posts. Considering how 'good' I'm at estimates, I'd not think that there are more than 150k pages. I've blocked several 'thin content' pages (like member profiles, which would add about 150k pages).

But the "not selected" pages are well over 5 million - which is certainly out of place.

I've removed all the 'duplicate' content as reported in GWT and don't really think there is any 'significant' duplicate content on the site. At least not to the tune of what's being reported by GWT.

I've been told that the exorbitant "not selected" could be an indication to Google that the website has lot of duplicate content. But that isn't the case. I suspect some technical error in the site, which I'm not able to find.

Awarn




msg:4522379
 4:44 pm on Nov 25, 2012 (gmt 0)

Your graph has a lot of similarities to mine. I have that same blue spike in the indexed pages at the same exact time. When your nonindexed pages climb in late April I instead had a spike in blocked in robots. However my robots.txt hadnt been changed since 2009 but had no prior issues. I have since stripped the robots.txt and it gradually came down but the not selected climbed proportionately. Just a question here, did you increase security on the server or anything? I have not figured out what triggered it.

TheBigK




msg:4522380
 4:49 pm on Nov 25, 2012 (gmt 0)

@Awarn : Thank you for your response. Even I didn't touch my robots.txt for long. I didn't do anything on the server; but had implemented a few things on our forums (/community/). Google reported that we've had an exceptionally large number of URLs on September 9 but those were all pagination related. Google seems to have dropped those URLs.

I'm totally worried about the 'not selected' URLs. I don't see what's going wrong on the site. Technically the website looks all-right. Just want to know what could trigger those URLs.

[edited by: incrediBILL at 5:21 am (utc) on Nov 26, 2012]
[edit reason] removed specifics, no specifics please [/edit]

TheBigK




msg:4522988
 6:30 pm on Nov 27, 2012 (gmt 0)

Update:

I'm aware of all the sections and pages that are generated on my website and believe that I don't have pages that are 'duplicates' or 'substantially similar' to other pages.

Is there any way of identifying such pages through Google's eyes?

I mean, if I don't know what to fix? How do I go about fixing it?

TheMadScientist




msg:4522991
 6:47 pm on Nov 27, 2012 (gmt 0)

1. Around 4-5 September, the 404 errors went suddenly high on our site and continued to grow until Mid - September. This was a Javascript error in DISQUS plugin. I've fixed this problem already by setting up proper 301 redirect for all the error URLs that were created.

When was this initially installed? All of those URLs would be 'not selected', because they are 'not selected' to be included in the SERPs, but Google would have found them.

2. I've had a tag auto-link plugin that would find tag keywords on our wordpress blog and link them to relevant tag pages. I've disabled it 3 days ago.

Why?

I'm totally worried about the 'not selected' URLs.

Why?

You haven't replaced indexed (selected) pages with not selected pages, there are just more pages Google has found and is not including in the results for some reason.

Once they have a URL they don't 'let it go', so if they have found 6 mil unique URLs on your site, even if they are from external links pointing to your site (EG if some 'friendly neighbor' webmaster put up a bunch of 404 links to your site because they're silly and thought it would hurt you or an external site had a malfunctioning script linking to yours or for whatever other reason) those URLs aren't going anywhere now that Google has them in the system.

There's not much chance of getting the not-selected number down since you say you only have at most a few hundred thousand pages and the only way I know of to have URLs Google has found and 'not selected' to include in the index actually go lower is to get them indexed, but they shouldn't be indexed (selected), so if you try to get them in you'll probably do more harm than good.

The not selected URLs are not your traffic problem, unless it's URLs that were included (selected) previously.

My personal advice is:
Ignore the not selected number, because it sounds as if whatever the URLs are and where ever they came from, they should not be included in the index, so they are Correctly not selected, but now that Google has the URLs they're not going to 'get rid of them' any time soon, so they will continue being reported as not selected. (IOW: Look at the number and give it a nice 'whatever' and move on.)

My personal guess is the malfunctioning script (huge spike in internal broken (404) links) is what triggered the drop, because it's a bad visitor experience to send someone to a site with a ton of broken links.

The broken links from the malfunctioning script would increase the not selected number, but the not selected number is (was) not the actual issue, it's likely the number of broken links that were on the site (or perceived to be on the site by an algo) due to the malfunctioning script that actually caused the problem ... I'd just wait it out, because it will probably take a while to rebuild the (for lack of a better way of explaining it) 'trust' in the visitor experience you're providing after the little '404 meltdown' you had for a while.

[edited by: TheMadScientist at 7:08 pm (utc) on Nov 27, 2012]

TheBigK




msg:4522999
 7:07 pm on Nov 27, 2012 (gmt 0)

Hello TheMadScientist,

Thank you for your response. Allow me to answer the questions you've asked above.

1. The Disqus plugin has been on the blog since last several months (maybe since 2010). The problem was common - [seroundtable.com...]

By the way, Google continues to say that the 404 errors don't really cause SERP penalty and I've been told to look elsewhere. However, it's a fact that the rise of 404s and traffic drop are in sync.

2. We installed the tag auto-link plugin because we loved the concept that's been on Engadget for long. It helps users find more posts related to any topic. I disabled it because I was told that it could lead to several 'similar pages' on our website.

3. I'm worried about 'not selected' URLs because I've been told by the experts(?) that the trend in the increase of 'Not Selected' that we've on our site is quite the same as that of those website penalized by Google. Frankly speaking, I've fixed all the problems on the site and back to the default setup that worked fine. The 'Not Selected' is the only 'issue' which is yet to be addressed, if it's an issue at all. I don't know what else I should be fixing to get all the Google love back.

While for the last 2 months, I've been asking whether it's the 404 error; Google (including JohnMu) have said that they don't affect rankings.

So at the end of all the discussions - I'm still left with one question : What should I really fix to get our rankings back? Should I start removing discussions contributed by our members which "I think, Google will think" are 'thin'. Come on, we've 50k discussions on our boards!

TheMadScientist




msg:4523014
 7:44 pm on Nov 27, 2012 (gmt 0)

By the way, Google continues to say that the 404 errors don't really cause SERP penalty and I've been told to look elsewhere. However, it's a fact that the rise of 404s and traffic drop are in sync.

That's true, and welcome to WebmasterWorld ... Here's your first 'gotta read exactly sometimes' note ;)

You're correct, 404 errors are not the issue, because if they were a competitor linking to your site and creating 404 errors could tank you, and if 404 errors were an issue removing thin pages that should not have been on a site in the first place could tank you BUT I didn't say 404 errors were the issue, I said it was the broken links on your site. (IOW: Broken links TO your site and broken links ON your site are two totally different things.)

IOW: It was not the 404 errors on your site I was talking about, but rather the links that were pointing to 404 error pages. (There's a huge difference).

2. We installed the tag auto-link plugin because we loved the concept that's been on Engadget for long. It helps users find more posts related to any topic. I disabled it because I was told that it could lead to several 'similar pages' on our website.

If it's really good for visitors I would try and find the fix for the similar pages (like making sure a rel=canonical is on every version of the page(s) and points to the main page if a 301 is not possible) and so what if your not selected page count goes up because the duplicates are not included?

TMS Pauses & Wishes Google would explain their sh*t to people better so we didn't have so much *bleeping* confusion about things, because it's absolutely ridiculous to have someone thinking they need to tone down visitor experience to manage a stupid number reported in WMT assuming the management of the number will make Google 'happier' than a better visitor experience would ... There should be rules and required (detailed) explanations for 'new toy additions' to WMT.

3. I'm worried about 'not selected' URLs because I've been told by the experts(?) that the trend in the increase of 'Not Selected' that we've on our site is quite the same as that of those website penalized by Google. Frankly speaking, I've fixed all the problems on the site and back to the default setup that worked fine. The 'Not Selected' is the only 'issue' which is yet to be addressed, if it's an issue at all. I don't know what else I should be fixing to get all the Google love back

Again, the detail in the wording makes a huge difference.

A penalized or filtered site will generally have a decrease in the number of indexed (selected) URLs, but we don't see that from the chart you provided, what we see is only an increase in the number of URLs Google has decided to not include in the index.

So, what we can see is:

Google found a bunch of new URLs.
We know they found those URLs to be 404 errors.
We know the 404 errors specifically are not the issue.
We know the URLs should not be included in the index (selected), so them not being included is fine and we don't need to worry about that. (It would actually be worse if they were included, because they would likely generate some 'negative signals' when visitors clicked and then went straight back to the results only to immediately click again, or worse, block you before they immediately click again.)

We know 'user experience' is a ranking factor.
We know broken links are a bad user experience.
We know (unfortunately) the links to the 404 pages Google found were on your site.

We know your traffic started dropping around the time the 404 links (which generated the 404 errors and the increase in not selected pages) started showing up on your site due to a malfunctioning script.

We can reasonably conclude:
The Links to the 404 pages decreased the 'user experience score' (for lack of a better phrase) on your site and your rankings were decreased accordingly.

We know you already fixed the linking issue, which we can reasonably concluded caused the issue.

We also know trust is a ranking factor and it's reasonable to conclude having a large number of broken links (not 404 error pages, but the links to the 404 error pages) on a site, which can seriously decrease the quality of a user experience, could have an impact on the 'trust' awarded to a site.

My personal advice is:
Forget about the not selected pages, because managing the number is not likely to help you, go back to building a great visitor experience, get some more quality inbound links and let Google work the rankings out.

I would guess since the 404 links were on your site for a period of time it will take more time to recover than it would if they were only there for a couple days, so I'd personally go back-to-the-basics of traffic/ranking building (quality site, quality inbound links, great visitor experience) and let things work their way through G's system for two or three months, then give it another look if things hadn't started recovering.

TheBigK




msg:4523017
 8:04 pm on Nov 27, 2012 (gmt 0)

TMS - it feels good when someone does a detailed analysis, than just posting 'do this' 'do that'. BIG THANK YOU!

May I mention that the 404s count rose over a period of 2-3 weeks (It took us some time to figure out that it was the DISQUS plugin). What it did was create several versions of a valid URL, for example-

Correct URL: sample.com/my-happy-url/
Disqus Generated : sample.com/my-happy-url/random-number -> 404
sample.com/my-happy-url/random-number -> 404

Each good URL got about an average of 3-6 bad URLs. So I fixed it by issuing a 301 redirect for each bad URL to the correct URL.

Point to note: These URLs were all internal; so Google might have thought we've pretty bad pages. By the time we figured out the problem, the 404 count had touched 99k. It's down at about 52k now.

Point to note: Google crawl rate dropped almost in sync with the 404 errors on the site. It's not recovered yet. So, I had concluded that because Google thinks our site had large number of 'bad' URLs, let's crawl them lesser and even push them down in ranking.

Problem with wordings: I think I should have mentioned that the URLs are all *internal* URLs that resulted into 404 errors. I fixed them up in one go (through redirects) but Google's still not acknowledging that they're all gone.

Is there any ballpark figure available on how long will it take Google to really start trusting us?

TheMadScientist




msg:4523019
 8:29 pm on Nov 27, 2012 (gmt 0)

TMS - it feels good when someone does a detailed analysis, than just posting 'do this' 'do that'. BIG THANK YOU!

NP, Glad I could help.

Is there any ballpark figure available on how long will it take Google to really start trusting us?

None, that I know, but 'trust' (overall) 'cascades' much like PageRank, so inbound links from trusted sites coupled with the fact the problem has been fixed and great visitor experience signals, in my opinion, should help to rebuild that faster than 'just waiting'.

One of the biggest issues these days with drops in rankings is there are so many sites to choose from, once you get 'dinged' or 'slip up' it's tougher to make a comeback than it used to be, so I would say patience, quality inbound links from trusted sites, a great visitor experience and no more 'issues' are the keys to recovering in your situation.

Another thing I would always recommend is taking the numbers reported in WMT (for anything) with a grain of salt ... It's notoriously buggy, so I personally always try to stick with 'relative' and 'directional' when possible.

IOW: If your not selected pages are not increasing, to me, that's a way better indicator of what's going on than the overall number, and, along the same lines, if the number of selected pages is not decreasing relative to the number of not selected pages that's what I would work with more than the actual number.

A couple of quick examples are:

If I had 100,000 pages 'selected' and 1,000,000 pages 'not selected' but thought there were only 60,000 pages that should really exist and be indexed.

I would not worry if:
Selected pages dropped to 50,000 and not selected dropped to 500,000 when traffic stayed [relatively] constant.

Selected pages increased to 150,000 and not selected increased to 1,100,000 or even stayed at 1,000,000.

Selected pages stayed at 100,000 and not selected increased to 1,100,000 or even jumped to 1,500,000, but traffic stayed [relatively] constant.

I would worry if:
Selected pages dropped to 90,000, not selected increases to 1,100,000 and traffic began decreasing.

TheBigK




msg:4523021
 8:48 pm on Nov 27, 2012 (gmt 0)

Lysis (who seem to be Top Contributor at Google Webmaster Forums) has said that thousands of internal 404 errors won't affect rankings.

Every single thing I determine as 'problem' is shot down by the 'experts'.

There's another webmaster I found on the Internet who said his website had massive internal 404s and it took him 4 months to recover. But according to Google, it's not a problem!

@TMS: My graph pretty much remains the same these days.

TheMadScientist




msg:4523024
 8:58 pm on Nov 27, 2012 (gmt 0)

Top Contributor at Google Webmaster Forums

There are those here too who contribute a bunch of posts (as far as count goes) but don't really have the knowledge level to back up their statements or assessments of situations in many cases, and, again 404 Errors are different than Links to 404 Error Pages.

Also, many people post with a distinct lack of precision in their wording, so you have to 'insert commonsense' on many occasions to 'get the right answer' from what they have to say.

404 Errors on a site are not an issue.
Links to 404 Error Pages likely are.

Unless you get the distinction between Links to 404 Error Pages and having 404 Errors reported in WMT (or your stats) you won't see what I'm saying.

Google says 404 Errors are not an issue (it's true), but if you can find anywhere they (not some poster on some other forum or even here, but an actual Google rep. or somewhere in their help/guidelines) explicitly state "Links to 404 Error Pages are not an issue.", please, link the source.

### Short Version ###

404 Errors = Not a Problem
Links to 404 Error Pages = Bad User Experience = Problem

TheMadScientist




msg:4523037
 9:24 pm on Nov 27, 2012 (gmt 0)

...thousands of internal 404 errors won't affect rankings.

That's actually a true statement, because the word links is not in there...

True:
Thousands of internal 404 errors won't affect rankings.

False:
Thousands of internal links to 404 errors won't affect rankings.

An example of 'imprecise' or at least 'confusing' wording is the initial quote above, because the use of 'internal' regarding 404 errors implies you could have an 'external 404 error', which you really can't, because if it's 'external' it's not on your site, so it's not your error.

You could have a Link to an external 404 error, but you, yourself cannot have an external 404 error, so the use of the word internal is not only unnecessary, the implication of 'having external 404 errors' being possible when they are not makes the statement a bit imprecise, in my opinion, but in any case it's likely confusing to many.

aakk9999




msg:4523086
 2:23 am on Nov 28, 2012 (gmt 0)

Perhaps an example will help:


THIS IS BAD USER EXPERIENCE AND MAY AFFECT RANKING
You have a site
On your page there are link(s) (to your other page or to another site's page)
When you click these link(s), you get 404

THIS WILL NOT AFFECT RANKING
You have site
Your links on the site are fine (return 200 OK when clicked)
Someone else's site has a link to your site
The link on that other site goes to an URL that does not exist on your site
When the link on that other site is clicked, your server returns 404

TheBigK




msg:4523097
 4:12 am on Nov 28, 2012 (gmt 0)

@TMS: Woah! I never knew I had to be so precise with the wordings. After posting examples that the links were originating from my own site and pointing to bad locations on my own site, I thought things were crystal clear.

I'll proceed with an assumption that since I've fixed all the 404 errors which were originating because of several thousand bad links on my own site, pointing to bad locations on my own site; it's just matter of time that Google will acknowledge they're gone and we've a good user experience. That also tells me I might have to wait several weeks before I see it happening.

I'd take an opportunity to know whether there are any ways of speeding up Google crawling rate (apart from WMT settings) to ensure that Google 'realises' that all those errors are gone?

PS: @TMS - you're a Godsend.

@akk9999: Thanks a lot for confirming!

TheMadScientist




msg:4523099
 4:52 am on Nov 28, 2012 (gmt 0)

First, glad we could help you out.

Second, I got what you were saying about where the links were and where they were pointing, but it sounded like you didn't understand part of what I was saying and I also really (mostly) wanted to make sure you (and future readers - tons of people read here, even if they don't post or post often) understood why it might seem like there are differing positions on (or statements about) things different places that seem to contradict each other at times, but may actually be correct and non-contradictory or easily misunderstood.

The wording of things is really an important aspect of understanding SEO and search engines, exactly because of the subtle wording difference changing the statement I edited above about 404 errors from true to false ... It's also one of the most difficult things for most people to get, because they aren't used to the level of 'exactness' necessary to describe a situation or how to fix something and a couple of words can make a huge difference in what someone says or how a situation is assessed.

As far as speeding up crawl rate goes, I'd try and get links, deep ones from quality sources...

* I actually started paying close attention to the wording of things way back when GoogleGuy used to post here, because he called what we lightly refer to as 'the algo' a 'heuristic' and as soon as he did, a bunch of bells and whistles went off in my head, because there's a difference in the two and we loosely call what Google, Bing, etc. use an 'algo', but it's technically a heuristic and there's a difference in the type of results (answers) each generates.

TheMadScientist




msg:4523100
 5:06 am on Nov 28, 2012 (gmt 0)

I'll proceed with an assumption that since I've fixed all the 404 errors which were originating because of several thousand bad links on my own site, pointing to bad locations on my own site; it's just matter of time that Google will acknowledge they're gone and we've a good user experience. That also tells me I might have to wait several weeks before I see it happening.

And, yeah, that's what I would do too ... If it's still a non-recovering issue after some time has passed, then I would definitely revisit the situation, but it seems to me you have the issue removed (corrected) and, in my opinion, it's just going to take time to 'get through the system'.

TheBigK




msg:4523105
 5:27 am on Nov 28, 2012 (gmt 0)

For the past 1.5 - 2 months, my routine is to get up in the morning, login to GWT and mark the top 1000 reported errors as 'Fixed' - then see the total error count reduce by 1000, 900 or sometimes 1100. I began at 99k errors and now at 50k. Google gave me two surprises in the last few days. On 26th, it dropped those errors by ~4k and by ~3k today. On 27, it followed the regular 1k drop, like all other days. The graph says it all -

[i.imgur.com...]

I'm confident that all the errors have been fixed because I downloaded the error data through API (which allowed me to view all the errors in one go).

I think once the errors are cleared off, Google might take a week to restore our rankings. Would still look for ways to improve the crawl rate.

TheBigK




msg:4524199
 7:47 pm on Dec 1, 2012 (gmt 0)

Quick Update:

I've been researching this topic of 'Internal Bad Link' causing traffic drop and have discovered some interesting stuff.

There are webmasters who've cited their experiences confirming that large number of bad urls pointing to non-existing locations on the same domain has resulted into drop in rankings. Some even mention that fixing the bad links recovered their traffic to levels slightly more than before the drop.

The general advise on the Internet is that '404 errors don't cause traffic drop. period'. Now all 404s aren't of the same nature. If another domain creates incorrect link to your website - it's not your fault. But what if your own domain has bad links to self?

I'd be interested in reading the experiences of webmasters who've been through similar situation.

aakk9999




msg:4524231
 10:55 pm on Dec 1, 2012 (gmt 0)

There are webmasters who've cited their experiences confirming that large number of bad urls pointing to non-existing locations on the same domain has resulted into drop in rankings.

This is what TMS has been telling you all along - re-read his posts.

TheBigK




msg:4524273
 3:50 am on Dec 2, 2012 (gmt 0)

@akk9999 : I've read and re-read his posts. Yes, my cynicism comes from those people with 'green bubbles' next to their usernames in Google Webmaster Forums. They tell me their websites had a million of those errors and were not affected by Google.

You may search for "Can large number of internal broken links cause search penalty?" in GWT to see what those folks have been saying.

I'm sorry for being very paranoid. It sucks, but I just want the crawl rate and the traffic to come back and have been researching a lot these days. :-(

TheBigK




msg:4524757
 4:41 am on Dec 4, 2012 (gmt 0)

Alright, here's 'latest' from Google's John Mu on the same discussion I had initiated. Who should I really trust? I'm turning into a paranoid!

---

Hi TheBigK

The number of crawl errors on your site generally doesn't affect the rest of your site's crawling, indexing, or ranking. It's completely normal for a site to have URLs that are invalid and which return 404 (or 410, etc). That wouldn't be something which we would count against a site -- on the contrary, it's something which tells us that the site is configured correctly.

For more information about 404's in particular, I'd also check out our blog post at [googlewebmastercentral.blogspot.com...]

When it comes to marking crawl errors as "fixed" in Webmaster Tools, this is something which is not used for the rest of our crawling / indexing systems, it's only used to simplify things in your UI. We try to bubble up more important crawl errors (such as pages that recently existed), so if you're not seeing any important URLs there (such as some that you thought were valid URLs), then I wouldn't worry about those crawl errors.

Cheers
John
------

So in essense - he's suggesting that the broken links that originate and point to your own domain are 'fine'. WTF?

TheMadScientist




msg:4524758
 4:56 am on Dec 4, 2012 (gmt 0)

No he isn't... He doesn't say a word about links.

He says 404 Error Pages and Crawl Errors. You're really going to have to pay attention to exactly what people say to ever get anywhere in this game, and right now, you're not definitely not getting it.

TheBigK




msg:4524761
 5:10 am on Dec 4, 2012 (gmt 0)

TMS - Yes, I do understand what he's saying. He's talking 'in general' about crawl errors reported in GWT and 404 pages and keeps on pointing at the same blog post that says 404s don't affect rankings. My frustration comes from explaining the complete situation that the links 'originate from my site and end in my site' - but he continues to ignore it.

I'll just ignore him.

lucy24




msg:4524783
 6:33 am on Dec 4, 2012 (gmt 0)

When it comes to marking crawl errors as "fixed" in Webmaster Tools, this is something which is not used for the rest of our crawling / indexing systems, it's only used to simplify things in your UI.

Uh... Is this saying that marking an error as "fixed" means
absolutely
nothing
at all
whatsoever--
it just clears out some room for the next batch of errors to swim into view?

So "fixed" = {display: none} ?

TheBigK




msg:4524874
 9:15 am on Dec 4, 2012 (gmt 0)

@Lucy24 - Yes, that's what he means. So why even bother displaying only 1000 errors? Why give the option 'mark as fixed' ? Just show all the errors you know on the site. I wonder if John even reads the messages before he replies to them.

aakk9999




msg:4524944
 12:56 pm on Dec 4, 2012 (gmt 0)

^^^^
Hmm.. this explains why WMT yesterday said I have 500 "Not Found" errors, but it was showing only 183 even though it says it can show up to 1000.... because couple of days before I have marked as "Fixed" something like 300 URLs....

Sgt_Kickaxe




msg:4524957
 1:58 pm on Dec 4, 2012 (gmt 0)

- switch from 404 to 410 and GWT will continue to report them 404 for a week anyway, the errors data is delayed.

- if you redirect links for things like affiliate programs and each link uses different parameters then your not chosen number will increase.

- marking things "fixed" when you actually want them to be 404 just returns them as broken again on the next crawl, don't bother.

- While I appreciate the data given to me by GWT I do not feel it is a two way tool, they are reporting findings but my input is largely ignored.

TheBigK




msg:4525007
 4:23 pm on Dec 4, 2012 (gmt 0)

Okay, I almost feel like I'm spamming this thread. I'd however appreciate thoughts on the following post in my thread by John (Googler) -

-----
Hi TheBigK
This thread is moving too quickly :-) - let me respond to your questions:

The number of broken links -- assuming the links point to URLs that don't exist -- generally does not affect your site's crawling, indexing or ranking at all, regardless if it's a handful or millions of them. This does not make us assume that a website is of lower quality (personally, it's more like a sign that the website is technically handling these invalid URLs correctly, which would be a good sign). The number of 404/410 crawl errors would also not negatively affect the crawl rate of the website -- it might even increase the crawl rate since the server can likely respond to these requests a bit faster than to normal requests.

The caveat that Luzie mentioned still applies though -- if these links were meant to point to legitimate content, then of course those links won't work, and that can make it harder for us to find that legitimate content. In other words, if your website is linking internally with broken URLs instead of correct URLs, then that would be worth fixing. In your case, these links appear to be pointing to invalid URLs, so that's not a worry here.

The reason for the "generally" is somewhat technical and not something that most websites would need to worry about. In particular, we try to limit our crawling on a per-server basis to avoid overloading the server and its websites. If we were to crawl invalid URLs instead of useful URLs, then it could take longer for us to recrawl the useful URLs. Since we try to prioritize normal URLs that we know about over URLs that we're just double-checking to see if they exist, this wouldn't be an issue; we'd still crawl your normal URLs normally and just try to squeeze these extra URLs in on the side. Even if we were to crawl some normal URLs a bit less frequently, that would generally not affect their indexing or ranking.

Hope this helps!
John
-----

This 34 message thread spans 2 pages: 34 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved