Welcome to WebmasterWorld Guest from 3.227.233.78

Forum Moderators: Robert Charlton & goodroi

"Crawled - Currently Not Indexed" urls in Google Search Console

     
12:01 pm on Jul 23, 2019 (gmt 0)

New User

joined:July 23, 2019
posts: 24
votes: 0


Hello,
How to Diagnose "Crawled - Currently Not Indexed" in Google Search Console
there are over 60.000+ URLs on my site under that category now. But when I inspected the URLs, some are showing "submitted and indexed" some are showing "indexed, but not in the sitemap" I checked my sitemap and the URLs are there. So, I'm not sure if I need to do anything with this "crawled, currently not indexed" issue.
3:51 am on July 24, 2019 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:4520
votes: 350


Hi designergweb and Welcome to WebmasterWorld [webmasterworld.com]

Clearly with that number of URLs this is not a new site. Since you are seeing conflicting information for some of the URLs I would say that you should trust what you can verify and ignore what you determine to be untrue. If there were suddenly 60000 URLs not indexed there should be a similar drop in traffic I would think. But if further testing (inspecting those non-indexed URLs) shows that they are indexed, then you should not be seeing a related drop in traffic. Google is not always 100% correct in their data reporting. I have not been seeing much of anything in the 'new' GSC that I trust 100%. Personally I would not take any action based on the "crawled, currently not indexed" message.
5:07 am on July 24, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:10465
votes: 1098


@designergweb.... Welcome to Webmasterworld!

G's reporting has been all over the place in recent days (try years!) and the revamped reporting tool(s) are more iffy than ever.

Raw logs are your backup to determine crawl ... as for "index" that's out of your hands ... and g has been inconsistent in what is DISPLAYED in the serps.

I have never used a sitemap ... that means g (or any bot) actually has to CRAWL my site to find everything ... and for the most part they do exactly that ... find everything. :)

Take the notice from g for exactly what is says: "Hey pardner, I got your stuff, but haven't had time to figure it out or put it on my list!"

Or be a bit more pessimistic in reading it as: "I got your stuff. I'll let you know what I think of it later, don't bother me."
5:56 am on July 24, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15889
votes: 876


Never take GSC’s unsupported word for anything.

Try exact-text searches for a random selection of the “crawled but not indexed” pages and see if they really do not come up in the index. Personally I have a hard time believing there exists anything that G has seen but not indexed. (Possible exception for things with explicit noindex tags--but even they have their place in the database.)
6:43 am on July 30, 2019 (gmt 0)

New User

joined:July 23, 2019
posts: 24
votes: 0


thanks a lot!
9:08 am on July 31, 2019 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator robert_charlton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2000
posts:12390
votes: 409


Try exact-text searches for a random selection of the “crawled but not indexed” pages and see if they really do not come up in the index.

designergweb, while I agree with many of the comments in this thread about approaching GSC with some skepticism, there is a trend with large sites not noted in the comments, and that is that Google is no longer simply indexing everything that's been submitted... I therefore would pay some attention to the "crawled but not indexed" pages you see reported, as they may actually be telling you something real about your site. That said, GSC's reporting function is necessarily the last link in a long chain in the indexing process, and Google's indexes are so large that there's no easy way to change that.

First, do check lucy24's suggestion, and see if searching for exact text strings (ie, try searching for text strings on these pages using quotation marks) and see what you find. IMO, exact text searches are the most dependable way to determine whether the pages you're looking for have been "indexed" in the particular layer of Google indexes that you care about... the visible pages... those that are shown in the serps and are considered for ranking.

There's been a history, though, in the past year (actually longer) of Google trying to discourage the use of the Fetch tool as a way of getting pages into the serps, as it's been misused greatly by spammers, and the tool was never intended for mass submissions, It did become for a while kind of a mini-industry, where so-called SEOs who knew about the tool used it as a way to get less-than-adequate pages into the index.

See the first of our threads on the topic here, spearheaded by some very persistent reporting by Barry Schwartz of seroundtable, whose posts we frequently cite in the thread....

Big reductions in crawl-to-index limits on Google Fetch tool
March, 2018
https://www.webmasterworld.com/google/4893740.htm [webmasterworld.com]

From Barry's article, quoting Google's John Mueller...
...The "Request indexing" feature on Fetch as Google is a convenience method for easily requesting indexing for a few URLs; if you have a large number of URLs to submit, it is easier to submit a sitemap. instead. Both methods are about the same in terms of response times....

Note also item #5 on the page...
5 - Recrawling is not immediate or guaranteed. It typically takes several days for a successful request to be granted. Also, understand that we can't guarantee that Google will index all your changes, as Google relies on a complex algorithm to update indexed materials.

Over time, Google raised the bar on quality it would accept, and imposed severe limits on submission quantities. The indexing problems in the early part of this year, I think in part came from an algorithm in this scenario which had a glitch in it.... and from time to time I think that these problems might also have been used to obscure the nature of the changes that were being made... but that's personal conjecture.

It's hard to say how much of what you're seeing now, in the way of "Crawled, Currently Not Indexed", goes back to how your site achieved its indexed status in the first place, and to these recent changes in Google.

Please post your observations as your indexed status evolves. I should add that I myself prefer to rely on natural crawling, as that gives me a better view of how the site structure and its integration with the web are performing.

3:58 am on Aug 1, 2019 (gmt 0)

Senior Member from ES 

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 13, 2005
posts:703
votes: 12


If you use Wordpress, each page generates many complementary pages with no value for Google such as
example.com/post-title/feed/
example.com/post-title/comments/feed/
example.com/post-title/image2/
example.com/tag/post-title/
6:12 am on Aug 1, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 29, 2005
posts:10465
votes: 1098


My last six months (raw logs) with g has been interesting. In that full period of time g has NOT managed a COMPLETE crawl of a 700 page site ... but turned around a crawled a 20k page site three times! Head scratching involved, but these days I don't have sleepless nights since there is not a lot anyone can do to bring that pony to water and make it drink...

Meanwhile, Bing is spot on, doing the same thing every two weeks. Very aggressive! DDG about once a month ... Others try, but most are either roboted or htaccess denied (no value to me).

I suspect g's crawl budget is getting diminished simply because of their scale, which is immense.

YMMV
12:36 pm on Aug 1, 2019 (gmt 0)

New User

joined:Mar 31, 2019
posts:13
votes: 13


Am i the only one that finds the new GSC completely useless?
I used webmaster tools a lot and while it was a few days behind it was useful.
The new version is completely useless. I don't find anything on it worth a damn.
12:49 pm on Aug 28, 2019 (gmt 0)

New User

joined:July 23, 2019
posts: 24
votes: 0


thanks a lot!