Forum Moderators: Robert Charlton & goodroi

Google stopped indexing my deeper pages (143k “Crawled – not indexed”)

should I merge them?

         

guarriman3

10:46 pm on Oct 6, 2025 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hey everyone,

I run a big site (about 350,000 insect species) organized like this:
Home > Families > Subfamilies > Species > Photos/Maps > Individual photo/data pages.
Everything’s static, in Spanish, and only updated once a year (population data, new photos, etc.).

Here’s what I’m seeing in Search Console:
  • 143k URLs: Crawled – currently not indexed (half are from the photo/map/data levels)
  • 68k URLs: Discovered – currently not indexed (almost all from those lower levels)

    A few years ago, all of these used to index fine. Since around 2022 (Helpful Content + Core Updates), Google barely indexes anything below the main species pages.

    So my questions are:
  • Does this sound normal for big structured sites like this?
  • Could all those “not indexed” URLs hurt crawling or ranking for the upper levels?
  • Should I just merge everything (photos, maps, etc.) into the main species page and kill the extra URLs?

    Would love to hear if anyone managing large database-style or taxonomy-type sites ran into the same issue.

    Thanks!
  • not2easy

    4:57 pm on Oct 7, 2025 (gmt 0)

    WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



    I don't manage a large database/taxonomy site anywhere, so I am just 'guessing' about the number of photos, maps, etc that must be involved, you probably don't want all that data on one page. I'd check to see how people use the main species pages first. If they are using the additional pages, it might be better to leave things as they are.

    Once a person seeking information lands on a species page, it would seem better to offer them links to that specific information than to need to search for it on the species page.

    I don't think it would improve ranking, but it could confuse Google even more than they are. Google can view major changes as manipulation efforts these days.

    Others who have worked with or had a similar situation could provide a better response, so hopefully someone else has a better suggestion.

    lucy24

    5:05 pm on Oct 7, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Obligatory addendum, G being G: Have you done some exact-text searches to confirm that when they say “not indexed” they are telling the truth?

    On the other hand, do you really need multiple data pages for a single species? And do they actually need to be indexed, so long as users can find the species page and go from there? (I haven't, of course, seen your site. But I do spend a fair amount of time at GBIF, so I'm picturing the extra information linked from each main page.)

    Whitey

    8:46 pm on Oct 7, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    What you’re seeing is completely normal since around 2022. After Google’s *Helpful Content* and *Core* updates, it became far more selective about what it indexes. Large structured sites (like yours) with deep taxonomies or thin sub-pages were hit hardest.

    Those “Crawled – not indexed” URLs aren’t hurting you directly, but they dilute crawl focus. External links can help indexing by signalling importance, but only if the page also has unique, valuable content and good internal links.

    Best move: prune or merge low-value photo/map/data pages into richer species pages, keep sitemaps clean, and strengthen internal linking. Quality and usefulness now matter far more than quantity.

    See Search Engine Land, “Fix ‘Crawled – currently not indexed’ error in Google Search Console,” Aug 2022 — for reference [searchengineland.com...]

    Whitey

    2:05 am on Oct 8, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    TL;DNR (follow-up):
    Once you’ve handled all the technical / crawl / structure basics, the next priority is authority; both for your site overall and for the inner pages you want indexed. Google now makes stronger “selective indexing” decisions than before: even genuinely unique pages can be skipped if they lack the reputation or signal that they deserve a place in the index.

    That’s why solid backlinks (external and internal) become the next critical lever. Pages that receive link equity are more likely to be seen by Google as “worth indexing.” As your site’s authority grows, many of those previously unindexed URLs often begin to return into the index ; quietly, without further changes.

    But make sure the backlinks are from sites that have topical relevance to your site, or you could do more harm than good. A few quality referrals are better than 1000's of low quality irrelevant matches.

    So yes; after getting fundamentals right (crawlability, content, internal linking), shift your focus to building and distributing authority across your site and pages.

    tangor

    7:11 am on Oct 8, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Scientific sites can often exceed that "old rule of thumb" that NO PAGE should be more than three links from HOME. Not because of search engines, merely USER annoyance to have to click YET ANOTHER LINK to get the full picture (sic) of the topic of interest. On the other hand, search engines might ALSO have that same frame of mind.

    Sometimes "granularity" is just that: sand in the machine clogging efficiency and useful results.

    If your MAIN TOPIC (top two links below HOME) are INDEXED than you are good as that is YOUR CONTENT, the rest is dicta that can be found by the USER if desired, but otherwise is TOO THIN in "discoverable" content for a search engine to bother with IN THE SERPS. MEANWHILE, those pages ARE CRAWLED---have to be to get the "not indexed" tag, but are LIKELY IN THE SERPS at a much more obscure location. As suggested above, do a few site specific EXACT MATCH searches to confirm.

    Brett_Tabke

    3:24 pm on Oct 14, 2025 (gmt 0)

    WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month Best Post Of The Month



    So, " currently not indexed" is a huge refrain going on throughout SEO. Not Indexed seems to be climbing for everyone.

    <.02> My opinion is that Google is pruning their DB to reduce size, to send more resources to AI usage. Pages that dont, and likely aren't going to get clicks, are being de-indexed.</.02>

    fearlessrick

    3:30 pm on Oct 14, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Agree completely with Brett. I have maybe 5000 pages on two sites with vastly varied interests. Most are not indexed by Google though all of them used to be. Google is increasingly becoming a detriment to human awareness.

    We should all thank them for being so "not evil." (yes, that deserves a <sarc> tag)

    lucy24

    6:02 pm on Oct 14, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Google is pruning their DB to reduce size
    May not be a bad thing, at least for some sites. I’m already quasi-blocking* visitors from certain geographic regions because I am utterly confident that they are not actually looking for material to be found on my pages, but are too lazy and/or too stupid to even glance at what the SERP says. (Or possibly all use the “I feel lucky” option, which again is wildly unlikely to be applicable.) People who are looking for what I have are more likely to use {reputable curated directory}.**


    * 302 to a page that says “You have accidentally replicated the behavior of an undesirable robot”, with link to the originally requested page. I believe I can count on my thumbs the number of people who have availed themselves of this option.
    ** Whose curator likes me ;)

    Whitey

    8:45 pm on Oct 14, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Brett’s right on the money. What we’re seeing looks a lot like Google pruning its index - not just to save space, but to concentrate on pages that actually get used.

    It’s not necessarily a “quality” issue. It’s more about engagement probability. If a page hasn’t attracted clicks or interaction in a while, it’s quietly parked out of the main index to free up resources for what users seem to care about — and for AI systems that need cleaner, higher-signal data.

    Merging the thinner photo/map/data layers into richer species pages makes sense, but I’d go steady. Don’t nuke deep pages that add genuine value or serve niche queries - just strengthen internal linking and consolidate authority where it counts.

    Feels less like a penalty, more like a reallocation of index bandwidth. Authority, freshness, and user signals are carrying more weight than ever.

    smallcompany

    8:40 pm on Dec 10, 2025 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    I noticed the same on my small sites, and experienced:

    - Pages in WMT showing a specific page not indexed, but when testing, Google shows all green, like indexed and ok.
    - Pages not being indexed, I request indexing, they come back, then dropped again.

    Therefore, I concluded G started expelling pages that were not get traffic anyway. Similar or same as Brett stated above.