Welcome to WebmasterWorld Guest from 54.81.78.135

Forum Moderators: Robert Charlton & goodroi

Conflicting Data in GSC - # of Pages Indexed

     
9:03 pm on Jan 9, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:May 29, 2003
posts:800
votes: 22


On my recently SSL converted site,
In the Google Search Console,
--- Under "Google Index" - "Index Status", it says ---> 424 Indexed (new HTTPS version), (old HTTP version 14 Indexed)

--- Under "Crawl" - Sitemaps, it says ---> 421 Submitted, 407 Indexed (both HTTPS and HTTP - notice delta of 14)
That 14 just won't go away.

Been this way for a while. Can anyone clarify what is going on?

Is there a surefire way to determine JUST WHICH PAGES are gumming up the works (remain unindexed?).
And why I am JUST FINE by one "pages indexed" metric (over 100%), but 3% screwed in the other?

I have been running Screaming Frog, but can't find a thing. Thanks in advance for any insight offered.
.
10:56 pm on Jan 9, 2018 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Sept 7, 2006
posts: 1025
votes: 88


When you say "recently", how long ago? It was a couple of months before GSC sorted the whole of my (smaller) site, and unless there are any errors I wouldn't worry too much.

If you can access the raw server logs, that is the place to check: look for erroneous 200s in the http log (there should be none), and for anything else (but particularly 4xx codes) in the https log. If everything there checks out OK, just give it time.
1:10 am on Jan 10, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:May 29, 2003
posts:800
votes: 22


I converted everything (I THOUGHT) in July.
My host did not properly implement the 301 redirections, until early November (plesk system).
By December, I was almost fully reindexed as HTTPS.
I lost most of the links to my top page (was over 30,000, now 4,100).
Still trying to crawl my way back to relevance.

I will bring up viewing "raw server logs", which I have never seen before. Are they normally provided by the host? (I am not a programmer - just a little self-taught HTML.)
.
3:46 am on Jan 10, 2018 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3724
votes: 205


You mention under Crawl > Sitemaps that you still have sitemaps for both http and https. Is that correct? When you create a new domain in GSC for the SSL version, it can help you check on indexing changes (from old to new) to keep the old non-https version listed in GSC as well, but you should not have a non-https sitemap in that domain at GSC after switching to https - unless there are still pages that are accessible only via http.

@Sally Stitts - re: raw server logs -You can download your logs via ftp, usually they are available in a zip or gzip format. They can be viewed in a text editor or as an office spreadsheet. Best to us a tool that decent search capabilities. Various hosts have minor differences in their log policies. Generally each line shows you the IP, the date/time, the file requested, the protocol, the browser (User Agent) used and occasionally a referrer.

[edited by: not2easy at 4:24 am (utc) on Jan 10, 2018]

3:58 am on Jan 10, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:11526
votes: 702


Can anyone clarify what is going on?
In addition to what been said, know that GSC is notorious for displaying conflicting data... especially indexed & crawl stats. It's intermittent, but chronic.
2:34 am on Jan 31, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:May 29, 2003
posts:800
votes: 22


I knew there was something I could do during the Trump SOTU speech, with my TV volume TURNED OFF.

I have added a few pages.

The latest -
--- Under "Google Index" - "Index Status", it says ---> 426 Indexed (new HTTPS version GOOD), old HTTP version 13 Indexed BAD

--- Under "Crawl" - "Sitemaps", it says ---> 425 Submitted, 408 Indexed (both HTTPS and HTTP - notice delta of ... 17?) (Needs weekly update on Feb.4.)
That old 14, now 17, just won't go away. BAD

Maybe I should just click on the "Resolve Conflicting Indexing Data" button?
Makes a nice acronym of RCID, or arrsse ID. (Just for fun ... strike it if you must.)

AND, for the first time in months, I received an UPTICK in my incoming link count. Maybe there is hope, after all.
Salvation after a monumental error? Glory be. There's stuff hiding in those old data banks.
.
5:53 pm on Jan 31, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14718
votes: 614


Are you on the new GSC or the old one? The new version should give a list of specific pages that have been submitted but not indexed. The list is not infinitely long, but it's definitely longer than 14.
8:57 pm on Feb 6, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:May 29, 2003
posts:800
votes: 22


Sorry for the delay in answering.
How the heck do you get to the NEW GSC? I can only get there from an email they sent to me.
I cannot find a way to get to the NEW GSC from the OLD GSC.
I have been "dealing" with the 13 files issues STILL indicated as HTTP by the NEW GSC.

9 are PDFs. (Does Googs not like converting PDF files from HTTP to HTTPS?) I have no clue. Shown as BOTH HTTP and HTTPS.
2 are files that have been deleted FOR YEARS. NOT https, not http. GONE. FOR YEARS.
1 file is garbage, with a bazillion characters added to the filename - blahblah.htm?&usa=U7ei=(bazillion characters)
1 file should be fine, redirected to HTTPS, but ALSO shown as HTTP. Over 400 files redirected, why not this one (completely)?

Missing from the new interface - the KILL IT OPTION! Why no option for this?
I am telling you that it is gone, dead and buried, but I cannot tell you DUMP IT NOW?
Why is it good, not to be able to?
.
10:24 pm on Feb 6, 2018 (gmt 0)

Administrator from US 

WebmasterWorld Administrator not2easy is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2006
posts:3724
votes: 205


I would be looking at my redirects at this point. If you can browse to files under http that you want indexed under https they are not properly redirected. Read through some of the old threads on http to https at [webmasterworld.com...] (2016) or see [webmasterworld.com...] from 2015

Re: no kill it option - from what I've seen hitting Validate does the same thing but it takes longer and they insist on keeping you updated. I do not like the new interface, but I think they are still working on it for now. Sure looks like it isn't quite ready.
12:53 am on Feb 7, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14718
votes: 614


I cannot find a way to get to the NEW GSC from the OLD GSC.

I suppose eventually they will redirect you automatically--I'm pretty sure by now I've got all eight notifications (three sites, one of them https), and I'm sure I am not at the top of their list--but for now all I could figure out was to make a bookmark once I'm on a New GSC page. And then from any one New GSC page you can go to all the others.

Validate does the same thing but it takes longer
Boy, does it ever. I just checked on an URL that they previously claimed as a 404. (They appear to have requested the file several days before I created it. I can only conjecture that I goofed by adding it to a banner header at some earlier date, and--in really bing-esque fashion--Google noticed.) They successfully crawled it four days ago, nine days after the Validation Started date, which seems excessive for a single 404. If it had been a whole new site, they'd be on it in nine hours.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members