Forum Moderators: goodroi

Message Too Old, No Replies

Reindexing problem

         

chainazo

5:22 pm on Sep 13, 2022 (gmt 0)

Top Contributors Of The Month



Hi, I have a very basic question:
2 months ago I uploaded a site, but:
1) I didn't change anything to it (it just stayed with the "sample page" and the "Hello world!" post). So it's a website with only one page and one post.
2) I didn't tell the search engine not to index it.
3) I didn't install SSL
4) I registered the site in search console ( with http because there is no ssl )
5) I forgot about the site

Very newbie!

A month ago I went back to work on it, made the pages I wanted, installed ssl, etc. Everything ok.

When I asked SC to index the new pages (correct home with https and 8 new pages), I found that:
1) http pages (home, author and rss) were already indexed and with date 2013 on each page (?).
2) I told google "these are now the correct pages" but it only replaced correctly the https home page. (the other 8 pages were never indexed, but were recognized by SC in the sitemap I sent it).

So a month after the reindexing request for the correct pages to SC, I only have one correct home page https indexed and two http pages I don't want, plus 8 "recognized but not indexed" urls.

I know google has long indexing times but in this month I have been uploading many other websites and all have been indexed perfectly with times of 9 days max.

So the indexing time problem I only had with this site, not that I am in a hurry for indexing.

Thanks for your help

not2easy

7:50 pm on Sep 13, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



You have set up a GSC account using the http version of the domain. If there are ancient pages indexed, maybe the domain was previously indexed with those pages. Now you have added https so you will need a new GSC account using the https version of that domain. Google sees these as two different domains. Be sure that https version is where you sumbit your sitemap. You should be able to see the index pages decrease in the http version as the "new" domain pages are indexed in the https account.

Yes, they are the same site, but Google views each version (http/https and www. non-www) as 4 different domains.

tangor

7:53 pm on Sep 13, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



First thing to do is 410 the pages you don't want!

Have you checked the URL inspection tool in SC?

chainazo

11:39 pm on Sep 18, 2022 (gmt 0)

Top Contributors Of The Month



Hi, I have done what you have said here, but there has been no change whatsoever.

The error I get when I request indexing of "new pages" (i.e. those with httpS), google says: "HTTPS is invalid and may prevent indexing of the page".

But the ssl certificate is totally valid since I corroborated it with the hosting and also I have uploaded many websites with this hosting and I have never had a problem with the ssl.

tangor

3:13 am on Sep 19, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"HTTPS is invalid and may prevent indexing of the page".


This suggests that g has found something out of place regarding SSL, so take another look at your pages to make sure you aren't accidentally including content that might break HTTPS.

Sgt_Kickaxe

3:59 am on Oct 4, 2022 (gmt 0)



"HTTPS is invalid and may prevent indexing of the page".

There is currently a delay in the HTTPS reports. You can see it if you try to index a fresh page. Google will report it as fine but re-visit the page in SC a day later and it will say HTTPS is invalid. Just wait another 2 days and it's valid again in the reporting.

It was never invalid, Google just wasn't done evaluating the page yet. The section is new, expect it to get fixed soon enough.

My advice is to keep building the site for now. If your sitemap is fine and HTTPS is fine and your htaccess file correctly redirects http to https then you're good to go, give Google time to catch up.

Fun fact: Many years ago, circa 2006 or so, you could launch a new site, link the URL from a PR 6 site and the new domain would become PR 5 a few months later, even if it only had a Hello World post. No traffic, but great for selling links, for a couple of years, lol.

chainazo

8:28 am on Oct 7, 2022 (gmt 0)

Top Contributors Of The Month



Well, I did what I was told here. I let several days go by to see if the situation changed.
It hasn't.
GSC says my links can't be indexed because they have "nofollow" (which of course they don't).
What I did was to resend the links, but it took a long time and they are still being "inspected".
So what I did was with a plugin, to speed up the indexing of the links I was interested in.
And it indexed them all, but with the following peculiarity: In GSC it still shows only the home page (i.e. it doesn't recognize any of the other links).
But in the search box when typing site:mysite, it shows absolutely all the links I sent to index.
So on the one hand google seems to tell me "it's all good with your links, I got you, stay calm", but GSC wrongly tells me "you asked me not to follow you!".

I guess now I can rest easy because everything is indexed, but the problem is that I can't read the site behavior data because GSC almost doesn't recognize me.

not2easy

12:31 pm on Oct 7, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



GSC seems to be having some growing pains since at least a month or so. I say that based on the number and type of anomalies showing up in various forums here. It is anyone's guess when or if it will show reliable data.

not2easy

4:58 pm on Oct 7, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If the http: site is not properly 301 redirected to the new https version you might also see problems getting Google to understand the changes. Can you still visit the http: site? If so, there is a problem with your redirect. It does help to edit the line in robots.txt to a sitemap in the https: version in GSC. Keep checking the old http: site's GSC account to see that it has fewer indexed pages.

If you are unsure, there are hundreds of discussions in the Apache forum that explain all the steps. If you type in the old http: domain URL, you should land on the new https: version and in your access logs you can see the server's 301 redirect response to that activity.

lucy24

6:18 pm on Oct 7, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



and in your access logs you can see the server's 301 redirect response to that activity
Maybe, maybe not. Last time I added HTTPS to a site--a few weeks back--I found that my host has switched to a default of redirecting all http requests before they even reach the site. I don't, of course, know the mechanics, but it means the request doesn't show up in http logs at all. (Equally of-course: as soon as I figured this out, I turned off the default setting, ignoring dire warnings about making the site run slower.)