Forum Moderators: Robert Charlton & goodroi
Without effecting
[widgets.com?...]
I suspect you can as it uses full absolute URLS but I am fearful that it may disregard the http status when logging the pages for no-inclusion in the next six months!
Whats the opinion out there.
...if you serve content via both http and https, you'll need a separate robots.txt file for each of these protocols. For example, to allow Googlebot to index all http pages but no https pages, you'd use the robots.txt files below.For your http protocol (http://yourserver.com/robots.txt):
User-agent: *
Allow: /For the https protocol (https://yourserver.com/robots.txt):
User-agent: *
Disallow: /[google.com...]
To use the remove tool, you must also have a robots.txt in place. So if your https: remove request sends the bot to http: by some accident, then the bot would NOT see the proper robots.txt -- so theoretically the http: urls would be safe.
Warning - I have not tested this in the real world.
Correct me if I am wrong but I have asked everywhere and am yet to get an answer that is workable on a basic HTML site.
Also, you could use the Google remove tool without using robots.txt to block providing the https version of the page shows a 404 error - say by removing the SSL cert.
If so, you can redirect each page using code.
(Probably alot of work, but worth it in the long run)
All you need to do is create the https version of the site as a completely seperate site, duplicate the pages, and add only the redirect code in the pages.
I'm assuming that you dont want to use the https version again.
Do you have a dynamic site? asp / aspx?
If so, you can redirect each page using code.(Probably alot of work, but worth it in the long run)
All you need to do is create the https version of the site as a completely seperate site, duplicate the pages, and add only the redirect code in the pages.
I'm assuming that you dont want to use the https version again.
We do not have a dynamic site.
Basic HTML plus javacript & a few asp pages.
Ellio, did you find your https:// pages in the SERPs out of the blue?
we have noticed it on one of our sites. it is back to http:// now.
it is quite strange as there is no link to the https:// page from anywhere.
leads to a suspicion that someone could enter the https:// page by mistake with a toolbar installed on it.
I noticed it when doing a site: search as the htpp index page (and others) had been replaced by the https version.
I have worked out how it happened and corrected the error. We had a secure forms on the same domain but they have now been transferred to a related domain and the SSL cert has been removed and a new SSL installed on the related domain.
The problem was with relative linking. There were no direct links to the https index page but there was a single relative /page.html link on the form pages and in turn that page had a /index.html relative link that the robot then sees as [mysite.co.uk...]
Good reason for ONLY using absolute linking. All our links have been changed to absolute but we are still waiting for the https pages to go and the http pages to return.
Luckily the problem occured in the default index and not Big Daddy.
Google help emailed to say that they had removed the offending https pages as requested.
They confirmed that the http page versions would not be effected as these pages did not return a 404 error.
Sounds like its safe to use remove tool providing ONLY the exact URL being removed shows a 404 error.
I would not rely on Google remove/robots.txt for this.