Forum Moderators: open
I got spam yest showing a snapshot of a new site not yet live but accessible in the root.
"We noticed that "www.unpublished-site.com" is not listed in over 753,000 search engines etc etc.."
Use a robots.txt file - (See the WW one)
J
It was not named "index.html," it was named something like "indexblue.html," but basically it was idential to the index page as far a content goes (it was there as a demo for some clients).
Could this have gotten our index page blown up? Never occured to us that G would index a page not linked to by any other page in the site. Geez, G has trouble just finding all of our regular linked pages...
The page I referred to was not crawled...I looked in the logs and also checked variations of the URL in G; found no results.
I did find another odd thing however, or maybe not odd. I will note it since it relates to posts about duplicate content.
During Dom we realized that a page called homepage.html was potentially causing us problems. We used it as an alternate homepage for non-Adwords PPC efforts, since it triggered a pop up (which Google doesn't allow for AdWords efforts). The page was identical to the index.html page but with a pop up javascript.
When we stopped running pop-ups back in April, we stopped using the homepage.html in PPC efforts...but we didn't delete homepage.html from the server until Dom/Es, when we realized it might be causing us a penalty for duplicate content.
Right after Dom, we deleted the page, and did a 301 back to the index.html.
Yesterday I did a search on G for the exact URL of that old page (www.mydomain.com/homepage.html) and the search returned our current "index.html" homepage.
Since we did a 301 from hompage.html back to the index.html page, perhaps this is not surprising. What bothers me is that G still keeps this way old, deleted URL (/homepage.html) with a cache of the current homepage. Essentially, they show two URL's with the same cache - our index.html homepage.
Any chance that this could be causing a problem? FYI, shortly after doing the 301, our index page reappeared. But the page is gone since Florida for it's main KW search. Still shows up for other searches.
<mods, if this belongs in one of the threads re dup content by all means move it>
I'd say yes, there's a chance. If the old URI is totally gone then it should correct eventually, (maybe, perhaps, at a guess).
In the early days of Florida, people were reporting finding pages in the serps that had been gone for many months. Back during dom/esm, GG indicated that they liked to use an older, more stable database for their algo changes. Maybe that's what happened to you. As soon as we're full into the rolling updates again, (judging by the way I'm getting crawled that's now), then it might self-correct.
Right after Dom, we deleted the page, and did a 301 back to the index.html.
The page, URI, the whole thing, is totally off the server is it?
ADDED: Wondering why you needed a redirect if the file was gone... if there's a server redirect still in place then maybe google thinks the old URI is still good, and then sees it as dupe content.
The page, URI, the whole thing, is totally off the server is it?ADDED: Wondering why you needed a redirect if the file was gone... if there's a server redirect still in place then maybe google thinks the old URI is still good, and then sees it as dupe content.
I don't really think that this is the cause of my index page sinking into the depths, especially since:
- it comes up for searches other than the optimized two-word phrase that it was primarily targeting, and,
- 19 of the top 20 sites pre-Florida are also gone.
But, one never knows. Perhaps if not for this, there would have two of us left standing, not just one lone competitor using cloaking!
As you say, that might not be the problem, but I always figured when you really wanted a page gone, you just removed it from the server along with all internal links pointing to it, then the SE's wouldn't be able to find the URI, and it would eventually disappear from the serps.
If you do find that removing the redirect solves it, could you post a follow-up in this thread? It might help others with a similar problem.
ADDED: Perhaps the hompage crawl wouldn't have shown in the logs because of the server redirect... I'm not 100% on that.