Page is a not externally linkable
2by4 - 8:17 pm on Oct 3, 2005 (gmt 0)
G needs to sort this out." Exactly, basically all the recent google tweaks have resulted in my needing to do an unprecedented amount of google specific tweaks just so it can handle the data I give it, whether it's www rewrites, index.htm -> / rewrites, full explicit moved page 301s rewrites, to avoid possible dupe penalties, to more stuff that's too boring to talk about. But the overall affect is that I am having to consider google's requirements now before I even start recoding a site, and install them from the beginning. This has nothing to do with my information, it's a direct result of me having to organize my information for google. In other words, google is no longer able to 'organize the world's information', it needs me to do its work for it. Google Site maps are an especially obvious example of this failure. I can accept this, but it's actually getting ridiculous how much I have to do to make sure my content does not trip some filter or other. This is a failure as far as I'm concerned, and on a fairly deep level. I am not talking about spam here, I'm talking about making sure google doesn't think I'm presenting dupe content when I'm not, things like rewriting all index pages to / and so on, since google by itself seems to be requesting pages that are not even linked to, things like testing / But here's one very recent thing I saw on a site, it's properly search engine friendly, and google had the pages indexed at roughly 2x the actual total for a year, but recently it decided to roughly increase the total page count to 50x the actual total. Literally. The only way this could have happened is that if it's including each and every link that is blocked by robots.txt in the site total page count. I've seen this behavior on a few different sites now, it's very recent, a few weeks at most. Interesting observations caveman, this explains an oddity I've been watching for about 4 months, I did a small site, but expanded it without completely filling out the new content pages (too lazy, figured it's easier to create the pages all at once and put filler on them then pad them out later than to hold them all back and add them in later), google has steadfastly refused to spider anything but the major index pages of the site, no penalty or anything, but just won't run through them. Not a big deal in this case, small site, small client, but interesting anyway as a case study. Your observations also make me wonder about a roughly 5-8 place drop we saw for a single keyword on one collection of sites, we've been trying to figure out the cause, but nothing stands out since sites rank exactly as before for all other major keywords. I have to wonder though, not sure it's related. Anyway, on sites where I've written all the content, I'm not seeing any such dupe content problems, obviously. But on other sites, I have to wonder, there's a lot of pages, and I haven't really read them to see how repetitive they are, or if we've accidentally reused articles etc, it's quite possible. [edited by: 2by4 at 8:25 pm (utc) on Oct. 3, 2005]
"In any case, I refuse. Talk about rigging sites simply for the purpose of ranking. That's exactly what we're NOT supposed to be doing.