homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 249 message thread spans 9 pages: < < 249 ( 1 2 3 4 [5] 6 7 8 9 > >     
Pages Dropping Out of Big Daddy Index
Part 2

 7:59 pm on May 8, 2006 (gmt 0)

Continued from: [webmasterworld.com...]

internetheaven, you said:

I had 20,300 pages showing for a site:www.example.com search yesterday and for the past month. Today it dropped to 509 but my traffic is still pretty constant. I normally get around 4,500 - 5,000 to that site per day and today I've already got 4,000.

So, either Google doesn't account for even a small percentage of my traffic (which I doubt) or the way Google stores information about my site has changed. i.e. the 20,300 pages are still there, Google will only tell me about 509 of them. As far as I can tell, I think the other pages have been supplemented.

That resonated with something that I was talking about with the crawl/index team. internetheaven, was that post about the site in your profile, or a different site? Your post aligns exactly with one thing I've seen in a couple ways. It would align even more if you were talking about a different site than the one in your profile. :) If you were talking about a different site, would mind sending the site name to bostonpubcon2006 [at] gmail.com with the subject line of "crawlpages" and the name of your site, plus the handle "internetheaven"? I'd like to check the theory.

Just to give folks an update, we've been going through the feedback and noticed one thing. We've been refreshing some (but not all) of the supplemental results. One part of the supplemental indexing system didn't return any results for [site:domain.com] (that is, a site: search with no additional terms). So that would match with fewer results being reported for site: queries but traffic not changing much. The pages are available for queries matching the supplemental results, but just adding a term or stopword to site: wouldn't automatically access those supplemental results.

I'm checking with the crawl/index folks if this might factor into what people are seeing, and I should hear back later today or tomorrow. In the mean time, interested folks might want to check if their search traffic has gone up/down by a major amount, and see if there are fewer/more supplemental results for a site: search for their domain. Since folks outside Google couldn't force the supplemental results to return site: results, it needed a crawl/index person to notice that fact based on the feedback that we've gotten.

Anyone that wants to send more info along those lines to bostonpubcon2006 [at] gmail.com with the subject line "crawlpages" is welcome to. So you might send something like "I originally wrote about domain.com. I looked at my logs and haven't seen a major decrease in traffic; my traffic is about the same. I used to have about X% supplemental results, and now I hardly see any supplemental results with a site:domain.com query."

I've still got someone reading the bostonpubcon email alias, and I've worked with the Sitemaps team to exclude that as a factor. The crawl/index folks are reading portions of the feedback too; if there's more that I notice, I'll stop by to let you know.

[edited by: Brett_Tabke at 8:07 pm (utc) on May 8, 2006]



 9:04 am on May 12, 2006 (gmt 0)

Thangs Google guy
I received a response out of bostonpubcon2006 at gmail.com
I have fill the reinclusion request as you told me and I hope as you told me that my page will have a review from the team.

Again thanks a lot.

[edited by: tedster at 4:54 pm (utc) on May 13, 2006]


 11:21 am on May 12, 2006 (gmt 0)

I too received a reply from bostonpubcon2006 at gmail.com today. Thank You. Much appreciated.

My situation stands better now than it has since this sites issues began last sept.

Until the current situation of missing pages started my site was fully indexed but only a handful of pages were ranking.

1100 pages go missing and the remaining pages returned to their pre september rankings. A few days after emailing the bostonpubcon2006 at gmail.com addy my site's pages started returning.

According to todays reply I have "939 pages listed".

When i site search on google using the 4 variants suggested they all return 10,300 pages. Of course the site doesn't have that many pages, it has around 1300 and up to 999 there are no supps.

Other points to note from the email:
"This suggests to me that the
situation is currently self-correcting"
"I've verified
that your site has not been manually penalized"

I hope my mess and slow improvements gives others a little hope in these times of trouble.

Thanks again to whoever took the time to look at my site and reply.

Edit to add that most of the reindexed pages are back to pre sept positions.


 1:30 pm on May 12, 2006 (gmt 0)

how long did it take you guys to get a response from them?


 1:57 pm on May 12, 2006 (gmt 0)

about 3 weeks for mine - but it really didn't tell me much


 2:34 pm on May 12, 2006 (gmt 0)

On my sitemap diagnostics I got a message from Google, "you have more than 10 pages with HTTP errors you may take a look at this"

4 out of the 10 pages are added a %22 to it for example:

www.widget.com/super-widgets.html%22 and that is the reason we get a 404 error.

What could be the problem?

Would anyone know if this has anything to do with the site not getting fully indexed?


 2:55 pm on May 12, 2006 (gmt 0)

%22 is a " (quote) I believe. Not sure if that helps you figure it out, or not.


 3:32 pm on May 12, 2006 (gmt 0)

May be a little off the subject, but I noticed in my awestats and google site maps that google is looking for pages that do not exsist. The site is strictly php based yet google site maps is looking for some odd urls.

They were all returned May 7th and 9th as 404 not found. I have no idea why or where google got these url's. They are not in our site map.



 3:51 pm on May 12, 2006 (gmt 0)

%22 is a missing " I found ours it is a coding problem I am not sure it is an error or not as ours just came up as well used the ' instead of the " look for that as the root of the error


 3:55 pm on May 12, 2006 (gmt 0)

there is another post on this in google area and I see it as well, there was a forum site and php site I am htm so no database and have the error as well. Loooks to be generated by Google as I can't explain it either..

Hope Google Guy looks into this as it seems wide spread


 2:14 pm on May 13, 2006 (gmt 0)

"how long did it take you guys to get a response from them? "

2 days.


 2:42 pm on May 13, 2006 (gmt 0)

I see the 22% problem too but I searched my page source, no problem at my side.

PS, if you did not use sitemap yet, I recommend you use it now.


 6:43 pm on May 13, 2006 (gmt 0)

As of today google deleted 1490 pages that had been deleted in october 05 and have been in index till yesterday. Real pages still get up slowly about 2-5 pages a day.


 7:34 pm on May 13, 2006 (gmt 0)

I received a response from bostonpubcon2006 approximately 2 weeks after I sent an email. There were no comments regarding my site (which I already knew was fine - no spam, duplicate content etc.) Other than asking me to consider creating a site map the email simply asked me how many pages I previously had indexed. I greatly appreciated hearing anything back from bostonpubcon2006 but I am reluctant to try sitemaps - see:


 6:40 am on May 14, 2006 (gmt 0)

try sitemaps! try sitemaps! thats all it seems to come back and say - funny you'll never guess what my email said "try sitemaps"


 6:50 am on May 14, 2006 (gmt 0)

Do you really, truly want the solution to this problem?

The sooner webmasters take G's crap affiliate program (adsense) off their sites, the sooner this whole mess will be solved.

Don't believe me?

Get 10,000+ webmasters to remove their adsense codes today and I'll bet this "impossible" issue will be resolved by sometime next week.

As soon as webmasters look at the bigger picture, (honestly is a couple weeks of missed adsense income REALLY hurting you/us?), the sooner we get our voices heard.


 7:32 am on May 14, 2006 (gmt 0)

try sitemaps! try sitemaps! thats all it seems to come back and say - funny you'll never guess what my email said "try sitemaps"

It's anwers like this that indicate how messed up things are over at Google. You never needed sitemaps before but we messed things up so bad that now the only way into our index is a special tool that sill might not help you. Or ar they implying that to be indexed you HAVE to use sitemaps but they phrase it in such a way that it doesn't sound like extortion. I think that if you didn't need a Google sitemap for 10 years that it's someone elses screw up. Google staff are no doubt working hard to fix what they totally blew it on but these emails are insulting. It's not a good strategy unless you are intending to piss everyone off. If that's the plan great. If it's not then they need to change the automatically generating email to something like, "Sorry we F'd up so bad. Please be patient while we do our jobs. Don't waste your time fixing things that aren't your fault."


 7:34 am on May 14, 2006 (gmt 0)


Thanks. Thats a very interesting post you bring to our attention:


especially this part of the post:

"There are a few things to consider about our overall crawl and indexing
pipeline. As part of some recent updates
(http://www.mattcutts.com/blog/bigdaddy/) we're taking a much closer
look at affiliate links, linkfarms, duplicate content, and other
factors as described in our webmaster quality guidelines"


 6:03 pm on May 14, 2006 (gmt 0)

I have just checked my rankings. Hard drops of pages with detailed widget information.
The main pages rank very well for 1 and 2 keyword searches. But thats not what people are looking for. They make 3 key searches and that 3rd key is only present on the detailed widget information pages. Even if they were in supp hell it would be better than not to be in index.

is it the same to you?


 7:03 pm on May 14, 2006 (gmt 0)

"we're taking a much closer
look at affiliate links, linkfarms, duplicate content, and other
factors as described in our webmaster quality guidelines"

We have links to affiliate sites, is that in any way affecting us?


 10:04 pm on May 14, 2006 (gmt 0)

Nofollow all affiliate links. Although I doubt that will fix it and we shouldn’t have to do that. Affiliate marketing is a very useful and appropriate thing for site owners and users.


 10:16 pm on May 14, 2006 (gmt 0)

All of our affiliate links have a javascript:void(0); for the bots not to follow the link.

However, is thier still a possiblity that Google is not listing all of our pages for this reason..


 10:30 pm on May 14, 2006 (gmt 0)

is Google spidering the sites you are having probs getting in the index? Also, has the spidering frequency increased, decrease or steady? How many bot visits you get on average daily and how many total pages do you have?


 10:35 pm on May 14, 2006 (gmt 0)

Spidering hasn't been a problem. Googlebot is still spidering at the same frequency. It's seeing those pages with a site:domain.com that not's showing them.


 11:33 pm on May 14, 2006 (gmt 0)

I think that Google meant that affiliates using approved copy might suffer a duplication penalty. That was how I read it in the webmaster guidelines and that makes perfect sense. Penalizing web pages with affiliate banners or links just because they have them would be terribly unfair and downright evil.


 11:37 pm on May 14, 2006 (gmt 0)

But what's interesting is we have 24 pages listed from our site. Some pages are affiliate pages linking to to our affiliate site, others are internal pages with our own shopping cart.

I am trying to figure why Google has chosen these 24 pages out of 440 pages?

What's wrong with the rest?


 11:52 pm on May 14, 2006 (gmt 0)

I agree. Basically, Google doesn't want 1000 sites in the index selling the same Pocahantas shirt with the same Disney description.

how many do you have on Google now and how many do you have in total? Are the pages increasing, decreasing and at what rate?

I ask because as Google increased its crawling I noticed more pages and more appearing


 12:36 am on May 15, 2006 (gmt 0)

I don't think the dropping of pages at this stage can be too closely related to the duplicate content issues described above [ although in time - when things are clearer it may reveal itself as an issue ].

There are 100's of webmasters with perfectly legitimate sites using original and interesting content in trouble.

Even Google knows that webmasters and good site owners in general do not adhere to their quality guidelines and to cut them out completely would have the effect of damaging their search products access to interesting information for users.

The overall problem is still suspended at an earlier stage with the DC's being monitored and tweaked [ and yes we webmasters are part of the testing phase ] to produce accurately what Google intends.

We still have to see the reliable effects of backlinks onto fully indexable pages. We're not even close to a solution - if the pages aren't even indexed properly , how long do you think it will be before the backlinks start to work.

2 months , 3 months , 6 months?

And then you have issues with unrelated or old pages [ supps ] still to be sorted.

When this is all sorted, then one can look at the on page content and say, this isn't appearing because because of these reasons.

But how I'd love to be wrong!


 1:11 am on May 15, 2006 (gmt 0)


Thank you so much for the feedback.

I guess we've got no choice..but wait. Was just trying to figure out if..maybe..maybe..there is something wrong with our site and that might be the cause..

Haven't come up with anything on our end..I guess we have done a lot..We will not give up..Hopefully we will succeed..


 6:05 am on May 15, 2006 (gmt 0)

I think that Google meant that affiliates using approved copy might suffer a duplication penalty.

NO - Google meant down right affiliate links.

"...we're taking a much closer look at affiliate links, linkfarms, duplicate content, and other factors as described in our webmaster quality guidelines..."

You'll have to use their crappy adwords program and pay up if you want to promote affiliate links...and they can freely channel it all over the web via their spammy MFA affiliates. As long as you pay there are no penalties or crawling errors.

no free affiliate links promotion for you ( Soup nazzi - Seinfeld)


 6:29 am on May 15, 2006 (gmt 0)

>we're taking a much closer look at affiliate links

it would be nice if we could get a little bit more clarification on what they mean here, are they just taking about affiliate sites that are just full of banners with no content, or as in my case a site with 40% affiliate content mixed in with my own unique content?

After all if you've got a site selling lets say books unless your going to read every review you are going to have to use snippets from the affiliate, but if that snippet is in between your own content with additional information from you why is that so bad!


 7:51 am on May 15, 2006 (gmt 0)

That affiliate statement is very broad. There is no way Google is going to kick out all affiliates as that would be financial suicide for them. Plus the user experience would become horrendous.

On a side note, my site: command shows no change in the # of pages and I have checked to make sure some of my newer pages wereen't indexed. Note that they are not in supplemental so there is no way they were just penalized either.

This 249 message thread spans 9 pages: < < 249 ( 1 2 3 4 [5] 6 7 8 9 > >
Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved