Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google's site: query returns bad results - now being fixed

hyphenated domains and trailing slash get incorrect results

         

Adam_Lasnik

10:46 pm on May 20, 2006 (gmt 0)

10+ Year Member



Hey everyone,

My colleague Vanessa just posted a note which should hopefully make your weekend a less stressful one :).

Specifically, she's noted that we've had some issues with our "site:" operator [sitemaps.blogspot.com] not returning appropriate results when used with hyphenated domains... so it's quite likely that much of the falloff you've seen is thankfully illusory.

We've noted similar discrepancies with site: searches using domains with a trailing slash as well.

And yes, Googlers are working to correct this stuff as quickly as possible!

tiori

12:45 pm on May 28, 2006 (gmt 0)

10+ Year Member



All my hyphenated domains are back with many supplemental results. Most cache dates are June 2005 and some as old as 2004.

tigger

12:52 pm on May 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>"Check your sites in the datacenters. I think it's *almost* fixed."

yes I'd like to know whish DC is showing this "almost fixed" problem

jpservicez1

5:15 pm on May 28, 2006 (gmt 0)

10+ Year Member



try these DC ..the results are promising..

64.233.187.104
64.233.171.99
64.233.171.104
64.233.171.107
64.233.171.147
64.233.179.99
64.233.179.104

g1smd

5:28 pm on May 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Those DCs have today been updated with recently spidered data. I changed a page title 2 weeks ago, and despite the new title being cached within just a few days, the Google SERPs continued to show the old title. Today the new title shows in the SERPs for the very first time, but only on some datacentres (including those above).

.

>> This morning Google Sitemaps shows me a 404 HTTP-error for the page: domain.com/1.html calculated on May 22. <<

>> We do not have pages named 1.html, but there is a link to a page like “whatever-1.html” on a page cached on May 21. <<

>> I checked internal and external linking, no links to 1.html <<

Hmm, another problem with hyphens in URLs, on top of the problem with hyphenated domains already acknowledged earlier in the month.

steveb

7:48 pm on May 28, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"All my hyphenated domains are back with many supplemental results. Most cache dates are June 2005 and some as old as 2004."

Yeah unfortunately this is not progress. Losing a page is only slightly bad since it can come back. Getting "new" (from 2004 no less) supplementals is a terrible thing that Google now shows little signs of being able to truly fix.

gendude

8:07 pm on May 28, 2006 (gmt 0)

10+ Year Member



Things are looking better, but I hope they do not do something like this again without better testing. I still can't believe it made it through whatever Q&A process they had.

Petra Kaiser

8:38 pm on May 28, 2006 (gmt 0)

10+ Year Member



Did someone with problems checked on this
- [&#045;]
– [Ctrl] [Num Lock] [minus]

bigdrummer

9:07 pm on May 28, 2006 (gmt 0)

10+ Year Member



Having gone down to 2 pages, I bounced back at 280'ish all of which escaped going supplemental. This was on a hyphenated domain, and it picked up a load of new pages recently. All in all, pretty satisfied...

LisaWeber

12:42 am on May 29, 2006 (gmt 0)

10+ Year Member



I also went down to two pages. I had only two pages in the index for a month. Yesterday I had fortyone, all but two pages were supplemental, same thing an hour ago. I just checked and now I have 75 pages (all) in the index, none supplemental.

Woohoo

g1smd

12:47 am on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Which datacentre is that in?

Do you get a different result in any other datacentres?

LisaWeber

1:04 am on May 29, 2006 (gmt 0)

10+ Year Member



g1, your sticky box is full. How do I know what datacentre I am checking?

BillyS

1:34 am on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>How do I know what datacentre I am checking?

If you're checking Google.com just look at the cache link. You'll see the IP address there...

LisaWeber

2:24 am on May 29, 2006 (gmt 0)

10+ Year Member



ok, the datacentre is 72.14.209.104 and I'm off to check some others now.

What I just discovered that's weird is if I click the cached link of any of these newly indexed pages, they have no cache. I get:
Your search - cache:****:www.*****.com - did not match any documents

eta:

I get 75 with 66.102.7.99 and 216.239.37.104, back to 41 and supplemental 216.239.53.104 and only 2 with 64.233.161.99 and 64.233.161.104 .. and that's all I'm going to check.

Jesse_Smith

10:23 am on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>>If you're checking Google.com just look at the cache link. You'll see the IP address there...

I don't think so. For example, Google.com shows 192 results for a domain, and the cache IP datacenter shows 19,600 results.

daveVk

12:10 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That trailing slash case seems to have been fixed, that is I get same result with and without it, showing all the sups. Hyphenated domains?

BillyS

12:28 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>I don't think so.

Okay Jesse, with your helpful I don't think so post.

Then just go to the command line and ping www.google.com and see what IP your hitting.

g1smd

7:10 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Even with Ping, you can get two different IPs just a few seconds apart.

I do all my searches at specific IPs. I do see differences i results even when I switch from 10 to 100 URLs per page. Nothing is 100% guaranteed.

tedster

7:39 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



differences i results even when I switch from 10 to 100 URLs per page.

Except for rare situations, that SHOULD give different results. Any domain that gets 2 urls on the same page of a SERP will see those two results clustered under the highest listing. If you switch to 100 results, the clustering will definitely change and the order of all the urls will shift to accomodate the change. For example, a domain with a #2 and a #99 when you use 10 results per page, now sees a #2 and a #3 when you use 100 results per page.

g1smd

8:00 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I see the clustering changes all the time on the indented results (with stuff from other pages "moving up"; but sometimes (not recently though) I have seen other changes too.

BillyS

9:08 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>Even with Ping, you can get two different IPs just a few seconds apart.

I realize that, but the poster only wanted to know how she could identify a particular Google.com.

steveb

9:41 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



But Jesse was basically right. Seeing the dc of the cache link doesn't guarantee anything anymore. Two people searching the same datacenter IP can get different results. That's the brave new world.

g1smd

9:55 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sometimes all you need to do is hit Reload on the same IP and you see a different result.

I guess that happens when some of the servers in that cluster have been updated and some have not.

BillyS

10:30 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Look, if you want to talk exception and not the rule fine. I give up, you win.

That being said, why don't you answer the question? When checking Google.com how do you know what DC you're hitting? And don't give the lame - I don't check Google.com I check by DC. Because you just said that can change too.

In fact according to your logic:

g1smd said "Sometimes all you need to do is hit Reload on the same IP and you see a different result."

steveb said "Seeing the dc of the cache link doesn't guarantee anything anymore."

Checking DCs is useless, right? Then why do you talk about them?

g1smd

10:51 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Accessing google.com takes you to a random IP every time. You will always get different answers just seconds apart. There is no tracibility at all.

Accessing a specific IP will give almost the same results day in, day out, with minor fluctuations, and show major differences when compared to some other DC, at any time, with that other DC being quasi-static too.

For instance 72.14.207.99 and 72.14.207.104 showed completely different results to every other IP for several weeks before those results eventually started spreading to other DCs. There were changes on those two IPs from day to day, but they were always different to everything else.

There were other differences elsewhere, and eventually those results changed too.

When you have tracked a range of certain keyword SERPs for several years, even minor changes, and miniscule little "patterns" become very obvious.

steveb

11:25 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's not useless, but it is what it is. A single DC can serve up diferent results. Those different results tend to be more similar than when comparing dc to dc, but things are now more difficult to analyze.

colin_h

11:26 pm on May 29, 2006 (gmt 0)



Can someone please explain why we need to know which DC Google is using for which country code. I'm at a loss how it can help us go forward.

All the Best

Col :-)

g1smd

11:51 pm on May 29, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Accessing any sort of google.TLD shows different results from minute to minute because what you get comes from a random datacentre, of which Google has at least 80.

Looking at a single IP by the IP address, allows tracking of results from day to day.

For instance if I look at a certain IP, then a certain site has been #11 for many weeks. If I look at a different IP that same site is at #3 and has been at #3 for nearly 3 years.

If I look at google.com then I see the site randomly bouncing up and down between #3 and #11 - with the specific IP searches I can track the two sets of results independantly for a long time.

icedowl

12:19 am on May 30, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



On 66.102.7.99 I've again lost about 200 pages. It shows the cache as 72.14.209.104 with cache dates of 5/24/2006.

When is this see-saw going to level out!?!

Swanson

1:59 am on May 30, 2006 (gmt 0)

10+ Year Member



All I can say is that it surely not in the interests of the user - and that is what google bang on about all the time.

Well, I can see results change radically by the refresh or by clicking next etc. This has been going on for ages (since bug daddy).

Sooner or later the user will notice they can't find what they looked at previously (I often have customers ring me thinking I am "on google")

The huge difference in live results on google datacentres is a joke - we all know it, thats why there is so much interest in it.

At the end of the day though if your site has been marginalised and you see a few pages left in the index I don't think you are coming back soon.

Maybe if you want the Google traffic now you need to break the rules - or play by them and wait forever in line to be included?

phish

2:19 am on May 30, 2006 (gmt 0)

10+ Year Member



"All I can say is that it surely not in the interests of the user - and that is what google bang on about all the time"

That concept flew out the window the day after the ipo. The new motto is "Do what ya gotta do to make the stockholders happy".
Honestly all this stuff about site query and hyphenated domains is simply one "thing" they found that is messed up that they think they can fix. In my honest opinion there's many more of those "one things" out there which may or may not ever get fixed.

-phish

This 132 message thread spans 5 pages: 132