homepage Welcome to WebmasterWorld Guest from 54.226.80.55
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
I just penalized Google with PR4 from PR10!
believe it or not ... (fun)
jamesyap




msg:87651
 6:43 am on Feb 26, 2003 (gmt 0)

Check this out, a PR4 Google.com ;) Magic from David Coperfield!

Google.com [google.com.]

Take a few seconds and your will figure out why.

... ...

But I never expect that syntax to works! Expert please explain.

 

pendanticist




msg:87652
 6:46 am on Feb 26, 2003 (gmt 0)

google.com
./
FillDeCube




msg:87653
 6:50 am on Feb 26, 2003 (gmt 0)

guess someone must have accidentally link to "google.com."

back link shows 453 results.

all My site has grey bar if I do the same

johnraphone




msg:87654
 6:55 am on Feb 26, 2003 (gmt 0)

[yahoo.com....] has a pr3

jamesyap




msg:87655
 9:03 am on Feb 26, 2003 (gmt 0)

The reason they accidentally link to it is because it is place at the back of a sentenses. Such as

You can use Google.com. <- see the .

So, someone can explain technically how the . works?

cornwall




msg:87656
 9:17 am on Feb 26, 2003 (gmt 0)

FWIW

Google has a selection of language sites eg

[google.com...]

For French, with "es" for Spanish, "eng" for English, etc

The "English" Google you have found with the trailing "dot" is not part og this group, but is not the main Google site either (that does not use the word "English" on it)

yetanotheruser




msg:87657
 10:54 am on Feb 26, 2003 (gmt 0)

So, someone can explain technically how the . works?

very odd.. Your browser most likely tidies up the url the best it can before resolving the name to an actual machine but probably sends the . thought when it requests the page, so what site is returned is a matter of which virual host the server decides to return.. Perhaps the 'English' version of google is simply first/last/default in the list of vhosts?

(I can't get our (apache) servers to accept the trailing . it just resolves the address without it.) But google are using their own server GWS/2.0 so this is all just a big guess ;)

aspdesigner




msg:87658
 12:05 pm on Feb 26, 2003 (gmt 0)

Same reason that if you type the following command in a DOS prompt -

CD .

It looks like nothing happened.

In disk subdirectories, "." and ".." have special meanings. ".." is a shortcut to the parent directory, and "." is a shortcut for the current directory.

Those familiar with DOS should recognize the command "CD .." which will take you up a directory level.

When you use just one period, you are asking for the current directory. So when you ask for this web page -

www.mysite.com/.

You are simply asking for the current directory that you are already on, so you get the home page.

Google seems to see this as a different URL, thus the difference in PageRank.

Depending on how a site's hosting is set-up, you may be able to use the ".." option in a url as well.

For example, try -

[microsoft.com...]

rather than give you a page in their windows subdiectory, you will find that this displays their home page!

yetanotheruser




msg:87659
 12:15 pm on Feb 26, 2003 (gmt 0)

erm.. Not sure I agree.. google.com/. returns a 404.. In fact AFIAK, obeying ../'s is potentially dangerous since script kiddies try and use it to access folders below the server root /

Anyhoo.. in the url the dot comes before the first slash:

google.com./ rather than google.com/.

Still not convinced there's anymore truth in my explanation though! ;)

aspdesigner




msg:87660
 1:28 pm on Feb 26, 2003 (gmt 0)

Depends on what OS they are hosting on and how it is set-up.

For example, at Microsoft (which is likely hosted on Windows), this works just fine -

[microsoft.com...]

Google doesn't like this, but then they aren't hosting on Windows either ;)

Also, remember that using a period rather than a slash at the start of a directory path in Windows is perfectly OK.

./filename

means to start from the current directory, and

/filename

means to start from the root directory

It just so happens that on a web site, they both usually point to the same thing!

[edited by: aspdesigner at 1:33 pm (utc) on Feb. 26, 2003]

andreasfriedrich




msg:87661
 1:32 pm on Feb 26, 2003 (gmt 0)

>>>So, someone can explain technically how the . works?

The Domain Name System is a tree like system. The trailing dot is the root of that domain tree.

See RFC1034 DOMAIN NAMES - CONCEPTS AND FACILITIES [faqs.org] for more infos on DNS.

google.com./ is entirely different from google.com/. and even google.com/./. Only in the first example does the dot belong to the authority section of the URL. In the second example any UA will connect to google.com and request the resource /.. In the third example it will indeed request / since ./ will just be removed from the URL (RFC2396 - Uniform Resource Identifiers (URI): Generic Syntax [faqs.org]).

HTH Andreas

[edited by: andreasfriedrich at 1:33 pm (utc) on Feb. 26, 2003]

daisho




msg:87662
 1:33 pm on Feb 26, 2003 (gmt 0)

I don't think it has anything to do with your host. This is a DNS issue (or non issue) nothing more.

Since their is nothing after the "." it will simply ask the root servers were "com" is. Then ask "com" where "google" is then ask "google" where "www" is then go their.

Made In Sheffield




msg:87663
 1:34 pm on Feb 26, 2003 (gmt 0)

(for aspdesigner)

Yes but the discussion isn't about a period used in that sense. Like yetanotheruser already said it's google.com./ not google.com/.

wackybrit




msg:87664
 1:37 pm on Feb 26, 2003 (gmt 0)

yetanotheruser is entirely on the money. I found this a few weeks ago when I was researching long IPs and discovered the same thing. I didn't think enough of it to post it.

Let me show you another way.. Google's 'Long IP' is.. 3639554917

This means you can access Google via [3639554917...] which results in the same page as the one with the extra dot :) In this case, however, it gets no pagerank at all ;-)

Long IPs are often used by spammers to disguise their servers.. but they're legitimate too.

HuhuFruFru




msg:87665
 1:38 pm on Feb 26, 2003 (gmt 0)

WOW this is great, i just found some nice backlinks for my site with a dot at the end of my domain! i better write them to change it,

thank you jamesyap!

cornwall




msg:87666
 1:41 pm on Feb 26, 2003 (gmt 0)

Since their is nothing after the "." it will simply ask the root servers were "com" is. Then ask "com" where "google" is then ask "google" where "www" is then go there.

That's not right

[google.com...] which is plain Google

is a completely different page to

[google.com....] which is Google "English"

aspdesigner




msg:87667
 2:20 pm on Feb 26, 2003 (gmt 0)

Made_In_Sheffield, you will note that both my first post, as well as yetanotheruser's response, discussed both ways. I was responding to his comments about server errors with /. as well as the issue of ./ in directory paths.

cornwall is quite correct, the "." is being passed to and resolved by the web server, as these two URLs display DIFFERENT pages!

andreasfriedrich




msg:87668
 2:38 pm on Feb 26, 2003 (gmt 0)

>>cornwall is quite correct, the "." is being passed to
>>and resolved by the web server

Yes and no. Of course the trailing dot (as part of the host name) will be passed to the web server in the host field of the request header. And the web server does use that field to do something like resolving.

Both google.com and google.com. resolve to the same ip address (216.239.57.100 or 216.239.53.101) since they are the same. This is probably the result of some round robin DNS load balancing.

When you request a page the browser will connect to the web server and then request a certain URL. The resolving of the authority section of a URL is done on the client side or by the client sending a request to its domain´s NS which will then query other name servers when it is not able to resolve the domain name itself.

>>as these two URLs display DIFFERENT pages!

The effect you are seeing is Google´s geo-targeting at work.

Andreas

daisho




msg:87669
 3:05 pm on Feb 26, 2003 (gmt 0)

As andreasfriedrich said the trailing "." is part of the hostname.

in DNS I beleive (but am not absolutly sure) that the trailing "." is an understood. That is the root of DNS and it's what the rootservers control. Under that their is "com" "org" "net" and all the other TLD (top level domains) and it goes down from there.

So there is *NO* difference between "www.google.com." and "www.google.com" as far as DNS goes. As a matter of fact if you do "dig www.google.com +trace" it will add the trailing "." in the results because it's an understood.

My guess on the PR difference is that googles PR has nothing to do with DNS. It's a string match, in that context "www.google.com"!= "www.google.com." and theirfor as far as PR goes they are different sites.

None the less it is quite interesting to see the different PR's

jamesyap




msg:87670
 4:38 pm on Feb 26, 2003 (gmt 0)

What I guess is

You tyle google.com. on the browser. Your computer try to resolve the without period version and get the IP. (What I say the DNS server threat both version the same).

But THEN due to virtual hosting. When the browser requesting data from the IP, the browser do send to the web server the domain name which is google.com. to know which web sites to display. (This are all about domain name virtual hosting, same ip, hosting different web sites).

So I guess google.com. will return the default site by the web server. Which is the ENGLISH version of google. I fully agree with 'yetanotheruser'.

For HuhuFruFru, the . must be appear in the href attribute of the link.

aspdesigner




msg:87671
 4:55 pm on Feb 26, 2003 (gmt 0)

It have been able to verify that the "." is NOT passed as part of the domain name when attempting to do DNS resolution, but is only sent to the server after the IP address has been resolved without the ".", as part of the URL/path.

I have a DNS proxy installed on my machine. It allows me to view all DNS requests made by my browser.

When I tried to view the web page -

www.google.com.

My browser makes a DNS resolution request for this domain -

www.google.com

with no period at the end. The period is not included in the DNS request (whether serviced locally from cache or sent to an external DNS Server), and it is not used to determine the host address.

The extra "." is only included as part of the page request that is sent to the web server after the host address has been resolved.

daisho, I believe you may be correct. It is possible that the browser may be stripping-off the "." as redundant, thus it would never issue a DNS request for "www.google.com." as it would interpret it simply as a request for "www.google.com"

As it is never seen by the DNS, it would be ignored except is cases where the server is specifically examining the entire URL, such as what might happen with virtual web servers that share the same IP.

Google appears to be using a kind of "virtual server" approach. It appears that Google is using domain name DNS resolution only to direct you to the appropriate data center, but using a virtual server approach to direct you to the proper language, by examining what URL you typed. For example, www.google.de, www.google.fr and www.google.co.uk all appear to be CNAME aliases of www.google.com, they all would have the same IP address!

It appears that the web page displayed when you try www.google.com. (with the extra period) is actually the home page for "www.google.us"

With regards to the difference in PR, this is not unexpected, as Google is already known to do this in other circumstances. For example, try a search for -

link:yahoo.com

This is why it is a good idea to make sure that all of your inbound links are consistent (i.e. - www.mysite.com vs. mysite.com) in order to avoid splitting you PR into two "different" sites.

andreasfriedrich




msg:87672
 5:30 pm on Feb 26, 2003 (gmt 0)

>>where the server is specifically examining the entire URL

The problem is that the server never sees the entire URL. In the request the browsers will only give the absolute path. To allow for name based virtual hosting the hostname is included in the host header field. See Pointing multiple domain names to main site without mirrors - How to do this without hosting them separately and using 301s? [webmasterworld.com] for an in-depth discussion on virtual name based hosting.

Andreas

aspdesigner




msg:87673
 6:29 pm on Feb 26, 2003 (gmt 0)


The problem is that the server never sees the entire URL. In the request the browsers will only give the absolute path.

Not so on our servers, we have access to pretty much everything in the URL from various server variables, including the domain name, both the relative AND absolute paths, as well as any parameters, about the only thing that we might not see is intra-page anchor points in a URL, and that is only with some browsers.

Perhaps you need better hosting?

With regards to the other thread you mentioned, I took a look at it, but it seemed to be mostly about domain aliasing ("parking") rather than virtual hosting.

However, I would expect that Google implemented this using something much more sophisticated than the simplistic approach described there. My intent was simply to compare their implementation of the foreign google sites to virtual hosting only as to the end result - serving-up different content from the same IP based on the domain name in the URL.

andreasfriedrich




msg:87674
 6:56 pm on Feb 26, 2003 (gmt 0)

>>Perhaps you need better hosting?

This is not about hosting but about the way the HTT protocol works.

See RFC2616 - Hypertext Transfer Protocol -- HTTP/1.1 [faqs.org].

Andreas

taxpod




msg:87675
 7:08 pm on Feb 26, 2003 (gmt 0)

I lost interest in this thread as I just thought it was a novel trick. Now the thread has come back up. The interesting thing to me is that this little trick seems to work on every domain I try it with. When I try this with Google.com./ and then go after the cache, I get a really weird result.

What is going on here?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved