Forum Moderators: open
The site has a PR 6 with 200 back links, which is good compared to most of the competition, but I don’t think worthy of #1. Naturally I wanted to find out how they did it.
Looking at xsite’s source in a browser there was nothing untoward. Loads of JavaScript and style sheet includes, typical of a large corporation site. In fact there was nothing in the code to suggest that the developers had SEO in mind, or cross browser compatibility, when developing.
Intrigued, I investigated further and used this site’s spider simulator-
[searchengineworld.com...] The spider returned status 200 but picked up nothing in the page.
I then used another spider simulator, this time it returned this code
<html>
<head>
<script language="javascript">
function doRes () {
document.screen. resolution.value = (screen.width + 'x' + screen.height)
document.screen submit();
}
</script>
</head>
<body onLoad="javascript:doRes();">
<form name="screen" action="/index.asp?" method="post">
<input type="hidden" name="resolution" value="">
</form>
</body>
</html>
Viewing the page with an older browser with JavaScript disabled return the above code. Refreshing the page gave the correct content.(though messy)
I wondered if the
screen.width + 'x' + screen.height was a crude method of determining whether the viewer was a spider or a browser. I tried to fool the script by doing a GET to the script rather than a post My inquisitiveness has got the better of me hence the post. I’m just wondering whether the developers have used a method that inadvertently gives food for Google. I would really like to see the asp source to see what’s going on but I realise this is just about impossible. What’s bugging me is that if a spider is to GET this sites content all that must be returned is the code above? Is this correct? To the best of my knowledge I believe that Spider’s do not read JavaScript and they also cannot do POSTs, is this correct?
Any thoughts, on these matters appreciated.
I use something like this to log the JavaScript
'document.referrer' with the form submit.
<form method="POST" action="http://mydomain.com/">
<input type="text" name="search" size=80>
<script TYPE="text/javascript">
document.write("<input type=\"hidden\" name=\"REF\" value=\""+document.referrer+"\">");
</script>
<br>
<input type="submit" name="submit1" value="Submit Request">
<input type="reset" value="Reset Request">
</form>
Reason being the server doesn't always get/have the 'HTTP_REFERER'
and this just gives another go at it if JavaScript is enabled...
On another page I log the screen resolution using the
JavaScript 'SRC' attribute (JavaScript include file).
and 'document.URL', 'navigator.userAgent', 'navigator.platform',
'screen.colorDepth' or 'screen.pixelDepth' and 'document.referrer'.
None of my JavaScript logs ever show any info from Google
or any se...
GeorgeGG
So to confirm, Google doesn't read Javascript. The site is doing some kind of sniffing.
With reference to the cached Versions, the site has two entries in Google SERPs, #1 and #2.
The #1 cached version is the actual site. That is, no optimisation, html code typical of a large corporation. How is Google managing to get to the content of the page? As stated before, viewing the site with JavaScript disabled returns the Screen Resolution Sniffer. Refreshing the same page gave the correct content, incorrectly rendered.
The #2 cached version goes to the normal Google Cached Header then the page redirects to the site. Viewing Source of the cached version shows the Google Cache Header and the Screen Resolution Sniffer.
I just don't think they had SEO in mind when developing, and I doubt if the company is Spider sniffing. Does Googlebot send two GETs requests? The first returning the sniffer and a status 200, then resends another GET for the content?
I’m still find it difficult to understand how Googlebot is managing to index their site, let alone rank it #1. Any thoughts appreciated
So, how has Google spidered and cached the page? The only way I can see this working is if the bot sends two GET requests.
First GET, the server returns the Screen Size Sniffer.
Second GET, something like
if referrer = self
{
if Posted Query String
then parses Posted query the screen size string,
serve content matching screen size
else
a bot or browser with no JavaScript
serve default content
} This will only work if bots send out two GETs to index a site. Is this normal?
On alltheWeb the same site is #1,
However the site details are
[anysite.site...] (410 B)
which is the .asp with just the screen sniffer in it.
Excuse my inexperience in this field, but can anybody explain what is going on?