Forum Moderators: open
this is fine and appears to have a page rank of 4. However, all the internal links on the site point to
www.domain.com/default.asp?pageid=home&product=ti
Note that all letters in the domain are now lower case. This Page now shows up as PR0 and shows no backlinks.
Does Google and the toolbar really see these as seperate pages or is this just a glitch in the system?
Moff
Also, strictly speaking, www.domain.com/ is a different page to www.domain.com/default.asp (or www.domain.com/index.html, or whatever). It's only due to the implementation/configuration of the webserver that /default.asp is returned for requests for /.
Jon
www.domain.com/pageId=Home&Product=TI
and
www.domain.com/default.asp?pageid=home&product=ti
then in theory you could end up tripping duplicate content filters?
I knew domain.com/ and domain.com/index.html could be seen as different pages but was under the impression that most of the time Googlebot was smart enough to recognise these as the same page and credit it as a single page.
Moff
In one case the site was changed over with regard to filenames when switching to using SSI. All the interior pages linked back to the homepage using index.shtml and there were no internal backlinks accrued to the homepage PR - or showing as backlinks. The situation was remedied by linking using an absolute URL for the homepage throughout the site.
Just recently, I've had to deal with a site where the web designer linked back to the homepage - all from the same pages on the site - 3 different ways. There were links to example.com and www.example.com and index.shtml - all within the site itself, with a lot on the same pages different ways. The site still shows a PR3 when it actually should be a high PR4 based on what the PR of the pages that inbound links are on. The same thing had to be done as in the first case, with all internal links to the homepage changed to www.example.com/ - and some people do prefer to include the forward slash.
What we see on the toolbar is for the most part out-dated and consequently irrelevant as to what the actual PR of pages should be at a particular given time, except for times when backlinks and PR have actually been freshly updated, which is when it's about as accurate as I believe we'll see it.
>>Since many sites run Apache/Unix, Google must recognize this
I don't think it's a matter of there not being the technology to handle the differences, but they are actually technically different pages and there are some who take advantage of that fact. At one time there was some site populating the local search for just about every major US city for a certain search category, as well as using expired domains mixed in to their bundle, that was serving "circle jerks" you couldn't get out of. They used exactly that technique - a radically different number of backlinks to the domain with and without the www
Question about this type of URL, because it's come up before, though in a different context:
>>www.domain.com/default.asp?pageid=home&product=ti
What shows up at Google when using allinurl: to see which of the site pages are included in the index? What bothers me about that kind of URL is wondering what happens when you can also have
www.domain.com/default.asp?pageid=home&product=tj
and
www.domain.com/default.asp?pageid=home&product=tk
I dont know how asp pages are actually generated, but does the above refer all to the same default.asp page or are there multiple ones depending on the product code? We once had a member who was having serious problems getting his site properly indexed, and his URLs looked quite similar to that, with the pagename.asp done in the same way, with different parameters following in the URL.
>>Does Google and the toolbar really see these as seperate pages or is this just a glitch in the system?
It isn't the toolbar that's seeing anything beyond what's being sent to it, and what the toolbar reflects may or may not be an actual representation. I think what we really need to take a closer look at is not so much the toolbar, but how the naming of files affects crawling and subsequent computations for scoring.