Forum Moderators: open

Message Too Old, No Replies

Site Map Problems

Google (fresh) won't crawl my site map... why?

         

vincevincevince

1:15 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I made a site map, as recommended on these forums, and fresh hit it, but didn't follow any links. The site map was made with:


<?php
echo "<title>Site Map</title><h1>Site Map</h1>";
$dh = opendir("./");
$file = readdir($dh);
$file = readdir($dh);
while(!(($file = readdir($dh)) === false)){
if(!is_dir("$dirname/$file"))
print "<a href=\"$file\">$file</a><br>";
}
closedir($dh);
?>

Can you suggest why Google didn't follow those links?

chiyo

1:32 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



vince3,

the php code doesn't help as what google reads is the actual html code it generates. If you view the source of your page after it displays you will see what Google sees. Make sure that all the links are easy and simple to read, and dont have strange formatting within them, and that the wholepage is valid HTML.

Im no expert PHP coder but shouldnt you have a HTML and BODY tag and closing tags in an echo statement there?

[edited by: chiyo at 1:34 pm (utc) on May 4, 2003]

takagi

1:34 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I presume the code in your message is not the complete code. The title should be part of the <head>, and I also miss the <body> tag.

It might take some time before the links are followed. There is no guarantee that links in a new page found by freshbot are automatically followed. It could depend on the PR of the page linking to the site map, the work load for freshbot, etc.

If you look in the cache of the site map, what do you see?
When did your see the site map for the first time in the SERP?

vincevincevince

1:34 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The page validates, and the links on the list are in the form:
<a href="file1.php">file1.php</a><br>
<a href="file2.php">file2.php</a><br>

(and yes, the code is not complete, i wanted to make it as short as possible... just show what is relevant)

the cache says "Your search - cache:http://www.widgets.com/sitemap.php - did not match any documents. "

BUT my log shows that google requested that page

it is linked from every page on the site, which is PR4

chiyo

1:38 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



google should follow that, but from all accounts you are losing a lot of value from just providing file names as anchors and not Titles.

vincevincevince

1:39 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



chiyo... thanks. i hadn't thought about that. google's likely to think it's a directory listing and not crawl maybe?

thanks for everyone's help... going to rewrite the script to grab the <title> info from each page then output it in the HREF :)

Jakpot

1:46 pm on May 4, 2003 (gmt 0)

10+ Year Member



"I made a site map, as recommended on these forums"

It would be worthwhile to be somewhat cautious
in implementing some recommendations that are offered
on these forums.
I too changed to a sitemap and my PR went down the tubes.
Coulda been some other cause(s) but my conclusion is it was the sitemap

takagi

1:52 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi Jakpot, having a link to the site map with a lot of pages on this site map, will spread the PR over these pages. Especially if a lot of new pages were found due to the site map, the PR of the homepage could drop. Therefore it could be better to keep the PR of the site map low, unless you want to spread the PR over the sub pages.

Jakpot

2:02 pm on May 4, 2003 (gmt 0)

10+ Year Member



Takaqi
Yep! That's what happened.
I dropped the sitemap and reverted to links (<100) on
my home page.
Really hope Google will revert to my previous PR's
and SERP positions.
My old billfold really took a hit because of my folly.

chiyo

2:03 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Agree with takagi and will add that if the only reason for the site map is to make sure all pages are spidered, then link to it ONLY from the home page, or if you really dont want to upset your carefullt engineered internal PR sharing, from a lower page rank page.

On a related topic remember that Google probably only follows a certain number of links from each page.
<just an instinct>
Im not convinced that for various reasons Google now or in the future takes much notice of site maps anymore. Especially if they are just, as is normal, long lists of linked URL's.

I think they take more notice of natural structure - ie. home page links to major section indexes -> major sections links to sub section indexes ->sub section indexes link to etc etc. This makes more sense for both user and spider.

</Just an instinct.>

takagi

2:46 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A site map is ideal to get a new site completely indexed. For an existing site with a good linking structure and all pages already indexed in Google, there is no need for Google to have a site map (other SEs might still need it). In that case you could think about a page with a relative high PR where you add links to new pages. Doing so will help the new pages to be indexed quickly.

mrguy

3:28 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes,

Give it some time.

I've seen a new page get hit a couple times with Freshbot before the links on the page started to get followed.

martinibuster

4:11 pm on May 4, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



the PR of the homepage could drop.

I don't think so. If you are linking to one sitemap (that contains 100 links on it), then the total PR flowing out to that sitemap is the same as what goes to any other page. The one hundred links on the site map have to share that one spurt of PR.

Adding new content to your web site is not going to lower your PR.

Also remember that the value of a sitemap lies not only in getting your deep pages spidered, but also in helping out your visitors. Group the links into relevant categories and slap some keyword rich category headings to identify what those groups are about. If the sitemap is good for your user, it will be amazingly good to you.

annej

4:26 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Jakpot, There appears to have been an overall PR drop over the last couple of months. That may have been a bigger factor than the site map.

Also as we get involved in PR we need to remember that Google only gets people to visit our sites. The next challenge is to keep them there. A good site map can help.

takagi

4:47 pm on May 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't know the reason for vince3, chiyo and Jakpot to add a site map, but it helps very good to add one if you have problems to get all your (dynamic) pages indexed. If this approach is successful, then the PR will spread out over more pages. Linking the site map from every page (see msg4, but unfortunately I didn't make that clear in my own msg8) will cause a large flow of PR to this page. Since the calculation of PR has many iterations, a significant part of the large flow of PR to the site map will go to the many sub pages. In this scenario the pagerank of the homepage could drop.