Forum Moderators: open

Message Too Old, No Replies

Google didnt deep crawl on this update

What went wrong?

         

Helpmebe1

2:49 pm on Aug 23, 2002 (gmt 0)

10+ Year Member



Hey all...
I just noticed why I didnt get the results I wanted on this last dance. Google didnt deep crawl.. I have a site of over 2k pages.. I sell shall we say camera supplies. I have a site that is laid out with a name brand section.. Canon, Minolta, Fuji, etc... In my footers across the site their is a link for Canon, Minolta, etc... linking to each main section. From that brand name section, I put Model X and Accessories, Model Y and accessories, etc..
Google crawled my Name brand sections such as Canon and so on but did not crawl any deeper then the main name brand sections. Google did not go and crawl the specific model pages...

My site just came online about 6 weeks ago..I saw the PR on the homepage go from a PR1 to a PR4 I think on this crawl with just my DMOZ link (it has not picked up my yahoo or other links from that it looks like) My Main brand sections have gone to a PR3, the individual model sections - well most are a white bar with no pr.. which tells me it didnt get crawled.. Is google unreliable with its deep crawls? Why would this happen? I have no fancy script or anything on my site to prevent it from crawling the entire site???

ikbenhet1

2:59 pm on Aug 23, 2002 (gmt 0)

10+ Year Member



Google didnt deep crawl..

Google did deep craw om my site, google found the extra 1.6k of pages this month.

So maybe google crawled a day before you put up your new pages?

what extention do these files have, not all files get indexed in google even if the have backlinks, for example

aaa.htmlt

will never be crawled by google.

[edit: added line]

bateman_ap

3:01 pm on Aug 23, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have a few pages with a totally white PR bar. I think it is due to the fact that the page it is linked from has a PR of 3, there are prob many links on that page and it is just the linked to page will have a PR rating of something like 0.01 so it just shows 0

ikbenhet1

3:03 pm on Aug 23, 2002 (gmt 0)

10+ Year Member




i was thinking about this too,

i can't imagine all pr0, are banned sites, some of them just get very little pr passes, i guess.

Helpmebe1

3:24 pm on Aug 23, 2002 (gmt 0)

10+ Year Member



Thank you all for responding.. the extensions are .html .. it is actually a yahoo store.. I dont see the extensions being a problem with the .html and all pages are written in plain html script.. no java, css, etc etc..
I put a link in my footers to the "url page" hoping this will help, but just made that change yesterday in hopes it will then spider the url page and pick up ALL the pages in the site being it is somewhat large being over 2k pages.

Could it be the PR to the main section is only a PR3? I dont see that being it though as I looked at one of my competitors sites which has a somewhat similiar layout to somewhat of an extent and all his pages have been crawled and have some kind of a PR rating.. I have seen links off a page of PR3 get crawled.. I wonder whats up with this? I cant even check my logs to see if I was deep crawled because it is a yahoo store and I dont have access to my raw logs like that... Uggh frustrating this can be sometimes!

ikbenhet1

3:26 pm on Aug 23, 2002 (gmt 0)

10+ Year Member



please put the url in your profile, i want to see this page.

note: if you did this just yesterday, you will have to wait for the next google crawl.

note1: pr3 is most exelent for the purpose of just getting listed in the database, this months crawl shows me that google deepcrawls pr3 lackbacks also cause i added 1 link from a pr3 site to 1.6k of new files linking to each other. There all show up in www2.google.com

i think your'e a few day's late the crawl has already een done a few days ago
.
[edit:added note]

bcc1234

4:30 pm on Aug 23, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



aaa.htmlt

will never be crawled by google.

Why not ??

Helpmebe1

4:32 pm on Aug 23, 2002 (gmt 0)

10+ Year Member



ikben,
I noticed that too.. I also noticed my DMOZ page which only has a PR of 2 looks like it brought our site to a PR4 from a PR1 since I see this as being the only link to our site.. I think i got into yahoo to late to be picked up and some other sites look like they didnt get picked up yet...
Ya know something.. these forums to kick a**! I love being able to share thoughts, successes and things not to do in here.
I think I was to late to make the changes of adding the url page link in the footers.. this should work? What are your thoughts.. will they follow that link and then pick up and PR all the pages in the site through the URL link page? I hope being it contains a URL to every page on our site. Just found out we got into ink for free... which seems to be bringing in some traffic for us as well.
Congrats on them finding wow, all those new pages on your site! are you ranking nicely for your new found pages???

ikbenhet1

4:59 pm on Aug 23, 2002 (gmt 0)

10+ Year Member



bcc1234

google only shows: .asp .htm .html .cgi .pdf and some more. [i put a link in my profile, showing exactly which extensions google crawls]

but if you have an extension other that in the list, google does not crawl it, since it doesn't know what this file is.

[edited by: ikbenhet1 at 6:19 pm (utc) on Aug. 23, 2002]

ikbenhet1

5:19 pm on Aug 23, 2002 (gmt 0)

10+ Year Member



I surely recoverd from my last month mistake, #1 with lots of domains.

I still can't say yet for the new html files,
I't will become clear somwhere around monday, then the pr will be calculated.
Even if these sites get a minimal of pr that will boost me up good, cause each one of the 1.6k of html files are linking to my 10 best domains, and each html has 20 links linking to other files within those 1.6k

but i will know for sure what effect it has in a few days,for now it looks good.

bcc1234

9:46 pm on Aug 23, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



google only shows: .asp .htm .html .cgi .pdf and some more. [i put a link in my profile, showing exactly which extensions google crawls

Hmmm, that does not seem reasonable.
Can anybody else please confirm this ?

I would think googlebot uses the Content-Type field of the response header to detect the type of the document.

ikbenhet1

1:48 pm on Aug 24, 2002 (gmt 0)

10+ Year Member




since no one answerd your question bcc1234, i'm gonna give a simple example.

i put a example page in my profile, with is crawled by google, so we can check it out.

now look at the navigation-bar, it starts with the link 'whatsnew'.
You see it? ok.

This file ends with .mnsw and is therfore is not indexed by google.
(check it please type the url in google and see, it's not crawled)

now you also see in the navigation bar a 'become a member' link, ending with blablabla.com/join
(please type the url in google and see, it's crawled)

this is proof for me.

This page is crawled, and all other links on this page ending on / or .htm or .html are crawled also, but the links ending on .mnsw and some more are not crawled.

Also there are lotta more extensions that do not get crawled.

bcc1234

8:55 am on Aug 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The page:
/HerbieNl1961/_whatsnew.msnw

Returns the following headers:

Expires: Mon, 11 Jan 1999 01:23:45 GMT
Pragma: No-Cache
Cache-Control: no-cache

I would say that's more than enough not to index the page.

Btw, did not find "become a member" link.
Also did not find any .htm or .html links on that page at all.

And that reference page with extenions lost any credibility once i ran across the line: "jsp - Java Script Page".

zeus

11:01 am on Aug 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



ikbenhet1

If I look at the directory there is different results and my site has been given a PR0 before PR4 and you think the PR first comes later?, but why has all the other sites in my categorie a good PR in the directory now?

zeus