First, make sure your robots.txt is valid -- try the tools on SearchEngineWorld -- [searchengineworld.com...]
Next, sit back and relax. You'll show up in the index eventually.
Consider yourself lucky that you didn't start your web site about 4 months ago like I did, just before all the strange update problems -- the first deep crawl from Google didn't happen until last week.
Welcome to WW,
Google seems to be picking up pages quicker now and you can get indexed within hours :) - however I would not panic if you are not this probably largely depends on links to your site and freshness.
Assuming that Google did have a deep(er) crawl last week and assuming that there will be a more traditional update then the crawl from last week could appear in 2-3 weeks (or so - maybe less maybe more)
Hard to say with the latest developments at Google but good luck and I am sure you will appear in the index sooner or later - after all it looks like Google found you OK - so it should work out in the end.
get links, DMOZ yahoo and many more and you will be o.k. eventually
Fwiw - I've had some pages crawled on the 5th of July which now seem to have stabilized in the SERPS without fresh tags.
This as of this morning... So, the 'traditional update' is, for me (provided these pages stay 'stuck' in the serps), looking like the dinosaurs...
Thanks guys! I can wait to be found in the result pages!
Quick question: How do you know what pages are actually crawled by Googlebot? Since I only see my robots.txt file to have been indexed, will I see the actual pages to be indexed by the bot later on this month? is it a 2 steps process for Google to index pages?
What are the "SERPS"?
SERP - Search Engine Results Page
googlebot asking for robots.txt is the ritual beginning. Relax, the bot knows you are there, and will return for pages.
So you have seen the start of it. Get as many high ranking links as possible, the bot is following those links. The more important the bot thinks you are, the deeper it will go.
Now, what if I change a page URL.
Let's say I had a contact.html page last week when Google came, but yesterday I have changed this for contact-us.html.
Will Google look for contact.html? Will Google also take the contact-us.html?
I may have done a mistake to change a few file names...did I?
I have been listed on google for awhile, but I never had a robots.txt file. I added one today. Should I just delete it or will it help?
Question for you Experts
I thought when google does its deepcrawl and comes to visit your site it will grab all the pages that exist. However when I was speaking to my webmaster he stated that they only got about half of my pages and are still working on getting the rest.
Also the weird thing about it all is that I am getting the same amount of traffic before my site was indexed on google that I am now after the index. Logically this does not make sense at all. I say this because several of my keywords are rank high and these keywords are a popular search according to the numbers.
Is it because Google may still be unsettled and my backlinks may be popping in and out? Please advise.
I had a site created April 7, 2003, just show up yesterday!
Showed up with no fresh tags.
Don't know how this works into the rolling update.
Sounds like you're in good shape, bobosse. :) Over time, Google should find more of your pages, especially as more people link to you when they find your site.
I'm in the same boat.
I created a site three months ago. Got the same combination of links that have worked in the past (DMOZ, YAHOO, state organization, national organization, affiliated national organizations, local directories, etc)
I manually submitted our front (index) page and that's all that has been spidered or added to the index. The bots' stroll by, request robots.txt (which is there) but then move on without scanning the site. [We use the sample robots.txt file from Searchengineworld, basically]
Has Google changed it's criteria for non commercial sites? Is there some threshhold before the bots will crawl our site?
Fearless: sounds like there are problems with the site. A site first online 2003-05-04 dropped cleanly into the index on 2003-06-16 with all pages listed (still no PR though).
You should only need at least one incoming link of at least PR4 to get you started. Run the pages through [validator.w3.org...] in case there is something tripping the spiders up. Next, find a long phrase on any one of your pages then do a search for that phrase on Google. See if someone else has duplicated your content and is stopping you being listed.
There are other things to try, but those are the two I would do first.
g1smd, Thanks for replying.
|Fearless: sounds like there are problems with the site. |
This is my fourth one. There is nothing significantly changed in terms of site creation, file layout, etc.
|A site first online 2003-05-04 dropped cleanly into the index on 2003-06-16 with all pages listed (still no PR though). |
That may be true, however, there have been other posts from people in similar straits as myself.
|You should only need at least one incoming link of at least PR4 to get you started. |
I've got well over that, plus DMOZ and Yahoo.
|Run the pages through [validator.w3.org...] in case there is something tripping the spiders up. Next, find a long phrase on any one of your pages then do a search for that phrase on Google. See if someone else has duplicated your content and is stopping you being listed. |
Nope, not the problem.
G-guy has responded to similar posts on other threads and he always says about what he did above
|Over time, Google should find more of your pages, especially as more people link to you when they find your site. |
I note that the dictionary defines "should" as "ought to, but not necessarily will.""
Googleguy's posts on this topic have been remarkably vague. Clearly, something has changed [regarding new non commercial modest size sites] and apparently he's not at liberty to discuss it, other than to offer reassurances so vague that nothing can be read into them.
Again to beat the horse, same problem as Fearless.
Two sites ... one commercial, one non-commercial
Commercial is linked from dmoz, yahoo, business.com, joeant, and two of my own on-topic PR7 and PR7 sites. Also an ad on a PR9 site [very on topic] and a PR6 [also on topic]
Just index page grabbed in 2 months, and the one being shown itself is 1 month old.
Page Rank: 1
Pages Indexed: 1
Total Pages: 204
Non commercial .. also in Yahoo, zeal, goguides, joeant, plus also a mention on the official site!
Just index page grabbed.
Page Rank: 0 [n/a]
Pages Indexed: 1
Total Pages: 10,521
The sites are very clean, zero spam, not 'overly' optimised. I have several other sites on this same server and IP [its a dedicated] and they are getting crawled left and right so it cannot be an IP ban.
Manually submitted too, and emailed webmaster@ [no response]
I put up a site on 6th June and gave it 1 link from a PR4 site I have.
It was crawled on 11th June and in the index and sticking about a week later.
New pages I have added since June 6th are not in the index yet and googlebot visits have been infrequent.
Recent threads have discused Google's ability to index php or asp generated sites. I wonder if links from those pages count toward the backlinks G guy was referring to when he said:
|as more people link to you when they find your site |
He is obviously implying that backlinks are what we need.
More and more of the logical, (legitimate) high ranking backlinks for my sites are script generated. I've already mentioned that "jump menu" links don't register with the google bots. (Even though the "links" query isn't working for my latest site) just by looking at the results for "widget+county+widget+party" several of our linking sites are returned.
But none of the ones from jump menus OR php pages show up! (Depsite their high "significance.")
Hang in there Ahmed.
|I'm pullin' for ya. We're all in this together!" |
(From Red Green, one of Canada's great minds......)
Yes fearless ... nothing to do but wait.
Its weird ... suddenly about 80 pages were added to the index .. .but only 3 of them have a cache to show, and the main index page still shows the 4 week old version :-/
[newb question - when/why is there no cache to be shown for a page?]
I thought Google deepcrawls for a particular month picks up all spiderable pages on your site that you have at that time? Why is it then that only half of my spiderable pages were picked up by google during last(June) month and not all of them?
Am I incorrect of this, or does it take google several months to pick up all pages that are spiderable?
My site is only a couple of months old, could this be why?
When will I know exactly how well my site has done in regards to google PR, indexing etc....
Does someone out there have an answer for me?
I was forced to move my site three and a half months ago. I got an immediate deep crawl picking up a third of 300+ files but it is only in the last week or two that any more files have been picked up and indexed. The readership has now, in the past three days, climbed back to half of what it was at the previous provider. But before that spider hits were taking up half of the hits. I will never regain my positions of my files built up over 7/8 years but at least recovery is in sight (I hope). Currently my highest PR is about 3 but I didnt have the toolbar at the previous provider. My site is hobbyist. What irks is that even direct emails to people having links to the old site hasnt produced any resetting. But things are now looking better.
New Site launched 8 weeks ago:
Index page showed up in Google mid June
85 pages showed up in Google on 1st July
Comet is showing an archive copy with a date June 18th for all pages except the index page which has a date of June 11th. Have since added 30+ new pages but nothing yet is showing.
Conclusion - for our new site is that it took 2 weeks for our new pages to show up following the first deep crawl of the site.
[darn fumble fingers!]
[edited by: Fearless at 6:48 pm (utc) on July 13, 2003]
Googleguy was nice enough to provide this google query (sort of like the "links:" query) that shows exactly what Google has indexed of your site (in my case- one page.....)
I think that another problem for us under the new regimen, (besides jump menus, php and asp backlinks) is that ALL of our backlinks are to the site not to individual pages.
(and it's likely to stay that way.....)
Hey Fearless (or GG)...
|Googleguy was nice enough to provide this google query (sort of like the "links:" query) that shows exactly what Google has indexed of your site |
I'm in a similar boat as you, and have been reading this thread with great interest! Where/how does one use that "google query"?
Just stick it in the Google search box and hit "Search".
Replace the yourdomain with, well, your domain.
You can also add &num=100 to the end of the search URL in the browser address bar, to get 100 results per page, and add &filter=0 as well if you want to see all of the results.
Ah! Thanx g1smd! 'precitate the answer...
Generally how long does it take Google to list your entire pages? 1 month, 3 months, 6 months? What the typical norm?
My sites approx 2 months old and Google has about half my pages. Anyone have the answer?
Depends if you have a few dozen or tens of thousands of pages.
Certainly allow 2 or 3 months, maybe more on bigger sites.
| This 35 message thread spans 2 pages: 35 (  2 ) > > |