Forum Moderators: open
This site is not big by any stretch (maybe 100 pages max).
What's he up to? Following links out, indexing another site and then back to us? Presumably if that's the case, a good sign?
TJ
Fortunately, we have big bandwidth, otherwise site would probably be bought down by all this activity. It's going deep too - down through all layers and links so far. I think it will index every single page at this rate. They've been there for *hours* now.
I assume this is a good thing, but I'm fairly new to this.
Site has only been online for 3 weeks, but thanks to freshbot we got high in the serps pretty quickly (No. 1 for one main key search term two days after launch) and I think we've been linked into by quite a lot of decent sites (we have really good content).
But grey pagerank at the moment - hoping next dance we show at least a 4.
Tj
Brett was right it seems - we concentrated purely on content and let the web do the rest of the work for us.
That's what I'll do in future - just build good content sites, with good page titles and forget about the rest.
One link is all you need to get started, the rest us just patience (although having said that, this has all happened very fast for us!).
TJ
Is this normal? About 7 hours crawling on a site of <300 pages (rough calculation).
He's maxed out my bandwidth (even with all image directories etc blocked). I don't mind, got to be good for us in teh long run, but my users are going to wonder why it takes a page 10 seconds to load at the moment!
Anyway of getting google to back off the gas a little?
TJ
I suspect the maxing out may actually have been several people hitting an image intensive page all at the same time - so I think you're right, not google.
I'm quite fascinated by all this - I didn't realise the depth to which our site was going to be crawled (about 16 more pages and he has the entirety of the site). I think I got my internal link strategy right!
Thanks for all the help,
TJ
Yah, that was me! It almost crashed my server during the last deepcrawl! It was bombing me, and took the server CPU load average up to over 20! I did way too much SEO on my vBulletin boards!
Is there good reason for this creeping sense of dread I feel, or is it just first-timer's jitters?
Darryl.
Can anyone point me to some resources to stop the deepcrawler being so aggresive on bandwidth? This is ridiculous. We're currently running at about 76% of capacity and 90% of that is damn googlebot!
I never thought I'd see the day I was annoyed by a robot crawling one of our sites!
TJ
Sorry to have delayed anyone elses index crawling by preoccupying the little guys!
I telephoned google and they were really helpful (10/10 for that google).
They are working on the problem now - have already phoned me back once. They are naturally very interested in having a copy of my logfiles - now >200mb in size.
Hmmm.... perhaps they're worth money? lol
I suspect right about now we rank about a PR10.
LOL
TJ
Do you know how to create a Googlebot log in your public_html directory? I don't think you would want to E-Mail it to them! Here's what you do. Get in telnet and enter these commands, after you change the paths to the correct directory, and change * to the mark that's almost like the '!', that's above the return button. The send them the Google log URLs to them and let them get them them selves. I'll bet there internet connections are much faster!
cat /logs/web.log * grep 216.239.46 > /puiblic_html/deep.log
cat /logs/web.log * grep 64.68.82 > /public_html/fresh.log
:::Is there good reason for this creeping sense of dread I feel, or is it just first-timer's jitters?
That's normal. There is seven days left in the deepcrawl. My big sites are being bombed, while my smaller sites haven't got a lot of hits yet. Last month it was in the second half of the deepcrawl that it got my smaller sites. Small being around 500-700 files vs around 10,000-20,000 files.
[edited by: Jesse_Smith at 9:49 pm (utc) on April 18, 2003]
A certain forum script in a certain configuration leads google to believe that we have a site of infinite depth and content.
Would be great if it actually stopped at some point. PR11 maybe?!
The old "recursion: see recursion" style definition....
It's chomping through the full 3 mbits connection at the moment. I only just managed to squeeze in to get the backup out.
I don't really know enough about the backend stuff to fully understand this. My partner in crime is dealing. I get to make the phone calls. I got the bum deal I think!
TJ
I don't want to pull the server down because we currently have users on the site (no doubt experiencing a slight slow down!).
I ended up telephoning google when the bandwidth started to get into my "now you pay" section...
I would go to google.com and "contacts" and get some numbers handy if you think this is a problem for you.
TJ
Thanks,
Kevin