Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: open
"Jeeves/Teoma" is spidering my site right now. I have many, many dynamic pages. That go like this - /country/state/city/ and I have almost every city in US and Canada. That makes ~ 20,000 pages. I have little content on those pages yet thou. And ask is getting *every single page*. I already reached 80% of my bandwidth usage just overnight.
Is it worth it? I didn't pay for inclusion, so I might not even get listed, right? And if I will get listed, will amount of pages reflect ranking?
And the same question for other SE, should I let them or it's not worth it?
Please respoint quick, because bot is still out there and I don't know what to do.
spelling and added a question[/edit]
[edited by: moltar at 3:28 pm (utc) on Aug. 8, 2003]
Just a thought--What will you do when googlebot comes calling for a deep crawl?
[edited by: fiestagirl at 3:38 pm (utc) on Aug. 8, 2003]
I don't know if I should let them spider it. Maybe once, or twice.
Ask took 174Mb in one visit and didn't even finisn spidering.
That is why I asked here. If the pages get indexed, how it will reflect my placement? There is not much content on those pages.
There is a plus side. Each page has country, state, and city name on it, as well as "widgets" keyword. Some people look for "widgets state", but I still not sure if that will help...
It's not only Ask problem. Hopefully other bots will show up, and I have no idea how will they act. Especially googlebot. It might come every once in a while and grab all the pages, or come once a month and take one. It's kind of unpredictable.
Ahhh so much confusion :)
Teoma took up 199MB of bandwidth from my site earlier this month. I tried a robot text file to disallow it, but it still kept coming. I finally had to block it by IP filter.
I made the personal decision that bandwidth preservation was more important than the chance that Teoma will bring any significant amount of visitors to my site. My site, can be compared to yours in that mines doesn't really have true original content at the moment. It's a bunch of affiliate links and my objective was to control cost.
You should be able to purchase additional bandwidth from your host "temporarily". In my case, I had done that the end of last month and don't want to do that again. My site shuts down if the bandwidth limit for the month is exceeded. This temporary bandwidth can add up to a big expense.
I am saving my bandwidth for googlebot. Google has already brought in over 1300 referers since the beginning of the month, while Teoma brought in just one! In the past, Teoma hasn't referred much, if at all, to my site (unless it doesn't give out it's referer info).
It was partially my fault anyways. I have a reseller account, and I can assign bandwidth limits myself. So I assigned 100Mb limit to that account. Who knew? I had 7mb of transfer in 2 weeks. 100MB looked like enough.
Well, I will see. If Ask is going to come frequently and grab *everything* it finds, I will just limit the access to that specific folder I guess... Let it crawl the rest of the site.
If however the conent is not for commercial advertising, then the cost of not having some kind of income from the site, may outweigh the cost of uping your upstream throughput.
Unique commercial content - as long as the site is built well -> worth allowing all bots to crawl.
If not commercial - then i would look at restrictive measures, on pages that probably don't need to be crawled.