homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

PromptCloud bot
Web Extraction, crawling and scraping service

 3:31 pm on Jul 30, 2013 (gmt 0)

Have you guys seen PromptCloud? <snip>

Anyone had any experience tracking and blocking it yet?

[edited by: incrediBILL at 6:03 pm (utc) on Aug 1, 2013]
[edit reason] Removed URL. No self-promo URLs please [/edit]



 6:08 pm on Aug 1, 2013 (gmt 0)

Hi Annie and thanks for letting us know about this bot.

I haven't seen it yet and have a few questions that would be great if you could answer them.

1. What's the User Agent String?
2. What's the IP range it operates from?
3. Does it honor robots.txt?
4. Is there a page on your site describing your bot? Normally there is a bot page, typically a link is provided in the User Agent string of the bot for webmasters to follow. I looked all over your site and couldn't find any reference.

Please let us know about these items at your convenience.



 7:53 pm on Aug 1, 2013 (gmt 0)

FWIW, their domain sub-hosts with BlueHosts, which presents no issue from the regulars here.

PromptCloud opeartes on “Data as a Service” (DaaS) model and deals with large-scale data crawl and extraction, using cutting edge technologies and cloud computing solutions (Nutch, Hadoop, Lucene, Cassandra, etc). These data could be from reviews, blogs, product catalogs, social sites, travel data- basically anything and everything on WWW, and can be useful across all verticals- Market research, travel, Comparison shopping, deal aggregation, reputation management and more.


 8:19 pm on Aug 1, 2013 (gmt 0)

My stats are showing that their website is hosted on a shared hosting platform together with 450 other websites...So crawling would be done from other IPs.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved