| 1:20 pm on Nov 22, 2005 (gmt 0)|
yes .. but i think you will need a good inhouse programmer
| 3:01 pm on Nov 22, 2005 (gmt 0)|
Ok is there a program that you could recommend?
| 3:08 pm on Nov 22, 2005 (gmt 0)|
you need to scape and rape the serps, you will need a program written for you.
| 3:15 pm on Nov 22, 2005 (gmt 0)|
Getting more than 1000 results on Google.co.uk:
Do searches for UK sites but limit results to .co.uk
Then repeat for .com .org.uk .net etc.
If you need more than that then try also using a date operator to restrict results to a set period and then get rid of the duplicates.
If you need more than that you could try doing something similar with Yahoo and MSN and combining the results OR use common terms with positive and negative operators to split results e.g
widget site:co.uk -commomterm
widget site:co.uk commomterm
gives 2000 results, be careful to use terms that are likely to appear around an even amount of times otherwise you will skew your result set.
Other ways include:
restricting the file format to return Word or PDF etc
Pay Gigablast for a commercial feed and get 10,000 at a time from their data :)
Use common first or surnames as positive qualifiers - e.g. widget john site:co.uk
Use large town names as positive qualifiers (this will also allow you to filter out directory sites as they will have many appearances in the goegraphical lists.
IF you still need more then I think that buying Google would be the next step, certainly easier than combining all of that ;)
| 3:17 pm on Nov 22, 2005 (gmt 0)|
[google.com...] is the home of the legitimate way to query Google en masse, however it's severely limited by the number of daily queries so you may need to scrape as suggested.
| 3:21 pm on Nov 22, 2005 (gmt 0)|
inbound did you find that maxresult will only return 10 even if set to 100?/
| 6:35 pm on Nov 22, 2005 (gmt 0)|
Yes, the Google Search API is a pain with it's throttling, I just thought I had to include it to give the legitimate route to get the results. It's not so much of an issue if you have a program set up to churn away but it severley restricts the applications you can build for realtime queries, which is kind of daft.
| 7:22 pm on Nov 22, 2005 (gmt 0)|
hmmm, I think you're looking at it from the wrong angle. With an API key it allows people to run their apps without overloading Google's system. Google could remove the facility and then where'd we be.
| 7:36 pm on Nov 22, 2005 (gmt 0)|
nah .. what we mean Engine is that you have a maxresults setting like NUM=100 in normal google .. but maxresults is 10 or 9,8,7,6,5,4,3,2,1
but no more than 10.. so it's poor .. even multi threading you use up 10 requests to get 100 results back .. when you only have 1000 request that soon adds up to .. api key query depletion