Welcome to WebmasterWorld Guest from 3.233.226.151

Forum Moderators: open

Message Too Old, No Replies

Advanced google checking

check ranking, number of page index, links from all google datacenter

     
8:30 pm on Jun 22, 2003 (gmt 0)

New User

10+ Year Member

joined:May 23, 2003
posts:15
votes: 0


i am not sure if it is ok or not to post it..

Anyway i got inspired by the disccussion [webmasterworld.com...]

so quickly wrote a php (tested on PHP4.31). i can say this is very very handy for webmaster. save time to go through loads of pages type in all those "allinxxx" crap.
i call it advanced Google Checker

ranking position in top 100 result
number of page indexed
number of links

From every google datacenter(9).

Keep in mind that this php is pretty slow due to it parse about 30 pages to check it, performance also depend on your connection and speed.

This code is for educational use only, use at your own risk.

Enjoy it!

<?php
/*
File Name google.php
Advanced Google checker
Author : SPEEDMAX

HOW TO USE
[yourdomain.com...]
*/
$query = $_GET['key'];
$domain = $_GET['url'];
$numResult = 100;
// if you want to add/delete datacenter edit this line
$datacenter = array("www","www2","www3","www-ex","www-fi","www-cw","www-sj","www-ab","www-zu");

echo "Google Checker : $domain <br>";
echo '<table border=1 ><tr><td>Data Center</td><td>Ranking</td><td>No of page indexed</td><td>Inbound Link</td></tr>';
foreach($datacenter as $v){
$url="http://$v.google.com/ie?q=".urlencode($query)."&hl=en&lr=&ie=ISO-8859-1&num=100&start=0&sa=N";
$file=implode("",file($url));
$file = explode("<NOBR>",$file);
foreach($file as $key => $value){
if(eregi($domain, $value)) {
$result = $key;
}
}
echo "<tr><td>".$v."</td><td>".$result."</td>";
$url2="http://$v.google.com/search?hl=en&lr=&ie=UTF-8&oe=UTF-8&q=site%3A$domain+-wx34A&num=1";
$file=implode("",file($url2));

if(preg_match("/ of about \<b\>([0-9,]*)/",$file,$text)){
echo "<td>".$text[1]."</td>";
}

$url3="http://$v.google.com/search?hl=en&lr=&ie=UTF-8&oe=UTF-8&q=link%3A$domain&num=1";
$file=implode("",file($url3));

if(preg_match("/ of about \<b\>([0-9,]*)/",$file,$text)){
echo "<td>".$text[1]."</td>";
}
echo "</tr>";
}
echo "</table>";
?>

10:37 pm on June 22, 2003 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38257
votes: 115


It will also be very effective at getting your ip banned by google for abuse. Atleast use a stock agent name.
3:00 am on June 23, 2003 (gmt 0)

New User

10+ Year Member

joined:May 23, 2003
posts:15
votes: 0


before i thought the same. but then i realise that it only download 3 file from the same server. plus in the apache log.

it will show like this, which is very very hard to track.
---------------------------------------------------
192.168.0.1 - - [23/Jun/2003:11:00:09 -0500] "GET /search?q=keyword HTTP/1.1" 200 4953 "-" "-"
---------------------------------------------------

i do believe even if you run this script in daily base it is okay. i mean who cares if some one download 3 page from your site compare to millons of request. :P

[edited by: speedmax at 4:46 am (utc) on June 23, 2003]

4:10 am on June 23, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 9, 2003
posts:42
votes: 0


interesting. Shouldn't be prob if you use your API key?
4:52 am on June 23, 2003 (gmt 0)

New User

10+ Year Member

joined:May 23, 2003
posts:15
votes: 0


For those who is interested.
This is the result of the script.
=========================================
Google Checker
URL : www.example.com
Keyword : lyrics
Data Center¦Ranking¦No of page indexed¦Inbound Link
www 97 19,500 154
www2 90 17,000 154
www3 90 17,000 154
www-ex 98 19,100 154
www-fi 90 17,000 154
www-cw 97 19,500 154
www-sj 97 19,100 154
www-ab 97 17,200 154
www-zu 97 17,200 154
6:18 am on June 23, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 19, 2002
posts:423
votes: 0


It will also be very effective at getting your ip banned by google for abuse.

I thought this myth had been debunked once and for all by GG when he confirmed in a previous thread...

...the things you mentioned are good reasons why IP address penalties are usually not a good idea. All the comments about manual penalties apply, plus it's easy for a bad guy to switch to a different IP. I haven't seen an IP-based penalty in a long time.

There are very good reasons not to make a habit of programmed interrogation of Google servers - but this does not appear to be one of them.

11:12 am on June 23, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Mar 14, 2002
posts:207
votes: 0


> interesting. Shouldn't be prob if you use your
> API key?

You couldn't run this program via the API -- the API doesn't allow access to all the Google servers referenced.

11:14 am on June 23, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 4, 2000
posts:446
votes: 0


Hmmm... Well..., in the past a company that I was working at did lots of programmed requests to google. And the IP address used got banned from searching Google. Oops.
4:47 pm on June 23, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 8, 2001
posts:2882
votes: 0


georgeek, that post was talking about doing an IP address ban for web sites that spam--that's a bad idea. However, we have no problem banning IP addresses that send automated queries to Google. If that happens, you'll get a 403 Forbidden error any time you want to access Google from that IP address. Sometimes, your ISP will want to talk to you, because we can drop them reports of abuse as well.

Overall, I highly recommend *not* hitting Google with automated queries.

4:58 pm on June 23, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2003
posts:76
votes: 0


there's a website that people use all the time as a 'link popularity check' - how do they get away with it, and maintain a PR 7?
Also, how does using the free google API software differ from automated queries written in PHP and other lanquages?
5:15 pm on June 23, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 19, 2002
posts:423
votes: 0


GoogleGuy

...we have no problem banning IP addresses that send automated queries to Google...

Thanks for the clarification - I hope you have your official hat on for this one :)

Bandwidth theft gets punished on my sites too ;)

5:47 pm on June 23, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 8, 2001
posts:2882
votes: 0


coolasafanman, with the API we can throttle back on queries if we needed to--that's why the API is there, so that people can do low-level queries on a non-commercial basis without worrying at all.

Any IP that does automatic or programmatic queries can get banned, so I would use the API as the legitimate way to access Google instead.

6:11 pm on June 23, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:May 20, 2003
posts:76
votes: 0


sounds good to me GG, now all i need to do is figure out how to get it to work ;-)
6:12 pm on June 23, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 20, 2003
posts:146
votes: 0



before i thought the same. but then i realise that it only download 3 file from the same server. plus in the apache log. it will show like this, which is very very hard to track.
---------------------------------------------------
192.168.0.1 - - [23/Jun/2003:11:00:09 -0500] "GET /search?q=keyword HTTP/1.1" 200 4953 "-" "-"
---------------------------------------------------
i do believe even if you run this script in daily base it is okay. i mean who cares if some one download 3 page from your site compare to millons of request. :P

can someone else independently confirm this?

Surely, running this script a few times a day,
you could never be tracked or banned by google?

can someone explain clearly and concisely (no wish wash)
why sites like google-dance don't get banned then?
Is is becacause it's just URL/Frames, no autmatic queries
are ever sent to google?
Surely, if some php in the same pages, counted how many
times your domain appeared, that would be OK? (rather than manually look at each datacenter and try to find your domain?)

6:19 pm on June 23, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 8, 2003
posts:95
votes: 0


so all i want to know is:

how with google tools can i do a search on our domain with a tandem keyword, and find out where we are ranked for that keyword?

i have been hitting referring URL on who's on now statistics tracking until now to find out what page we are showing up on various terms we get visitors from.

thanks anybody.

10:15 pm on June 23, 2003 (gmt 0)

New User

10+ Year Member

joined:May 23, 2003
posts:15
votes: 0


i think it is pretty easy to add the API thingy.

but any way this nice and simple..

10:35 pm on June 23, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member ogletree is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 14, 2003
posts:4320
votes: 42


Would Google be willing to work with a 3rd party to offer a paid service for automated searches? I would provide a server and a dedicated T1 to a Google datacenter or any other hardware needed. They could send me updates at off peak times. I would pay a licensing fee or whatever.
1:15 am on June 24, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 7, 2003
posts:54
votes: 0


Hypothetically: Would Google consider an automatic search tool to be abusive if there was a 10 second delay between each query.

I would like to find out how I stand with each of my important keyword phrases. I can sit there and enter the phrases one at a time (which useally takes about 10 seconds between queries) or it would be quite simple to write a little auto query program that has a 10 second delay to do it for me. The impact on the bandwidth would be exactly the same.

Any opinions on this option.

2:31 am on June 24, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Apr 9, 2003
posts:42
votes: 0


speedmax if you add the api code we would be grateful. I can't see anyone using it more than a few times a day though. Thanks for the cool script.
8:30 pm on June 24, 2003 (gmt 0)

New User

10+ Year Member

joined:Feb 22, 2003
posts:9
votes: 0


I could be wrong, but I don't think that the Google API will not let you query other datacenters besides www.
5:45 am on June 27, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Jan 5, 2003
posts:380
votes: 0


Exacly lippsy.

Goggle API is of no use in the case you want to check your rankings on multiple datacenters.

5:54 am on June 27, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 8, 2001
posts:2882
votes: 0


Depends, bigace. 10 seconds between queries would still allow 8640 queries a day, which starts to add up if everyone does it. :)

Personally, I don't recommend following queries down to comparing data center by data center. See my "knob" post for the reasons why. :)

6:29 am on June 27, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Jan 5, 2003
posts:380
votes: 0


GoogleGuy Said
Personally, I don't recommend following queries down to comparing data center by data center.


Well why is it possable then? Why not just redirect www-fi.google.com to www.google.com? If Google doesn't want people to check individual datacenters, then they should do something about it.
9:17 am on June 27, 2003 (gmt 0)

New User

10+ Year Member

joined:Feb 21, 2003
posts:8
votes: 0


This is the "knob" post from GG
[webmasterworld.com...]
11:28 pm on July 1, 2003 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member ogletree is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Apr 14, 2003
posts:4320
votes: 42


He is not saying that it will hurt us somehow by looking at the data centers he just thinks people that do are wasteing their time. That there is no value in doing so. I like to watch the seti@home screensaver or the defrag on my computer but it really does not help me find aliens or understand how I can use my computer better so that it is less defragged next time.