Welcome to WebmasterWorld Guest from

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Grabbing 2 parts of a remote webpage

2:32 pm on May 25, 2013 (gmt 0)

New User

5+ Year Member

joined:May 6, 2013
posts: 32
votes: 0

I created a Google CSE and did not like that it was limited to 100 results. So now I figured out how to search multiple sites on Google and I want to "rip" the results and the link to the next set of results from the search. I figured out what elements to rip. #pnnext and #rso

So how do I do this?
11:08 pm on May 26, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member

joined:Dec 13, 2009
votes: 0

You could use cURL to get the raw content, and then use a fairly simple regular expression to split out just the #pnnext and #rso elements.

Something like this should do you. Please bear in mind that this has been typed on the fly, and if I was sensible I would have been in bed a while ago, so I offfer no guaruntee that this will work out the box.


// Get the contents of your Google search. Define $url as wherever you're querying
$ch = curl_init();
curl_setopt_array($ch, array(
CURLOPT_URL => $url,
$page_contents = curl_exec($ch);

// Get the search results out of the ol#rso
preg_match('/<ol[^>]+id="rso"[^>]*>((?:<li.*?<\/li>)+)</ol>/ms', $page_contents, $results);
// echo $results[1]; // Should give you the content of #rso

// Get the URL from the a#pnnext
preg_match('/<a[^>]+id="pnnext"[^>]*href="([^"]+])"/', $page_contents, $next_url);
// echo $next_url[1]; // Should be the next page link


Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members