What exactly is the issue, what is blocking you?
Just to clarify, Selenium is used to control the browser it allows you to programatically (using JS) interact with the browser and gives you access to the data (html/xml/JSON/csv.etc...).
First you will need to write a crawler/scraper script that tells Selenium what to do, this can be done in PHP or Python or any other scripting language.
Once you have "collected" the data with Selenium then you need to parse it into a usable format, typically one would collect a mix of formats HTML,JSON etc., typically one strips away all the HTML and only preserves the text elements and data. Finally, once the data is parsed you save it to your DB to be used in the future. Certainly one can use PHP for parsing but I doubt that it is well suited for that task. Python has several packages available that are designed specifically for that task (eg: beautiful soup). If you're lucky the data you collect may already be in some usable format such as JSON or csv but that typically isn't the case as webpages are intended to be readable and these formats are not.
I understand that Python may be new to you but it is a very easy language to learn, it is easily readable far less cryptic than PHP. The time spent trying to write script in PHP to do something that it is not really intended to do is likely going to be the same as learning how to do it Python. Here is an example of a Python script I wrote to scrape paginated content of the web. This script handles only the first part, that is directing selenium and collecting the data. Parsing handled after and is very specific to the content collected.
import time
from bs4 import BeautifulSoup as bs
from selenium import webdriver
from selenium.webdriver.chrome import service
webdriver_service = service.Service('path_to/Selenium_drivers/chromedriver_win32/chromedriver.exe')
webdriver_service.start()
def paginated_scrapper(url):
""" function to scrape paginated results when the pages are the standard
page 1,2,3...
url: string representing the base url to crawl
return a dict of beautiful soup objects where the key an int
representing the page number."""
driver = webdriver.Remote(webdriver_service.service_url,
webdriver.DesiredCapabilities.CHROME)
driver.get(url)
time.sleep(5)
data = {1: bs(driver.page_source)}
for i in range(2,1000):
try:
element = driver.find_element_by_link_text(str(i))#.click()
driver.execute_script("arguments[0].click();", element)
time.sleep(3)
data[i] = bs(driver.page_source)
except scx.NoSuchElementException:
print('Ending on page:', i)
break
except Exception as e:
print('error occured on page', i, ':', e)
break
driver.quit()
return data