Skip to content Skip to sidebar Skip to footer

Web Scraping An "onclick" Object Table On A Website With Python

I am trying to scrape the data for this link: page. If you click the up arrow you will notice the highlighted days in the month sections. Clicking on a highlighted day, a table wit

Solution 1:

Please try below solution

driver.maximize_window()
wait = WebDriverWait(driver, 20)  


elemnt=wait.until(EC.presence_of_element_located((By.XPATH, "//body/div[@id='wrapper']/div[@id='content']/div[@class='tenders']/div[@class='form-group']/div[1]/div[1]//i")))
    elemnt.click()
    elemnt1=wait.until(EC.presence_of_element_located((By.XPATH, "//div[@class='form-group']//div[1]//div[3]//table[1]//tbody[1]//tr[6]//td[1]")))
    elemnt1.click()
    lists=wait.until(EC.presence_of_all_elements_located((By.XPATH, "//table[@class='tenders-table cloned']")))
    for element in lists:
         print element.text

Solution 2:

Well, i see there's no reason to use selenium for such case as it's will slow down your task.

The website is loaded with JavaScript event which render it's data dynamically once the page loads.

requests library will not be able to render JavaScript on the fly. so you can use selenium or requests_html. and indeed there's a lot of modules which can do that.

Now, we do have another option on the table, to track from where the data is rendered. I were able to locate the XHR request which is used to retrieve the data from the back-endAPI and render it to the users side.

You can get the XHR request by open Developer-Tools and check Network and check XHR/JS requests made depending of the type of call such as fetch

import requests
import json

data = {
    'from': '2020-1-01',
    'to': '2020-3-01'
}


defmain(url):
    r = requests.post(url, data=data).json()
    print(json.dumps(r, indent=4)) # to see it in nice format.print(r.keys())


main("http://www.ibex.bg/ajax/tenders_ajax.php")

Because am just a lazy coder: I will do it in this way:

import requests
import re
import pandas as pd
import ast
from datetime import datetime

data = {
    'from': '2020-1-01',
    'to': '2020-3-01'
}


defmain(url):
    r = requests.post(url, data=data).json()
    matches = set(re.findall(r"tender_date': '([^']*)'", str(r)))
    sort = (sorted(matches, key=lambda k: datetime.strptime(k, '%d.%m.%Y')))
    print(f"Available Dates: {sort}")
    opa = re.findall(r"({\'id.*?})", str(r))
    convert = [ast.literal_eval(x) for x in opa]
    df = pd.DataFrame(convert)
    print(df)
    df.to_csv("data.csv", index=False)


main("http://www.ibex.bg/ajax/tenders_ajax.php")

Output: view-online

enter image description here

Post a Comment for "Web Scraping An "onclick" Object Table On A Website With Python"