I want to redirect the output(data) of my sublime web scraper, to a string in a different program using subprocessing. I then want the output string, to be used in an if else statement to prevent duplicates. I want the if else statement to read the file I'm trying to import the data into and check if any of the dates in the string already exists in the file, I also want to set the date as the primary key.
Example:
Output String: 03012019, 03022019, 03032019, 03042019, 03052019, 03062019
Text File: 03042019, 03052019, 03062019, 03072019, 03082019, 03092019
If the example above happens I want the first 3 dates of the output string to be written to the file, and the last to be ignored because they already exists.
Webscraper:
import requests
from bs4 import BeautifulSoup
from datetime import datetime
response = requests.get('https://www.lotteryusa.com/michigan/powerball/')
soup = BeautifulSoup(response.text, 'html.parser')
title = soup.find(class_='game-title').get_text()
date = soup.find_all("td", {"class":"date"})
results = soup.find_all("ul",{"class":"draw-result list-unstyled list-inline"})
print(title)
for date, results in zip(date, results):
d = datetime.strptime(date.time['datetime'], '%Y-%m-%d')
print(d.strftime("%m%d%Y")+(',')+results.get_text()[:-20].strip().replace('\n',','))
Subprocessor:
from subprocess import*
p2 = check_output("Webscraper", shell=True)
data = p2.decode("utf-8")
print(data)
with open('Notepad++ Text File','r+') as file3:
file3 = file3.readlines()
for lines in file3:
if ('23') in lines:
#Can only put integers in ().
print(lines)
Aucun commentaire:
Enregistrer un commentaire