I am trying to pull out the meta description of a few webpages. Below is my code:
URL_List = ['https://digisapient.com', 'https://dataquest.io']
Meta_Description = []
for url in URL_List:
response = requests.get(url, headers=headers)
#lower_response_text = response.text.lower()
soup = BeautifulSoup(response.text, 'lxml')
metas = soup.find_all('meta')
for m in metas:
if m.get ('name') == 'description':
desc = m.get('content')
Meta_Description.append(desc)
else:
desc = "Not Found"
Meta_Description.append(desc)
Now this is returning me the below:
['Not Found',
'Not Found',
'Not Found',
'Not Found',
'Learn Python, R, and SQL skills. Follow career paths to become a job-qualified data scientist, analyst, or engineer with interactive data science courses!',
'Not Found',
'Not Found',
'Not Found',
'Not Found']
I want to pull the content
where the meta name == 'description'
. In case, the condition doesn't match, i.e., the page doesn't have meta property with name == 'description
it should return Not Found
.
Expected Output:
['Not Found',
'Learn Python, R, and SQL skills. Follow career paths to become a job-qualified data scientist, analyst, or engineer with interactive data science courses!']
Please suggest.
Aucun commentaire:
Enregistrer un commentaire