mercredi 27 mai 2020

Using FOR loop and IF for BeautifulSoup in Python

I am trying to pull out the meta description of a few webpages. Below is my code:

URL_List = ['https://digisapient.com', 'https://dataquest.io']
Meta_Description = []

for url in URL_List:
    response = requests.get(url, headers=headers)
    #lower_response_text = response.text.lower()
    soup = BeautifulSoup(response.text, 'lxml')
    metas = soup.find_all('meta')
    for m in metas:
        if m.get ('name') == 'description':
            desc = m.get('content')
            Meta_Description.append(desc)
        else:
            desc = "Not Found"
            Meta_Description.append(desc)

Now this is returning me the below:

['Not Found',
 'Not Found',
 'Not Found',
 'Not Found',
 'Learn Python, R, and SQL skills. Follow career paths to become a job-qualified data scientist, analyst, or engineer with interactive data science courses!',
 'Not Found',
 'Not Found',
 'Not Found',
 'Not Found']

I want to pull the content where the meta name == 'description'. In case, the condition doesn't match, i.e., the page doesn't have meta property with name == 'description it should return Not Found.

Expected Output:

['Not Found',
 'Learn Python, R, and SQL skills. Follow career paths to become a job-qualified data scientist, analyst, or engineer with interactive data science courses!']

Please suggest.

Aucun commentaire:

Enregistrer un commentaire