jeudi 16 avril 2020

My list if, else statement only returns the "if" statement

From the URL I want to extract profile information of this care home: The information is given in the following format on the website: https://www.carehome.co.uk/carehome.cfm/searchazref/10001005FITA

Group: Excelcare Holdings

Person in charge: Denise Marks (Registered Manager)

Local Authority / Social Services: London Borough of Tower Hamlets Council (click for contact details)

etc

My get_deets function is only outputting the first elements in their respective lists "tag" and "sibling". I want the entire list of tag text and corresponding information aswell.

SCRIPT

import numpy as np
import pandas as pd
from bs4 import BeautifulSoup as soup
from selenium import webdriver

driver = webdriver.Chrome(executable_path=r'C:\Users\Main\Documents\Work\Projects\chromedriver')

my_url = "https://www.carehome.co.uk/carehome.cfm/searchazref/10001005FITA"

def make_soup(url):
  driver.get(url)
  m_soup = soup(driver.page_source, features='html.parser')
  return m_soup 

main_page = make_soup(my_url)

strongs = main_page.select(".blue")

def get_deets(strongs):
    tag = []
    sibling = []
    for strong_tag in strongs:
     if strong_tag.next_sibling == '\n':
        tag.append(strong_tag.text), sibling.append(strong_tag.next_sibling.next_sibling.text)
     else:
        tag.append(strong_tag.text), sibling.append(strong_tag.next_sibling.strip())
     return tag, sibling

My Current Output :

get_deets(strongs)

    (['Group:'], ['Excelcare Holdings'])

Desired Output

tag

['Group:','Person in charge:', 'Local Authority / Social Services:'] 

sibling

['Excelcare Holdings',  'Denise Marks (Registered Manager)','London Borough of Tower Hamlets Council (click for contact details)' ]

Using this HTML:

<div class="profile-group-description col-xs-12 col-sm-8">

    <p><strong class="blue">Group:</strong>

        <a href="https://www.carehome.co.uk/care_search_results.cfm/searchgroup/36151505EXCA">Excelcare Holdings</a>
    </p>

    <p><strong class="blue">Person in charge:</strong>

      Denise Marks (Registered Manager)</p>

    <p><strong class="blue">Local Authority / Social Services:</strong> 
      London Borough of Tower Hamlets Council (<a href="https://www.carehome.co.uk/local-authorities/profile.cfm/id/Tower-Hamlets">click for contact details</a>)</p>

    <p>
        <strong class="blue">Type of Service:</strong>
      Care Home only (Residential Care) – Privately Owned , Registered for a maximum of 44 Service Users
    </p>

    <p>
        <strong class="blue">Registered Care Categories*:</strong> 
      Dementia • Learning Disability • Mental Health Condition • Old Age
    </p>

Aucun commentaire:

Enregistrer un commentaire