mardi 17 décembre 2019

how to sort out values in categories from lists in order to access data from variables in python?

I have a python scraping script to get infos about some upcomming concerts and it's the same text pattern everytime no matter how many concerts will appear, the only difference is that sometimes it will show an additional line with the price of the tickets when they are still available to book such as the example below:

LIVE 01/01/99 9PM
Iron Maiden
Madison Square Garden 
New York City
LIVE 01/01/99 9.30PM
The Doors
Staples Center
Los Angeles
LIVE 01/02/99 8.45PM
Dr Dre & Snoop Dogg
Staples Center
Los Angeles
Book a ticket now for $99,99
LIVE 01/02/99 9PM
Diana Ross
City Hall
New York City 
Book a ticket now for $79,99       ect...

I need to sort these blocks between 2 categories (4 lines blocks & 5 lines blocks) and then iterate through the values in my variables (bands, date, location, price ect...)

It's working perfectly fine if I have either 4 lines block only or 5 lines block only, but when I have both kind of blocks like in my sample text I don't know how to put them into their own categories, I've tried many formulations in my if statement but none of them worked

live_lines = []
line_counter = 0
distances = []
with open('concerts_list.txt', 'r') as file:
    reading_file = file.read()
    lines = reading_file.split('\n')
    for line in lines:
        if line.startswith('LIVE'):
            live_lines.append(line_counter)

        line_counter += 1

for position in range(len(live_lines)-1):
    block_lines = live_lines[position+1] - live_lines[position]
    block_sizes.append(block_lines)

print('live_lines:', live_lines) #output = [0,4,8,13]
print('block_sizes', block_sizes) #output = [4,4,5]

if block_sizes == 4 for block_lines in live_lines:
    dates = [i for i in lines [0::4]] 
if block_sizes == 5 for block_lines in live_lines:
    dates = [i for i in lines [0::5]] 

This dates variable line of code works perfectly fine without an if statement when there are ONLY 4 lines blocks but gets messed up and read 1 char less when a 5 line block appears

if block_sizes == 4 for block_lines in live_lines:
    dates = [i for i in lines [0::4]] 

This dates variable line of code works perfectly fine without an if statement when there are ONLY 5 lines blocks but gets messed up and read 1 char more when a 4 line block appears

elif block_sizes == 4 for block_lines in live_lines:
    dates = [i for i in lines [0::5]] 

Aucun commentaire:

Enregistrer un commentaire