jeudi 19 octobre 2017

Getting duplicate values while appending cleaned data to dictionary

*I'm new to Python, so please excuse me if I'm missing anything.

I have a txt file and I need to extract some values and organize as dictionary. Here is the needed format {State: Town) for example {'Alabama': 'Auburn', Alabama: 'Florence'....'Wyoming': 'Laramie')

Here is my code:

with open('my.txt') as file:
    output = []
    current_state = ""
    region = ""
    for line in file:
        if (len(line.split("[edit]")) == 2):
            current_state = line.split("[edit]")[0]
        else:
            region = line.split(" (")[0]
        if (region != ""):
            output.append([current_state, region])
    return output

However, my code doesn't do what I want it to do. It feels as I'm storing previously extracted "region" value and appending it to the next state. So there is something wrong with the logic and I'm not sure what exactly.

[['Alabama', 'Auburn'],
 ['Alabama', 'Florence'],
 ['Alabama', 'Jacksonville'],
 ['Alabama', 'Livingston'],
 ['Alabama', 'Montevallo'],
 ['Alabama', 'Troy'],
 ['Alabama', 'Tuscaloosa'],
 ['Alabama', 'Tuskegee'],
 ['Alaska', 'Tuskegee'],
 ['Alaska', 'Fairbanks'],
 ['Arizona', 'Fairbanks'],
 ['Arizona', 'Flagstaff'],
 ['Arizona', 'Tempe'],
 ['Arizona', 'Tucson'],...]

As you can see I'm getting 'Fairbanks' 2 times, first it is appended to Alaska, which is correct and second it is getting appended to Arizona, which is not correct. I have this happening to all of my states.

...
 ['Alaska', 'Fairbanks'],
 ['Arizona', 'Fairbanks'],
...
 ['Wisconsin', 'Whitewater'],
 ['Wyoming', 'Whitewater'],
 ['Wyoming', 'Laramie']]

Please advice.

Aucun commentaire:

Enregistrer un commentaire