I'm writing a python program that removes duplicate words from a file. A word is defined as any sequence of characters without spaces and a duplicate is a duplicate regardless of the case so: duplicate, Duplicate, DUPLICATE, dUplIcaTe are all duplicates. The way it works is I read in the original file and store it as a list of strings. I then create a new empty list and populate it one at a time, checking whether the current string already exists in the new list. I run into problems when I try to implement the case conversion, which checks for all the instances of a specific case format. I've tried rewriting the if statement as:
if elem and capital and title and lower not in uniqueList:
uniqueList.append(elem)
I've also tried writing it with or statements as well:
if elem or capital or title or lower not in uniqueList:
uniqueList.append(elem)
However, I still get duplicates. The only way the program works properly is if I write the code like so:
def remove_duplicates(self):
"""
self.words is a class variable, which stores the original text as a list of strings
"""
uniqueList = []
for elem in self.words:
capital = elem.upper()
lower = elem.lower()
title = elem.title()
if elem == '\n':
uniqueList.append(elem)
else:
if elem not in uniqueList:
if capital not in uniqueList:
if title not in uniqueList:
if lower not in uniqueList:
uniqueList.append(elem)
self.words = uniqueList
Is there any way I can write these nested if statements more elegantly?
Aucun commentaire:
Enregistrer un commentaire