I am trying to use if statements in multiple scenarios to append suffixes and prefixes to text. It seems as though python doesnt pick up the prefixes, only the suffixes. how can i make it clearer to python that these conditional statements are used for different feature extractions?
# Suffix up to length 5
if len(token) > 1:
feature_list.append("SUF_" + token[-1:])
if len(token) > 2:
feature_list.append("SUF_" + token[-2:])
if len(token) > 3:
feature_list.append("SUF_" + token[-3:])
if len(token) > 4:
feature_list.append("SUF_" + token[-4:])
if len(token) > 5:
feature_list.append("SUF_" + token[-5:])
# Prefix up to length 5
if len(token) < 1:
feature_list.append("PRE_1" + token[0])
if len(token) < 2:
feature_list.append("PRE_2" + token[:1])
if len(token) < 3:
feature_list.append("PRE_3" + token[:2])
if len(token) < 4:
feature_list.append("PRE_4" + token[:3])
if len(token) < 5:
feature_list.append("PRE_5" + token[:4])
Below is the current output: ```
['SUF_P', 'SUF_BP', 'SUF_VBP', 'SUF_BP', 'SUF_VBP', 'WORD_steve', 'POS_PRPVBP']
['SUF_N', 'SUF_BN', 'SUF_VBN', 'SUF_BP', 'SUF_VBP', 'WORD_mcqueen', 'POS_VBN']
['SUF_N', 'SUF_BN', 'SUF_VBN', 'SUF_BP', 'SUF_VBP', 'WORD_provided', 'POS_VBN']
['SUF_T', 'SUF_DT', 'SUF__DT', 'SUF_BP', 'SUF_VBP', 'WORD_a', 'POS_DT']
['SUF_N', 'SUF_NN', 'SUF__NN', 'SUF_BP', 'SUF_VBP', 'WORD_thrilling', 'POS_NN']
['SUF_N', 'SUF_NN', 'SUF__NN', 'SUF_BP', 'SUF_VBP', 'WORD_motorcycle', 'POS_NN']
```
desired output would also include prefix such that each line would include the following:
['SUF_P', 'SUF_BP', 'SUF_VBP', 'SUF_BP', 'SUF_VBP','PRE_1P', 'PRE_2BP', 'PRE_3VBP', 'PRE_4BP', 'PRE_5VBP' 'WORD_steve', 'POS_PRPVBP']
Aucun commentaire:
Enregistrer un commentaire