所以我试着在列表中标记一堆单词(确切地说是POS标记)如下:pos = [nltk.pos_tag(i,tagset='universal') for i in lw]
其中lw是一个单词列表(它确实很长,或者我会发布它,但是它就像[['hello'],['world']](也就是一个列表,每个列表包含一个单词),但是当我尝试运行它时,我得到:Traceback (most recent call last):
File "", line 1, in
pos = [nltk.pos_tag(i,tagset='universal') for i in lw]
File "", line 1, in
pos = [nltk.pos_tag(i,tagset='universal') for i in lw]
File "C:\Users\my system\AppData\Local\Programs\Python\Python35\lib\site-packages\nltk\tag\__init__.py", line 134, in pos_tag
return _pos_tag(tokens, tagset, tagger)
File "C:\Users\my system\AppData\Local\Programs\Python\Python35\lib\site-packages\nltk\tag\__init__.py", line 102, in _pos_tag
tagged_tokens = tagger.tag(tokens)
File "C:\Users\my system\AppData\Local\Programs\Python\Python35\lib\site-packages\nltk\tag\perceptron.py", line 152, in tag
context = self.START + [self.normalize(w) for w in tokens] + self.END
File "C:\Users\my system\AppData\Local\Programs\Python\Python35\lib\site-packages\nltk\tag\perceptron.py", line 152, in
context = self.START + [self.normalize(w) for w in tokens] + self.END
File "C:\Users\my system\AppData\Local\Programs\Python\Python35\lib\site-packages\nltk\tag\perceptron.py", line 240, in normalize
elif word[0].isdigit():
IndexError: string index out of range
有人能告诉我为什么,如何得到这个错误和如何解决它吗?非常感谢。
在尝试使用NLTK库对单词列表进行POS标记时,遇到'IndexError: string index out of range'的错误。问题出现在对含有空字符串的列表进行pos_tag操作。要解决这个问题,需要确保输入的单词列表不包含空字符串或无效数据。
4万+

被折叠的 条评论
为什么被折叠?



