为什么当逐行导入文本文件进行情感分析而不是使用硬编码的句子时，会出现TypeError错误？

Question

10 浏览2023年2月7日

匿名的 2023年2月7日

0 Comments

我试图逐行从文本文件中分析每个给定句子的情感。每当我使用与第一个链接问题中提供的硬编码句子时，代码都能正常工作。但是当我使用文本文件输入时，就会出现TypeError错误。\n这与此处提出的问题有关。而逐行从文本文件读取的代码来自这个问题：\n第一个代码段可以正常运行，但使用文本文件(\"I love you. I hate him. You are nice. He is dumb\")时无法正常工作。以下是代码：\n

from pycorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('http://localhost:9000')
results = []    
with open("c:/nlp/test.txt","r") as f:
    for line in f.read().split('\n'):
        print("Line:" + line)
        res = nlp.annotate(line,
                   properties={
                       'annotators': 'sentiment',
                       'outputFormat': 'json',
                       'timeout': 1000,
                   })
        results.append(res)      
for res in results:             
    s = res["sentences"]         
    print("%d: '%s': %s %s" % (
        s["index"], 
        " ".join([t["word"] for t in s["tokens"]]),
        s["sentimentValue"], s["sentiment"]))

\n我得到了这个错误：\n

\n第21行，\ns[\"index\"],\nTypeError: list indices must be integers or slices, not str\n

0

2 答案

匿名的 · Answer 1 · 2023-04-25T01:06:28+00:00

问题的原因是在代码中将S设置为List，而实际上应该是dict。解决方法是将代码移动到循环的同一个位置，该循环按行读取和分析文本文件，并直接在那里打印结果。

具体的代码如下：

from pycorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('http://localhost:9000')
with open("c:/nlp/test.txt","r") as f:
    for line in f.read().split('\n'):
        res = nlp.annotate(line,
                    properties={
                        'annotators': 'sentiment',
                        'outputFormat': 'json',
                        'timeout': 15000,
                   }) 
        for s in res["sentences"]:
            print("%d: '%s': %s %s" % (
            s["index"], 
            " ".join([t["word"] for t in s["tokens"]]),
            s["sentimentValue"], s["sentiment"]))

运行结果如下：

0: 'I love you .': 3 Positive
0: 'I hate him .': 1 Negative
0: 'You are nice .': 3 Positive
0: 'He is dumb .': 1 Negative

运行结果显示和预期一致，没有出现任何错误信息。

匿名的 · Answer 2 · 2023-07-11T00:11:03+00:00

当使用文本文件逐行导入进行情感分析时，出现TypeError错误的原因是变量类型不匹配。初始示例中，变量"s"的类型是字典（dict），而在使用文本文件的代码中，"s"的类型变为列表（list），导致错误。解决方法是在代码中将"s"保持为字典类型。可以通过将"for s in res.values()"替换为"s = res['sentences']"来实现。

以下是代码的实现示例：

results = []    
with open("tester.txt","r") as f:
    for line in f.read().split('\n'):
        print("Line:" + line)
        sentences = [
        {
            "index":1,
            "word":line,
            "sentimentValue": "sentVal",
            "sentiment":"senti"
        }
    ]
    results.append(sentences) 
for res in results:         
    for s in res.values():         
        print("%d: '%s': %s %s" % (
            s["index"], 
            " ".join(s["word"]),
            s["sentimentValue"], s["sentiment"]))

通过以上修改，可以确保代码正常运行，输出结果如下：

1: 'I   l o v e   y o u .': sentVal senti
1: 'I   h a t e   h i m .': sentVal senti
1: 'Y o u   a r e   n i c e .': sentVal senti
1: 'H e   i s   d u m b': sentVal senti

需要注意的是，在Stanfort API的文档中提到，只需将整个文本传递给API，不需要进行切片和格式化。另外，如果没有返回结果，需要注意这一点。

总结一下，当导入文本文件进行情感分析时，出现TypeError错误的原因是变量类型不匹配，解决方法是将变量保持为正确的类型。