本文介绍了NLTK无法找到stanford-postagger.jar!设置CLASSPATH环境变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从事一个项目,该项目需要我使用nltk和python标记令牌.所以我想用这个.但是提出了一些问题.我经历了很多其他已经问过的问题和其他论坛,但仍然无法解决这个问题.问题是当我尝试执行以下命令时:

I am working on a project that requires me to tag tokens using nltk and python. So I wanted to use this. But came up with a few problems.I went through a lot of other already asked questions and other forums but I was still unable to get a soultion to this problem.The problem is when I try to execute the following:

from nltk.tag import StanfordPOSTagger st = StanfordPOSTagger('english-bidirectional-distsim.tagger')

from nltk.tag import StanfordPOSTagger st = StanfordPOSTagger('english-bidirectional-distsim.tagger')

我得到以下信息:

    Traceback (most recent call last):

    `File "<pyshell#13>", line 1, in <module>
        st = StanfordPOSTagger('english-bidirectional-distsim.tagger')`

    `File "C:\Users\MY3\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nltk-3.1-py3.5.egg\nltk\tag\stanford.py", line 131, in __init__
        super(StanfordPOSTagger, self).__init__(*args, **kwargs)`

    `File "C:\Users\MY3\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nltk-3.1-py3.5.egg\nltk\tag\stanford.py", line 53, in __init__
        verbose=verbose)`

     `File "C:\Users\MY3\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nltk-3.1-py3.5.egg\nltk\internals.py", line 652, in find_jar
        searchpath, url, verbose, is_regex))`

     `File "C:\Users\MY3\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nltk-3.1-py3.5.egg\nltk\internals.py", line 647, in find_jar_iter
        raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))`

    LookupError: 

    ===========================================================================
      NLTK was unable to find stanford-postagger.jar! Set the CLASSPATH
      environment variable.

    ===========================================================================

我已经设置了类路径-C:\Users\MY3\Desktop\nltk\stanford\stanford-postagger.jar我也尝试过C:\Users\MY3\Desktop\nltk\stanford.

I already set the CLASSPATH - C:\Users\MY3\Desktop\nltk\stanford\stanford-postagger.jarI tried it as C:\Users\MY3\Desktop\nltk\stanford as well..

STANFORD_MODELS-C:\Users\MY3\Desktop\nltk\stanford\models\

STANFORD_MODELS - C:\Users\MY3\Desktop\nltk\stanford\models\

我也尝试这样做.File "C:\Python27\lib\site-packages\nltk\tag\stanford.py", line 45, in __init__env_vars=('STANFORD_MODELS',), verbose=verbose)但它也不能解决问题.请帮助我解决这个问题.

I tried doing this as well..in vainFile "C:\Python27\lib\site-packages\nltk\tag\stanford.py", line 45, in __init__env_vars=('STANFORD_MODELS',), verbose=verbose)but it doesn't solve the problem either. Please Help me in solving this issue.

我使用Windows 8,python 3.5和nltk 3.1

I use Windows 8, python 3.5 and nltk 3.1

推荐答案

更新

原始答案是为 Stanford POS Tagger版本3.6.0(日期为2015-12-09

有一个 新版本(3.7.0,2016年10月发布) 31) .这是新版本的代码:

There is a new Version (3.7.0, released 2016-10-31). Here's the code for the newer version:

from nltk.tag import StanfordPOSTagger
from nltk import word_tokenize

# Add the jar and model via their path (instead of setting environment variables):
jar = 'your_path/stanford-postagger-full-2016-10-31/stanford-postagger.jar'
model = 'your_path/stanford-postagger-full-2016-10-31/models/english-left3words-distsim.tagger'

pos_tagger = StanfordPOSTagger(model, jar, encoding='utf8')

text = pos_tagger.tag(word_tokenize("What's the airspeed of an unladen swallow ?"))
print(text)


原始答案

我遇到了同样的问题(但是使用OS X和PyCharm),终于使它工作了.这是我从 StanfordPOSTagger文档有关此问题的工作(非常感谢!):


Original answer

I had the same problem (but using OS X and PyCharm), finally got it to work. Here's what I've pieced together from the StanfordPOSTagger Documentation and alvas' work on the issue (big thanks!):

from nltk.internals import find_jars_within_path
from nltk.tag import StanfordPOSTagger
from nltk import word_tokenize

# Alternatively to setting the CLASSPATH add the jar and model via their path:
jar = '/Users/nischi/PycharmProjects/stanford-postagger-full-2015-12-09/stanford-postagger.jar'
model = '/Users/nischi/PycharmProjects/stanford-postagger-full-2015-12-09/models/english-left3words-distsim.tagger'

pos_tagger = StanfordPOSTagger(model, jar)

# Add other jars from Stanford directory
stanford_dir = pos_tagger._stanford_jar.rpartition('/')[0]
stanford_jars = find_jars_within_path(stanford_dir)
pos_tagger._stanford_jar = ':'.join(stanford_jars)

text = pos_tagger.tag(word_tokenize("What's the airspeed of an unladen swallow ?"))
print(text)

希望这会有所帮助.

这篇关于NLTK无法找到stanford-postagger.jar!设置CLASSPATH环境变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-26 19:58