本文介绍了在 Scrapy 机器人中,我无法增加全局变量(但可以分配相同的变量).为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道使用全局变量不是一个好主意,我打算做一些不同的事情.但是,在玩的时候,我在 Scrapy 中遇到了一个奇怪的全局变量问题.在纯 python 中,我没有看到这个问题.

I know that using global variables is not a good idea and I plan to do something different. But, while playing around I ran into a strange global variable issue within Scrapy. In pure python, I don't see this problem.

当我运行这个机器人代码时:

When I run this bot code:

import scrapy

from tutorial.items import DmozItem

class DmozSpider(scrapy.Spider):
    name = "dmoz"
    allowed_domains = ["lib-web.org"]
    start_urls = [
        "http://www.lib-web.org/united-states/public-libraries/michigan/"
    ]

    count = 0

    def parse(self, response):
        for sel in response.xpath('//div/div/div/ul/li'):
            item = DmozItem()
            item['title'] = sel.xpath('a/text()').extract()
            item['link'] = sel.xpath('a/@href').extract()
            item['desc'] = sel.xpath('p/text()').extract()

            global count;
            count += 1
            print count

            yield item

DmozItem:

import scrapy

class DmozItem(scrapy.Item):
    title = scrapy.Field()
    link = scrapy.Field()
    desc = scrapy.Field()

我收到此错误:

  File "/Users/Admin/scpy_projs/tutorial/tutorial/spiders/dmoz_spider.py", line 22, in parse
    count += 1
NameError: global name 'count' is not defined

但如果我只是将 'count += 1' 更改为仅 'count = 1',它运行良好.

But if I simply change 'count += 1' to just 'count = 1', it runs fine.

这是怎么回事?为什么我不能增加变量?

What's going on here? Why can I not increment the variable?

同样,如果我在 Scrapy 上下文之外运行类似的代码,在纯 Python 中,它运行良好.代码如下:

Again, if I run similar code outside of a Scrapy context, in pure Python, it runs fine. Here's the code:

count = 0

def doIt():
        global count
        for i in range(0, 10):
            count +=1
doIt()
doIt()
print count

结果:

Admin$ python count_test.py
20

推荐答案

count 是一个 class variable 在您的示例中,因此您应该使用 self.count 访问它.它解决了错误,但也许您真正需要的是 实例变量,因为作为一个类变量,count在类的所有实例之间是共享的.

count is a class variable in your example, so you should access it using self.count. It solves the error, but maybe what you really need is an instance variable, because as a class variable, count is shared between all the instances of the class.

parse 方法中分配 count = 1 有效,因为它创建了一个名为 count 的新局部变量,这与类变量 count 不同.

Assigning count = 1 inside the parse method works because it creates a new local variable called count, which is different from the class variable count.

您的纯 Python 示例之所以有效,是因为您没有定义类,而是定义了一个函数,并且您在那里创建的变量 count 具有全局作用域,可从函数作用域访问.

Your pure Python example works because you did not define a class, but a function instead, and the variable count you created there has global scope, which is accessible from the function scope.

这篇关于在 Scrapy 机器人中,我无法增加全局变量(但可以分配相同的变量).为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-02 17:21