本文介绍了排除 os.walk 中的目录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

我正在编写一个进入目录树的脚本(使用 os.walk()),然后访问匹配特定文件扩展名的每个文件.但是,由于我的工具将用于的一些目录树还包含子目录,而这些子目录又包含 LOT 无用的(为了这个脚本的目的)的东西,我想我会添加供用户指定要从遍历中排除的目录列表的选项.

使用 os.walk() 就很容易了.毕竟,由我决定是否真的要访问 os.walk() 生成的相应文件/目录,或者只是跳过它们.问题是,例如,如果我有这样的目录树:

root--|--- 目录|--- 目录|---没用的东西-|--- 更多垃圾|---还有更多垃圾

并且我想排除 uselessStuff 及其所有子目录,os.walk() 仍将下降到 uselessStuff 的所有(可能有数千个)子目录中,其中,不用说,会大大减慢速度.在理想的世界中,我可以告诉 os.walk() 甚至不要再生产 uselessStuff 的孩子,但据我所知,没有办法做到这一点(有吗?).>

有人有想法吗?也许有第三方库可以提供类似的功能?

解决方案

Modifying dirs in-place 将修剪 访问的(后续)文件和目录os.walk:

# exclude = set(['新建文件夹', 'Windows', '桌面'])对于 os.walk(top, topdown=True) 中的根、目录、文件:dirs[:] = [d for d in dirs if d not in exclude]


来自帮助(os.walk):

topdown 为 true 时,调用者可以就地修改目录名列表(例如,通过 del 或 slice 赋值),而 walk 只会递归到名称保留在 dirnames 中的子目录;这可以用来修剪搜索...

I'm writing a script that descends into a directory tree (using os.walk()) and then visits each file matching a certain file extension. However, since some of the directory trees that my tool will be used on also contain sub directories that in turn contain a LOT of useless (for the purpose of this script) stuff, I figured I'd add an option for the user to specify a list of directories to exclude from the traversal.

This is easy enough with os.walk(). After all, it's up to me to decide whether I actually want to visit the respective files / dirs yielded by os.walk() or just skip them. The problem is that if I have, for example, a directory tree like this:

root--
     |
     --- dirA
     |
     --- dirB
     |
     --- uselessStuff --
                       |
                       --- moreJunk
                       |
                       --- yetMoreJunk

and I want to exclude uselessStuff and all its children, os.walk() will still descend into all the (potentially thousands of) sub directories of uselessStuff, which, needless to say, slows things down a lot. In an ideal world, I could tell os.walk() to not even bother yielding any more children of uselessStuff, but to my knowledge there is no way of doing that (is there?).

Does anyone have an idea? Maybe there's a third-party library that provides something like that?

解决方案

Modifying dirs in-place will prune the (subsequent) files and directories visited by os.walk:

# exclude = set(['New folder', 'Windows', 'Desktop'])
for root, dirs, files in os.walk(top, topdown=True):
    dirs[:] = [d for d in dirs if d not in exclude]


From help(os.walk):

这篇关于排除 os.walk 中的目录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-07 17:35