Every so often you will find yourself needing to write code that traverse a directory. They tend to be one-off scripts or clean up scripts that run in cron in my experience. Anyway, Python provides a very useful methods of walking a directory structure. We cover best of them.
Testing directory structure
Here is my testing filesystem tree. Root is in /test
~] tree -a /test
/test
├── A
│ ├── AA
│ │ └── aa.png
│ ├── a.png
│ └── a.txt
├── B
│ ├── BB
│ └── b.txt
├── broken_symlink -> /aaa
├── symlink -> /etc
├── .test
├── test.png
└── test.txt
python os.walk()
os.walk(top, topdown=True, onerror=None, followlinks=False)
- Generate the file names in a directory tree by walking the tree either top-down or bottom-up. For each directory in the tree rooted at directory top (including top itself), it yields a 3-tuple
(dirpath, dirnames, filenames)
- dirpath is a string, the path to the directory. dirnames is a list of the names of the subdirectories in dirpath (excluding '
.
' and '..
'). filenames is a list of the names of the non-directory files in dirpath. Note that the names in the lists contain no path components. To get a full path (which begins with top) to a file or directory in dirpath, doos.path.join(dirpath, name)
. Whether or not the lists are sorted depends on the file system. If a file is removed from or added to the dirpath directory during generating the lists, whether a name for that file be included is unspecified. - If optional argument topdown is True or not specified, the triple for a directory is generated before the triples for any of its subdirectories (directories are generated top-down). If topdown is False, the triple for a directory is generated after the triples for all of its subdirectories (directories are generated bottom-up). No matter the value of topdown, the list of subdirectories is retrieved before the tuples for the directory and its subdirectories are generated.
- By default, walk() will not walk down into symbolic links that resolve to directories. Set followlinks to True to visit directories pointed to by symlinks, on systems that support them.