Fix #988

Update directory scanning to use os.scandir()
- Change to use os.scandir() instead of os.walk() to leverage DirEntry objects. - Avoids extra calls to stat() on files during fs.can_handle() - See 3x speed improvement on Windows in some cases
2026-02-01 19:11:38 +00:00 · 2022-03-30 01:06:51 -05:00 · 2022-03-29 23:37:56 -05:00 · 2022-03-29 22:35:38 -05:00 · 2022-03-29 22:32:36 -05:00 · 2022-03-29 21:48:14 -05:00
36 changed files with 652 additions and 1675 deletions
--- a/README.md
+++ b/README.md
@@ -1,16 +1,12 @@
 # dupeGuru
 [dupeGuru][dupeguru] is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in
-a system. It is written mostly in Python 3 and has the peculiarity of using
+a system. It is written mostly in Python 3 and uses [qt](https://www.qt.io/) for the UI.
 [multiple GUI toolkits][cross-toolkit], all using the same core Python code. On OS X, the UI layer
 is written in Objective-C and uses Cocoa. On Linux, it is written in Python and uses Qt5.
 The Cocoa UI of dupeGuru is hosted in a separate repo: https://github.com/arsenetar/dupeguru-cocoa
 ## Current status
 Still looking for additional help especially with regards to:
-* OSX maintenance: reproducing bugs & cocoa version, building package with Cocoa UI.
+* OSX maintenance: reproducing bugs, packaging verification.
-* Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package.
+* Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package, rpm package.
 * Translations: updating missing strings, transifex project at https://www.transifex.com/voltaicideas/dupeguru-1
 * Documentation: keeping it up-to-date.
@@ -43,12 +39,10 @@ For macos instructions (qt version) see the [macOS Instructions](macos.md).
 When running in a linux based environment the following system packages or equivalents are needed to build:
 * python3-pyqt5
 * pyqt5-dev-tools (on some systems, see note)
 * python3-wheel (for hsaudiotag3k)
 * python3-venv (only if using a virtual environment)
 * python3-dev
 * build-essential
 Note: On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with `which pyrcc5`.  Debian based systems need the extra package, and Arch does not.
 To create packages the following are also needed:
--- a/build.py
+++ b/build.py
@@ -109,10 +109,7 @@ def build_updatepot():
    print("Building columns.pot")
    loc.generate_pot(["core"], Path("locale", "columns.pot"), ["coltr"])
    print("Building ui.pot")
-    # When we're not under OS X, we don't want to overwrite ui.pot because it contains Cocoa locs
+    loc.generate_pot(["qt"], Path("locale", "ui.pot"), ["tr"], merge=True)
    # We want to merge the generated pot with the old pot in the most preserving way possible.
    ui_packages = ["qt", Path("cocoa", "inter")]
    loc.generate_pot(ui_packages, Path("locale", "ui.pot"), ["tr"], merge=True)
    print("Building qtlib.pot")
    loc.generate_pot(["qtlib"], Path("qtlib", "locale", "qtlib.pot"), ["tr"])
@@ -121,13 +118,11 @@ def build_mergepot():
    print("Updating .po files using .pot files")
    loc.merge_pots_into_pos("locale")
    loc.merge_pots_into_pos(Path("qtlib", "locale"))
    # loc.merge_pots_into_pos(Path("cocoalib", "locale"))
 def build_normpo():
    loc.normalize_all_pos("locale")
    loc.normalize_all_pos(Path("qtlib", "locale"))
    # loc.normalize_all_pos(Path("cocoalib", "locale"))
 def build_pe_modules():
--- a/core/app.py
+++ b/core/app.py
@@ -10,11 +10,11 @@ import logging
 import subprocess
 import re
 import shutil
 from pathlib import Path
 from send2trash import send2trash
 from hscommon.jobprogress import job
 from hscommon.notify import Broadcaster
 from hscommon.path import Path
 from hscommon.conflict import smart_move, smart_copy
 from hscommon.gui.progress_window import ProgressWindow
 from hscommon.util import delete_if_empty, first, escape, nonone, allsame
@@ -248,7 +248,7 @@ class DupeGuru(Broadcaster):
            ref = group.ref
            linkfunc = os.link if use_hardlinks else os.symlink
            linkfunc(str(ref.path), str_path)
-        self.clean_empty_dirs(dupe.path.parent())
+        self.clean_empty_dirs(dupe.path.parent)
    def _create_file(self, path):
        # We add fs.Folder to fileclasses in case the file we're loading contains folder paths.
@@ -415,7 +415,7 @@ class DupeGuru(Broadcaster):
    def clean_empty_dirs(self, path):
        if self.options["clean_empty_dirs"]:
            while delete_if_empty(path, [".DS_Store"]):
-                path = path.parent()
+                path = path.parent
    def clear_picture_cache(self):
        try:
@@ -428,25 +428,25 @@ class DupeGuru(Broadcaster):
    def copy_or_move(self, dupe, copy: bool, destination: str, dest_type: DestType):
        source_path = dupe.path
-        location_path = first(p for p in self.directories if dupe.path in p)
+        location_path = first(p for p in self.directories if p in dupe.path.parents)
        dest_path = Path(destination)
        if dest_type in {DestType.RELATIVE, DestType.ABSOLUTE}:
            # no filename, no windows drive letter
-            source_base = source_path.remove_drive_letter().parent()
+            source_base = source_path.relative_to(source_path.anchor).parent
            if dest_type == DestType.RELATIVE:
-                source_base = source_base[location_path:]
+                source_base = source_base.relative_to(location_path.relative_to(location_path.anchor))
-            dest_path = dest_path[source_base]
+            dest_path = dest_path.joinpath(source_base)
        if not dest_path.exists():
-            dest_path.makedirs()
+            dest_path.mkdir(parents=True)
        # Add filename to dest_path. For file move/copy, it's not required, but for folders, yes.
-        dest_path = dest_path[source_path.name]
+        dest_path = dest_path.joinpath(source_path.name)
        logging.debug("Copy/Move operation from '%s' to '%s'", source_path, dest_path)
        # Raises an EnvironmentError if there's a problem
        if copy:
            smart_copy(source_path, dest_path)
        else:
            smart_move(source_path, dest_path)
-            self.clean_empty_dirs(source_path.parent())
+            self.clean_empty_dirs(source_path.parent)
    def copy_or_move_marked(self, copy):
        """Start an async move (or copy) job on marked duplicates.
--- a/core/directories.py
+++ b/core/directories.py
@@ -7,9 +7,9 @@
 import os
 from xml.etree import ElementTree as ET
 import logging
 from pathlib import Path
 from hscommon.jobprogress import job
 from hscommon.path import Path
 from hscommon.util import FileOrPath
 from hscommon.trans import tr
@@ -63,7 +63,7 @@ class Directories:
    def __contains__(self, path):
        for p in self._dirs:
-            if path in p:
+            if path == p or p in path.parents:
                return True
        return False
@@ -90,45 +90,45 @@ class Directories:
            return DirectoryState.EXCLUDED
    def _get_files(self, from_path, fileclasses, j):
-        for root, dirs, files in os.walk(str(from_path)):
+        try:
-            j.check_if_cancelled()
+            with os.scandir(from_path) as iter:
-            root_path = Path(root)
+                root_path = Path(from_path)
-            state = self.get_state(root_path)
+                state = self.get_state(root_path)
-            if state == DirectoryState.EXCLUDED and not any(p[: len(root_path)] == root_path for p in self.states):
+                # if we have no un-excluded dirs under this directory skip going deeper
-                # Recursively get files from folders with lots of subfolder is expensive. However, there
+                skip_dirs = state == DirectoryState.EXCLUDED and not any(
-                # might be a subfolder in this path that is not excluded. What we want to do is to skim
+                    p.parts[: len(root_path.parts)] == root_path.parts for p in self.states
-                # through self.states and see if we must continue, or we can stop right here to save time
+                )
-                del dirs[:]
+                count = 0
-            try:
+                for item in iter:
-                if state != DirectoryState.EXCLUDED:
+                    j.check_if_cancelled()
-                    # Old logic
+                    try:
-                    if self._exclude_list is None or not self._exclude_list.mark_count:
+                        if item.is_dir():
-                        found_files = [fs.get_file(root_path + f, fileclasses=fileclasses) for f in files]
+                            if skip_dirs:
-                    else:
+                                continue
-                        found_files = []
+                            yield from self._get_files(item.path, fileclasses, j)
-                        # print(f"len of files: {len(files)} {files}")
+                            continue
-                        for f in files:
+                        elif state == DirectoryState.EXCLUDED:
-                            if not self._exclude_list.is_excluded(root, f):
+                            continue
-                                found_files.append(fs.get_file(root_path + f, fileclasses=fileclasses))
+                        # File excluding or not
-                    found_files = [f for f in found_files if f is not None]
+                        if (
-                    # In some cases, directories can be considered as files by dupeGuru, which is
+                            self._exclude_list is None
-                    # why we have this line below. In fact, there only one case: Bundle files under
+                            or not self._exclude_list.mark_count
-                    # OS X... In other situations, this forloop will do nothing.
+                            or not self._exclude_list.is_excluded(str(from_path), item.name)
-                    for d in dirs[:]:
+                        ):
-                        f = fs.get_file(root_path + d, fileclasses=fileclasses)
+                            file = fs.get_file(item, fileclasses=fileclasses)
-                        if f is not None:
+                            if file:
-                            found_files.append(f)
+                                file.is_ref = state == DirectoryState.REFERENCE
-                            dirs.remove(d)
+                                count += 1
-                    logging.debug(
+                                yield file
-                        "Collected %d files in folder %s",
+                    except (EnvironmentError, OSError, fs.InvalidPath):
-                        len(found_files),
+                        pass
-                        str(root_path),
+                logging.debug(
-                    )
+                    "Collected %d files in folder %s",
-                    for file in found_files:
+                    count,
-                        file.is_ref = state == DirectoryState.REFERENCE
+                    str(root_path),
-                        yield file
+                )
-            except (EnvironmentError, fs.InvalidPath):
+        except OSError:
-                pass
+            pass
    def _get_folders(self, from_folder, j):
        j.check_if_cancelled()
@@ -159,7 +159,7 @@ class Directories:
            raise AlreadyThereError()
        if not path.exists():
            raise InvalidPathError()
-        self._dirs = [p for p in self._dirs if p not in path]
+        self._dirs = [p for p in self._dirs if path not in p.parents]
        self._dirs.append(path)
    @staticmethod
@@ -170,7 +170,7 @@ class Directories:
        :rtype: list of Path
        """
        try:
-            subpaths = [p for p in path.listdir() if p.isdir()]
+            subpaths = [p for p in path.glob("*") if p.is_dir()]
            subpaths.sort(key=lambda x: x.name.lower())
            return subpaths
        except EnvironmentError:
@@ -220,14 +220,11 @@ class Directories:
        if state != DirectoryState.NORMAL:
            self.states[path] = state
            return state
-
+        # find the longest parent path that is in states and return that state if found
-        prevlen = 0
+        # NOTE: path.parents is ordered longest to shortest
-        # we loop through the states to find the longest matching prefix
+        for parent_path in path.parents:
-        # if the parent has a state in cache, return that state
+            if parent_path in self.states:
-        for p, s in self.states.items():
+                return self.states[parent_path]
            if p.is_parent_of(path) and len(p) > prevlen:
                prevlen = len(p)
                state = s
        return state
    def has_any_file(self):
@@ -296,6 +293,6 @@ class Directories:
        if self.get_state(path) == state:
            return
        for iter_path in list(self.states.keys()):
-            if path.is_parent_of(iter_path):
+            if path in iter_path.parents:
                del self.states[iter_path]
        self.states[path] = state
--- a/core/engine.py
+++ b/core/engine.py
@@ -283,7 +283,7 @@ def getmatches_by_contents(files, bigsize=0, j=job.nulljob):
    """Returns a list of :class:`Match` within ``files`` if their contents is the same.
    :param bigsize: The size in bytes over which we consider files big enough to
-                    justify taking samples of md5. If 0, compute md5 as usual.
+                    justify taking samples of the file for hashing. If 0, compute digest as usual.
    :param j: A :ref:`job progress instance <jobs>`.
    """
    size2files = defaultdict(set)
@@ -300,15 +300,15 @@ def getmatches_by_contents(files, bigsize=0, j=job.nulljob):
            if first.is_ref and second.is_ref:
                continue  # Don't spend time comparing two ref pics together.
            if first.size == 0 and second.size == 0:
-                # skip md5 for zero length files
+                # skip hashing for zero length files
                result.append(Match(first, second, 100))
                continue
-            if first.md5partial == second.md5partial:
+            if first.digest_partial == second.digest_partial:
                if bigsize > 0 and first.size > bigsize:
-                    if first.md5samples == second.md5samples:
+                    if first.digest_samples == second.digest_samples:
                        result.append(Match(first, second, 100))
                else:
-                    if first.md5 == second.md5:
+                    if first.digest == second.digest:
                        result.append(Match(first, second, 100))
        group_count += 1
        j.add_progress(desc=PROGRESS_MESSAGE % (len(result), group_count))
--- a/core/fs.py
+++ b/core/fs.py
@@ -11,14 +11,24 @@
 # resulting needless complexity and memory usage. It's been a while since I wanted to do that fork,
 # and I'm doing it now.
-import hashlib
+import os
 try:
    import xxhash
    hasher = xxhash.xxh128
 except ImportError:
    import hashlib
    hasher = hashlib.md5
 from math import floor
 import logging
 import sqlite3
 from threading import Lock
-from typing import Any
+from typing import Any, AnyStr, Union
-from hscommon.path import Path
+from pathlib import Path
 from hscommon.util import nonone, get_file_ext
 __all__ = [
@@ -40,7 +50,7 @@ NOT_SET = object()
 # CPU.
 CHUNK_SIZE = 1024 * 1024  # 1 MiB
-# Minimum size below which partial hashes don't need to be computed
+# Minimum size below which partial hashing is not used
 MIN_FILE_SIZE = 3 * CHUNK_SIZE  # 3MiB, because we take 3 samples
@@ -83,9 +93,11 @@ class OperationError(FSError):
 class FilesDB:
    schema_version = 1
    schema_version_description = "Changed from md5 to xxhash if available."
-    create_table_query = "CREATE TABLE IF NOT EXISTS files (path TEXT PRIMARY KEY, size INTEGER, mtime_ns INTEGER, entry_dt DATETIME, md5 BLOB, md5partial BLOB)"
+    create_table_query = "CREATE TABLE IF NOT EXISTS files (path TEXT PRIMARY KEY, size INTEGER, mtime_ns INTEGER, entry_dt DATETIME, digest BLOB, digest_partial BLOB, digest_samples BLOB)"
-    drop_table_query = "DROP TABLE files;"
+    drop_table_query = "DROP TABLE IF EXISTS files;"
    select_query = "SELECT {key} FROM files WHERE path=:path AND size=:size and mtime_ns=:mtime_ns"
    insert_query = """
        INSERT INTO files (path, size, mtime_ns, entry_dt, {key}) VALUES (:path, :size, :mtime_ns, datetime('now'), :value)
@@ -97,24 +109,37 @@ class FilesDB:
        self.cur = None
        self.lock = None
-    def connect(self, path):
+    def connect(self, path: Union[AnyStr, os.PathLike]) -> None:
        # type: (str, ) -> None
        self.conn = sqlite3.connect(path, check_same_thread=False)
        self.cur = self.conn.cursor()
        self.cur.execute(self.create_table_query)
        self.lock = Lock()
        self._check_upgrade()
-    def clear(self):
+    def _check_upgrade(self) -> None:
-        # type: () -> None
+        with self.lock:
            has_schema = self.cur.execute(
                "SELECT NAME FROM sqlite_master WHERE type='table' AND name='schema_version'"
            ).fetchall()
            version = None
            if has_schema:
                version = self.cur.execute("SELECT version FROM schema_version ORDER BY version DESC").fetchone()[0]
            else:
                self.cur.execute("CREATE TABLE schema_version (version int PRIMARY KEY, description TEXT)")
            if version != self.schema_version:
                self.cur.execute(self.drop_table_query)
                self.cur.execute(
                    "INSERT OR REPLACE INTO schema_version VALUES (:version, :description)",
                    {"version": self.schema_version, "description": self.schema_version_description},
                )
            self.cur.execute(self.create_table_query)
            self.conn.commit()
    def clear(self) -> None:
        with self.lock:
            self.cur.execute(self.drop_table_query)
            self.cur.execute(self.create_table_query)
-    def get(self, path, key):
+    def get(self, path: Path, key: str) -> Union[bytes, None]:
        # type: (Path, str) -> bytes
        stat = path.stat()
        size = stat.st_size
        mtime_ns = stat.st_mtime_ns
@@ -128,9 +153,7 @@ class FilesDB:
        return None
-    def put(self, path, key, value):
+    def put(self, path: Path, key: str, value: Any) -> None:
        # type: (Path, str, Any) -> None
        stat = path.stat()
        size = stat.st_size
        mtime_ns = stat.st_mtime_ns
@@ -141,15 +164,11 @@ class FilesDB:
                {"path": str(path), "size": size, "mtime_ns": mtime_ns, "value": value},
            )
-    def commit(self):
+    def commit(self) -> None:
        # type: () -> None
        with self.lock:
            self.conn.commit()
-    def close(self):
+    def close(self) -> None:
        # type: () -> None
        with self.lock:
            self.cur.close()
            self.conn.close()
@@ -161,7 +180,7 @@ filesdb = FilesDB()  # Singleton
 class File:
    """Represents a file and holds metadata to be used for scanning."""
-    INITIAL_INFO = {"size": 0, "mtime": 0, "md5": b"", "md5partial": b"", "md5samples": b""}
+    INITIAL_INFO = {"size": 0, "mtime": 0, "digest": b"", "digest_partial": b"", "digest_samples": b""}
    # Slots for File make us save quite a bit of memory. In a memory test I've made with a lot of
    # files, I saved 35% memory usage with "unread" files (no _read_info() call) and gains become
    # even greater when we take into account read attributes (70%!). Yeah, it's worth it.
@@ -187,32 +206,51 @@ class File:
                result = self.INITIAL_INFO[attrname]
        return result
-    def _calc_md5(self):
+    def _calc_digest(self):
        # type: () -> bytes
        with self.path.open("rb") as fp:
-            md5 = hashlib.md5()
+            file_hash = hasher()
            # The goal here is to not run out of memory on really big files. However, the chunk
            # size has to be large enough so that the python loop isn't too costly in terms of
            # CPU.
            CHUNK_SIZE = 1024 * 1024  # 1 mb
            filedata = fp.read(CHUNK_SIZE)
            while filedata:
-                md5.update(filedata)
+                file_hash.update(filedata)
                filedata = fp.read(CHUNK_SIZE)
-            return md5.digest()
+            return file_hash.digest()
-    def _calc_md5partial(self):
+    def _calc_digest_partial(self):
        # type: () -> bytes
-        # This offset is where we should start reading the file to get a partial md5
+        # This offset is where we should start reading the file to get a partial hash
        # For audio file, it should be where audio data starts
        offset, size = (0x4000, 0x4000)
        with self.path.open("rb") as fp:
            fp.seek(offset)
-            partialdata = fp.read(size)
+            partial_data = fp.read(size)
-            return hashlib.md5(partialdata).digest()
+            return hasher(partial_data).digest()
    def _calc_digest_samples(self) -> bytes:
        size = self.size
        with self.path.open("rb") as fp:
            # Chunk at 25% of the file
            fp.seek(floor(size * 25 / 100), 0)
            file_data = fp.read(CHUNK_SIZE)
            file_hash = hasher(file_data)
            # Chunk at 60% of the file
            fp.seek(floor(size * 60 / 100), 0)
            file_data = fp.read(CHUNK_SIZE)
            file_hash.update(file_data)
            # Last chunk of the file
            fp.seek(-CHUNK_SIZE, 2)
            file_data = fp.read(CHUNK_SIZE)
            file_hash.update(file_data)
            return file_hash.digest()
    def _read_info(self, field):
        # print(f"_read_info({field}) for {self}")
@@ -220,48 +258,35 @@ class File:
            stats = self.path.stat()
            self.size = nonone(stats.st_size, 0)
            self.mtime = nonone(stats.st_mtime, 0)
-        elif field == "md5partial":
+        elif field == "digest_partial":
            try:
-                self.md5partial = filesdb.get(self.path, "md5partial")
+                self.digest_partial = filesdb.get(self.path, "digest_partial")
-                if self.md5partial is None:
+                if self.digest_partial is None:
-                    self.md5partial = self._calc_md5partial()
+                    self.digest_partial = self._calc_digest_partial()
-                    filesdb.put(self.path, "md5partial", self.md5partial)
+                    filesdb.put(self.path, "digest_partial", self.digest_partial)
            except Exception as e:
-                logging.warning("Couldn't get md5partial for %s: %s", self.path, e)
+                logging.warning("Couldn't get digest_partial for %s: %s", self.path, e)
-        elif field == "md5":
+        elif field == "digest":
            try:
-                self.md5 = filesdb.get(self.path, "md5")
+                self.digest = filesdb.get(self.path, "digest")
-                if self.md5 is None:
+                if self.digest is None:
-                    self.md5 = self._calc_md5()
+                    self.digest = self._calc_digest()
-                    filesdb.put(self.path, "md5", self.md5)
+                    filesdb.put(self.path, "digest", self.digest)
            except Exception as e:
-                logging.warning("Couldn't get md5 for %s: %s", self.path, e)
+                logging.warning("Couldn't get digest for %s: %s", self.path, e)
-        elif field == "md5samples":
+        elif field == "digest_samples":
            size = self.size
            # Might as well hash such small files entirely.
            if size <= MIN_FILE_SIZE:
                setattr(self, field, self.digest)
                return
            try:
-                with self.path.open("rb") as fp:
+                self.digest_samples = filesdb.get(self.path, "digest_samples")
-                    size = self.size
+                if self.digest_samples is None:
-                    # Might as well hash such small files entirely.
+                    self.digest_samples = self._calc_digest_samples()
-                    if size <= MIN_FILE_SIZE:
+                    filesdb.put(self.path, "digest_samples", self.digest_samples)
                        setattr(self, field, self.md5)
                        return
                    # Chunk at 25% of the file
                    fp.seek(floor(size * 25 / 100), 0)
                    filedata = fp.read(CHUNK_SIZE)
                    md5 = hashlib.md5(filedata)
                    # Chunk at 60% of the file
                    fp.seek(floor(size * 60 / 100), 0)
                    filedata = fp.read(CHUNK_SIZE)
                    md5.update(filedata)
                    # Last chunk of the file
                    fp.seek(-CHUNK_SIZE, 2)
                    filedata = fp.read(CHUNK_SIZE)
                    md5.update(filedata)
                    setattr(self, field, md5.digest())
            except Exception as e:
-                logging.error(f"Error computing md5samples: {e}")
+                logging.warning(f"Couldn't get digest_samples for {self.path}: {e}")
    def _read_all_info(self, attrnames=None):
        """Cache all possible info.
@@ -277,14 +302,14 @@ class File:
    @classmethod
    def can_handle(cls, path):
        """Returns whether this file wrapper class can handle ``path``."""
-        return not path.islink() and path.isfile()
+        return not path.is_symlink() and path.is_file()
    def rename(self, newname):
        if newname == self.name:
            return
-        destpath = self.path.parent()[newname]
+        destpath = self.path.parent.joinpath(newname)
        if destpath.exists():
-            raise AlreadyExistsError(newname, self.path.parent())
+            raise AlreadyExistsError(newname, self.path.parent)
        try:
            self.path.rename(destpath)
        except EnvironmentError:
@@ -308,13 +333,13 @@ class File:
    @property
    def folder_path(self):
-        return self.path.parent()
+        return self.path.parent
 class Folder(File):
    """A wrapper around a folder path.
-    It has the size/md5 info of a File, but its value is the sum of its subitems.
+    It has the size/digest info of a File, but its value is the sum of its subitems.
    """
    __slots__ = File.__slots__ + ("_subfolders",)
@@ -335,31 +360,31 @@ class Folder(File):
            self.size = size
            stats = self.path.stat()
            self.mtime = nonone(stats.st_mtime, 0)
-        elif field in {"md5", "md5partial", "md5samples"}:
+        elif field in {"digest", "digest_partial", "digest_samples"}:
            # What's sensitive here is that we must make sure that subfiles'
-            # md5 are always added up in the same order, but we also want a
+            # digest are always added up in the same order, but we also want a
-            # different md5 if a file gets moved in a different subdirectory.
+            # different digest if a file gets moved in a different subdirectory.
-            def get_dir_md5_concat():
+            def get_dir_digest_concat():
                items = self._all_items()
                items.sort(key=lambda f: f.path)
-                md5s = [getattr(f, field) for f in items]
+                digests = [getattr(f, field) for f in items]
-                return b"".join(md5s)
+                return b"".join(digests)
-            md5 = hashlib.md5(get_dir_md5_concat())
+            digest = hasher(get_dir_digest_concat()).digest()
            digest = md5.digest()
            setattr(self, field, digest)
    @property
    def subfolders(self):
        if self._subfolders is None:
-            subfolders = [p for p in self.path.listdir() if not p.islink() and p.isdir()]
+            with os.scandir(self.path) as iter:
-            self._subfolders = [self.__class__(p) for p in subfolders]
+                subfolders = [p.path for p in iter if not p.is_symlink() and p.is_dir()]
            self._subfolders = [self.__class__(Path(p)) for p in subfolders]
        return self._subfolders
    @classmethod
    def can_handle(cls, path):
-        return not path.islink() and path.isdir()
+        return not path.is_symlink() and path.is_dir()
 def get_file(path, fileclasses=[File]):
@@ -372,6 +397,8 @@ def get_file(path, fileclasses=[File]):
    """
    for fileclass in fileclasses:
        if fileclass.can_handle(path):
            if type(path) is os.DirEntry:
                return fileclass(Path(path.path))
            return fileclass(path)
@@ -384,10 +411,11 @@ def get_files(path, fileclasses=[File]):
    assert all(issubclass(fileclass, File) for fileclass in fileclasses)
    try:
        result = []
-        for path in path.listdir():
+        with os.scandir(path) as iter:
-            file = get_file(path, fileclasses=fileclasses)
+            for item in iter:
-            if file is not None:
+                file = get_file(item, fileclasses=fileclasses)
-                result.append(file)
+                if file is not None:
                    result.append(file)
        return result
    except EnvironmentError:
        raise InvalidPath(path)
--- a/core/me/fs.py
+++ b/core/me/fs.py
@@ -97,11 +97,6 @@ class MusicFile(fs.File):
            "dupe_count": format_dupe_count(dupe_count),
        }
    def _get_md5partial_offset_and_size(self):
        # No longer calculating the offset and audio size, just whole file
        size = self.path.stat().st_size
        return (0, size)
    def _read_info(self, field):
        fs.File._read_info(self, field)
        if field in TAG_FIELDS:
--- a/core/pe/matchblock.py
+++ b/core/pe/matchblock.py
@@ -238,7 +238,7 @@ def getmatches(pictures, cache_path, threshold, match_scaled=False, j=job.nulljo
    for ref_id, other_id, percentage in myiter:
        ref = id2picture[ref_id]
        other = id2picture[other_id]
-        if percentage == 100 and ref.md5 != other.md5:
+        if percentage == 100 and ref.digest != other.digest:
            percentage = 99
        if percentage >= threshold:
            ref.dimensions  # pre-read dimensions for display in results
--- a/core/prioritize.py
+++ b/core/prioritize.py
@@ -82,10 +82,12 @@ class FolderCategory(ValueListCategory):
    def sort_key(self, dupe, crit_value):
        value = self.extract_value(dupe)
-        if value[: len(crit_value)] == crit_value:
+        # This is instead of using is_relative_to() which was added in py 3.9
-            return 0
+        try:
-        else:
+            value.relative_to(crit_value)
        except ValueError:
            return 1
        return 0
 class FilenameCategory(CriterionCategory):
--- a/core/scanner.py
+++ b/core/scanner.py
@@ -134,7 +134,7 @@ class Scanner:
            return False
        if is_same_with_digit(refname, dupename):
            return True
-        return len(dupe.path) > len(ref.path)
+        return len(dupe.path.parts) > len(ref.path.parts)
    @staticmethod
    def get_scan_options():
@@ -164,7 +164,7 @@ class Scanner:
            toremove = set()
            last_parent_path = sortedpaths[0]
            for p in sortedpaths[1:]:
-                if p in last_parent_path:
+                if last_parent_path in p.parents:
                    toremove.add(p)
                else:
                    last_parent_path = p
--- a/core/tests/app_test.py
+++ b/core/tests/app_test.py
@@ -9,7 +9,7 @@ import os.path as op
 import logging
 import pytest
-from hscommon.path import Path
+from pathlib import Path
 import hscommon.conflict
 import hscommon.util
 from hscommon.testutil import eq_, log_calls
@@ -56,7 +56,7 @@ class TestCaseDupeGuru:
        # for this unit is pathetic. What's done is done. My approach now is to add tests for
        # every change I want to make. The blowup was caused by a missing import.
        p = Path(str(tmpdir))
-        p["foo"].open("w").close()
+        p.joinpath("foo").touch()
        monkeypatch.setattr(
            hscommon.conflict,
            "smart_copy",
@@ -71,19 +71,19 @@ class TestCaseDupeGuru:
        dgapp.copy_or_move(f, True, "some_destination", 0)
        eq_(1, len(hscommon.conflict.smart_copy.calls))
        call = hscommon.conflict.smart_copy.calls[0]
-        eq_(call["dest_path"], op.join("some_destination", "foo"))
+        eq_(call["dest_path"], Path("some_destination", "foo"))
        eq_(call["source_path"], f.path)
    def test_copy_or_move_clean_empty_dirs(self, tmpdir, monkeypatch):
        tmppath = Path(str(tmpdir))
-        sourcepath = tmppath["source"]
+        sourcepath = tmppath.joinpath("source")
        sourcepath.mkdir()
-        sourcepath["myfile"].open("w")
+        sourcepath.joinpath("myfile").touch()
        app = TestApp().app
        app.directories.add_path(tmppath)
        [myfile] = app.directories.get_files()
        monkeypatch.setattr(app, "clean_empty_dirs", log_calls(lambda path: None))
-        app.copy_or_move(myfile, False, tmppath["dest"], 0)
+        app.copy_or_move(myfile, False, tmppath.joinpath("dest"), 0)
        calls = app.clean_empty_dirs.calls
        eq_(1, len(calls))
        eq_(sourcepath, calls[0]["path"])
@@ -106,8 +106,8 @@ class TestCaseDupeGuru:
        # If the ignore_hardlink_matches option is set, don't match files hardlinking to the same
        # inode.
        tmppath = Path(str(tmpdir))
-        tmppath["myfile"].open("w").write("foo")
+        tmppath.joinpath("myfile").open("wt").write("foo")
-        os.link(str(tmppath["myfile"]), str(tmppath["hardlink"]))
+        os.link(str(tmppath.joinpath("myfile")), str(tmppath.joinpath("hardlink")))
        app = TestApp().app
        app.directories.add_path(tmppath)
        app.options["scan_type"] = ScanType.CONTENTS
@@ -153,7 +153,7 @@ class TestCaseDupeGuruCleanEmptyDirs:
        # delete_if_empty must be recursively called up in the path until it returns False
        @log_calls
        def mock_delete_if_empty(path, files_to_delete=[]):
-            return len(path) > 1
+            return len(path.parts) > 1
        monkeypatch.setattr(hscommon.util, "delete_if_empty", mock_delete_if_empty)
        # XXX This monkeypatch is temporary. will be fixed in a better monkeypatcher.
@@ -180,8 +180,8 @@ class TestCaseDupeGuruWithResults:
        self.rtable.refresh()
        tmpdir = request.getfixturevalue("tmpdir")
        tmppath = Path(str(tmpdir))
-        tmppath["foo"].mkdir()
+        tmppath.joinpath("foo").mkdir()
-        tmppath["bar"].mkdir()
+        tmppath.joinpath("bar").mkdir()
        self.app.directories.add_path(tmppath)
    def test_get_objects(self, do_setup):
@@ -424,12 +424,9 @@ class TestCaseDupeGuruRenameSelected:
    def do_setup(self, request):
        tmpdir = request.getfixturevalue("tmpdir")
        p = Path(str(tmpdir))
-        fp = open(str(p["foo bar 1"]), mode="w")
+        p.joinpath("foo bar 1").touch()
-        fp.close()
+        p.joinpath("foo bar 2").touch()
-        fp = open(str(p["foo bar 2"]), mode="w")
+        p.joinpath("foo bar 3").touch()
        fp.close()
        fp = open(str(p["foo bar 3"]), mode="w")
        fp.close()
        files = fs.get_files(p)
        for f in files:
            f.is_ref = False
@@ -451,7 +448,7 @@ class TestCaseDupeGuruRenameSelected:
        g = self.groups[0]
        self.rtable.select([1])
        assert app.rename_selected("renamed")
-        names = [p.name for p in self.p.listdir()]
+        names = [p.name for p in self.p.glob("*")]
        assert "renamed" in names
        assert "foo bar 2" not in names
        eq_(g.dupes[0].name, "renamed")
@@ -464,7 +461,7 @@ class TestCaseDupeGuruRenameSelected:
        assert not app.rename_selected("renamed")
        msg = logging.warning.calls[0]["msg"]
        eq_("dupeGuru Warning: list index out of range", msg)
-        names = [p.name for p in self.p.listdir()]
+        names = [p.name for p in self.p.glob("*")]
        assert "renamed" not in names
        assert "foo bar 2" in names
        eq_(g.dupes[0].name, "foo bar 2")
@@ -477,7 +474,7 @@ class TestCaseDupeGuruRenameSelected:
        assert not app.rename_selected("foo bar 1")
        msg = logging.warning.calls[0]["msg"]
        assert msg.startswith("dupeGuru Warning: 'foo bar 1' already exists in")
-        names = [p.name for p in self.p.listdir()]
+        names = [p.name for p in self.p.glob("*")]
        assert "foo bar 1" in names
        assert "foo bar 2" in names
        eq_(g.dupes[0].name, "foo bar 2")
@@ -488,9 +485,9 @@ class TestAppWithDirectoriesInTree:
    def do_setup(self, request):
        tmpdir = request.getfixturevalue("tmpdir")
        p = Path(str(tmpdir))
-        p["sub1"].mkdir()
+        p.joinpath("sub1").mkdir()
-        p["sub2"].mkdir()
+        p.joinpath("sub2").mkdir()
-        p["sub3"].mkdir()
+        p.joinpath("sub3").mkdir()
        app = TestApp()
        self.app = app.app
        self.dtree = app.dtree
--- a/core/tests/base.py
+++ b/core/tests/base.py
@@ -5,7 +5,7 @@
 # http://www.gnu.org/licenses/gpl-3.0.html
 from hscommon.testutil import TestApp as TestAppBase, CallLogger, eq_, with_app  # noqa
-from hscommon.path import Path
+from pathlib import Path
 from hscommon.util import get_file_ext, format_size
 from hscommon.gui.column import Column
 from hscommon.jobprogress.job import nulljob, JobCancelled
@@ -86,9 +86,9 @@ class NamedObject:
            folder = "basepath"
        self._folder = Path(folder)
        self.size = size
-        self.md5partial = name
+        self.digest_partial = name
-        self.md5 = name
+        self.digest = name
-        self.md5samples = name
+        self.digest_samples = name
        if with_words:
            self.words = getwords(name)
        self.is_ref = False
@@ -111,11 +111,11 @@ class NamedObject:
    @property
    def path(self):
-        return self._folder[self.name]
+        return self._folder.joinpath(self.name)
    @property
    def folder_path(self):
-        return self.path.parent()
+        return self.path.parent
    @property
    def extension(self):
--- a/core/tests/directories_test.py
+++ b/core/tests/directories_test.py
@@ -10,7 +10,7 @@ import tempfile
 import shutil
 from pytest import raises
-from hscommon.path import Path
+from pathlib import Path
 from hscommon.testutil import eq_
 from hscommon.plat import ISWINDOWS
@@ -26,29 +26,23 @@ from ..exclude import ExcludeList, ExcludeDict
 def create_fake_fs(rootpath):
    # We have it as a separate function because other units are using it.
-    rootpath = rootpath["fs"]
+    rootpath = rootpath.joinpath("fs")
    rootpath.mkdir()
-    rootpath["dir1"].mkdir()
+    rootpath.joinpath("dir1").mkdir()
-    rootpath["dir2"].mkdir()
+    rootpath.joinpath("dir2").mkdir()
-    rootpath["dir3"].mkdir()
+    rootpath.joinpath("dir3").mkdir()
-    fp = rootpath["file1.test"].open("w")
+    with rootpath.joinpath("file1.test").open("wt") as fp:
-    fp.write("1")
+        fp.write("1")
-    fp.close()
+    with rootpath.joinpath("file2.test").open("wt") as fp:
-    fp = rootpath["file2.test"].open("w")
+        fp.write("12")
-    fp.write("12")
+    with rootpath.joinpath("file3.test").open("wt") as fp:
-    fp.close()
+        fp.write("123")
-    fp = rootpath["file3.test"].open("w")
+    with rootpath.joinpath("dir1", "file1.test").open("wt") as fp:
-    fp.write("123")
+        fp.write("1")
-    fp.close()
+    with rootpath.joinpath("dir2", "file2.test").open("wt") as fp:
-    fp = rootpath["dir1"]["file1.test"].open("w")
+        fp.write("12")
-    fp.write("1")
+    with rootpath.joinpath("dir3", "file3.test").open("wt") as fp:
-    fp.close()
+        fp.write("123")
    fp = rootpath["dir2"]["file2.test"].open("w")
    fp.write("12")
    fp.close()
    fp = rootpath["dir3"]["file3.test"].open("w")
    fp.write("123")
    fp.close()
    return rootpath
@@ -60,11 +54,10 @@ def setup_module(module):
    # and another with a more complex structure.
    testpath = Path(tempfile.mkdtemp())
    module.testpath = testpath
-    rootpath = testpath["onefile"]
+    rootpath = testpath.joinpath("onefile")
    rootpath.mkdir()
-    fp = rootpath["test.txt"].open("w")
+    with rootpath.joinpath("test.txt").open("wt") as fp:
-    fp.write("test_data")
+        fp.write("test_data")
    fp.close()
    create_fake_fs(testpath)
@@ -80,13 +73,13 @@ def test_empty():
 def test_add_path():
    d = Directories()
-    p = testpath["onefile"]
+    p = testpath.joinpath("onefile")
    d.add_path(p)
    eq_(1, len(d))
    assert p in d
-    assert (p["foobar"]) in d
+    assert (p.joinpath("foobar")) in d
-    assert p.parent() not in d
+    assert p.parent not in d
-    p = testpath["fs"]
+    p = testpath.joinpath("fs")
    d.add_path(p)
    eq_(2, len(d))
    assert p in d
@@ -94,18 +87,18 @@ def test_add_path():
 def test_add_path_when_path_is_already_there():
    d = Directories()
-    p = testpath["onefile"]
+    p = testpath.joinpath("onefile")
    d.add_path(p)
    with raises(AlreadyThereError):
        d.add_path(p)
    with raises(AlreadyThereError):
-        d.add_path(p["foobar"])
+        d.add_path(p.joinpath("foobar"))
    eq_(1, len(d))
 def test_add_path_containing_paths_already_there():
    d = Directories()
-    d.add_path(testpath["onefile"])
+    d.add_path(testpath.joinpath("onefile"))
    eq_(1, len(d))
    d.add_path(testpath)
    eq_(len(d), 1)
@@ -114,7 +107,7 @@ def test_add_path_containing_paths_already_there():
 def test_add_path_non_latin(tmpdir):
    p = Path(str(tmpdir))
-    to_add = p["unicode\u201a"]
+    to_add = p.joinpath("unicode\u201a")
    os.mkdir(str(to_add))
    d = Directories()
    try:
@@ -125,25 +118,25 @@ def test_add_path_non_latin(tmpdir):
 def test_del():
    d = Directories()
-    d.add_path(testpath["onefile"])
+    d.add_path(testpath.joinpath("onefile"))
    try:
        del d[1]
        assert False
    except IndexError:
        pass
-    d.add_path(testpath["fs"])
+    d.add_path(testpath.joinpath("fs"))
    del d[1]
    eq_(1, len(d))
 def test_states():
    d = Directories()
-    p = testpath["onefile"]
+    p = testpath.joinpath("onefile")
    d.add_path(p)
    eq_(DirectoryState.NORMAL, d.get_state(p))
    d.set_state(p, DirectoryState.REFERENCE)
    eq_(DirectoryState.REFERENCE, d.get_state(p))
-    eq_(DirectoryState.REFERENCE, d.get_state(p["dir1"]))
+    eq_(DirectoryState.REFERENCE, d.get_state(p.joinpath("dir1")))
    eq_(1, len(d.states))
    eq_(p, list(d.states.keys())[0])
    eq_(DirectoryState.REFERENCE, d.states[p])
@@ -152,7 +145,7 @@ def test_states():
 def test_get_state_with_path_not_there():
    # When the path's not there, just return DirectoryState.Normal
    d = Directories()
-    d.add_path(testpath["onefile"])
+    d.add_path(testpath.joinpath("onefile"))
    eq_(d.get_state(testpath), DirectoryState.NORMAL)
@@ -160,26 +153,26 @@ def test_states_overwritten_when_larger_directory_eat_smaller_ones():
    # ref #248
    # When setting the state of a folder, we overwrite previously set states for subfolders.
    d = Directories()
-    p = testpath["onefile"]
+    p = testpath.joinpath("onefile")
    d.add_path(p)
    d.set_state(p, DirectoryState.EXCLUDED)
    d.add_path(testpath)
    d.set_state(testpath, DirectoryState.REFERENCE)
    eq_(d.get_state(p), DirectoryState.REFERENCE)
-    eq_(d.get_state(p["dir1"]), DirectoryState.REFERENCE)
+    eq_(d.get_state(p.joinpath("dir1")), DirectoryState.REFERENCE)
    eq_(d.get_state(testpath), DirectoryState.REFERENCE)
 def test_get_files():
    d = Directories()
-    p = testpath["fs"]
+    p = testpath.joinpath("fs")
    d.add_path(p)
-    d.set_state(p["dir1"], DirectoryState.REFERENCE)
+    d.set_state(p.joinpath("dir1"), DirectoryState.REFERENCE)
-    d.set_state(p["dir2"], DirectoryState.EXCLUDED)
+    d.set_state(p.joinpath("dir2"), DirectoryState.EXCLUDED)
    files = list(d.get_files())
    eq_(5, len(files))
    for f in files:
-        if f.path.parent() == p["dir1"]:
+        if f.path.parent == p.joinpath("dir1"):
            assert f.is_ref
        else:
            assert not f.is_ref
@@ -193,7 +186,7 @@ def test_get_files_with_folders():
            return True
    d = Directories()
-    p = testpath["fs"]
+    p = testpath.joinpath("fs")
    d.add_path(p)
    files = list(d.get_files(fileclasses=[FakeFile]))
    # We have the 3 root files and the 3 root dirs
@@ -202,23 +195,23 @@ def test_get_files_with_folders():
 def test_get_folders():
    d = Directories()
-    p = testpath["fs"]
+    p = testpath.joinpath("fs")
    d.add_path(p)
-    d.set_state(p["dir1"], DirectoryState.REFERENCE)
+    d.set_state(p.joinpath("dir1"), DirectoryState.REFERENCE)
-    d.set_state(p["dir2"], DirectoryState.EXCLUDED)
+    d.set_state(p.joinpath("dir2"), DirectoryState.EXCLUDED)
    folders = list(d.get_folders())
    eq_(len(folders), 3)
    ref = [f for f in folders if f.is_ref]
    not_ref = [f for f in folders if not f.is_ref]
    eq_(len(ref), 1)
-    eq_(ref[0].path, p["dir1"])
+    eq_(ref[0].path, p.joinpath("dir1"))
    eq_(len(not_ref), 2)
    eq_(ref[0].size, 1)
 def test_get_files_with_inherited_exclusion():
    d = Directories()
-    p = testpath["onefile"]
+    p = testpath.joinpath("onefile")
    d.add_path(p)
    d.set_state(p, DirectoryState.EXCLUDED)
    eq_([], list(d.get_files()))
@@ -234,13 +227,13 @@ def test_save_and_load(tmpdir):
    d1.add_path(p1)
    d1.add_path(p2)
    d1.set_state(p1, DirectoryState.REFERENCE)
-    d1.set_state(p1["dir1"], DirectoryState.EXCLUDED)
+    d1.set_state(p1.joinpath("dir1"), DirectoryState.EXCLUDED)
    tmpxml = str(tmpdir.join("directories_testunit.xml"))
    d1.save_to_file(tmpxml)
    d2.load_from_file(tmpxml)
    eq_(2, len(d2))
    eq_(DirectoryState.REFERENCE, d2.get_state(p1))
-    eq_(DirectoryState.EXCLUDED, d2.get_state(p1["dir1"]))
+    eq_(DirectoryState.EXCLUDED, d2.get_state(p1.joinpath("dir1")))
 def test_invalid_path():
@@ -268,7 +261,7 @@ def test_load_from_file_with_invalid_path(tmpdir):
    # This test simulates a load from file resulting in a
    # InvalidPath raise. Other directories must be loaded.
    d1 = Directories()
-    d1.add_path(testpath["onefile"])
+    d1.add_path(testpath.joinpath("onefile"))
    # Will raise InvalidPath upon loading
    p = Path(str(tmpdir.join("toremove")))
    p.mkdir()
@@ -283,11 +276,11 @@ def test_load_from_file_with_invalid_path(tmpdir):
 def test_unicode_save(tmpdir):
    d = Directories()
-    p1 = Path(str(tmpdir))["hello\xe9"]
+    p1 = Path(str(tmpdir), "hello\xe9")
    p1.mkdir()
-    p1["foo\xe9"].mkdir()
+    p1.joinpath("foo\xe9").mkdir()
    d.add_path(p1)
-    d.set_state(p1["foo\xe9"], DirectoryState.EXCLUDED)
+    d.set_state(p1.joinpath("foo\xe9"), DirectoryState.EXCLUDED)
    tmpxml = str(tmpdir.join("directories_testunit.xml"))
    try:
        d.save_to_file(tmpxml)
@@ -297,12 +290,12 @@ def test_unicode_save(tmpdir):
 def test_get_files_refreshes_its_directories():
    d = Directories()
-    p = testpath["fs"]
+    p = testpath.joinpath("fs")
    d.add_path(p)
    files = d.get_files()
    eq_(6, len(list(files)))
    time.sleep(1)
-    os.remove(str(p["dir1"]["file1.test"]))
+    os.remove(str(p.joinpath("dir1", "file1.test")))
    files = d.get_files()
    eq_(5, len(list(files)))
@@ -311,15 +304,15 @@ def test_get_files_does_not_choke_on_non_existing_directories(tmpdir):
    d = Directories()
    p = Path(str(tmpdir))
    d.add_path(p)
-    p.rmtree()
+    shutil.rmtree(str(p))
    eq_([], list(d.get_files()))
 def test_get_state_returns_excluded_by_default_for_hidden_directories(tmpdir):
    d = Directories()
    p = Path(str(tmpdir))
-    hidden_dir_path = p[".foo"]
+    hidden_dir_path = p.joinpath(".foo")
-    p[".foo"].mkdir()
+    p.joinpath(".foo").mkdir()
    d.add_path(p)
    eq_(d.get_state(hidden_dir_path), DirectoryState.EXCLUDED)
    # But it can be overriden
@@ -331,22 +324,22 @@ def test_default_path_state_override(tmpdir):
    # It's possible for a subclass to override the default state of a path
    class MyDirectories(Directories):
        def _default_state_for_path(self, path):
-            if "foobar" in path:
+            if "foobar" in path.parts:
                return DirectoryState.EXCLUDED
    d = MyDirectories()
    p1 = Path(str(tmpdir))
-    p1["foobar"].mkdir()
+    p1.joinpath("foobar").mkdir()
-    p1["foobar/somefile"].open("w").close()
+    p1.joinpath("foobar/somefile").touch()
-    p1["foobaz"].mkdir()
+    p1.joinpath("foobaz").mkdir()
-    p1["foobaz/somefile"].open("w").close()
+    p1.joinpath("foobaz/somefile").touch()
    d.add_path(p1)
-    eq_(d.get_state(p1["foobaz"]), DirectoryState.NORMAL)
+    eq_(d.get_state(p1.joinpath("foobaz")), DirectoryState.NORMAL)
-    eq_(d.get_state(p1["foobar"]), DirectoryState.EXCLUDED)
+    eq_(d.get_state(p1.joinpath("foobar")), DirectoryState.EXCLUDED)
    eq_(len(list(d.get_files())), 1)  # only the 'foobaz' file is there
    # However, the default state can be changed
-    d.set_state(p1["foobar"], DirectoryState.NORMAL)
+    d.set_state(p1.joinpath("foobar"), DirectoryState.NORMAL)
-    eq_(d.get_state(p1["foobar"]), DirectoryState.NORMAL)
+    eq_(d.get_state(p1.joinpath("foobar")), DirectoryState.NORMAL)
    eq_(len(list(d.get_files())), 2)
@@ -372,42 +365,42 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        self.d._exclude_list.add(regex)
        self.d._exclude_list.mark(regex)
        p1 = Path(str(tmpdir))
-        p1["$Recycle.Bin"].mkdir()
+        p1.joinpath("$Recycle.Bin").mkdir()
-        p1["$Recycle.Bin"]["subdir"].mkdir()
+        p1.joinpath("$Recycle.Bin", "subdir").mkdir()
        self.d.add_path(p1)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
        # By default, subdirs should be excluded too, but this can be overridden separately
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
-        self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL)
+        self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
    def test_exclude_refined(self, tmpdir):
        regex1 = r"^\$Recycle\.Bin$"
        self.d._exclude_list.add(regex1)
        self.d._exclude_list.mark(regex1)
        p1 = Path(str(tmpdir))
-        p1["$Recycle.Bin"].mkdir()
+        p1.joinpath("$Recycle.Bin").mkdir()
-        p1["$Recycle.Bin"]["somefile.png"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "somefile.png").touch()
-        p1["$Recycle.Bin"]["some_unwanted_file.jpg"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "some_unwanted_file.jpg").touch()
-        p1["$Recycle.Bin"]["subdir"].mkdir()
+        p1.joinpath("$Recycle.Bin", "subdir").mkdir()
-        p1["$Recycle.Bin"]["subdir"]["somesubdirfile.png"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "subdir", "somesubdirfile.png").touch()
-        p1["$Recycle.Bin"]["subdir"]["unwanted_subdirfile.gif"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "subdir", "unwanted_subdirfile.gif").touch()
-        p1["$Recycle.Bin"]["subdar"].mkdir()
+        p1.joinpath("$Recycle.Bin", "subdar").mkdir()
-        p1["$Recycle.Bin"]["subdar"]["somesubdarfile.jpeg"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "subdar", "somesubdarfile.jpeg").touch()
-        p1["$Recycle.Bin"]["subdar"]["unwanted_subdarfile.png"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "subdar", "unwanted_subdarfile.png").touch()
-        self.d.add_path(p1["$Recycle.Bin"])
+        self.d.add_path(p1.joinpath("$Recycle.Bin"))
        # Filter should set the default state to Excluded
-        eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
        # The subdir should inherit its parent state
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.EXCLUDED)
        # Override a child path's state
-        self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL)
+        self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
        # Parent should keep its default state, and the other child too
-        eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.EXCLUDED)
        # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
        # only the 2 files directly under the Normal directory
@@ -419,8 +412,8 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        assert "somesubdirfile.png" in files
        assert "unwanted_subdirfile.gif" in files
        # Overriding the parent should enable all children
-        self.d.set_state(p1["$Recycle.Bin"], DirectoryState.NORMAL)
+        self.d.set_state(p1.joinpath("$Recycle.Bin"), DirectoryState.NORMAL)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.NORMAL)
        # all files there
        files = self.get_files_and_expect_num_result(6)
        assert "somefile.png" in files
@@ -444,7 +437,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        assert self.d._exclude_list.error(regex3) is None
        # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
        # Directory shouldn't change its state here, unless explicitely done by user
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
        files = self.get_files_and_expect_num_result(5)
        assert "unwanted_subdirfile.gif" not in files
        assert "unwanted_subdarfile.png" in files
@@ -453,15 +446,15 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        regex4 = r".*subdir$"
        self.d._exclude_list.rename(regex3, regex4)
        assert self.d._exclude_list.error(regex4) is None
-        p1["$Recycle.Bin"]["subdar"]["file_ending_with_subdir"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "subdar", "file_ending_with_subdir").touch()
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
        files = self.get_files_and_expect_num_result(4)
        assert "file_ending_with_subdir" not in files
        assert "somesubdarfile.jpeg" in files
        assert "somesubdirfile.png" not in files
        assert "unwanted_subdirfile.gif" not in files
-        self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL)
+        self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
        # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
        files = self.get_files_and_expect_num_result(6)
        assert "file_ending_with_subdir" not in files
@@ -471,9 +464,9 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        regex5 = r".*subdir.*"
        self.d._exclude_list.rename(regex4, regex5)
        # Files containing substring should be filtered
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
        # The path should not match, only the filename, the "subdir" in the directory name shouldn't matter
-        p1["$Recycle.Bin"]["subdir"]["file_which_shouldnt_match"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "subdir", "file_which_shouldnt_match").touch()
        files = self.get_files_and_expect_num_result(5)
        assert "somesubdirfile.png" not in files
        assert "unwanted_subdirfile.gif" not in files
@@ -493,7 +486,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        assert self.d._exclude_list.error(regex6) is None
        assert regex6 in self.d._exclude_list
        # This still should not be affected
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
        files = self.get_files_and_expect_num_result(5)
        # These files are under the "/subdir" directory
        assert "somesubdirfile.png" not in files
@@ -505,20 +498,20 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
    def test_japanese_unicode(self, tmpdir):
        p1 = Path(str(tmpdir))
-        p1["$Recycle.Bin"].mkdir()
+        p1.joinpath("$Recycle.Bin").mkdir()
-        p1["$Recycle.Bin"]["somerecycledfile.png"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "somerecycledfile.png").touch()
-        p1["$Recycle.Bin"]["some_unwanted_file.jpg"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "some_unwanted_file.jpg").touch()
-        p1["$Recycle.Bin"]["subdir"].mkdir()
+        p1.joinpath("$Recycle.Bin", "subdir").mkdir()
-        p1["$Recycle.Bin"]["subdir"]["過去白濁物語～]_カラー.jpg"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "subdir", "過去白濁物語～]_カラー.jpg").touch()
-        p1["$Recycle.Bin"]["思叫物語"].mkdir()
+        p1.joinpath("$Recycle.Bin", "思叫物語").mkdir()
-        p1["$Recycle.Bin"]["思叫物語"]["なししろ会う前"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "思叫物語", "なししろ会う前").touch()
-        p1["$Recycle.Bin"]["思叫物語"]["堂～ロ"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "思叫物語", "堂～ロ").touch()
-        self.d.add_path(p1["$Recycle.Bin"])
+        self.d.add_path(p1.joinpath("$Recycle.Bin"))
        regex3 = r".*物語.*"
        self.d._exclude_list.add(regex3)
        self.d._exclude_list.mark(regex3)
        # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["思叫物語"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "思叫物語")), DirectoryState.EXCLUDED)
        files = self.get_files_and_expect_num_result(2)
        assert "過去白濁物語～]_カラー.jpg" not in files
        assert "なししろ会う前" not in files
@@ -527,7 +520,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        regex4 = r".*物語$"
        self.d._exclude_list.rename(regex3, regex4)
        assert self.d._exclude_list.error(regex4) is None
-        self.d.set_state(p1["$Recycle.Bin"]["思叫物語"], DirectoryState.NORMAL)
+        self.d.set_state(p1.joinpath("$Recycle.Bin", "思叫物語"), DirectoryState.NORMAL)
        files = self.get_files_and_expect_num_result(5)
        assert "過去白濁物語～]_カラー.jpg" in files
        assert "なししろ会う前" in files
@@ -539,15 +532,15 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        self.d._exclude_list.add(regex)
        self.d._exclude_list.mark(regex)
        p1 = Path(str(tmpdir))
-        p1["foobar"].mkdir()
+        p1.joinpath("foobar").mkdir()
-        p1["foobar"][".hidden_file.txt"].open("w").close()
+        p1.joinpath("foobar", ".hidden_file.txt").touch()
-        p1["foobar"][".hidden_dir"].mkdir()
+        p1.joinpath("foobar", ".hidden_dir").mkdir()
-        p1["foobar"][".hidden_dir"]["foobar.jpg"].open("w").close()
+        p1.joinpath("foobar", ".hidden_dir", "foobar.jpg").touch()
-        p1["foobar"][".hidden_dir"][".hidden_subfile.png"].open("w").close()
+        p1.joinpath("foobar", ".hidden_dir", ".hidden_subfile.png").touch()
-        self.d.add_path(p1["foobar"])
+        self.d.add_path(p1.joinpath("foobar"))
        # It should not inherit its parent's state originally
-        eq_(self.d.get_state(p1["foobar"][".hidden_dir"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("foobar", ".hidden_dir")), DirectoryState.EXCLUDED)
-        self.d.set_state(p1["foobar"][".hidden_dir"], DirectoryState.NORMAL)
+        self.d.set_state(p1.joinpath("foobar", ".hidden_dir"), DirectoryState.NORMAL)
        # The files should still be filtered
        files = self.get_files_and_expect_num_result(1)
        eq_(len(self.d._exclude_list.compiled_paths), 0)
--- a/core/tests/engine_test.py
+++ b/core/tests/engine_test.py
@@ -530,7 +530,7 @@ class TestCaseGetMatches:
 class TestCaseGetMatchesByContents:
-    def test_big_file_partial_hashes(self):
+    def test_big_file_partial_hashing(self):
        smallsize = 1
        bigsize = 100 * 1024 * 1024  # 100MB
        f = [
@@ -539,17 +539,17 @@ class TestCaseGetMatchesByContents:
            no("smallfoo", size=smallsize),
            no("smallbar", size=smallsize),
        ]
-        f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar"
+        f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
-        f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar"
+        f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
-        f[2].md5 = f[2].md5partial = "bleh"
+        f[2].digest = f[2].digest_partial = "bleh"
-        f[3].md5 = f[3].md5partial = "bleh"
+        f[3].digest = f[3].digest_partial = "bleh"
        r = getmatches_by_contents(f, bigsize=bigsize)
        eq_(len(r), 2)
-        # User disabled optimization for big files, compute hashes as usual
+        # User disabled optimization for big files, compute digests as usual
        r = getmatches_by_contents(f, bigsize=0)
        eq_(len(r), 2)
-        # Other file is now slightly different, md5partial is still the same
+        # Other file is now slightly different, digest_partial is still the same
-        f[1].md5 = f[1].md5samples = "foobardiff"
+        f[1].digest = f[1].digest_samples = "foobardiff"
        r = getmatches_by_contents(f, bigsize=bigsize)
        # Successfully filter it out
        eq_(len(r), 1)
--- a/core/tests/fs_test.py
+++ b/core/tests/fs_test.py
@@ -6,10 +6,18 @@
 # which should be included with this package. The terms are also available at
 # http://www.gnu.org/licenses/gpl-3.0.html
-import hashlib
+try:
    import xxhash
    hasher = xxhash.xxh128
 except ImportError:
    import hashlib
    hasher = hashlib.md5
 from os import urandom
-from hscommon.path import Path
+from pathlib import Path
 from hscommon.testutil import eq_
 from core.tests.directories_test import create_fake_fs
@@ -17,32 +25,26 @@ from .. import fs
 def create_fake_fs_with_random_data(rootpath):
-    rootpath = rootpath["fs"]
+    rootpath = rootpath.joinpath("fs")
    rootpath.mkdir()
-    rootpath["dir1"].mkdir()
+    rootpath.joinpath("dir1").mkdir()
-    rootpath["dir2"].mkdir()
+    rootpath.joinpath("dir2").mkdir()
-    rootpath["dir3"].mkdir()
+    rootpath.joinpath("dir3").mkdir()
    fp = rootpath["file1.test"].open("wb")
    data1 = urandom(200 * 1024)  # 200KiB
    data2 = urandom(1024 * 1024)  # 1MiB
    data3 = urandom(10 * 1024 * 1024)  # 10MiB
-    fp.write(data1)
+    with rootpath.joinpath("file1.test").open("wb") as fp:
-    fp.close()
+        fp.write(data1)
-    fp = rootpath["file2.test"].open("wb")
+    with rootpath.joinpath("file2.test").open("wb") as fp:
-    fp.write(data2)
+        fp.write(data2)
-    fp.close()
+    with rootpath.joinpath("file3.test").open("wb") as fp:
-    fp = rootpath["file3.test"].open("wb")
+        fp.write(data3)
-    fp.write(data3)
+    with rootpath.joinpath("dir1", "file1.test").open("wb") as fp:
-    fp.close()
+        fp.write(data1)
-    fp = rootpath["dir1"]["file1.test"].open("wb")
+    with rootpath.joinpath("dir2", "file2.test").open("wb") as fp:
-    fp.write(data1)
+        fp.write(data2)
-    fp.close()
+    with rootpath.joinpath("dir3", "file3.test").open("wb") as fp:
-    fp = rootpath["dir2"]["file2.test"].open("wb")
+        fp.write(data3)
    fp.write(data2)
    fp.close()
    fp = rootpath["dir3"]["file3.test"].open("wb")
    fp.write(data3)
    fp.close()
    return rootpath
@@ -52,54 +54,54 @@ def test_size_aggregates_subfiles(tmpdir):
    eq_(b.size, 12)
-def test_md5_aggregate_subfiles_sorted(tmpdir):
+def test_digest_aggregate_subfiles_sorted(tmpdir):
-    # dir.allfiles can return child in any order. Thus, bundle.md5 must aggregate
+    # dir.allfiles can return child in any order. Thus, bundle.digest must aggregate
-    # all files' md5 it contains, but it must make sure that it does so in the
+    # all files' digests it contains, but it must make sure that it does so in the
    # same order everytime.
    p = create_fake_fs_with_random_data(Path(str(tmpdir)))
    b = fs.Folder(p)
-    md51 = fs.File(p["dir1"]["file1.test"]).md5
+    digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest
-    md52 = fs.File(p["dir2"]["file2.test"]).md5
+    digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest
-    md53 = fs.File(p["dir3"]["file3.test"]).md5
+    digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest
-    md54 = fs.File(p["file1.test"]).md5
+    digest4 = fs.File(p.joinpath("file1.test")).digest
-    md55 = fs.File(p["file2.test"]).md5
+    digest5 = fs.File(p.joinpath("file2.test")).digest
-    md56 = fs.File(p["file3.test"]).md5
+    digest6 = fs.File(p.joinpath("file3.test")).digest
-    # The expected md5 is the md5 of md5s for folders and the direct md5 for files
+    # The expected digest is the hash of digests for folders and the direct digest for files
-    folder_md51 = hashlib.md5(md51).digest()
+    folder_digest1 = hasher(digest1).digest()
-    folder_md52 = hashlib.md5(md52).digest()
+    folder_digest2 = hasher(digest2).digest()
-    folder_md53 = hashlib.md5(md53).digest()
+    folder_digest3 = hasher(digest3).digest()
-    md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56)
+    digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
-    eq_(b.md5, md5.digest())
+    eq_(b.digest, digest)
-def test_partial_md5_aggregate_subfile_sorted(tmpdir):
+def test_partial_digest_aggregate_subfile_sorted(tmpdir):
    p = create_fake_fs_with_random_data(Path(str(tmpdir)))
    b = fs.Folder(p)
-    md51 = fs.File(p["dir1"]["file1.test"]).md5partial
+    digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest_partial
-    md52 = fs.File(p["dir2"]["file2.test"]).md5partial
+    digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest_partial
-    md53 = fs.File(p["dir3"]["file3.test"]).md5partial
+    digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest_partial
-    md54 = fs.File(p["file1.test"]).md5partial
+    digest4 = fs.File(p.joinpath("file1.test")).digest_partial
-    md55 = fs.File(p["file2.test"]).md5partial
+    digest5 = fs.File(p.joinpath("file2.test")).digest_partial
-    md56 = fs.File(p["file3.test"]).md5partial
+    digest6 = fs.File(p.joinpath("file3.test")).digest_partial
-    # The expected md5 is the md5 of md5s for folders and the direct md5 for files
+    # The expected digest is the hash of digests for folders and the direct digest for files
-    folder_md51 = hashlib.md5(md51).digest()
+    folder_digest1 = hasher(digest1).digest()
-    folder_md52 = hashlib.md5(md52).digest()
+    folder_digest2 = hasher(digest2).digest()
-    folder_md53 = hashlib.md5(md53).digest()
+    folder_digest3 = hasher(digest3).digest()
-    md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56)
+    digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
-    eq_(b.md5partial, md5.digest())
+    eq_(b.digest_partial, digest)
-    md51 = fs.File(p["dir1"]["file1.test"]).md5samples
+    digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest_samples
-    md52 = fs.File(p["dir2"]["file2.test"]).md5samples
+    digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest_samples
-    md53 = fs.File(p["dir3"]["file3.test"]).md5samples
+    digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest_samples
-    md54 = fs.File(p["file1.test"]).md5samples
+    digest4 = fs.File(p.joinpath("file1.test")).digest_samples
-    md55 = fs.File(p["file2.test"]).md5samples
+    digest5 = fs.File(p.joinpath("file2.test")).digest_samples
-    md56 = fs.File(p["file3.test"]).md5samples
+    digest6 = fs.File(p.joinpath("file3.test")).digest_samples
-    # The expected md5 is the md5 of md5s for folders and the direct md5 for files
+    # The expected digest is the digest of digests for folders and the direct digest for files
-    folder_md51 = hashlib.md5(md51).digest()
+    folder_digest1 = hasher(digest1).digest()
-    folder_md52 = hashlib.md5(md52).digest()
+    folder_digest2 = hasher(digest2).digest()
-    folder_md53 = hashlib.md5(md53).digest()
+    folder_digest3 = hasher(digest3).digest()
-    md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56)
+    digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
-    eq_(b.md5samples, md5.digest())
+    eq_(b.digest_samples, digest)
 def test_has_file_attrs(tmpdir):
--- a/core/tests/results_test.py
+++ b/core/tests/results_test.py
@@ -447,7 +447,7 @@ class TestCaseResultsXML:
        self.results.groups = self.groups
    def get_file(self, path):  # use this as a callback for load_from_xml
-        return [o for o in self.objects if o.path == path][0]
+        return [o for o in self.objects if str(o.path) == path][0]
    def test_save_to_xml(self):
        self.objects[0].is_ref = True
--- a/core/tests/scanner_test.py
+++ b/core/tests/scanner_test.py
@@ -7,7 +7,7 @@
 import pytest
 from hscommon.jobprogress import job
-from hscommon.path import Path
+from pathlib import Path
 from hscommon.testutil import eq_
 from .. import fs
@@ -22,7 +22,7 @@ class NamedObject:
        if path is None:
            path = Path(name)
        else:
-            path = Path(path)[name]
+            path = Path(path, name)
        self.name = name
        self.size = size
        self.path = path
@@ -123,19 +123,19 @@ def test_content_scan(fake_fileexists):
    s = Scanner()
    s.scan_type = ScanType.CONTENTS
    f = [no("foo"), no("bar"), no("bleh")]
-    f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar"
+    f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
-    f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar"
+    f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
-    f[2].md5 = f[2].md5partial = f[1].md5samples = "bleh"
+    f[2].digest = f[2].digest_partial = f[1].digest_samples = "bleh"
    r = s.get_dupe_groups(f)
    eq_(len(r), 1)
    eq_(len(r[0]), 2)
-    eq_(s.discarded_file_count, 0)  # don't count the different md5 as discarded!
+    eq_(s.discarded_file_count, 0)  # don't count the different digest as discarded!
 def test_content_scan_compare_sizes_first(fake_fileexists):
    class MyFile(no):
        @property
-        def md5(self):
+        def digest(self):
            raise AssertionError()
    s = Scanner()
@@ -161,14 +161,14 @@ def test_ignore_file_size(fake_fileexists):
        no("largeignore1", large_size + 1),
        no("largeignore2", large_size + 1),
    ]
-    f[0].md5 = f[0].md5partial = f[0].md5samples = "smallignore"
+    f[0].digest = f[0].digest_partial = f[0].digest_samples = "smallignore"
-    f[1].md5 = f[1].md5partial = f[1].md5samples = "smallignore"
+    f[1].digest = f[1].digest_partial = f[1].digest_samples = "smallignore"
-    f[2].md5 = f[2].md5partial = f[2].md5samples = "small"
+    f[2].digest = f[2].digest_partial = f[2].digest_samples = "small"
-    f[3].md5 = f[3].md5partial = f[3].md5samples = "small"
+    f[3].digest = f[3].digest_partial = f[3].digest_samples = "small"
-    f[4].md5 = f[4].md5partial = f[4].md5samples = "large"
+    f[4].digest = f[4].digest_partial = f[4].digest_samples = "large"
-    f[5].md5 = f[5].md5partial = f[5].md5samples = "large"
+    f[5].digest = f[5].digest_partial = f[5].digest_samples = "large"
-    f[6].md5 = f[6].md5partial = f[6].md5samples = "largeignore"
+    f[6].digest = f[6].digest_partial = f[6].digest_samples = "largeignore"
-    f[7].md5 = f[7].md5partial = f[7].md5samples = "largeignore"
+    f[7].digest = f[7].digest_partial = f[7].digest_samples = "largeignore"
    r = s.get_dupe_groups(f)
    # No ignores
@@ -197,21 +197,21 @@ def test_big_file_partial_hashes(fake_fileexists):
    s.big_file_size_threshold = bigsize
    f = [no("bigfoo", bigsize), no("bigbar", bigsize), no("smallfoo", smallsize), no("smallbar", smallsize)]
-    f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar"
+    f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
-    f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar"
+    f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
-    f[2].md5 = f[2].md5partial = "bleh"
+    f[2].digest = f[2].digest_partial = "bleh"
-    f[3].md5 = f[3].md5partial = "bleh"
+    f[3].digest = f[3].digest_partial = "bleh"
    r = s.get_dupe_groups(f)
    eq_(len(r), 2)
-    # md5partial is still the same, but the file is actually different
+    # digest_partial is still the same, but the file is actually different
-    f[1].md5 = f[1].md5samples = "difffoobar"
+    f[1].digest = f[1].digest_samples = "difffoobar"
-    # here we compare the full md5s, as the user disabled the optimization
+    # here we compare the full digests, as the user disabled the optimization
    s.big_file_size_threshold = 0
    r = s.get_dupe_groups(f)
    eq_(len(r), 1)
-    # here we should compare the md5samples, and see they are different
+    # here we should compare the digest_samples, and see they are different
    s.big_file_size_threshold = bigsize
    r = s.get_dupe_groups(f)
    eq_(len(r), 1)
@@ -221,9 +221,9 @@ def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists):
    s = Scanner()
    s.scan_type = ScanType.CONTENTS
    f = [no("foo"), no("bar"), no("bleh")]
-    f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar"
+    f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
-    f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar"
+    f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
-    f[2].md5 = f[2].md5partial = f[2].md5samples = "bleh"
+    f[2].digest = f[2].digest_partial = f[2].digest_samples = "bleh"
    s.min_match_percentage = 101
    r = s.get_dupe_groups(f)
    eq_(len(r), 1)
@@ -234,12 +234,16 @@ def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists):
    eq_(len(r[0]), 2)
-def test_content_scan_doesnt_put_md5_in_words_at_the_end(fake_fileexists):
+def test_content_scan_doesnt_put_digest_in_words_at_the_end(fake_fileexists):
    s = Scanner()
    s.scan_type = ScanType.CONTENTS
    f = [no("foo"), no("bar")]
-    f[0].md5 = f[0].md5partial = f[0].md5samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
+    f[0].digest = f[0].digest_partial = f[
-    f[1].md5 = f[1].md5partial = f[1].md5samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
+        0
    ].digest_samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
    f[1].digest = f[1].digest_partial = f[
        1
    ].digest_samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
    r = s.get_dupe_groups(f)
    # FIXME looks like we are missing something here?
    r[0]
@@ -568,12 +572,14 @@ def test_dont_group_files_that_dont_exist(tmpdir):
    s = Scanner()
    s.scan_type = ScanType.CONTENTS
    p = Path(str(tmpdir))
-    p["file1"].open("w").write("foo")
+    with p.joinpath("file1").open("w") as fp:
-    p["file2"].open("w").write("foo")
+        fp.write("foo")
    with p.joinpath("file2").open("w") as fp:
        fp.write("foo")
    file1, file2 = fs.get_files(p)
    def getmatches(*args, **kw):
-        file2.path.remove()
+        file2.path.unlink()
        return [Match(file1, file2, 100)]
    s._getmatches = getmatches
@@ -587,21 +593,21 @@ def test_folder_scan_exclude_subfolder_matches(fake_fileexists):
    s = Scanner()
    s.scan_type = ScanType.FOLDERS
    topf1 = no("top folder 1", size=42)
-    topf1.md5 = topf1.md5partial = topf1.md5samples = b"some_md5_1"
+    topf1.digest = topf1.digest_partial = topf1.digest_samples = b"some_digest__1"
    topf1.path = Path("/topf1")
    topf2 = no("top folder 2", size=42)
-    topf2.md5 = topf2.md5partial = topf2.md5samples = b"some_md5_1"
+    topf2.digest = topf2.digest_partial = topf2.digest_samples = b"some_digest__1"
    topf2.path = Path("/topf2")
    subf1 = no("sub folder 1", size=41)
-    subf1.md5 = subf1.md5partial = subf1.md5samples = b"some_md5_2"
+    subf1.digest = subf1.digest_partial = subf1.digest_samples = b"some_digest__2"
    subf1.path = Path("/topf1/sub")
    subf2 = no("sub folder 2", size=41)
-    subf2.md5 = subf2.md5partial = subf2.md5samples = b"some_md5_2"
+    subf2.digest = subf2.digest_partial = subf2.digest_samples = b"some_digest__2"
    subf2.path = Path("/topf2/sub")
    eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2])), 1)  # only top folders
    # however, if another folder matches a subfolder, keep in in the matches
    otherf = no("other folder", size=41)
-    otherf.md5 = otherf.md5partial = otherf.md5samples = b"some_md5_2"
+    otherf.digest = otherf.digest_partial = otherf.digest_samples = b"some_digest__2"
    otherf.path = Path("/otherfolder")
    eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2, otherf])), 2)
@@ -624,9 +630,9 @@ def test_dont_count_ref_files_as_discarded(fake_fileexists):
    o1 = no("foo", path="p1")
    o2 = no("foo", path="p2")
    o3 = no("foo", path="p3")
-    o1.md5 = o1.md5partial = o1.md5samples = "foobar"
+    o1.digest = o1.digest_partial = o1.digest_samples = "foobar"
-    o2.md5 = o2.md5partial = o2.md5samples = "foobar"
+    o2.digest = o2.digest_partial = o2.digest_samples = "foobar"
-    o3.md5 = o3.md5partial = o3.md5samples = "foobar"
+    o3.digest = o3.digest_partial = o3.digest_samples = "foobar"
    o1.is_ref = True
    o2.is_ref = True
    eq_(len(s.get_dupe_groups([o1, o2, o3])), 1)
--- a/core/util.py
+++ b/core/util.py
@@ -7,6 +7,12 @@
 import time
 import sys
 import os
 import urllib.request
 import urllib.error
 import json
 import semantic_version
 import logging
 from typing import Union
 from hscommon.util import format_time_decimal
@@ -64,3 +70,34 @@ def fix_surrogate_encoding(s, encoding="utf-8"):
 def executable_folder():
    return os.path.dirname(os.path.abspath(sys.argv[0]))
 def check_for_update(current_version: str, include_prerelease: bool = False) -> Union[None, dict]:
    request = urllib.request.Request(
        "https://api.github.com/repos/arsenetar/dupeguru/releases",
        headers={"Accept": "application/vnd.github.v3+json"},
    )
    try:
        with urllib.request.urlopen(request) as response:
            if response.status != 200:
                logging.warn(f"Error retriving updates. Status: {response.status}")
                return None
            try:
                response_json = json.loads(response.read())
            except json.JSONDecodeError as ex:
                logging.warn(f"Error parsing updates. {ex.msg}")
                return None
    except urllib.error.URLError as ex:
        logging.warn(f"Error retriving updates. {ex.reason}")
        return None
    new_version = semantic_version.Version(current_version)
    new_url = None
    for release in response_json:
        release_version = semantic_version.Version(release["name"])
        if new_version < release_version and (include_prerelease or not release_version.prerelease):
            new_version = release_version
            new_url = release["html_url"]
    if new_url is not None:
        return {"version": new_version, "url": new_url}
    else:
        return None
--- a/hscommon/build.py
+++ b/hscommon/build.py
@@ -20,13 +20,8 @@ import re
 import importlib
 from datetime import datetime
 import glob
 import sysconfig
 import modulefinder
 from setuptools import setup, Extension
 from .plat import ISWINDOWS
 from .util import ensure_folder, delete_files_with_pattern
 def print_and_do(cmd):
@@ -181,23 +176,6 @@ def build_dmg(app_path, destfolder):
    print("Build Complete")
 def copy_sysconfig_files_for_embed(destpath):
    # This normally shouldn't be needed for Python 3.3+.
    makefile = sysconfig.get_makefile_filename()
    configh = sysconfig.get_config_h_filename()
    shutil.copy(makefile, destpath)
    shutil.copy(configh, destpath)
    with open(op.join(destpath, "site.py"), "w") as fp:
        fp.write(
            """
 import os.path as op
 from distutils import sysconfig
 sysconfig.get_makefile_filename = lambda: op.join(op.dirname(__file__), 'Makefile')
 sysconfig.get_config_h_filename = lambda: op.join(op.dirname(__file__), 'pyconfig.h')
 """
        )
 def add_to_pythonpath(path):
    """Adds ``path`` to both ``PYTHONPATH`` env and ``sys.path``."""
    abspath = op.abspath(path)
@@ -248,20 +226,6 @@ def copy_packages(packages_names, dest, create_links=False, extra_ignores=None):
                shutil.copy(source_path, dest_path)
 def copy_qt_plugins(folder_names, dest):  # This is only for Windows
    from PyQt5.QtCore import QLibraryInfo
    qt_plugin_dir = QLibraryInfo.location(QLibraryInfo.PluginsPath)
    def ignore(path, names):
        if path == qt_plugin_dir:
            return [n for n in names if n not in folder_names]
        else:
            return [n for n in names if not n.endswith(".dll")]
    shutil.copytree(qt_plugin_dir, dest, ignore=ignore)
 def build_debian_changelog(
    changelogpath,
    destfile,
@@ -349,183 +313,6 @@ def read_changelog_file(filename):
    return result
 class OSXAppStructure:
    def __init__(self, dest):
        self.dest = dest
        self.contents = op.join(dest, "Contents")
        self.macos = op.join(self.contents, "MacOS")
        self.resources = op.join(self.contents, "Resources")
        self.frameworks = op.join(self.contents, "Frameworks")
        self.infoplist = op.join(self.contents, "Info.plist")
    def create(self, infoplist):
        ensure_empty_folder(self.dest)
        os.makedirs(self.macos)
        os.mkdir(self.resources)
        os.mkdir(self.frameworks)
        copy(infoplist, self.infoplist)
        open(op.join(self.contents, "PkgInfo"), "wt").write("APPLxxxx")
    def copy_executable(self, executable):
        info = plistlib.readPlist(self.infoplist)
        self.executablename = info["CFBundleExecutable"]
        self.executablepath = op.join(self.macos, self.executablename)
        copy(executable, self.executablepath)
    def copy_resources(self, *resources, use_symlinks=False):
        for path in resources:
            resource_dest = op.join(self.resources, op.basename(path))
            action = symlink if use_symlinks else copy
            action(op.abspath(path), resource_dest)
    def copy_frameworks(self, *frameworks):
        for path in frameworks:
            framework_dest = op.join(self.frameworks, op.basename(path))
            copy(path, framework_dest)
 def create_osx_app_structure(
    dest,
    executable,
    infoplist,
    resources=None,
    frameworks=None,
    symlink_resources=False,
 ):
    # `dest`: A path to the destination .app folder
    # `executable`: the path of the executable file that goes in "MacOS"
    # `infoplist`: The path to your Info.plist file.
    # `resources`: A list of paths of files or folders going in the "Resources" folder.
    # `frameworks`: Same as above for "Frameworks".
    # `symlink_resources`: If True, will symlink resources into the structure instead of copying them.
    app = OSXAppStructure(dest)
    app.create(infoplist)
    app.copy_executable(executable)
    app.copy_resources(*resources, use_symlinks=symlink_resources)
    app.copy_frameworks(*frameworks)
 class OSXFrameworkStructure:
    def __init__(self, dest):
        self.dest = dest
        self.contents = op.join(dest, "Versions", "A")
        self.resources = op.join(self.contents, "Resources")
        self.headers = op.join(self.contents, "Headers")
        self.infoplist = op.join(self.resources, "Info.plist")
        self._update_executable_path()
    def _update_executable_path(self):
        if not op.exists(self.infoplist):
            self.executablename = self.executablepath = None
            return
        info = plistlib.readPlist(self.infoplist)
        self.executablename = info["CFBundleExecutable"]
        self.executablepath = op.join(self.contents, self.executablename)
    def create(self, infoplist):
        ensure_empty_folder(self.dest)
        os.makedirs(self.contents)
        os.mkdir(self.resources)
        os.mkdir(self.headers)
        copy(infoplist, self.infoplist)
        self._update_executable_path()
    def create_symlinks(self):
        # Only call this after create() and copy_executable()
        os.symlink("A", op.join(self.dest, "Versions", "Current"))
        os.symlink(op.relpath(self.executablepath, self.dest), op.join(self.dest, self.executablename))
        os.symlink(op.relpath(self.headers, self.dest), op.join(self.dest, "Headers"))
        os.symlink(op.relpath(self.resources, self.dest), op.join(self.dest, "Resources"))
    def copy_executable(self, executable):
        copy(executable, self.executablepath)
    def copy_resources(self, *resources, use_symlinks=False):
        for path in resources:
            resource_dest = op.join(self.resources, op.basename(path))
            action = symlink if use_symlinks else copy
            action(op.abspath(path), resource_dest)
    def copy_headers(self, *headers, use_symlinks=False):
        for path in headers:
            header_dest = op.join(self.headers, op.basename(path))
            action = symlink if use_symlinks else copy
            action(op.abspath(path), header_dest)
 def copy_embeddable_python_dylib(dst):
    runtime = op.join(
        sysconfig.get_config_var("PYTHONFRAMEWORKPREFIX"),
        sysconfig.get_config_var("LDLIBRARY"),
    )
    filedest = op.join(dst, "Python")
    shutil.copy(runtime, filedest)
    os.chmod(filedest, 0o774)  # We need write permission to use install_name_tool
    cmd = "install_name_tool -id @rpath/Python %s" % filedest
    print_and_do(cmd)
 def collect_stdlib_dependencies(script, dest_folder, extra_deps=None):
    sysprefix = sys.prefix  # could be a virtualenv
    basesysprefix = sys.base_prefix  # seems to be path to non-virtual sys
    real_lib_prefix = sysconfig.get_config_var("LIBDEST")  # leaving this in case it is neede
    def is_stdlib_path(path):
        # A module path is only a stdlib path if it's in either sys.prefix or
        # sysconfig.get_config_var('prefix') (the 2 are different if we are in a virtualenv) and if
        # there's no "site-package in the path.
        if not path:
            return False
        if "site-package" in path:
            return False
        if not (path.startswith(sysprefix) or path.startswith(basesysprefix) or path.startswith(real_lib_prefix)):
            return False
        return True
    ensure_folder(dest_folder)
    mf = modulefinder.ModuleFinder()
    mf.run_script(script)
    modpaths = [mod.__file__ for mod in mf.modules.values()]
    modpaths = filter(is_stdlib_path, modpaths)
    for p in modpaths:
        if p.startswith(real_lib_prefix):
            relpath = op.relpath(p, real_lib_prefix)
        elif p.startswith(sysprefix):
            relpath = op.relpath(p, sysprefix)
            assert relpath.startswith("lib/python3.")  # we want to get rid of that lib/python3.x part
            relpath = relpath[len("lib/python3.X/") :]
        elif p.startswith(basesysprefix):
            relpath = op.relpath(p, basesysprefix)
            assert relpath.startswith("lib/python3.")
            relpath = relpath[len("lib/python3.X/") :]
        else:
            raise AssertionError()
        if relpath.startswith("lib-dynload"):  # We copy .so files in lib-dynload directly in our dest
            relpath = relpath[len("lib-dynload/") :]
        if relpath.startswith("encodings") or relpath.startswith("distutils"):
            # We force their inclusion later.
            continue
        dest_path = op.join(dest_folder, relpath)
        ensure_folder(op.dirname(dest_path))
        copy(p, dest_path)
    # stringprep is used by encodings.
    # We use real_lib_prefix with distutils because virtualenv messes with it and we need to refer
    # to the original distutils folder.
    FORCED_INCLUSION = [
        "encodings",
        "stringprep",
        op.join(real_lib_prefix, "distutils"),
    ]
    if extra_deps:
        FORCED_INCLUSION += extra_deps
    copy_packages(FORCED_INCLUSION, dest_folder)
    # There's a couple of rather big exe files in the distutils folder that we absolutely don't
    # need. Remove them.
    delete_files_with_pattern(op.join(dest_folder, "distutils"), "*.exe")
    # And, finally, create an empty "site.py" that Python needs around on startup.
    open(op.join(dest_folder, "site.py"), "w").close()
 def fix_qt_resource_file(path):
    # pyrcc5 under Windows, if the locale is non-english, can produce a source file with a date
    # containing accented characters. If it does, the encoding is wrong and it prevents the file
@@ -537,21 +324,3 @@ def fix_qt_resource_file(path):
    lines = [line for line in lines if not line.startswith(b"#")]
    with open(path, "wb") as fp:
        fp.write(b"\n".join(lines))
 def build_cocoa_ext(extname, dest, source_files, extra_frameworks=(), extra_includes=()):
    extra_link_args = ["-framework", "CoreFoundation", "-framework", "Foundation"]
    for extra in extra_frameworks:
        extra_link_args += ["-framework", extra]
    ext = Extension(
        extname,
        source_files,
        extra_link_args=extra_link_args,
        include_dirs=extra_includes,
    )
    setup(script_args=["build_ext", "--inplace"], ext_modules=[ext])
    # Our problem here is to get the fully qualified filename of the resulting .so but I couldn't
    # find a documented way to do so. The only thing I could find is this below :(
    fn = ext._file_name
    assert op.exists(fn)
    move(fn, op.join(dest, fn))
--- a/hscommon/conflict.py
+++ b/hscommon/conflict.py
@@ -14,7 +14,7 @@ import re
 import os
 import shutil
-from .path import Path, pathify
+from pathlib import Path
 # This matches [123], but not [12] (3 digits being the minimum).
 # It also matches [1234] [12345] etc..
@@ -52,16 +52,15 @@ def is_conflicted(name):
    return re_conflict.match(name) is not None
@pathify
 def _smart_move_or_copy(operation, source_path: Path, dest_path: Path):
    """Use move() or copy() to move and copy file with the conflict management."""
-    if dest_path.isdir() and not source_path.isdir():
+    if dest_path.is_dir() and not source_path.is_dir():
-        dest_path = dest_path[source_path.name]
+        dest_path = dest_path.joinpath(source_path.name)
    if dest_path.exists():
        filename = dest_path.name
-        dest_dir_path = dest_path.parent()
+        dest_dir_path = dest_path.parent
        newname = get_conflicted_name(os.listdir(str(dest_dir_path)), filename)
-        dest_path = dest_dir_path[newname]
+        dest_path = dest_dir_path.joinpath(newname)
    operation(str(source_path), str(dest_path))
--- a/hscommon/debug.py
+++ b/hscommon/debug.py
@@ -1,23 +0,0 @@
 # Created By: Virgil Dupras
 # Created On: 2011-04-19
 # Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
 #
 # This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
 # which should be included with this package. The terms are also available at
 # http://www.gnu.org/licenses/gpl-3.0.html
 import sys
 import traceback
 # Taken from http://bzimmer.ziclix.com/2008/12/17/python-thread-dumps/
 def stacktraces():
    code = []
    for thread_id, stack in sys._current_frames().items():
        code.append("\n# ThreadID: %s" % thread_id)
        for filename, lineno, name, line in traceback.extract_stack(stack):
            code.append('File: "%s", line %d, in %s' % (filename, lineno, name))
            if line:
                code.append("  %s" % (line.strip()))
    return "\n".join(code)
--- a/hscommon/desktop.py
+++ b/hscommon/desktop.py
@@ -42,73 +42,50 @@ def special_folder_path(special_folder, appname=None, portable=False):
 try:
-    # Normally, we would simply do "from cocoa import proxy", but due to a bug in pytest (currently
+    from PyQt5.QtCore import QUrl, QStandardPaths
-    # at v2.4.2), our test suite is broken when we do that. This below is a workaround until that
+    from PyQt5.QtGui import QDesktopServices
-    # bug is fixed.
+    from qtlib.util import get_appdata
-    import cocoa
+    from core.util import executable_folder
    from hscommon.plat import ISWINDOWS, ISOSX
    import subprocess
-    if not hasattr(cocoa, "proxy"):
+    def _open_url(url):
-        raise ImportError()
+        QDesktopServices.openUrl(QUrl(url))
-    proxy = cocoa.proxy
+
-    _open_url = proxy.openURL_
+    def _open_path(path):
-    _open_path = proxy.openPath_
+        url = QUrl.fromLocalFile(str(path))
-    _reveal_path = proxy.revealPath_
+        QDesktopServices.openUrl(url)
    def _reveal_path(path):
        if ISWINDOWS:
            subprocess.run(["explorer", "/select,", op.abspath(path)])
        elif ISOSX:
            subprocess.run(["open", "-R", op.abspath(path)])
        else:
            _open_path(op.dirname(str(path)))
    def _special_folder_path(special_folder, appname=None, portable=False):
        if special_folder == SpecialFolder.CACHE:
-            base = proxy.getCachePath()
+            if ISWINDOWS and portable:
                folder = op.join(executable_folder(), "cache")
            else:
                folder = QStandardPaths.standardLocations(QStandardPaths.CacheLocation)[0]
        else:
-            base = proxy.getAppdataPath()
+            folder = get_appdata(portable)
-        if not appname:
+        return folder
            appname = proxy.bundleInfo_("CFBundleName")
        return op.join(base, appname)
 except ImportError:
-    try:
+    # We're either running tests, and these functions don't matter much or we're in a really
-        from PyQt5.QtCore import QUrl, QStandardPaths
+    # weird situation. Let's just have dummy fallbacks.
-        from PyQt5.QtGui import QDesktopServices
+    logging.warning("Can't setup desktop functions!")
        from qtlib.util import get_appdata
        from core.util import executable_folder
        from hscommon.plat import ISWINDOWS, ISOSX
        import subprocess
-        def _open_url(url):
+    def _open_path(path):
-            QDesktopServices.openUrl(QUrl(url))
+        # Dummy for tests
        pass
-        def _open_path(path):
+    def _reveal_path(path):
-            url = QUrl.fromLocalFile(str(path))
+        # Dummy for tests
-            QDesktopServices.openUrl(url)
+        pass
-        def _reveal_path(path):
+    def _special_folder_path(special_folder, appname=None, portable=False):
-            if ISWINDOWS:
+        return "/tmp"
                subprocess.run(["explorer", "/select,", op.abspath(path)])
            elif ISOSX:
                subprocess.run(["open", "-R", op.abspath(path)])
            else:
                _open_path(op.dirname(str(path)))
        def _special_folder_path(special_folder, appname=None, portable=False):
            if special_folder == SpecialFolder.CACHE:
                if ISWINDOWS and portable:
                    folder = op.join(executable_folder(), "cache")
                else:
                    folder = QStandardPaths.standardLocations(QStandardPaths.CacheLocation)[0]
            else:
                folder = get_appdata(portable)
            return folder
    except ImportError:
        # We're either running tests, and these functions don't matter much or we're in a really
        # weird situation. Let's just have dummy fallbacks.
        logging.warning("Can't setup desktop functions!")
        def _open_path(path):
            # Dummy for tests
            pass
        def _reveal_path(path):
            # Dummy for tests
            pass
        def _special_folder_path(special_folder, appname=None, portable=False):
            return "/tmp"
--- a/hscommon/geometry.py
+++ b/hscommon/geometry.py
@@ -1,216 +0,0 @@
 # Created By: Virgil Dupras
 # Created On: 2011-08-05
 # Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
 #
 # This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
 # which should be included with this package. The terms are also available at
 # http://www.gnu.org/licenses/gpl-3.0.html
 from sys import maxsize as INF
 from math import sqrt
 VERY_SMALL = 0.0000001
 class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __repr__(self):
        return "<Point {:2.2f}, {:2.2f}>".format(*self)
    def __iter__(self):
        yield self.x
        yield self.y
    def distance_to(self, other):
        return Line(self, other).length()
 class Line:
    def __init__(self, p1, p2):
        self.p1 = p1
        self.p2 = p2
    def __repr__(self):
        return "<Line {}, {}>".format(*self)
    def __iter__(self):
        yield self.p1
        yield self.p2
    def dx(self):
        return self.p2.x - self.p1.x
    def dy(self):
        return self.p2.y - self.p1.y
    def length(self):
        return sqrt(self.dx() ** 2 + self.dy() ** 2)
    def slope(self):
        if self.dx() == 0:
            return INF if self.dy() > 0 else -INF
        else:
            return self.dy() / self.dx()
    def intersection_point(self, other):
        # with help from http://paulbourke.net/geometry/lineline2d/
        if abs(self.slope() - other.slope()) < VERY_SMALL:
            # parallel. Even if coincident, we return nothing
            return None
        A, B = self
        C, D = other
        denom = (D.y - C.y) * (B.x - A.x) - (D.x - C.x) * (B.y - A.y)
        if denom == 0:
            return None
        numera = (D.x - C.x) * (A.y - C.y) - (D.y - C.y) * (A.x - C.x)
        numerb = (B.x - A.x) * (A.y - C.y) - (B.y - A.y) * (A.x - C.x)
        mua = numera / denom
        mub = numerb / denom
        if (0 <= mua <= 1) and (0 <= mub <= 1):
            x = A.x + mua * (B.x - A.x)
            y = A.y + mua * (B.y - A.y)
            return Point(x, y)
        else:
            return None
 class Rect:
    def __init__(self, x, y, w, h):
        self.x = x
        self.y = y
        self.w = w
        self.h = h
    def __iter__(self):
        yield self.x
        yield self.y
        yield self.w
        yield self.h
    def __repr__(self):
        return "<Rect {:2.2f}, {:2.2f}, {:2.2f}, {:2.2f}>".format(*self)
    @classmethod
    def from_center(cls, center, width, height):
        x = center.x - width / 2
        y = center.y - height / 2
        return cls(x, y, width, height)
    @classmethod
    def from_corners(cls, pt1, pt2):
        x1, y1 = pt1
        x2, y2 = pt2
        return cls(min(x1, x2), min(y1, y2), abs(x1 - x2), abs(y1 - y2))
    def center(self):
        return Point(self.x + self.w / 2, self.y + self.h / 2)
    def contains_point(self, point):
        x, y = point
        (x1, y1), (x2, y2) = self.corners()
        return (x1 <= x <= x2) and (y1 <= y <= y2)
    def contains_rect(self, rect):
        pt1, pt2 = rect.corners()
        return self.contains_point(pt1) and self.contains_point(pt2)
    def corners(self):
        return Point(self.x, self.y), Point(self.x + self.w, self.y + self.h)
    def intersects(self, other):
        r1pt1, r1pt2 = self.corners()
        r2pt1, r2pt2 = other.corners()
        if r1pt1.x < r2pt1.x:
            xinter = r1pt2.x >= r2pt1.x
        else:
            xinter = r2pt2.x >= r1pt1.x
        if not xinter:
            return False
        if r1pt1.y < r2pt1.y:
            yinter = r1pt2.y >= r2pt1.y
        else:
            yinter = r2pt2.y >= r1pt1.y
        return yinter
    def lines(self):
        pt1, pt4 = self.corners()
        pt2 = Point(pt4.x, pt1.y)
        pt3 = Point(pt1.x, pt4.y)
        l1 = Line(pt1, pt2)
        l2 = Line(pt2, pt4)
        l3 = Line(pt3, pt4)
        l4 = Line(pt1, pt3)
        return l1, l2, l3, l4
    def scaled_rect(self, dx, dy):
        """Returns a rect that has the same borders at self, but grown/shrunk by dx/dy on each side."""
        x, y, w, h = self
        x -= dx
        y -= dy
        w += dx * 2
        h += dy * 2
        return Rect(x, y, w, h)
    def united(self, other):
        """Returns the bounding rectangle of this rectangle and `other`."""
        # ul=upper left lr=lower right
        ulcorner1, lrcorner1 = self.corners()
        ulcorner2, lrcorner2 = other.corners()
        corner1 = Point(min(ulcorner1.x, ulcorner2.x), min(ulcorner1.y, ulcorner2.y))
        corner2 = Point(max(lrcorner1.x, lrcorner2.x), max(lrcorner1.y, lrcorner2.y))
        return Rect.from_corners(corner1, corner2)
    # --- Properties
    @property
    def top(self):
        return self.y
    @top.setter
    def top(self, value):
        self.y = value
    @property
    def bottom(self):
        return self.y + self.h
    @bottom.setter
    def bottom(self, value):
        self.y = value - self.h
    @property
    def left(self):
        return self.x
    @left.setter
    def left(self, value):
        self.x = value
    @property
    def right(self):
        return self.x + self.w
    @right.setter
    def right(self, value):
        self.x = value - self.w
    @property
    def width(self):
        return self.w
    @width.setter
    def width(self, value):
        self.w = value
    @property
    def height(self):
        return self.h
    @height.setter
    def height(self, value):
        self.h = value
--- a/hscommon/loc.py
+++ b/hscommon/loc.py
@@ -1,14 +1,11 @@
 import os
 import os.path as op
 import shutil
 import re
 import tempfile
 import polib
 from . import pygettext
 from .util import modified_after, dedupe, ensure_folder
 from .build import print_and_do, ensure_empty_folder
 LC_MESSAGES = "LC_MESSAGES"
@@ -116,118 +113,3 @@ def normalize_all_pos(base_folder):
        for pofile in pofiles:
            p = polib.pofile(pofile)
            p.save()
 # --- Cocoa
 def all_lproj_paths(folder):
    return files_with_ext(folder, ".lproj")
 def escape_cocoa_strings(s):
    return s.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")
 def unescape_cocoa_strings(s):
    return s.replace("\\\\", "\\").replace('\\"', '"').replace("\\n", "\n")
 def strings2pot(target, dest):
    with open(target, "rt", encoding="utf-8") as fp:
        contents = fp.read()
    # We're reading an en.lproj file. We only care about the righthand part of the translation.
    re_trans = re.compile(r'".*" = "(.*)";')
    strings = re_trans.findall(contents)
    if op.exists(dest):
        po = polib.pofile(dest)
    else:
        po = polib.POFile()
    for s in dedupe(strings):
        s = unescape_cocoa_strings(s)
        entry = po.find(s)
        if entry is None:
            entry = polib.POEntry(msgid=s)
            po.append(entry)
        # we don't know or care about a line number so we put 0
        entry.occurrences.append((target, "0"))
        entry.occurrences = dedupe(entry.occurrences)
    po.save(dest)
 def allstrings2pot(lprojpath, dest, excludes=None):
    allstrings = files_with_ext(lprojpath, STRING_EXT)
    if excludes:
        allstrings = [p for p in allstrings if op.splitext(op.basename(p))[0] not in excludes]
    for strings_path in allstrings:
        strings2pot(strings_path, dest)
 def po2strings(pofile, en_strings, dest):
    # Takes en_strings and replace all righthand parts of "foo" = "bar"; entries with translations
    # in pofile, then puts the result in dest.
    po = polib.pofile(pofile)
    if not modified_after(pofile, dest):
        return
    ensure_folder(op.dirname(dest))
    print("Creating {} from {}".format(dest, pofile))
    with open(en_strings, "rt", encoding="utf-8") as fp:
        contents = fp.read()
    re_trans = re.compile(r'(?<= = ").*(?=";\n)')
    def repl(match):
        s = match.group(0)
        unescaped = unescape_cocoa_strings(s)
        entry = po.find(unescaped)
        if entry is None:
            print("WARNING: Could not find entry '{}' in .po file".format(s))
            return s
        trans = entry.msgstr
        return escape_cocoa_strings(trans) if trans else s
    contents = re_trans.sub(repl, contents)
    with open(dest, "wt", encoding="utf-8") as fp:
        fp.write(contents)
 def generate_cocoa_strings_from_code(code_folder, dest_folder):
    # Uses the "genstrings" command to generate strings file from all .m files in "code_folder".
    # The strings file (their name depends on the localization table used in the source) will be
    # placed in "dest_folder".
    # genstrings produces utf-16 files with comments. After having generated the files, we convert
    # them to utf-8 and remove the comments.
    ensure_empty_folder(dest_folder)
    print_and_do('genstrings -o "{}" `find "{}" -name *.m | xargs`'.format(dest_folder, code_folder))
    for stringsfile in os.listdir(dest_folder):
        stringspath = op.join(dest_folder, stringsfile)
        with open(stringspath, "rt", encoding="utf-16") as fp:
            content = fp.read()
        content = re.sub(r"/\*.*?\*/", "", content)
        content = re.sub(r"\n{2,}", "\n", content)
        # I have no idea why, but genstrings seems to have problems with "%" character in strings
        # and inserts (number)$ after it. Find these bogus inserts and remove them.
        content = re.sub(r"%\d\$", "%", content)
        with open(stringspath, "wt", encoding="utf-8") as fp:
            fp.write(content)
 def generate_cocoa_strings_from_xib(xib_folder):
    xibs = [op.join(xib_folder, fn) for fn in os.listdir(xib_folder) if fn.endswith(".xib")]
    for xib in xibs:
        dest = xib.replace(".xib", STRING_EXT)
        print_and_do("ibtool {} --generate-strings-file {}".format(xib, dest))
        print_and_do("iconv -f utf-16 -t utf-8 {0} | tee {0}".format(dest))
 def localize_stringsfile(stringsfile, dest_root_folder):
    stringsfile_name = op.basename(stringsfile)
    for lang in get_langs("locale"):
        pofile = op.join("locale", lang, "LC_MESSAGES", "ui.po")
        cocoa_lang = PO2COCOA.get(lang, lang)
        dest_lproj = op.join(dest_root_folder, cocoa_lang + ".lproj")
        ensure_folder(dest_lproj)
        po2strings(pofile, stringsfile, op.join(dest_lproj, stringsfile_name))
 def localize_all_stringsfiles(src_folder, dest_root_folder):
    stringsfiles = [op.join(src_folder, fn) for fn in os.listdir(src_folder) if fn.endswith(STRING_EXT)]
    for path in stringsfiles:
        localize_stringsfile(path, dest_root_folder)
--- a/hscommon/path.py
+++ b/hscommon/path.py
@@ -7,208 +7,9 @@
 # http://www.gnu.org/licenses/gpl-3.0.html
 import logging
 import os
 import os.path as op
 import shutil
 import sys
 from itertools import takewhile
 from functools import wraps
 from inspect import signature
-
+from pathlib import Path
 class Path(tuple):
    """A handy class to work with paths.
    We subclass ``tuple``, each element of the tuple represents an element of the path.
    * ``Path('/foo/bar/baz')[1]`` --> ``'bar'``
    * ``Path('/foo/bar/baz')[1:2]`` --> ``Path('bar/baz')``
    * ``Path('/foo/bar')['baz']`` --> ``Path('/foo/bar/baz')``
    * ``str(Path('/foo/bar/baz'))`` --> ``'/foo/bar/baz'``
    """
    # Saves a little bit of memory usage
    __slots__ = ()
    def __new__(cls, value, separator=None):
        def unicode_if_needed(s):
            if isinstance(s, str):
                return s
            else:
                try:
                    return str(s, sys.getfilesystemencoding())
                except UnicodeDecodeError:
                    logging.warning("Could not decode %r", s)
                    raise
        if isinstance(value, Path):
            return value
        if not separator:
            separator = os.sep
        if isinstance(value, bytes):
            value = unicode_if_needed(value)
        if isinstance(value, str):
            if value:
                if (separator not in value) and ("/" in value):
                    separator = "/"
                value = value.split(separator)
            else:
                value = ()
        else:
            if any(isinstance(x, bytes) for x in value):
                value = [unicode_if_needed(x) for x in value]
            # value is a tuple/list
            if any(separator in x for x in value):
                # We have a component with a separator in it. Let's rejoin it, and generate another path.
                return Path(separator.join(value), separator)
        if (len(value) > 1) and (not value[-1]):
            value = value[
                :-1
            ]  # We never want a path to end with a '' (because Path() can be called with a trailing slash ending path)
        return tuple.__new__(cls, value)
    def __add__(self, other):
        other = Path(other)
        if other and (not other[0]):
            other = other[1:]
        return Path(tuple.__add__(self, other))
    def __contains__(self, item):
        if isinstance(item, Path):
            return item[: len(self)] == self
        else:
            return tuple.__contains__(self, item)
    def __eq__(self, other):
        return tuple.__eq__(self, Path(other))
    def __getitem__(self, key):
        if isinstance(key, slice):
            if isinstance(key.start, Path):
                equal_elems = list(takewhile(lambda pair: pair[0] == pair[1], zip(self, key.start)))
                key = slice(len(equal_elems), key.stop, key.step)
            if isinstance(key.stop, Path):
                equal_elems = list(
                    takewhile(
                        lambda pair: pair[0] == pair[1],
                        zip(reversed(self), reversed(key.stop)),
                    )
                )
                stop = -len(equal_elems) if equal_elems else None
                key = slice(key.start, stop, key.step)
            return Path(tuple.__getitem__(self, key))
        elif isinstance(key, (str, Path)):
            return self + key
        else:
            return tuple.__getitem__(self, key)
    def __hash__(self):
        return tuple.__hash__(self)
    def __ne__(self, other):
        return not self.__eq__(other)
    def __radd__(self, other):
        return Path(other) + self
    def __str__(self):
        if len(self) == 1:
            first = self[0]
            if (len(first) == 2) and (first[1] == ":"):  # Windows drive letter
                return first + "\\"
            elif not len(first):  # root directory
                return "/"
        return os.sep.join(self)
    def has_drive_letter(self):
        if not self:
            return False
        first = self[0]
        return (len(first) == 2) and (first[1] == ":")
    def is_parent_of(self, other):
        """Whether ``other`` is a subpath of ``self``.
        Almost the same as ``other in self``, but it's a bit more self-explicative and when
        ``other == self``, returns False.
        """
        if other == self:
            return False
        else:
            return other in self
    def remove_drive_letter(self):
        if self.has_drive_letter():
            return self[1:]
        else:
            return self
    def tobytes(self):
        return str(self).encode(sys.getfilesystemencoding())
    def parent(self):
        """Returns the parent path.
        ``Path('/foo/bar/baz').parent()`` --> ``Path('/foo/bar')``
        """
        return self[:-1]
    @property
    def name(self):
        """Last element of the path (filename), with extension.
        ``Path('/foo/bar/baz').name`` --> ``'baz'``
        """
        return self[-1]
    # OS method wrappers
    def exists(self):
        return op.exists(str(self))
    def copy(self, dest_path):
        return shutil.copy(str(self), str(dest_path))
    def copytree(self, dest_path, *args, **kwargs):
        return shutil.copytree(str(self), str(dest_path), *args, **kwargs)
    def isdir(self):
        return op.isdir(str(self))
    def isfile(self):
        return op.isfile(str(self))
    def islink(self):
        return op.islink(str(self))
    def listdir(self):
        return [self[name] for name in os.listdir(str(self))]
    def mkdir(self, *args, **kwargs):
        return os.mkdir(str(self), *args, **kwargs)
    def makedirs(self, *args, **kwargs):
        return os.makedirs(str(self), *args, **kwargs)
    def move(self, dest_path):
        return shutil.move(str(self), str(dest_path))
    def open(self, *args, **kwargs):
        return open(str(self), *args, **kwargs)
    def remove(self):
        return os.remove(str(self))
    def rename(self, dest_path):
        return os.rename(str(self), str(dest_path))
    def rmdir(self):
        return os.rmdir(str(self))
    def rmtree(self):
        return shutil.rmtree(str(self))
    def stat(self):
        return os.stat(str(self))
 def pathify(f):
--- a/hscommon/tests/conflict_test.py
+++ b/hscommon/tests/conflict_test.py
@@ -15,7 +15,7 @@ from ..conflict import (
    smart_copy,
    smart_move,
 )
-from ..path import Path
+from pathlib import Path
 from ..testutil import eq_
@@ -71,43 +71,43 @@ class TestCaseMoveCopy:
    def do_setup(self, request):
        tmpdir = request.getfixturevalue("tmpdir")
        self.path = Path(str(tmpdir))
-        self.path["foo"].open("w").close()
+        self.path.joinpath("foo").touch()
-        self.path["bar"].open("w").close()
+        self.path.joinpath("bar").touch()
-        self.path["dir"].mkdir()
+        self.path.joinpath("dir").mkdir()
    def test_move_no_conflict(self, do_setup):
-        smart_move(self.path + "foo", self.path + "baz")
+        smart_move(self.path.joinpath("foo"), self.path.joinpath("baz"))
-        assert self.path["baz"].exists()
+        assert self.path.joinpath("baz").exists()
-        assert not self.path["foo"].exists()
+        assert not self.path.joinpath("foo").exists()
    def test_copy_no_conflict(self, do_setup):  # No need to duplicate the rest of the tests... Let's just test on move
-        smart_copy(self.path + "foo", self.path + "baz")
+        smart_copy(self.path.joinpath("foo"), self.path.joinpath("baz"))
-        assert self.path["baz"].exists()
+        assert self.path.joinpath("baz").exists()
-        assert self.path["foo"].exists()
+        assert self.path.joinpath("foo").exists()
    def test_move_no_conflict_dest_is_dir(self, do_setup):
-        smart_move(self.path + "foo", self.path + "dir")
+        smart_move(self.path.joinpath("foo"), self.path.joinpath("dir"))
-        assert self.path["dir"]["foo"].exists()
+        assert self.path.joinpath("dir", "foo").exists()
-        assert not self.path["foo"].exists()
+        assert not self.path.joinpath("foo").exists()
    def test_move_conflict(self, do_setup):
-        smart_move(self.path + "foo", self.path + "bar")
+        smart_move(self.path.joinpath("foo"), self.path.joinpath("bar"))
-        assert self.path["[000] bar"].exists()
+        assert self.path.joinpath("[000] bar").exists()
-        assert not self.path["foo"].exists()
+        assert not self.path.joinpath("foo").exists()
    def test_move_conflict_dest_is_dir(self, do_setup):
-        smart_move(self.path["foo"], self.path["dir"])
+        smart_move(self.path.joinpath("foo"), self.path.joinpath("dir"))
-        smart_move(self.path["bar"], self.path["foo"])
+        smart_move(self.path.joinpath("bar"), self.path.joinpath("foo"))
-        smart_move(self.path["foo"], self.path["dir"])
+        smart_move(self.path.joinpath("foo"), self.path.joinpath("dir"))
-        assert self.path["dir"]["foo"].exists()
+        assert self.path.joinpath("dir", "foo").exists()
-        assert self.path["dir"]["[000] foo"].exists()
+        assert self.path.joinpath("dir", "[000] foo").exists()
-        assert not self.path["foo"].exists()
+        assert not self.path.joinpath("foo").exists()
-        assert not self.path["bar"].exists()
+        assert not self.path.joinpath("bar").exists()
    def test_copy_folder(self, tmpdir):
        # smart_copy also works on folders
        path = Path(str(tmpdir))
-        path["foo"].mkdir()
+        path.joinpath("foo").mkdir()
-        path["bar"].mkdir()
+        path.joinpath("bar").mkdir()
-        smart_copy(path["foo"], path["bar"])  # no crash
+        smart_copy(path.joinpath("foo"), path.joinpath("bar"))  # no crash
-        assert path["[000] bar"].exists()
+        assert path.joinpath("[000] bar").exists()
--- a/hscommon/tests/path_test.py
+++ b/hscommon/tests/path_test.py
@@ -6,261 +6,8 @@
 # which should be included with this package. The terms are also available at
 # http://www.gnu.org/licenses/gpl-3.0.html
-import sys
+from ..path import pathify
-import os
+from pathlib import Path
 import pytest
 from ..path import Path, pathify
 from ..testutil import eq_
@pytest.fixture
 def force_ossep(request):
    monkeypatch = request.getfixturevalue("monkeypatch")
    monkeypatch.setattr(os, "sep", "/")
 def test_empty(force_ossep):
    path = Path("")
    eq_("", str(path))
    eq_(0, len(path))
    path = Path(())
    eq_("", str(path))
    eq_(0, len(path))
 def test_single(force_ossep):
    path = Path("foobar")
    eq_("foobar", path)
    eq_(1, len(path))
 def test_multiple(force_ossep):
    path = Path("foo/bar")
    eq_("foo/bar", path)
    eq_(2, len(path))
 def test_init_with_tuple_and_list(force_ossep):
    path = Path(("foo", "bar"))
    eq_("foo/bar", path)
    path = Path(["foo", "bar"])
    eq_("foo/bar", path)
 def test_init_with_invalid_value(force_ossep):
    try:
        Path(42)
        assert False
    except TypeError:
        pass
 def test_access(force_ossep):
    path = Path("foo/bar/bleh")
    eq_("foo", path[0])
    eq_("foo", path[-3])
    eq_("bar", path[1])
    eq_("bar", path[-2])
    eq_("bleh", path[2])
    eq_("bleh", path[-1])
 def test_slicing(force_ossep):
    path = Path("foo/bar/bleh")
    subpath = path[:2]
    eq_("foo/bar", subpath)
    assert isinstance(subpath, Path)
 def test_parent(force_ossep):
    path = Path("foo/bar/bleh")
    subpath = path.parent()
    eq_("foo/bar", subpath)
    assert isinstance(subpath, Path)
 def test_filename(force_ossep):
    path = Path("foo/bar/bleh.ext")
    eq_(path.name, "bleh.ext")
 def test_deal_with_empty_components(force_ossep):
    """Keep ONLY a leading space, which means we want a leading slash."""
    eq_("foo//bar", str(Path(("foo", "", "bar"))))
    eq_("/foo/bar", str(Path(("", "foo", "bar"))))
    eq_("foo/bar", str(Path("foo/bar/")))
 def test_old_compare_paths(force_ossep):
    eq_(Path("foobar"), Path("foobar"))
    eq_(Path("foobar/"), Path("foobar\\", "\\"))
    eq_(Path("/foobar/"), Path("\\foobar\\", "\\"))
    eq_(Path("/foo/bar"), Path("\\foo\\bar", "\\"))
    eq_(Path("/foo/bar"), Path("\\foo\\bar\\", "\\"))
    assert Path("/foo/bar") != Path("\\foo\\foo", "\\")
    # We also have to test __ne__
    assert not (Path("foobar") != Path("foobar"))
    assert Path("/a/b/c.x") != Path("/a/b/c.y")
 def test_old_split_path(force_ossep):
    eq_(Path("foobar"), ("foobar",))
    eq_(Path("foo/bar"), ("foo", "bar"))
    eq_(Path("/foo/bar/"), ("", "foo", "bar"))
    eq_(Path("\\foo\\bar", "\\"), ("", "foo", "bar"))
 def test_representation(force_ossep):
    eq_("('foo', 'bar')", repr(Path(("foo", "bar"))))
 def test_add(force_ossep):
    eq_("foo/bar/bar/foo", Path(("foo", "bar")) + Path("bar/foo"))
    eq_("foo/bar/bar/foo", Path("foo/bar") + "bar/foo")
    eq_("foo/bar/bar/foo", Path("foo/bar") + ("bar", "foo"))
    eq_("foo/bar/bar/foo", ("foo", "bar") + Path("bar/foo"))
    eq_("foo/bar/bar/foo", "foo/bar" + Path("bar/foo"))
    # Invalid concatenation
    try:
        Path(("foo", "bar")) + 1
        assert False
    except TypeError:
        pass
 def test_path_slice(force_ossep):
    foo = Path("foo")
    bar = Path("bar")
    foobar = Path("foo/bar")
    eq_("bar", foobar[foo:])
    eq_("foo", foobar[:bar])
    eq_("foo/bar", foobar[bar:])
    eq_("foo/bar", foobar[:foo])
    eq_((), foobar[foobar:])
    eq_((), foobar[:foobar])
    abcd = Path("a/b/c/d")
    a = Path("a")
    d = Path("d")
    z = Path("z")
    eq_("b/c", abcd[a:d])
    eq_("b/c/d", abcd[a : d + z])
    eq_("b/c", abcd[a : z + d])
    eq_("a/b/c/d", abcd[:z])
 def test_add_with_root_path(force_ossep):
    """if I perform /a/b/c + /d/e/f, I want /a/b/c/d/e/f, not /a/b/c//d/e/f"""
    eq_("/foo/bar", str(Path("/foo") + Path("/bar")))
 def test_create_with_tuple_that_have_slash_inside(force_ossep, monkeypatch):
    eq_(("", "foo", "bar"), Path(("/foo", "bar")))
    monkeypatch.setattr(os, "sep", "\\")
    eq_(("", "foo", "bar"), Path(("\\foo", "bar")))
 def test_auto_decode_os_sep(force_ossep, monkeypatch):
    """Path should decode any either / or os.sep, but always encode in os.sep."""
    eq_(("foo\\bar", "bleh"), Path("foo\\bar/bleh"))
    monkeypatch.setattr(os, "sep", "\\")
    eq_(("foo", "bar/bleh"), Path("foo\\bar/bleh"))
    path = Path("foo/bar")
    eq_(("foo", "bar"), path)
    eq_("foo\\bar", str(path))
 def test_contains(force_ossep):
    p = Path(("foo", "bar"))
    assert Path(("foo", "bar", "bleh")) in p
    assert Path(("foo", "bar")) in p
    assert "foo" in p
    assert "bleh" not in p
    assert Path("foo") not in p
 def test_is_parent_of(force_ossep):
    assert Path(("foo", "bar")).is_parent_of(Path(("foo", "bar", "bleh")))
    assert not Path(("foo", "bar")).is_parent_of(Path(("foo", "baz")))
    assert not Path(("foo", "bar")).is_parent_of(Path(("foo", "bar")))
 def test_windows_drive_letter(force_ossep):
    p = Path(("c:",))
    eq_("c:\\", str(p))
 def test_root_path(force_ossep):
    p = Path("/")
    eq_("/", str(p))
 def test_str_encodes_unicode_to_getfilesystemencoding(force_ossep):
    p = Path(("foo", "bar\u00e9"))
    eq_("foo/bar\u00e9".encode(sys.getfilesystemencoding()), p.tobytes())
 def test_unicode(force_ossep):
    p = Path(("foo", "bar\u00e9"))
    eq_("foo/bar\u00e9", str(p))
 def test_str_repr_of_mix_between_non_ascii_str_and_unicode(force_ossep):
    u = "foo\u00e9"
    encoded = u.encode(sys.getfilesystemencoding())
    p = Path((encoded, "bar"))
    print(repr(tuple(p)))
    eq_("foo\u00e9/bar".encode(sys.getfilesystemencoding()), p.tobytes())
 def test_path_of_a_path_returns_self(force_ossep):
    # if Path() is called with a path as value, just return value.
    p = Path("foo/bar")
    assert Path(p) is p
 def test_getitem_str(force_ossep):
    # path['something'] returns the child path corresponding to the name
    p = Path("/foo/bar")
    eq_(p["baz"], Path("/foo/bar/baz"))
 def test_getitem_path(force_ossep):
    # path[Path('something')] returns the child path corresponding to the name (or subpath)
    p = Path("/foo/bar")
    eq_(p[Path("baz/bleh")], Path("/foo/bar/baz/bleh"))
@pytest.mark.xfail(reason="pytest's capture mechanism is flaky, I have to investigate")
 def test_log_unicode_errors(force_ossep, monkeypatch, capsys):
    # When an there's a UnicodeDecodeError on path creation, log it so it can be possible
    # to debug the cause of it.
    monkeypatch.setattr(sys, "getfilesystemencoding", lambda: "ascii")
    with pytest.raises(UnicodeDecodeError):
        Path(["", b"foo\xe9"])
    out, err = capsys.readouterr()
    assert repr(b"foo\xe9") in err
 def test_has_drive_letter(monkeypatch):
    monkeypatch.setattr(os, "sep", "\\")
    p = Path("foo\\bar")
    assert not p.has_drive_letter()
    p = Path("C:\\")
    assert p.has_drive_letter()
    p = Path("z:\\foo")
    assert p.has_drive_letter()
 def test_remove_drive_letter(monkeypatch):
    monkeypatch.setattr(os, "sep", "\\")
    p = Path("foo\\bar")
    eq_(p.remove_drive_letter(), Path("foo\\bar"))
    p = Path("C:\\")
    eq_(p.remove_drive_letter(), Path(""))
    p = Path("z:\\foo")
    eq_(p.remove_drive_letter(), Path("foo"))
 def test_pathify():
--- a/hscommon/tests/util_test.py
+++ b/hscommon/tests/util_test.py
@@ -11,7 +11,7 @@ from io import StringIO
 from pytest import raises
 from ..testutil import eq_
-from ..path import Path
+from pathlib import Path
 from ..util import (
    nonone,
    tryint,
@@ -245,30 +245,30 @@ class TestCaseDeleteIfEmpty:
    def test_not_empty(self, tmpdir):
        testpath = Path(str(tmpdir))
-        testpath["foo"].mkdir()
+        testpath.joinpath("foo").mkdir()
        assert not delete_if_empty(testpath)
        assert testpath.exists()
    def test_with_files_to_delete(self, tmpdir):
        testpath = Path(str(tmpdir))
-        testpath["foo"].open("w")
+        testpath.joinpath("foo").touch()
-        testpath["bar"].open("w")
+        testpath.joinpath("bar").touch()
        assert delete_if_empty(testpath, ["foo", "bar"])
        assert not testpath.exists()
    def test_directory_in_files_to_delete(self, tmpdir):
        testpath = Path(str(tmpdir))
-        testpath["foo"].mkdir()
+        testpath.joinpath("foo").mkdir()
        assert not delete_if_empty(testpath, ["foo"])
        assert testpath.exists()
    def test_delete_files_to_delete_only_if_dir_is_empty(self, tmpdir):
        testpath = Path(str(tmpdir))
-        testpath["foo"].open("w")
+        testpath.joinpath("foo").touch()
-        testpath["bar"].open("w")
+        testpath.joinpath("bar").touch()
        assert not delete_if_empty(testpath, ["foo"])
        assert testpath.exists()
-        assert testpath["foo"].exists()
+        assert testpath.joinpath("foo").exists()
    def test_doesnt_exist(self):
        # When the 'path' doesn't exist, just do nothing.
@@ -276,8 +276,8 @@ class TestCaseDeleteIfEmpty:
    def test_is_file(self, tmpdir):
        # When 'path' is a file, do nothing.
-        p = Path(str(tmpdir)) + "filename"
+        p = Path(str(tmpdir)).joinpath("filename")
-        p.open("w").close()
+        p.touch()
        delete_if_empty(p)  # no crash
    def test_ioerror(self, tmpdir, monkeypatch):
--- a/hscommon/trans.py
+++ b/hscommon/trans.py
@@ -129,18 +129,6 @@ def install_gettext_trans(base_folder, lang):
    installed_lang = lang
 def install_gettext_trans_under_cocoa():
    from cocoa import proxy
    res_folder = proxy.getResourcePath()
    base_folder = op.join(res_folder, "locale")
    current_lang = proxy.systemLang()
    install_gettext_trans(base_folder, current_lang)
    localename = get_locale_name(current_lang)
    if localename is not None:
        locale.setlocale(locale.LC_ALL, localename)
 def install_gettext_trans_under_qt(base_folder, lang=None):
    # So, we install the gettext locale, great, but we also should try to install qt_*.qm if
    # available so that strings that are inside Qt itself over which I have no control are in the
--- a/hscommon/util.py
+++ b/hscommon/util.py
@@ -15,7 +15,8 @@ import glob
 import shutil
 from datetime import timedelta
-from .path import Path, pathify, log_io_error
+from pathlib import Path
 from .path import pathify, log_io_error
 def nonone(value, replace_value):
@@ -354,13 +355,13 @@ def find_in_path(name, paths=None):
@pathify
 def delete_if_empty(path: Path, files_to_delete=[]):
    """Deletes the directory at 'path' if it is empty or if it only contains files_to_delete."""
-    if not path.exists() or not path.isdir():
+    if not path.exists() or not path.is_dir():
        return
-    contents = path.listdir()
+    contents = list(path.glob("*"))
-    if any(p for p in contents if (p.name not in files_to_delete) or p.isdir()):
+    if any(p for p in contents if (p.name not in files_to_delete) or p.is_dir()):
        return False
    for p in contents:
-        p.remove()
+        p.unlink()
    path.rmdir()
    return True
--- a/macos.md
+++ b/macos.md
@@ -1,5 +1,7 @@
 ## How to build dupeGuru for macos
 These instructions are for the Qt version of the UI on macOS.
 *Note: The Cocoa UI of dupeGuru is hosted in a separate repo: https://github.com/arsenetar/dupeguru-cocoa and is no longer "supported".*
 ### Prerequisites
 - [Python 3.7+][python]
--- a/pkg/debian/control
+++ b/pkg/debian/control
@@ -10,7 +10,7 @@ Vcs-Git: https://github.com/arsenetar/dupeguru.git
 Package: {pkgname}
 Architecture: {arch}
-Depends: ${shlibs:Depends}, python3 (>=3.7), python3-pyqt5, python3-mutagen
+Depends: ${shlibs:Depends}, python3 (>=3.7), python3-pyqt5, python3-mutagen, python3-sematic-version
 Provides: dupeguru-se, dupeguru-me, dupeguru-pe
 Replaces: dupeguru-se, dupeguru-me, dupeguru-pe
 Conflicts: dupeguru-se, dupeguru-me, dupeguru-pe
--- a/qtlib/about_box.py
+++ b/qtlib/about_box.py
@@ -6,18 +6,11 @@
 # which should be included with this package. The terms are also available at
 # http://www.gnu.org/licenses/gpl-3.0.html
-from PyQt5.QtCore import Qt, QCoreApplication
+from PyQt5.QtCore import Qt, QCoreApplication, QTimer
 from PyQt5.QtGui import QPixmap, QFont
-from PyQt5.QtWidgets import (
+from PyQt5.QtWidgets import QDialog, QDialogButtonBox, QSizePolicy, QHBoxLayout, QVBoxLayout, QLabel
    QDialog,
    QDialogButtonBox,
    QSizePolicy,
    QHBoxLayout,
    QVBoxLayout,
    QLabel,
    QApplication,
 )
 from core.util import check_for_update
 from qtlib.util import move_to_screen_center
 from hscommon.trans import trget
@@ -31,61 +24,56 @@ class AboutBox(QDialog):
        self.app = app
        self._setupUi()
-        self.buttonBox.accepted.connect(self.accept)
+        self.button_box.accepted.connect(self.accept)
-        self.buttonBox.rejected.connect(self.reject)
+        self.button_box.rejected.connect(self.reject)
    def _setupUi(self):
        self.setWindowTitle(tr("About {}").format(QCoreApplication.instance().applicationName()))
        self.resize(400, 290)
        size_policy = QSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed)
        size_policy.setHorizontalStretch(0)
        size_policy.setVerticalStretch(0)
        size_policy.setHeightForWidth(self.sizePolicy().hasHeightForWidth())
        self.setSizePolicy(size_policy)
-        self.horizontalLayout = QHBoxLayout(self)
+        main_layout = QHBoxLayout(self)
-        self.logoLabel = QLabel(self)
+        logo_label = QLabel()
-        self.logoLabel.setPixmap(QPixmap(":/%s_big" % self.app.LOGO_NAME))
+        logo_label.setPixmap(QPixmap(":/%s_big" % self.app.LOGO_NAME))
-        self.horizontalLayout.addWidget(self.logoLabel)
+        main_layout.addWidget(logo_label)
-        self.verticalLayout = QVBoxLayout()
+        detail_layout = QVBoxLayout()
-        self.nameLabel = QLabel(self)
+        name_label = QLabel()
        font = QFont()
        font.setWeight(75)
        font.setBold(True)
-        self.nameLabel.setFont(font)
+        name_label.setFont(font)
-        self.nameLabel.setText(QCoreApplication.instance().applicationName())
+        name_label.setText(QCoreApplication.instance().applicationName())
-        self.verticalLayout.addWidget(self.nameLabel)
+        detail_layout.addWidget(name_label)
-        self.versionLabel = QLabel(self)
+        version_label = QLabel()
-        self.versionLabel.setText(tr("Version {}").format(QCoreApplication.instance().applicationVersion()))
+        version_label.setText(tr("Version {}").format(QCoreApplication.instance().applicationVersion()))
-        self.verticalLayout.addWidget(self.versionLabel)
+        detail_layout.addWidget(version_label)
-        self.label_3 = QLabel(self)
+        self.update_label = QLabel(tr("Checking for updates..."))
-        self.verticalLayout.addWidget(self.label_3)
+        self.update_label.setTextInteractionFlags(Qt.TextBrowserInteraction)
-        self.label_3.setText(tr("Licensed under GPLv3"))
+        self.update_label.setOpenExternalLinks(True)
-        self.label = QLabel(self)
+        detail_layout.addWidget(self.update_label)
-        font = QFont()
+        license_label = QLabel()
-        font.setWeight(75)
+        license_label.setText(tr("Licensed under GPLv3"))
-        font.setBold(True)
+        detail_layout.addWidget(license_label)
-        self.label.setFont(font)
+        spacer_label = QLabel()
-        self.verticalLayout.addWidget(self.label)
+        spacer_label.setFont(font)
-        self.buttonBox = QDialogButtonBox(self)
+        detail_layout.addWidget(spacer_label)
-        self.buttonBox.setOrientation(Qt.Horizontal)
+        self.button_box = QDialogButtonBox()
-        self.buttonBox.setStandardButtons(QDialogButtonBox.Ok)
+        self.button_box.setOrientation(Qt.Horizontal)
-        self.verticalLayout.addWidget(self.buttonBox)
+        self.button_box.setStandardButtons(QDialogButtonBox.Ok)
-        self.horizontalLayout.addLayout(self.verticalLayout)
+        detail_layout.addWidget(self.button_box)
        main_layout.addLayout(detail_layout)
    def _check_for_update(self):
        update = check_for_update(QCoreApplication.instance().applicationVersion(), include_prerelease=False)
        if update is None:
            self.update_label.setText(tr("No update available."))
        else:
            self.update_label.setText(
                tr('New version {} available, download <a href="{}">here</a>.').format(update["version"], update["url"])
            )
    def showEvent(self, event):
        self.update_label.setText(tr("Checking for updates..."))
        # have to do this here as the frameGeometry is not correct until shown
        move_to_screen_center(self)
        super().showEvent(event)
-
+        QTimer.singleShot(0, self._check_for_update)
 if __name__ == "__main__":
    import sys
    app = QApplication([])
    QCoreApplication.setOrganizationName("Hardcoded Software")
    QCoreApplication.setApplicationName("FooApp")
    QCoreApplication.setApplicationVersion("1.2.3")
    app.LOGO_NAME = ""
    dialog = AboutBox(None, app)
    dialog.show()
    sys.exit(app.exec_())
--- a/qtlib/locale/qtlib.pot
+++ b/qtlib/locale/qtlib.pot
@@ -4,18 +4,30 @@ msgstr ""
 "Content-Type: text/plain; charset=utf-8\n"
 "Content-Transfer-Encoding: utf-8\n"
-#: qtlib\about_box.py:38
+#: qtlib\about_box.py:31
 msgid "About {}"
 msgstr ""
-#: qtlib\about_box.py:58
+#: qtlib\about_box.py:47
 msgid "Version {}"
 msgstr ""
-#: qtlib\about_box.py:62
+#: qtlib\about_box.py:49 qtlib\about_box.py:75
 msgid "Checking for updates..."
 msgstr ""
 #: qtlib\about_box.py:54
 msgid "Licensed under GPLv3"
 msgstr ""
 #: qtlib\about_box.py:68
 msgid "No update available."
 msgstr ""
 #: qtlib\about_box.py:71
 msgid "New version {} available, download <a href=\"{}\">here</a>."
 msgstr ""
 #: qtlib\error_report_dialog.py:50
 msgid "Error Report"
 msgstr ""
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,7 +1,9 @@
 distro>=1.5.0
 mutagen>=1.44.0
 polib>=1.1.0
 PyQt5 >=5.14.1,<6.0; sys_platform != 'linux'
 pywin32>=228; sys_platform == 'win32'
 semantic-version>=2.0.0,<3.0.0
 Send2Trash>=1.3.0
 sphinx>=3.0.0
-polib>=1.1.0
+xxhash>=3.0.0,<4.0.0
 mutagen>=1.44.0
 distro>=1.5.0
 PyQt5 >=5.14.1,<6.0; sys_platform != 'linux'
 pywin32>=228; sys_platform == 'win32'
--- a/setup.cfg
+++ b/setup.cfg
@@ -34,6 +34,8 @@ install_requires =
    distro>=1.5.0
    PyQt5 >=5.14.1,<6.0; sys_platform != 'linux'
    pywin32>=228; sys_platform == 'win32'
    semantic-version>=2.0.0,<3.0.0
    xxhash>=3.0.0,<4.0.0
 setup_requires =
    sphinx>=3.0.0
    polib>=1.1.0
Author	SHA1	Message	Date
Andrew Senetar	a37b5b0eeb	Fix #988	2022-03-30 01:06:51 -05:00
Andrew Senetar	efd500ecc1	Update directory scanning to use os.scandir() - Change to use os.scandir() instead of os.walk() to leverage DirEntry objects. - Avoids extra calls to stat() on files during fs.can_handle() - See 3x speed improvement on Windows in some cases	2022-03-29 23:37:56 -05:00
Andrew Senetar	43fcc52291	Replace pathlib.glob() with os.scandir() in fs.py	2022-03-29 22:35:38 -05:00
Andrew Senetar	50f5db1543	Update fs to support DirEntry on get_file()	2022-03-29 22:32:36 -05:00
Andrew Senetar	a5b0ccdd02	Improve performance of Directories.get_state()	2022-03-29 21:48:14 -05:00
Andrew Senetar	143147cb8e	Remove Cocoa specific and other unused code	2022-03-28 00:47:46 -05:00
Andrew Senetar	ebb81d9f03	Remove pathlib function added in Python 3.9	2022-03-28 00:06:32 -05:00
Andrew Senetar	da9f8b2b9d	Squashed commit of the following: commit 8b15fe9a502ebf4841c6529e7098cef03a6a5e6f Author: Andrew Senetar <arsenetar@gmail.com> Date: Sun Mar 27 23:48:15 2022 -0500 Finish up changes to copy_or_move commit 21f6a32cf3186a400af8f30e67ad2743dc9a49bd Author: Andrew Senetar <arsenetar@gmail.com> Date: Thu Mar 17 23:56:52 2022 -0500 Migrate from hscommon.path to pathlib - Part one, this gets all hscommon and core tests passing - App appears to be able to load directories and complete scans, need further testing - app.py copy_or_move needs some additional work	2022-03-27 23:50:03 -05:00
Andrew Senetar	5ed5eddde6	Add polib back to requirements.txt	2022-03-27 22:35:34 -05:00
Andrew Senetar	9f40e4e786	Squashed commit of the following: commit 5eb515f666bfa1ff06c2e96bdc351a4b7456580e Author: Andrew Senetar <arsenetar@gmail.com> Date: Sun Mar 27 22:19:39 2022 -0500 Add fallback to md5 if xxhash not available Mainly here for the case when distributions have not packaged python3-xxhash. commit `51b18d4c84` Author: Andrew Senetar <arsenetar@gmail.com> Date: Sat Mar 19 15:25:46 2022 -0500 Switch file hashing to xxhash instead of md5 - Improves performance significantly in some cases - Add xxhash to requirements.txt and sort requirements - Rename md5 based members to digest - Update all tests to use new member names and hashing methods - Update hash db code to upgrade schema NOTE: May consider supporting multiple hashing algorithms in the future.	2022-03-27 22:27:13 -05:00
Andrew Senetar	86bf9b39d0	Add update check function and call from about - Implement a update check against the GitHub releases via the api - Add semantic-version dependency - Add automatic check when opening about dialog	2022-03-27 21:13:27 -05:00
Andrew Senetar	c0be0aecbd	Minor documentation update	2022-03-27 21:04:37 -05:00