Fix #988

Update directory scanning to use os.scandir()
- Change to use os.scandir() instead of os.walk() to leverage DirEntry objects. - Avoids extra calls to stat() on files during fs.can_handle() - See 3x speed improvement on Windows in some cases
2026-02-01 19:11:38 +00:00 · 2022-03-30 01:06:51 -05:00 · 2022-03-29 23:37:56 -05:00 · 2022-03-29 22:35:38 -05:00 · 2022-03-29 22:32:36 -05:00 · 2022-03-29 21:48:14 -05:00
36 changed files with 652 additions and 1675 deletions
--- a/README.md
+++ b/README.md
@@ -1,16 +1,12 @@
 # dupeGuru

 [dupeGuru][dupeguru] is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in
-a system. It is written mostly in Python 3 and has the peculiarity of using
-[multiple GUI toolkits][cross-toolkit], all using the same core Python code. On OS X, the UI layer
-is written in Objective-C and uses Cocoa. On Linux, it is written in Python and uses Qt5.
-
-The Cocoa UI of dupeGuru is hosted in a separate repo: https://github.com/arsenetar/dupeguru-cocoa
+a system. It is written mostly in Python 3 and uses [qt](https://www.qt.io/) for the UI.

 ## Current status
 Still looking for additional help especially with regards to:
-* OSX maintenance: reproducing bugs & cocoa version, building package with Cocoa UI.
-* Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package.
+* OSX maintenance: reproducing bugs, packaging verification.
+* Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package, rpm package.
 * Translations: updating missing strings, transifex project at https://www.transifex.com/voltaicideas/dupeguru-1
 * Documentation: keeping it up-to-date.

@@ -43,12 +39,10 @@ For macos instructions (qt version) see the [macOS Instructions](macos.md).
 When running in a linux based environment the following system packages or equivalents are needed to build:
 * python3-pyqt5
 * pyqt5-dev-tools (on some systems, see note)
-* python3-wheel (for hsaudiotag3k)
 * python3-venv (only if using a virtual environment)
 * python3-dev
 * build-essential

-
 Note: On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with `which pyrcc5`.  Debian based systems need the extra package, and Arch does not.

 To create packages the following are also needed:
--- a/build.py
+++ b/build.py
@@ -109,10 +109,7 @@ def build_updatepot():
    print("Building columns.pot")
    loc.generate_pot(["core"], Path("locale", "columns.pot"), ["coltr"])
    print("Building ui.pot")
-    # When we're not under OS X, we don't want to overwrite ui.pot because it contains Cocoa locs
-    # We want to merge the generated pot with the old pot in the most preserving way possible.
-    ui_packages = ["qt", Path("cocoa", "inter")]
-    loc.generate_pot(ui_packages, Path("locale", "ui.pot"), ["tr"], merge=True)
+    loc.generate_pot(["qt"], Path("locale", "ui.pot"), ["tr"], merge=True)
    print("Building qtlib.pot")
    loc.generate_pot(["qtlib"], Path("qtlib", "locale", "qtlib.pot"), ["tr"])

@@ -121,13 +118,11 @@ def build_mergepot():
    print("Updating .po files using .pot files")
    loc.merge_pots_into_pos("locale")
    loc.merge_pots_into_pos(Path("qtlib", "locale"))
-    # loc.merge_pots_into_pos(Path("cocoalib", "locale"))


 def build_normpo():
    loc.normalize_all_pos("locale")
    loc.normalize_all_pos(Path("qtlib", "locale"))
-    # loc.normalize_all_pos(Path("cocoalib", "locale"))


 def build_pe_modules():
--- a/core/app.py
+++ b/core/app.py
@@ -10,11 +10,11 @@ import logging
 import subprocess
 import re
 import shutil
+from pathlib import Path

 from send2trash import send2trash
 from hscommon.jobprogress import job
 from hscommon.notify import Broadcaster
-from hscommon.path import Path
 from hscommon.conflict import smart_move, smart_copy
 from hscommon.gui.progress_window import ProgressWindow
 from hscommon.util import delete_if_empty, first, escape, nonone, allsame
@@ -248,7 +248,7 @@ class DupeGuru(Broadcaster):
            ref = group.ref
            linkfunc = os.link if use_hardlinks else os.symlink
            linkfunc(str(ref.path), str_path)
-        self.clean_empty_dirs(dupe.path.parent())
+        self.clean_empty_dirs(dupe.path.parent)

    def _create_file(self, path):
        # We add fs.Folder to fileclasses in case the file we're loading contains folder paths.
@@ -415,7 +415,7 @@ class DupeGuru(Broadcaster):
    def clean_empty_dirs(self, path):
        if self.options["clean_empty_dirs"]:
            while delete_if_empty(path, [".DS_Store"]):
-                path = path.parent()
+                path = path.parent

    def clear_picture_cache(self):
        try:
@@ -428,25 +428,25 @@ class DupeGuru(Broadcaster):

    def copy_or_move(self, dupe, copy: bool, destination: str, dest_type: DestType):
        source_path = dupe.path
-        location_path = first(p for p in self.directories if dupe.path in p)
+        location_path = first(p for p in self.directories if p in dupe.path.parents)
        dest_path = Path(destination)
        if dest_type in {DestType.RELATIVE, DestType.ABSOLUTE}:
            # no filename, no windows drive letter
-            source_base = source_path.remove_drive_letter().parent()
+            source_base = source_path.relative_to(source_path.anchor).parent
            if dest_type == DestType.RELATIVE:
-                source_base = source_base[location_path:]
-            dest_path = dest_path[source_base]
+                source_base = source_base.relative_to(location_path.relative_to(location_path.anchor))
+            dest_path = dest_path.joinpath(source_base)
        if not dest_path.exists():
-            dest_path.makedirs()
+            dest_path.mkdir(parents=True)
        # Add filename to dest_path. For file move/copy, it's not required, but for folders, yes.
-        dest_path = dest_path[source_path.name]
+        dest_path = dest_path.joinpath(source_path.name)
        logging.debug("Copy/Move operation from '%s' to '%s'", source_path, dest_path)
        # Raises an EnvironmentError if there's a problem
        if copy:
            smart_copy(source_path, dest_path)
        else:
            smart_move(source_path, dest_path)
-            self.clean_empty_dirs(source_path.parent())
+            self.clean_empty_dirs(source_path.parent)

    def copy_or_move_marked(self, copy):
        """Start an async move (or copy) job on marked duplicates.
--- a/core/directories.py
+++ b/core/directories.py
@@ -7,9 +7,9 @@
 import os
 from xml.etree import ElementTree as ET
 import logging
+from pathlib import Path

 from hscommon.jobprogress import job
-from hscommon.path import Path
 from hscommon.util import FileOrPath
 from hscommon.trans import tr

@@ -63,7 +63,7 @@ class Directories:

    def __contains__(self, path):
        for p in self._dirs:
-            if path in p:
+            if path == p or p in path.parents:
                return True
        return False

@@ -90,44 +90,44 @@ class Directories:
            return DirectoryState.EXCLUDED

    def _get_files(self, from_path, fileclasses, j):
-        for root, dirs, files in os.walk(str(from_path)):
-            j.check_if_cancelled()
-            root_path = Path(root)
-            state = self.get_state(root_path)
-            if state == DirectoryState.EXCLUDED and not any(p[: len(root_path)] == root_path for p in self.states):
-                # Recursively get files from folders with lots of subfolder is expensive. However, there
-                # might be a subfolder in this path that is not excluded. What we want to do is to skim
-                # through self.states and see if we must continue, or we can stop right here to save time
-                del dirs[:]
        try:
-                if state != DirectoryState.EXCLUDED:
-                    # Old logic
-                    if self._exclude_list is None or not self._exclude_list.mark_count:
-                        found_files = [fs.get_file(root_path + f, fileclasses=fileclasses) for f in files]
-                    else:
-                        found_files = []
-                        # print(f"len of files: {len(files)} {files}")
-                        for f in files:
-                            if not self._exclude_list.is_excluded(root, f):
-                                found_files.append(fs.get_file(root_path + f, fileclasses=fileclasses))
-                    found_files = [f for f in found_files if f is not None]
-                    # In some cases, directories can be considered as files by dupeGuru, which is
-                    # why we have this line below. In fact, there only one case: Bundle files under
-                    # OS X... In other situations, this forloop will do nothing.
-                    for d in dirs[:]:
-                        f = fs.get_file(root_path + d, fileclasses=fileclasses)
-                        if f is not None:
-                            found_files.append(f)
-                            dirs.remove(d)
+            with os.scandir(from_path) as iter:
+                root_path = Path(from_path)
+                state = self.get_state(root_path)
+                # if we have no un-excluded dirs under this directory skip going deeper
+                skip_dirs = state == DirectoryState.EXCLUDED and not any(
+                    p.parts[: len(root_path.parts)] == root_path.parts for p in self.states
+                )
+                count = 0
+                for item in iter:
+                    j.check_if_cancelled()
+                    try:
+                        if item.is_dir():
+                            if skip_dirs:
+                                continue
+                            yield from self._get_files(item.path, fileclasses, j)
+                            continue
+                        elif state == DirectoryState.EXCLUDED:
+                            continue
+                        # File excluding or not
+                        if (
+                            self._exclude_list is None
+                            or not self._exclude_list.mark_count
+                            or not self._exclude_list.is_excluded(str(from_path), item.name)
+                        ):
+                            file = fs.get_file(item, fileclasses=fileclasses)
+                            if file:
+                                file.is_ref = state == DirectoryState.REFERENCE
+                                count += 1
+                                yield file
+                    except (EnvironmentError, OSError, fs.InvalidPath):
+                        pass
                logging.debug(
                    "Collected %d files in folder %s",
-                        len(found_files),
+                    count,
                    str(root_path),
                )
-                    for file in found_files:
-                        file.is_ref = state == DirectoryState.REFERENCE
-                        yield file
-            except (EnvironmentError, fs.InvalidPath):
+        except OSError:
            pass

    def _get_folders(self, from_folder, j):
@@ -159,7 +159,7 @@ class Directories:
            raise AlreadyThereError()
        if not path.exists():
            raise InvalidPathError()
-        self._dirs = [p for p in self._dirs if p not in path]
+        self._dirs = [p for p in self._dirs if path not in p.parents]
        self._dirs.append(path)

    @staticmethod
@@ -170,7 +170,7 @@ class Directories:
        :rtype: list of Path
        """
        try:
-            subpaths = [p for p in path.listdir() if p.isdir()]
+            subpaths = [p for p in path.glob("*") if p.is_dir()]
            subpaths.sort(key=lambda x: x.name.lower())
            return subpaths
        except EnvironmentError:
@@ -220,14 +220,11 @@ class Directories:
        if state != DirectoryState.NORMAL:
            self.states[path] = state
            return state
-
-        prevlen = 0
-        # we loop through the states to find the longest matching prefix
-        # if the parent has a state in cache, return that state
-        for p, s in self.states.items():
-            if p.is_parent_of(path) and len(p) > prevlen:
-                prevlen = len(p)
-                state = s
+        # find the longest parent path that is in states and return that state if found
+        # NOTE: path.parents is ordered longest to shortest
+        for parent_path in path.parents:
+            if parent_path in self.states:
+                return self.states[parent_path]
        return state

    def has_any_file(self):
@@ -296,6 +293,6 @@ class Directories:
        if self.get_state(path) == state:
            return
        for iter_path in list(self.states.keys()):
-            if path.is_parent_of(iter_path):
+            if path in iter_path.parents:
                del self.states[iter_path]
        self.states[path] = state
--- a/core/engine.py
+++ b/core/engine.py
@@ -283,7 +283,7 @@ def getmatches_by_contents(files, bigsize=0, j=job.nulljob):
    """Returns a list of :class:`Match` within ``files`` if their contents is the same.

    :param bigsize: The size in bytes over which we consider files big enough to
-                    justify taking samples of md5. If 0, compute md5 as usual.
+                    justify taking samples of the file for hashing. If 0, compute digest as usual.
    :param j: A :ref:`job progress instance <jobs>`.
    """
    size2files = defaultdict(set)
@@ -300,15 +300,15 @@ def getmatches_by_contents(files, bigsize=0, j=job.nulljob):
            if first.is_ref and second.is_ref:
                continue  # Don't spend time comparing two ref pics together.
            if first.size == 0 and second.size == 0:
-                # skip md5 for zero length files
+                # skip hashing for zero length files
                result.append(Match(first, second, 100))
                continue
-            if first.md5partial == second.md5partial:
+            if first.digest_partial == second.digest_partial:
                if bigsize > 0 and first.size > bigsize:
-                    if first.md5samples == second.md5samples:
+                    if first.digest_samples == second.digest_samples:
                        result.append(Match(first, second, 100))
                else:
-                    if first.md5 == second.md5:
+                    if first.digest == second.digest:
                        result.append(Match(first, second, 100))
        group_count += 1
        j.add_progress(desc=PROGRESS_MESSAGE % (len(result), group_count))
--- a/core/fs.py
+++ b/core/fs.py
@@ -11,14 +11,24 @@
 # resulting needless complexity and memory usage. It's been a while since I wanted to do that fork,
 # and I'm doing it now.

+import os
+
+try:
+    import xxhash
+
+    hasher = xxhash.xxh128
+except ImportError:
    import hashlib
+
+    hasher = hashlib.md5
+
 from math import floor
 import logging
 import sqlite3
 from threading import Lock
-from typing import Any
+from typing import Any, AnyStr, Union

-from hscommon.path import Path
+from pathlib import Path
 from hscommon.util import nonone, get_file_ext

 __all__ = [
@@ -40,7 +50,7 @@ NOT_SET = object()
 # CPU.
 CHUNK_SIZE = 1024 * 1024  # 1 MiB

-# Minimum size below which partial hashes don't need to be computed
+# Minimum size below which partial hashing is not used
 MIN_FILE_SIZE = 3 * CHUNK_SIZE  # 3MiB, because we take 3 samples


@@ -83,9 +93,11 @@ class OperationError(FSError):


 class FilesDB:
+    schema_version = 1
+    schema_version_description = "Changed from md5 to xxhash if available."

-    create_table_query = "CREATE TABLE IF NOT EXISTS files (path TEXT PRIMARY KEY, size INTEGER, mtime_ns INTEGER, entry_dt DATETIME, md5 BLOB, md5partial BLOB)"
-    drop_table_query = "DROP TABLE files;"
+    create_table_query = "CREATE TABLE IF NOT EXISTS files (path TEXT PRIMARY KEY, size INTEGER, mtime_ns INTEGER, entry_dt DATETIME, digest BLOB, digest_partial BLOB, digest_samples BLOB)"
+    drop_table_query = "DROP TABLE IF EXISTS files;"
    select_query = "SELECT {key} FROM files WHERE path=:path AND size=:size and mtime_ns=:mtime_ns"
    insert_query = """
        INSERT INTO files (path, size, mtime_ns, entry_dt, {key}) VALUES (:path, :size, :mtime_ns, datetime('now'), :value)
@@ -97,24 +109,37 @@ class FilesDB:
        self.cur = None
        self.lock = None

-    def connect(self, path):
-        # type: (str, ) -> None
-
+    def connect(self, path: Union[AnyStr, os.PathLike]) -> None:
        self.conn = sqlite3.connect(path, check_same_thread=False)
        self.cur = self.conn.cursor()
-        self.cur.execute(self.create_table_query)
        self.lock = Lock()
+        self._check_upgrade()

-    def clear(self):
-        # type: () -> None
+    def _check_upgrade(self) -> None:
+        with self.lock:
+            has_schema = self.cur.execute(
+                "SELECT NAME FROM sqlite_master WHERE type='table' AND name='schema_version'"
+            ).fetchall()
+            version = None
+            if has_schema:
+                version = self.cur.execute("SELECT version FROM schema_version ORDER BY version DESC").fetchone()[0]
+            else:
+                self.cur.execute("CREATE TABLE schema_version (version int PRIMARY KEY, description TEXT)")
+            if version != self.schema_version:
+                self.cur.execute(self.drop_table_query)
+                self.cur.execute(
+                    "INSERT OR REPLACE INTO schema_version VALUES (:version, :description)",
+                    {"version": self.schema_version, "description": self.schema_version_description},
+                )
+            self.cur.execute(self.create_table_query)
+            self.conn.commit()

+    def clear(self) -> None:
        with self.lock:
            self.cur.execute(self.drop_table_query)
            self.cur.execute(self.create_table_query)

-    def get(self, path, key):
-        # type: (Path, str) -> bytes
-
+    def get(self, path: Path, key: str) -> Union[bytes, None]:
        stat = path.stat()
        size = stat.st_size
        mtime_ns = stat.st_mtime_ns
@@ -128,9 +153,7 @@ class FilesDB:

        return None

-    def put(self, path, key, value):
-        # type: (Path, str, Any) -> None
-
+    def put(self, path: Path, key: str, value: Any) -> None:
        stat = path.stat()
        size = stat.st_size
        mtime_ns = stat.st_mtime_ns
@@ -141,15 +164,11 @@ class FilesDB:
                {"path": str(path), "size": size, "mtime_ns": mtime_ns, "value": value},
            )

-    def commit(self):
-        # type: () -> None
-
+    def commit(self) -> None:
        with self.lock:
            self.conn.commit()

-    def close(self):
-        # type: () -> None
-
+    def close(self) -> None:
        with self.lock:
            self.cur.close()
            self.conn.close()
@@ -161,7 +180,7 @@ filesdb = FilesDB()  # Singleton
 class File:
    """Represents a file and holds metadata to be used for scanning."""

-    INITIAL_INFO = {"size": 0, "mtime": 0, "md5": b"", "md5partial": b"", "md5samples": b""}
+    INITIAL_INFO = {"size": 0, "mtime": 0, "digest": b"", "digest_partial": b"", "digest_samples": b""}
    # Slots for File make us save quite a bit of memory. In a memory test I've made with a lot of
    # files, I saved 35% memory usage with "unread" files (no _read_info() call) and gains become
    # even greater when we take into account read attributes (70%!). Yeah, it's worth it.
@@ -187,32 +206,51 @@ class File:
                result = self.INITIAL_INFO[attrname]
        return result

-    def _calc_md5(self):
+    def _calc_digest(self):
        # type: () -> bytes

        with self.path.open("rb") as fp:
-            md5 = hashlib.md5()
+            file_hash = hasher()
            # The goal here is to not run out of memory on really big files. However, the chunk
            # size has to be large enough so that the python loop isn't too costly in terms of
            # CPU.
            CHUNK_SIZE = 1024 * 1024  # 1 mb
            filedata = fp.read(CHUNK_SIZE)
            while filedata:
-                md5.update(filedata)
+                file_hash.update(filedata)
                filedata = fp.read(CHUNK_SIZE)
-            return md5.digest()
+            return file_hash.digest()

-    def _calc_md5partial(self):
+    def _calc_digest_partial(self):
        # type: () -> bytes

-        # This offset is where we should start reading the file to get a partial md5
+        # This offset is where we should start reading the file to get a partial hash
        # For audio file, it should be where audio data starts
        offset, size = (0x4000, 0x4000)

        with self.path.open("rb") as fp:
            fp.seek(offset)
-            partialdata = fp.read(size)
-            return hashlib.md5(partialdata).digest()
+            partial_data = fp.read(size)
+            return hasher(partial_data).digest()
+
+    def _calc_digest_samples(self) -> bytes:
+        size = self.size
+        with self.path.open("rb") as fp:
+            # Chunk at 25% of the file
+            fp.seek(floor(size * 25 / 100), 0)
+            file_data = fp.read(CHUNK_SIZE)
+            file_hash = hasher(file_data)
+
+            # Chunk at 60% of the file
+            fp.seek(floor(size * 60 / 100), 0)
+            file_data = fp.read(CHUNK_SIZE)
+            file_hash.update(file_data)
+
+            # Last chunk of the file
+            fp.seek(-CHUNK_SIZE, 2)
+            file_data = fp.read(CHUNK_SIZE)
+            file_hash.update(file_data)
+            return file_hash.digest()

    def _read_info(self, field):
        # print(f"_read_info({field}) for {self}")
@@ -220,48 +258,35 @@ class File:
            stats = self.path.stat()
            self.size = nonone(stats.st_size, 0)
            self.mtime = nonone(stats.st_mtime, 0)
-        elif field == "md5partial":
+        elif field == "digest_partial":
            try:
-                self.md5partial = filesdb.get(self.path, "md5partial")
-                if self.md5partial is None:
-                    self.md5partial = self._calc_md5partial()
-                    filesdb.put(self.path, "md5partial", self.md5partial)
+                self.digest_partial = filesdb.get(self.path, "digest_partial")
+                if self.digest_partial is None:
+                    self.digest_partial = self._calc_digest_partial()
+                    filesdb.put(self.path, "digest_partial", self.digest_partial)
            except Exception as e:
-                logging.warning("Couldn't get md5partial for %s: %s", self.path, e)
-        elif field == "md5":
+                logging.warning("Couldn't get digest_partial for %s: %s", self.path, e)
+        elif field == "digest":
            try:
-                self.md5 = filesdb.get(self.path, "md5")
-                if self.md5 is None:
-                    self.md5 = self._calc_md5()
-                    filesdb.put(self.path, "md5", self.md5)
+                self.digest = filesdb.get(self.path, "digest")
+                if self.digest is None:
+                    self.digest = self._calc_digest()
+                    filesdb.put(self.path, "digest", self.digest)
            except Exception as e:
-                logging.warning("Couldn't get md5 for %s: %s", self.path, e)
-        elif field == "md5samples":
-            try:
-                with self.path.open("rb") as fp:
+                logging.warning("Couldn't get digest for %s: %s", self.path, e)
+        elif field == "digest_samples":
            size = self.size
            # Might as well hash such small files entirely.
            if size <= MIN_FILE_SIZE:
-                        setattr(self, field, self.md5)
+                setattr(self, field, self.digest)
                return
-
-                    # Chunk at 25% of the file
-                    fp.seek(floor(size * 25 / 100), 0)
-                    filedata = fp.read(CHUNK_SIZE)
-                    md5 = hashlib.md5(filedata)
-
-                    # Chunk at 60% of the file
-                    fp.seek(floor(size * 60 / 100), 0)
-                    filedata = fp.read(CHUNK_SIZE)
-                    md5.update(filedata)
-
-                    # Last chunk of the file
-                    fp.seek(-CHUNK_SIZE, 2)
-                    filedata = fp.read(CHUNK_SIZE)
-                    md5.update(filedata)
-                    setattr(self, field, md5.digest())
+            try:
+                self.digest_samples = filesdb.get(self.path, "digest_samples")
+                if self.digest_samples is None:
+                    self.digest_samples = self._calc_digest_samples()
+                    filesdb.put(self.path, "digest_samples", self.digest_samples)
            except Exception as e:
-                logging.error(f"Error computing md5samples: {e}")
+                logging.warning(f"Couldn't get digest_samples for {self.path}: {e}")

    def _read_all_info(self, attrnames=None):
        """Cache all possible info.
@@ -277,14 +302,14 @@ class File:
    @classmethod
    def can_handle(cls, path):
        """Returns whether this file wrapper class can handle ``path``."""
-        return not path.islink() and path.isfile()
+        return not path.is_symlink() and path.is_file()

    def rename(self, newname):
        if newname == self.name:
            return
-        destpath = self.path.parent()[newname]
+        destpath = self.path.parent.joinpath(newname)
        if destpath.exists():
-            raise AlreadyExistsError(newname, self.path.parent())
+            raise AlreadyExistsError(newname, self.path.parent)
        try:
            self.path.rename(destpath)
        except EnvironmentError:
@@ -308,13 +333,13 @@ class File:

    @property
    def folder_path(self):
-        return self.path.parent()
+        return self.path.parent


 class Folder(File):
    """A wrapper around a folder path.

-    It has the size/md5 info of a File, but its value is the sum of its subitems.
+    It has the size/digest info of a File, but its value is the sum of its subitems.
    """

    __slots__ = File.__slots__ + ("_subfolders",)
@@ -335,31 +360,31 @@ class Folder(File):
            self.size = size
            stats = self.path.stat()
            self.mtime = nonone(stats.st_mtime, 0)
-        elif field in {"md5", "md5partial", "md5samples"}:
+        elif field in {"digest", "digest_partial", "digest_samples"}:
            # What's sensitive here is that we must make sure that subfiles'
-            # md5 are always added up in the same order, but we also want a
-            # different md5 if a file gets moved in a different subdirectory.
+            # digest are always added up in the same order, but we also want a
+            # different digest if a file gets moved in a different subdirectory.

-            def get_dir_md5_concat():
+            def get_dir_digest_concat():
                items = self._all_items()
                items.sort(key=lambda f: f.path)
-                md5s = [getattr(f, field) for f in items]
-                return b"".join(md5s)
+                digests = [getattr(f, field) for f in items]
+                return b"".join(digests)

-            md5 = hashlib.md5(get_dir_md5_concat())
-            digest = md5.digest()
+            digest = hasher(get_dir_digest_concat()).digest()
            setattr(self, field, digest)

    @property
    def subfolders(self):
        if self._subfolders is None:
-            subfolders = [p for p in self.path.listdir() if not p.islink() and p.isdir()]
-            self._subfolders = [self.__class__(p) for p in subfolders]
+            with os.scandir(self.path) as iter:
+                subfolders = [p.path for p in iter if not p.is_symlink() and p.is_dir()]
+            self._subfolders = [self.__class__(Path(p)) for p in subfolders]
        return self._subfolders

    @classmethod
    def can_handle(cls, path):
-        return not path.islink() and path.isdir()
+        return not path.is_symlink() and path.is_dir()


 def get_file(path, fileclasses=[File]):
@@ -372,6 +397,8 @@ def get_file(path, fileclasses=[File]):
    """
    for fileclass in fileclasses:
        if fileclass.can_handle(path):
+            if type(path) is os.DirEntry:
+                return fileclass(Path(path.path))
            return fileclass(path)


@@ -384,8 +411,9 @@ def get_files(path, fileclasses=[File]):
    assert all(issubclass(fileclass, File) for fileclass in fileclasses)
    try:
        result = []
-        for path in path.listdir():
-            file = get_file(path, fileclasses=fileclasses)
+        with os.scandir(path) as iter:
+            for item in iter:
+                file = get_file(item, fileclasses=fileclasses)
                if file is not None:
                    result.append(file)
        return result
--- a/core/me/fs.py
+++ b/core/me/fs.py
@@ -97,11 +97,6 @@ class MusicFile(fs.File):
            "dupe_count": format_dupe_count(dupe_count),
        }

-    def _get_md5partial_offset_and_size(self):
-        # No longer calculating the offset and audio size, just whole file
-        size = self.path.stat().st_size
-        return (0, size)
-
    def _read_info(self, field):
        fs.File._read_info(self, field)
        if field in TAG_FIELDS:
--- a/core/pe/matchblock.py
+++ b/core/pe/matchblock.py
@@ -238,7 +238,7 @@ def getmatches(pictures, cache_path, threshold, match_scaled=False, j=job.nulljo
    for ref_id, other_id, percentage in myiter:
        ref = id2picture[ref_id]
        other = id2picture[other_id]
-        if percentage == 100 and ref.md5 != other.md5:
+        if percentage == 100 and ref.digest != other.digest:
            percentage = 99
        if percentage >= threshold:
            ref.dimensions  # pre-read dimensions for display in results
--- a/core/prioritize.py
+++ b/core/prioritize.py
@@ -82,10 +82,12 @@ class FolderCategory(ValueListCategory):

    def sort_key(self, dupe, crit_value):
        value = self.extract_value(dupe)
-        if value[: len(crit_value)] == crit_value:
-            return 0
-        else:
+        # This is instead of using is_relative_to() which was added in py 3.9
+        try:
+            value.relative_to(crit_value)
+        except ValueError:
            return 1
+        return 0


 class FilenameCategory(CriterionCategory):
--- a/core/scanner.py
+++ b/core/scanner.py
@@ -134,7 +134,7 @@ class Scanner:
            return False
        if is_same_with_digit(refname, dupename):
            return True
-        return len(dupe.path) > len(ref.path)
+        return len(dupe.path.parts) > len(ref.path.parts)

    @staticmethod
    def get_scan_options():
@@ -164,7 +164,7 @@ class Scanner:
            toremove = set()
            last_parent_path = sortedpaths[0]
            for p in sortedpaths[1:]:
-                if p in last_parent_path:
+                if last_parent_path in p.parents:
                    toremove.add(p)
                else:
                    last_parent_path = p
--- a/core/tests/app_test.py
+++ b/core/tests/app_test.py
@@ -9,7 +9,7 @@ import os.path as op
 import logging

 import pytest
-from hscommon.path import Path
+from pathlib import Path
 import hscommon.conflict
 import hscommon.util
 from hscommon.testutil import eq_, log_calls
@@ -56,7 +56,7 @@ class TestCaseDupeGuru:
        # for this unit is pathetic. What's done is done. My approach now is to add tests for
        # every change I want to make. The blowup was caused by a missing import.
        p = Path(str(tmpdir))
-        p["foo"].open("w").close()
+        p.joinpath("foo").touch()
        monkeypatch.setattr(
            hscommon.conflict,
            "smart_copy",
@@ -71,19 +71,19 @@ class TestCaseDupeGuru:
        dgapp.copy_or_move(f, True, "some_destination", 0)
        eq_(1, len(hscommon.conflict.smart_copy.calls))
        call = hscommon.conflict.smart_copy.calls[0]
-        eq_(call["dest_path"], op.join("some_destination", "foo"))
+        eq_(call["dest_path"], Path("some_destination", "foo"))
        eq_(call["source_path"], f.path)

    def test_copy_or_move_clean_empty_dirs(self, tmpdir, monkeypatch):
        tmppath = Path(str(tmpdir))
-        sourcepath = tmppath["source"]
+        sourcepath = tmppath.joinpath("source")
        sourcepath.mkdir()
-        sourcepath["myfile"].open("w")
+        sourcepath.joinpath("myfile").touch()
        app = TestApp().app
        app.directories.add_path(tmppath)
        [myfile] = app.directories.get_files()
        monkeypatch.setattr(app, "clean_empty_dirs", log_calls(lambda path: None))
-        app.copy_or_move(myfile, False, tmppath["dest"], 0)
+        app.copy_or_move(myfile, False, tmppath.joinpath("dest"), 0)
        calls = app.clean_empty_dirs.calls
        eq_(1, len(calls))
        eq_(sourcepath, calls[0]["path"])
@@ -106,8 +106,8 @@ class TestCaseDupeGuru:
        # If the ignore_hardlink_matches option is set, don't match files hardlinking to the same
        # inode.
        tmppath = Path(str(tmpdir))
-        tmppath["myfile"].open("w").write("foo")
-        os.link(str(tmppath["myfile"]), str(tmppath["hardlink"]))
+        tmppath.joinpath("myfile").open("wt").write("foo")
+        os.link(str(tmppath.joinpath("myfile")), str(tmppath.joinpath("hardlink")))
        app = TestApp().app
        app.directories.add_path(tmppath)
        app.options["scan_type"] = ScanType.CONTENTS
@@ -153,7 +153,7 @@ class TestCaseDupeGuruCleanEmptyDirs:
        # delete_if_empty must be recursively called up in the path until it returns False
        @log_calls
        def mock_delete_if_empty(path, files_to_delete=[]):
-            return len(path) > 1
+            return len(path.parts) > 1

        monkeypatch.setattr(hscommon.util, "delete_if_empty", mock_delete_if_empty)
        # XXX This monkeypatch is temporary. will be fixed in a better monkeypatcher.
@@ -180,8 +180,8 @@ class TestCaseDupeGuruWithResults:
        self.rtable.refresh()
        tmpdir = request.getfixturevalue("tmpdir")
        tmppath = Path(str(tmpdir))
-        tmppath["foo"].mkdir()
-        tmppath["bar"].mkdir()
+        tmppath.joinpath("foo").mkdir()
+        tmppath.joinpath("bar").mkdir()
        self.app.directories.add_path(tmppath)

    def test_get_objects(self, do_setup):
@@ -424,12 +424,9 @@ class TestCaseDupeGuruRenameSelected:
    def do_setup(self, request):
        tmpdir = request.getfixturevalue("tmpdir")
        p = Path(str(tmpdir))
-        fp = open(str(p["foo bar 1"]), mode="w")
-        fp.close()
-        fp = open(str(p["foo bar 2"]), mode="w")
-        fp.close()
-        fp = open(str(p["foo bar 3"]), mode="w")
-        fp.close()
+        p.joinpath("foo bar 1").touch()
+        p.joinpath("foo bar 2").touch()
+        p.joinpath("foo bar 3").touch()
        files = fs.get_files(p)
        for f in files:
            f.is_ref = False
@@ -451,7 +448,7 @@ class TestCaseDupeGuruRenameSelected:
        g = self.groups[0]
        self.rtable.select([1])
        assert app.rename_selected("renamed")
-        names = [p.name for p in self.p.listdir()]
+        names = [p.name for p in self.p.glob("*")]
        assert "renamed" in names
        assert "foo bar 2" not in names
        eq_(g.dupes[0].name, "renamed")
@@ -464,7 +461,7 @@ class TestCaseDupeGuruRenameSelected:
        assert not app.rename_selected("renamed")
        msg = logging.warning.calls[0]["msg"]
        eq_("dupeGuru Warning: list index out of range", msg)
-        names = [p.name for p in self.p.listdir()]
+        names = [p.name for p in self.p.glob("*")]
        assert "renamed" not in names
        assert "foo bar 2" in names
        eq_(g.dupes[0].name, "foo bar 2")
@@ -477,7 +474,7 @@ class TestCaseDupeGuruRenameSelected:
        assert not app.rename_selected("foo bar 1")
        msg = logging.warning.calls[0]["msg"]
        assert msg.startswith("dupeGuru Warning: 'foo bar 1' already exists in")
-        names = [p.name for p in self.p.listdir()]
+        names = [p.name for p in self.p.glob("*")]
        assert "foo bar 1" in names
        assert "foo bar 2" in names
        eq_(g.dupes[0].name, "foo bar 2")
@@ -488,9 +485,9 @@ class TestAppWithDirectoriesInTree:
    def do_setup(self, request):
        tmpdir = request.getfixturevalue("tmpdir")
        p = Path(str(tmpdir))
-        p["sub1"].mkdir()
-        p["sub2"].mkdir()
-        p["sub3"].mkdir()
+        p.joinpath("sub1").mkdir()
+        p.joinpath("sub2").mkdir()
+        p.joinpath("sub3").mkdir()
        app = TestApp()
        self.app = app.app
        self.dtree = app.dtree
--- a/core/tests/base.py
+++ b/core/tests/base.py
@@ -5,7 +5,7 @@
 # http://www.gnu.org/licenses/gpl-3.0.html

 from hscommon.testutil import TestApp as TestAppBase, CallLogger, eq_, with_app  # noqa
-from hscommon.path import Path
+from pathlib import Path
 from hscommon.util import get_file_ext, format_size
 from hscommon.gui.column import Column
 from hscommon.jobprogress.job import nulljob, JobCancelled
@@ -86,9 +86,9 @@ class NamedObject:
            folder = "basepath"
        self._folder = Path(folder)
        self.size = size
-        self.md5partial = name
-        self.md5 = name
-        self.md5samples = name
+        self.digest_partial = name
+        self.digest = name
+        self.digest_samples = name
        if with_words:
            self.words = getwords(name)
        self.is_ref = False
@@ -111,11 +111,11 @@ class NamedObject:

    @property
    def path(self):
-        return self._folder[self.name]
+        return self._folder.joinpath(self.name)

    @property
    def folder_path(self):
-        return self.path.parent()
+        return self.path.parent

    @property
    def extension(self):
--- a/core/tests/directories_test.py
+++ b/core/tests/directories_test.py
@@ -10,7 +10,7 @@ import tempfile
 import shutil

 from pytest import raises
-from hscommon.path import Path
+from pathlib import Path
 from hscommon.testutil import eq_
 from hscommon.plat import ISWINDOWS

@@ -26,29 +26,23 @@ from ..exclude import ExcludeList, ExcludeDict

 def create_fake_fs(rootpath):
    # We have it as a separate function because other units are using it.
-    rootpath = rootpath["fs"]
+    rootpath = rootpath.joinpath("fs")
    rootpath.mkdir()
-    rootpath["dir1"].mkdir()
-    rootpath["dir2"].mkdir()
-    rootpath["dir3"].mkdir()
-    fp = rootpath["file1.test"].open("w")
+    rootpath.joinpath("dir1").mkdir()
+    rootpath.joinpath("dir2").mkdir()
+    rootpath.joinpath("dir3").mkdir()
+    with rootpath.joinpath("file1.test").open("wt") as fp:
        fp.write("1")
-    fp.close()
-    fp = rootpath["file2.test"].open("w")
+    with rootpath.joinpath("file2.test").open("wt") as fp:
        fp.write("12")
-    fp.close()
-    fp = rootpath["file3.test"].open("w")
+    with rootpath.joinpath("file3.test").open("wt") as fp:
        fp.write("123")
-    fp.close()
-    fp = rootpath["dir1"]["file1.test"].open("w")
+    with rootpath.joinpath("dir1", "file1.test").open("wt") as fp:
        fp.write("1")
-    fp.close()
-    fp = rootpath["dir2"]["file2.test"].open("w")
+    with rootpath.joinpath("dir2", "file2.test").open("wt") as fp:
        fp.write("12")
-    fp.close()
-    fp = rootpath["dir3"]["file3.test"].open("w")
+    with rootpath.joinpath("dir3", "file3.test").open("wt") as fp:
        fp.write("123")
-    fp.close()
    return rootpath


@@ -60,11 +54,10 @@ def setup_module(module):
    # and another with a more complex structure.
    testpath = Path(tempfile.mkdtemp())
    module.testpath = testpath
-    rootpath = testpath["onefile"]
+    rootpath = testpath.joinpath("onefile")
    rootpath.mkdir()
-    fp = rootpath["test.txt"].open("w")
+    with rootpath.joinpath("test.txt").open("wt") as fp:
        fp.write("test_data")
-    fp.close()
    create_fake_fs(testpath)


@@ -80,13 +73,13 @@ def test_empty():

 def test_add_path():
    d = Directories()
-    p = testpath["onefile"]
+    p = testpath.joinpath("onefile")
    d.add_path(p)
    eq_(1, len(d))
    assert p in d
-    assert (p["foobar"]) in d
-    assert p.parent() not in d
-    p = testpath["fs"]
+    assert (p.joinpath("foobar")) in d
+    assert p.parent not in d
+    p = testpath.joinpath("fs")
    d.add_path(p)
    eq_(2, len(d))
    assert p in d
@@ -94,18 +87,18 @@ def test_add_path():

 def test_add_path_when_path_is_already_there():
    d = Directories()
-    p = testpath["onefile"]
+    p = testpath.joinpath("onefile")
    d.add_path(p)
    with raises(AlreadyThereError):
        d.add_path(p)
    with raises(AlreadyThereError):
-        d.add_path(p["foobar"])
+        d.add_path(p.joinpath("foobar"))
    eq_(1, len(d))


 def test_add_path_containing_paths_already_there():
    d = Directories()
-    d.add_path(testpath["onefile"])
+    d.add_path(testpath.joinpath("onefile"))
    eq_(1, len(d))
    d.add_path(testpath)
    eq_(len(d), 1)
@@ -114,7 +107,7 @@ def test_add_path_containing_paths_already_there():

 def test_add_path_non_latin(tmpdir):
    p = Path(str(tmpdir))
-    to_add = p["unicode\u201a"]
+    to_add = p.joinpath("unicode\u201a")
    os.mkdir(str(to_add))
    d = Directories()
    try:
@@ -125,25 +118,25 @@ def test_add_path_non_latin(tmpdir):

 def test_del():
    d = Directories()
-    d.add_path(testpath["onefile"])
+    d.add_path(testpath.joinpath("onefile"))
    try:
        del d[1]
        assert False
    except IndexError:
        pass
-    d.add_path(testpath["fs"])
+    d.add_path(testpath.joinpath("fs"))
    del d[1]
    eq_(1, len(d))


 def test_states():
    d = Directories()
-    p = testpath["onefile"]
+    p = testpath.joinpath("onefile")
    d.add_path(p)
    eq_(DirectoryState.NORMAL, d.get_state(p))
    d.set_state(p, DirectoryState.REFERENCE)
    eq_(DirectoryState.REFERENCE, d.get_state(p))
-    eq_(DirectoryState.REFERENCE, d.get_state(p["dir1"]))
+    eq_(DirectoryState.REFERENCE, d.get_state(p.joinpath("dir1")))
    eq_(1, len(d.states))
    eq_(p, list(d.states.keys())[0])
    eq_(DirectoryState.REFERENCE, d.states[p])
@@ -152,7 +145,7 @@ def test_states():
 def test_get_state_with_path_not_there():
    # When the path's not there, just return DirectoryState.Normal
    d = Directories()
-    d.add_path(testpath["onefile"])
+    d.add_path(testpath.joinpath("onefile"))
    eq_(d.get_state(testpath), DirectoryState.NORMAL)


@@ -160,26 +153,26 @@ def test_states_overwritten_when_larger_directory_eat_smaller_ones():
    # ref #248
    # When setting the state of a folder, we overwrite previously set states for subfolders.
    d = Directories()
-    p = testpath["onefile"]
+    p = testpath.joinpath("onefile")
    d.add_path(p)
    d.set_state(p, DirectoryState.EXCLUDED)
    d.add_path(testpath)
    d.set_state(testpath, DirectoryState.REFERENCE)
    eq_(d.get_state(p), DirectoryState.REFERENCE)
-    eq_(d.get_state(p["dir1"]), DirectoryState.REFERENCE)
+    eq_(d.get_state(p.joinpath("dir1")), DirectoryState.REFERENCE)
    eq_(d.get_state(testpath), DirectoryState.REFERENCE)


 def test_get_files():
    d = Directories()
-    p = testpath["fs"]
+    p = testpath.joinpath("fs")
    d.add_path(p)
-    d.set_state(p["dir1"], DirectoryState.REFERENCE)
-    d.set_state(p["dir2"], DirectoryState.EXCLUDED)
+    d.set_state(p.joinpath("dir1"), DirectoryState.REFERENCE)
+    d.set_state(p.joinpath("dir2"), DirectoryState.EXCLUDED)
    files = list(d.get_files())
    eq_(5, len(files))
    for f in files:
-        if f.path.parent() == p["dir1"]:
+        if f.path.parent == p.joinpath("dir1"):
            assert f.is_ref
        else:
            assert not f.is_ref
@@ -193,7 +186,7 @@ def test_get_files_with_folders():
            return True

    d = Directories()
-    p = testpath["fs"]
+    p = testpath.joinpath("fs")
    d.add_path(p)
    files = list(d.get_files(fileclasses=[FakeFile]))
    # We have the 3 root files and the 3 root dirs
@@ -202,23 +195,23 @@ def test_get_files_with_folders():

 def test_get_folders():
    d = Directories()
-    p = testpath["fs"]
+    p = testpath.joinpath("fs")
    d.add_path(p)
-    d.set_state(p["dir1"], DirectoryState.REFERENCE)
-    d.set_state(p["dir2"], DirectoryState.EXCLUDED)
+    d.set_state(p.joinpath("dir1"), DirectoryState.REFERENCE)
+    d.set_state(p.joinpath("dir2"), DirectoryState.EXCLUDED)
    folders = list(d.get_folders())
    eq_(len(folders), 3)
    ref = [f for f in folders if f.is_ref]
    not_ref = [f for f in folders if not f.is_ref]
    eq_(len(ref), 1)
-    eq_(ref[0].path, p["dir1"])
+    eq_(ref[0].path, p.joinpath("dir1"))
    eq_(len(not_ref), 2)
    eq_(ref[0].size, 1)


 def test_get_files_with_inherited_exclusion():
    d = Directories()
-    p = testpath["onefile"]
+    p = testpath.joinpath("onefile")
    d.add_path(p)
    d.set_state(p, DirectoryState.EXCLUDED)
    eq_([], list(d.get_files()))
@@ -234,13 +227,13 @@ def test_save_and_load(tmpdir):
    d1.add_path(p1)
    d1.add_path(p2)
    d1.set_state(p1, DirectoryState.REFERENCE)
-    d1.set_state(p1["dir1"], DirectoryState.EXCLUDED)
+    d1.set_state(p1.joinpath("dir1"), DirectoryState.EXCLUDED)
    tmpxml = str(tmpdir.join("directories_testunit.xml"))
    d1.save_to_file(tmpxml)
    d2.load_from_file(tmpxml)
    eq_(2, len(d2))
    eq_(DirectoryState.REFERENCE, d2.get_state(p1))
-    eq_(DirectoryState.EXCLUDED, d2.get_state(p1["dir1"]))
+    eq_(DirectoryState.EXCLUDED, d2.get_state(p1.joinpath("dir1")))


 def test_invalid_path():
@@ -268,7 +261,7 @@ def test_load_from_file_with_invalid_path(tmpdir):
    # This test simulates a load from file resulting in a
    # InvalidPath raise. Other directories must be loaded.
    d1 = Directories()
-    d1.add_path(testpath["onefile"])
+    d1.add_path(testpath.joinpath("onefile"))
    # Will raise InvalidPath upon loading
    p = Path(str(tmpdir.join("toremove")))
    p.mkdir()
@@ -283,11 +276,11 @@ def test_load_from_file_with_invalid_path(tmpdir):

 def test_unicode_save(tmpdir):
    d = Directories()
-    p1 = Path(str(tmpdir))["hello\xe9"]
+    p1 = Path(str(tmpdir), "hello\xe9")
    p1.mkdir()
-    p1["foo\xe9"].mkdir()
+    p1.joinpath("foo\xe9").mkdir()
    d.add_path(p1)
-    d.set_state(p1["foo\xe9"], DirectoryState.EXCLUDED)
+    d.set_state(p1.joinpath("foo\xe9"), DirectoryState.EXCLUDED)
    tmpxml = str(tmpdir.join("directories_testunit.xml"))
    try:
        d.save_to_file(tmpxml)
@@ -297,12 +290,12 @@ def test_unicode_save(tmpdir):

 def test_get_files_refreshes_its_directories():
    d = Directories()
-    p = testpath["fs"]
+    p = testpath.joinpath("fs")
    d.add_path(p)
    files = d.get_files()
    eq_(6, len(list(files)))
    time.sleep(1)
-    os.remove(str(p["dir1"]["file1.test"]))
+    os.remove(str(p.joinpath("dir1", "file1.test")))
    files = d.get_files()
    eq_(5, len(list(files)))

@@ -311,15 +304,15 @@ def test_get_files_does_not_choke_on_non_existing_directories(tmpdir):
    d = Directories()
    p = Path(str(tmpdir))
    d.add_path(p)
-    p.rmtree()
+    shutil.rmtree(str(p))
    eq_([], list(d.get_files()))


 def test_get_state_returns_excluded_by_default_for_hidden_directories(tmpdir):
    d = Directories()
    p = Path(str(tmpdir))
-    hidden_dir_path = p[".foo"]
-    p[".foo"].mkdir()
+    hidden_dir_path = p.joinpath(".foo")
+    p.joinpath(".foo").mkdir()
    d.add_path(p)
    eq_(d.get_state(hidden_dir_path), DirectoryState.EXCLUDED)
    # But it can be overriden
@@ -331,22 +324,22 @@ def test_default_path_state_override(tmpdir):
    # It's possible for a subclass to override the default state of a path
    class MyDirectories(Directories):
        def _default_state_for_path(self, path):
-            if "foobar" in path:
+            if "foobar" in path.parts:
                return DirectoryState.EXCLUDED

    d = MyDirectories()
    p1 = Path(str(tmpdir))
-    p1["foobar"].mkdir()
-    p1["foobar/somefile"].open("w").close()
-    p1["foobaz"].mkdir()
-    p1["foobaz/somefile"].open("w").close()
+    p1.joinpath("foobar").mkdir()
+    p1.joinpath("foobar/somefile").touch()
+    p1.joinpath("foobaz").mkdir()
+    p1.joinpath("foobaz/somefile").touch()
    d.add_path(p1)
-    eq_(d.get_state(p1["foobaz"]), DirectoryState.NORMAL)
-    eq_(d.get_state(p1["foobar"]), DirectoryState.EXCLUDED)
+    eq_(d.get_state(p1.joinpath("foobaz")), DirectoryState.NORMAL)
+    eq_(d.get_state(p1.joinpath("foobar")), DirectoryState.EXCLUDED)
    eq_(len(list(d.get_files())), 1)  # only the 'foobaz' file is there
    # However, the default state can be changed
-    d.set_state(p1["foobar"], DirectoryState.NORMAL)
-    eq_(d.get_state(p1["foobar"]), DirectoryState.NORMAL)
+    d.set_state(p1.joinpath("foobar"), DirectoryState.NORMAL)
+    eq_(d.get_state(p1.joinpath("foobar")), DirectoryState.NORMAL)
    eq_(len(list(d.get_files())), 2)


@@ -372,42 +365,42 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        self.d._exclude_list.add(regex)
        self.d._exclude_list.mark(regex)
        p1 = Path(str(tmpdir))
-        p1["$Recycle.Bin"].mkdir()
-        p1["$Recycle.Bin"]["subdir"].mkdir()
+        p1.joinpath("$Recycle.Bin").mkdir()
+        p1.joinpath("$Recycle.Bin", "subdir").mkdir()
        self.d.add_path(p1)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
        # By default, subdirs should be excluded too, but this can be overridden separately
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED)
-        self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
+        self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)

    def test_exclude_refined(self, tmpdir):
        regex1 = r"^\$Recycle\.Bin$"
        self.d._exclude_list.add(regex1)
        self.d._exclude_list.mark(regex1)
        p1 = Path(str(tmpdir))
-        p1["$Recycle.Bin"].mkdir()
-        p1["$Recycle.Bin"]["somefile.png"].open("w").close()
-        p1["$Recycle.Bin"]["some_unwanted_file.jpg"].open("w").close()
-        p1["$Recycle.Bin"]["subdir"].mkdir()
-        p1["$Recycle.Bin"]["subdir"]["somesubdirfile.png"].open("w").close()
-        p1["$Recycle.Bin"]["subdir"]["unwanted_subdirfile.gif"].open("w").close()
-        p1["$Recycle.Bin"]["subdar"].mkdir()
-        p1["$Recycle.Bin"]["subdar"]["somesubdarfile.jpeg"].open("w").close()
-        p1["$Recycle.Bin"]["subdar"]["unwanted_subdarfile.png"].open("w").close()
-        self.d.add_path(p1["$Recycle.Bin"])
+        p1.joinpath("$Recycle.Bin").mkdir()
+        p1.joinpath("$Recycle.Bin", "somefile.png").touch()
+        p1.joinpath("$Recycle.Bin", "some_unwanted_file.jpg").touch()
+        p1.joinpath("$Recycle.Bin", "subdir").mkdir()
+        p1.joinpath("$Recycle.Bin", "subdir", "somesubdirfile.png").touch()
+        p1.joinpath("$Recycle.Bin", "subdir", "unwanted_subdirfile.gif").touch()
+        p1.joinpath("$Recycle.Bin", "subdar").mkdir()
+        p1.joinpath("$Recycle.Bin", "subdar", "somesubdarfile.jpeg").touch()
+        p1.joinpath("$Recycle.Bin", "subdar", "unwanted_subdarfile.png").touch()
+        self.d.add_path(p1.joinpath("$Recycle.Bin"))

        # Filter should set the default state to Excluded
-        eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
        # The subdir should inherit its parent state
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.EXCLUDED)
        # Override a child path's state
-        self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
        # Parent should keep its default state, and the other child too
-        eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.EXCLUDED)
        # print(f"get_folders(): {[x for x in self.d.get_folders()]}")

        # only the 2 files directly under the Normal directory
@@ -419,8 +412,8 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        assert "somesubdirfile.png" in files
        assert "unwanted_subdirfile.gif" in files
        # Overriding the parent should enable all children
-        self.d.set_state(p1["$Recycle.Bin"], DirectoryState.NORMAL)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.NORMAL)
+        self.d.set_state(p1.joinpath("$Recycle.Bin"), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.NORMAL)
        # all files there
        files = self.get_files_and_expect_num_result(6)
        assert "somefile.png" in files
@@ -444,7 +437,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        assert self.d._exclude_list.error(regex3) is None
        # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
        # Directory shouldn't change its state here, unless explicitely done by user
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
        files = self.get_files_and_expect_num_result(5)
        assert "unwanted_subdirfile.gif" not in files
        assert "unwanted_subdarfile.png" in files
@@ -453,15 +446,15 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        regex4 = r".*subdir$"
        self.d._exclude_list.rename(regex3, regex4)
        assert self.d._exclude_list.error(regex4) is None
-        p1["$Recycle.Bin"]["subdar"]["file_ending_with_subdir"].open("w").close()
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED)
+        p1.joinpath("$Recycle.Bin", "subdar", "file_ending_with_subdir").touch()
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
        files = self.get_files_and_expect_num_result(4)
        assert "file_ending_with_subdir" not in files
        assert "somesubdarfile.jpeg" in files
        assert "somesubdirfile.png" not in files
        assert "unwanted_subdirfile.gif" not in files
-        self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL)
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
        # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
        files = self.get_files_and_expect_num_result(6)
        assert "file_ending_with_subdir" not in files
@@ -471,9 +464,9 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        regex5 = r".*subdir.*"
        self.d._exclude_list.rename(regex4, regex5)
        # Files containing substring should be filtered
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
        # The path should not match, only the filename, the "subdir" in the directory name shouldn't matter
-        p1["$Recycle.Bin"]["subdir"]["file_which_shouldnt_match"].open("w").close()
+        p1.joinpath("$Recycle.Bin", "subdir", "file_which_shouldnt_match").touch()
        files = self.get_files_and_expect_num_result(5)
        assert "somesubdirfile.png" not in files
        assert "unwanted_subdirfile.gif" not in files
@@ -493,7 +486,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        assert self.d._exclude_list.error(regex6) is None
        assert regex6 in self.d._exclude_list
        # This still should not be affected
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
        files = self.get_files_and_expect_num_result(5)
        # These files are under the "/subdir" directory
        assert "somesubdirfile.png" not in files
@@ -505,20 +498,20 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled

    def test_japanese_unicode(self, tmpdir):
        p1 = Path(str(tmpdir))
-        p1["$Recycle.Bin"].mkdir()
-        p1["$Recycle.Bin"]["somerecycledfile.png"].open("w").close()
-        p1["$Recycle.Bin"]["some_unwanted_file.jpg"].open("w").close()
-        p1["$Recycle.Bin"]["subdir"].mkdir()
-        p1["$Recycle.Bin"]["subdir"]["過去白濁物語～]_カラー.jpg"].open("w").close()
-        p1["$Recycle.Bin"]["思叫物語"].mkdir()
-        p1["$Recycle.Bin"]["思叫物語"]["なししろ会う前"].open("w").close()
-        p1["$Recycle.Bin"]["思叫物語"]["堂～ロ"].open("w").close()
-        self.d.add_path(p1["$Recycle.Bin"])
+        p1.joinpath("$Recycle.Bin").mkdir()
+        p1.joinpath("$Recycle.Bin", "somerecycledfile.png").touch()
+        p1.joinpath("$Recycle.Bin", "some_unwanted_file.jpg").touch()
+        p1.joinpath("$Recycle.Bin", "subdir").mkdir()
+        p1.joinpath("$Recycle.Bin", "subdir", "過去白濁物語～]_カラー.jpg").touch()
+        p1.joinpath("$Recycle.Bin", "思叫物語").mkdir()
+        p1.joinpath("$Recycle.Bin", "思叫物語", "なししろ会う前").touch()
+        p1.joinpath("$Recycle.Bin", "思叫物語", "堂～ロ").touch()
+        self.d.add_path(p1.joinpath("$Recycle.Bin"))
        regex3 = r".*物語.*"
        self.d._exclude_list.add(regex3)
        self.d._exclude_list.mark(regex3)
        # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
-        eq_(self.d.get_state(p1["$Recycle.Bin"]["思叫物語"]), DirectoryState.EXCLUDED)
+        eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "思叫物語")), DirectoryState.EXCLUDED)
        files = self.get_files_and_expect_num_result(2)
        assert "過去白濁物語～]_カラー.jpg" not in files
        assert "なししろ会う前" not in files
@@ -527,7 +520,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        regex4 = r".*物語$"
        self.d._exclude_list.rename(regex3, regex4)
        assert self.d._exclude_list.error(regex4) is None
-        self.d.set_state(p1["$Recycle.Bin"]["思叫物語"], DirectoryState.NORMAL)
+        self.d.set_state(p1.joinpath("$Recycle.Bin", "思叫物語"), DirectoryState.NORMAL)
        files = self.get_files_and_expect_num_result(5)
        assert "過去白濁物語～]_カラー.jpg" in files
        assert "なししろ会う前" in files
@@ -539,15 +532,15 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
        self.d._exclude_list.add(regex)
        self.d._exclude_list.mark(regex)
        p1 = Path(str(tmpdir))
-        p1["foobar"].mkdir()
-        p1["foobar"][".hidden_file.txt"].open("w").close()
-        p1["foobar"][".hidden_dir"].mkdir()
-        p1["foobar"][".hidden_dir"]["foobar.jpg"].open("w").close()
-        p1["foobar"][".hidden_dir"][".hidden_subfile.png"].open("w").close()
-        self.d.add_path(p1["foobar"])
+        p1.joinpath("foobar").mkdir()
+        p1.joinpath("foobar", ".hidden_file.txt").touch()
+        p1.joinpath("foobar", ".hidden_dir").mkdir()
+        p1.joinpath("foobar", ".hidden_dir", "foobar.jpg").touch()
+        p1.joinpath("foobar", ".hidden_dir", ".hidden_subfile.png").touch()
+        self.d.add_path(p1.joinpath("foobar"))
        # It should not inherit its parent's state originally
-        eq_(self.d.get_state(p1["foobar"][".hidden_dir"]), DirectoryState.EXCLUDED)
-        self.d.set_state(p1["foobar"][".hidden_dir"], DirectoryState.NORMAL)
+        eq_(self.d.get_state(p1.joinpath("foobar", ".hidden_dir")), DirectoryState.EXCLUDED)
+        self.d.set_state(p1.joinpath("foobar", ".hidden_dir"), DirectoryState.NORMAL)
        # The files should still be filtered
        files = self.get_files_and_expect_num_result(1)
        eq_(len(self.d._exclude_list.compiled_paths), 0)
--- a/core/tests/engine_test.py
+++ b/core/tests/engine_test.py
@@ -530,7 +530,7 @@ class TestCaseGetMatches:


 class TestCaseGetMatchesByContents:
-    def test_big_file_partial_hashes(self):
+    def test_big_file_partial_hashing(self):
        smallsize = 1
        bigsize = 100 * 1024 * 1024  # 100MB
        f = [
@@ -539,17 +539,17 @@ class TestCaseGetMatchesByContents:
            no("smallfoo", size=smallsize),
            no("smallbar", size=smallsize),
        ]
-        f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar"
-        f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar"
-        f[2].md5 = f[2].md5partial = "bleh"
-        f[3].md5 = f[3].md5partial = "bleh"
+        f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
+        f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
+        f[2].digest = f[2].digest_partial = "bleh"
+        f[3].digest = f[3].digest_partial = "bleh"
        r = getmatches_by_contents(f, bigsize=bigsize)
        eq_(len(r), 2)
-        # User disabled optimization for big files, compute hashes as usual
+        # User disabled optimization for big files, compute digests as usual
        r = getmatches_by_contents(f, bigsize=0)
        eq_(len(r), 2)
-        # Other file is now slightly different, md5partial is still the same
-        f[1].md5 = f[1].md5samples = "foobardiff"
+        # Other file is now slightly different, digest_partial is still the same
+        f[1].digest = f[1].digest_samples = "foobardiff"
        r = getmatches_by_contents(f, bigsize=bigsize)
        # Successfully filter it out
        eq_(len(r), 1)
--- a/core/tests/fs_test.py
+++ b/core/tests/fs_test.py
@@ -6,10 +6,18 @@
 # which should be included with this package. The terms are also available at
 # http://www.gnu.org/licenses/gpl-3.0.html

+try:
+    import xxhash
+
+    hasher = xxhash.xxh128
+except ImportError:
    import hashlib
+
+    hasher = hashlib.md5
+
 from os import urandom

-from hscommon.path import Path
+from pathlib import Path
 from hscommon.testutil import eq_
 from core.tests.directories_test import create_fake_fs

@@ -17,32 +25,26 @@ from .. import fs


 def create_fake_fs_with_random_data(rootpath):
-    rootpath = rootpath["fs"]
+    rootpath = rootpath.joinpath("fs")
    rootpath.mkdir()
-    rootpath["dir1"].mkdir()
-    rootpath["dir2"].mkdir()
-    rootpath["dir3"].mkdir()
-    fp = rootpath["file1.test"].open("wb")
+    rootpath.joinpath("dir1").mkdir()
+    rootpath.joinpath("dir2").mkdir()
+    rootpath.joinpath("dir3").mkdir()
    data1 = urandom(200 * 1024)  # 200KiB
    data2 = urandom(1024 * 1024)  # 1MiB
    data3 = urandom(10 * 1024 * 1024)  # 10MiB
+    with rootpath.joinpath("file1.test").open("wb") as fp:
        fp.write(data1)
-    fp.close()
-    fp = rootpath["file2.test"].open("wb")
+    with rootpath.joinpath("file2.test").open("wb") as fp:
        fp.write(data2)
-    fp.close()
-    fp = rootpath["file3.test"].open("wb")
+    with rootpath.joinpath("file3.test").open("wb") as fp:
        fp.write(data3)
-    fp.close()
-    fp = rootpath["dir1"]["file1.test"].open("wb")
+    with rootpath.joinpath("dir1", "file1.test").open("wb") as fp:
        fp.write(data1)
-    fp.close()
-    fp = rootpath["dir2"]["file2.test"].open("wb")
+    with rootpath.joinpath("dir2", "file2.test").open("wb") as fp:
        fp.write(data2)
-    fp.close()
-    fp = rootpath["dir3"]["file3.test"].open("wb")
+    with rootpath.joinpath("dir3", "file3.test").open("wb") as fp:
        fp.write(data3)
-    fp.close()
    return rootpath


@@ -52,54 +54,54 @@ def test_size_aggregates_subfiles(tmpdir):
    eq_(b.size, 12)


-def test_md5_aggregate_subfiles_sorted(tmpdir):
-    # dir.allfiles can return child in any order. Thus, bundle.md5 must aggregate
-    # all files' md5 it contains, but it must make sure that it does so in the
+def test_digest_aggregate_subfiles_sorted(tmpdir):
+    # dir.allfiles can return child in any order. Thus, bundle.digest must aggregate
+    # all files' digests it contains, but it must make sure that it does so in the
    # same order everytime.
    p = create_fake_fs_with_random_data(Path(str(tmpdir)))
    b = fs.Folder(p)
-    md51 = fs.File(p["dir1"]["file1.test"]).md5
-    md52 = fs.File(p["dir2"]["file2.test"]).md5
-    md53 = fs.File(p["dir3"]["file3.test"]).md5
-    md54 = fs.File(p["file1.test"]).md5
-    md55 = fs.File(p["file2.test"]).md5
-    md56 = fs.File(p["file3.test"]).md5
-    # The expected md5 is the md5 of md5s for folders and the direct md5 for files
-    folder_md51 = hashlib.md5(md51).digest()
-    folder_md52 = hashlib.md5(md52).digest()
-    folder_md53 = hashlib.md5(md53).digest()
-    md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56)
-    eq_(b.md5, md5.digest())
+    digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest
+    digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest
+    digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest
+    digest4 = fs.File(p.joinpath("file1.test")).digest
+    digest5 = fs.File(p.joinpath("file2.test")).digest
+    digest6 = fs.File(p.joinpath("file3.test")).digest
+    # The expected digest is the hash of digests for folders and the direct digest for files
+    folder_digest1 = hasher(digest1).digest()
+    folder_digest2 = hasher(digest2).digest()
+    folder_digest3 = hasher(digest3).digest()
+    digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
+    eq_(b.digest, digest)


-def test_partial_md5_aggregate_subfile_sorted(tmpdir):
+def test_partial_digest_aggregate_subfile_sorted(tmpdir):
    p = create_fake_fs_with_random_data(Path(str(tmpdir)))
    b = fs.Folder(p)
-    md51 = fs.File(p["dir1"]["file1.test"]).md5partial
-    md52 = fs.File(p["dir2"]["file2.test"]).md5partial
-    md53 = fs.File(p["dir3"]["file3.test"]).md5partial
-    md54 = fs.File(p["file1.test"]).md5partial
-    md55 = fs.File(p["file2.test"]).md5partial
-    md56 = fs.File(p["file3.test"]).md5partial
-    # The expected md5 is the md5 of md5s for folders and the direct md5 for files
-    folder_md51 = hashlib.md5(md51).digest()
-    folder_md52 = hashlib.md5(md52).digest()
-    folder_md53 = hashlib.md5(md53).digest()
-    md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56)
-    eq_(b.md5partial, md5.digest())
+    digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest_partial
+    digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest_partial
+    digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest_partial
+    digest4 = fs.File(p.joinpath("file1.test")).digest_partial
+    digest5 = fs.File(p.joinpath("file2.test")).digest_partial
+    digest6 = fs.File(p.joinpath("file3.test")).digest_partial
+    # The expected digest is the hash of digests for folders and the direct digest for files
+    folder_digest1 = hasher(digest1).digest()
+    folder_digest2 = hasher(digest2).digest()
+    folder_digest3 = hasher(digest3).digest()
+    digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
+    eq_(b.digest_partial, digest)

-    md51 = fs.File(p["dir1"]["file1.test"]).md5samples
-    md52 = fs.File(p["dir2"]["file2.test"]).md5samples
-    md53 = fs.File(p["dir3"]["file3.test"]).md5samples
-    md54 = fs.File(p["file1.test"]).md5samples
-    md55 = fs.File(p["file2.test"]).md5samples
-    md56 = fs.File(p["file3.test"]).md5samples
-    # The expected md5 is the md5 of md5s for folders and the direct md5 for files
-    folder_md51 = hashlib.md5(md51).digest()
-    folder_md52 = hashlib.md5(md52).digest()
-    folder_md53 = hashlib.md5(md53).digest()
-    md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56)
-    eq_(b.md5samples, md5.digest())
+    digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest_samples
+    digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest_samples
+    digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest_samples
+    digest4 = fs.File(p.joinpath("file1.test")).digest_samples
+    digest5 = fs.File(p.joinpath("file2.test")).digest_samples
+    digest6 = fs.File(p.joinpath("file3.test")).digest_samples
+    # The expected digest is the digest of digests for folders and the direct digest for files
+    folder_digest1 = hasher(digest1).digest()
+    folder_digest2 = hasher(digest2).digest()
+    folder_digest3 = hasher(digest3).digest()
+    digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
+    eq_(b.digest_samples, digest)


 def test_has_file_attrs(tmpdir):
--- a/core/tests/results_test.py
+++ b/core/tests/results_test.py
@@ -447,7 +447,7 @@ class TestCaseResultsXML:
        self.results.groups = self.groups

    def get_file(self, path):  # use this as a callback for load_from_xml
-        return [o for o in self.objects if o.path == path][0]
+        return [o for o in self.objects if str(o.path) == path][0]

    def test_save_to_xml(self):
        self.objects[0].is_ref = True
--- a/core/tests/scanner_test.py
+++ b/core/tests/scanner_test.py
@@ -7,7 +7,7 @@
 import pytest

 from hscommon.jobprogress import job
-from hscommon.path import Path
+from pathlib import Path
 from hscommon.testutil import eq_

 from .. import fs
@@ -22,7 +22,7 @@ class NamedObject:
        if path is None:
            path = Path(name)
        else:
-            path = Path(path)[name]
+            path = Path(path, name)
        self.name = name
        self.size = size
        self.path = path
@@ -123,19 +123,19 @@ def test_content_scan(fake_fileexists):
    s = Scanner()
    s.scan_type = ScanType.CONTENTS
    f = [no("foo"), no("bar"), no("bleh")]
-    f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar"
-    f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar"
-    f[2].md5 = f[2].md5partial = f[1].md5samples = "bleh"
+    f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
+    f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
+    f[2].digest = f[2].digest_partial = f[1].digest_samples = "bleh"
    r = s.get_dupe_groups(f)
    eq_(len(r), 1)
    eq_(len(r[0]), 2)
-    eq_(s.discarded_file_count, 0)  # don't count the different md5 as discarded!
+    eq_(s.discarded_file_count, 0)  # don't count the different digest as discarded!


 def test_content_scan_compare_sizes_first(fake_fileexists):
    class MyFile(no):
        @property
-        def md5(self):
+        def digest(self):
            raise AssertionError()

    s = Scanner()
@@ -161,14 +161,14 @@ def test_ignore_file_size(fake_fileexists):
        no("largeignore1", large_size + 1),
        no("largeignore2", large_size + 1),
    ]
-    f[0].md5 = f[0].md5partial = f[0].md5samples = "smallignore"
-    f[1].md5 = f[1].md5partial = f[1].md5samples = "smallignore"
-    f[2].md5 = f[2].md5partial = f[2].md5samples = "small"
-    f[3].md5 = f[3].md5partial = f[3].md5samples = "small"
-    f[4].md5 = f[4].md5partial = f[4].md5samples = "large"
-    f[5].md5 = f[5].md5partial = f[5].md5samples = "large"
-    f[6].md5 = f[6].md5partial = f[6].md5samples = "largeignore"
-    f[7].md5 = f[7].md5partial = f[7].md5samples = "largeignore"
+    f[0].digest = f[0].digest_partial = f[0].digest_samples = "smallignore"
+    f[1].digest = f[1].digest_partial = f[1].digest_samples = "smallignore"
+    f[2].digest = f[2].digest_partial = f[2].digest_samples = "small"
+    f[3].digest = f[3].digest_partial = f[3].digest_samples = "small"
+    f[4].digest = f[4].digest_partial = f[4].digest_samples = "large"
+    f[5].digest = f[5].digest_partial = f[5].digest_samples = "large"
+    f[6].digest = f[6].digest_partial = f[6].digest_samples = "largeignore"
+    f[7].digest = f[7].digest_partial = f[7].digest_samples = "largeignore"

    r = s.get_dupe_groups(f)
    # No ignores
@@ -197,21 +197,21 @@ def test_big_file_partial_hashes(fake_fileexists):
    s.big_file_size_threshold = bigsize

    f = [no("bigfoo", bigsize), no("bigbar", bigsize), no("smallfoo", smallsize), no("smallbar", smallsize)]
-    f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar"
-    f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar"
-    f[2].md5 = f[2].md5partial = "bleh"
-    f[3].md5 = f[3].md5partial = "bleh"
+    f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
+    f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
+    f[2].digest = f[2].digest_partial = "bleh"
+    f[3].digest = f[3].digest_partial = "bleh"
    r = s.get_dupe_groups(f)
    eq_(len(r), 2)

-    # md5partial is still the same, but the file is actually different
-    f[1].md5 = f[1].md5samples = "difffoobar"
-    # here we compare the full md5s, as the user disabled the optimization
+    # digest_partial is still the same, but the file is actually different
+    f[1].digest = f[1].digest_samples = "difffoobar"
+    # here we compare the full digests, as the user disabled the optimization
    s.big_file_size_threshold = 0
    r = s.get_dupe_groups(f)
    eq_(len(r), 1)

-    # here we should compare the md5samples, and see they are different
+    # here we should compare the digest_samples, and see they are different
    s.big_file_size_threshold = bigsize
    r = s.get_dupe_groups(f)
    eq_(len(r), 1)
@@ -221,9 +221,9 @@ def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists):
    s = Scanner()
    s.scan_type = ScanType.CONTENTS
    f = [no("foo"), no("bar"), no("bleh")]
-    f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar"
-    f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar"
-    f[2].md5 = f[2].md5partial = f[2].md5samples = "bleh"
+    f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
+    f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
+    f[2].digest = f[2].digest_partial = f[2].digest_samples = "bleh"
    s.min_match_percentage = 101
    r = s.get_dupe_groups(f)
    eq_(len(r), 1)
@@ -234,12 +234,16 @@ def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists):
    eq_(len(r[0]), 2)


-def test_content_scan_doesnt_put_md5_in_words_at_the_end(fake_fileexists):
+def test_content_scan_doesnt_put_digest_in_words_at_the_end(fake_fileexists):
    s = Scanner()
    s.scan_type = ScanType.CONTENTS
    f = [no("foo"), no("bar")]
-    f[0].md5 = f[0].md5partial = f[0].md5samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
-    f[1].md5 = f[1].md5partial = f[1].md5samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
+    f[0].digest = f[0].digest_partial = f[
+        0
+    ].digest_samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
+    f[1].digest = f[1].digest_partial = f[
+        1
+    ].digest_samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
    r = s.get_dupe_groups(f)
    # FIXME looks like we are missing something here?
    r[0]
@@ -568,12 +572,14 @@ def test_dont_group_files_that_dont_exist(tmpdir):
    s = Scanner()
    s.scan_type = ScanType.CONTENTS
    p = Path(str(tmpdir))
-    p["file1"].open("w").write("foo")
-    p["file2"].open("w").write("foo")
+    with p.joinpath("file1").open("w") as fp:
+        fp.write("foo")
+    with p.joinpath("file2").open("w") as fp:
+        fp.write("foo")
    file1, file2 = fs.get_files(p)

    def getmatches(*args, **kw):
-        file2.path.remove()
+        file2.path.unlink()
        return [Match(file1, file2, 100)]

    s._getmatches = getmatches
@@ -587,21 +593,21 @@ def test_folder_scan_exclude_subfolder_matches(fake_fileexists):
    s = Scanner()
    s.scan_type = ScanType.FOLDERS
    topf1 = no("top folder 1", size=42)
-    topf1.md5 = topf1.md5partial = topf1.md5samples = b"some_md5_1"
+    topf1.digest = topf1.digest_partial = topf1.digest_samples = b"some_digest__1"
    topf1.path = Path("/topf1")
    topf2 = no("top folder 2", size=42)
-    topf2.md5 = topf2.md5partial = topf2.md5samples = b"some_md5_1"
+    topf2.digest = topf2.digest_partial = topf2.digest_samples = b"some_digest__1"
    topf2.path = Path("/topf2")
    subf1 = no("sub folder 1", size=41)
-    subf1.md5 = subf1.md5partial = subf1.md5samples = b"some_md5_2"
+    subf1.digest = subf1.digest_partial = subf1.digest_samples = b"some_digest__2"
    subf1.path = Path("/topf1/sub")
    subf2 = no("sub folder 2", size=41)
-    subf2.md5 = subf2.md5partial = subf2.md5samples = b"some_md5_2"
+    subf2.digest = subf2.digest_partial = subf2.digest_samples = b"some_digest__2"
    subf2.path = Path("/topf2/sub")
    eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2])), 1)  # only top folders
    # however, if another folder matches a subfolder, keep in in the matches
    otherf = no("other folder", size=41)
-    otherf.md5 = otherf.md5partial = otherf.md5samples = b"some_md5_2"
+    otherf.digest = otherf.digest_partial = otherf.digest_samples = b"some_digest__2"
    otherf.path = Path("/otherfolder")
    eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2, otherf])), 2)

@@ -624,9 +630,9 @@ def test_dont_count_ref_files_as_discarded(fake_fileexists):
    o1 = no("foo", path="p1")
    o2 = no("foo", path="p2")
    o3 = no("foo", path="p3")
-    o1.md5 = o1.md5partial = o1.md5samples = "foobar"
-    o2.md5 = o2.md5partial = o2.md5samples = "foobar"
-    o3.md5 = o3.md5partial = o3.md5samples = "foobar"
+    o1.digest = o1.digest_partial = o1.digest_samples = "foobar"
+    o2.digest = o2.digest_partial = o2.digest_samples = "foobar"
+    o3.digest = o3.digest_partial = o3.digest_samples = "foobar"
    o1.is_ref = True
    o2.is_ref = True
    eq_(len(s.get_dupe_groups([o1, o2, o3])), 1)
--- a/core/util.py
+++ b/core/util.py
@@ -7,6 +7,12 @@
 import time
 import sys
 import os
+import urllib.request
+import urllib.error
+import json
+import semantic_version
+import logging
+from typing import Union

 from hscommon.util import format_time_decimal

@@ -64,3 +70,34 @@ def fix_surrogate_encoding(s, encoding="utf-8"):

 def executable_folder():
    return os.path.dirname(os.path.abspath(sys.argv[0]))
+
+
+def check_for_update(current_version: str, include_prerelease: bool = False) -> Union[None, dict]:
+    request = urllib.request.Request(
+        "https://api.github.com/repos/arsenetar/dupeguru/releases",
+        headers={"Accept": "application/vnd.github.v3+json"},
+    )
+    try:
+        with urllib.request.urlopen(request) as response:
+            if response.status != 200:
+                logging.warn(f"Error retriving updates. Status: {response.status}")
+                return None
+            try:
+                response_json = json.loads(response.read())
+            except json.JSONDecodeError as ex:
+                logging.warn(f"Error parsing updates. {ex.msg}")
+                return None
+    except urllib.error.URLError as ex:
+        logging.warn(f"Error retriving updates. {ex.reason}")
+        return None
+    new_version = semantic_version.Version(current_version)
+    new_url = None
+    for release in response_json:
+        release_version = semantic_version.Version(release["name"])
+        if new_version < release_version and (include_prerelease or not release_version.prerelease):
+            new_version = release_version
+            new_url = release["html_url"]
+    if new_url is not None:
+        return {"version": new_version, "url": new_url}
+    else:
+        return None
--- a/hscommon/build.py
+++ b/hscommon/build.py
@@ -20,13 +20,8 @@ import re
 import importlib
 from datetime import datetime
 import glob
-import sysconfig
-import modulefinder
-
-from setuptools import setup, Extension

 from .plat import ISWINDOWS
-from .util import ensure_folder, delete_files_with_pattern


 def print_and_do(cmd):
@@ -181,23 +176,6 @@ def build_dmg(app_path, destfolder):
    print("Build Complete")


-def copy_sysconfig_files_for_embed(destpath):
-    # This normally shouldn't be needed for Python 3.3+.
-    makefile = sysconfig.get_makefile_filename()
-    configh = sysconfig.get_config_h_filename()
-    shutil.copy(makefile, destpath)
-    shutil.copy(configh, destpath)
-    with open(op.join(destpath, "site.py"), "w") as fp:
-        fp.write(
-            """
-import os.path as op
-from distutils import sysconfig
-sysconfig.get_makefile_filename = lambda: op.join(op.dirname(__file__), 'Makefile')
-sysconfig.get_config_h_filename = lambda: op.join(op.dirname(__file__), 'pyconfig.h')
-"""
-        )
-
-
 def add_to_pythonpath(path):
    """Adds ``path`` to both ``PYTHONPATH`` env and ``sys.path``."""
    abspath = op.abspath(path)
@@ -248,20 +226,6 @@ def copy_packages(packages_names, dest, create_links=False, extra_ignores=None):
                shutil.copy(source_path, dest_path)


-def copy_qt_plugins(folder_names, dest):  # This is only for Windows
-    from PyQt5.QtCore import QLibraryInfo
-
-    qt_plugin_dir = QLibraryInfo.location(QLibraryInfo.PluginsPath)
-
-    def ignore(path, names):
-        if path == qt_plugin_dir:
-            return [n for n in names if n not in folder_names]
-        else:
-            return [n for n in names if not n.endswith(".dll")]
-
-    shutil.copytree(qt_plugin_dir, dest, ignore=ignore)
-
-
 def build_debian_changelog(
    changelogpath,
    destfile,
@@ -349,183 +313,6 @@ def read_changelog_file(filename):
    return result


-class OSXAppStructure:
-    def __init__(self, dest):
-        self.dest = dest
-        self.contents = op.join(dest, "Contents")
-        self.macos = op.join(self.contents, "MacOS")
-        self.resources = op.join(self.contents, "Resources")
-        self.frameworks = op.join(self.contents, "Frameworks")
-        self.infoplist = op.join(self.contents, "Info.plist")
-
-    def create(self, infoplist):
-        ensure_empty_folder(self.dest)
-        os.makedirs(self.macos)
-        os.mkdir(self.resources)
-        os.mkdir(self.frameworks)
-        copy(infoplist, self.infoplist)
-        open(op.join(self.contents, "PkgInfo"), "wt").write("APPLxxxx")
-
-    def copy_executable(self, executable):
-        info = plistlib.readPlist(self.infoplist)
-        self.executablename = info["CFBundleExecutable"]
-        self.executablepath = op.join(self.macos, self.executablename)
-        copy(executable, self.executablepath)
-
-    def copy_resources(self, *resources, use_symlinks=False):
-        for path in resources:
-            resource_dest = op.join(self.resources, op.basename(path))
-            action = symlink if use_symlinks else copy
-            action(op.abspath(path), resource_dest)
-
-    def copy_frameworks(self, *frameworks):
-        for path in frameworks:
-            framework_dest = op.join(self.frameworks, op.basename(path))
-            copy(path, framework_dest)
-
-
-def create_osx_app_structure(
-    dest,
-    executable,
-    infoplist,
-    resources=None,
-    frameworks=None,
-    symlink_resources=False,
-):
-    # `dest`: A path to the destination .app folder
-    # `executable`: the path of the executable file that goes in "MacOS"
-    # `infoplist`: The path to your Info.plist file.
-    # `resources`: A list of paths of files or folders going in the "Resources" folder.
-    # `frameworks`: Same as above for "Frameworks".
-    # `symlink_resources`: If True, will symlink resources into the structure instead of copying them.
-    app = OSXAppStructure(dest)
-    app.create(infoplist)
-    app.copy_executable(executable)
-    app.copy_resources(*resources, use_symlinks=symlink_resources)
-    app.copy_frameworks(*frameworks)
-
-
-class OSXFrameworkStructure:
-    def __init__(self, dest):
-        self.dest = dest
-        self.contents = op.join(dest, "Versions", "A")
-        self.resources = op.join(self.contents, "Resources")
-        self.headers = op.join(self.contents, "Headers")
-        self.infoplist = op.join(self.resources, "Info.plist")
-        self._update_executable_path()
-
-    def _update_executable_path(self):
-        if not op.exists(self.infoplist):
-            self.executablename = self.executablepath = None
-            return
-        info = plistlib.readPlist(self.infoplist)
-        self.executablename = info["CFBundleExecutable"]
-        self.executablepath = op.join(self.contents, self.executablename)
-
-    def create(self, infoplist):
-        ensure_empty_folder(self.dest)
-        os.makedirs(self.contents)
-        os.mkdir(self.resources)
-        os.mkdir(self.headers)
-        copy(infoplist, self.infoplist)
-        self._update_executable_path()
-
-    def create_symlinks(self):
-        # Only call this after create() and copy_executable()
-        os.symlink("A", op.join(self.dest, "Versions", "Current"))
-        os.symlink(op.relpath(self.executablepath, self.dest), op.join(self.dest, self.executablename))
-        os.symlink(op.relpath(self.headers, self.dest), op.join(self.dest, "Headers"))
-        os.symlink(op.relpath(self.resources, self.dest), op.join(self.dest, "Resources"))
-
-    def copy_executable(self, executable):
-        copy(executable, self.executablepath)
-
-    def copy_resources(self, *resources, use_symlinks=False):
-        for path in resources:
-            resource_dest = op.join(self.resources, op.basename(path))
-            action = symlink if use_symlinks else copy
-            action(op.abspath(path), resource_dest)
-
-    def copy_headers(self, *headers, use_symlinks=False):
-        for path in headers:
-            header_dest = op.join(self.headers, op.basename(path))
-            action = symlink if use_symlinks else copy
-            action(op.abspath(path), header_dest)
-
-
-def copy_embeddable_python_dylib(dst):
-    runtime = op.join(
-        sysconfig.get_config_var("PYTHONFRAMEWORKPREFIX"),
-        sysconfig.get_config_var("LDLIBRARY"),
-    )
-    filedest = op.join(dst, "Python")
-    shutil.copy(runtime, filedest)
-    os.chmod(filedest, 0o774)  # We need write permission to use install_name_tool
-    cmd = "install_name_tool -id @rpath/Python %s" % filedest
-    print_and_do(cmd)
-
-
-def collect_stdlib_dependencies(script, dest_folder, extra_deps=None):
-    sysprefix = sys.prefix  # could be a virtualenv
-    basesysprefix = sys.base_prefix  # seems to be path to non-virtual sys
-    real_lib_prefix = sysconfig.get_config_var("LIBDEST")  # leaving this in case it is neede
-
-    def is_stdlib_path(path):
-        # A module path is only a stdlib path if it's in either sys.prefix or
-        # sysconfig.get_config_var('prefix') (the 2 are different if we are in a virtualenv) and if
-        # there's no "site-package in the path.
-        if not path:
-            return False
-        if "site-package" in path:
-            return False
-        if not (path.startswith(sysprefix) or path.startswith(basesysprefix) or path.startswith(real_lib_prefix)):
-            return False
-        return True
-
-    ensure_folder(dest_folder)
-    mf = modulefinder.ModuleFinder()
-    mf.run_script(script)
-    modpaths = [mod.__file__ for mod in mf.modules.values()]
-    modpaths = filter(is_stdlib_path, modpaths)
-    for p in modpaths:
-        if p.startswith(real_lib_prefix):
-            relpath = op.relpath(p, real_lib_prefix)
-        elif p.startswith(sysprefix):
-            relpath = op.relpath(p, sysprefix)
-            assert relpath.startswith("lib/python3.")  # we want to get rid of that lib/python3.x part
-            relpath = relpath[len("lib/python3.X/") :]
-        elif p.startswith(basesysprefix):
-            relpath = op.relpath(p, basesysprefix)
-            assert relpath.startswith("lib/python3.")
-            relpath = relpath[len("lib/python3.X/") :]
-        else:
-            raise AssertionError()
-        if relpath.startswith("lib-dynload"):  # We copy .so files in lib-dynload directly in our dest
-            relpath = relpath[len("lib-dynload/") :]
-        if relpath.startswith("encodings") or relpath.startswith("distutils"):
-            # We force their inclusion later.
-            continue
-        dest_path = op.join(dest_folder, relpath)
-        ensure_folder(op.dirname(dest_path))
-        copy(p, dest_path)
-    # stringprep is used by encodings.
-    # We use real_lib_prefix with distutils because virtualenv messes with it and we need to refer
-    # to the original distutils folder.
-    FORCED_INCLUSION = [
-        "encodings",
-        "stringprep",
-        op.join(real_lib_prefix, "distutils"),
-    ]
-    if extra_deps:
-        FORCED_INCLUSION += extra_deps
-    copy_packages(FORCED_INCLUSION, dest_folder)
-    # There's a couple of rather big exe files in the distutils folder that we absolutely don't
-    # need. Remove them.
-    delete_files_with_pattern(op.join(dest_folder, "distutils"), "*.exe")
-    # And, finally, create an empty "site.py" that Python needs around on startup.
-    open(op.join(dest_folder, "site.py"), "w").close()
-
-
 def fix_qt_resource_file(path):
    # pyrcc5 under Windows, if the locale is non-english, can produce a source file with a date
    # containing accented characters. If it does, the encoding is wrong and it prevents the file
@@ -537,21 +324,3 @@ def fix_qt_resource_file(path):
    lines = [line for line in lines if not line.startswith(b"#")]
    with open(path, "wb") as fp:
        fp.write(b"\n".join(lines))
-
-
-def build_cocoa_ext(extname, dest, source_files, extra_frameworks=(), extra_includes=()):
-    extra_link_args = ["-framework", "CoreFoundation", "-framework", "Foundation"]
-    for extra in extra_frameworks:
-        extra_link_args += ["-framework", extra]
-    ext = Extension(
-        extname,
-        source_files,
-        extra_link_args=extra_link_args,
-        include_dirs=extra_includes,
-    )
-    setup(script_args=["build_ext", "--inplace"], ext_modules=[ext])
-    # Our problem here is to get the fully qualified filename of the resulting .so but I couldn't
-    # find a documented way to do so. The only thing I could find is this below :(
-    fn = ext._file_name
-    assert op.exists(fn)
-    move(fn, op.join(dest, fn))
--- a/hscommon/conflict.py
+++ b/hscommon/conflict.py
@@ -14,7 +14,7 @@ import re
 import os
 import shutil

-from .path import Path, pathify
+from pathlib import Path

 # This matches [123], but not [12] (3 digits being the minimum).
 # It also matches [1234] [12345] etc..
@@ -52,16 +52,15 @@ def is_conflicted(name):
    return re_conflict.match(name) is not None


-@pathify
 def _smart_move_or_copy(operation, source_path: Path, dest_path: Path):
    """Use move() or copy() to move and copy file with the conflict management."""
-    if dest_path.isdir() and not source_path.isdir():
-        dest_path = dest_path[source_path.name]
+    if dest_path.is_dir() and not source_path.is_dir():
+        dest_path = dest_path.joinpath(source_path.name)
    if dest_path.exists():
        filename = dest_path.name
-        dest_dir_path = dest_path.parent()
+        dest_dir_path = dest_path.parent
        newname = get_conflicted_name(os.listdir(str(dest_dir_path)), filename)
-        dest_path = dest_dir_path[newname]
+        dest_path = dest_dir_path.joinpath(newname)
    operation(str(source_path), str(dest_path))


--- a/hscommon/debug.py
+++ b/hscommon/debug.py
@@ -1,23 +0,0 @@
-# Created By: Virgil Dupras
-# Created On: 2011-04-19
-# Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
-#
-# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
-# which should be included with this package. The terms are also available at
-# http://www.gnu.org/licenses/gpl-3.0.html
-
-import sys
-import traceback
-
-
-# Taken from http://bzimmer.ziclix.com/2008/12/17/python-thread-dumps/
-def stacktraces():
-    code = []
-    for thread_id, stack in sys._current_frames().items():
-        code.append("\n# ThreadID: %s" % thread_id)
-        for filename, lineno, name, line in traceback.extract_stack(stack):
-            code.append('File: "%s", line %d, in %s' % (filename, lineno, name))
-            if line:
-                code.append("  %s" % (line.strip()))
-
-    return "\n".join(code)
--- a/hscommon/desktop.py
+++ b/hscommon/desktop.py
@@ -41,29 +41,6 @@ def special_folder_path(special_folder, appname=None, portable=False):
    return _special_folder_path(special_folder, appname, portable=portable)


-try:
-    # Normally, we would simply do "from cocoa import proxy", but due to a bug in pytest (currently
-    # at v2.4.2), our test suite is broken when we do that. This below is a workaround until that
-    # bug is fixed.
-    import cocoa
-
-    if not hasattr(cocoa, "proxy"):
-        raise ImportError()
-    proxy = cocoa.proxy
-    _open_url = proxy.openURL_
-    _open_path = proxy.openPath_
-    _reveal_path = proxy.revealPath_
-
-    def _special_folder_path(special_folder, appname=None, portable=False):
-        if special_folder == SpecialFolder.CACHE:
-            base = proxy.getCachePath()
-        else:
-            base = proxy.getAppdataPath()
-        if not appname:
-            appname = proxy.bundleInfo_("CFBundleName")
-        return op.join(base, appname)
-
-except ImportError:
 try:
    from PyQt5.QtCore import QUrl, QStandardPaths
    from PyQt5.QtGui import QDesktopServices
--- a/hscommon/geometry.py
+++ b/hscommon/geometry.py
@@ -1,216 +0,0 @@
-# Created By: Virgil Dupras
-# Created On: 2011-08-05
-# Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
-#
-# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
-# which should be included with this package. The terms are also available at
-# http://www.gnu.org/licenses/gpl-3.0.html
-
-from sys import maxsize as INF
-from math import sqrt
-
-VERY_SMALL = 0.0000001
-
-
-class Point:
-    def __init__(self, x, y):
-        self.x = x
-        self.y = y
-
-    def __repr__(self):
-        return "<Point {:2.2f}, {:2.2f}>".format(*self)
-
-    def __iter__(self):
-        yield self.x
-        yield self.y
-
-    def distance_to(self, other):
-        return Line(self, other).length()
-
-
-class Line:
-    def __init__(self, p1, p2):
-        self.p1 = p1
-        self.p2 = p2
-
-    def __repr__(self):
-        return "<Line {}, {}>".format(*self)
-
-    def __iter__(self):
-        yield self.p1
-        yield self.p2
-
-    def dx(self):
-        return self.p2.x - self.p1.x
-
-    def dy(self):
-        return self.p2.y - self.p1.y
-
-    def length(self):
-        return sqrt(self.dx() ** 2 + self.dy() ** 2)
-
-    def slope(self):
-        if self.dx() == 0:
-            return INF if self.dy() > 0 else -INF
-        else:
-            return self.dy() / self.dx()
-
-    def intersection_point(self, other):
-        # with help from http://paulbourke.net/geometry/lineline2d/
-        if abs(self.slope() - other.slope()) < VERY_SMALL:
-            # parallel. Even if coincident, we return nothing
-            return None
-
-        A, B = self
-        C, D = other
-
-        denom = (D.y - C.y) * (B.x - A.x) - (D.x - C.x) * (B.y - A.y)
-        if denom == 0:
-            return None
-        numera = (D.x - C.x) * (A.y - C.y) - (D.y - C.y) * (A.x - C.x)
-        numerb = (B.x - A.x) * (A.y - C.y) - (B.y - A.y) * (A.x - C.x)
-
-        mua = numera / denom
-        mub = numerb / denom
-        if (0 <= mua <= 1) and (0 <= mub <= 1):
-            x = A.x + mua * (B.x - A.x)
-            y = A.y + mua * (B.y - A.y)
-            return Point(x, y)
-        else:
-            return None
-
-
-class Rect:
-    def __init__(self, x, y, w, h):
-        self.x = x
-        self.y = y
-        self.w = w
-        self.h = h
-
-    def __iter__(self):
-        yield self.x
-        yield self.y
-        yield self.w
-        yield self.h
-
-    def __repr__(self):
-        return "<Rect {:2.2f}, {:2.2f}, {:2.2f}, {:2.2f}>".format(*self)
-
-    @classmethod
-    def from_center(cls, center, width, height):
-        x = center.x - width / 2
-        y = center.y - height / 2
-        return cls(x, y, width, height)
-
-    @classmethod
-    def from_corners(cls, pt1, pt2):
-        x1, y1 = pt1
-        x2, y2 = pt2
-        return cls(min(x1, x2), min(y1, y2), abs(x1 - x2), abs(y1 - y2))
-
-    def center(self):
-        return Point(self.x + self.w / 2, self.y + self.h / 2)
-
-    def contains_point(self, point):
-        x, y = point
-        (x1, y1), (x2, y2) = self.corners()
-        return (x1 <= x <= x2) and (y1 <= y <= y2)
-
-    def contains_rect(self, rect):
-        pt1, pt2 = rect.corners()
-        return self.contains_point(pt1) and self.contains_point(pt2)
-
-    def corners(self):
-        return Point(self.x, self.y), Point(self.x + self.w, self.y + self.h)
-
-    def intersects(self, other):
-        r1pt1, r1pt2 = self.corners()
-        r2pt1, r2pt2 = other.corners()
-        if r1pt1.x < r2pt1.x:
-            xinter = r1pt2.x >= r2pt1.x
-        else:
-            xinter = r2pt2.x >= r1pt1.x
-        if not xinter:
-            return False
-        if r1pt1.y < r2pt1.y:
-            yinter = r1pt2.y >= r2pt1.y
-        else:
-            yinter = r2pt2.y >= r1pt1.y
-        return yinter
-
-    def lines(self):
-        pt1, pt4 = self.corners()
-        pt2 = Point(pt4.x, pt1.y)
-        pt3 = Point(pt1.x, pt4.y)
-        l1 = Line(pt1, pt2)
-        l2 = Line(pt2, pt4)
-        l3 = Line(pt3, pt4)
-        l4 = Line(pt1, pt3)
-        return l1, l2, l3, l4
-
-    def scaled_rect(self, dx, dy):
-        """Returns a rect that has the same borders at self, but grown/shrunk by dx/dy on each side."""
-        x, y, w, h = self
-        x -= dx
-        y -= dy
-        w += dx * 2
-        h += dy * 2
-        return Rect(x, y, w, h)
-
-    def united(self, other):
-        """Returns the bounding rectangle of this rectangle and `other`."""
-        # ul=upper left lr=lower right
-        ulcorner1, lrcorner1 = self.corners()
-        ulcorner2, lrcorner2 = other.corners()
-        corner1 = Point(min(ulcorner1.x, ulcorner2.x), min(ulcorner1.y, ulcorner2.y))
-        corner2 = Point(max(lrcorner1.x, lrcorner2.x), max(lrcorner1.y, lrcorner2.y))
-        return Rect.from_corners(corner1, corner2)
-
-    # --- Properties
-    @property
-    def top(self):
-        return self.y
-
-    @top.setter
-    def top(self, value):
-        self.y = value
-
-    @property
-    def bottom(self):
-        return self.y + self.h
-
-    @bottom.setter
-    def bottom(self, value):
-        self.y = value - self.h
-
-    @property
-    def left(self):
-        return self.x
-
-    @left.setter
-    def left(self, value):
-        self.x = value
-
-    @property
-    def right(self):
-        return self.x + self.w
-
-    @right.setter
-    def right(self, value):
-        self.x = value - self.w
-
-    @property
-    def width(self):
-        return self.w
-
-    @width.setter
-    def width(self, value):
-        self.w = value
-
-    @property
-    def height(self):
-        return self.h
-
-    @height.setter
-    def height(self, value):
-        self.h = value
--- a/hscommon/loc.py
+++ b/hscommon/loc.py
@@ -1,14 +1,11 @@
 import os
 import os.path as op
 import shutil
-import re
 import tempfile

 import polib

 from . import pygettext
-from .util import modified_after, dedupe, ensure_folder
-from .build import print_and_do, ensure_empty_folder

 LC_MESSAGES = "LC_MESSAGES"

@@ -116,118 +113,3 @@ def normalize_all_pos(base_folder):
        for pofile in pofiles:
            p = polib.pofile(pofile)
            p.save()
-
-
-# --- Cocoa
-def all_lproj_paths(folder):
-    return files_with_ext(folder, ".lproj")
-
-
-def escape_cocoa_strings(s):
-    return s.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")
-
-
-def unescape_cocoa_strings(s):
-    return s.replace("\\\\", "\\").replace('\\"', '"').replace("\\n", "\n")
-
-
-def strings2pot(target, dest):
-    with open(target, "rt", encoding="utf-8") as fp:
-        contents = fp.read()
-    # We're reading an en.lproj file. We only care about the righthand part of the translation.
-    re_trans = re.compile(r'".*" = "(.*)";')
-    strings = re_trans.findall(contents)
-    if op.exists(dest):
-        po = polib.pofile(dest)
-    else:
-        po = polib.POFile()
-    for s in dedupe(strings):
-        s = unescape_cocoa_strings(s)
-        entry = po.find(s)
-        if entry is None:
-            entry = polib.POEntry(msgid=s)
-            po.append(entry)
-        # we don't know or care about a line number so we put 0
-        entry.occurrences.append((target, "0"))
-        entry.occurrences = dedupe(entry.occurrences)
-    po.save(dest)
-
-
-def allstrings2pot(lprojpath, dest, excludes=None):
-    allstrings = files_with_ext(lprojpath, STRING_EXT)
-    if excludes:
-        allstrings = [p for p in allstrings if op.splitext(op.basename(p))[0] not in excludes]
-    for strings_path in allstrings:
-        strings2pot(strings_path, dest)
-
-
-def po2strings(pofile, en_strings, dest):
-    # Takes en_strings and replace all righthand parts of "foo" = "bar"; entries with translations
-    # in pofile, then puts the result in dest.
-    po = polib.pofile(pofile)
-    if not modified_after(pofile, dest):
-        return
-    ensure_folder(op.dirname(dest))
-    print("Creating {} from {}".format(dest, pofile))
-    with open(en_strings, "rt", encoding="utf-8") as fp:
-        contents = fp.read()
-    re_trans = re.compile(r'(?<= = ").*(?=";\n)')
-
-    def repl(match):
-        s = match.group(0)
-        unescaped = unescape_cocoa_strings(s)
-        entry = po.find(unescaped)
-        if entry is None:
-            print("WARNING: Could not find entry '{}' in .po file".format(s))
-            return s
-        trans = entry.msgstr
-        return escape_cocoa_strings(trans) if trans else s
-
-    contents = re_trans.sub(repl, contents)
-    with open(dest, "wt", encoding="utf-8") as fp:
-        fp.write(contents)
-
-
-def generate_cocoa_strings_from_code(code_folder, dest_folder):
-    # Uses the "genstrings" command to generate strings file from all .m files in "code_folder".
-    # The strings file (their name depends on the localization table used in the source) will be
-    # placed in "dest_folder".
-    # genstrings produces utf-16 files with comments. After having generated the files, we convert
-    # them to utf-8 and remove the comments.
-    ensure_empty_folder(dest_folder)
-    print_and_do('genstrings -o "{}" `find "{}" -name *.m | xargs`'.format(dest_folder, code_folder))
-    for stringsfile in os.listdir(dest_folder):
-        stringspath = op.join(dest_folder, stringsfile)
-        with open(stringspath, "rt", encoding="utf-16") as fp:
-            content = fp.read()
-        content = re.sub(r"/\*.*?\*/", "", content)
-        content = re.sub(r"\n{2,}", "\n", content)
-        # I have no idea why, but genstrings seems to have problems with "%" character in strings
-        # and inserts (number)$ after it. Find these bogus inserts and remove them.
-        content = re.sub(r"%\d\$", "%", content)
-        with open(stringspath, "wt", encoding="utf-8") as fp:
-            fp.write(content)
-
-
-def generate_cocoa_strings_from_xib(xib_folder):
-    xibs = [op.join(xib_folder, fn) for fn in os.listdir(xib_folder) if fn.endswith(".xib")]
-    for xib in xibs:
-        dest = xib.replace(".xib", STRING_EXT)
-        print_and_do("ibtool {} --generate-strings-file {}".format(xib, dest))
-        print_and_do("iconv -f utf-16 -t utf-8 {0} | tee {0}".format(dest))
-
-
-def localize_stringsfile(stringsfile, dest_root_folder):
-    stringsfile_name = op.basename(stringsfile)
-    for lang in get_langs("locale"):
-        pofile = op.join("locale", lang, "LC_MESSAGES", "ui.po")
-        cocoa_lang = PO2COCOA.get(lang, lang)
-        dest_lproj = op.join(dest_root_folder, cocoa_lang + ".lproj")
-        ensure_folder(dest_lproj)
-        po2strings(pofile, stringsfile, op.join(dest_lproj, stringsfile_name))
-
-
-def localize_all_stringsfiles(src_folder, dest_root_folder):
-    stringsfiles = [op.join(src_folder, fn) for fn in os.listdir(src_folder) if fn.endswith(STRING_EXT)]
-    for path in stringsfiles:
-        localize_stringsfile(path, dest_root_folder)
--- a/hscommon/path.py
+++ b/hscommon/path.py
@@ -7,208 +7,9 @@
 # http://www.gnu.org/licenses/gpl-3.0.html

 import logging
-import os
-import os.path as op
-import shutil
-import sys
-from itertools import takewhile
 from functools import wraps
 from inspect import signature
-
-
-class Path(tuple):
-    """A handy class to work with paths.
-
-    We subclass ``tuple``, each element of the tuple represents an element of the path.
-
-    * ``Path('/foo/bar/baz')[1]`` --> ``'bar'``
-    * ``Path('/foo/bar/baz')[1:2]`` --> ``Path('bar/baz')``
-    * ``Path('/foo/bar')['baz']`` --> ``Path('/foo/bar/baz')``
-    * ``str(Path('/foo/bar/baz'))`` --> ``'/foo/bar/baz'``
-    """
-
-    # Saves a little bit of memory usage
-    __slots__ = ()
-
-    def __new__(cls, value, separator=None):
-        def unicode_if_needed(s):
-            if isinstance(s, str):
-                return s
-            else:
-                try:
-                    return str(s, sys.getfilesystemencoding())
-                except UnicodeDecodeError:
-                    logging.warning("Could not decode %r", s)
-                    raise
-
-        if isinstance(value, Path):
-            return value
-        if not separator:
-            separator = os.sep
-        if isinstance(value, bytes):
-            value = unicode_if_needed(value)
-        if isinstance(value, str):
-            if value:
-                if (separator not in value) and ("/" in value):
-                    separator = "/"
-                value = value.split(separator)
-            else:
-                value = ()
-        else:
-            if any(isinstance(x, bytes) for x in value):
-                value = [unicode_if_needed(x) for x in value]
-            # value is a tuple/list
-            if any(separator in x for x in value):
-                # We have a component with a separator in it. Let's rejoin it, and generate another path.
-                return Path(separator.join(value), separator)
-        if (len(value) > 1) and (not value[-1]):
-            value = value[
-                :-1
-            ]  # We never want a path to end with a '' (because Path() can be called with a trailing slash ending path)
-        return tuple.__new__(cls, value)
-
-    def __add__(self, other):
-        other = Path(other)
-        if other and (not other[0]):
-            other = other[1:]
-        return Path(tuple.__add__(self, other))
-
-    def __contains__(self, item):
-        if isinstance(item, Path):
-            return item[: len(self)] == self
-        else:
-            return tuple.__contains__(self, item)
-
-    def __eq__(self, other):
-        return tuple.__eq__(self, Path(other))
-
-    def __getitem__(self, key):
-        if isinstance(key, slice):
-            if isinstance(key.start, Path):
-                equal_elems = list(takewhile(lambda pair: pair[0] == pair[1], zip(self, key.start)))
-                key = slice(len(equal_elems), key.stop, key.step)
-            if isinstance(key.stop, Path):
-                equal_elems = list(
-                    takewhile(
-                        lambda pair: pair[0] == pair[1],
-                        zip(reversed(self), reversed(key.stop)),
-                    )
-                )
-                stop = -len(equal_elems) if equal_elems else None
-                key = slice(key.start, stop, key.step)
-            return Path(tuple.__getitem__(self, key))
-        elif isinstance(key, (str, Path)):
-            return self + key
-        else:
-            return tuple.__getitem__(self, key)
-
-    def __hash__(self):
-        return tuple.__hash__(self)
-
-    def __ne__(self, other):
-        return not self.__eq__(other)
-
-    def __radd__(self, other):
-        return Path(other) + self
-
-    def __str__(self):
-        if len(self) == 1:
-            first = self[0]
-            if (len(first) == 2) and (first[1] == ":"):  # Windows drive letter
-                return first + "\\"
-            elif not len(first):  # root directory
-                return "/"
-        return os.sep.join(self)
-
-    def has_drive_letter(self):
-        if not self:
-            return False
-        first = self[0]
-        return (len(first) == 2) and (first[1] == ":")
-
-    def is_parent_of(self, other):
-        """Whether ``other`` is a subpath of ``self``.
-
-        Almost the same as ``other in self``, but it's a bit more self-explicative and when
-        ``other == self``, returns False.
-        """
-        if other == self:
-            return False
-        else:
-            return other in self
-
-    def remove_drive_letter(self):
-        if self.has_drive_letter():
-            return self[1:]
-        else:
-            return self
-
-    def tobytes(self):
-        return str(self).encode(sys.getfilesystemencoding())
-
-    def parent(self):
-        """Returns the parent path.
-
-        ``Path('/foo/bar/baz').parent()`` --> ``Path('/foo/bar')``
-        """
-        return self[:-1]
-
-    @property
-    def name(self):
-        """Last element of the path (filename), with extension.
-
-        ``Path('/foo/bar/baz').name`` --> ``'baz'``
-        """
-        return self[-1]
-
-    # OS method wrappers
-    def exists(self):
-        return op.exists(str(self))
-
-    def copy(self, dest_path):
-        return shutil.copy(str(self), str(dest_path))
-
-    def copytree(self, dest_path, *args, **kwargs):
-        return shutil.copytree(str(self), str(dest_path), *args, **kwargs)
-
-    def isdir(self):
-        return op.isdir(str(self))
-
-    def isfile(self):
-        return op.isfile(str(self))
-
-    def islink(self):
-        return op.islink(str(self))
-
-    def listdir(self):
-        return [self[name] for name in os.listdir(str(self))]
-
-    def mkdir(self, *args, **kwargs):
-        return os.mkdir(str(self), *args, **kwargs)
-
-    def makedirs(self, *args, **kwargs):
-        return os.makedirs(str(self), *args, **kwargs)
-
-    def move(self, dest_path):
-        return shutil.move(str(self), str(dest_path))
-
-    def open(self, *args, **kwargs):
-        return open(str(self), *args, **kwargs)
-
-    def remove(self):
-        return os.remove(str(self))
-
-    def rename(self, dest_path):
-        return os.rename(str(self), str(dest_path))
-
-    def rmdir(self):
-        return os.rmdir(str(self))
-
-    def rmtree(self):
-        return shutil.rmtree(str(self))
-
-    def stat(self):
-        return os.stat(str(self))
+from pathlib import Path


 def pathify(f):
--- a/hscommon/tests/conflict_test.py
+++ b/hscommon/tests/conflict_test.py
@@ -15,7 +15,7 @@ from ..conflict import (
    smart_copy,
    smart_move,
 )
-from ..path import Path
+from pathlib import Path
 from ..testutil import eq_


@@ -71,43 +71,43 @@ class TestCaseMoveCopy:
    def do_setup(self, request):
        tmpdir = request.getfixturevalue("tmpdir")
        self.path = Path(str(tmpdir))
-        self.path["foo"].open("w").close()
-        self.path["bar"].open("w").close()
-        self.path["dir"].mkdir()
+        self.path.joinpath("foo").touch()
+        self.path.joinpath("bar").touch()
+        self.path.joinpath("dir").mkdir()

    def test_move_no_conflict(self, do_setup):
-        smart_move(self.path + "foo", self.path + "baz")
-        assert self.path["baz"].exists()
-        assert not self.path["foo"].exists()
+        smart_move(self.path.joinpath("foo"), self.path.joinpath("baz"))
+        assert self.path.joinpath("baz").exists()
+        assert not self.path.joinpath("foo").exists()

    def test_copy_no_conflict(self, do_setup):  # No need to duplicate the rest of the tests... Let's just test on move
-        smart_copy(self.path + "foo", self.path + "baz")
-        assert self.path["baz"].exists()
-        assert self.path["foo"].exists()
+        smart_copy(self.path.joinpath("foo"), self.path.joinpath("baz"))
+        assert self.path.joinpath("baz").exists()
+        assert self.path.joinpath("foo").exists()

    def test_move_no_conflict_dest_is_dir(self, do_setup):
-        smart_move(self.path + "foo", self.path + "dir")
-        assert self.path["dir"]["foo"].exists()
-        assert not self.path["foo"].exists()
+        smart_move(self.path.joinpath("foo"), self.path.joinpath("dir"))
+        assert self.path.joinpath("dir", "foo").exists()
+        assert not self.path.joinpath("foo").exists()

    def test_move_conflict(self, do_setup):
-        smart_move(self.path + "foo", self.path + "bar")
-        assert self.path["[000] bar"].exists()
-        assert not self.path["foo"].exists()
+        smart_move(self.path.joinpath("foo"), self.path.joinpath("bar"))
+        assert self.path.joinpath("[000] bar").exists()
+        assert not self.path.joinpath("foo").exists()

    def test_move_conflict_dest_is_dir(self, do_setup):
-        smart_move(self.path["foo"], self.path["dir"])
-        smart_move(self.path["bar"], self.path["foo"])
-        smart_move(self.path["foo"], self.path["dir"])
-        assert self.path["dir"]["foo"].exists()
-        assert self.path["dir"]["[000] foo"].exists()
-        assert not self.path["foo"].exists()
-        assert not self.path["bar"].exists()
+        smart_move(self.path.joinpath("foo"), self.path.joinpath("dir"))
+        smart_move(self.path.joinpath("bar"), self.path.joinpath("foo"))
+        smart_move(self.path.joinpath("foo"), self.path.joinpath("dir"))
+        assert self.path.joinpath("dir", "foo").exists()
+        assert self.path.joinpath("dir", "[000] foo").exists()
+        assert not self.path.joinpath("foo").exists()
+        assert not self.path.joinpath("bar").exists()

    def test_copy_folder(self, tmpdir):
        # smart_copy also works on folders
        path = Path(str(tmpdir))
-        path["foo"].mkdir()
-        path["bar"].mkdir()
-        smart_copy(path["foo"], path["bar"])  # no crash
-        assert path["[000] bar"].exists()
+        path.joinpath("foo").mkdir()
+        path.joinpath("bar").mkdir()
+        smart_copy(path.joinpath("foo"), path.joinpath("bar"))  # no crash
+        assert path.joinpath("[000] bar").exists()
--- a/hscommon/tests/path_test.py
+++ b/hscommon/tests/path_test.py
@@ -6,261 +6,8 @@
 # which should be included with this package. The terms are also available at
 # http://www.gnu.org/licenses/gpl-3.0.html

-import sys
-import os
-
-import pytest
-
-from ..path import Path, pathify
-from ..testutil import eq_
-
-
-@pytest.fixture
-def force_ossep(request):
-    monkeypatch = request.getfixturevalue("monkeypatch")
-    monkeypatch.setattr(os, "sep", "/")
-
-
-def test_empty(force_ossep):
-    path = Path("")
-    eq_("", str(path))
-    eq_(0, len(path))
-    path = Path(())
-    eq_("", str(path))
-    eq_(0, len(path))
-
-
-def test_single(force_ossep):
-    path = Path("foobar")
-    eq_("foobar", path)
-    eq_(1, len(path))
-
-
-def test_multiple(force_ossep):
-    path = Path("foo/bar")
-    eq_("foo/bar", path)
-    eq_(2, len(path))
-
-
-def test_init_with_tuple_and_list(force_ossep):
-    path = Path(("foo", "bar"))
-    eq_("foo/bar", path)
-    path = Path(["foo", "bar"])
-    eq_("foo/bar", path)
-
-
-def test_init_with_invalid_value(force_ossep):
-    try:
-        Path(42)
-        assert False
-    except TypeError:
-        pass
-
-
-def test_access(force_ossep):
-    path = Path("foo/bar/bleh")
-    eq_("foo", path[0])
-    eq_("foo", path[-3])
-    eq_("bar", path[1])
-    eq_("bar", path[-2])
-    eq_("bleh", path[2])
-    eq_("bleh", path[-1])
-
-
-def test_slicing(force_ossep):
-    path = Path("foo/bar/bleh")
-    subpath = path[:2]
-    eq_("foo/bar", subpath)
-    assert isinstance(subpath, Path)
-
-
-def test_parent(force_ossep):
-    path = Path("foo/bar/bleh")
-    subpath = path.parent()
-    eq_("foo/bar", subpath)
-    assert isinstance(subpath, Path)
-
-
-def test_filename(force_ossep):
-    path = Path("foo/bar/bleh.ext")
-    eq_(path.name, "bleh.ext")
-
-
-def test_deal_with_empty_components(force_ossep):
-    """Keep ONLY a leading space, which means we want a leading slash."""
-    eq_("foo//bar", str(Path(("foo", "", "bar"))))
-    eq_("/foo/bar", str(Path(("", "foo", "bar"))))
-    eq_("foo/bar", str(Path("foo/bar/")))
-
-
-def test_old_compare_paths(force_ossep):
-    eq_(Path("foobar"), Path("foobar"))
-    eq_(Path("foobar/"), Path("foobar\\", "\\"))
-    eq_(Path("/foobar/"), Path("\\foobar\\", "\\"))
-    eq_(Path("/foo/bar"), Path("\\foo\\bar", "\\"))
-    eq_(Path("/foo/bar"), Path("\\foo\\bar\\", "\\"))
-    assert Path("/foo/bar") != Path("\\foo\\foo", "\\")
-    # We also have to test __ne__
-    assert not (Path("foobar") != Path("foobar"))
-    assert Path("/a/b/c.x") != Path("/a/b/c.y")
-
-
-def test_old_split_path(force_ossep):
-    eq_(Path("foobar"), ("foobar",))
-    eq_(Path("foo/bar"), ("foo", "bar"))
-    eq_(Path("/foo/bar/"), ("", "foo", "bar"))
-    eq_(Path("\\foo\\bar", "\\"), ("", "foo", "bar"))
-
-
-def test_representation(force_ossep):
-    eq_("('foo', 'bar')", repr(Path(("foo", "bar"))))
-
-
-def test_add(force_ossep):
-    eq_("foo/bar/bar/foo", Path(("foo", "bar")) + Path("bar/foo"))
-    eq_("foo/bar/bar/foo", Path("foo/bar") + "bar/foo")
-    eq_("foo/bar/bar/foo", Path("foo/bar") + ("bar", "foo"))
-    eq_("foo/bar/bar/foo", ("foo", "bar") + Path("bar/foo"))
-    eq_("foo/bar/bar/foo", "foo/bar" + Path("bar/foo"))
-    # Invalid concatenation
-    try:
-        Path(("foo", "bar")) + 1
-        assert False
-    except TypeError:
-        pass
-
-
-def test_path_slice(force_ossep):
-    foo = Path("foo")
-    bar = Path("bar")
-    foobar = Path("foo/bar")
-    eq_("bar", foobar[foo:])
-    eq_("foo", foobar[:bar])
-    eq_("foo/bar", foobar[bar:])
-    eq_("foo/bar", foobar[:foo])
-    eq_((), foobar[foobar:])
-    eq_((), foobar[:foobar])
-    abcd = Path("a/b/c/d")
-    a = Path("a")
-    d = Path("d")
-    z = Path("z")
-    eq_("b/c", abcd[a:d])
-    eq_("b/c/d", abcd[a : d + z])
-    eq_("b/c", abcd[a : z + d])
-    eq_("a/b/c/d", abcd[:z])
-
-
-def test_add_with_root_path(force_ossep):
-    """if I perform /a/b/c + /d/e/f, I want /a/b/c/d/e/f, not /a/b/c//d/e/f"""
-    eq_("/foo/bar", str(Path("/foo") + Path("/bar")))
-
-
-def test_create_with_tuple_that_have_slash_inside(force_ossep, monkeypatch):
-    eq_(("", "foo", "bar"), Path(("/foo", "bar")))
-    monkeypatch.setattr(os, "sep", "\\")
-    eq_(("", "foo", "bar"), Path(("\\foo", "bar")))
-
-
-def test_auto_decode_os_sep(force_ossep, monkeypatch):
-    """Path should decode any either / or os.sep, but always encode in os.sep."""
-    eq_(("foo\\bar", "bleh"), Path("foo\\bar/bleh"))
-    monkeypatch.setattr(os, "sep", "\\")
-    eq_(("foo", "bar/bleh"), Path("foo\\bar/bleh"))
-    path = Path("foo/bar")
-    eq_(("foo", "bar"), path)
-    eq_("foo\\bar", str(path))
-
-
-def test_contains(force_ossep):
-    p = Path(("foo", "bar"))
-    assert Path(("foo", "bar", "bleh")) in p
-    assert Path(("foo", "bar")) in p
-    assert "foo" in p
-    assert "bleh" not in p
-    assert Path("foo") not in p
-
-
-def test_is_parent_of(force_ossep):
-    assert Path(("foo", "bar")).is_parent_of(Path(("foo", "bar", "bleh")))
-    assert not Path(("foo", "bar")).is_parent_of(Path(("foo", "baz")))
-    assert not Path(("foo", "bar")).is_parent_of(Path(("foo", "bar")))
-
-
-def test_windows_drive_letter(force_ossep):
-    p = Path(("c:",))
-    eq_("c:\\", str(p))
-
-
-def test_root_path(force_ossep):
-    p = Path("/")
-    eq_("/", str(p))
-
-
-def test_str_encodes_unicode_to_getfilesystemencoding(force_ossep):
-    p = Path(("foo", "bar\u00e9"))
-    eq_("foo/bar\u00e9".encode(sys.getfilesystemencoding()), p.tobytes())
-
-
-def test_unicode(force_ossep):
-    p = Path(("foo", "bar\u00e9"))
-    eq_("foo/bar\u00e9", str(p))
-
-
-def test_str_repr_of_mix_between_non_ascii_str_and_unicode(force_ossep):
-    u = "foo\u00e9"
-    encoded = u.encode(sys.getfilesystemencoding())
-    p = Path((encoded, "bar"))
-    print(repr(tuple(p)))
-    eq_("foo\u00e9/bar".encode(sys.getfilesystemencoding()), p.tobytes())
-
-
-def test_path_of_a_path_returns_self(force_ossep):
-    # if Path() is called with a path as value, just return value.
-    p = Path("foo/bar")
-    assert Path(p) is p
-
-
-def test_getitem_str(force_ossep):
-    # path['something'] returns the child path corresponding to the name
-    p = Path("/foo/bar")
-    eq_(p["baz"], Path("/foo/bar/baz"))
-
-
-def test_getitem_path(force_ossep):
-    # path[Path('something')] returns the child path corresponding to the name (or subpath)
-    p = Path("/foo/bar")
-    eq_(p[Path("baz/bleh")], Path("/foo/bar/baz/bleh"))
-
-
-@pytest.mark.xfail(reason="pytest's capture mechanism is flaky, I have to investigate")
-def test_log_unicode_errors(force_ossep, monkeypatch, capsys):
-    # When an there's a UnicodeDecodeError on path creation, log it so it can be possible
-    # to debug the cause of it.
-    monkeypatch.setattr(sys, "getfilesystemencoding", lambda: "ascii")
-    with pytest.raises(UnicodeDecodeError):
-        Path(["", b"foo\xe9"])
-    out, err = capsys.readouterr()
-    assert repr(b"foo\xe9") in err
-
-
-def test_has_drive_letter(monkeypatch):
-    monkeypatch.setattr(os, "sep", "\\")
-    p = Path("foo\\bar")
-    assert not p.has_drive_letter()
-    p = Path("C:\\")
-    assert p.has_drive_letter()
-    p = Path("z:\\foo")
-    assert p.has_drive_letter()
-
-
-def test_remove_drive_letter(monkeypatch):
-    monkeypatch.setattr(os, "sep", "\\")
-    p = Path("foo\\bar")
-    eq_(p.remove_drive_letter(), Path("foo\\bar"))
-    p = Path("C:\\")
-    eq_(p.remove_drive_letter(), Path(""))
-    p = Path("z:\\foo")
-    eq_(p.remove_drive_letter(), Path("foo"))
+from ..path import pathify
+from pathlib import Path


 def test_pathify():
--- a/hscommon/tests/util_test.py
+++ b/hscommon/tests/util_test.py
@@ -11,7 +11,7 @@ from io import StringIO
 from pytest import raises

 from ..testutil import eq_
-from ..path import Path
+from pathlib import Path
 from ..util import (
    nonone,
    tryint,
@@ -245,30 +245,30 @@ class TestCaseDeleteIfEmpty:

    def test_not_empty(self, tmpdir):
        testpath = Path(str(tmpdir))
-        testpath["foo"].mkdir()
+        testpath.joinpath("foo").mkdir()
        assert not delete_if_empty(testpath)
        assert testpath.exists()

    def test_with_files_to_delete(self, tmpdir):
        testpath = Path(str(tmpdir))
-        testpath["foo"].open("w")
-        testpath["bar"].open("w")
+        testpath.joinpath("foo").touch()
+        testpath.joinpath("bar").touch()
        assert delete_if_empty(testpath, ["foo", "bar"])
        assert not testpath.exists()

    def test_directory_in_files_to_delete(self, tmpdir):
        testpath = Path(str(tmpdir))
-        testpath["foo"].mkdir()
+        testpath.joinpath("foo").mkdir()
        assert not delete_if_empty(testpath, ["foo"])
        assert testpath.exists()

    def test_delete_files_to_delete_only_if_dir_is_empty(self, tmpdir):
        testpath = Path(str(tmpdir))
-        testpath["foo"].open("w")
-        testpath["bar"].open("w")
+        testpath.joinpath("foo").touch()
+        testpath.joinpath("bar").touch()
        assert not delete_if_empty(testpath, ["foo"])
        assert testpath.exists()
-        assert testpath["foo"].exists()
+        assert testpath.joinpath("foo").exists()

    def test_doesnt_exist(self):
        # When the 'path' doesn't exist, just do nothing.
@@ -276,8 +276,8 @@ class TestCaseDeleteIfEmpty:

    def test_is_file(self, tmpdir):
        # When 'path' is a file, do nothing.
-        p = Path(str(tmpdir)) + "filename"
-        p.open("w").close()
+        p = Path(str(tmpdir)).joinpath("filename")
+        p.touch()
        delete_if_empty(p)  # no crash

    def test_ioerror(self, tmpdir, monkeypatch):
--- a/hscommon/trans.py
+++ b/hscommon/trans.py
@@ -129,18 +129,6 @@ def install_gettext_trans(base_folder, lang):
    installed_lang = lang


-def install_gettext_trans_under_cocoa():
-    from cocoa import proxy
-
-    res_folder = proxy.getResourcePath()
-    base_folder = op.join(res_folder, "locale")
-    current_lang = proxy.systemLang()
-    install_gettext_trans(base_folder, current_lang)
-    localename = get_locale_name(current_lang)
-    if localename is not None:
-        locale.setlocale(locale.LC_ALL, localename)
-
-
 def install_gettext_trans_under_qt(base_folder, lang=None):
    # So, we install the gettext locale, great, but we also should try to install qt_*.qm if
    # available so that strings that are inside Qt itself over which I have no control are in the
--- a/hscommon/util.py
+++ b/hscommon/util.py
@@ -15,7 +15,8 @@ import glob
 import shutil
 from datetime import timedelta

-from .path import Path, pathify, log_io_error
+from pathlib import Path
+from .path import pathify, log_io_error


 def nonone(value, replace_value):
@@ -354,13 +355,13 @@ def find_in_path(name, paths=None):
@pathify
 def delete_if_empty(path: Path, files_to_delete=[]):
    """Deletes the directory at 'path' if it is empty or if it only contains files_to_delete."""
-    if not path.exists() or not path.isdir():
+    if not path.exists() or not path.is_dir():
        return
-    contents = path.listdir()
-    if any(p for p in contents if (p.name not in files_to_delete) or p.isdir()):
+    contents = list(path.glob("*"))
+    if any(p for p in contents if (p.name not in files_to_delete) or p.is_dir()):
        return False
    for p in contents:
-        p.remove()
+        p.unlink()
    path.rmdir()
    return True

--- a/macos.md
+++ b/macos.md
@@ -1,5 +1,7 @@
 ## How to build dupeGuru for macos
+These instructions are for the Qt version of the UI on macOS.

+*Note: The Cocoa UI of dupeGuru is hosted in a separate repo: https://github.com/arsenetar/dupeguru-cocoa and is no longer "supported".*
 ### Prerequisites

 - [Python 3.7+][python]
--- a/pkg/debian/control
+++ b/pkg/debian/control
@@ -10,7 +10,7 @@ Vcs-Git: https://github.com/arsenetar/dupeguru.git

 Package: {pkgname}
 Architecture: {arch}
-Depends: ${shlibs:Depends}, python3 (>=3.7), python3-pyqt5, python3-mutagen
+Depends: ${shlibs:Depends}, python3 (>=3.7), python3-pyqt5, python3-mutagen, python3-sematic-version
 Provides: dupeguru-se, dupeguru-me, dupeguru-pe
 Replaces: dupeguru-se, dupeguru-me, dupeguru-pe
 Conflicts: dupeguru-se, dupeguru-me, dupeguru-pe
--- a/qtlib/about_box.py
+++ b/qtlib/about_box.py
@@ -6,18 +6,11 @@
 # which should be included with this package. The terms are also available at
 # http://www.gnu.org/licenses/gpl-3.0.html

-from PyQt5.QtCore import Qt, QCoreApplication
+from PyQt5.QtCore import Qt, QCoreApplication, QTimer
 from PyQt5.QtGui import QPixmap, QFont
-from PyQt5.QtWidgets import (
-    QDialog,
-    QDialogButtonBox,
-    QSizePolicy,
-    QHBoxLayout,
-    QVBoxLayout,
-    QLabel,
-    QApplication,
-)
+from PyQt5.QtWidgets import QDialog, QDialogButtonBox, QSizePolicy, QHBoxLayout, QVBoxLayout, QLabel

+from core.util import check_for_update
 from qtlib.util import move_to_screen_center
 from hscommon.trans import trget

@@ -31,61 +24,56 @@ class AboutBox(QDialog):
        self.app = app
        self._setupUi()

-        self.buttonBox.accepted.connect(self.accept)
-        self.buttonBox.rejected.connect(self.reject)
+        self.button_box.accepted.connect(self.accept)
+        self.button_box.rejected.connect(self.reject)

    def _setupUi(self):
        self.setWindowTitle(tr("About {}").format(QCoreApplication.instance().applicationName()))
-        self.resize(400, 290)
        size_policy = QSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed)
-        size_policy.setHorizontalStretch(0)
-        size_policy.setVerticalStretch(0)
-        size_policy.setHeightForWidth(self.sizePolicy().hasHeightForWidth())
        self.setSizePolicy(size_policy)
-        self.horizontalLayout = QHBoxLayout(self)
-        self.logoLabel = QLabel(self)
-        self.logoLabel.setPixmap(QPixmap(":/%s_big" % self.app.LOGO_NAME))
-        self.horizontalLayout.addWidget(self.logoLabel)
-        self.verticalLayout = QVBoxLayout()
-        self.nameLabel = QLabel(self)
+        main_layout = QHBoxLayout(self)
+        logo_label = QLabel()
+        logo_label.setPixmap(QPixmap(":/%s_big" % self.app.LOGO_NAME))
+        main_layout.addWidget(logo_label)
+        detail_layout = QVBoxLayout()
+        name_label = QLabel()
        font = QFont()
        font.setWeight(75)
        font.setBold(True)
-        self.nameLabel.setFont(font)
-        self.nameLabel.setText(QCoreApplication.instance().applicationName())
-        self.verticalLayout.addWidget(self.nameLabel)
-        self.versionLabel = QLabel(self)
-        self.versionLabel.setText(tr("Version {}").format(QCoreApplication.instance().applicationVersion()))
-        self.verticalLayout.addWidget(self.versionLabel)
-        self.label_3 = QLabel(self)
-        self.verticalLayout.addWidget(self.label_3)
-        self.label_3.setText(tr("Licensed under GPLv3"))
-        self.label = QLabel(self)
-        font = QFont()
-        font.setWeight(75)
-        font.setBold(True)
-        self.label.setFont(font)
-        self.verticalLayout.addWidget(self.label)
-        self.buttonBox = QDialogButtonBox(self)
-        self.buttonBox.setOrientation(Qt.Horizontal)
-        self.buttonBox.setStandardButtons(QDialogButtonBox.Ok)
-        self.verticalLayout.addWidget(self.buttonBox)
-        self.horizontalLayout.addLayout(self.verticalLayout)
+        name_label.setFont(font)
+        name_label.setText(QCoreApplication.instance().applicationName())
+        detail_layout.addWidget(name_label)
+        version_label = QLabel()
+        version_label.setText(tr("Version {}").format(QCoreApplication.instance().applicationVersion()))
+        detail_layout.addWidget(version_label)
+        self.update_label = QLabel(tr("Checking for updates..."))
+        self.update_label.setTextInteractionFlags(Qt.TextBrowserInteraction)
+        self.update_label.setOpenExternalLinks(True)
+        detail_layout.addWidget(self.update_label)
+        license_label = QLabel()
+        license_label.setText(tr("Licensed under GPLv3"))
+        detail_layout.addWidget(license_label)
+        spacer_label = QLabel()
+        spacer_label.setFont(font)
+        detail_layout.addWidget(spacer_label)
+        self.button_box = QDialogButtonBox()
+        self.button_box.setOrientation(Qt.Horizontal)
+        self.button_box.setStandardButtons(QDialogButtonBox.Ok)
+        detail_layout.addWidget(self.button_box)
+        main_layout.addLayout(detail_layout)
+
+    def _check_for_update(self):
+        update = check_for_update(QCoreApplication.instance().applicationVersion(), include_prerelease=False)
+        if update is None:
+            self.update_label.setText(tr("No update available."))
+        else:
+            self.update_label.setText(
+                tr('New version {} available, download <a href="{}">here</a>.').format(update["version"], update["url"])
+            )

    def showEvent(self, event):
+        self.update_label.setText(tr("Checking for updates..."))
        # have to do this here as the frameGeometry is not correct until shown
        move_to_screen_center(self)
        super().showEvent(event)
-
-
-if __name__ == "__main__":
-    import sys
-
-    app = QApplication([])
-    QCoreApplication.setOrganizationName("Hardcoded Software")
-    QCoreApplication.setApplicationName("FooApp")
-    QCoreApplication.setApplicationVersion("1.2.3")
-    app.LOGO_NAME = ""
-    dialog = AboutBox(None, app)
-    dialog.show()
-    sys.exit(app.exec_())
+        QTimer.singleShot(0, self._check_for_update)
--- a/qtlib/locale/qtlib.pot
+++ b/qtlib/locale/qtlib.pot
@@ -4,18 +4,30 @@ msgstr ""
 "Content-Type: text/plain; charset=utf-8\n"
 "Content-Transfer-Encoding: utf-8\n"

-#: qtlib\about_box.py:38
+#: qtlib\about_box.py:31
 msgid "About {}"
 msgstr ""

-#: qtlib\about_box.py:58
+#: qtlib\about_box.py:47
 msgid "Version {}"
 msgstr ""

-#: qtlib\about_box.py:62
+#: qtlib\about_box.py:49 qtlib\about_box.py:75
+msgid "Checking for updates..."
+msgstr ""
+
+#: qtlib\about_box.py:54
 msgid "Licensed under GPLv3"
 msgstr ""

+#: qtlib\about_box.py:68
+msgid "No update available."
+msgstr ""
+
+#: qtlib\about_box.py:71
+msgid "New version {} available, download <a href=\"{}\">here</a>."
+msgstr ""
+
 #: qtlib\error_report_dialog.py:50
 msgid "Error Report"
 msgstr ""
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,7 +1,9 @@
-Send2Trash>=1.3.0
-sphinx>=3.0.0
-polib>=1.1.0
-mutagen>=1.44.0
 distro>=1.5.0
+mutagen>=1.44.0
+polib>=1.1.0
 PyQt5 >=5.14.1,<6.0; sys_platform != 'linux'
 pywin32>=228; sys_platform == 'win32'
+semantic-version>=2.0.0,<3.0.0
+Send2Trash>=1.3.0
+sphinx>=3.0.0
+xxhash>=3.0.0,<4.0.0
--- a/setup.cfg
+++ b/setup.cfg
@@ -34,6 +34,8 @@ install_requires =
    distro>=1.5.0
    PyQt5 >=5.14.1,<6.0; sys_platform != 'linux'
    pywin32>=228; sys_platform == 'win32'
+    semantic-version>=2.0.0,<3.0.0
+    xxhash>=3.0.0,<4.0.0
 setup_requires =
    sphinx>=3.0.0
    polib>=1.1.0
Author	SHA1	Message	Date
Andrew Senetar	a37b5b0eeb	Fix #988	2022-03-30 01:06:51 -05:00
Andrew Senetar	efd500ecc1	Update directory scanning to use os.scandir() - Change to use os.scandir() instead of os.walk() to leverage DirEntry objects. - Avoids extra calls to stat() on files during fs.can_handle() - See 3x speed improvement on Windows in some cases	2022-03-29 23:37:56 -05:00
Andrew Senetar	43fcc52291	Replace pathlib.glob() with os.scandir() in fs.py	2022-03-29 22:35:38 -05:00
Andrew Senetar	50f5db1543	Update fs to support DirEntry on get_file()	2022-03-29 22:32:36 -05:00
Andrew Senetar	a5b0ccdd02	Improve performance of Directories.get_state()	2022-03-29 21:48:14 -05:00
Andrew Senetar	143147cb8e	Remove Cocoa specific and other unused code	2022-03-28 00:47:46 -05:00
Andrew Senetar	ebb81d9f03	Remove pathlib function added in Python 3.9	2022-03-28 00:06:32 -05:00
Andrew Senetar	da9f8b2b9d	Squashed commit of the following: commit 8b15fe9a502ebf4841c6529e7098cef03a6a5e6f Author: Andrew Senetar <arsenetar@gmail.com> Date: Sun Mar 27 23:48:15 2022 -0500 Finish up changes to copy_or_move commit 21f6a32cf3186a400af8f30e67ad2743dc9a49bd Author: Andrew Senetar <arsenetar@gmail.com> Date: Thu Mar 17 23:56:52 2022 -0500 Migrate from hscommon.path to pathlib - Part one, this gets all hscommon and core tests passing - App appears to be able to load directories and complete scans, need further testing - app.py copy_or_move needs some additional work	2022-03-27 23:50:03 -05:00
Andrew Senetar	5ed5eddde6	Add polib back to requirements.txt	2022-03-27 22:35:34 -05:00
Andrew Senetar	9f40e4e786	Squashed commit of the following: commit 5eb515f666bfa1ff06c2e96bdc351a4b7456580e Author: Andrew Senetar <arsenetar@gmail.com> Date: Sun Mar 27 22:19:39 2022 -0500 Add fallback to md5 if xxhash not available Mainly here for the case when distributions have not packaged python3-xxhash. commit `51b18d4c84` Author: Andrew Senetar <arsenetar@gmail.com> Date: Sat Mar 19 15:25:46 2022 -0500 Switch file hashing to xxhash instead of md5 - Improves performance significantly in some cases - Add xxhash to requirements.txt and sort requirements - Rename md5 based members to digest - Update all tests to use new member names and hashing methods - Update hash db code to upgrade schema NOTE: May consider supporting multiple hashing algorithms in the future.	2022-03-27 22:27:13 -05:00
Andrew Senetar	86bf9b39d0	Add update check function and call from about - Implement a update check against the GitHub releases via the api - Add semantic-version dependency - Add automatic check when opening about dialog	2022-03-27 21:13:27 -05:00
Andrew Senetar	c0be0aecbd	Minor documentation update	2022-03-27 21:04:37 -05:00