1
0
mirror of https://github.com/arsenetar/dupeguru.git synced 2026-02-01 19:11:38 +00:00

Compare commits

...

12 Commits

Author SHA1 Message Date
a37b5b0eeb Fix #988 2022-03-30 01:06:51 -05:00
efd500ecc1 Update directory scanning to use os.scandir()
- Change to use os.scandir() instead of os.walk() to leverage DirEntry objects.
- Avoids extra calls to stat() on files during fs.can_handle()
- See 3x speed improvement on Windows in some cases
2022-03-29 23:37:56 -05:00
43fcc52291 Replace pathlib.glob() with os.scandir() in fs.py 2022-03-29 22:35:38 -05:00
50f5db1543 Update fs to support DirEntry on get_file() 2022-03-29 22:32:36 -05:00
a5b0ccdd02 Improve performance of Directories.get_state() 2022-03-29 21:48:14 -05:00
143147cb8e Remove Cocoa specific and other unused code 2022-03-28 00:47:46 -05:00
ebb81d9f03 Remove pathlib function added in Python 3.9 2022-03-28 00:06:32 -05:00
da9f8b2b9d Squashed commit of the following:
commit 8b15fe9a502ebf4841c6529e7098cef03a6a5e6f
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Sun Mar 27 23:48:15 2022 -0500

    Finish up changes to copy_or_move

commit 21f6a32cf3186a400af8f30e67ad2743dc9a49bd
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Thu Mar 17 23:56:52 2022 -0500

    Migrate from hscommon.path to pathlib
    - Part one, this gets all hscommon and core tests passing
    - App appears to be able to load directories and complete scans, need further testing
    - app.py copy_or_move needs some additional work
2022-03-27 23:50:03 -05:00
5ed5eddde6 Add polib back to requirements.txt 2022-03-27 22:35:34 -05:00
9f40e4e786 Squashed commit of the following:
commit 5eb515f666bfa1ff06c2e96bdc351a4b7456580e
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Sun Mar 27 22:19:39 2022 -0500

    Add fallback to md5 if xxhash not available

    Mainly here for the case when distributions have not packaged python3-xxhash.

commit 51b18d4c84
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Sat Mar 19 15:25:46 2022 -0500

    Switch file hashing to xxhash instead of md5

    - Improves performance significantly in some cases
    - Add xxhash to requirements.txt and sort requirements
    - Rename md5 based members to digest
    - Update all tests to use new member names and hashing methods
    - Update hash db code to upgrade schema

    NOTE: May consider supporting multiple hashing algorithms in the future.
2022-03-27 22:27:13 -05:00
86bf9b39d0 Add update check function and call from about
- Implement a update check against the GitHub releases via the api
- Add semantic-version dependency
- Add automatic check when opening about dialog
2022-03-27 21:13:27 -05:00
c0be0aecbd Minor documentation update 2022-03-27 21:04:37 -05:00
36 changed files with 652 additions and 1675 deletions

View File

@@ -1,16 +1,12 @@
# dupeGuru # dupeGuru
[dupeGuru][dupeguru] is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in [dupeGuru][dupeguru] is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in
a system. It is written mostly in Python 3 and has the peculiarity of using a system. It is written mostly in Python 3 and uses [qt](https://www.qt.io/) for the UI.
[multiple GUI toolkits][cross-toolkit], all using the same core Python code. On OS X, the UI layer
is written in Objective-C and uses Cocoa. On Linux, it is written in Python and uses Qt5.
The Cocoa UI of dupeGuru is hosted in a separate repo: https://github.com/arsenetar/dupeguru-cocoa
## Current status ## Current status
Still looking for additional help especially with regards to: Still looking for additional help especially with regards to:
* OSX maintenance: reproducing bugs & cocoa version, building package with Cocoa UI. * OSX maintenance: reproducing bugs, packaging verification.
* Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package. * Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package, rpm package.
* Translations: updating missing strings, transifex project at https://www.transifex.com/voltaicideas/dupeguru-1 * Translations: updating missing strings, transifex project at https://www.transifex.com/voltaicideas/dupeguru-1
* Documentation: keeping it up-to-date. * Documentation: keeping it up-to-date.
@@ -43,12 +39,10 @@ For macos instructions (qt version) see the [macOS Instructions](macos.md).
When running in a linux based environment the following system packages or equivalents are needed to build: When running in a linux based environment the following system packages or equivalents are needed to build:
* python3-pyqt5 * python3-pyqt5
* pyqt5-dev-tools (on some systems, see note) * pyqt5-dev-tools (on some systems, see note)
* python3-wheel (for hsaudiotag3k)
* python3-venv (only if using a virtual environment) * python3-venv (only if using a virtual environment)
* python3-dev * python3-dev
* build-essential * build-essential
Note: On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with `which pyrcc5`. Debian based systems need the extra package, and Arch does not. Note: On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with `which pyrcc5`. Debian based systems need the extra package, and Arch does not.
To create packages the following are also needed: To create packages the following are also needed:

View File

@@ -109,10 +109,7 @@ def build_updatepot():
print("Building columns.pot") print("Building columns.pot")
loc.generate_pot(["core"], Path("locale", "columns.pot"), ["coltr"]) loc.generate_pot(["core"], Path("locale", "columns.pot"), ["coltr"])
print("Building ui.pot") print("Building ui.pot")
# When we're not under OS X, we don't want to overwrite ui.pot because it contains Cocoa locs loc.generate_pot(["qt"], Path("locale", "ui.pot"), ["tr"], merge=True)
# We want to merge the generated pot with the old pot in the most preserving way possible.
ui_packages = ["qt", Path("cocoa", "inter")]
loc.generate_pot(ui_packages, Path("locale", "ui.pot"), ["tr"], merge=True)
print("Building qtlib.pot") print("Building qtlib.pot")
loc.generate_pot(["qtlib"], Path("qtlib", "locale", "qtlib.pot"), ["tr"]) loc.generate_pot(["qtlib"], Path("qtlib", "locale", "qtlib.pot"), ["tr"])
@@ -121,13 +118,11 @@ def build_mergepot():
print("Updating .po files using .pot files") print("Updating .po files using .pot files")
loc.merge_pots_into_pos("locale") loc.merge_pots_into_pos("locale")
loc.merge_pots_into_pos(Path("qtlib", "locale")) loc.merge_pots_into_pos(Path("qtlib", "locale"))
# loc.merge_pots_into_pos(Path("cocoalib", "locale"))
def build_normpo(): def build_normpo():
loc.normalize_all_pos("locale") loc.normalize_all_pos("locale")
loc.normalize_all_pos(Path("qtlib", "locale")) loc.normalize_all_pos(Path("qtlib", "locale"))
# loc.normalize_all_pos(Path("cocoalib", "locale"))
def build_pe_modules(): def build_pe_modules():

View File

@@ -10,11 +10,11 @@ import logging
import subprocess import subprocess
import re import re
import shutil import shutil
from pathlib import Path
from send2trash import send2trash from send2trash import send2trash
from hscommon.jobprogress import job from hscommon.jobprogress import job
from hscommon.notify import Broadcaster from hscommon.notify import Broadcaster
from hscommon.path import Path
from hscommon.conflict import smart_move, smart_copy from hscommon.conflict import smart_move, smart_copy
from hscommon.gui.progress_window import ProgressWindow from hscommon.gui.progress_window import ProgressWindow
from hscommon.util import delete_if_empty, first, escape, nonone, allsame from hscommon.util import delete_if_empty, first, escape, nonone, allsame
@@ -248,7 +248,7 @@ class DupeGuru(Broadcaster):
ref = group.ref ref = group.ref
linkfunc = os.link if use_hardlinks else os.symlink linkfunc = os.link if use_hardlinks else os.symlink
linkfunc(str(ref.path), str_path) linkfunc(str(ref.path), str_path)
self.clean_empty_dirs(dupe.path.parent()) self.clean_empty_dirs(dupe.path.parent)
def _create_file(self, path): def _create_file(self, path):
# We add fs.Folder to fileclasses in case the file we're loading contains folder paths. # We add fs.Folder to fileclasses in case the file we're loading contains folder paths.
@@ -415,7 +415,7 @@ class DupeGuru(Broadcaster):
def clean_empty_dirs(self, path): def clean_empty_dirs(self, path):
if self.options["clean_empty_dirs"]: if self.options["clean_empty_dirs"]:
while delete_if_empty(path, [".DS_Store"]): while delete_if_empty(path, [".DS_Store"]):
path = path.parent() path = path.parent
def clear_picture_cache(self): def clear_picture_cache(self):
try: try:
@@ -428,25 +428,25 @@ class DupeGuru(Broadcaster):
def copy_or_move(self, dupe, copy: bool, destination: str, dest_type: DestType): def copy_or_move(self, dupe, copy: bool, destination: str, dest_type: DestType):
source_path = dupe.path source_path = dupe.path
location_path = first(p for p in self.directories if dupe.path in p) location_path = first(p for p in self.directories if p in dupe.path.parents)
dest_path = Path(destination) dest_path = Path(destination)
if dest_type in {DestType.RELATIVE, DestType.ABSOLUTE}: if dest_type in {DestType.RELATIVE, DestType.ABSOLUTE}:
# no filename, no windows drive letter # no filename, no windows drive letter
source_base = source_path.remove_drive_letter().parent() source_base = source_path.relative_to(source_path.anchor).parent
if dest_type == DestType.RELATIVE: if dest_type == DestType.RELATIVE:
source_base = source_base[location_path:] source_base = source_base.relative_to(location_path.relative_to(location_path.anchor))
dest_path = dest_path[source_base] dest_path = dest_path.joinpath(source_base)
if not dest_path.exists(): if not dest_path.exists():
dest_path.makedirs() dest_path.mkdir(parents=True)
# Add filename to dest_path. For file move/copy, it's not required, but for folders, yes. # Add filename to dest_path. For file move/copy, it's not required, but for folders, yes.
dest_path = dest_path[source_path.name] dest_path = dest_path.joinpath(source_path.name)
logging.debug("Copy/Move operation from '%s' to '%s'", source_path, dest_path) logging.debug("Copy/Move operation from '%s' to '%s'", source_path, dest_path)
# Raises an EnvironmentError if there's a problem # Raises an EnvironmentError if there's a problem
if copy: if copy:
smart_copy(source_path, dest_path) smart_copy(source_path, dest_path)
else: else:
smart_move(source_path, dest_path) smart_move(source_path, dest_path)
self.clean_empty_dirs(source_path.parent()) self.clean_empty_dirs(source_path.parent)
def copy_or_move_marked(self, copy): def copy_or_move_marked(self, copy):
"""Start an async move (or copy) job on marked duplicates. """Start an async move (or copy) job on marked duplicates.

View File

@@ -7,9 +7,9 @@
import os import os
from xml.etree import ElementTree as ET from xml.etree import ElementTree as ET
import logging import logging
from pathlib import Path
from hscommon.jobprogress import job from hscommon.jobprogress import job
from hscommon.path import Path
from hscommon.util import FileOrPath from hscommon.util import FileOrPath
from hscommon.trans import tr from hscommon.trans import tr
@@ -63,7 +63,7 @@ class Directories:
def __contains__(self, path): def __contains__(self, path):
for p in self._dirs: for p in self._dirs:
if path in p: if path == p or p in path.parents:
return True return True
return False return False
@@ -90,45 +90,45 @@ class Directories:
return DirectoryState.EXCLUDED return DirectoryState.EXCLUDED
def _get_files(self, from_path, fileclasses, j): def _get_files(self, from_path, fileclasses, j):
for root, dirs, files in os.walk(str(from_path)): try:
j.check_if_cancelled() with os.scandir(from_path) as iter:
root_path = Path(root) root_path = Path(from_path)
state = self.get_state(root_path) state = self.get_state(root_path)
if state == DirectoryState.EXCLUDED and not any(p[: len(root_path)] == root_path for p in self.states): # if we have no un-excluded dirs under this directory skip going deeper
# Recursively get files from folders with lots of subfolder is expensive. However, there skip_dirs = state == DirectoryState.EXCLUDED and not any(
# might be a subfolder in this path that is not excluded. What we want to do is to skim p.parts[: len(root_path.parts)] == root_path.parts for p in self.states
# through self.states and see if we must continue, or we can stop right here to save time )
del dirs[:] count = 0
try: for item in iter:
if state != DirectoryState.EXCLUDED: j.check_if_cancelled()
# Old logic try:
if self._exclude_list is None or not self._exclude_list.mark_count: if item.is_dir():
found_files = [fs.get_file(root_path + f, fileclasses=fileclasses) for f in files] if skip_dirs:
else: continue
found_files = [] yield from self._get_files(item.path, fileclasses, j)
# print(f"len of files: {len(files)} {files}") continue
for f in files: elif state == DirectoryState.EXCLUDED:
if not self._exclude_list.is_excluded(root, f): continue
found_files.append(fs.get_file(root_path + f, fileclasses=fileclasses)) # File excluding or not
found_files = [f for f in found_files if f is not None] if (
# In some cases, directories can be considered as files by dupeGuru, which is self._exclude_list is None
# why we have this line below. In fact, there only one case: Bundle files under or not self._exclude_list.mark_count
# OS X... In other situations, this forloop will do nothing. or not self._exclude_list.is_excluded(str(from_path), item.name)
for d in dirs[:]: ):
f = fs.get_file(root_path + d, fileclasses=fileclasses) file = fs.get_file(item, fileclasses=fileclasses)
if f is not None: if file:
found_files.append(f) file.is_ref = state == DirectoryState.REFERENCE
dirs.remove(d) count += 1
logging.debug( yield file
"Collected %d files in folder %s", except (EnvironmentError, OSError, fs.InvalidPath):
len(found_files), pass
str(root_path), logging.debug(
) "Collected %d files in folder %s",
for file in found_files: count,
file.is_ref = state == DirectoryState.REFERENCE str(root_path),
yield file )
except (EnvironmentError, fs.InvalidPath): except OSError:
pass pass
def _get_folders(self, from_folder, j): def _get_folders(self, from_folder, j):
j.check_if_cancelled() j.check_if_cancelled()
@@ -159,7 +159,7 @@ class Directories:
raise AlreadyThereError() raise AlreadyThereError()
if not path.exists(): if not path.exists():
raise InvalidPathError() raise InvalidPathError()
self._dirs = [p for p in self._dirs if p not in path] self._dirs = [p for p in self._dirs if path not in p.parents]
self._dirs.append(path) self._dirs.append(path)
@staticmethod @staticmethod
@@ -170,7 +170,7 @@ class Directories:
:rtype: list of Path :rtype: list of Path
""" """
try: try:
subpaths = [p for p in path.listdir() if p.isdir()] subpaths = [p for p in path.glob("*") if p.is_dir()]
subpaths.sort(key=lambda x: x.name.lower()) subpaths.sort(key=lambda x: x.name.lower())
return subpaths return subpaths
except EnvironmentError: except EnvironmentError:
@@ -220,14 +220,11 @@ class Directories:
if state != DirectoryState.NORMAL: if state != DirectoryState.NORMAL:
self.states[path] = state self.states[path] = state
return state return state
# find the longest parent path that is in states and return that state if found
prevlen = 0 # NOTE: path.parents is ordered longest to shortest
# we loop through the states to find the longest matching prefix for parent_path in path.parents:
# if the parent has a state in cache, return that state if parent_path in self.states:
for p, s in self.states.items(): return self.states[parent_path]
if p.is_parent_of(path) and len(p) > prevlen:
prevlen = len(p)
state = s
return state return state
def has_any_file(self): def has_any_file(self):
@@ -296,6 +293,6 @@ class Directories:
if self.get_state(path) == state: if self.get_state(path) == state:
return return
for iter_path in list(self.states.keys()): for iter_path in list(self.states.keys()):
if path.is_parent_of(iter_path): if path in iter_path.parents:
del self.states[iter_path] del self.states[iter_path]
self.states[path] = state self.states[path] = state

View File

@@ -283,7 +283,7 @@ def getmatches_by_contents(files, bigsize=0, j=job.nulljob):
"""Returns a list of :class:`Match` within ``files`` if their contents is the same. """Returns a list of :class:`Match` within ``files`` if their contents is the same.
:param bigsize: The size in bytes over which we consider files big enough to :param bigsize: The size in bytes over which we consider files big enough to
justify taking samples of md5. If 0, compute md5 as usual. justify taking samples of the file for hashing. If 0, compute digest as usual.
:param j: A :ref:`job progress instance <jobs>`. :param j: A :ref:`job progress instance <jobs>`.
""" """
size2files = defaultdict(set) size2files = defaultdict(set)
@@ -300,15 +300,15 @@ def getmatches_by_contents(files, bigsize=0, j=job.nulljob):
if first.is_ref and second.is_ref: if first.is_ref and second.is_ref:
continue # Don't spend time comparing two ref pics together. continue # Don't spend time comparing two ref pics together.
if first.size == 0 and second.size == 0: if first.size == 0 and second.size == 0:
# skip md5 for zero length files # skip hashing for zero length files
result.append(Match(first, second, 100)) result.append(Match(first, second, 100))
continue continue
if first.md5partial == second.md5partial: if first.digest_partial == second.digest_partial:
if bigsize > 0 and first.size > bigsize: if bigsize > 0 and first.size > bigsize:
if first.md5samples == second.md5samples: if first.digest_samples == second.digest_samples:
result.append(Match(first, second, 100)) result.append(Match(first, second, 100))
else: else:
if first.md5 == second.md5: if first.digest == second.digest:
result.append(Match(first, second, 100)) result.append(Match(first, second, 100))
group_count += 1 group_count += 1
j.add_progress(desc=PROGRESS_MESSAGE % (len(result), group_count)) j.add_progress(desc=PROGRESS_MESSAGE % (len(result), group_count))

View File

@@ -11,14 +11,24 @@
# resulting needless complexity and memory usage. It's been a while since I wanted to do that fork, # resulting needless complexity and memory usage. It's been a while since I wanted to do that fork,
# and I'm doing it now. # and I'm doing it now.
import hashlib import os
try:
import xxhash
hasher = xxhash.xxh128
except ImportError:
import hashlib
hasher = hashlib.md5
from math import floor from math import floor
import logging import logging
import sqlite3 import sqlite3
from threading import Lock from threading import Lock
from typing import Any from typing import Any, AnyStr, Union
from hscommon.path import Path from pathlib import Path
from hscommon.util import nonone, get_file_ext from hscommon.util import nonone, get_file_ext
__all__ = [ __all__ = [
@@ -40,7 +50,7 @@ NOT_SET = object()
# CPU. # CPU.
CHUNK_SIZE = 1024 * 1024 # 1 MiB CHUNK_SIZE = 1024 * 1024 # 1 MiB
# Minimum size below which partial hashes don't need to be computed # Minimum size below which partial hashing is not used
MIN_FILE_SIZE = 3 * CHUNK_SIZE # 3MiB, because we take 3 samples MIN_FILE_SIZE = 3 * CHUNK_SIZE # 3MiB, because we take 3 samples
@@ -83,9 +93,11 @@ class OperationError(FSError):
class FilesDB: class FilesDB:
schema_version = 1
schema_version_description = "Changed from md5 to xxhash if available."
create_table_query = "CREATE TABLE IF NOT EXISTS files (path TEXT PRIMARY KEY, size INTEGER, mtime_ns INTEGER, entry_dt DATETIME, md5 BLOB, md5partial BLOB)" create_table_query = "CREATE TABLE IF NOT EXISTS files (path TEXT PRIMARY KEY, size INTEGER, mtime_ns INTEGER, entry_dt DATETIME, digest BLOB, digest_partial BLOB, digest_samples BLOB)"
drop_table_query = "DROP TABLE files;" drop_table_query = "DROP TABLE IF EXISTS files;"
select_query = "SELECT {key} FROM files WHERE path=:path AND size=:size and mtime_ns=:mtime_ns" select_query = "SELECT {key} FROM files WHERE path=:path AND size=:size and mtime_ns=:mtime_ns"
insert_query = """ insert_query = """
INSERT INTO files (path, size, mtime_ns, entry_dt, {key}) VALUES (:path, :size, :mtime_ns, datetime('now'), :value) INSERT INTO files (path, size, mtime_ns, entry_dt, {key}) VALUES (:path, :size, :mtime_ns, datetime('now'), :value)
@@ -97,24 +109,37 @@ class FilesDB:
self.cur = None self.cur = None
self.lock = None self.lock = None
def connect(self, path): def connect(self, path: Union[AnyStr, os.PathLike]) -> None:
# type: (str, ) -> None
self.conn = sqlite3.connect(path, check_same_thread=False) self.conn = sqlite3.connect(path, check_same_thread=False)
self.cur = self.conn.cursor() self.cur = self.conn.cursor()
self.cur.execute(self.create_table_query)
self.lock = Lock() self.lock = Lock()
self._check_upgrade()
def clear(self): def _check_upgrade(self) -> None:
# type: () -> None with self.lock:
has_schema = self.cur.execute(
"SELECT NAME FROM sqlite_master WHERE type='table' AND name='schema_version'"
).fetchall()
version = None
if has_schema:
version = self.cur.execute("SELECT version FROM schema_version ORDER BY version DESC").fetchone()[0]
else:
self.cur.execute("CREATE TABLE schema_version (version int PRIMARY KEY, description TEXT)")
if version != self.schema_version:
self.cur.execute(self.drop_table_query)
self.cur.execute(
"INSERT OR REPLACE INTO schema_version VALUES (:version, :description)",
{"version": self.schema_version, "description": self.schema_version_description},
)
self.cur.execute(self.create_table_query)
self.conn.commit()
def clear(self) -> None:
with self.lock: with self.lock:
self.cur.execute(self.drop_table_query) self.cur.execute(self.drop_table_query)
self.cur.execute(self.create_table_query) self.cur.execute(self.create_table_query)
def get(self, path, key): def get(self, path: Path, key: str) -> Union[bytes, None]:
# type: (Path, str) -> bytes
stat = path.stat() stat = path.stat()
size = stat.st_size size = stat.st_size
mtime_ns = stat.st_mtime_ns mtime_ns = stat.st_mtime_ns
@@ -128,9 +153,7 @@ class FilesDB:
return None return None
def put(self, path, key, value): def put(self, path: Path, key: str, value: Any) -> None:
# type: (Path, str, Any) -> None
stat = path.stat() stat = path.stat()
size = stat.st_size size = stat.st_size
mtime_ns = stat.st_mtime_ns mtime_ns = stat.st_mtime_ns
@@ -141,15 +164,11 @@ class FilesDB:
{"path": str(path), "size": size, "mtime_ns": mtime_ns, "value": value}, {"path": str(path), "size": size, "mtime_ns": mtime_ns, "value": value},
) )
def commit(self): def commit(self) -> None:
# type: () -> None
with self.lock: with self.lock:
self.conn.commit() self.conn.commit()
def close(self): def close(self) -> None:
# type: () -> None
with self.lock: with self.lock:
self.cur.close() self.cur.close()
self.conn.close() self.conn.close()
@@ -161,7 +180,7 @@ filesdb = FilesDB() # Singleton
class File: class File:
"""Represents a file and holds metadata to be used for scanning.""" """Represents a file and holds metadata to be used for scanning."""
INITIAL_INFO = {"size": 0, "mtime": 0, "md5": b"", "md5partial": b"", "md5samples": b""} INITIAL_INFO = {"size": 0, "mtime": 0, "digest": b"", "digest_partial": b"", "digest_samples": b""}
# Slots for File make us save quite a bit of memory. In a memory test I've made with a lot of # Slots for File make us save quite a bit of memory. In a memory test I've made with a lot of
# files, I saved 35% memory usage with "unread" files (no _read_info() call) and gains become # files, I saved 35% memory usage with "unread" files (no _read_info() call) and gains become
# even greater when we take into account read attributes (70%!). Yeah, it's worth it. # even greater when we take into account read attributes (70%!). Yeah, it's worth it.
@@ -187,32 +206,51 @@ class File:
result = self.INITIAL_INFO[attrname] result = self.INITIAL_INFO[attrname]
return result return result
def _calc_md5(self): def _calc_digest(self):
# type: () -> bytes # type: () -> bytes
with self.path.open("rb") as fp: with self.path.open("rb") as fp:
md5 = hashlib.md5() file_hash = hasher()
# The goal here is to not run out of memory on really big files. However, the chunk # The goal here is to not run out of memory on really big files. However, the chunk
# size has to be large enough so that the python loop isn't too costly in terms of # size has to be large enough so that the python loop isn't too costly in terms of
# CPU. # CPU.
CHUNK_SIZE = 1024 * 1024 # 1 mb CHUNK_SIZE = 1024 * 1024 # 1 mb
filedata = fp.read(CHUNK_SIZE) filedata = fp.read(CHUNK_SIZE)
while filedata: while filedata:
md5.update(filedata) file_hash.update(filedata)
filedata = fp.read(CHUNK_SIZE) filedata = fp.read(CHUNK_SIZE)
return md5.digest() return file_hash.digest()
def _calc_md5partial(self): def _calc_digest_partial(self):
# type: () -> bytes # type: () -> bytes
# This offset is where we should start reading the file to get a partial md5 # This offset is where we should start reading the file to get a partial hash
# For audio file, it should be where audio data starts # For audio file, it should be where audio data starts
offset, size = (0x4000, 0x4000) offset, size = (0x4000, 0x4000)
with self.path.open("rb") as fp: with self.path.open("rb") as fp:
fp.seek(offset) fp.seek(offset)
partialdata = fp.read(size) partial_data = fp.read(size)
return hashlib.md5(partialdata).digest() return hasher(partial_data).digest()
def _calc_digest_samples(self) -> bytes:
size = self.size
with self.path.open("rb") as fp:
# Chunk at 25% of the file
fp.seek(floor(size * 25 / 100), 0)
file_data = fp.read(CHUNK_SIZE)
file_hash = hasher(file_data)
# Chunk at 60% of the file
fp.seek(floor(size * 60 / 100), 0)
file_data = fp.read(CHUNK_SIZE)
file_hash.update(file_data)
# Last chunk of the file
fp.seek(-CHUNK_SIZE, 2)
file_data = fp.read(CHUNK_SIZE)
file_hash.update(file_data)
return file_hash.digest()
def _read_info(self, field): def _read_info(self, field):
# print(f"_read_info({field}) for {self}") # print(f"_read_info({field}) for {self}")
@@ -220,48 +258,35 @@ class File:
stats = self.path.stat() stats = self.path.stat()
self.size = nonone(stats.st_size, 0) self.size = nonone(stats.st_size, 0)
self.mtime = nonone(stats.st_mtime, 0) self.mtime = nonone(stats.st_mtime, 0)
elif field == "md5partial": elif field == "digest_partial":
try: try:
self.md5partial = filesdb.get(self.path, "md5partial") self.digest_partial = filesdb.get(self.path, "digest_partial")
if self.md5partial is None: if self.digest_partial is None:
self.md5partial = self._calc_md5partial() self.digest_partial = self._calc_digest_partial()
filesdb.put(self.path, "md5partial", self.md5partial) filesdb.put(self.path, "digest_partial", self.digest_partial)
except Exception as e: except Exception as e:
logging.warning("Couldn't get md5partial for %s: %s", self.path, e) logging.warning("Couldn't get digest_partial for %s: %s", self.path, e)
elif field == "md5": elif field == "digest":
try: try:
self.md5 = filesdb.get(self.path, "md5") self.digest = filesdb.get(self.path, "digest")
if self.md5 is None: if self.digest is None:
self.md5 = self._calc_md5() self.digest = self._calc_digest()
filesdb.put(self.path, "md5", self.md5) filesdb.put(self.path, "digest", self.digest)
except Exception as e: except Exception as e:
logging.warning("Couldn't get md5 for %s: %s", self.path, e) logging.warning("Couldn't get digest for %s: %s", self.path, e)
elif field == "md5samples": elif field == "digest_samples":
size = self.size
# Might as well hash such small files entirely.
if size <= MIN_FILE_SIZE:
setattr(self, field, self.digest)
return
try: try:
with self.path.open("rb") as fp: self.digest_samples = filesdb.get(self.path, "digest_samples")
size = self.size if self.digest_samples is None:
# Might as well hash such small files entirely. self.digest_samples = self._calc_digest_samples()
if size <= MIN_FILE_SIZE: filesdb.put(self.path, "digest_samples", self.digest_samples)
setattr(self, field, self.md5)
return
# Chunk at 25% of the file
fp.seek(floor(size * 25 / 100), 0)
filedata = fp.read(CHUNK_SIZE)
md5 = hashlib.md5(filedata)
# Chunk at 60% of the file
fp.seek(floor(size * 60 / 100), 0)
filedata = fp.read(CHUNK_SIZE)
md5.update(filedata)
# Last chunk of the file
fp.seek(-CHUNK_SIZE, 2)
filedata = fp.read(CHUNK_SIZE)
md5.update(filedata)
setattr(self, field, md5.digest())
except Exception as e: except Exception as e:
logging.error(f"Error computing md5samples: {e}") logging.warning(f"Couldn't get digest_samples for {self.path}: {e}")
def _read_all_info(self, attrnames=None): def _read_all_info(self, attrnames=None):
"""Cache all possible info. """Cache all possible info.
@@ -277,14 +302,14 @@ class File:
@classmethod @classmethod
def can_handle(cls, path): def can_handle(cls, path):
"""Returns whether this file wrapper class can handle ``path``.""" """Returns whether this file wrapper class can handle ``path``."""
return not path.islink() and path.isfile() return not path.is_symlink() and path.is_file()
def rename(self, newname): def rename(self, newname):
if newname == self.name: if newname == self.name:
return return
destpath = self.path.parent()[newname] destpath = self.path.parent.joinpath(newname)
if destpath.exists(): if destpath.exists():
raise AlreadyExistsError(newname, self.path.parent()) raise AlreadyExistsError(newname, self.path.parent)
try: try:
self.path.rename(destpath) self.path.rename(destpath)
except EnvironmentError: except EnvironmentError:
@@ -308,13 +333,13 @@ class File:
@property @property
def folder_path(self): def folder_path(self):
return self.path.parent() return self.path.parent
class Folder(File): class Folder(File):
"""A wrapper around a folder path. """A wrapper around a folder path.
It has the size/md5 info of a File, but its value is the sum of its subitems. It has the size/digest info of a File, but its value is the sum of its subitems.
""" """
__slots__ = File.__slots__ + ("_subfolders",) __slots__ = File.__slots__ + ("_subfolders",)
@@ -335,31 +360,31 @@ class Folder(File):
self.size = size self.size = size
stats = self.path.stat() stats = self.path.stat()
self.mtime = nonone(stats.st_mtime, 0) self.mtime = nonone(stats.st_mtime, 0)
elif field in {"md5", "md5partial", "md5samples"}: elif field in {"digest", "digest_partial", "digest_samples"}:
# What's sensitive here is that we must make sure that subfiles' # What's sensitive here is that we must make sure that subfiles'
# md5 are always added up in the same order, but we also want a # digest are always added up in the same order, but we also want a
# different md5 if a file gets moved in a different subdirectory. # different digest if a file gets moved in a different subdirectory.
def get_dir_md5_concat(): def get_dir_digest_concat():
items = self._all_items() items = self._all_items()
items.sort(key=lambda f: f.path) items.sort(key=lambda f: f.path)
md5s = [getattr(f, field) for f in items] digests = [getattr(f, field) for f in items]
return b"".join(md5s) return b"".join(digests)
md5 = hashlib.md5(get_dir_md5_concat()) digest = hasher(get_dir_digest_concat()).digest()
digest = md5.digest()
setattr(self, field, digest) setattr(self, field, digest)
@property @property
def subfolders(self): def subfolders(self):
if self._subfolders is None: if self._subfolders is None:
subfolders = [p for p in self.path.listdir() if not p.islink() and p.isdir()] with os.scandir(self.path) as iter:
self._subfolders = [self.__class__(p) for p in subfolders] subfolders = [p.path for p in iter if not p.is_symlink() and p.is_dir()]
self._subfolders = [self.__class__(Path(p)) for p in subfolders]
return self._subfolders return self._subfolders
@classmethod @classmethod
def can_handle(cls, path): def can_handle(cls, path):
return not path.islink() and path.isdir() return not path.is_symlink() and path.is_dir()
def get_file(path, fileclasses=[File]): def get_file(path, fileclasses=[File]):
@@ -372,6 +397,8 @@ def get_file(path, fileclasses=[File]):
""" """
for fileclass in fileclasses: for fileclass in fileclasses:
if fileclass.can_handle(path): if fileclass.can_handle(path):
if type(path) is os.DirEntry:
return fileclass(Path(path.path))
return fileclass(path) return fileclass(path)
@@ -384,10 +411,11 @@ def get_files(path, fileclasses=[File]):
assert all(issubclass(fileclass, File) for fileclass in fileclasses) assert all(issubclass(fileclass, File) for fileclass in fileclasses)
try: try:
result = [] result = []
for path in path.listdir(): with os.scandir(path) as iter:
file = get_file(path, fileclasses=fileclasses) for item in iter:
if file is not None: file = get_file(item, fileclasses=fileclasses)
result.append(file) if file is not None:
result.append(file)
return result return result
except EnvironmentError: except EnvironmentError:
raise InvalidPath(path) raise InvalidPath(path)

View File

@@ -97,11 +97,6 @@ class MusicFile(fs.File):
"dupe_count": format_dupe_count(dupe_count), "dupe_count": format_dupe_count(dupe_count),
} }
def _get_md5partial_offset_and_size(self):
# No longer calculating the offset and audio size, just whole file
size = self.path.stat().st_size
return (0, size)
def _read_info(self, field): def _read_info(self, field):
fs.File._read_info(self, field) fs.File._read_info(self, field)
if field in TAG_FIELDS: if field in TAG_FIELDS:

View File

@@ -238,7 +238,7 @@ def getmatches(pictures, cache_path, threshold, match_scaled=False, j=job.nulljo
for ref_id, other_id, percentage in myiter: for ref_id, other_id, percentage in myiter:
ref = id2picture[ref_id] ref = id2picture[ref_id]
other = id2picture[other_id] other = id2picture[other_id]
if percentage == 100 and ref.md5 != other.md5: if percentage == 100 and ref.digest != other.digest:
percentage = 99 percentage = 99
if percentage >= threshold: if percentage >= threshold:
ref.dimensions # pre-read dimensions for display in results ref.dimensions # pre-read dimensions for display in results

View File

@@ -82,10 +82,12 @@ class FolderCategory(ValueListCategory):
def sort_key(self, dupe, crit_value): def sort_key(self, dupe, crit_value):
value = self.extract_value(dupe) value = self.extract_value(dupe)
if value[: len(crit_value)] == crit_value: # This is instead of using is_relative_to() which was added in py 3.9
return 0 try:
else: value.relative_to(crit_value)
except ValueError:
return 1 return 1
return 0
class FilenameCategory(CriterionCategory): class FilenameCategory(CriterionCategory):

View File

@@ -134,7 +134,7 @@ class Scanner:
return False return False
if is_same_with_digit(refname, dupename): if is_same_with_digit(refname, dupename):
return True return True
return len(dupe.path) > len(ref.path) return len(dupe.path.parts) > len(ref.path.parts)
@staticmethod @staticmethod
def get_scan_options(): def get_scan_options():
@@ -164,7 +164,7 @@ class Scanner:
toremove = set() toremove = set()
last_parent_path = sortedpaths[0] last_parent_path = sortedpaths[0]
for p in sortedpaths[1:]: for p in sortedpaths[1:]:
if p in last_parent_path: if last_parent_path in p.parents:
toremove.add(p) toremove.add(p)
else: else:
last_parent_path = p last_parent_path = p

View File

@@ -9,7 +9,7 @@ import os.path as op
import logging import logging
import pytest import pytest
from hscommon.path import Path from pathlib import Path
import hscommon.conflict import hscommon.conflict
import hscommon.util import hscommon.util
from hscommon.testutil import eq_, log_calls from hscommon.testutil import eq_, log_calls
@@ -56,7 +56,7 @@ class TestCaseDupeGuru:
# for this unit is pathetic. What's done is done. My approach now is to add tests for # for this unit is pathetic. What's done is done. My approach now is to add tests for
# every change I want to make. The blowup was caused by a missing import. # every change I want to make. The blowup was caused by a missing import.
p = Path(str(tmpdir)) p = Path(str(tmpdir))
p["foo"].open("w").close() p.joinpath("foo").touch()
monkeypatch.setattr( monkeypatch.setattr(
hscommon.conflict, hscommon.conflict,
"smart_copy", "smart_copy",
@@ -71,19 +71,19 @@ class TestCaseDupeGuru:
dgapp.copy_or_move(f, True, "some_destination", 0) dgapp.copy_or_move(f, True, "some_destination", 0)
eq_(1, len(hscommon.conflict.smart_copy.calls)) eq_(1, len(hscommon.conflict.smart_copy.calls))
call = hscommon.conflict.smart_copy.calls[0] call = hscommon.conflict.smart_copy.calls[0]
eq_(call["dest_path"], op.join("some_destination", "foo")) eq_(call["dest_path"], Path("some_destination", "foo"))
eq_(call["source_path"], f.path) eq_(call["source_path"], f.path)
def test_copy_or_move_clean_empty_dirs(self, tmpdir, monkeypatch): def test_copy_or_move_clean_empty_dirs(self, tmpdir, monkeypatch):
tmppath = Path(str(tmpdir)) tmppath = Path(str(tmpdir))
sourcepath = tmppath["source"] sourcepath = tmppath.joinpath("source")
sourcepath.mkdir() sourcepath.mkdir()
sourcepath["myfile"].open("w") sourcepath.joinpath("myfile").touch()
app = TestApp().app app = TestApp().app
app.directories.add_path(tmppath) app.directories.add_path(tmppath)
[myfile] = app.directories.get_files() [myfile] = app.directories.get_files()
monkeypatch.setattr(app, "clean_empty_dirs", log_calls(lambda path: None)) monkeypatch.setattr(app, "clean_empty_dirs", log_calls(lambda path: None))
app.copy_or_move(myfile, False, tmppath["dest"], 0) app.copy_or_move(myfile, False, tmppath.joinpath("dest"), 0)
calls = app.clean_empty_dirs.calls calls = app.clean_empty_dirs.calls
eq_(1, len(calls)) eq_(1, len(calls))
eq_(sourcepath, calls[0]["path"]) eq_(sourcepath, calls[0]["path"])
@@ -106,8 +106,8 @@ class TestCaseDupeGuru:
# If the ignore_hardlink_matches option is set, don't match files hardlinking to the same # If the ignore_hardlink_matches option is set, don't match files hardlinking to the same
# inode. # inode.
tmppath = Path(str(tmpdir)) tmppath = Path(str(tmpdir))
tmppath["myfile"].open("w").write("foo") tmppath.joinpath("myfile").open("wt").write("foo")
os.link(str(tmppath["myfile"]), str(tmppath["hardlink"])) os.link(str(tmppath.joinpath("myfile")), str(tmppath.joinpath("hardlink")))
app = TestApp().app app = TestApp().app
app.directories.add_path(tmppath) app.directories.add_path(tmppath)
app.options["scan_type"] = ScanType.CONTENTS app.options["scan_type"] = ScanType.CONTENTS
@@ -153,7 +153,7 @@ class TestCaseDupeGuruCleanEmptyDirs:
# delete_if_empty must be recursively called up in the path until it returns False # delete_if_empty must be recursively called up in the path until it returns False
@log_calls @log_calls
def mock_delete_if_empty(path, files_to_delete=[]): def mock_delete_if_empty(path, files_to_delete=[]):
return len(path) > 1 return len(path.parts) > 1
monkeypatch.setattr(hscommon.util, "delete_if_empty", mock_delete_if_empty) monkeypatch.setattr(hscommon.util, "delete_if_empty", mock_delete_if_empty)
# XXX This monkeypatch is temporary. will be fixed in a better monkeypatcher. # XXX This monkeypatch is temporary. will be fixed in a better monkeypatcher.
@@ -180,8 +180,8 @@ class TestCaseDupeGuruWithResults:
self.rtable.refresh() self.rtable.refresh()
tmpdir = request.getfixturevalue("tmpdir") tmpdir = request.getfixturevalue("tmpdir")
tmppath = Path(str(tmpdir)) tmppath = Path(str(tmpdir))
tmppath["foo"].mkdir() tmppath.joinpath("foo").mkdir()
tmppath["bar"].mkdir() tmppath.joinpath("bar").mkdir()
self.app.directories.add_path(tmppath) self.app.directories.add_path(tmppath)
def test_get_objects(self, do_setup): def test_get_objects(self, do_setup):
@@ -424,12 +424,9 @@ class TestCaseDupeGuruRenameSelected:
def do_setup(self, request): def do_setup(self, request):
tmpdir = request.getfixturevalue("tmpdir") tmpdir = request.getfixturevalue("tmpdir")
p = Path(str(tmpdir)) p = Path(str(tmpdir))
fp = open(str(p["foo bar 1"]), mode="w") p.joinpath("foo bar 1").touch()
fp.close() p.joinpath("foo bar 2").touch()
fp = open(str(p["foo bar 2"]), mode="w") p.joinpath("foo bar 3").touch()
fp.close()
fp = open(str(p["foo bar 3"]), mode="w")
fp.close()
files = fs.get_files(p) files = fs.get_files(p)
for f in files: for f in files:
f.is_ref = False f.is_ref = False
@@ -451,7 +448,7 @@ class TestCaseDupeGuruRenameSelected:
g = self.groups[0] g = self.groups[0]
self.rtable.select([1]) self.rtable.select([1])
assert app.rename_selected("renamed") assert app.rename_selected("renamed")
names = [p.name for p in self.p.listdir()] names = [p.name for p in self.p.glob("*")]
assert "renamed" in names assert "renamed" in names
assert "foo bar 2" not in names assert "foo bar 2" not in names
eq_(g.dupes[0].name, "renamed") eq_(g.dupes[0].name, "renamed")
@@ -464,7 +461,7 @@ class TestCaseDupeGuruRenameSelected:
assert not app.rename_selected("renamed") assert not app.rename_selected("renamed")
msg = logging.warning.calls[0]["msg"] msg = logging.warning.calls[0]["msg"]
eq_("dupeGuru Warning: list index out of range", msg) eq_("dupeGuru Warning: list index out of range", msg)
names = [p.name for p in self.p.listdir()] names = [p.name for p in self.p.glob("*")]
assert "renamed" not in names assert "renamed" not in names
assert "foo bar 2" in names assert "foo bar 2" in names
eq_(g.dupes[0].name, "foo bar 2") eq_(g.dupes[0].name, "foo bar 2")
@@ -477,7 +474,7 @@ class TestCaseDupeGuruRenameSelected:
assert not app.rename_selected("foo bar 1") assert not app.rename_selected("foo bar 1")
msg = logging.warning.calls[0]["msg"] msg = logging.warning.calls[0]["msg"]
assert msg.startswith("dupeGuru Warning: 'foo bar 1' already exists in") assert msg.startswith("dupeGuru Warning: 'foo bar 1' already exists in")
names = [p.name for p in self.p.listdir()] names = [p.name for p in self.p.glob("*")]
assert "foo bar 1" in names assert "foo bar 1" in names
assert "foo bar 2" in names assert "foo bar 2" in names
eq_(g.dupes[0].name, "foo bar 2") eq_(g.dupes[0].name, "foo bar 2")
@@ -488,9 +485,9 @@ class TestAppWithDirectoriesInTree:
def do_setup(self, request): def do_setup(self, request):
tmpdir = request.getfixturevalue("tmpdir") tmpdir = request.getfixturevalue("tmpdir")
p = Path(str(tmpdir)) p = Path(str(tmpdir))
p["sub1"].mkdir() p.joinpath("sub1").mkdir()
p["sub2"].mkdir() p.joinpath("sub2").mkdir()
p["sub3"].mkdir() p.joinpath("sub3").mkdir()
app = TestApp() app = TestApp()
self.app = app.app self.app = app.app
self.dtree = app.dtree self.dtree = app.dtree

View File

@@ -5,7 +5,7 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from hscommon.testutil import TestApp as TestAppBase, CallLogger, eq_, with_app # noqa from hscommon.testutil import TestApp as TestAppBase, CallLogger, eq_, with_app # noqa
from hscommon.path import Path from pathlib import Path
from hscommon.util import get_file_ext, format_size from hscommon.util import get_file_ext, format_size
from hscommon.gui.column import Column from hscommon.gui.column import Column
from hscommon.jobprogress.job import nulljob, JobCancelled from hscommon.jobprogress.job import nulljob, JobCancelled
@@ -86,9 +86,9 @@ class NamedObject:
folder = "basepath" folder = "basepath"
self._folder = Path(folder) self._folder = Path(folder)
self.size = size self.size = size
self.md5partial = name self.digest_partial = name
self.md5 = name self.digest = name
self.md5samples = name self.digest_samples = name
if with_words: if with_words:
self.words = getwords(name) self.words = getwords(name)
self.is_ref = False self.is_ref = False
@@ -111,11 +111,11 @@ class NamedObject:
@property @property
def path(self): def path(self):
return self._folder[self.name] return self._folder.joinpath(self.name)
@property @property
def folder_path(self): def folder_path(self):
return self.path.parent() return self.path.parent
@property @property
def extension(self): def extension(self):

View File

@@ -10,7 +10,7 @@ import tempfile
import shutil import shutil
from pytest import raises from pytest import raises
from hscommon.path import Path from pathlib import Path
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from hscommon.plat import ISWINDOWS from hscommon.plat import ISWINDOWS
@@ -26,29 +26,23 @@ from ..exclude import ExcludeList, ExcludeDict
def create_fake_fs(rootpath): def create_fake_fs(rootpath):
# We have it as a separate function because other units are using it. # We have it as a separate function because other units are using it.
rootpath = rootpath["fs"] rootpath = rootpath.joinpath("fs")
rootpath.mkdir() rootpath.mkdir()
rootpath["dir1"].mkdir() rootpath.joinpath("dir1").mkdir()
rootpath["dir2"].mkdir() rootpath.joinpath("dir2").mkdir()
rootpath["dir3"].mkdir() rootpath.joinpath("dir3").mkdir()
fp = rootpath["file1.test"].open("w") with rootpath.joinpath("file1.test").open("wt") as fp:
fp.write("1") fp.write("1")
fp.close() with rootpath.joinpath("file2.test").open("wt") as fp:
fp = rootpath["file2.test"].open("w") fp.write("12")
fp.write("12") with rootpath.joinpath("file3.test").open("wt") as fp:
fp.close() fp.write("123")
fp = rootpath["file3.test"].open("w") with rootpath.joinpath("dir1", "file1.test").open("wt") as fp:
fp.write("123") fp.write("1")
fp.close() with rootpath.joinpath("dir2", "file2.test").open("wt") as fp:
fp = rootpath["dir1"]["file1.test"].open("w") fp.write("12")
fp.write("1") with rootpath.joinpath("dir3", "file3.test").open("wt") as fp:
fp.close() fp.write("123")
fp = rootpath["dir2"]["file2.test"].open("w")
fp.write("12")
fp.close()
fp = rootpath["dir3"]["file3.test"].open("w")
fp.write("123")
fp.close()
return rootpath return rootpath
@@ -60,11 +54,10 @@ def setup_module(module):
# and another with a more complex structure. # and another with a more complex structure.
testpath = Path(tempfile.mkdtemp()) testpath = Path(tempfile.mkdtemp())
module.testpath = testpath module.testpath = testpath
rootpath = testpath["onefile"] rootpath = testpath.joinpath("onefile")
rootpath.mkdir() rootpath.mkdir()
fp = rootpath["test.txt"].open("w") with rootpath.joinpath("test.txt").open("wt") as fp:
fp.write("test_data") fp.write("test_data")
fp.close()
create_fake_fs(testpath) create_fake_fs(testpath)
@@ -80,13 +73,13 @@ def test_empty():
def test_add_path(): def test_add_path():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
eq_(1, len(d)) eq_(1, len(d))
assert p in d assert p in d
assert (p["foobar"]) in d assert (p.joinpath("foobar")) in d
assert p.parent() not in d assert p.parent not in d
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
eq_(2, len(d)) eq_(2, len(d))
assert p in d assert p in d
@@ -94,18 +87,18 @@ def test_add_path():
def test_add_path_when_path_is_already_there(): def test_add_path_when_path_is_already_there():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
with raises(AlreadyThereError): with raises(AlreadyThereError):
d.add_path(p) d.add_path(p)
with raises(AlreadyThereError): with raises(AlreadyThereError):
d.add_path(p["foobar"]) d.add_path(p.joinpath("foobar"))
eq_(1, len(d)) eq_(1, len(d))
def test_add_path_containing_paths_already_there(): def test_add_path_containing_paths_already_there():
d = Directories() d = Directories()
d.add_path(testpath["onefile"]) d.add_path(testpath.joinpath("onefile"))
eq_(1, len(d)) eq_(1, len(d))
d.add_path(testpath) d.add_path(testpath)
eq_(len(d), 1) eq_(len(d), 1)
@@ -114,7 +107,7 @@ def test_add_path_containing_paths_already_there():
def test_add_path_non_latin(tmpdir): def test_add_path_non_latin(tmpdir):
p = Path(str(tmpdir)) p = Path(str(tmpdir))
to_add = p["unicode\u201a"] to_add = p.joinpath("unicode\u201a")
os.mkdir(str(to_add)) os.mkdir(str(to_add))
d = Directories() d = Directories()
try: try:
@@ -125,25 +118,25 @@ def test_add_path_non_latin(tmpdir):
def test_del(): def test_del():
d = Directories() d = Directories()
d.add_path(testpath["onefile"]) d.add_path(testpath.joinpath("onefile"))
try: try:
del d[1] del d[1]
assert False assert False
except IndexError: except IndexError:
pass pass
d.add_path(testpath["fs"]) d.add_path(testpath.joinpath("fs"))
del d[1] del d[1]
eq_(1, len(d)) eq_(1, len(d))
def test_states(): def test_states():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
eq_(DirectoryState.NORMAL, d.get_state(p)) eq_(DirectoryState.NORMAL, d.get_state(p))
d.set_state(p, DirectoryState.REFERENCE) d.set_state(p, DirectoryState.REFERENCE)
eq_(DirectoryState.REFERENCE, d.get_state(p)) eq_(DirectoryState.REFERENCE, d.get_state(p))
eq_(DirectoryState.REFERENCE, d.get_state(p["dir1"])) eq_(DirectoryState.REFERENCE, d.get_state(p.joinpath("dir1")))
eq_(1, len(d.states)) eq_(1, len(d.states))
eq_(p, list(d.states.keys())[0]) eq_(p, list(d.states.keys())[0])
eq_(DirectoryState.REFERENCE, d.states[p]) eq_(DirectoryState.REFERENCE, d.states[p])
@@ -152,7 +145,7 @@ def test_states():
def test_get_state_with_path_not_there(): def test_get_state_with_path_not_there():
# When the path's not there, just return DirectoryState.Normal # When the path's not there, just return DirectoryState.Normal
d = Directories() d = Directories()
d.add_path(testpath["onefile"]) d.add_path(testpath.joinpath("onefile"))
eq_(d.get_state(testpath), DirectoryState.NORMAL) eq_(d.get_state(testpath), DirectoryState.NORMAL)
@@ -160,26 +153,26 @@ def test_states_overwritten_when_larger_directory_eat_smaller_ones():
# ref #248 # ref #248
# When setting the state of a folder, we overwrite previously set states for subfolders. # When setting the state of a folder, we overwrite previously set states for subfolders.
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
d.set_state(p, DirectoryState.EXCLUDED) d.set_state(p, DirectoryState.EXCLUDED)
d.add_path(testpath) d.add_path(testpath)
d.set_state(testpath, DirectoryState.REFERENCE) d.set_state(testpath, DirectoryState.REFERENCE)
eq_(d.get_state(p), DirectoryState.REFERENCE) eq_(d.get_state(p), DirectoryState.REFERENCE)
eq_(d.get_state(p["dir1"]), DirectoryState.REFERENCE) eq_(d.get_state(p.joinpath("dir1")), DirectoryState.REFERENCE)
eq_(d.get_state(testpath), DirectoryState.REFERENCE) eq_(d.get_state(testpath), DirectoryState.REFERENCE)
def test_get_files(): def test_get_files():
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
d.set_state(p["dir1"], DirectoryState.REFERENCE) d.set_state(p.joinpath("dir1"), DirectoryState.REFERENCE)
d.set_state(p["dir2"], DirectoryState.EXCLUDED) d.set_state(p.joinpath("dir2"), DirectoryState.EXCLUDED)
files = list(d.get_files()) files = list(d.get_files())
eq_(5, len(files)) eq_(5, len(files))
for f in files: for f in files:
if f.path.parent() == p["dir1"]: if f.path.parent == p.joinpath("dir1"):
assert f.is_ref assert f.is_ref
else: else:
assert not f.is_ref assert not f.is_ref
@@ -193,7 +186,7 @@ def test_get_files_with_folders():
return True return True
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
files = list(d.get_files(fileclasses=[FakeFile])) files = list(d.get_files(fileclasses=[FakeFile]))
# We have the 3 root files and the 3 root dirs # We have the 3 root files and the 3 root dirs
@@ -202,23 +195,23 @@ def test_get_files_with_folders():
def test_get_folders(): def test_get_folders():
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
d.set_state(p["dir1"], DirectoryState.REFERENCE) d.set_state(p.joinpath("dir1"), DirectoryState.REFERENCE)
d.set_state(p["dir2"], DirectoryState.EXCLUDED) d.set_state(p.joinpath("dir2"), DirectoryState.EXCLUDED)
folders = list(d.get_folders()) folders = list(d.get_folders())
eq_(len(folders), 3) eq_(len(folders), 3)
ref = [f for f in folders if f.is_ref] ref = [f for f in folders if f.is_ref]
not_ref = [f for f in folders if not f.is_ref] not_ref = [f for f in folders if not f.is_ref]
eq_(len(ref), 1) eq_(len(ref), 1)
eq_(ref[0].path, p["dir1"]) eq_(ref[0].path, p.joinpath("dir1"))
eq_(len(not_ref), 2) eq_(len(not_ref), 2)
eq_(ref[0].size, 1) eq_(ref[0].size, 1)
def test_get_files_with_inherited_exclusion(): def test_get_files_with_inherited_exclusion():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
d.set_state(p, DirectoryState.EXCLUDED) d.set_state(p, DirectoryState.EXCLUDED)
eq_([], list(d.get_files())) eq_([], list(d.get_files()))
@@ -234,13 +227,13 @@ def test_save_and_load(tmpdir):
d1.add_path(p1) d1.add_path(p1)
d1.add_path(p2) d1.add_path(p2)
d1.set_state(p1, DirectoryState.REFERENCE) d1.set_state(p1, DirectoryState.REFERENCE)
d1.set_state(p1["dir1"], DirectoryState.EXCLUDED) d1.set_state(p1.joinpath("dir1"), DirectoryState.EXCLUDED)
tmpxml = str(tmpdir.join("directories_testunit.xml")) tmpxml = str(tmpdir.join("directories_testunit.xml"))
d1.save_to_file(tmpxml) d1.save_to_file(tmpxml)
d2.load_from_file(tmpxml) d2.load_from_file(tmpxml)
eq_(2, len(d2)) eq_(2, len(d2))
eq_(DirectoryState.REFERENCE, d2.get_state(p1)) eq_(DirectoryState.REFERENCE, d2.get_state(p1))
eq_(DirectoryState.EXCLUDED, d2.get_state(p1["dir1"])) eq_(DirectoryState.EXCLUDED, d2.get_state(p1.joinpath("dir1")))
def test_invalid_path(): def test_invalid_path():
@@ -268,7 +261,7 @@ def test_load_from_file_with_invalid_path(tmpdir):
# This test simulates a load from file resulting in a # This test simulates a load from file resulting in a
# InvalidPath raise. Other directories must be loaded. # InvalidPath raise. Other directories must be loaded.
d1 = Directories() d1 = Directories()
d1.add_path(testpath["onefile"]) d1.add_path(testpath.joinpath("onefile"))
# Will raise InvalidPath upon loading # Will raise InvalidPath upon loading
p = Path(str(tmpdir.join("toremove"))) p = Path(str(tmpdir.join("toremove")))
p.mkdir() p.mkdir()
@@ -283,11 +276,11 @@ def test_load_from_file_with_invalid_path(tmpdir):
def test_unicode_save(tmpdir): def test_unicode_save(tmpdir):
d = Directories() d = Directories()
p1 = Path(str(tmpdir))["hello\xe9"] p1 = Path(str(tmpdir), "hello\xe9")
p1.mkdir() p1.mkdir()
p1["foo\xe9"].mkdir() p1.joinpath("foo\xe9").mkdir()
d.add_path(p1) d.add_path(p1)
d.set_state(p1["foo\xe9"], DirectoryState.EXCLUDED) d.set_state(p1.joinpath("foo\xe9"), DirectoryState.EXCLUDED)
tmpxml = str(tmpdir.join("directories_testunit.xml")) tmpxml = str(tmpdir.join("directories_testunit.xml"))
try: try:
d.save_to_file(tmpxml) d.save_to_file(tmpxml)
@@ -297,12 +290,12 @@ def test_unicode_save(tmpdir):
def test_get_files_refreshes_its_directories(): def test_get_files_refreshes_its_directories():
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
files = d.get_files() files = d.get_files()
eq_(6, len(list(files))) eq_(6, len(list(files)))
time.sleep(1) time.sleep(1)
os.remove(str(p["dir1"]["file1.test"])) os.remove(str(p.joinpath("dir1", "file1.test")))
files = d.get_files() files = d.get_files()
eq_(5, len(list(files))) eq_(5, len(list(files)))
@@ -311,15 +304,15 @@ def test_get_files_does_not_choke_on_non_existing_directories(tmpdir):
d = Directories() d = Directories()
p = Path(str(tmpdir)) p = Path(str(tmpdir))
d.add_path(p) d.add_path(p)
p.rmtree() shutil.rmtree(str(p))
eq_([], list(d.get_files())) eq_([], list(d.get_files()))
def test_get_state_returns_excluded_by_default_for_hidden_directories(tmpdir): def test_get_state_returns_excluded_by_default_for_hidden_directories(tmpdir):
d = Directories() d = Directories()
p = Path(str(tmpdir)) p = Path(str(tmpdir))
hidden_dir_path = p[".foo"] hidden_dir_path = p.joinpath(".foo")
p[".foo"].mkdir() p.joinpath(".foo").mkdir()
d.add_path(p) d.add_path(p)
eq_(d.get_state(hidden_dir_path), DirectoryState.EXCLUDED) eq_(d.get_state(hidden_dir_path), DirectoryState.EXCLUDED)
# But it can be overriden # But it can be overriden
@@ -331,22 +324,22 @@ def test_default_path_state_override(tmpdir):
# It's possible for a subclass to override the default state of a path # It's possible for a subclass to override the default state of a path
class MyDirectories(Directories): class MyDirectories(Directories):
def _default_state_for_path(self, path): def _default_state_for_path(self, path):
if "foobar" in path: if "foobar" in path.parts:
return DirectoryState.EXCLUDED return DirectoryState.EXCLUDED
d = MyDirectories() d = MyDirectories()
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["foobar"].mkdir() p1.joinpath("foobar").mkdir()
p1["foobar/somefile"].open("w").close() p1.joinpath("foobar/somefile").touch()
p1["foobaz"].mkdir() p1.joinpath("foobaz").mkdir()
p1["foobaz/somefile"].open("w").close() p1.joinpath("foobaz/somefile").touch()
d.add_path(p1) d.add_path(p1)
eq_(d.get_state(p1["foobaz"]), DirectoryState.NORMAL) eq_(d.get_state(p1.joinpath("foobaz")), DirectoryState.NORMAL)
eq_(d.get_state(p1["foobar"]), DirectoryState.EXCLUDED) eq_(d.get_state(p1.joinpath("foobar")), DirectoryState.EXCLUDED)
eq_(len(list(d.get_files())), 1) # only the 'foobaz' file is there eq_(len(list(d.get_files())), 1) # only the 'foobaz' file is there
# However, the default state can be changed # However, the default state can be changed
d.set_state(p1["foobar"], DirectoryState.NORMAL) d.set_state(p1.joinpath("foobar"), DirectoryState.NORMAL)
eq_(d.get_state(p1["foobar"]), DirectoryState.NORMAL) eq_(d.get_state(p1.joinpath("foobar")), DirectoryState.NORMAL)
eq_(len(list(d.get_files())), 2) eq_(len(list(d.get_files())), 2)
@@ -372,42 +365,42 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
self.d._exclude_list.add(regex) self.d._exclude_list.add(regex)
self.d._exclude_list.mark(regex) self.d._exclude_list.mark(regex)
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["$Recycle.Bin"].mkdir() p1.joinpath("$Recycle.Bin").mkdir()
p1["$Recycle.Bin"]["subdir"].mkdir() p1.joinpath("$Recycle.Bin", "subdir").mkdir()
self.d.add_path(p1) self.d.add_path(p1)
eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
# By default, subdirs should be excluded too, but this can be overridden separately # By default, subdirs should be excluded too, but this can be overridden separately
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
def test_exclude_refined(self, tmpdir): def test_exclude_refined(self, tmpdir):
regex1 = r"^\$Recycle\.Bin$" regex1 = r"^\$Recycle\.Bin$"
self.d._exclude_list.add(regex1) self.d._exclude_list.add(regex1)
self.d._exclude_list.mark(regex1) self.d._exclude_list.mark(regex1)
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["$Recycle.Bin"].mkdir() p1.joinpath("$Recycle.Bin").mkdir()
p1["$Recycle.Bin"]["somefile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "somefile.png").touch()
p1["$Recycle.Bin"]["some_unwanted_file.jpg"].open("w").close() p1.joinpath("$Recycle.Bin", "some_unwanted_file.jpg").touch()
p1["$Recycle.Bin"]["subdir"].mkdir() p1.joinpath("$Recycle.Bin", "subdir").mkdir()
p1["$Recycle.Bin"]["subdir"]["somesubdirfile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "somesubdirfile.png").touch()
p1["$Recycle.Bin"]["subdir"]["unwanted_subdirfile.gif"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "unwanted_subdirfile.gif").touch()
p1["$Recycle.Bin"]["subdar"].mkdir() p1.joinpath("$Recycle.Bin", "subdar").mkdir()
p1["$Recycle.Bin"]["subdar"]["somesubdarfile.jpeg"].open("w").close() p1.joinpath("$Recycle.Bin", "subdar", "somesubdarfile.jpeg").touch()
p1["$Recycle.Bin"]["subdar"]["unwanted_subdarfile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "subdar", "unwanted_subdarfile.png").touch()
self.d.add_path(p1["$Recycle.Bin"]) self.d.add_path(p1.joinpath("$Recycle.Bin"))
# Filter should set the default state to Excluded # Filter should set the default state to Excluded
eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
# The subdir should inherit its parent state # The subdir should inherit its parent state
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.EXCLUDED)
# Override a child path's state # Override a child path's state
self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
# Parent should keep its default state, and the other child too # Parent should keep its default state, and the other child too
eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.EXCLUDED)
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
# only the 2 files directly under the Normal directory # only the 2 files directly under the Normal directory
@@ -419,8 +412,8 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
assert "somesubdirfile.png" in files assert "somesubdirfile.png" in files
assert "unwanted_subdirfile.gif" in files assert "unwanted_subdirfile.gif" in files
# Overriding the parent should enable all children # Overriding the parent should enable all children
self.d.set_state(p1["$Recycle.Bin"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("$Recycle.Bin"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.NORMAL)
# all files there # all files there
files = self.get_files_and_expect_num_result(6) files = self.get_files_and_expect_num_result(6)
assert "somefile.png" in files assert "somefile.png" in files
@@ -444,7 +437,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
assert self.d._exclude_list.error(regex3) is None assert self.d._exclude_list.error(regex3) is None
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
# Directory shouldn't change its state here, unless explicitely done by user # Directory shouldn't change its state here, unless explicitely done by user
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
assert "unwanted_subdirfile.gif" not in files assert "unwanted_subdirfile.gif" not in files
assert "unwanted_subdarfile.png" in files assert "unwanted_subdarfile.png" in files
@@ -453,15 +446,15 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
regex4 = r".*subdir$" regex4 = r".*subdir$"
self.d._exclude_list.rename(regex3, regex4) self.d._exclude_list.rename(regex3, regex4)
assert self.d._exclude_list.error(regex4) is None assert self.d._exclude_list.error(regex4) is None
p1["$Recycle.Bin"]["subdar"]["file_ending_with_subdir"].open("w").close() p1.joinpath("$Recycle.Bin", "subdar", "file_ending_with_subdir").touch()
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
files = self.get_files_and_expect_num_result(4) files = self.get_files_and_expect_num_result(4)
assert "file_ending_with_subdir" not in files assert "file_ending_with_subdir" not in files
assert "somesubdarfile.jpeg" in files assert "somesubdarfile.jpeg" in files
assert "somesubdirfile.png" not in files assert "somesubdirfile.png" not in files
assert "unwanted_subdirfile.gif" not in files assert "unwanted_subdirfile.gif" not in files
self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
files = self.get_files_and_expect_num_result(6) files = self.get_files_and_expect_num_result(6)
assert "file_ending_with_subdir" not in files assert "file_ending_with_subdir" not in files
@@ -471,9 +464,9 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
regex5 = r".*subdir.*" regex5 = r".*subdir.*"
self.d._exclude_list.rename(regex4, regex5) self.d._exclude_list.rename(regex4, regex5)
# Files containing substring should be filtered # Files containing substring should be filtered
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
# The path should not match, only the filename, the "subdir" in the directory name shouldn't matter # The path should not match, only the filename, the "subdir" in the directory name shouldn't matter
p1["$Recycle.Bin"]["subdir"]["file_which_shouldnt_match"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "file_which_shouldnt_match").touch()
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
assert "somesubdirfile.png" not in files assert "somesubdirfile.png" not in files
assert "unwanted_subdirfile.gif" not in files assert "unwanted_subdirfile.gif" not in files
@@ -493,7 +486,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
assert self.d._exclude_list.error(regex6) is None assert self.d._exclude_list.error(regex6) is None
assert regex6 in self.d._exclude_list assert regex6 in self.d._exclude_list
# This still should not be affected # This still should not be affected
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
# These files are under the "/subdir" directory # These files are under the "/subdir" directory
assert "somesubdirfile.png" not in files assert "somesubdirfile.png" not in files
@@ -505,20 +498,20 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
def test_japanese_unicode(self, tmpdir): def test_japanese_unicode(self, tmpdir):
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["$Recycle.Bin"].mkdir() p1.joinpath("$Recycle.Bin").mkdir()
p1["$Recycle.Bin"]["somerecycledfile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "somerecycledfile.png").touch()
p1["$Recycle.Bin"]["some_unwanted_file.jpg"].open("w").close() p1.joinpath("$Recycle.Bin", "some_unwanted_file.jpg").touch()
p1["$Recycle.Bin"]["subdir"].mkdir() p1.joinpath("$Recycle.Bin", "subdir").mkdir()
p1["$Recycle.Bin"]["subdir"]["過去白濁物語~]_カラー.jpg"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "過去白濁物語~]_カラー.jpg").touch()
p1["$Recycle.Bin"]["思叫物語"].mkdir() p1.joinpath("$Recycle.Bin", "思叫物語").mkdir()
p1["$Recycle.Bin"]["思叫物語"]["なししろ会う前"].open("w").close() p1.joinpath("$Recycle.Bin", "思叫物語", "なししろ会う前").touch()
p1["$Recycle.Bin"]["思叫物語"]["堂~ロ"].open("w").close() p1.joinpath("$Recycle.Bin", "思叫物語", "堂~ロ").touch()
self.d.add_path(p1["$Recycle.Bin"]) self.d.add_path(p1.joinpath("$Recycle.Bin"))
regex3 = r".*物語.*" regex3 = r".*物語.*"
self.d._exclude_list.add(regex3) self.d._exclude_list.add(regex3)
self.d._exclude_list.mark(regex3) self.d._exclude_list.mark(regex3)
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
eq_(self.d.get_state(p1["$Recycle.Bin"]["思叫物語"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "思叫物語")), DirectoryState.EXCLUDED)
files = self.get_files_and_expect_num_result(2) files = self.get_files_and_expect_num_result(2)
assert "過去白濁物語~]_カラー.jpg" not in files assert "過去白濁物語~]_カラー.jpg" not in files
assert "なししろ会う前" not in files assert "なししろ会う前" not in files
@@ -527,7 +520,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
regex4 = r".*物語$" regex4 = r".*物語$"
self.d._exclude_list.rename(regex3, regex4) self.d._exclude_list.rename(regex3, regex4)
assert self.d._exclude_list.error(regex4) is None assert self.d._exclude_list.error(regex4) is None
self.d.set_state(p1["$Recycle.Bin"]["思叫物語"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("$Recycle.Bin", "思叫物語"), DirectoryState.NORMAL)
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
assert "過去白濁物語~]_カラー.jpg" in files assert "過去白濁物語~]_カラー.jpg" in files
assert "なししろ会う前" in files assert "なししろ会う前" in files
@@ -539,15 +532,15 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
self.d._exclude_list.add(regex) self.d._exclude_list.add(regex)
self.d._exclude_list.mark(regex) self.d._exclude_list.mark(regex)
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["foobar"].mkdir() p1.joinpath("foobar").mkdir()
p1["foobar"][".hidden_file.txt"].open("w").close() p1.joinpath("foobar", ".hidden_file.txt").touch()
p1["foobar"][".hidden_dir"].mkdir() p1.joinpath("foobar", ".hidden_dir").mkdir()
p1["foobar"][".hidden_dir"]["foobar.jpg"].open("w").close() p1.joinpath("foobar", ".hidden_dir", "foobar.jpg").touch()
p1["foobar"][".hidden_dir"][".hidden_subfile.png"].open("w").close() p1.joinpath("foobar", ".hidden_dir", ".hidden_subfile.png").touch()
self.d.add_path(p1["foobar"]) self.d.add_path(p1.joinpath("foobar"))
# It should not inherit its parent's state originally # It should not inherit its parent's state originally
eq_(self.d.get_state(p1["foobar"][".hidden_dir"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("foobar", ".hidden_dir")), DirectoryState.EXCLUDED)
self.d.set_state(p1["foobar"][".hidden_dir"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("foobar", ".hidden_dir"), DirectoryState.NORMAL)
# The files should still be filtered # The files should still be filtered
files = self.get_files_and_expect_num_result(1) files = self.get_files_and_expect_num_result(1)
eq_(len(self.d._exclude_list.compiled_paths), 0) eq_(len(self.d._exclude_list.compiled_paths), 0)

View File

@@ -530,7 +530,7 @@ class TestCaseGetMatches:
class TestCaseGetMatchesByContents: class TestCaseGetMatchesByContents:
def test_big_file_partial_hashes(self): def test_big_file_partial_hashing(self):
smallsize = 1 smallsize = 1
bigsize = 100 * 1024 * 1024 # 100MB bigsize = 100 * 1024 * 1024 # 100MB
f = [ f = [
@@ -539,17 +539,17 @@ class TestCaseGetMatchesByContents:
no("smallfoo", size=smallsize), no("smallfoo", size=smallsize),
no("smallbar", size=smallsize), no("smallbar", size=smallsize),
] ]
f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar" f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar" f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].md5 = f[2].md5partial = "bleh" f[2].digest = f[2].digest_partial = "bleh"
f[3].md5 = f[3].md5partial = "bleh" f[3].digest = f[3].digest_partial = "bleh"
r = getmatches_by_contents(f, bigsize=bigsize) r = getmatches_by_contents(f, bigsize=bigsize)
eq_(len(r), 2) eq_(len(r), 2)
# User disabled optimization for big files, compute hashes as usual # User disabled optimization for big files, compute digests as usual
r = getmatches_by_contents(f, bigsize=0) r = getmatches_by_contents(f, bigsize=0)
eq_(len(r), 2) eq_(len(r), 2)
# Other file is now slightly different, md5partial is still the same # Other file is now slightly different, digest_partial is still the same
f[1].md5 = f[1].md5samples = "foobardiff" f[1].digest = f[1].digest_samples = "foobardiff"
r = getmatches_by_contents(f, bigsize=bigsize) r = getmatches_by_contents(f, bigsize=bigsize)
# Successfully filter it out # Successfully filter it out
eq_(len(r), 1) eq_(len(r), 1)

View File

@@ -6,10 +6,18 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import hashlib try:
import xxhash
hasher = xxhash.xxh128
except ImportError:
import hashlib
hasher = hashlib.md5
from os import urandom from os import urandom
from hscommon.path import Path from pathlib import Path
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from core.tests.directories_test import create_fake_fs from core.tests.directories_test import create_fake_fs
@@ -17,32 +25,26 @@ from .. import fs
def create_fake_fs_with_random_data(rootpath): def create_fake_fs_with_random_data(rootpath):
rootpath = rootpath["fs"] rootpath = rootpath.joinpath("fs")
rootpath.mkdir() rootpath.mkdir()
rootpath["dir1"].mkdir() rootpath.joinpath("dir1").mkdir()
rootpath["dir2"].mkdir() rootpath.joinpath("dir2").mkdir()
rootpath["dir3"].mkdir() rootpath.joinpath("dir3").mkdir()
fp = rootpath["file1.test"].open("wb")
data1 = urandom(200 * 1024) # 200KiB data1 = urandom(200 * 1024) # 200KiB
data2 = urandom(1024 * 1024) # 1MiB data2 = urandom(1024 * 1024) # 1MiB
data3 = urandom(10 * 1024 * 1024) # 10MiB data3 = urandom(10 * 1024 * 1024) # 10MiB
fp.write(data1) with rootpath.joinpath("file1.test").open("wb") as fp:
fp.close() fp.write(data1)
fp = rootpath["file2.test"].open("wb") with rootpath.joinpath("file2.test").open("wb") as fp:
fp.write(data2) fp.write(data2)
fp.close() with rootpath.joinpath("file3.test").open("wb") as fp:
fp = rootpath["file3.test"].open("wb") fp.write(data3)
fp.write(data3) with rootpath.joinpath("dir1", "file1.test").open("wb") as fp:
fp.close() fp.write(data1)
fp = rootpath["dir1"]["file1.test"].open("wb") with rootpath.joinpath("dir2", "file2.test").open("wb") as fp:
fp.write(data1) fp.write(data2)
fp.close() with rootpath.joinpath("dir3", "file3.test").open("wb") as fp:
fp = rootpath["dir2"]["file2.test"].open("wb") fp.write(data3)
fp.write(data2)
fp.close()
fp = rootpath["dir3"]["file3.test"].open("wb")
fp.write(data3)
fp.close()
return rootpath return rootpath
@@ -52,54 +54,54 @@ def test_size_aggregates_subfiles(tmpdir):
eq_(b.size, 12) eq_(b.size, 12)
def test_md5_aggregate_subfiles_sorted(tmpdir): def test_digest_aggregate_subfiles_sorted(tmpdir):
# dir.allfiles can return child in any order. Thus, bundle.md5 must aggregate # dir.allfiles can return child in any order. Thus, bundle.digest must aggregate
# all files' md5 it contains, but it must make sure that it does so in the # all files' digests it contains, but it must make sure that it does so in the
# same order everytime. # same order everytime.
p = create_fake_fs_with_random_data(Path(str(tmpdir))) p = create_fake_fs_with_random_data(Path(str(tmpdir)))
b = fs.Folder(p) b = fs.Folder(p)
md51 = fs.File(p["dir1"]["file1.test"]).md5 digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest
md52 = fs.File(p["dir2"]["file2.test"]).md5 digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest
md53 = fs.File(p["dir3"]["file3.test"]).md5 digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest
md54 = fs.File(p["file1.test"]).md5 digest4 = fs.File(p.joinpath("file1.test")).digest
md55 = fs.File(p["file2.test"]).md5 digest5 = fs.File(p.joinpath("file2.test")).digest
md56 = fs.File(p["file3.test"]).md5 digest6 = fs.File(p.joinpath("file3.test")).digest
# The expected md5 is the md5 of md5s for folders and the direct md5 for files # The expected digest is the hash of digests for folders and the direct digest for files
folder_md51 = hashlib.md5(md51).digest() folder_digest1 = hasher(digest1).digest()
folder_md52 = hashlib.md5(md52).digest() folder_digest2 = hasher(digest2).digest()
folder_md53 = hashlib.md5(md53).digest() folder_digest3 = hasher(digest3).digest()
md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56) digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
eq_(b.md5, md5.digest()) eq_(b.digest, digest)
def test_partial_md5_aggregate_subfile_sorted(tmpdir): def test_partial_digest_aggregate_subfile_sorted(tmpdir):
p = create_fake_fs_with_random_data(Path(str(tmpdir))) p = create_fake_fs_with_random_data(Path(str(tmpdir)))
b = fs.Folder(p) b = fs.Folder(p)
md51 = fs.File(p["dir1"]["file1.test"]).md5partial digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest_partial
md52 = fs.File(p["dir2"]["file2.test"]).md5partial digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest_partial
md53 = fs.File(p["dir3"]["file3.test"]).md5partial digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest_partial
md54 = fs.File(p["file1.test"]).md5partial digest4 = fs.File(p.joinpath("file1.test")).digest_partial
md55 = fs.File(p["file2.test"]).md5partial digest5 = fs.File(p.joinpath("file2.test")).digest_partial
md56 = fs.File(p["file3.test"]).md5partial digest6 = fs.File(p.joinpath("file3.test")).digest_partial
# The expected md5 is the md5 of md5s for folders and the direct md5 for files # The expected digest is the hash of digests for folders and the direct digest for files
folder_md51 = hashlib.md5(md51).digest() folder_digest1 = hasher(digest1).digest()
folder_md52 = hashlib.md5(md52).digest() folder_digest2 = hasher(digest2).digest()
folder_md53 = hashlib.md5(md53).digest() folder_digest3 = hasher(digest3).digest()
md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56) digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
eq_(b.md5partial, md5.digest()) eq_(b.digest_partial, digest)
md51 = fs.File(p["dir1"]["file1.test"]).md5samples digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest_samples
md52 = fs.File(p["dir2"]["file2.test"]).md5samples digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest_samples
md53 = fs.File(p["dir3"]["file3.test"]).md5samples digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest_samples
md54 = fs.File(p["file1.test"]).md5samples digest4 = fs.File(p.joinpath("file1.test")).digest_samples
md55 = fs.File(p["file2.test"]).md5samples digest5 = fs.File(p.joinpath("file2.test")).digest_samples
md56 = fs.File(p["file3.test"]).md5samples digest6 = fs.File(p.joinpath("file3.test")).digest_samples
# The expected md5 is the md5 of md5s for folders and the direct md5 for files # The expected digest is the digest of digests for folders and the direct digest for files
folder_md51 = hashlib.md5(md51).digest() folder_digest1 = hasher(digest1).digest()
folder_md52 = hashlib.md5(md52).digest() folder_digest2 = hasher(digest2).digest()
folder_md53 = hashlib.md5(md53).digest() folder_digest3 = hasher(digest3).digest()
md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56) digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
eq_(b.md5samples, md5.digest()) eq_(b.digest_samples, digest)
def test_has_file_attrs(tmpdir): def test_has_file_attrs(tmpdir):

View File

@@ -447,7 +447,7 @@ class TestCaseResultsXML:
self.results.groups = self.groups self.results.groups = self.groups
def get_file(self, path): # use this as a callback for load_from_xml def get_file(self, path): # use this as a callback for load_from_xml
return [o for o in self.objects if o.path == path][0] return [o for o in self.objects if str(o.path) == path][0]
def test_save_to_xml(self): def test_save_to_xml(self):
self.objects[0].is_ref = True self.objects[0].is_ref = True

View File

@@ -7,7 +7,7 @@
import pytest import pytest
from hscommon.jobprogress import job from hscommon.jobprogress import job
from hscommon.path import Path from pathlib import Path
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from .. import fs from .. import fs
@@ -22,7 +22,7 @@ class NamedObject:
if path is None: if path is None:
path = Path(name) path = Path(name)
else: else:
path = Path(path)[name] path = Path(path, name)
self.name = name self.name = name
self.size = size self.size = size
self.path = path self.path = path
@@ -123,19 +123,19 @@ def test_content_scan(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.CONTENTS s.scan_type = ScanType.CONTENTS
f = [no("foo"), no("bar"), no("bleh")] f = [no("foo"), no("bar"), no("bleh")]
f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar" f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar" f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].md5 = f[2].md5partial = f[1].md5samples = "bleh" f[2].digest = f[2].digest_partial = f[1].digest_samples = "bleh"
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
eq_(len(r[0]), 2) eq_(len(r[0]), 2)
eq_(s.discarded_file_count, 0) # don't count the different md5 as discarded! eq_(s.discarded_file_count, 0) # don't count the different digest as discarded!
def test_content_scan_compare_sizes_first(fake_fileexists): def test_content_scan_compare_sizes_first(fake_fileexists):
class MyFile(no): class MyFile(no):
@property @property
def md5(self): def digest(self):
raise AssertionError() raise AssertionError()
s = Scanner() s = Scanner()
@@ -161,14 +161,14 @@ def test_ignore_file_size(fake_fileexists):
no("largeignore1", large_size + 1), no("largeignore1", large_size + 1),
no("largeignore2", large_size + 1), no("largeignore2", large_size + 1),
] ]
f[0].md5 = f[0].md5partial = f[0].md5samples = "smallignore" f[0].digest = f[0].digest_partial = f[0].digest_samples = "smallignore"
f[1].md5 = f[1].md5partial = f[1].md5samples = "smallignore" f[1].digest = f[1].digest_partial = f[1].digest_samples = "smallignore"
f[2].md5 = f[2].md5partial = f[2].md5samples = "small" f[2].digest = f[2].digest_partial = f[2].digest_samples = "small"
f[3].md5 = f[3].md5partial = f[3].md5samples = "small" f[3].digest = f[3].digest_partial = f[3].digest_samples = "small"
f[4].md5 = f[4].md5partial = f[4].md5samples = "large" f[4].digest = f[4].digest_partial = f[4].digest_samples = "large"
f[5].md5 = f[5].md5partial = f[5].md5samples = "large" f[5].digest = f[5].digest_partial = f[5].digest_samples = "large"
f[6].md5 = f[6].md5partial = f[6].md5samples = "largeignore" f[6].digest = f[6].digest_partial = f[6].digest_samples = "largeignore"
f[7].md5 = f[7].md5partial = f[7].md5samples = "largeignore" f[7].digest = f[7].digest_partial = f[7].digest_samples = "largeignore"
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
# No ignores # No ignores
@@ -197,21 +197,21 @@ def test_big_file_partial_hashes(fake_fileexists):
s.big_file_size_threshold = bigsize s.big_file_size_threshold = bigsize
f = [no("bigfoo", bigsize), no("bigbar", bigsize), no("smallfoo", smallsize), no("smallbar", smallsize)] f = [no("bigfoo", bigsize), no("bigbar", bigsize), no("smallfoo", smallsize), no("smallbar", smallsize)]
f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar" f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar" f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].md5 = f[2].md5partial = "bleh" f[2].digest = f[2].digest_partial = "bleh"
f[3].md5 = f[3].md5partial = "bleh" f[3].digest = f[3].digest_partial = "bleh"
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 2) eq_(len(r), 2)
# md5partial is still the same, but the file is actually different # digest_partial is still the same, but the file is actually different
f[1].md5 = f[1].md5samples = "difffoobar" f[1].digest = f[1].digest_samples = "difffoobar"
# here we compare the full md5s, as the user disabled the optimization # here we compare the full digests, as the user disabled the optimization
s.big_file_size_threshold = 0 s.big_file_size_threshold = 0
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
# here we should compare the md5samples, and see they are different # here we should compare the digest_samples, and see they are different
s.big_file_size_threshold = bigsize s.big_file_size_threshold = bigsize
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
@@ -221,9 +221,9 @@ def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.CONTENTS s.scan_type = ScanType.CONTENTS
f = [no("foo"), no("bar"), no("bleh")] f = [no("foo"), no("bar"), no("bleh")]
f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar" f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar" f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].md5 = f[2].md5partial = f[2].md5samples = "bleh" f[2].digest = f[2].digest_partial = f[2].digest_samples = "bleh"
s.min_match_percentage = 101 s.min_match_percentage = 101
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
@@ -234,12 +234,16 @@ def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists):
eq_(len(r[0]), 2) eq_(len(r[0]), 2)
def test_content_scan_doesnt_put_md5_in_words_at_the_end(fake_fileexists): def test_content_scan_doesnt_put_digest_in_words_at_the_end(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.CONTENTS s.scan_type = ScanType.CONTENTS
f = [no("foo"), no("bar")] f = [no("foo"), no("bar")]
f[0].md5 = f[0].md5partial = f[0].md5samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" f[0].digest = f[0].digest_partial = f[
f[1].md5 = f[1].md5partial = f[1].md5samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" 0
].digest_samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
f[1].digest = f[1].digest_partial = f[
1
].digest_samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
# FIXME looks like we are missing something here? # FIXME looks like we are missing something here?
r[0] r[0]
@@ -568,12 +572,14 @@ def test_dont_group_files_that_dont_exist(tmpdir):
s = Scanner() s = Scanner()
s.scan_type = ScanType.CONTENTS s.scan_type = ScanType.CONTENTS
p = Path(str(tmpdir)) p = Path(str(tmpdir))
p["file1"].open("w").write("foo") with p.joinpath("file1").open("w") as fp:
p["file2"].open("w").write("foo") fp.write("foo")
with p.joinpath("file2").open("w") as fp:
fp.write("foo")
file1, file2 = fs.get_files(p) file1, file2 = fs.get_files(p)
def getmatches(*args, **kw): def getmatches(*args, **kw):
file2.path.remove() file2.path.unlink()
return [Match(file1, file2, 100)] return [Match(file1, file2, 100)]
s._getmatches = getmatches s._getmatches = getmatches
@@ -587,21 +593,21 @@ def test_folder_scan_exclude_subfolder_matches(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.FOLDERS s.scan_type = ScanType.FOLDERS
topf1 = no("top folder 1", size=42) topf1 = no("top folder 1", size=42)
topf1.md5 = topf1.md5partial = topf1.md5samples = b"some_md5_1" topf1.digest = topf1.digest_partial = topf1.digest_samples = b"some_digest__1"
topf1.path = Path("/topf1") topf1.path = Path("/topf1")
topf2 = no("top folder 2", size=42) topf2 = no("top folder 2", size=42)
topf2.md5 = topf2.md5partial = topf2.md5samples = b"some_md5_1" topf2.digest = topf2.digest_partial = topf2.digest_samples = b"some_digest__1"
topf2.path = Path("/topf2") topf2.path = Path("/topf2")
subf1 = no("sub folder 1", size=41) subf1 = no("sub folder 1", size=41)
subf1.md5 = subf1.md5partial = subf1.md5samples = b"some_md5_2" subf1.digest = subf1.digest_partial = subf1.digest_samples = b"some_digest__2"
subf1.path = Path("/topf1/sub") subf1.path = Path("/topf1/sub")
subf2 = no("sub folder 2", size=41) subf2 = no("sub folder 2", size=41)
subf2.md5 = subf2.md5partial = subf2.md5samples = b"some_md5_2" subf2.digest = subf2.digest_partial = subf2.digest_samples = b"some_digest__2"
subf2.path = Path("/topf2/sub") subf2.path = Path("/topf2/sub")
eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2])), 1) # only top folders eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2])), 1) # only top folders
# however, if another folder matches a subfolder, keep in in the matches # however, if another folder matches a subfolder, keep in in the matches
otherf = no("other folder", size=41) otherf = no("other folder", size=41)
otherf.md5 = otherf.md5partial = otherf.md5samples = b"some_md5_2" otherf.digest = otherf.digest_partial = otherf.digest_samples = b"some_digest__2"
otherf.path = Path("/otherfolder") otherf.path = Path("/otherfolder")
eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2, otherf])), 2) eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2, otherf])), 2)
@@ -624,9 +630,9 @@ def test_dont_count_ref_files_as_discarded(fake_fileexists):
o1 = no("foo", path="p1") o1 = no("foo", path="p1")
o2 = no("foo", path="p2") o2 = no("foo", path="p2")
o3 = no("foo", path="p3") o3 = no("foo", path="p3")
o1.md5 = o1.md5partial = o1.md5samples = "foobar" o1.digest = o1.digest_partial = o1.digest_samples = "foobar"
o2.md5 = o2.md5partial = o2.md5samples = "foobar" o2.digest = o2.digest_partial = o2.digest_samples = "foobar"
o3.md5 = o3.md5partial = o3.md5samples = "foobar" o3.digest = o3.digest_partial = o3.digest_samples = "foobar"
o1.is_ref = True o1.is_ref = True
o2.is_ref = True o2.is_ref = True
eq_(len(s.get_dupe_groups([o1, o2, o3])), 1) eq_(len(s.get_dupe_groups([o1, o2, o3])), 1)

View File

@@ -7,6 +7,12 @@
import time import time
import sys import sys
import os import os
import urllib.request
import urllib.error
import json
import semantic_version
import logging
from typing import Union
from hscommon.util import format_time_decimal from hscommon.util import format_time_decimal
@@ -64,3 +70,34 @@ def fix_surrogate_encoding(s, encoding="utf-8"):
def executable_folder(): def executable_folder():
return os.path.dirname(os.path.abspath(sys.argv[0])) return os.path.dirname(os.path.abspath(sys.argv[0]))
def check_for_update(current_version: str, include_prerelease: bool = False) -> Union[None, dict]:
request = urllib.request.Request(
"https://api.github.com/repos/arsenetar/dupeguru/releases",
headers={"Accept": "application/vnd.github.v3+json"},
)
try:
with urllib.request.urlopen(request) as response:
if response.status != 200:
logging.warn(f"Error retriving updates. Status: {response.status}")
return None
try:
response_json = json.loads(response.read())
except json.JSONDecodeError as ex:
logging.warn(f"Error parsing updates. {ex.msg}")
return None
except urllib.error.URLError as ex:
logging.warn(f"Error retriving updates. {ex.reason}")
return None
new_version = semantic_version.Version(current_version)
new_url = None
for release in response_json:
release_version = semantic_version.Version(release["name"])
if new_version < release_version and (include_prerelease or not release_version.prerelease):
new_version = release_version
new_url = release["html_url"]
if new_url is not None:
return {"version": new_version, "url": new_url}
else:
return None

View File

@@ -20,13 +20,8 @@ import re
import importlib import importlib
from datetime import datetime from datetime import datetime
import glob import glob
import sysconfig
import modulefinder
from setuptools import setup, Extension
from .plat import ISWINDOWS from .plat import ISWINDOWS
from .util import ensure_folder, delete_files_with_pattern
def print_and_do(cmd): def print_and_do(cmd):
@@ -181,23 +176,6 @@ def build_dmg(app_path, destfolder):
print("Build Complete") print("Build Complete")
def copy_sysconfig_files_for_embed(destpath):
# This normally shouldn't be needed for Python 3.3+.
makefile = sysconfig.get_makefile_filename()
configh = sysconfig.get_config_h_filename()
shutil.copy(makefile, destpath)
shutil.copy(configh, destpath)
with open(op.join(destpath, "site.py"), "w") as fp:
fp.write(
"""
import os.path as op
from distutils import sysconfig
sysconfig.get_makefile_filename = lambda: op.join(op.dirname(__file__), 'Makefile')
sysconfig.get_config_h_filename = lambda: op.join(op.dirname(__file__), 'pyconfig.h')
"""
)
def add_to_pythonpath(path): def add_to_pythonpath(path):
"""Adds ``path`` to both ``PYTHONPATH`` env and ``sys.path``.""" """Adds ``path`` to both ``PYTHONPATH`` env and ``sys.path``."""
abspath = op.abspath(path) abspath = op.abspath(path)
@@ -248,20 +226,6 @@ def copy_packages(packages_names, dest, create_links=False, extra_ignores=None):
shutil.copy(source_path, dest_path) shutil.copy(source_path, dest_path)
def copy_qt_plugins(folder_names, dest): # This is only for Windows
from PyQt5.QtCore import QLibraryInfo
qt_plugin_dir = QLibraryInfo.location(QLibraryInfo.PluginsPath)
def ignore(path, names):
if path == qt_plugin_dir:
return [n for n in names if n not in folder_names]
else:
return [n for n in names if not n.endswith(".dll")]
shutil.copytree(qt_plugin_dir, dest, ignore=ignore)
def build_debian_changelog( def build_debian_changelog(
changelogpath, changelogpath,
destfile, destfile,
@@ -349,183 +313,6 @@ def read_changelog_file(filename):
return result return result
class OSXAppStructure:
def __init__(self, dest):
self.dest = dest
self.contents = op.join(dest, "Contents")
self.macos = op.join(self.contents, "MacOS")
self.resources = op.join(self.contents, "Resources")
self.frameworks = op.join(self.contents, "Frameworks")
self.infoplist = op.join(self.contents, "Info.plist")
def create(self, infoplist):
ensure_empty_folder(self.dest)
os.makedirs(self.macos)
os.mkdir(self.resources)
os.mkdir(self.frameworks)
copy(infoplist, self.infoplist)
open(op.join(self.contents, "PkgInfo"), "wt").write("APPLxxxx")
def copy_executable(self, executable):
info = plistlib.readPlist(self.infoplist)
self.executablename = info["CFBundleExecutable"]
self.executablepath = op.join(self.macos, self.executablename)
copy(executable, self.executablepath)
def copy_resources(self, *resources, use_symlinks=False):
for path in resources:
resource_dest = op.join(self.resources, op.basename(path))
action = symlink if use_symlinks else copy
action(op.abspath(path), resource_dest)
def copy_frameworks(self, *frameworks):
for path in frameworks:
framework_dest = op.join(self.frameworks, op.basename(path))
copy(path, framework_dest)
def create_osx_app_structure(
dest,
executable,
infoplist,
resources=None,
frameworks=None,
symlink_resources=False,
):
# `dest`: A path to the destination .app folder
# `executable`: the path of the executable file that goes in "MacOS"
# `infoplist`: The path to your Info.plist file.
# `resources`: A list of paths of files or folders going in the "Resources" folder.
# `frameworks`: Same as above for "Frameworks".
# `symlink_resources`: If True, will symlink resources into the structure instead of copying them.
app = OSXAppStructure(dest)
app.create(infoplist)
app.copy_executable(executable)
app.copy_resources(*resources, use_symlinks=symlink_resources)
app.copy_frameworks(*frameworks)
class OSXFrameworkStructure:
def __init__(self, dest):
self.dest = dest
self.contents = op.join(dest, "Versions", "A")
self.resources = op.join(self.contents, "Resources")
self.headers = op.join(self.contents, "Headers")
self.infoplist = op.join(self.resources, "Info.plist")
self._update_executable_path()
def _update_executable_path(self):
if not op.exists(self.infoplist):
self.executablename = self.executablepath = None
return
info = plistlib.readPlist(self.infoplist)
self.executablename = info["CFBundleExecutable"]
self.executablepath = op.join(self.contents, self.executablename)
def create(self, infoplist):
ensure_empty_folder(self.dest)
os.makedirs(self.contents)
os.mkdir(self.resources)
os.mkdir(self.headers)
copy(infoplist, self.infoplist)
self._update_executable_path()
def create_symlinks(self):
# Only call this after create() and copy_executable()
os.symlink("A", op.join(self.dest, "Versions", "Current"))
os.symlink(op.relpath(self.executablepath, self.dest), op.join(self.dest, self.executablename))
os.symlink(op.relpath(self.headers, self.dest), op.join(self.dest, "Headers"))
os.symlink(op.relpath(self.resources, self.dest), op.join(self.dest, "Resources"))
def copy_executable(self, executable):
copy(executable, self.executablepath)
def copy_resources(self, *resources, use_symlinks=False):
for path in resources:
resource_dest = op.join(self.resources, op.basename(path))
action = symlink if use_symlinks else copy
action(op.abspath(path), resource_dest)
def copy_headers(self, *headers, use_symlinks=False):
for path in headers:
header_dest = op.join(self.headers, op.basename(path))
action = symlink if use_symlinks else copy
action(op.abspath(path), header_dest)
def copy_embeddable_python_dylib(dst):
runtime = op.join(
sysconfig.get_config_var("PYTHONFRAMEWORKPREFIX"),
sysconfig.get_config_var("LDLIBRARY"),
)
filedest = op.join(dst, "Python")
shutil.copy(runtime, filedest)
os.chmod(filedest, 0o774) # We need write permission to use install_name_tool
cmd = "install_name_tool -id @rpath/Python %s" % filedest
print_and_do(cmd)
def collect_stdlib_dependencies(script, dest_folder, extra_deps=None):
sysprefix = sys.prefix # could be a virtualenv
basesysprefix = sys.base_prefix # seems to be path to non-virtual sys
real_lib_prefix = sysconfig.get_config_var("LIBDEST") # leaving this in case it is neede
def is_stdlib_path(path):
# A module path is only a stdlib path if it's in either sys.prefix or
# sysconfig.get_config_var('prefix') (the 2 are different if we are in a virtualenv) and if
# there's no "site-package in the path.
if not path:
return False
if "site-package" in path:
return False
if not (path.startswith(sysprefix) or path.startswith(basesysprefix) or path.startswith(real_lib_prefix)):
return False
return True
ensure_folder(dest_folder)
mf = modulefinder.ModuleFinder()
mf.run_script(script)
modpaths = [mod.__file__ for mod in mf.modules.values()]
modpaths = filter(is_stdlib_path, modpaths)
for p in modpaths:
if p.startswith(real_lib_prefix):
relpath = op.relpath(p, real_lib_prefix)
elif p.startswith(sysprefix):
relpath = op.relpath(p, sysprefix)
assert relpath.startswith("lib/python3.") # we want to get rid of that lib/python3.x part
relpath = relpath[len("lib/python3.X/") :]
elif p.startswith(basesysprefix):
relpath = op.relpath(p, basesysprefix)
assert relpath.startswith("lib/python3.")
relpath = relpath[len("lib/python3.X/") :]
else:
raise AssertionError()
if relpath.startswith("lib-dynload"): # We copy .so files in lib-dynload directly in our dest
relpath = relpath[len("lib-dynload/") :]
if relpath.startswith("encodings") or relpath.startswith("distutils"):
# We force their inclusion later.
continue
dest_path = op.join(dest_folder, relpath)
ensure_folder(op.dirname(dest_path))
copy(p, dest_path)
# stringprep is used by encodings.
# We use real_lib_prefix with distutils because virtualenv messes with it and we need to refer
# to the original distutils folder.
FORCED_INCLUSION = [
"encodings",
"stringprep",
op.join(real_lib_prefix, "distutils"),
]
if extra_deps:
FORCED_INCLUSION += extra_deps
copy_packages(FORCED_INCLUSION, dest_folder)
# There's a couple of rather big exe files in the distutils folder that we absolutely don't
# need. Remove them.
delete_files_with_pattern(op.join(dest_folder, "distutils"), "*.exe")
# And, finally, create an empty "site.py" that Python needs around on startup.
open(op.join(dest_folder, "site.py"), "w").close()
def fix_qt_resource_file(path): def fix_qt_resource_file(path):
# pyrcc5 under Windows, if the locale is non-english, can produce a source file with a date # pyrcc5 under Windows, if the locale is non-english, can produce a source file with a date
# containing accented characters. If it does, the encoding is wrong and it prevents the file # containing accented characters. If it does, the encoding is wrong and it prevents the file
@@ -537,21 +324,3 @@ def fix_qt_resource_file(path):
lines = [line for line in lines if not line.startswith(b"#")] lines = [line for line in lines if not line.startswith(b"#")]
with open(path, "wb") as fp: with open(path, "wb") as fp:
fp.write(b"\n".join(lines)) fp.write(b"\n".join(lines))
def build_cocoa_ext(extname, dest, source_files, extra_frameworks=(), extra_includes=()):
extra_link_args = ["-framework", "CoreFoundation", "-framework", "Foundation"]
for extra in extra_frameworks:
extra_link_args += ["-framework", extra]
ext = Extension(
extname,
source_files,
extra_link_args=extra_link_args,
include_dirs=extra_includes,
)
setup(script_args=["build_ext", "--inplace"], ext_modules=[ext])
# Our problem here is to get the fully qualified filename of the resulting .so but I couldn't
# find a documented way to do so. The only thing I could find is this below :(
fn = ext._file_name
assert op.exists(fn)
move(fn, op.join(dest, fn))

View File

@@ -14,7 +14,7 @@ import re
import os import os
import shutil import shutil
from .path import Path, pathify from pathlib import Path
# This matches [123], but not [12] (3 digits being the minimum). # This matches [123], but not [12] (3 digits being the minimum).
# It also matches [1234] [12345] etc.. # It also matches [1234] [12345] etc..
@@ -52,16 +52,15 @@ def is_conflicted(name):
return re_conflict.match(name) is not None return re_conflict.match(name) is not None
@pathify
def _smart_move_or_copy(operation, source_path: Path, dest_path: Path): def _smart_move_or_copy(operation, source_path: Path, dest_path: Path):
"""Use move() or copy() to move and copy file with the conflict management.""" """Use move() or copy() to move and copy file with the conflict management."""
if dest_path.isdir() and not source_path.isdir(): if dest_path.is_dir() and not source_path.is_dir():
dest_path = dest_path[source_path.name] dest_path = dest_path.joinpath(source_path.name)
if dest_path.exists(): if dest_path.exists():
filename = dest_path.name filename = dest_path.name
dest_dir_path = dest_path.parent() dest_dir_path = dest_path.parent
newname = get_conflicted_name(os.listdir(str(dest_dir_path)), filename) newname = get_conflicted_name(os.listdir(str(dest_dir_path)), filename)
dest_path = dest_dir_path[newname] dest_path = dest_dir_path.joinpath(newname)
operation(str(source_path), str(dest_path)) operation(str(source_path), str(dest_path))

View File

@@ -1,23 +0,0 @@
# Created By: Virgil Dupras
# Created On: 2011-04-19
# Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
#
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
import sys
import traceback
# Taken from http://bzimmer.ziclix.com/2008/12/17/python-thread-dumps/
def stacktraces():
code = []
for thread_id, stack in sys._current_frames().items():
code.append("\n# ThreadID: %s" % thread_id)
for filename, lineno, name, line in traceback.extract_stack(stack):
code.append('File: "%s", line %d, in %s' % (filename, lineno, name))
if line:
code.append(" %s" % (line.strip()))
return "\n".join(code)

View File

@@ -42,73 +42,50 @@ def special_folder_path(special_folder, appname=None, portable=False):
try: try:
# Normally, we would simply do "from cocoa import proxy", but due to a bug in pytest (currently from PyQt5.QtCore import QUrl, QStandardPaths
# at v2.4.2), our test suite is broken when we do that. This below is a workaround until that from PyQt5.QtGui import QDesktopServices
# bug is fixed. from qtlib.util import get_appdata
import cocoa from core.util import executable_folder
from hscommon.plat import ISWINDOWS, ISOSX
import subprocess
if not hasattr(cocoa, "proxy"): def _open_url(url):
raise ImportError() QDesktopServices.openUrl(QUrl(url))
proxy = cocoa.proxy
_open_url = proxy.openURL_ def _open_path(path):
_open_path = proxy.openPath_ url = QUrl.fromLocalFile(str(path))
_reveal_path = proxy.revealPath_ QDesktopServices.openUrl(url)
def _reveal_path(path):
if ISWINDOWS:
subprocess.run(["explorer", "/select,", op.abspath(path)])
elif ISOSX:
subprocess.run(["open", "-R", op.abspath(path)])
else:
_open_path(op.dirname(str(path)))
def _special_folder_path(special_folder, appname=None, portable=False): def _special_folder_path(special_folder, appname=None, portable=False):
if special_folder == SpecialFolder.CACHE: if special_folder == SpecialFolder.CACHE:
base = proxy.getCachePath() if ISWINDOWS and portable:
folder = op.join(executable_folder(), "cache")
else:
folder = QStandardPaths.standardLocations(QStandardPaths.CacheLocation)[0]
else: else:
base = proxy.getAppdataPath() folder = get_appdata(portable)
if not appname: return folder
appname = proxy.bundleInfo_("CFBundleName")
return op.join(base, appname)
except ImportError: except ImportError:
try: # We're either running tests, and these functions don't matter much or we're in a really
from PyQt5.QtCore import QUrl, QStandardPaths # weird situation. Let's just have dummy fallbacks.
from PyQt5.QtGui import QDesktopServices logging.warning("Can't setup desktop functions!")
from qtlib.util import get_appdata
from core.util import executable_folder
from hscommon.plat import ISWINDOWS, ISOSX
import subprocess
def _open_url(url): def _open_path(path):
QDesktopServices.openUrl(QUrl(url)) # Dummy for tests
pass
def _open_path(path): def _reveal_path(path):
url = QUrl.fromLocalFile(str(path)) # Dummy for tests
QDesktopServices.openUrl(url) pass
def _reveal_path(path): def _special_folder_path(special_folder, appname=None, portable=False):
if ISWINDOWS: return "/tmp"
subprocess.run(["explorer", "/select,", op.abspath(path)])
elif ISOSX:
subprocess.run(["open", "-R", op.abspath(path)])
else:
_open_path(op.dirname(str(path)))
def _special_folder_path(special_folder, appname=None, portable=False):
if special_folder == SpecialFolder.CACHE:
if ISWINDOWS and portable:
folder = op.join(executable_folder(), "cache")
else:
folder = QStandardPaths.standardLocations(QStandardPaths.CacheLocation)[0]
else:
folder = get_appdata(portable)
return folder
except ImportError:
# We're either running tests, and these functions don't matter much or we're in a really
# weird situation. Let's just have dummy fallbacks.
logging.warning("Can't setup desktop functions!")
def _open_path(path):
# Dummy for tests
pass
def _reveal_path(path):
# Dummy for tests
pass
def _special_folder_path(special_folder, appname=None, portable=False):
return "/tmp"

View File

@@ -1,216 +0,0 @@
# Created By: Virgil Dupras
# Created On: 2011-08-05
# Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
#
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
from sys import maxsize as INF
from math import sqrt
VERY_SMALL = 0.0000001
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return "<Point {:2.2f}, {:2.2f}>".format(*self)
def __iter__(self):
yield self.x
yield self.y
def distance_to(self, other):
return Line(self, other).length()
class Line:
def __init__(self, p1, p2):
self.p1 = p1
self.p2 = p2
def __repr__(self):
return "<Line {}, {}>".format(*self)
def __iter__(self):
yield self.p1
yield self.p2
def dx(self):
return self.p2.x - self.p1.x
def dy(self):
return self.p2.y - self.p1.y
def length(self):
return sqrt(self.dx() ** 2 + self.dy() ** 2)
def slope(self):
if self.dx() == 0:
return INF if self.dy() > 0 else -INF
else:
return self.dy() / self.dx()
def intersection_point(self, other):
# with help from http://paulbourke.net/geometry/lineline2d/
if abs(self.slope() - other.slope()) < VERY_SMALL:
# parallel. Even if coincident, we return nothing
return None
A, B = self
C, D = other
denom = (D.y - C.y) * (B.x - A.x) - (D.x - C.x) * (B.y - A.y)
if denom == 0:
return None
numera = (D.x - C.x) * (A.y - C.y) - (D.y - C.y) * (A.x - C.x)
numerb = (B.x - A.x) * (A.y - C.y) - (B.y - A.y) * (A.x - C.x)
mua = numera / denom
mub = numerb / denom
if (0 <= mua <= 1) and (0 <= mub <= 1):
x = A.x + mua * (B.x - A.x)
y = A.y + mua * (B.y - A.y)
return Point(x, y)
else:
return None
class Rect:
def __init__(self, x, y, w, h):
self.x = x
self.y = y
self.w = w
self.h = h
def __iter__(self):
yield self.x
yield self.y
yield self.w
yield self.h
def __repr__(self):
return "<Rect {:2.2f}, {:2.2f}, {:2.2f}, {:2.2f}>".format(*self)
@classmethod
def from_center(cls, center, width, height):
x = center.x - width / 2
y = center.y - height / 2
return cls(x, y, width, height)
@classmethod
def from_corners(cls, pt1, pt2):
x1, y1 = pt1
x2, y2 = pt2
return cls(min(x1, x2), min(y1, y2), abs(x1 - x2), abs(y1 - y2))
def center(self):
return Point(self.x + self.w / 2, self.y + self.h / 2)
def contains_point(self, point):
x, y = point
(x1, y1), (x2, y2) = self.corners()
return (x1 <= x <= x2) and (y1 <= y <= y2)
def contains_rect(self, rect):
pt1, pt2 = rect.corners()
return self.contains_point(pt1) and self.contains_point(pt2)
def corners(self):
return Point(self.x, self.y), Point(self.x + self.w, self.y + self.h)
def intersects(self, other):
r1pt1, r1pt2 = self.corners()
r2pt1, r2pt2 = other.corners()
if r1pt1.x < r2pt1.x:
xinter = r1pt2.x >= r2pt1.x
else:
xinter = r2pt2.x >= r1pt1.x
if not xinter:
return False
if r1pt1.y < r2pt1.y:
yinter = r1pt2.y >= r2pt1.y
else:
yinter = r2pt2.y >= r1pt1.y
return yinter
def lines(self):
pt1, pt4 = self.corners()
pt2 = Point(pt4.x, pt1.y)
pt3 = Point(pt1.x, pt4.y)
l1 = Line(pt1, pt2)
l2 = Line(pt2, pt4)
l3 = Line(pt3, pt4)
l4 = Line(pt1, pt3)
return l1, l2, l3, l4
def scaled_rect(self, dx, dy):
"""Returns a rect that has the same borders at self, but grown/shrunk by dx/dy on each side."""
x, y, w, h = self
x -= dx
y -= dy
w += dx * 2
h += dy * 2
return Rect(x, y, w, h)
def united(self, other):
"""Returns the bounding rectangle of this rectangle and `other`."""
# ul=upper left lr=lower right
ulcorner1, lrcorner1 = self.corners()
ulcorner2, lrcorner2 = other.corners()
corner1 = Point(min(ulcorner1.x, ulcorner2.x), min(ulcorner1.y, ulcorner2.y))
corner2 = Point(max(lrcorner1.x, lrcorner2.x), max(lrcorner1.y, lrcorner2.y))
return Rect.from_corners(corner1, corner2)
# --- Properties
@property
def top(self):
return self.y
@top.setter
def top(self, value):
self.y = value
@property
def bottom(self):
return self.y + self.h
@bottom.setter
def bottom(self, value):
self.y = value - self.h
@property
def left(self):
return self.x
@left.setter
def left(self, value):
self.x = value
@property
def right(self):
return self.x + self.w
@right.setter
def right(self, value):
self.x = value - self.w
@property
def width(self):
return self.w
@width.setter
def width(self, value):
self.w = value
@property
def height(self):
return self.h
@height.setter
def height(self, value):
self.h = value

View File

@@ -1,14 +1,11 @@
import os import os
import os.path as op import os.path as op
import shutil import shutil
import re
import tempfile import tempfile
import polib import polib
from . import pygettext from . import pygettext
from .util import modified_after, dedupe, ensure_folder
from .build import print_and_do, ensure_empty_folder
LC_MESSAGES = "LC_MESSAGES" LC_MESSAGES = "LC_MESSAGES"
@@ -116,118 +113,3 @@ def normalize_all_pos(base_folder):
for pofile in pofiles: for pofile in pofiles:
p = polib.pofile(pofile) p = polib.pofile(pofile)
p.save() p.save()
# --- Cocoa
def all_lproj_paths(folder):
return files_with_ext(folder, ".lproj")
def escape_cocoa_strings(s):
return s.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")
def unescape_cocoa_strings(s):
return s.replace("\\\\", "\\").replace('\\"', '"').replace("\\n", "\n")
def strings2pot(target, dest):
with open(target, "rt", encoding="utf-8") as fp:
contents = fp.read()
# We're reading an en.lproj file. We only care about the righthand part of the translation.
re_trans = re.compile(r'".*" = "(.*)";')
strings = re_trans.findall(contents)
if op.exists(dest):
po = polib.pofile(dest)
else:
po = polib.POFile()
for s in dedupe(strings):
s = unescape_cocoa_strings(s)
entry = po.find(s)
if entry is None:
entry = polib.POEntry(msgid=s)
po.append(entry)
# we don't know or care about a line number so we put 0
entry.occurrences.append((target, "0"))
entry.occurrences = dedupe(entry.occurrences)
po.save(dest)
def allstrings2pot(lprojpath, dest, excludes=None):
allstrings = files_with_ext(lprojpath, STRING_EXT)
if excludes:
allstrings = [p for p in allstrings if op.splitext(op.basename(p))[0] not in excludes]
for strings_path in allstrings:
strings2pot(strings_path, dest)
def po2strings(pofile, en_strings, dest):
# Takes en_strings and replace all righthand parts of "foo" = "bar"; entries with translations
# in pofile, then puts the result in dest.
po = polib.pofile(pofile)
if not modified_after(pofile, dest):
return
ensure_folder(op.dirname(dest))
print("Creating {} from {}".format(dest, pofile))
with open(en_strings, "rt", encoding="utf-8") as fp:
contents = fp.read()
re_trans = re.compile(r'(?<= = ").*(?=";\n)')
def repl(match):
s = match.group(0)
unescaped = unescape_cocoa_strings(s)
entry = po.find(unescaped)
if entry is None:
print("WARNING: Could not find entry '{}' in .po file".format(s))
return s
trans = entry.msgstr
return escape_cocoa_strings(trans) if trans else s
contents = re_trans.sub(repl, contents)
with open(dest, "wt", encoding="utf-8") as fp:
fp.write(contents)
def generate_cocoa_strings_from_code(code_folder, dest_folder):
# Uses the "genstrings" command to generate strings file from all .m files in "code_folder".
# The strings file (their name depends on the localization table used in the source) will be
# placed in "dest_folder".
# genstrings produces utf-16 files with comments. After having generated the files, we convert
# them to utf-8 and remove the comments.
ensure_empty_folder(dest_folder)
print_and_do('genstrings -o "{}" `find "{}" -name *.m | xargs`'.format(dest_folder, code_folder))
for stringsfile in os.listdir(dest_folder):
stringspath = op.join(dest_folder, stringsfile)
with open(stringspath, "rt", encoding="utf-16") as fp:
content = fp.read()
content = re.sub(r"/\*.*?\*/", "", content)
content = re.sub(r"\n{2,}", "\n", content)
# I have no idea why, but genstrings seems to have problems with "%" character in strings
# and inserts (number)$ after it. Find these bogus inserts and remove them.
content = re.sub(r"%\d\$", "%", content)
with open(stringspath, "wt", encoding="utf-8") as fp:
fp.write(content)
def generate_cocoa_strings_from_xib(xib_folder):
xibs = [op.join(xib_folder, fn) for fn in os.listdir(xib_folder) if fn.endswith(".xib")]
for xib in xibs:
dest = xib.replace(".xib", STRING_EXT)
print_and_do("ibtool {} --generate-strings-file {}".format(xib, dest))
print_and_do("iconv -f utf-16 -t utf-8 {0} | tee {0}".format(dest))
def localize_stringsfile(stringsfile, dest_root_folder):
stringsfile_name = op.basename(stringsfile)
for lang in get_langs("locale"):
pofile = op.join("locale", lang, "LC_MESSAGES", "ui.po")
cocoa_lang = PO2COCOA.get(lang, lang)
dest_lproj = op.join(dest_root_folder, cocoa_lang + ".lproj")
ensure_folder(dest_lproj)
po2strings(pofile, stringsfile, op.join(dest_lproj, stringsfile_name))
def localize_all_stringsfiles(src_folder, dest_root_folder):
stringsfiles = [op.join(src_folder, fn) for fn in os.listdir(src_folder) if fn.endswith(STRING_EXT)]
for path in stringsfiles:
localize_stringsfile(path, dest_root_folder)

View File

@@ -7,208 +7,9 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import logging import logging
import os
import os.path as op
import shutil
import sys
from itertools import takewhile
from functools import wraps from functools import wraps
from inspect import signature from inspect import signature
from pathlib import Path
class Path(tuple):
"""A handy class to work with paths.
We subclass ``tuple``, each element of the tuple represents an element of the path.
* ``Path('/foo/bar/baz')[1]`` --> ``'bar'``
* ``Path('/foo/bar/baz')[1:2]`` --> ``Path('bar/baz')``
* ``Path('/foo/bar')['baz']`` --> ``Path('/foo/bar/baz')``
* ``str(Path('/foo/bar/baz'))`` --> ``'/foo/bar/baz'``
"""
# Saves a little bit of memory usage
__slots__ = ()
def __new__(cls, value, separator=None):
def unicode_if_needed(s):
if isinstance(s, str):
return s
else:
try:
return str(s, sys.getfilesystemencoding())
except UnicodeDecodeError:
logging.warning("Could not decode %r", s)
raise
if isinstance(value, Path):
return value
if not separator:
separator = os.sep
if isinstance(value, bytes):
value = unicode_if_needed(value)
if isinstance(value, str):
if value:
if (separator not in value) and ("/" in value):
separator = "/"
value = value.split(separator)
else:
value = ()
else:
if any(isinstance(x, bytes) for x in value):
value = [unicode_if_needed(x) for x in value]
# value is a tuple/list
if any(separator in x for x in value):
# We have a component with a separator in it. Let's rejoin it, and generate another path.
return Path(separator.join(value), separator)
if (len(value) > 1) and (not value[-1]):
value = value[
:-1
] # We never want a path to end with a '' (because Path() can be called with a trailing slash ending path)
return tuple.__new__(cls, value)
def __add__(self, other):
other = Path(other)
if other and (not other[0]):
other = other[1:]
return Path(tuple.__add__(self, other))
def __contains__(self, item):
if isinstance(item, Path):
return item[: len(self)] == self
else:
return tuple.__contains__(self, item)
def __eq__(self, other):
return tuple.__eq__(self, Path(other))
def __getitem__(self, key):
if isinstance(key, slice):
if isinstance(key.start, Path):
equal_elems = list(takewhile(lambda pair: pair[0] == pair[1], zip(self, key.start)))
key = slice(len(equal_elems), key.stop, key.step)
if isinstance(key.stop, Path):
equal_elems = list(
takewhile(
lambda pair: pair[0] == pair[1],
zip(reversed(self), reversed(key.stop)),
)
)
stop = -len(equal_elems) if equal_elems else None
key = slice(key.start, stop, key.step)
return Path(tuple.__getitem__(self, key))
elif isinstance(key, (str, Path)):
return self + key
else:
return tuple.__getitem__(self, key)
def __hash__(self):
return tuple.__hash__(self)
def __ne__(self, other):
return not self.__eq__(other)
def __radd__(self, other):
return Path(other) + self
def __str__(self):
if len(self) == 1:
first = self[0]
if (len(first) == 2) and (first[1] == ":"): # Windows drive letter
return first + "\\"
elif not len(first): # root directory
return "/"
return os.sep.join(self)
def has_drive_letter(self):
if not self:
return False
first = self[0]
return (len(first) == 2) and (first[1] == ":")
def is_parent_of(self, other):
"""Whether ``other`` is a subpath of ``self``.
Almost the same as ``other in self``, but it's a bit more self-explicative and when
``other == self``, returns False.
"""
if other == self:
return False
else:
return other in self
def remove_drive_letter(self):
if self.has_drive_letter():
return self[1:]
else:
return self
def tobytes(self):
return str(self).encode(sys.getfilesystemencoding())
def parent(self):
"""Returns the parent path.
``Path('/foo/bar/baz').parent()`` --> ``Path('/foo/bar')``
"""
return self[:-1]
@property
def name(self):
"""Last element of the path (filename), with extension.
``Path('/foo/bar/baz').name`` --> ``'baz'``
"""
return self[-1]
# OS method wrappers
def exists(self):
return op.exists(str(self))
def copy(self, dest_path):
return shutil.copy(str(self), str(dest_path))
def copytree(self, dest_path, *args, **kwargs):
return shutil.copytree(str(self), str(dest_path), *args, **kwargs)
def isdir(self):
return op.isdir(str(self))
def isfile(self):
return op.isfile(str(self))
def islink(self):
return op.islink(str(self))
def listdir(self):
return [self[name] for name in os.listdir(str(self))]
def mkdir(self, *args, **kwargs):
return os.mkdir(str(self), *args, **kwargs)
def makedirs(self, *args, **kwargs):
return os.makedirs(str(self), *args, **kwargs)
def move(self, dest_path):
return shutil.move(str(self), str(dest_path))
def open(self, *args, **kwargs):
return open(str(self), *args, **kwargs)
def remove(self):
return os.remove(str(self))
def rename(self, dest_path):
return os.rename(str(self), str(dest_path))
def rmdir(self):
return os.rmdir(str(self))
def rmtree(self):
return shutil.rmtree(str(self))
def stat(self):
return os.stat(str(self))
def pathify(f): def pathify(f):

View File

@@ -15,7 +15,7 @@ from ..conflict import (
smart_copy, smart_copy,
smart_move, smart_move,
) )
from ..path import Path from pathlib import Path
from ..testutil import eq_ from ..testutil import eq_
@@ -71,43 +71,43 @@ class TestCaseMoveCopy:
def do_setup(self, request): def do_setup(self, request):
tmpdir = request.getfixturevalue("tmpdir") tmpdir = request.getfixturevalue("tmpdir")
self.path = Path(str(tmpdir)) self.path = Path(str(tmpdir))
self.path["foo"].open("w").close() self.path.joinpath("foo").touch()
self.path["bar"].open("w").close() self.path.joinpath("bar").touch()
self.path["dir"].mkdir() self.path.joinpath("dir").mkdir()
def test_move_no_conflict(self, do_setup): def test_move_no_conflict(self, do_setup):
smart_move(self.path + "foo", self.path + "baz") smart_move(self.path.joinpath("foo"), self.path.joinpath("baz"))
assert self.path["baz"].exists() assert self.path.joinpath("baz").exists()
assert not self.path["foo"].exists() assert not self.path.joinpath("foo").exists()
def test_copy_no_conflict(self, do_setup): # No need to duplicate the rest of the tests... Let's just test on move def test_copy_no_conflict(self, do_setup): # No need to duplicate the rest of the tests... Let's just test on move
smart_copy(self.path + "foo", self.path + "baz") smart_copy(self.path.joinpath("foo"), self.path.joinpath("baz"))
assert self.path["baz"].exists() assert self.path.joinpath("baz").exists()
assert self.path["foo"].exists() assert self.path.joinpath("foo").exists()
def test_move_no_conflict_dest_is_dir(self, do_setup): def test_move_no_conflict_dest_is_dir(self, do_setup):
smart_move(self.path + "foo", self.path + "dir") smart_move(self.path.joinpath("foo"), self.path.joinpath("dir"))
assert self.path["dir"]["foo"].exists() assert self.path.joinpath("dir", "foo").exists()
assert not self.path["foo"].exists() assert not self.path.joinpath("foo").exists()
def test_move_conflict(self, do_setup): def test_move_conflict(self, do_setup):
smart_move(self.path + "foo", self.path + "bar") smart_move(self.path.joinpath("foo"), self.path.joinpath("bar"))
assert self.path["[000] bar"].exists() assert self.path.joinpath("[000] bar").exists()
assert not self.path["foo"].exists() assert not self.path.joinpath("foo").exists()
def test_move_conflict_dest_is_dir(self, do_setup): def test_move_conflict_dest_is_dir(self, do_setup):
smart_move(self.path["foo"], self.path["dir"]) smart_move(self.path.joinpath("foo"), self.path.joinpath("dir"))
smart_move(self.path["bar"], self.path["foo"]) smart_move(self.path.joinpath("bar"), self.path.joinpath("foo"))
smart_move(self.path["foo"], self.path["dir"]) smart_move(self.path.joinpath("foo"), self.path.joinpath("dir"))
assert self.path["dir"]["foo"].exists() assert self.path.joinpath("dir", "foo").exists()
assert self.path["dir"]["[000] foo"].exists() assert self.path.joinpath("dir", "[000] foo").exists()
assert not self.path["foo"].exists() assert not self.path.joinpath("foo").exists()
assert not self.path["bar"].exists() assert not self.path.joinpath("bar").exists()
def test_copy_folder(self, tmpdir): def test_copy_folder(self, tmpdir):
# smart_copy also works on folders # smart_copy also works on folders
path = Path(str(tmpdir)) path = Path(str(tmpdir))
path["foo"].mkdir() path.joinpath("foo").mkdir()
path["bar"].mkdir() path.joinpath("bar").mkdir()
smart_copy(path["foo"], path["bar"]) # no crash smart_copy(path.joinpath("foo"), path.joinpath("bar")) # no crash
assert path["[000] bar"].exists() assert path.joinpath("[000] bar").exists()

View File

@@ -6,261 +6,8 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import sys from ..path import pathify
import os from pathlib import Path
import pytest
from ..path import Path, pathify
from ..testutil import eq_
@pytest.fixture
def force_ossep(request):
monkeypatch = request.getfixturevalue("monkeypatch")
monkeypatch.setattr(os, "sep", "/")
def test_empty(force_ossep):
path = Path("")
eq_("", str(path))
eq_(0, len(path))
path = Path(())
eq_("", str(path))
eq_(0, len(path))
def test_single(force_ossep):
path = Path("foobar")
eq_("foobar", path)
eq_(1, len(path))
def test_multiple(force_ossep):
path = Path("foo/bar")
eq_("foo/bar", path)
eq_(2, len(path))
def test_init_with_tuple_and_list(force_ossep):
path = Path(("foo", "bar"))
eq_("foo/bar", path)
path = Path(["foo", "bar"])
eq_("foo/bar", path)
def test_init_with_invalid_value(force_ossep):
try:
Path(42)
assert False
except TypeError:
pass
def test_access(force_ossep):
path = Path("foo/bar/bleh")
eq_("foo", path[0])
eq_("foo", path[-3])
eq_("bar", path[1])
eq_("bar", path[-2])
eq_("bleh", path[2])
eq_("bleh", path[-1])
def test_slicing(force_ossep):
path = Path("foo/bar/bleh")
subpath = path[:2]
eq_("foo/bar", subpath)
assert isinstance(subpath, Path)
def test_parent(force_ossep):
path = Path("foo/bar/bleh")
subpath = path.parent()
eq_("foo/bar", subpath)
assert isinstance(subpath, Path)
def test_filename(force_ossep):
path = Path("foo/bar/bleh.ext")
eq_(path.name, "bleh.ext")
def test_deal_with_empty_components(force_ossep):
"""Keep ONLY a leading space, which means we want a leading slash."""
eq_("foo//bar", str(Path(("foo", "", "bar"))))
eq_("/foo/bar", str(Path(("", "foo", "bar"))))
eq_("foo/bar", str(Path("foo/bar/")))
def test_old_compare_paths(force_ossep):
eq_(Path("foobar"), Path("foobar"))
eq_(Path("foobar/"), Path("foobar\\", "\\"))
eq_(Path("/foobar/"), Path("\\foobar\\", "\\"))
eq_(Path("/foo/bar"), Path("\\foo\\bar", "\\"))
eq_(Path("/foo/bar"), Path("\\foo\\bar\\", "\\"))
assert Path("/foo/bar") != Path("\\foo\\foo", "\\")
# We also have to test __ne__
assert not (Path("foobar") != Path("foobar"))
assert Path("/a/b/c.x") != Path("/a/b/c.y")
def test_old_split_path(force_ossep):
eq_(Path("foobar"), ("foobar",))
eq_(Path("foo/bar"), ("foo", "bar"))
eq_(Path("/foo/bar/"), ("", "foo", "bar"))
eq_(Path("\\foo\\bar", "\\"), ("", "foo", "bar"))
def test_representation(force_ossep):
eq_("('foo', 'bar')", repr(Path(("foo", "bar"))))
def test_add(force_ossep):
eq_("foo/bar/bar/foo", Path(("foo", "bar")) + Path("bar/foo"))
eq_("foo/bar/bar/foo", Path("foo/bar") + "bar/foo")
eq_("foo/bar/bar/foo", Path("foo/bar") + ("bar", "foo"))
eq_("foo/bar/bar/foo", ("foo", "bar") + Path("bar/foo"))
eq_("foo/bar/bar/foo", "foo/bar" + Path("bar/foo"))
# Invalid concatenation
try:
Path(("foo", "bar")) + 1
assert False
except TypeError:
pass
def test_path_slice(force_ossep):
foo = Path("foo")
bar = Path("bar")
foobar = Path("foo/bar")
eq_("bar", foobar[foo:])
eq_("foo", foobar[:bar])
eq_("foo/bar", foobar[bar:])
eq_("foo/bar", foobar[:foo])
eq_((), foobar[foobar:])
eq_((), foobar[:foobar])
abcd = Path("a/b/c/d")
a = Path("a")
d = Path("d")
z = Path("z")
eq_("b/c", abcd[a:d])
eq_("b/c/d", abcd[a : d + z])
eq_("b/c", abcd[a : z + d])
eq_("a/b/c/d", abcd[:z])
def test_add_with_root_path(force_ossep):
"""if I perform /a/b/c + /d/e/f, I want /a/b/c/d/e/f, not /a/b/c//d/e/f"""
eq_("/foo/bar", str(Path("/foo") + Path("/bar")))
def test_create_with_tuple_that_have_slash_inside(force_ossep, monkeypatch):
eq_(("", "foo", "bar"), Path(("/foo", "bar")))
monkeypatch.setattr(os, "sep", "\\")
eq_(("", "foo", "bar"), Path(("\\foo", "bar")))
def test_auto_decode_os_sep(force_ossep, monkeypatch):
"""Path should decode any either / or os.sep, but always encode in os.sep."""
eq_(("foo\\bar", "bleh"), Path("foo\\bar/bleh"))
monkeypatch.setattr(os, "sep", "\\")
eq_(("foo", "bar/bleh"), Path("foo\\bar/bleh"))
path = Path("foo/bar")
eq_(("foo", "bar"), path)
eq_("foo\\bar", str(path))
def test_contains(force_ossep):
p = Path(("foo", "bar"))
assert Path(("foo", "bar", "bleh")) in p
assert Path(("foo", "bar")) in p
assert "foo" in p
assert "bleh" not in p
assert Path("foo") not in p
def test_is_parent_of(force_ossep):
assert Path(("foo", "bar")).is_parent_of(Path(("foo", "bar", "bleh")))
assert not Path(("foo", "bar")).is_parent_of(Path(("foo", "baz")))
assert not Path(("foo", "bar")).is_parent_of(Path(("foo", "bar")))
def test_windows_drive_letter(force_ossep):
p = Path(("c:",))
eq_("c:\\", str(p))
def test_root_path(force_ossep):
p = Path("/")
eq_("/", str(p))
def test_str_encodes_unicode_to_getfilesystemencoding(force_ossep):
p = Path(("foo", "bar\u00e9"))
eq_("foo/bar\u00e9".encode(sys.getfilesystemencoding()), p.tobytes())
def test_unicode(force_ossep):
p = Path(("foo", "bar\u00e9"))
eq_("foo/bar\u00e9", str(p))
def test_str_repr_of_mix_between_non_ascii_str_and_unicode(force_ossep):
u = "foo\u00e9"
encoded = u.encode(sys.getfilesystemencoding())
p = Path((encoded, "bar"))
print(repr(tuple(p)))
eq_("foo\u00e9/bar".encode(sys.getfilesystemencoding()), p.tobytes())
def test_path_of_a_path_returns_self(force_ossep):
# if Path() is called with a path as value, just return value.
p = Path("foo/bar")
assert Path(p) is p
def test_getitem_str(force_ossep):
# path['something'] returns the child path corresponding to the name
p = Path("/foo/bar")
eq_(p["baz"], Path("/foo/bar/baz"))
def test_getitem_path(force_ossep):
# path[Path('something')] returns the child path corresponding to the name (or subpath)
p = Path("/foo/bar")
eq_(p[Path("baz/bleh")], Path("/foo/bar/baz/bleh"))
@pytest.mark.xfail(reason="pytest's capture mechanism is flaky, I have to investigate")
def test_log_unicode_errors(force_ossep, monkeypatch, capsys):
# When an there's a UnicodeDecodeError on path creation, log it so it can be possible
# to debug the cause of it.
monkeypatch.setattr(sys, "getfilesystemencoding", lambda: "ascii")
with pytest.raises(UnicodeDecodeError):
Path(["", b"foo\xe9"])
out, err = capsys.readouterr()
assert repr(b"foo\xe9") in err
def test_has_drive_letter(monkeypatch):
monkeypatch.setattr(os, "sep", "\\")
p = Path("foo\\bar")
assert not p.has_drive_letter()
p = Path("C:\\")
assert p.has_drive_letter()
p = Path("z:\\foo")
assert p.has_drive_letter()
def test_remove_drive_letter(monkeypatch):
monkeypatch.setattr(os, "sep", "\\")
p = Path("foo\\bar")
eq_(p.remove_drive_letter(), Path("foo\\bar"))
p = Path("C:\\")
eq_(p.remove_drive_letter(), Path(""))
p = Path("z:\\foo")
eq_(p.remove_drive_letter(), Path("foo"))
def test_pathify(): def test_pathify():

View File

@@ -11,7 +11,7 @@ from io import StringIO
from pytest import raises from pytest import raises
from ..testutil import eq_ from ..testutil import eq_
from ..path import Path from pathlib import Path
from ..util import ( from ..util import (
nonone, nonone,
tryint, tryint,
@@ -245,30 +245,30 @@ class TestCaseDeleteIfEmpty:
def test_not_empty(self, tmpdir): def test_not_empty(self, tmpdir):
testpath = Path(str(tmpdir)) testpath = Path(str(tmpdir))
testpath["foo"].mkdir() testpath.joinpath("foo").mkdir()
assert not delete_if_empty(testpath) assert not delete_if_empty(testpath)
assert testpath.exists() assert testpath.exists()
def test_with_files_to_delete(self, tmpdir): def test_with_files_to_delete(self, tmpdir):
testpath = Path(str(tmpdir)) testpath = Path(str(tmpdir))
testpath["foo"].open("w") testpath.joinpath("foo").touch()
testpath["bar"].open("w") testpath.joinpath("bar").touch()
assert delete_if_empty(testpath, ["foo", "bar"]) assert delete_if_empty(testpath, ["foo", "bar"])
assert not testpath.exists() assert not testpath.exists()
def test_directory_in_files_to_delete(self, tmpdir): def test_directory_in_files_to_delete(self, tmpdir):
testpath = Path(str(tmpdir)) testpath = Path(str(tmpdir))
testpath["foo"].mkdir() testpath.joinpath("foo").mkdir()
assert not delete_if_empty(testpath, ["foo"]) assert not delete_if_empty(testpath, ["foo"])
assert testpath.exists() assert testpath.exists()
def test_delete_files_to_delete_only_if_dir_is_empty(self, tmpdir): def test_delete_files_to_delete_only_if_dir_is_empty(self, tmpdir):
testpath = Path(str(tmpdir)) testpath = Path(str(tmpdir))
testpath["foo"].open("w") testpath.joinpath("foo").touch()
testpath["bar"].open("w") testpath.joinpath("bar").touch()
assert not delete_if_empty(testpath, ["foo"]) assert not delete_if_empty(testpath, ["foo"])
assert testpath.exists() assert testpath.exists()
assert testpath["foo"].exists() assert testpath.joinpath("foo").exists()
def test_doesnt_exist(self): def test_doesnt_exist(self):
# When the 'path' doesn't exist, just do nothing. # When the 'path' doesn't exist, just do nothing.
@@ -276,8 +276,8 @@ class TestCaseDeleteIfEmpty:
def test_is_file(self, tmpdir): def test_is_file(self, tmpdir):
# When 'path' is a file, do nothing. # When 'path' is a file, do nothing.
p = Path(str(tmpdir)) + "filename" p = Path(str(tmpdir)).joinpath("filename")
p.open("w").close() p.touch()
delete_if_empty(p) # no crash delete_if_empty(p) # no crash
def test_ioerror(self, tmpdir, monkeypatch): def test_ioerror(self, tmpdir, monkeypatch):

View File

@@ -129,18 +129,6 @@ def install_gettext_trans(base_folder, lang):
installed_lang = lang installed_lang = lang
def install_gettext_trans_under_cocoa():
from cocoa import proxy
res_folder = proxy.getResourcePath()
base_folder = op.join(res_folder, "locale")
current_lang = proxy.systemLang()
install_gettext_trans(base_folder, current_lang)
localename = get_locale_name(current_lang)
if localename is not None:
locale.setlocale(locale.LC_ALL, localename)
def install_gettext_trans_under_qt(base_folder, lang=None): def install_gettext_trans_under_qt(base_folder, lang=None):
# So, we install the gettext locale, great, but we also should try to install qt_*.qm if # So, we install the gettext locale, great, but we also should try to install qt_*.qm if
# available so that strings that are inside Qt itself over which I have no control are in the # available so that strings that are inside Qt itself over which I have no control are in the

View File

@@ -15,7 +15,8 @@ import glob
import shutil import shutil
from datetime import timedelta from datetime import timedelta
from .path import Path, pathify, log_io_error from pathlib import Path
from .path import pathify, log_io_error
def nonone(value, replace_value): def nonone(value, replace_value):
@@ -354,13 +355,13 @@ def find_in_path(name, paths=None):
@pathify @pathify
def delete_if_empty(path: Path, files_to_delete=[]): def delete_if_empty(path: Path, files_to_delete=[]):
"""Deletes the directory at 'path' if it is empty or if it only contains files_to_delete.""" """Deletes the directory at 'path' if it is empty or if it only contains files_to_delete."""
if not path.exists() or not path.isdir(): if not path.exists() or not path.is_dir():
return return
contents = path.listdir() contents = list(path.glob("*"))
if any(p for p in contents if (p.name not in files_to_delete) or p.isdir()): if any(p for p in contents if (p.name not in files_to_delete) or p.is_dir()):
return False return False
for p in contents: for p in contents:
p.remove() p.unlink()
path.rmdir() path.rmdir()
return True return True

View File

@@ -1,5 +1,7 @@
## How to build dupeGuru for macos ## How to build dupeGuru for macos
These instructions are for the Qt version of the UI on macOS.
*Note: The Cocoa UI of dupeGuru is hosted in a separate repo: https://github.com/arsenetar/dupeguru-cocoa and is no longer "supported".*
### Prerequisites ### Prerequisites
- [Python 3.7+][python] - [Python 3.7+][python]

View File

@@ -10,7 +10,7 @@ Vcs-Git: https://github.com/arsenetar/dupeguru.git
Package: {pkgname} Package: {pkgname}
Architecture: {arch} Architecture: {arch}
Depends: ${shlibs:Depends}, python3 (>=3.7), python3-pyqt5, python3-mutagen Depends: ${shlibs:Depends}, python3 (>=3.7), python3-pyqt5, python3-mutagen, python3-sematic-version
Provides: dupeguru-se, dupeguru-me, dupeguru-pe Provides: dupeguru-se, dupeguru-me, dupeguru-pe
Replaces: dupeguru-se, dupeguru-me, dupeguru-pe Replaces: dupeguru-se, dupeguru-me, dupeguru-pe
Conflicts: dupeguru-se, dupeguru-me, dupeguru-pe Conflicts: dupeguru-se, dupeguru-me, dupeguru-pe

View File

@@ -6,18 +6,11 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from PyQt5.QtCore import Qt, QCoreApplication from PyQt5.QtCore import Qt, QCoreApplication, QTimer
from PyQt5.QtGui import QPixmap, QFont from PyQt5.QtGui import QPixmap, QFont
from PyQt5.QtWidgets import ( from PyQt5.QtWidgets import QDialog, QDialogButtonBox, QSizePolicy, QHBoxLayout, QVBoxLayout, QLabel
QDialog,
QDialogButtonBox,
QSizePolicy,
QHBoxLayout,
QVBoxLayout,
QLabel,
QApplication,
)
from core.util import check_for_update
from qtlib.util import move_to_screen_center from qtlib.util import move_to_screen_center
from hscommon.trans import trget from hscommon.trans import trget
@@ -31,61 +24,56 @@ class AboutBox(QDialog):
self.app = app self.app = app
self._setupUi() self._setupUi()
self.buttonBox.accepted.connect(self.accept) self.button_box.accepted.connect(self.accept)
self.buttonBox.rejected.connect(self.reject) self.button_box.rejected.connect(self.reject)
def _setupUi(self): def _setupUi(self):
self.setWindowTitle(tr("About {}").format(QCoreApplication.instance().applicationName())) self.setWindowTitle(tr("About {}").format(QCoreApplication.instance().applicationName()))
self.resize(400, 290)
size_policy = QSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed) size_policy = QSizePolicy(QSizePolicy.Fixed, QSizePolicy.Fixed)
size_policy.setHorizontalStretch(0)
size_policy.setVerticalStretch(0)
size_policy.setHeightForWidth(self.sizePolicy().hasHeightForWidth())
self.setSizePolicy(size_policy) self.setSizePolicy(size_policy)
self.horizontalLayout = QHBoxLayout(self) main_layout = QHBoxLayout(self)
self.logoLabel = QLabel(self) logo_label = QLabel()
self.logoLabel.setPixmap(QPixmap(":/%s_big" % self.app.LOGO_NAME)) logo_label.setPixmap(QPixmap(":/%s_big" % self.app.LOGO_NAME))
self.horizontalLayout.addWidget(self.logoLabel) main_layout.addWidget(logo_label)
self.verticalLayout = QVBoxLayout() detail_layout = QVBoxLayout()
self.nameLabel = QLabel(self) name_label = QLabel()
font = QFont() font = QFont()
font.setWeight(75) font.setWeight(75)
font.setBold(True) font.setBold(True)
self.nameLabel.setFont(font) name_label.setFont(font)
self.nameLabel.setText(QCoreApplication.instance().applicationName()) name_label.setText(QCoreApplication.instance().applicationName())
self.verticalLayout.addWidget(self.nameLabel) detail_layout.addWidget(name_label)
self.versionLabel = QLabel(self) version_label = QLabel()
self.versionLabel.setText(tr("Version {}").format(QCoreApplication.instance().applicationVersion())) version_label.setText(tr("Version {}").format(QCoreApplication.instance().applicationVersion()))
self.verticalLayout.addWidget(self.versionLabel) detail_layout.addWidget(version_label)
self.label_3 = QLabel(self) self.update_label = QLabel(tr("Checking for updates..."))
self.verticalLayout.addWidget(self.label_3) self.update_label.setTextInteractionFlags(Qt.TextBrowserInteraction)
self.label_3.setText(tr("Licensed under GPLv3")) self.update_label.setOpenExternalLinks(True)
self.label = QLabel(self) detail_layout.addWidget(self.update_label)
font = QFont() license_label = QLabel()
font.setWeight(75) license_label.setText(tr("Licensed under GPLv3"))
font.setBold(True) detail_layout.addWidget(license_label)
self.label.setFont(font) spacer_label = QLabel()
self.verticalLayout.addWidget(self.label) spacer_label.setFont(font)
self.buttonBox = QDialogButtonBox(self) detail_layout.addWidget(spacer_label)
self.buttonBox.setOrientation(Qt.Horizontal) self.button_box = QDialogButtonBox()
self.buttonBox.setStandardButtons(QDialogButtonBox.Ok) self.button_box.setOrientation(Qt.Horizontal)
self.verticalLayout.addWidget(self.buttonBox) self.button_box.setStandardButtons(QDialogButtonBox.Ok)
self.horizontalLayout.addLayout(self.verticalLayout) detail_layout.addWidget(self.button_box)
main_layout.addLayout(detail_layout)
def _check_for_update(self):
update = check_for_update(QCoreApplication.instance().applicationVersion(), include_prerelease=False)
if update is None:
self.update_label.setText(tr("No update available."))
else:
self.update_label.setText(
tr('New version {} available, download <a href="{}">here</a>.').format(update["version"], update["url"])
)
def showEvent(self, event): def showEvent(self, event):
self.update_label.setText(tr("Checking for updates..."))
# have to do this here as the frameGeometry is not correct until shown # have to do this here as the frameGeometry is not correct until shown
move_to_screen_center(self) move_to_screen_center(self)
super().showEvent(event) super().showEvent(event)
QTimer.singleShot(0, self._check_for_update)
if __name__ == "__main__":
import sys
app = QApplication([])
QCoreApplication.setOrganizationName("Hardcoded Software")
QCoreApplication.setApplicationName("FooApp")
QCoreApplication.setApplicationVersion("1.2.3")
app.LOGO_NAME = ""
dialog = AboutBox(None, app)
dialog.show()
sys.exit(app.exec_())

View File

@@ -4,18 +4,30 @@ msgstr ""
"Content-Type: text/plain; charset=utf-8\n" "Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: utf-8\n" "Content-Transfer-Encoding: utf-8\n"
#: qtlib\about_box.py:38 #: qtlib\about_box.py:31
msgid "About {}" msgid "About {}"
msgstr "" msgstr ""
#: qtlib\about_box.py:58 #: qtlib\about_box.py:47
msgid "Version {}" msgid "Version {}"
msgstr "" msgstr ""
#: qtlib\about_box.py:62 #: qtlib\about_box.py:49 qtlib\about_box.py:75
msgid "Checking for updates..."
msgstr ""
#: qtlib\about_box.py:54
msgid "Licensed under GPLv3" msgid "Licensed under GPLv3"
msgstr "" msgstr ""
#: qtlib\about_box.py:68
msgid "No update available."
msgstr ""
#: qtlib\about_box.py:71
msgid "New version {} available, download <a href=\"{}\">here</a>."
msgstr ""
#: qtlib\error_report_dialog.py:50 #: qtlib\error_report_dialog.py:50
msgid "Error Report" msgid "Error Report"
msgstr "" msgstr ""

View File

@@ -1,7 +1,9 @@
distro>=1.5.0
mutagen>=1.44.0
polib>=1.1.0
PyQt5 >=5.14.1,<6.0; sys_platform != 'linux'
pywin32>=228; sys_platform == 'win32'
semantic-version>=2.0.0,<3.0.0
Send2Trash>=1.3.0 Send2Trash>=1.3.0
sphinx>=3.0.0 sphinx>=3.0.0
polib>=1.1.0 xxhash>=3.0.0,<4.0.0
mutagen>=1.44.0
distro>=1.5.0
PyQt5 >=5.14.1,<6.0; sys_platform != 'linux'
pywin32>=228; sys_platform == 'win32'

View File

@@ -34,6 +34,8 @@ install_requires =
distro>=1.5.0 distro>=1.5.0
PyQt5 >=5.14.1,<6.0; sys_platform != 'linux' PyQt5 >=5.14.1,<6.0; sys_platform != 'linux'
pywin32>=228; sys_platform == 'win32' pywin32>=228; sys_platform == 'win32'
semantic-version>=2.0.0,<3.0.0
xxhash>=3.0.0,<4.0.0
setup_requires = setup_requires =
sphinx>=3.0.0 sphinx>=3.0.0
polib>=1.1.0 polib>=1.1.0