- Add "safe" existence check to files which catches OSErrors that may
occur when trying to stat files
- Use "safe" existence check during final existence check
- Remove lock on read operations, only needed for write operations
- Change to use context manager for sqlite connection
- Remove long lived cursor object and use short lived cursors instead
Fixes#1080
- Add exclude pattern for flake8 when running with pre-commit as it does
not fully honor the exclude paths.
- Cleanup exclude paths for flake8 in tox.ini
- Re-enable line length check and correct three affected files
- Add option to include file existence check at end of scan, speeds up
end of scan operation time considerably, however if user has removed
or moved files since starting a scan there could be later errors when
interacting with results. Defaults to existing behavior of including
the check, until it can be verified later dialogs and actions handle
non-existent items better.
- Add option to ignore differences in mtime when checking hash cache.
Option is present in advanced tab of preferences. Closes#1022.
- Regenerate pot files for translations
- Move the try/except of cache db calls to the calls themselves.
- Add some additional information to logging statements on cache db
exception to improve troubleshooting.
- Add Callable type to hasher (should realy be more specific...)
- Add type hint to COLUMNS in qtlib/table.py
- Use Qt.ItemFlag.ItemIsEnabled instead of Qt.itemIsEnabled in qtlib/table.py
- Update to get size and mtime at time of class creation when os.DirEntry is used for initialization.
- Folders still calculate size later for folder scans.
- Ref #962, #959
- Change to use os.scandir() instead of os.walk() to leverage DirEntry objects.
- Avoids extra calls to stat() on files during fs.can_handle()
- See 3x speed improvement on Windows in some cases
commit 8b15fe9a502ebf4841c6529e7098cef03a6a5e6f
Author: Andrew Senetar <arsenetar@gmail.com>
Date: Sun Mar 27 23:48:15 2022 -0500
Finish up changes to copy_or_move
commit 21f6a32cf3186a400af8f30e67ad2743dc9a49bd
Author: Andrew Senetar <arsenetar@gmail.com>
Date: Thu Mar 17 23:56:52 2022 -0500
Migrate from hscommon.path to pathlib
- Part one, this gets all hscommon and core tests passing
- App appears to be able to load directories and complete scans, need further testing
- app.py copy_or_move needs some additional work
commit 5eb515f666bfa1ff06c2e96bdc351a4b7456580e
Author: Andrew Senetar <arsenetar@gmail.com>
Date: Sun Mar 27 22:19:39 2022 -0500
Add fallback to md5 if xxhash not available
Mainly here for the case when distributions have not packaged python3-xxhash.
commit 51b18d4c84
Author: Andrew Senetar <arsenetar@gmail.com>
Date: Sat Mar 19 15:25:46 2022 -0500
Switch file hashing to xxhash instead of md5
- Improves performance significantly in some cases
- Add xxhash to requirements.txt and sort requirements
- Rename md5 based members to digest
- Update all tests to use new member names and hashing methods
- Update hash db code to upgrade schema
NOTE: May consider supporting multiple hashing algorithms in the future.
Computing 3 hash samples for files less than 3MiB (3 * CHUNK_SIZE) is not efficient since spans of later samples would overlap a previous one.
Therefore we can simply return the hash of the entire small file instead.
* Instead of keeping md5 samples separate, merge them as one hash computed from the various selected chunks we picked.
* We don't need to keep a boolean to see whether or not the user chose to optimize; we can simply compare the value of the threshold, since 0 means no optimization currently active.
* Big files above the user selected threshold can be partially hashed in 3 places.
* If the user is willing to take the risk, we consider files with identical md5samples as being identical.
- Format all files with black
- Update tox.ini flake8 arguments to be compatible
- Add black to requirements-extra.txt
- Reduce ignored flake8 rules and fix a few violations
Refactored dupeGuru to make hscommon.path's API a bit close to pathlib's
API. It's not 100% compatible yet, but it's much better than before.
This is more of a hscommon refactoring than a dupeguru one, but since
duepGuru is the main user of Path, it was the driver behind the
refactoring.
This refactoring also see the introduction of @pathify, which ensure
Path arguments. Previously, we were often unsure of whether the caller
of a function was passing a Path or a str. This problem is now solved
and this allows us to remove hscommon.io, an ill-conceived attempt to
solve that same ambiguity problem.
Fixes#235.