1
0
mirror of https://github.com/arsenetar/dupeguru.git synced 2026-01-25 16:11:39 +00:00

Compare commits

..

162 Commits
4.1.1 ... 4.3.0

Author SHA1 Message Date
360dceca7b Update to version 4.3.0, update changelog 2022-06-30 23:27:14 -05:00
92b27801c3 Update translations, remove iphoto_plist.py 2022-06-30 23:03:40 -05:00
Marcus Yanello
b9aabb8545 Redirect stdout from custom command to the log files (#1008)
Send the logs for the custom command subprocess to the logs
Closes #1007
2022-06-13 21:04:40 -05:00
d5eeab4a17 Additional type hints in hscommon 2022-05-11 00:50:34 -05:00
7865e4aeac Type hinting hscommon & cleanup 2022-05-09 23:36:39 -05:00
58863b1728 Change to use a real temporary directory for test
app_test was not using a real temporary location originally
2022-05-09 01:46:42 -05:00
e382683f66 Replace all relative imports 2022-05-09 01:40:08 -05:00
f7ed1c801c Add type hinting to desktop.py 2022-05-09 01:15:25 -05:00
f587c7b5d8 Removed unused code in hscommon/util
Also added type hints throughout
2022-05-09 00:47:57 -05:00
40ff40bea8 Move create_qsettings() out of preferences
- Load order was impacting translations
- Fix by moving create_qsettings() for now
2022-05-08 20:33:31 -05:00
7a44c72a0a Complete removal of qtlib locale files 2022-05-08 19:52:25 -05:00
66aff9f74e Update pot files
This "moves" the translation points from qtlib.pot to ui.pot.
Needs further updates to propagate across.
2022-05-08 19:28:37 -05:00
5451f55219 Move qtlib localization files to top level 2022-05-08 19:23:13 -05:00
36280b01e6 Finish moving all qtlib py files to qt 2022-05-08 19:22:08 -05:00
18359c3ea6 Start flattening Qtlib into qt
- Remove app.py from qtlib (unused)
- Remove .gitignore from qtlib (unecessary)
- Move contents of preferences.py in qtlib to qt, clean up references
- Simplify language dropdown code
2022-05-08 18:51:10 -05:00
0a4e61edf5 Additional cleanup per mypy
- Add Callable type to hasher (should realy be more specific...)
- Add type hint to COLUMNS in qtlib/table.py
- Use Qt.ItemFlag.ItemIsEnabled instead of Qt.itemIsEnabled in qtlib/table.py
2022-04-30 05:16:46 -05:00
d73a85b82e Add type hints for compiled modules 2022-04-30 05:11:54 -05:00
81c593399e Format changes with black 2022-04-27 20:59:20 -05:00
6a732a79a8 Remove old tx config 2022-04-27 20:58:30 -05:00
63dd4d4561 Apply pyupgrade changes 2022-04-27 20:53:12 -05:00
e0061d7bc1 Fix #989, typo in debian control file 2022-04-02 16:43:19 -05:00
c5818b1d1f Add option to profile scans
- Add preference for profiling scans
- Move debug options to tab in preferences
- Add label with clickable link to debug output (appdata) to debug tab in preferences
- Update translation source files
2022-03-31 00:16:37 -05:00
a470a8de25 Update fs.py to optimize stat() calls
- Update to get size and mtime at time of class creation when os.DirEntry is used for initialization.
- Folders still calculate size later for folder scans.
- Ref #962, #959
2022-03-30 22:58:01 -05:00
a37b5b0eeb Fix #988 2022-03-30 01:06:51 -05:00
efd500ecc1 Update directory scanning to use os.scandir()
- Change to use os.scandir() instead of os.walk() to leverage DirEntry objects.
- Avoids extra calls to stat() on files during fs.can_handle()
- See 3x speed improvement on Windows in some cases
2022-03-29 23:37:56 -05:00
43fcc52291 Replace pathlib.glob() with os.scandir() in fs.py 2022-03-29 22:35:38 -05:00
50f5db1543 Update fs to support DirEntry on get_file() 2022-03-29 22:32:36 -05:00
a5b0ccdd02 Improve performance of Directories.get_state() 2022-03-29 21:48:14 -05:00
143147cb8e Remove Cocoa specific and other unused code 2022-03-28 00:47:46 -05:00
ebb81d9f03 Remove pathlib function added in Python 3.9 2022-03-28 00:06:32 -05:00
da9f8b2b9d Squashed commit of the following:
commit 8b15fe9a502ebf4841c6529e7098cef03a6a5e6f
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Sun Mar 27 23:48:15 2022 -0500

    Finish up changes to copy_or_move

commit 21f6a32cf3186a400af8f30e67ad2743dc9a49bd
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Thu Mar 17 23:56:52 2022 -0500

    Migrate from hscommon.path to pathlib
    - Part one, this gets all hscommon and core tests passing
    - App appears to be able to load directories and complete scans, need further testing
    - app.py copy_or_move needs some additional work
2022-03-27 23:50:03 -05:00
5ed5eddde6 Add polib back to requirements.txt 2022-03-27 22:35:34 -05:00
9f40e4e786 Squashed commit of the following:
commit 5eb515f666bfa1ff06c2e96bdc351a4b7456580e
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Sun Mar 27 22:19:39 2022 -0500

    Add fallback to md5 if xxhash not available

    Mainly here for the case when distributions have not packaged python3-xxhash.

commit 51b18d4c84
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Sat Mar 19 15:25:46 2022 -0500

    Switch file hashing to xxhash instead of md5

    - Improves performance significantly in some cases
    - Add xxhash to requirements.txt and sort requirements
    - Rename md5 based members to digest
    - Update all tests to use new member names and hashing methods
    - Update hash db code to upgrade schema

    NOTE: May consider supporting multiple hashing algorithms in the future.
2022-03-27 22:27:13 -05:00
86bf9b39d0 Add update check function and call from about
- Implement a update check against the GitHub releases via the api
- Add semantic-version dependency
- Add automatic check when opening about dialog
2022-03-27 21:13:27 -05:00
c0be0aecbd Minor documentation update 2022-03-27 21:04:37 -05:00
c408873d20 Update changelog 2022-03-25 23:37:46 -05:00
bbcdfbf698 Add vscode extension recommendation 2022-03-21 22:27:16 -05:00
8cee1a9467 Fix internal links in CONTRIBUTING.md 2022-03-21 22:19:58 -05:00
448d33dcb6 Add workflow yml validation settings
- Add yml validation to project for vscode
- Allow .vscode/settings.json
- Apply formatting to workflow files
2022-03-21 22:18:22 -05:00
8d414cadac Add initial partial CONTRIBUTING.md
- Adopt a CONTRIBUTING.md format similar to that used by atom/atom.
- Add label section as replacement to wiki
- Add style guide section
- Setup basic document structure

TODO:
- Migrate some existing wiki information here where applicable.
- Migrate some existing help information here.
- Finish up remaining sections.
2022-03-21 22:04:45 -05:00
f902ee889a Add configuration for isort to pyproject.toml 2022-03-21 00:25:36 -05:00
bc89e71935 Update .gitignore
- Pull from github/gitignore to cover some things better
- Organize remaining items
- Remove a few no longer relevant items
2022-03-20 23:25:01 -05:00
17b83c8001 Move polib to setup_requires instead of install_requires 2022-03-20 22:48:03 -05:00
0f845ee67a Update min python version in Makefile 2022-03-20 01:23:01 -05:00
d40e32a143 Update transifex config & pull latest updates
- Update transifex configuration to new format
- Pull translation updates
2022-03-19 20:21:14 -05:00
1bc206e62d Bump version to 4.2.1 2022-03-19 19:02:41 -05:00
106a0feaba Add sponsor information 2022-03-19 17:46:12 -05:00
984e0c4094 Fix help path for local files and some help doc updates 2022-03-19 17:43:11 -05:00
9321e811d7 Enforce minimum Windows version ref #983 2022-03-19 17:01:54 -05:00
a64fcbfb5c Fix deprecation warning from sqlite 2022-03-19 17:01:53 -05:00
cff07a12d6 Black formatter changes 2022-03-19 17:01:53 -05:00
Alfonso Montero
b9c7832c4a Apply @arsenetar's proposed change to fix for errors on window change event. Solves #937. (#980) 2022-03-15 20:47:48 -05:00
b9dfeac2f3 Drop Python 3.6 Support 2022-03-15 05:10:41 -05:00
efc99eee96 Merge pull request #978 from glubsy/fix_zoom_scrollbar
Fix image viewer scrollbar zoom
2022-03-14 20:43:40 -05:00
glubsy
ff7733bb73 Fix image viewer
When zooming in or out, the value computed might be a float instead
of an int, which is what the QScrollBar expect for its setValue method.
Simply casting to int should be enough here.
2022-03-12 22:36:17 +01:00
4b2fbe87ea Default to English on unsupported system language Fix #976
- Add check for supported language to system locale detection
- Fall-back to English when not a supported locale
2022-03-12 04:36:13 -06:00
9e4b41feb5 Fix BASE_PATH for frozen macOS app 2022-03-09 06:50:41 -06:00
cbfa8720f1 Update imports for objc module 2022-03-09 05:01:12 -06:00
a02c5e5b9b Add built modules as artifacts 2022-03-04 01:14:01 -06:00
35e6ffd6af Fix macOS packaging issue 2022-02-09 22:33:41 -06:00
e957f840da Fix python version check in makefile, close #971 2022-02-09 21:59:35 -06:00
85e22089bd Black formatting changes 2022-02-09 21:49:51 -06:00
b7d68b4458 Update debian control template depends 2022-02-09 21:45:45 -06:00
8f440603ee Add Python 3.10 to tox.ini 2022-01-25 10:39:52 -06:00
5d8e559ca3 Fix issue introduced in fix for #900 2022-01-25 10:39:08 -06:00
2c11eecf97 Update version and changelog to 4.2.0 2022-01-24 22:28:40 -06:00
02803f738b Update translation files including Malay 2022-01-24 21:05:33 -06:00
db27e6a645 Add Malay to language selection 2022-01-24 21:02:57 -06:00
c9c35cc60d Add translation source file for dark style change. 2022-01-24 19:33:42 -06:00
880205dbc8 Fix python 3.10 in default action 2022-01-24 19:30:42 -06:00
6456e64328 Update python versions for CI/CD
- Update python versions for Default action
- Set python versions for sonarcloud
2022-01-24 19:27:29 -06:00
f6a0c0cc6d Add initial dark style for use in Windows
- Other platforms can achieve this with the OS theme so not enabled for them at this time.
- Adds preference in display options to use dark style, default is false.
2022-01-24 19:14:30 -06:00
eb57d269fc Update translation source files 2021-11-23 21:11:30 -06:00
34f41dc522 Merge pull request #942 from Dobatymo/hash-cache
Implement hash cache for md5 hash based on sqlite
2021-11-23 21:08:22 -06:00
Dobatymo
77460045c4 clean up abstraction 2021-10-29 15:24:47 +08:00
Dobatymo
9753afba74 change FilesDB to singleton class
move hash calculation back in to Files class
clear cache now clears hash cache in addition to picture cache
2021-10-29 15:12:40 +08:00
Dobatymo
1ea108fc2b changed cache filename 2021-10-29 15:12:40 +08:00
Dobatymo
2f02a6010d implement hash cache for md5 hash based on sqlite 2021-10-29 15:12:40 +08:00
b80489fd66 Update translation source files 2021-09-15 20:15:09 -05:00
1d60e124ee Update invoke_custom_command to run for all selected items 2021-09-02 20:48:25 -05:00
e22d7d2fc9 Remove filtering of 0 size files in engine
Files size is already able to be filtered at a higher level, some users
may decide to see zero length files. Fix #321.
2021-08-28 18:16:22 -05:00
0a0694e095 Expand fix for #630 to fix #551 2021-08-28 17:29:25 -05:00
3da9d5d869 Update documentation files, add multi-language doc build
- Update links in documentation, and some errors
- Remove non-existent page
- Update build to build all languages with --alldoc flag
- Fix one minor debugging change introduced in package.py
2021-08-28 17:07:18 -05:00
78fb052d77 Add more progress details to getmatches, ref #700 2021-08-28 04:58:22 -05:00
9805cba10d Use different message for direct delete success, close #904 2021-08-28 04:27:34 -05:00
4c3dfe2f1f Provide more feedback during scans
- Add output for number of collected files / folders
- Update to allow indeterminate progress bar
- Remove unused hscommon\jobprogress\qt.py
2021-08-28 04:05:07 -05:00
b0baa5bfd6 Add windows position handling at open, fix #653
- Move offscreen windows back on screen
- Restore maximized state without impacting resored size
- Fullscreen comes back on primary screen, needs further work to support
  restore on other screens
2021-08-27 23:26:19 -05:00
22996ee914 Merge pull request #935 from chchia/master
resize preference dialog file size box
2021-08-27 21:57:03 -05:00
chchia
31ec9c667f resize preference dialog file size box 2021-08-28 10:28:06 +08:00
3045361243 Add preference to ignore large files, close #430 2021-08-27 05:35:54 -05:00
809116c764 Fix CodeQL Alerts
- Cast int to Py_ssize_t for multiplication
2021-08-26 03:43:31 -05:00
83f401595d Minor Updates
- Cleanup extension modules in setup.py to use correct namespaces
- Update build.py to leverage setup.py for modules
- Roll mutagen required version back to 1.44.0 to support more distros
- Change build.py and sphinxgen.py to use pathlib
- Remove hsaudiotag from package list for debian and arch
2021-08-26 03:29:24 -05:00
814d145366 Updates to setup files
- Include additional non-python files in MANIFEST.in (package_data in
  setup.cfg was not including the files)
- Update requirements in setup.cfg
2021-08-25 04:10:38 -05:00
efb76c7686 Add OS and Python Information to error dialog 2021-08-25 02:05:18 -05:00
47dbe805bb More cleanup and fixed a flake8 build issue 2021-08-25 01:11:24 -05:00
f11fccc889 More cleanups
- Cleanup columns.py and tables
- Other misc cleanups
- Remove text_field.py from qtlib as it is not used
- Remove unused variables from image_viewer method
2021-08-25 00:46:33 -05:00
2e13c4ccb5 Update internationalization files 2021-08-24 03:54:54 -05:00
da72ffd1fd Add ability to use non-native dialog for directories
- Add preference for native dialogs
- Add non-native directory selection to allow selecting multiple folders
  fixes #874 when using non-native.
2021-08-24 03:52:43 -05:00
2c9437bef4 Fix #897 2021-08-24 03:13:03 -05:00
f9085386a6 First pass code cleanup in qt/qtlib 2021-08-24 00:12:23 -05:00
d576a7043c Code cleanups in core and other affected files 2021-08-21 18:02:02 -05:00
1ef5f56158 Code cleanups in hscommon & external effects 2021-08-21 16:56:27 -05:00
f9316de244 Code cleanups in hscommon\tests 2021-08-21 16:25:33 -05:00
0189c29f47 Misc cleanups in core/tests 2021-08-21 03:52:09 -05:00
b4fa1d68f0 Add check for python version to build.py, close #589 2021-08-20 23:49:20 -05:00
16df882481 Update requirements.txt for previous change 2021-08-19 00:17:46 -05:00
58c04ff9ad Switch from hsaudiotag to mutagen, close #440
- This opens up the ability to support more tags and audio information
- Also makes progress on #333
2021-08-19 00:14:26 -05:00
6b8f85e39a Reveal in Explorer / Finder, close #895 2021-08-18 20:51:45 -05:00
2fff1a3436 Add ablity to load results at start, closes #902
- Add ablility to load .dupguru file at start by passing as first argument
- Add file association to .dupeguru file in windows at install
2021-08-18 19:24:14 -05:00
a685524dd5 Add files for more standardized build tools
- Prior investigation into linux packaging (not using pyinstaller) suggested
having setuptools files could make packaging easier and automatable
- Add setup.cfg and setup.py as initial starting point
- Add MANIFEST.in (at least temporarily)

Currently with the python build module this almost works for main application.
It does not include all the extra data files right now.
2021-08-18 04:12:38 -05:00
74918e2c56 Attempt to fix apt-get failure 2021-08-18 03:07:47 -05:00
18895d983b Fix syntax error in codeql-analysis.yml 2021-08-18 03:04:44 -05:00
fe720208ea Add minimum custom build for codeql cpp 2021-08-18 02:49:20 -05:00
091d9e9239 Create codeql-analysis.yml
Test out codeql
2021-08-18 02:33:40 -05:00
5a4958cff9 Update translation .pot files 2021-08-17 21:18:47 -05:00
be10b462fc Add portable mode
If settings.ini is present next to the executable, will run in portable mode.
This results in settings, data, and cache all being in same folder as dupeGuru.
2021-08-17 21:12:32 -05:00
d62b13bcdb Removing travis
- All CI is now covered by Github Actions
- Remove .travis.yml
- Remove tox-travis in requirements-extra.txt
2021-08-17 18:16:20 -05:00
06eca11f0b Remove format check from lint job 2021-08-17 00:52:14 -05:00
2879f18e0d Run linting and formatting check in parallel before test 2021-08-17 00:50:41 -05:00
3ee21771f9 Fix workflow file format 2021-08-17 00:33:54 -05:00
c0ba6fb57a Test out github actions
Add a workflow to test
2021-08-17 00:31:15 -05:00
bc942b8263 Add black format check to tox runs 2021-08-15 04:10:46 -05:00
ffe6b7047c Format all files with black correcting line length 2021-08-15 04:10:18 -05:00
9446f37fad Remove flake8 E731 Errors
Note: black formatting is now applying correctly as well.
2021-08-15 03:53:43 -05:00
af19660c18 Update flake8 and black configuration
- Update black to now use 120 lines
- Update flake8 to use recommended settings for black integration
2021-08-15 03:32:31 -05:00
99ad297906 Change preferences to use spinboxes where applicable
- Change LineEdit to Spinbox for minimum file size 0-1,000,000KB
- Change LineEdit to Spinbox for big file size 0-1,000,000MB
2021-08-15 02:11:42 -05:00
e11f996dfc Merge pull request #908 from glubsy/hash_sample_optimization
Hash sample optimization
2021-08-13 23:41:17 -05:00
glubsy
e95306e58f Fix flake 8 2021-08-14 02:52:00 +02:00
glubsy
891a875990 Cache constant expression
Perhaps the python byte code is already optimized, but just in case it is not, keep pre-compute the constant expression.
2021-08-13 21:33:21 +02:00
glubsy
545a5a75fb Fix for older python versions
The "walrus" operator is only available in python 3.8 and later. Fall back to more traditional notation.
2021-08-13 20:56:33 +02:00
glubsy
7b764f183e Avoid partially hashing small files
Computing 3 hash samples for files less than 3MiB (3 * CHUNK_SIZE) is not efficient since spans of later samples would overlap a previous one.
Therefore we can simply return the hash of the entire small file instead.
2021-08-13 20:47:01 +02:00
fdc8a17d26 Update .travis.yml
- Windows test uses 3.9.6 now
- Intentation changes
2021-08-07 19:35:57 -05:00
cb3bbbec6e Upgrade Requirement Minimums
- Upgrade requirements to specify more current minimums
- Remove compatability code from sphinxgen for old versions
- Upgrade pyinstaller to a minimum version that works with latest macOS
2021-08-07 19:28:41 -05:00
c51a82a2ce Fix Issues from Translation Update
- Add Qtlib to transifex config
- Pull latest qtlib translations
- Fix flake8 error
- Remove code for manual translation import, use transifex-client instead
2021-08-06 22:21:35 -05:00
0cd8f5e948 Update translation pot files 2021-08-06 21:41:52 -05:00
9c09607c08 Add Turkish & Updates from Transifex
- Pull updates from Transifex
- Add Turkish
- Sort language lists in code
- Remove old locale conversion code as it appears to work correctly on
windows without different conversions.
2021-08-06 21:41:52 -05:00
3bd342770c Update configurations
- Enable Unicode for NSIS Installer
- Update transifex config to new project
2021-08-06 21:41:52 -05:00
14b456dcf9 Merge pull request #927 from glubsy/fix_directories_tests
Fix Directories regex test
2021-08-06 20:08:27 -05:00
glubsy
3dccb686e2 Fix Directories regex test
The entire path to the file would match unless another path separator is added.
2021-08-06 17:18:23 +02:00
0db66baace Merge pull request #907 from glubsy/missing_renamed_regex
Missing renamed regex
2021-08-03 22:26:08 -05:00
e3828ae2ca Merge pull request #911 from glubsy/fix_757_fix_regression
Fix infinite recursion
2021-06-22 22:44:12 -05:00
glubsy
23c59787e5 Fix infinite recursion
Force the Results to update its internal __dupes list whenever at least one group has re-prioritized and changed its dupes/ref.
2021-06-23 05:36:10 +02:00
2f8d603251 Merge pull request #910 from glubsy/757_fix
Fix refs appearing in dupes-only view
2021-06-22 21:54:49 -05:00
glubsy
a51f263632 Fix refs appearing in dupes-only view
* Some refs appeared in the dupes-only view after a re-prioritization was done a second time.
* It seems the core.Results.__dupes list was not properly updated whenever core.app.Dupeguru.reprioritize_groups() -> core.Results.sort_dupes() was called.
When a re-prioritization is done, some refs became dupe, and some dupes became ref in their place. So we need to update the new state of the internal list of dupes kept by the Results object, instead of relying on the outdated cached one.
* Fix #757.
2021-06-22 22:57:57 +02:00
glubsy
718ca5b313 Remove unused import 2021-06-22 02:41:33 +02:00
glubsy
277bc3fbb8 Add unit tests for hash sample optimization
* Instead of keeping md5 samples separate, merge them as one hash computed from the various selected chunks we picked.
* We don't need to keep a boolean to see whether or not the user chose to optimize; we can simply compare the value of the threshold, since 0 means no optimization currently active.
2021-06-21 22:44:05 +02:00
glubsy
e07dfd5955 Add partial hashes optimization for big files
* Big files above the user selected threshold can be partially hashed in 3 places.
* If the user is willing to take the risk, we consider files with identical md5samples as being identical.
2021-06-21 19:03:21 +02:00
4641bd6ec9 Merge pull request #905 from glubsy/fix_863
Fix exception when deleting while in delta view
2021-06-19 20:29:47 -05:00
glubsy
a6f83ad3d7 Fix missing regexp after rename
* Doing a full match should be safer to avoid partial results which would result in overly aggressive filtering.
* Add new tests to test suite to cover this issue.
* Fixes #903.
2021-06-19 02:00:25 +02:00
glubsy
ab8750eedb Fix partial regex match yielding false positive 2021-06-17 03:49:59 +02:00
glubsy
22033211d6 Fix exception when deleting while in delta view 2021-05-31 23:49:21 +02:00
0b46ca2222 Merge pull request #879 from glubsy/fix_unicode
Fix stripping (japanese) unicode characters
2021-05-25 19:11:19 -05:00
72e0f76242 Merge pull request #898 from AlttiRi/master
Change reference background color #894
2021-05-25 19:10:31 -05:00
[Alt'tiRi]
65c1d463f8 Change reference background color #894 2021-05-22 02:52:41 +03:00
e6c791ab0a Merge pull request #884 from samusz/master
Small typo
2021-05-09 23:32:32 -05:00
Sacha Muszlak
78f5088101 Merge pull request #1 from samusz/samusz-patch-1
typo correction
2021-05-07 09:41:47 +02:00
Sacha Muszlak
095df5eb95 typo correction 2021-05-07 09:40:08 +02:00
glubsy
f1ae478433 Fix including character at the border 2021-04-29 05:29:35 +02:00
glubsy
c4dcfd3d4b Fix stripping (japanese) unicode characters
* Accents are getting removed from Unicode characters to generate similar "words".
* Non-latin characters which cannot be processed that way (eg. japanese, greek, russian, etc.) should not be filtered out at all otherwise files are erroneously skipped or detected as dupes if only some characters make it passed the filter.
* Starting from an arbitrary unicode codepoint (converted to decimal), above which we know it is pointless to try any sort of processing, we leave the characters as is.
* Fix #878.
2021-04-29 05:15:34 +02:00
0840104edf Merge pull request #873 from glubsy/fix_857
Fix 857
2021-04-20 20:05:05 -05:00
glubsy
6b4b436251 Fix crash on shutdown
* Fixes "'DetailsPanel' object has no attribute '_table'" error on shutdown if the Results table is updated (item removed) while the Details Dialog is shown as a floating window.
* It seems that QApplication.quit() triggers some sort of refresh on the floating QDockWidget, which in turn makes calls to the underlying model that is possibly being destroyed, ie. there might be a race condition here.
* Closing or hiding the QDockWidget before the cal to quit() is a workaround. Similarly, this is already done in the quitTriggered() method anyway.
* This fixes #857.
2021-04-16 17:54:49 +02:00
glubsy
d18b8c10ec Remove redundant assignment
The "app" field is already set in the parent class.
2021-04-15 18:03:00 +02:00
255 changed files with 12442 additions and 9638 deletions

13
.github/FUNDING.yml vendored Normal file
View File

@@ -0,0 +1,13 @@
# These are supported funding model platforms
github: arsenetar
patreon: # Replace with a single Patreon username
open_collective: # Replace with a single Open Collective username
ko_fi: # Replace with a single Ko-fi username
tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
liberapay: # Replace with a single Liberapay username
issuehunt: # Replace with a single IssueHunt username
otechie: # Replace with a single Otechie username
lfx_crowdfunding: # Replace with a single LFX Crowdfunding project-name e.g., cloud-foundry
custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']

50
.github/workflows/codeql-analysis.yml vendored Normal file
View File

@@ -0,0 +1,50 @@
name: "CodeQL"
on:
push:
branches: [master]
pull_request:
# The branches below must be a subset of the branches above
branches: [master]
schedule:
- cron: "24 20 * * 2"
jobs:
analyze:
name: Analyze
runs-on: ubuntu-latest
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: ["cpp", "python"]
steps:
- name: Checkout repository
uses: actions/checkout@v2
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v1
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# queries: ./path/to/local/query, your-org/your-repo/queries@main
- if: matrix.language == 'cpp'
name: Build Cpp
run: |
sudo apt-get update
sudo apt-get install python3-pyqt5
make modules
- if: matrix.language == 'python'
name: Autobuild
uses: github/codeql-action/autobuild@v1
# Analysis
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v1

84
.github/workflows/default.yml vendored Normal file
View File

@@ -0,0 +1,84 @@
# Workflow lints, and checks format in parallel then runs tests on all platforms
name: Default CI/CD
on:
push:
branches: [master]
pull_request:
branches: [master]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.10
uses: actions/setup-python@v2
with:
python-version: "3.10"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt -r requirements-extra.txt
- name: Lint with flake8
run: |
flake8 .
format:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.10
uses: actions/setup-python@v2
with:
python-version: "3.10"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt -r requirements-extra.txt
- name: Check format with black
run: |
black .
test:
needs: [lint, format]
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
python-version: [3.7, 3.8, 3.9, "3.10"]
exclude:
- os: macos-latest
python-version: 3.7
- os: macos-latest
python-version: 3.8
- os: macos-latest
python-version: 3.9
- os: windows-latest
python-version: 3.7
- os: windows-latest
python-version: 3.8
- os: windows-latest
python-version: 3.9
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt -r requirements-extra.txt
- name: Build python modules
run: |
python build.py --modules
- name: Run tests
run: |
pytest core hscommon
- name: Upload Artifacts
if: matrix.os == 'ubuntu-latest'
uses: actions/upload-artifact@v3
with:
name: modules ${{ matrix.python-version }}
path: ${{ github.workspace }}/**/*.so

123
.gitignore vendored
View File

@@ -1,28 +1,111 @@
.DS_Store # Byte-compiled / optimized / DLL files
__pycache__ __pycache__/
*.py[cod]
*$py.class
# C extensions
*.so *.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo *.mo
*.waf* #*.pot
.lock-waf*
.tox
/tags
build # PEP 582; used by e.g. github.com/David-OConnor/pyflow
dist __pypackages__/
env*
/deps
cocoa/autogen
/run.py # Environments
/cocoa/*/Info.plist .env
/cocoa/*/build .venv
env*/
venv/
ENV/
env.bak/
venv.bak/
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# macOS
.DS_Store
# Visual Studio Code
.vscode/*
!.vscode/settings.json
#!.vscode/tasks.json
#!.vscode/launch.json
!.vscode/extensions.json
!.vscode/*.code-snippets
# Local History for Visual Studio Code
.history/
# Built Visual Studio Code Extensions
*.vsix
# dupeGuru Specific
/qt/*_rc.py /qt/*_rc.py
/help/*/conf.py /help/*/conf.py
/help/*/changelog.rst /help/*/changelog.rst
/transifex cocoa/autogen
/cocoa/*/Info.plist
/cocoa/*/build
*.pyd *.waf*
*.exe .lock-waf*
*.spec /tags
.vscode

1
.sonarcloud.properties Normal file
View File

@@ -0,0 +1 @@
sonar.python.version=3.7, 3.8, 3.9, 3.10

View File

@@ -1,27 +0,0 @@
sudo: false
language: python
install:
- pip3 install -r requirements.txt -r requirements-extra.txt
script: tox
matrix:
include:
- os: "linux"
dist: "xenial"
python: "3.6"
- os: "linux"
dist: "xenial"
python: "3.7"
- os: "linux"
dist: "focal"
python: "3.8"
- os: "linux"
dist: "focal"
python: "3.9"
- os: "windows"
language: shell
python: "3.8"
env: "PATH=/c/python38:/c/python38/Scripts:$PATH"
before_install:
- choco install python --version=3.8.6
- cp /c/python38/python.exe /c/python38/python3.exe
script: tox -e py38

View File

@@ -1,19 +1,19 @@
[main] [main]
host = https://www.transifex.com host = https://www.transifex.com
[dupeguru.core] [o:voltaicideas:p:dupeguru-1:r:columns]
file_filter = locale/<lang>/LC_MESSAGES/core.po
source_file = locale/core.pot
source_lang = en
type = PO
[dupeguru.columns]
file_filter = locale/<lang>/LC_MESSAGES/columns.po file_filter = locale/<lang>/LC_MESSAGES/columns.po
source_file = locale/columns.pot source_file = locale/columns.pot
source_lang = en source_lang = en
type = PO type = PO
[dupeguru.ui] [o:voltaicideas:p:dupeguru-1:r:core]
file_filter = locale/<lang>/LC_MESSAGES/core.po
source_file = locale/core.pot
source_lang = en
type = PO
[o:voltaicideas:p:dupeguru-1:r:ui]
file_filter = locale/<lang>/LC_MESSAGES/ui.po file_filter = locale/<lang>/LC_MESSAGES/ui.po
source_file = locale/ui.pot source_file = locale/ui.pot
source_lang = en source_lang = en

10
.vscode/extensions.json vendored Normal file
View File

@@ -0,0 +1,10 @@
{
// List of extensions which should be recommended for users of this workspace.
"recommendations": [
"redhat.vscode-yaml",
"ms-python.vscode-pylance",
"ms-python.python"
],
// List of extensions recommended by VS Code that should not be recommended for users of this workspace.
"unwantedRecommendations": []
}

12
.vscode/settings.json vendored Normal file
View File

@@ -0,0 +1,12 @@
{
"python.formatting.provider": "black",
"cSpell.words": [
"Dupras",
"hscommon"
],
"python.languageServer": "Pylance",
"yaml.schemaStore.enable": true,
"yaml.schemas": {
"https://json.schemastore.org/github-workflow.json": ".github/workflows/*.yml"
}
}

88
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,88 @@
# Contributing to dupeGuru
The following is a set of guidelines and information for contributing to dupeGuru.
#### Table of Contents
[Things to Know Before Starting](#things-to-know-before-starting)
[Ways to Contribute](#ways-to-contribute)
* [Reporting Bugs](#reporting-bugs)
* [Suggesting Enhancements](#suggesting-enhancements)
* [Localization](#localization)
* [Code Contribution](#code-contribution)
* [Pull Requests](#pull-requests)
[Style Guides](#style-guides)
* [Git Commit Messages](#git-commit-messages)
* [Python Style Guide](#python-style-guide)
* [Documentation Style Guide](#documentation-style-guide)
[Additional Notes](#additional-notes)
* [Issue and Pull Request Labels](#issue-and-pull-request-labels)
## Things to Know Before Starting
**TODO**
## Ways to contribute
### Reporting Bugs
**TODO**
### Suggesting Enhancements
**TODO**
### Localization
**TODO**
### Code Contribution
**TODO**
### Pull Requests
Please follow these steps to have your contribution considered by the maintainers:
1. Keep Pull Request specific to one feature or bug.
2. Follow the [style guides](#style-guides)
3. After you submit your pull request, verify that all [status checks](https://help.github.com/articles/about-status-checks/) are passing <details><summary>What if the status checks are failing?</summary>If a status check is failing, and you believe that the failure is unrelated to your change, please leave a comment on the pull request explaining why you believe the failure is unrelated. A maintainer will re-run the status check for you. If we conclude that the failure was a false positive, then we will open an issue to track that problem with our status check suite.</details>
While the prerequisites above must be satisfied prior to having your pull request reviewed, the reviewer(s) may ask you to complete additional design work, tests, or other changes before your pull request can be ultimately accepted.
## Style Guides
### Git Commit Messages
- Use the present tense ("Add feature" not "Added feature")
- Use the imperative mood ("Move cursor to..." not "Moves cursor to...")
- Limit the first line to 72 characters or less
- Reference issues and pull requests liberally after the first line
### Python Style Guide
- All files are formatted with [Black](https://github.com/psf/black)
- Follow [PEP 8](https://peps.python.org/pep-0008/) as much as practical
- Pass [flake8](https://flake8.pycqa.org/en/latest/) linting
- Include [PEP 484](https://peps.python.org/pep-0484/) type hints (new code)
### Documentation Style Guide
**TODO**
## Additional Notes
### Issue and Pull Request Labels
This section lists and describes the various labels used with issues and pull requests. Each of the labels is listed with a search link as well.
#### Issue Type and Status
| Label name | Search | Description |
|------------|--------|-------------|
| `enhancement` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aenhancement) | Feature requests and enhancements. |
| `bug` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Abug) | Bug reports. |
| `duplicate` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aduplicate) | Issue is a duplicate of existing issue. |
| `needs-reproduction` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aneeds-reproduction) | A bug that has not been able to be reproduced. |
| `needs-information` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aneeds-information) | More information needs to be collected about these problems or feature requests (e.g. steps to reproduce). |
| `blocked` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Ablocked) | Issue blocked by other issues. |
| `beginner` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Abeginner) | Less complex issues for users who want to start contributing. |
#### Category Labels
| Label name | Search | Description |
|------------|--------|-------------|
| `3rd party` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3A%223rd%20party%22) | Related to a 3rd party dependency. |
| `crash` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Acrash) | Related to crashes (complete, or unhandled). |
| `documentation` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Adocumentation) | Related to any documentation. |
| `linux` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3linux) | Related to running on Linux. |
| `mac` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Amac) | Related to running on macOS. |
| `performance` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aperformance) | Related to the performance. |
| `ui` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aui)| Related to the visual design. |
| `windows` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Awindows) | Related to running on Windows. |
#### Pull Request Labels
None at this time, if the volume of Pull Requests increase labels may be added to manage.

5
MANIFEST.in Normal file
View File

@@ -0,0 +1,5 @@
recursive-include core *.h
recursive-include core *.m
include run.py
graft locale
graft help

View File

@@ -1,7 +1,7 @@
PYTHON ?= python3 PYTHON ?= python3
PYTHON_VERSION_MINOR := $(shell ${PYTHON} -c "import sys; print(sys.version_info.minor)") PYTHON_VERSION_MINOR := $(shell ${PYTHON} -c "import sys; print(sys.version_info.minor)")
PYRCC5 ?= pyrcc5 PYRCC5 ?= pyrcc5
REQ_MINOR_VERSION = 6 REQ_MINOR_VERSION = 7
PREFIX ?= /usr/local PREFIX ?= /usr/local
# Window compatability via Msys2 # Window compatability via Msys2
@@ -35,7 +35,7 @@ endif
# Our build scripts are not very "make like" yet and perform their task in a bundle. For now, we # Our build scripts are not very "make like" yet and perform their task in a bundle. For now, we
# use one of each file to act as a representative, a target, of these groups. # use one of each file to act as a representative, a target, of these groups.
packages = hscommon qtlib core qt packages = hscommon core qt
localedirs = $(wildcard locale/*/LC_MESSAGES) localedirs = $(wildcard locale/*/LC_MESSAGES)
pofiles = $(wildcard locale/*/LC_MESSAGES/*.po) pofiles = $(wildcard locale/*/LC_MESSAGES/*.po)
mofiles = $(patsubst %.po,%.mo,$(pofiles)) mofiles = $(patsubst %.po,%.mo,$(pofiles))
@@ -53,7 +53,7 @@ pyc: | env
${VENV_PYTHON} -m compileall ${packages} ${VENV_PYTHON} -m compileall ${packages}
reqs: reqs:
ifneq ($(shell test $(PYTHON_VERSION_MINOR) -gt $(REQ_MINOR_VERSION); echo $$?),0) ifneq ($(shell test $(PYTHON_VERSION_MINOR) -ge $(REQ_MINOR_VERSION); echo $$?),0)
$(error "Python 3.${REQ_MINOR_VERSION}+ required. Aborting.") $(error "Python 3.${REQ_MINOR_VERSION}+ required. Aborting.")
endif endif
ifndef NO_VENV ifndef NO_VENV

View File

@@ -1,16 +1,12 @@
# dupeGuru # dupeGuru
[dupeGuru][dupeguru] is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in [dupeGuru][dupeguru] is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in
a system. It is written mostly in Python 3 and has the peculiarity of using a system. It is written mostly in Python 3 and uses [qt](https://www.qt.io/) for the UI.
[multiple GUI toolkits][cross-toolkit], all using the same core Python code. On OS X, the UI layer
is written in Objective-C and uses Cocoa. On Linux, it is written in Python and uses Qt5.
The Cocoa UI of dupeGuru is hosted in a separate repo: https://github.com/arsenetar/dupeguru-cocoa
## Current status ## Current status
Still looking for additional help especially with regards to: Still looking for additional help especially with regards to:
* OSX maintenance: reproducing bugs & cocoa version, building package with Cocoa UI. * OSX maintenance: reproducing bugs, packaging verification.
* Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package. * Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package, rpm package.
* Translations: updating missing strings, transifex project at https://www.transifex.com/voltaicideas/dupeguru-1 * Translations: updating missing strings, transifex project at https://www.transifex.com/voltaicideas/dupeguru-1
* Documentation: keeping it up-to-date. * Documentation: keeping it up-to-date.
@@ -26,7 +22,6 @@ This folder contains the source for dupeGuru. Its documentation is in `help`, bu
* help: Help document, written for Sphinx. * help: Help document, written for Sphinx.
* locale: .po files for localization. * locale: .po files for localization.
* hscommon: A collection of helpers used across HS applications. * hscommon: A collection of helpers used across HS applications.
* qtlib: A collection of helpers used across Qt UI codebases of HS applications.
## How to build dupeGuru from source ## How to build dupeGuru from source
@@ -36,19 +31,17 @@ For windows instructions see the [Windows Instructions](Windows.md).
For macos instructions (qt version) see the [macOS Instructions](macos.md). For macos instructions (qt version) see the [macOS Instructions](macos.md).
### Prerequisites ### Prerequisites
* [Python 3.6+][python] * [Python 3.7+][python]
* PyQt5 * PyQt5
### System Setup ### System Setup
When running in a linux based environment the following system packages or equivalents are needed to build: When running in a linux based environment the following system packages or equivalents are needed to build:
* python3-pyqt5 * python3-pyqt5
* pyqt5-dev-tools (on some systems, see note) * pyqt5-dev-tools (on some systems, see note)
* python3-wheel (for hsaudiotag3k)
* python3-venv (only if using a virtual environment) * python3-venv (only if using a virtual environment)
* python3-dev * python3-dev
* build-essential * build-essential
Note: On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with `which pyrcc5`. Debian based systems need the extra package, and Arch does not. Note: On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with `which pyrcc5`. Debian based systems need the extra package, and Arch does not.
To create packages the following are also needed: To create packages the following are also needed:

View File

@@ -2,7 +2,7 @@
### Prerequisites ### Prerequisites
- [Python 3.6+][python] - [Python 3.7+][python]
- [Visual Studio 2019][vs] or [Visual Studio Build Tools 2019][vsBuildTools] with the Windows 10 SDK - [Visual Studio 2019][vs] or [Visual Studio Build Tools 2019][vsBuildTools] with the Windows 10 SDK
- [nsis][nsis] (for installer creation) - [nsis][nsis] (for installer creation)
- [msys2][msys2] (for using makefile method) - [msys2][msys2] (for using makefile method)
@@ -16,7 +16,7 @@ After installing python it is recommended to update setuptools before compiling
More details on setting up python for compiling packages on windows can be found on the [python wiki][pythonWindowsCompilers] Take note of the required vc++ versions. More details on setting up python for compiling packages on windows can be found on the [python wiki][pythonWindowsCompilers] Take note of the required vc++ versions.
### With build.py (preferred) ### With build.py (preferred)
To build with a different python version 3.6 vs 3.8 or 32 bit vs 64 bit specify that version instead of -3.8 to the `py` command below. If you want to build additional versions while keeping all virtual environments setup use a different location for each virtual environment. To build with a different python version 3.7 vs 3.8 or 32 bit vs 64 bit specify that version instead of -3.8 to the `py` command below. If you want to build additional versions while keeping all virtual environments setup use a different location for each virtual environment.
$ cd <dupeGuru directory> $ cd <dupeGuru directory>
$ py -3.8 -m venv .\env $ py -3.8 -m venv .\env

147
build.py
View File

@@ -4,19 +4,17 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import os from pathlib import Path
import os.path as op import sys
from optparse import OptionParser from optparse import OptionParser
import shutil import shutil
from pathlib import Path from multiprocessing import Pool
from setuptools import setup, Extension
from setuptools import sandbox
from hscommon import sphinxgen from hscommon import sphinxgen
from hscommon.build import ( from hscommon.build import (
add_to_pythonpath, add_to_pythonpath,
print_and_do, print_and_do,
move_all,
fix_qt_resource_file, fix_qt_resource_file,
) )
from hscommon import loc from hscommon import loc
@@ -31,12 +29,9 @@ def parse_args():
dest="clean", dest="clean",
help="Clean build folder before building", help="Clean build folder before building",
) )
parser.add_option( parser.add_option("--doc", action="store_true", dest="doc", help="Build only the help file (en)")
"--doc", action="store_true", dest="doc", help="Build only the help file" parser.add_option("--alldoc", action="store_true", dest="all_doc", help="Build only the help file in all languages")
) parser.add_option("--loc", action="store_true", dest="loc", help="Build only localization")
parser.add_option(
"--loc", action="store_true", dest="loc", help="Build only localization"
)
parser.add_option( parser.add_option(
"--updatepot", "--updatepot",
action="store_true", action="store_true",
@@ -61,26 +56,20 @@ def parse_args():
dest="modules", dest="modules",
help="Build the python modules.", help="Build the python modules.",
) )
parser.add_option(
"--importpo",
action="store_true",
dest="importpo",
help="Import all PO files downloaded from transifex.",
)
(options, args) = parser.parse_args() (options, args) = parser.parse_args()
return options return options
def build_help(): def build_one_help(language):
print("Generating Help") print(f"Generating Help in {language}")
current_path = op.abspath(".") current_path = Path(".").absolute()
help_basepath = op.join(current_path, "help", "en") changelog_path = current_path.joinpath("help", "changelog")
help_destpath = op.join(current_path, "build", "help")
changelog_path = op.join(current_path, "help", "changelog")
tixurl = "https://github.com/arsenetar/dupeguru/issues/{}" tixurl = "https://github.com/arsenetar/dupeguru/issues/{}"
confrepl = {"language": "en"} changelogtmpl = current_path.joinpath("help", "changelog.tmpl")
changelogtmpl = op.join(current_path, "help", "changelog.tmpl") conftmpl = current_path.joinpath("help", "conf.tmpl")
conftmpl = op.join(current_path, "help", "conf.tmpl") help_basepath = current_path.joinpath("help", language)
help_destpath = current_path.joinpath("build", "help", language)
confrepl = {"language": language}
sphinxgen.gen( sphinxgen.gen(
help_basepath, help_basepath,
help_destpath, help_destpath,
@@ -92,103 +81,44 @@ def build_help():
) )
def build_qt_localizations(): def build_help():
loc.compile_all_po(op.join("qtlib", "locale")) languages = ["en", "de", "fr", "hy", "ru", "uk"]
loc.merge_locale_dir(op.join("qtlib", "locale"), "locale") # Running with Pools as for some reason sphinx seems to cross contaminate the output otherwise
with Pool(len(languages)) as p:
p.map(build_one_help, languages)
def build_localizations(): def build_localizations():
loc.compile_all_po("locale") loc.compile_all_po("locale")
build_qt_localizations() locale_dest = Path("build", "locale")
locale_dest = op.join("build", "locale") if locale_dest.exists():
if op.exists(locale_dest):
shutil.rmtree(locale_dest) shutil.rmtree(locale_dest)
shutil.copytree( shutil.copytree("locale", locale_dest, ignore=shutil.ignore_patterns("*.po", "*.pot"))
"locale", locale_dest, ignore=shutil.ignore_patterns("*.po", "*.pot")
)
def build_updatepot(): def build_updatepot():
print("Building .pot files from source files") print("Building .pot files from source files")
print("Building core.pot") print("Building core.pot")
loc.generate_pot(["core"], op.join("locale", "core.pot"), ["tr"]) loc.generate_pot(["core"], Path("locale", "core.pot"), ["tr"])
print("Building columns.pot") print("Building columns.pot")
loc.generate_pot(["core"], op.join("locale", "columns.pot"), ["coltr"]) loc.generate_pot(["core"], Path("locale", "columns.pot"), ["coltr"])
print("Building ui.pot") print("Building ui.pot")
# When we're not under OS X, we don't want to overwrite ui.pot because it contains Cocoa locs loc.generate_pot(["qt"], Path("locale", "ui.pot"), ["tr"], merge=True)
# We want to merge the generated pot with the old pot in the most preserving way possible.
ui_packages = ["qt", op.join("cocoa", "inter")]
loc.generate_pot(ui_packages, op.join("locale", "ui.pot"), ["tr"], merge=True)
print("Building qtlib.pot")
loc.generate_pot(["qtlib"], op.join("qtlib", "locale", "qtlib.pot"), ["tr"])
def build_mergepot(): def build_mergepot():
print("Updating .po files using .pot files") print("Updating .po files using .pot files")
loc.merge_pots_into_pos("locale") loc.merge_pots_into_pos("locale")
loc.merge_pots_into_pos(op.join("qtlib", "locale"))
# loc.merge_pots_into_pos(op.join("cocoalib", "locale"))
def build_normpo(): def build_normpo():
loc.normalize_all_pos("locale") loc.normalize_all_pos("locale")
loc.normalize_all_pos(op.join("qtlib", "locale"))
# loc.normalize_all_pos(op.join("cocoalib", "locale"))
def build_importpo():
basePath = Path.cwd()
# expect a folder named transifex with all the .po files from the exports
translationsPath = basePath.joinpath("transifex")
# locations where the translation files go
qtlibPath = basePath.joinpath("qtlib", "locale")
localePath = basePath.joinpath("locale")
for translation in translationsPath.iterdir():
# transifex files are named resource_lang.po so split on first '_'
parts = translation.stem.split("_", 1)
resource = parts[0]
language = parts[1]
# make sure qtlib resources go to dedicated folder
if resource == "qtlib":
outputPath = qtlibPath
else:
outputPath = localePath
outputFolder = outputPath.joinpath(language, "LC_MESSAGES")
# create the language folder if it is new
if not outputFolder.exists():
outputFolder.mkdir(parents=True)
# copy the po file over
shutil.copy(translation, outputFolder.joinpath(resource + ".po"))
# normalize files after complete
build_normpo()
def build_pe_modules(): def build_pe_modules():
print("Building PE Modules") print("Building PE Modules")
exts = [ # Leverage setup.py to build modules
Extension( sandbox.run_setup("setup.py", ["build_ext", "--inplace"])
"_block",
[
op.join("core", "pe", "modules", "block.c"),
op.join("core", "pe", "modules", "common.c"),
],
),
Extension(
"_cache",
[
op.join("core", "pe", "modules", "cache.c"),
op.join("core", "pe", "modules", "common.c"),
],
),
]
exts.append(Extension("_block_qt", [op.join("qt", "pe", "modules", "block.c")]))
setup(
script_args=["build_ext", "--inplace"],
ext_modules=exts,
)
move_all("_block_qt*", op.join("qt", "pe"))
move_all("_block*", op.join("core", "pe"))
move_all("_cache*", op.join("core", "pe"))
def build_normal(): def build_normal():
@@ -199,21 +129,22 @@ def build_normal():
print("Building localizations") print("Building localizations")
build_localizations() build_localizations()
print("Building Qt stuff") print("Building Qt stuff")
print_and_do( print_and_do("pyrcc5 {} > {}".format(Path("qt", "dg.qrc"), Path("qt", "dg_rc.py")))
"pyrcc5 {0} > {1}".format(op.join("qt", "dg.qrc"), op.join("qt", "dg_rc.py")) fix_qt_resource_file(Path("qt", "dg_rc.py"))
)
fix_qt_resource_file(op.join("qt", "dg_rc.py"))
build_help() build_help()
def main(): def main():
if sys.version_info < (3, 7):
sys.exit("Python < 3.7 is unsupported.")
options = parse_args() options = parse_args()
if options.clean: if options.clean and Path("build").exists():
if op.exists("build"):
shutil.rmtree("build") shutil.rmtree("build")
if not op.exists("build"): if not Path("build").exists():
os.mkdir("build") Path("build").mkdir()
if options.doc: if options.doc:
build_one_help("en")
elif options.all_doc:
build_help() build_help()
elif options.loc: elif options.loc:
build_localizations() build_localizations()
@@ -225,8 +156,6 @@ def main():
build_normpo() build_normpo()
elif options.modules: elif options.modules:
build_pe_modules() build_pe_modules()
elif options.importpo:
build_importpo()
else: else:
build_normal() build_normal()

View File

@@ -1,2 +1,2 @@
__version__ = "4.1.1" __version__ = "4.3.0"
__appname__ = "dupeGuru" __appname__ = "dupeGuru"

View File

@@ -4,37 +4,39 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import cProfile
import datetime
import os import os
import os.path as op import os.path as op
import logging import logging
import subprocess import subprocess
import re import re
import shutil import shutil
from pathlib import Path
from send2trash import send2trash from send2trash import send2trash
from hscommon.jobprogress import job from hscommon.jobprogress import job
from hscommon.notify import Broadcaster from hscommon.notify import Broadcaster
from hscommon.path import Path
from hscommon.conflict import smart_move, smart_copy from hscommon.conflict import smart_move, smart_copy
from hscommon.gui.progress_window import ProgressWindow from hscommon.gui.progress_window import ProgressWindow
from hscommon.util import delete_if_empty, first, escape, nonone, allsame from hscommon.util import delete_if_empty, first, escape, nonone, allsame
from hscommon.trans import tr from hscommon.trans import tr
from hscommon import desktop from hscommon import desktop
from . import se, me, pe from core import se, me, pe
from .pe.photo import get_delta_dimensions from core.pe.photo import get_delta_dimensions
from .util import cmp_value, fix_surrogate_encoding from core.util import cmp_value, fix_surrogate_encoding
from . import directories, results, export, fs, prioritize from core import directories, results, export, fs, prioritize
from .ignore import IgnoreList from core.ignore import IgnoreList
from .exclude import ExcludeDict as ExcludeList from core.exclude import ExcludeDict as ExcludeList
from .scanner import ScanType from core.scanner import ScanType
from .gui.deletion_options import DeletionOptions from core.gui.deletion_options import DeletionOptions
from .gui.details_panel import DetailsPanel from core.gui.details_panel import DetailsPanel
from .gui.directory_tree import DirectoryTree from core.gui.directory_tree import DirectoryTree
from .gui.ignore_list_dialog import IgnoreListDialog from core.gui.ignore_list_dialog import IgnoreListDialog
from .gui.exclude_list_dialog import ExcludeListDialogCore from core.gui.exclude_list_dialog import ExcludeListDialogCore
from .gui.problem_dialog import ProblemDialog from core.gui.problem_dialog import ProblemDialog
from .gui.stats_label import StatsLabel from core.gui.stats_label import StatsLabel
HAD_FIRST_LAUNCH_PREFERENCE = "HadFirstLaunch" HAD_FIRST_LAUNCH_PREFERENCE = "HadFirstLaunch"
DEBUG_MODE_PREFERENCE = "DebugMode" DEBUG_MODE_PREFERENCE = "DebugMode"
@@ -48,31 +50,31 @@ MSG_MANY_FILES_TO_OPEN = tr(
class DestType: class DestType:
Direct = 0 DIRECT = 0
Relative = 1 RELATIVE = 1
Absolute = 2 ABSOLUTE = 2
class JobType: class JobType:
Scan = "job_scan" SCAN = "job_scan"
Load = "job_load" LOAD = "job_load"
Move = "job_move" MOVE = "job_move"
Copy = "job_copy" COPY = "job_copy"
Delete = "job_delete" DELETE = "job_delete"
class AppMode: class AppMode:
Standard = 0 STANDARD = 0
Music = 1 MUSIC = 1
Picture = 2 PICTURE = 2
JOBID2TITLE = { JOBID2TITLE = {
JobType.Scan: tr("Scanning for duplicates"), JobType.SCAN: tr("Scanning for duplicates"),
JobType.Load: tr("Loading"), JobType.LOAD: tr("Loading"),
JobType.Move: tr("Moving"), JobType.MOVE: tr("Moving"),
JobType.Copy: tr("Copying"), JobType.COPY: tr("Copying"),
JobType.Delete: tr("Sending to Trash"), JobType.DELETE: tr("Sending to Trash"),
} }
@@ -126,20 +128,20 @@ class DupeGuru(Broadcaster):
PICTURE_CACHE_TYPE = "sqlite" # set to 'shelve' for a ShelveCache PICTURE_CACHE_TYPE = "sqlite" # set to 'shelve' for a ShelveCache
def __init__(self, view): def __init__(self, view, portable=False):
if view.get_default(DEBUG_MODE_PREFERENCE): if view.get_default(DEBUG_MODE_PREFERENCE):
logging.getLogger().setLevel(logging.DEBUG) logging.getLogger().setLevel(logging.DEBUG)
logging.debug("Debug mode enabled") logging.debug("Debug mode enabled")
Broadcaster.__init__(self) Broadcaster.__init__(self)
self.view = view self.view = view
self.appdata = desktop.special_folder_path( self.appdata = desktop.special_folder_path(desktop.SpecialFolder.APPDATA, portable=portable)
desktop.SpecialFolder.AppData, appname=self.NAME
)
if not op.exists(self.appdata): if not op.exists(self.appdata):
os.makedirs(self.appdata) os.makedirs(self.appdata)
self.app_mode = AppMode.Standard self.app_mode = AppMode.STANDARD
self.discarded_file_count = 0 self.discarded_file_count = 0
self.exclude_list = ExcludeList() self.exclude_list = ExcludeList()
hash_cache_file = op.join(self.appdata, "hash_cache.db")
fs.filesdb.connect(hash_cache_file)
self.directories = directories.Directories(self.exclude_list) self.directories = directories.Directories(self.exclude_list)
self.results = results.Results(self) self.results = results.Results(self)
self.ignore_list = IgnoreList() self.ignore_list = IgnoreList()
@@ -150,7 +152,7 @@ class DupeGuru(Broadcaster):
"escape_filter_regexp": True, "escape_filter_regexp": True,
"clean_empty_dirs": False, "clean_empty_dirs": False,
"ignore_hardlink_matches": False, "ignore_hardlink_matches": False,
"copymove_dest_type": DestType.Relative, "copymove_dest_type": DestType.RELATIVE,
"picture_cache_type": self.PICTURE_CACHE_TYPE, "picture_cache_type": self.PICTURE_CACHE_TYPE,
} }
self.selected_dupes = [] self.selected_dupes = []
@@ -171,9 +173,9 @@ class DupeGuru(Broadcaster):
def _recreate_result_table(self): def _recreate_result_table(self):
if self.result_table is not None: if self.result_table is not None:
self.result_table.disconnect() self.result_table.disconnect()
if self.app_mode == AppMode.Picture: if self.app_mode == AppMode.PICTURE:
self.result_table = pe.result_table.ResultTable(self) self.result_table = pe.result_table.ResultTable(self)
elif self.app_mode == AppMode.Music: elif self.app_mode == AppMode.MUSIC:
self.result_table = me.result_table.ResultTable(self) self.result_table = me.result_table.ResultTable(self)
else: else:
self.result_table = se.result_table.ResultTable(self) self.result_table = se.result_table.ResultTable(self)
@@ -182,20 +184,14 @@ class DupeGuru(Broadcaster):
def _get_picture_cache_path(self): def _get_picture_cache_path(self):
cache_type = self.options["picture_cache_type"] cache_type = self.options["picture_cache_type"]
cache_name = ( cache_name = "cached_pictures.shelve" if cache_type == "shelve" else "cached_pictures.db"
"cached_pictures.shelve" if cache_type == "shelve" else "cached_pictures.db"
)
return op.join(self.appdata, cache_name) return op.join(self.appdata, cache_name)
def _get_dupe_sort_key(self, dupe, get_group, key, delta): def _get_dupe_sort_key(self, dupe, get_group, key, delta):
if self.app_mode in (AppMode.Music, AppMode.Picture): if self.app_mode in (AppMode.MUSIC, AppMode.PICTURE) and key == "folder_path":
if key == "folder_path": dupe_folder_path = getattr(dupe, "display_folder_path", dupe.folder_path)
dupe_folder_path = getattr(
dupe, "display_folder_path", dupe.folder_path
)
return str(dupe_folder_path).lower() return str(dupe_folder_path).lower()
if self.app_mode == AppMode.Picture: if self.app_mode == AppMode.PICTURE and delta and key == "dimensions":
if delta and key == "dimensions":
r = cmp_value(dupe, key) r = cmp_value(dupe, key)
ref_value = cmp_value(get_group().ref, key) ref_value = cmp_value(get_group().ref, key)
return get_delta_dimensions(r, ref_value) return get_delta_dimensions(r, ref_value)
@@ -218,11 +214,8 @@ class DupeGuru(Broadcaster):
return result return result
def _get_group_sort_key(self, group, key): def _get_group_sort_key(self, group, key):
if self.app_mode in (AppMode.Music, AppMode.Picture): if self.app_mode in (AppMode.MUSIC, AppMode.PICTURE) and key == "folder_path":
if key == "folder_path": dupe_folder_path = getattr(group.ref, "display_folder_path", group.ref.folder_path)
dupe_folder_path = getattr(
group.ref, "display_folder_path", group.ref.folder_path
)
return str(dupe_folder_path).lower() return str(dupe_folder_path).lower()
if key == "percentage": if key == "percentage":
return group.percentage return group.percentage
@@ -235,9 +228,7 @@ class DupeGuru(Broadcaster):
def _do_delete(self, j, link_deleted, use_hardlinks, direct_deletion): def _do_delete(self, j, link_deleted, use_hardlinks, direct_deletion):
def op(dupe): def op(dupe):
j.add_progress() j.add_progress()
return self._do_delete_dupe( return self._do_delete_dupe(dupe, link_deleted, use_hardlinks, direct_deletion)
dupe, link_deleted, use_hardlinks, direct_deletion
)
j.start_job(self.results.mark_count) j.start_job(self.results.mark_count)
self.results.perform_on_marked(op, True) self.results.perform_on_marked(op, True)
@@ -259,7 +250,7 @@ class DupeGuru(Broadcaster):
ref = group.ref ref = group.ref
linkfunc = os.link if use_hardlinks else os.symlink linkfunc = os.link if use_hardlinks else os.symlink
linkfunc(str(ref.path), str_path) linkfunc(str(ref.path), str_path)
self.clean_empty_dirs(dupe.path.parent()) self.clean_empty_dirs(dupe.path.parent)
def _create_file(self, path): def _create_file(self, path):
# We add fs.Folder to fileclasses in case the file we're loading contains folder paths. # We add fs.Folder to fileclasses in case the file we're loading contains folder paths.
@@ -273,15 +264,11 @@ class DupeGuru(Broadcaster):
try: try:
f._read_all_info(attrnames=self.METADATA_TO_READ) f._read_all_info(attrnames=self.METADATA_TO_READ)
return f return f
except EnvironmentError: except OSError:
return None return None
def _get_export_data(self): def _get_export_data(self):
columns = [ columns = [col for col in self.result_table._columns.ordered_columns if col.visible and col.name != "marked"]
col
for col in self.result_table.columns.ordered_columns
if col.visible and col.name != "marked"
]
colnames = [col.display for col in columns] colnames = [col.display for col in columns]
rows = [] rows = []
for group_id, group in enumerate(self.results.groups): for group_id, group in enumerate(self.results.groups):
@@ -293,11 +280,7 @@ class DupeGuru(Broadcaster):
return colnames, rows return colnames, rows
def _results_changed(self): def _results_changed(self):
self.selected_dupes = [ self.selected_dupes = [d for d in self.selected_dupes if self.results.get_group_of_duplicate(d) is not None]
d
for d in self.selected_dupes
if self.results.get_group_of_duplicate(d) is not None
]
self.notify("results_changed") self.notify("results_changed")
def _start_job(self, jobid, func, args=()): def _start_job(self, jobid, func, args=()):
@@ -312,34 +295,36 @@ class DupeGuru(Broadcaster):
self.view.show_message(msg) self.view.show_message(msg)
def _job_completed(self, jobid): def _job_completed(self, jobid):
if jobid == JobType.Scan: if jobid == JobType.SCAN:
self._results_changed() self._results_changed()
fs.filesdb.commit()
if not self.results.groups: if not self.results.groups:
self.view.show_message(tr("No duplicates found.")) self.view.show_message(tr("No duplicates found."))
else: else:
self.view.show_results_window() self.view.show_results_window()
if jobid in {JobType.Move, JobType.Delete}: if jobid in {JobType.MOVE, JobType.DELETE}:
self._results_changed() self._results_changed()
if jobid == JobType.Load: if jobid == JobType.LOAD:
self._recreate_result_table() self._recreate_result_table()
self._results_changed() self._results_changed()
self.view.show_results_window() self.view.show_results_window()
if jobid in {JobType.Copy, JobType.Move, JobType.Delete}: if jobid in {JobType.COPY, JobType.MOVE, JobType.DELETE}:
if self.results.problems: if self.results.problems:
self.problem_dialog.refresh() self.problem_dialog.refresh()
self.view.show_problem_dialog() self.view.show_problem_dialog()
else: else:
msg = { if jobid == JobType.COPY:
JobType.Copy: tr("All marked files were copied successfully."), msg = tr("All marked files were copied successfully.")
JobType.Move: tr("All marked files were moved successfully."), elif jobid == JobType.MOVE:
JobType.Delete: tr( msg = tr("All marked files were moved successfully.")
"All marked files were successfully sent to Trash." elif jobid == JobType.DELETE and self.deletion_options.direct:
), msg = tr("All marked files were deleted successfully.")
}[jobid] else:
msg = tr("All marked files were successfully sent to Trash.")
self.view.show_message(msg) self.view.show_message(msg)
def _job_error(self, jobid, err): def _job_error(self, jobid, err):
if jobid == JobType.Load: if jobid == JobType.LOAD:
msg = tr("Could not load file: {}").format(err) msg = tr("Could not load file: {}").format(err)
self.view.show_message(msg) self.view.show_message(msg)
return False return False
@@ -369,17 +354,17 @@ class DupeGuru(Broadcaster):
# --- Protected # --- Protected
def _get_fileclasses(self): def _get_fileclasses(self):
if self.app_mode == AppMode.Picture: if self.app_mode == AppMode.PICTURE:
return [pe.photo.PLAT_SPECIFIC_PHOTO_CLASS] return [pe.photo.PLAT_SPECIFIC_PHOTO_CLASS]
elif self.app_mode == AppMode.Music: elif self.app_mode == AppMode.MUSIC:
return [me.fs.MusicFile] return [me.fs.MusicFile]
else: else:
return [se.fs.File] return [se.fs.File]
def _prioritization_categories(self): def _prioritization_categories(self):
if self.app_mode == AppMode.Picture: if self.app_mode == AppMode.PICTURE:
return pe.prioritize.all_categories() return pe.prioritize.all_categories()
elif self.app_mode == AppMode.Music: elif self.app_mode == AppMode.MUSIC:
return me.prioritize.all_categories() return me.prioritize.all_categories()
else: else:
return prioritize.all_categories() return prioritize.all_categories()
@@ -401,41 +386,38 @@ class DupeGuru(Broadcaster):
self.view.show_message(tr("'{}' does not exist.").format(d)) self.view.show_message(tr("'{}' does not exist.").format(d))
def add_selected_to_ignore_list(self): def add_selected_to_ignore_list(self):
"""Adds :attr:`selected_dupes` to :attr:`ignore_list`. """Adds :attr:`selected_dupes` to :attr:`ignore_list`."""
"""
dupes = self.without_ref(self.selected_dupes) dupes = self.without_ref(self.selected_dupes)
if not dupes: if not dupes:
self.view.show_message(MSG_NO_SELECTED_DUPES) self.view.show_message(MSG_NO_SELECTED_DUPES)
return return
msg = tr( msg = tr("All selected %d matches are going to be ignored in all subsequent scans. Continue?")
"All selected %d matches are going to be ignored in all subsequent scans. Continue?"
)
if not self.view.ask_yes_no(msg % len(dupes)): if not self.view.ask_yes_no(msg % len(dupes)):
return return
for dupe in dupes: for dupe in dupes:
g = self.results.get_group_of_duplicate(dupe) g = self.results.get_group_of_duplicate(dupe)
for other in g: for other in g:
if other is not dupe: if other is not dupe:
self.ignore_list.Ignore(str(other.path), str(dupe.path)) self.ignore_list.ignore(str(other.path), str(dupe.path))
self.remove_duplicates(dupes) self.remove_duplicates(dupes)
self.ignore_list_dialog.refresh() self.ignore_list_dialog.refresh()
def apply_filter(self, filter): def apply_filter(self, result_filter):
"""Apply a filter ``filter`` to the results so that it shows only dupe groups that match it. """Apply a filter ``filter`` to the results so that it shows only dupe groups that match it.
:param str filter: filter to apply :param str filter: filter to apply
""" """
self.results.apply_filter(None) self.results.apply_filter(None)
if self.options["escape_filter_regexp"]: if self.options["escape_filter_regexp"]:
filter = escape(filter, set("()[]\\.|+?^")) result_filter = escape(result_filter, set("()[]\\.|+?^"))
filter = escape(filter, "*", ".") result_filter = escape(result_filter, "*", ".")
self.results.apply_filter(filter) self.results.apply_filter(result_filter)
self._results_changed() self._results_changed()
def clean_empty_dirs(self, path): def clean_empty_dirs(self, path):
if self.options["clean_empty_dirs"]: if self.options["clean_empty_dirs"]:
while delete_if_empty(path, [".DS_Store"]): while delete_if_empty(path, [".DS_Store"]):
path = path.parent() path = path.parent
def clear_picture_cache(self): def clear_picture_cache(self):
try: try:
@@ -443,27 +425,30 @@ class DupeGuru(Broadcaster):
except FileNotFoundError: except FileNotFoundError:
pass # we don't care pass # we don't care
def clear_hash_cache(self):
fs.filesdb.clear()
def copy_or_move(self, dupe, copy: bool, destination: str, dest_type: DestType): def copy_or_move(self, dupe, copy: bool, destination: str, dest_type: DestType):
source_path = dupe.path source_path = dupe.path
location_path = first(p for p in self.directories if dupe.path in p) location_path = first(p for p in self.directories if p in dupe.path.parents)
dest_path = Path(destination) dest_path = Path(destination)
if dest_type in {DestType.Relative, DestType.Absolute}: if dest_type in {DestType.RELATIVE, DestType.ABSOLUTE}:
# no filename, no windows drive letter # no filename, no windows drive letter
source_base = source_path.remove_drive_letter().parent() source_base = source_path.relative_to(source_path.anchor).parent
if dest_type == DestType.Relative: if dest_type == DestType.RELATIVE:
source_base = source_base[location_path:] source_base = source_base.relative_to(location_path.relative_to(location_path.anchor))
dest_path = dest_path[source_base] dest_path = dest_path.joinpath(source_base)
if not dest_path.exists(): if not dest_path.exists():
dest_path.makedirs() dest_path.mkdir(parents=True)
# Add filename to dest_path. For file move/copy, it's not required, but for folders, yes. # Add filename to dest_path. For file move/copy, it's not required, but for folders, yes.
dest_path = dest_path[source_path.name] dest_path = dest_path.joinpath(source_path.name)
logging.debug("Copy/Move operation from '%s' to '%s'", source_path, dest_path) logging.debug("Copy/Move operation from '%s' to '%s'", source_path, dest_path)
# Raises an EnvironmentError if there's a problem # Raises an EnvironmentError if there's a problem
if copy: if copy:
smart_copy(source_path, dest_path) smart_copy(source_path, dest_path)
else: else:
smart_move(source_path, dest_path) smart_move(source_path, dest_path)
self.clean_empty_dirs(source_path.parent()) self.clean_empty_dirs(source_path.parent)
def copy_or_move_marked(self, copy): def copy_or_move_marked(self, copy):
"""Start an async move (or copy) job on marked duplicates. """Start an async move (or copy) job on marked duplicates.
@@ -483,16 +468,17 @@ class DupeGuru(Broadcaster):
self.view.show_message(MSG_NO_MARKED_DUPES) self.view.show_message(MSG_NO_MARKED_DUPES)
return return
destination = self.view.select_dest_folder( destination = self.view.select_dest_folder(
tr("Select a directory to copy marked files to") if copy tr("Select a directory to copy marked files to")
else tr("Select a directory to move marked files to")) if copy
else tr("Select a directory to move marked files to")
)
if destination: if destination:
desttype = self.options["copymove_dest_type"] desttype = self.options["copymove_dest_type"]
jobid = JobType.Copy if copy else JobType.Move jobid = JobType.COPY if copy else JobType.MOVE
self._start_job(jobid, do) self._start_job(jobid, do)
def delete_marked(self): def delete_marked(self):
"""Start an async job to send marked duplicates to the trash. """Start an async job to send marked duplicates to the trash."""
"""
if not self.results.mark_count: if not self.results.mark_count:
self.view.show_message(MSG_NO_MARKED_DUPES) self.view.show_message(MSG_NO_MARKED_DUPES)
return return
@@ -504,7 +490,7 @@ class DupeGuru(Broadcaster):
self.deletion_options.direct, self.deletion_options.direct,
] ]
logging.debug("Starting deletion job with args %r", args) logging.debug("Starting deletion job with args %r", args)
self._start_job(JobType.Delete, self._do_delete, args=args) self._start_job(JobType.DELETE, self._do_delete, args=args)
def export_to_xhtml(self): def export_to_xhtml(self):
"""Export current results to XHTML. """Export current results to XHTML.
@@ -523,9 +509,7 @@ class DupeGuru(Broadcaster):
The columns and their order in the resulting CSV file is determined in the same way as in The columns and their order in the resulting CSV file is determined in the same way as in
:meth:`export_to_xhtml`. :meth:`export_to_xhtml`.
""" """
dest_file = self.view.select_dest_file( dest_file = self.view.select_dest_file(tr("Select a destination for your exported CSV"), "csv")
tr("Select a destination for your exported CSV"), "csv"
)
if dest_file: if dest_file:
colnames, rows = self._get_export_data() colnames, rows = self._get_export_data()
try: try:
@@ -542,9 +526,7 @@ class DupeGuru(Broadcaster):
try: try:
return dupe.get_display_info(group, delta) return dupe.get_display_info(group, delta)
except Exception as e: except Exception as e:
logging.warning( logging.warning("Exception (type: %s) on GetDisplayInfo for %s: %s", type(e), str(dupe.path), str(e))
"Exception (type: %s) on GetDisplayInfo for %s: %s",
type(e), str(dupe.path), str(e))
return empty_data() return empty_data()
def invoke_custom_command(self): def invoke_custom_command(self):
@@ -556,28 +538,30 @@ class DupeGuru(Broadcaster):
""" """
cmd = self.view.get_default("CustomCommand") cmd = self.view.get_default("CustomCommand")
if not cmd: if not cmd:
msg = tr( msg = tr("You have no custom command set up. Set it up in your preferences.")
"You have no custom command set up. Set it up in your preferences."
)
self.view.show_message(msg) self.view.show_message(msg)
return return
if not self.selected_dupes: if not self.selected_dupes:
return return
dupe = self.selected_dupes[0] dupes = self.selected_dupes
group = self.results.get_group_of_duplicate(dupe) refs = [self.results.get_group_of_duplicate(dupe).ref for dupe in dupes]
ref = group.ref for dupe, ref in zip(dupes, refs):
cmd = cmd.replace("%d", str(dupe.path)) dupe_cmd = cmd.replace("%d", str(dupe.path))
cmd = cmd.replace("%r", str(ref.path)) dupe_cmd = dupe_cmd.replace("%r", str(ref.path))
match = re.match(r'"([^"]+)"(.*)', cmd) match = re.match(r'"([^"]+)"(.*)', dupe_cmd)
if match is not None: if match is not None:
# This code here is because subprocess. Popen doesn't seem to accept, under Windows, # This code here is because subprocess. Popen doesn't seem to accept, under Windows,
# executable paths with spaces in it, *even* when they're enclosed in "". So this is # executable paths with spaces in it, *even* when they're enclosed in "". So this is
# a workaround to make the damn thing work. # a workaround to make the damn thing work.
exepath, args = match.groups() exepath, args = match.groups()
path, exename = op.split(exepath) path, exename = op.split(exepath)
subprocess.Popen(exename + args, shell=True, cwd=path) p = subprocess.Popen(exename + args, shell=True, cwd=path, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = p.stdout.read()
logging.info("Custom command %s %s: %s", exename, args, output)
else: else:
subprocess.Popen(cmd, shell=True) p = subprocess.Popen(dupe_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = p.stdout.read()
logging.info("Custom command %s: %s", dupe_cmd, output)
def load(self): def load(self):
"""Load directory selection and ignore list from files in appdata. """Load directory selection and ignore list from files in appdata.
@@ -610,7 +594,7 @@ class DupeGuru(Broadcaster):
def do(j): def do(j):
self.results.load_from_xml(filename, self._get_file, j) self.results.load_from_xml(filename, self._get_file, j)
self._start_job(JobType.Load, do) self._start_job(JobType.LOAD, do)
def make_selected_reference(self): def make_selected_reference(self):
"""Promote :attr:`selected_dupes` to reference position within their respective groups. """Promote :attr:`selected_dupes` to reference position within their respective groups.
@@ -623,8 +607,7 @@ class DupeGuru(Broadcaster):
changed_groups = set() changed_groups = set()
for dupe in dupes: for dupe in dupes:
g = self.results.get_group_of_duplicate(dupe) g = self.results.get_group_of_duplicate(dupe)
if g not in changed_groups: if g not in changed_groups and self.results.make_ref(dupe):
if self.results.make_ref(dupe):
changed_groups.add(g) changed_groups.add(g)
# It's not always obvious to users what this action does, so to make it a bit clearer, # It's not always obvious to users what this action does, so to make it a bit clearer,
# we change our selection to the ref of all changed groups. However, we also want to keep # we change our selection to the ref of all changed groups. However, we also want to keep
@@ -634,9 +617,7 @@ class DupeGuru(Broadcaster):
if not self.result_table.power_marker: if not self.result_table.power_marker:
if changed_groups: if changed_groups:
self.selected_dupes = [ self.selected_dupes = [
d d for d in self.selected_dupes if self.results.get_group_of_duplicate(d).ref is d
for d in self.selected_dupes
if self.results.get_group_of_duplicate(d).ref is d
] ]
self.notify("results_changed") self.notify("results_changed")
else: else:
@@ -648,20 +629,17 @@ class DupeGuru(Broadcaster):
self.notify("results_changed_but_keep_selection") self.notify("results_changed_but_keep_selection")
def mark_all(self): def mark_all(self):
"""Set all dupes in the results as marked. """Set all dupes in the results as marked."""
"""
self.results.mark_all() self.results.mark_all()
self.notify("marking_changed") self.notify("marking_changed")
def mark_none(self): def mark_none(self):
"""Set all dupes in the results as unmarked. """Set all dupes in the results as unmarked."""
"""
self.results.mark_none() self.results.mark_none()
self.notify("marking_changed") self.notify("marking_changed")
def mark_invert(self): def mark_invert(self):
"""Invert the marked state of all dupes in the results. """Invert the marked state of all dupes in the results."""
"""
self.results.mark_invert() self.results.mark_invert()
self.notify("marking_changed") self.notify("marking_changed")
@@ -679,18 +657,15 @@ class DupeGuru(Broadcaster):
self.notify("marking_changed") self.notify("marking_changed")
def open_selected(self): def open_selected(self):
"""Open :attr:`selected_dupes` with their associated application. """Open :attr:`selected_dupes` with their associated application."""
""" if len(self.selected_dupes) > 10 and not self.view.ask_yes_no(MSG_MANY_FILES_TO_OPEN):
if len(self.selected_dupes) > 10:
if not self.view.ask_yes_no(MSG_MANY_FILES_TO_OPEN):
return return
for dupe in self.selected_dupes: for dupe in self.selected_dupes:
desktop.open_path(dupe.path) desktop.open_path(dupe.path)
def purge_ignore_list(self): def purge_ignore_list(self):
"""Remove files that don't exist from :attr:`ignore_list`. """Remove files that don't exist from :attr:`ignore_list`."""
""" self.ignore_list.filter(lambda f, s: op.exists(f) and op.exists(s))
self.ignore_list.Filter(lambda f, s: op.exists(f) and op.exists(s))
self.ignore_list_dialog.refresh() self.ignore_list_dialog.refresh()
def remove_directories(self, indexes): def remove_directories(self, indexes):
@@ -719,8 +694,7 @@ class DupeGuru(Broadcaster):
self.notify("results_changed_but_keep_selection") self.notify("results_changed_but_keep_selection")
def remove_marked(self): def remove_marked(self):
"""Removed marked duplicates from the results (without touching the files themselves). """Removed marked duplicates from the results (without touching the files themselves)."""
"""
if not self.results.mark_count: if not self.results.mark_count:
self.view.show_message(MSG_NO_MARKED_DUPES) self.view.show_message(MSG_NO_MARKED_DUPES)
return return
@@ -731,8 +705,7 @@ class DupeGuru(Broadcaster):
self._results_changed() self._results_changed()
def remove_selected(self): def remove_selected(self):
"""Removed :attr:`selected_dupes` from the results (without touching the files themselves). """Removed :attr:`selected_dupes` from the results (without touching the files themselves)."""
"""
dupes = self.without_ref(self.selected_dupes) dupes = self.without_ref(self.selected_dupes)
if not dupes: if not dupes:
self.view.show_message(MSG_NO_SELECTED_DUPES) self.view.show_message(MSG_NO_SELECTED_DUPES)
@@ -770,10 +743,10 @@ class DupeGuru(Broadcaster):
for group in self.results.groups: for group in self.results.groups:
if group.prioritize(key_func=sort_key): if group.prioritize(key_func=sort_key):
count += 1 count += 1
if count:
self.results.refresh_required = True
self._results_changed() self._results_changed()
msg = tr("{} duplicate groups were changed by the re-prioritization.").format( msg = tr("{} duplicate groups were changed by the re-prioritization.").format(count)
count
)
self.view.show_message(msg) self.view.show_message(msg)
def reveal_selected(self): def reveal_selected(self):
@@ -790,6 +763,9 @@ class DupeGuru(Broadcaster):
self.exclude_list.save_to_xml(p) self.exclude_list.save_to_xml(p)
self.notify("save_session") self.notify("save_session")
def close(self):
fs.filesdb.close()
def save_as(self, filename): def save_as(self, filename):
"""Save results in ``filename``. """Save results in ``filename``.
@@ -810,44 +786,44 @@ class DupeGuru(Broadcaster):
except OSError as e: except OSError as e:
self.view.show_message(tr("Couldn't write to file: {}").format(str(e))) self.view.show_message(tr("Couldn't write to file: {}").format(str(e)))
def start_scanning(self): def start_scanning(self, profile_scan=False):
"""Starts an async job to scan for duplicates. """Starts an async job to scan for duplicates.
Scans folders selected in :attr:`directories` and put the results in :attr:`results` Scans folders selected in :attr:`directories` and put the results in :attr:`results`
""" """
scanner = self.SCANNER_CLASS() scanner = self.SCANNER_CLASS()
if not self.directories.has_any_file(): if not self.directories.has_any_file():
self.view.show_message( self.view.show_message(tr("The selected directories contain no scannable file."))
tr("The selected directories contain no scannable file.")
)
return return
# Send relevant options down to the scanner instance # Send relevant options down to the scanner instance
for k, v in self.options.items(): for k, v in self.options.items():
if hasattr(scanner, k): if hasattr(scanner, k):
setattr(scanner, k, v) setattr(scanner, k, v)
if self.app_mode == AppMode.Picture: if self.app_mode == AppMode.PICTURE:
scanner.cache_path = self._get_picture_cache_path() scanner.cache_path = self._get_picture_cache_path()
self.results.groups = [] self.results.groups = []
self._recreate_result_table() self._recreate_result_table()
self._results_changed() self._results_changed()
def do(j): def do(j):
if profile_scan:
pr = cProfile.Profile()
pr.enable()
j.set_progress(0, tr("Collecting files to scan")) j.set_progress(0, tr("Collecting files to scan"))
if scanner.scan_type == ScanType.Folders: if scanner.scan_type == ScanType.FOLDERS:
files = list( files = list(self.directories.get_folders(folderclass=se.fs.Folder, j=j))
self.directories.get_folders(folderclass=se.fs.Folder, j=j)
)
else: else:
files = list( files = list(self.directories.get_files(fileclasses=self.fileclasses, j=j))
self.directories.get_files(fileclasses=self.fileclasses, j=j)
)
if self.options["ignore_hardlink_matches"]: if self.options["ignore_hardlink_matches"]:
files = self._remove_hardlink_dupes(files) files = self._remove_hardlink_dupes(files)
logging.info("Scanning %d files" % len(files)) logging.info("Scanning %d files" % len(files))
self.results.groups = scanner.get_dupe_groups(files, self.ignore_list, j) self.results.groups = scanner.get_dupe_groups(files, self.ignore_list, j)
self.discarded_file_count = scanner.discarded_file_count self.discarded_file_count = scanner.discarded_file_count
if profile_scan:
pr.disable()
pr.dump_stats(op.join(self.appdata, f"{datetime.datetime.now():%Y-%m-%d_%H-%M-%S}.profile"))
self._start_job(JobType.Scan, do) self._start_job(JobType.SCAN, do)
def toggle_selected_mark_state(self): def toggle_selected_mark_state(self):
selected = self.without_ref(self.selected_dupes) selected = self.without_ref(self.selected_dupes)
@@ -862,13 +838,8 @@ class DupeGuru(Broadcaster):
self.notify("marking_changed") self.notify("marking_changed")
def without_ref(self, dupes): def without_ref(self, dupes):
"""Returns ``dupes`` with all reference elements removed. """Returns ``dupes`` with all reference elements removed."""
""" return [dupe for dupe in dupes if self.results.get_group_of_duplicate(dupe).ref is not dupe]
return [
dupe
for dupe in dupes
if self.results.get_group_of_duplicate(dupe).ref is not dupe
]
def get_default(self, key, fallback_value=None): def get_default(self, key, fallback_value=None):
result = nonone(self.view.get_default(key), fallback_value) result = nonone(self.view.get_default(key), fallback_value)
@@ -897,18 +868,18 @@ class DupeGuru(Broadcaster):
@property @property
def SCANNER_CLASS(self): def SCANNER_CLASS(self):
if self.app_mode == AppMode.Picture: if self.app_mode == AppMode.PICTURE:
return pe.scanner.ScannerPE return pe.scanner.ScannerPE
elif self.app_mode == AppMode.Music: elif self.app_mode == AppMode.MUSIC:
return me.scanner.ScannerME return me.scanner.ScannerME
else: else:
return se.scanner.ScannerSE return se.scanner.ScannerSE
@property @property
def METADATA_TO_READ(self): def METADATA_TO_READ(self):
if self.app_mode == AppMode.Picture: if self.app_mode == AppMode.PICTURE:
return ["size", "mtime", "dimensions", "exif_timestamp"] return ["size", "mtime", "dimensions", "exif_timestamp"]
elif self.app_mode == AppMode.Music: elif self.app_mode == AppMode.MUSIC:
return [ return [
"size", "size",
"mtime", "mtime",

View File

@@ -7,12 +7,13 @@
import os import os
from xml.etree import ElementTree as ET from xml.etree import ElementTree as ET
import logging import logging
from pathlib import Path
from hscommon.jobprogress import job from hscommon.jobprogress import job
from hscommon.path import Path
from hscommon.util import FileOrPath from hscommon.util import FileOrPath
from hscommon.trans import tr
from . import fs from core import fs
__all__ = [ __all__ = [
"Directories", "Directories",
@@ -30,9 +31,9 @@ class DirectoryState:
* DirectoryState.Excluded: Don't scan this folder * DirectoryState.Excluded: Don't scan this folder
""" """
Normal = 0 NORMAL = 0
Reference = 1 REFERENCE = 1
Excluded = 2 EXCLUDED = 2
class AlreadyThereError(Exception): class AlreadyThereError(Exception):
@@ -62,7 +63,7 @@ class Directories:
def __contains__(self, path): def __contains__(self, path):
for p in self._dirs: for p in self._dirs:
if path in p: if path == p or p in path.parents:
return True return True
return False return False
@@ -82,76 +83,64 @@ class Directories:
# We iterate even if we only have one item here # We iterate even if we only have one item here
for denied_path_re in self._exclude_list.compiled: for denied_path_re in self._exclude_list.compiled:
if denied_path_re.match(str(path.name)): if denied_path_re.match(str(path.name)):
return DirectoryState.Excluded return DirectoryState.EXCLUDED
# return # We still use the old logic to force state on hidden dirs # return # We still use the old logic to force state on hidden dirs
# Override this in subclasses to specify the state of some special folders. # Override this in subclasses to specify the state of some special folders.
if path.name.startswith("."): if path.name.startswith("."):
return DirectoryState.Excluded return DirectoryState.EXCLUDED
def _get_files(self, from_path, fileclasses, j): def _get_files(self, from_path, fileclasses, j):
for root, dirs, files in os.walk(str(from_path)):
j.check_if_cancelled()
rootPath = Path(root)
state = self.get_state(rootPath)
if state == DirectoryState.Excluded:
# Recursively get files from folders with lots of subfolder is expensive. However, there
# might be a subfolder in this path that is not excluded. What we want to do is to skim
# through self.states and see if we must continue, or we can stop right here to save time
if not any(p[: len(rootPath)] == rootPath for p in self.states):
del dirs[:]
try: try:
if state != DirectoryState.Excluded: with os.scandir(from_path) as iter:
# Old logic root_path = Path(from_path)
if self._exclude_list is None or not self._exclude_list.mark_count: state = self.get_state(root_path)
found_files = [fs.get_file(rootPath + f, fileclasses=fileclasses) for f in files] # if we have no un-excluded dirs under this directory skip going deeper
else: skip_dirs = state == DirectoryState.EXCLUDED and not any(
found_files = [] p.parts[: len(root_path.parts)] == root_path.parts for p in self.states
# print(f"len of files: {len(files)} {files}") )
for f in files: count = 0
found = False for item in iter:
for expr in self._exclude_list.compiled_files: j.check_if_cancelled()
if expr.match(f): try:
found = True if item.is_dir():
break if skip_dirs:
if not found: continue
for expr in self._exclude_list.compiled_paths: yield from self._get_files(item.path, fileclasses, j)
if expr.match(root + os.sep + f): continue
found = True elif state == DirectoryState.EXCLUDED:
break continue
if not found: # File excluding or not
found_files.append(fs.get_file(rootPath + f, fileclasses=fileclasses)) if (
found_files = [f for f in found_files if f is not None] self._exclude_list is None
# In some cases, directories can be considered as files by dupeGuru, which is or not self._exclude_list.mark_count
# why we have this line below. In fact, there only one case: Bundle files under or not self._exclude_list.is_excluded(str(from_path), item.name)
# OS X... In other situations, this forloop will do nothing. ):
for d in dirs[:]: file = fs.get_file(item, fileclasses=fileclasses)
f = fs.get_file(rootPath + d, fileclasses=fileclasses) if file:
if f is not None: file.is_ref = state == DirectoryState.REFERENCE
found_files.append(f) count += 1
dirs.remove(d) yield file
except (OSError, fs.InvalidPath):
pass
logging.debug( logging.debug(
"Collected %d files in folder %s", "Collected %d files in folder %s",
len(found_files), count,
str(rootPath), str(root_path),
) )
for file in found_files: except OSError:
file.is_ref = state == DirectoryState.Reference
yield file
except (EnvironmentError, fs.InvalidPath):
pass pass
def _get_folders(self, from_folder, j): def _get_folders(self, from_folder, j):
j.check_if_cancelled() j.check_if_cancelled()
try: try:
for subfolder in from_folder.subfolders: for subfolder in from_folder.subfolders:
for folder in self._get_folders(subfolder, j): yield from self._get_folders(subfolder, j)
yield folder
state = self.get_state(from_folder.path) state = self.get_state(from_folder.path)
if state != DirectoryState.Excluded: if state != DirectoryState.EXCLUDED:
from_folder.is_ref = state == DirectoryState.Reference from_folder.is_ref = state == DirectoryState.REFERENCE
logging.debug("Yielding Folder %r state: %d", from_folder, state) logging.debug("Yielding Folder %r state: %d", from_folder, state)
yield from_folder yield from_folder
except (EnvironmentError, fs.InvalidPath): except (OSError, fs.InvalidPath):
pass pass
# ---Public # ---Public
@@ -169,7 +158,7 @@ class Directories:
raise AlreadyThereError() raise AlreadyThereError()
if not path.exists(): if not path.exists():
raise InvalidPathError() raise InvalidPathError()
self._dirs = [p for p in self._dirs if p not in path] self._dirs = [p for p in self._dirs if path not in p.parents]
self._dirs.append(path) self._dirs.append(path)
@staticmethod @staticmethod
@@ -180,10 +169,10 @@ class Directories:
:rtype: list of Path :rtype: list of Path
""" """
try: try:
subpaths = [p for p in path.listdir() if p.isdir()] subpaths = [p for p in path.glob("*") if p.is_dir()]
subpaths.sort(key=lambda x: x.name.lower()) subpaths.sort(key=lambda x: x.name.lower())
return subpaths return subpaths
except EnvironmentError: except OSError:
return [] return []
def get_files(self, fileclasses=None, j=job.nulljob): def get_files(self, fileclasses=None, j=job.nulljob):
@@ -193,8 +182,12 @@ class Directories:
""" """
if fileclasses is None: if fileclasses is None:
fileclasses = [fs.File] fileclasses = [fs.File]
file_count = 0
for path in self._dirs: for path in self._dirs:
for file in self._get_files(path, fileclasses=fileclasses, j=j): for file in self._get_files(path, fileclasses=fileclasses, j=j):
file_count += 1
if type(j) != job.NullJob:
j.set_progress(-1, tr("Collected {} files to scan").format(file_count))
yield file yield file
def get_folders(self, folderclass=None, j=job.nulljob): def get_folders(self, folderclass=None, j=job.nulljob):
@@ -204,9 +197,13 @@ class Directories:
""" """
if folderclass is None: if folderclass is None:
folderclass = fs.Folder folderclass = fs.Folder
folder_count = 0
for path in self._dirs: for path in self._dirs:
from_folder = folderclass(path) from_folder = folderclass(path)
for folder in self._get_folders(from_folder, j): for folder in self._get_folders(from_folder, j):
folder_count += 1
if type(j) != job.NullJob:
j.set_progress(-1, tr("Collected {} folders to scan").format(folder_count))
yield folder yield folder
def get_state(self, path): def get_state(self, path):
@@ -217,19 +214,16 @@ class Directories:
# direct match? easy result. # direct match? easy result.
if path in self.states: if path in self.states:
return self.states[path] return self.states[path]
state = self._default_state_for_path(path) or DirectoryState.Normal state = self._default_state_for_path(path) or DirectoryState.NORMAL
# Save non-default states in cache, necessary for _get_files() # Save non-default states in cache, necessary for _get_files()
if state != DirectoryState.Normal: if state != DirectoryState.NORMAL:
self.states[path] = state self.states[path] = state
return state return state
# find the longest parent path that is in states and return that state if found
prevlen = 0 # NOTE: path.parents is ordered longest to shortest
# we loop through the states to find the longest matching prefix for parent_path in path.parents:
# if the parent has a state in cache, return that state if parent_path in self.states:
for p, s in self.states.items(): return self.states[parent_path]
if p.is_parent_of(path) and len(p) > prevlen:
prevlen = len(p)
state = s
return state return state
def has_any_file(self): def has_any_file(self):
@@ -298,6 +292,6 @@ class Directories:
if self.get_state(path) == state: if self.get_state(path) == state:
return return
for iter_path in list(self.states.keys()): for iter_path in list(self.states.keys()):
if path.is_parent_of(iter_path): if path in iter_path.parents:
del self.states[iter_path] del self.states[iter_path]
self.states[path] = state self.states[path] = state

View File

@@ -17,17 +17,31 @@ from hscommon.util import flatten, multi_replace
from hscommon.trans import tr from hscommon.trans import tr
from hscommon.jobprogress import job from hscommon.jobprogress import job
(WEIGHT_WORDS, MATCH_SIMILAR_WORDS, NO_FIELD_ORDER,) = range(3) (
WEIGHT_WORDS,
MATCH_SIMILAR_WORDS,
NO_FIELD_ORDER,
) = range(3)
JOB_REFRESH_RATE = 100 JOB_REFRESH_RATE = 100
PROGRESS_MESSAGE = tr("%d matches found from %d groups")
def getwords(s): def getwords(s):
# We decompose the string so that ascii letters with accents can be part of the word. # We decompose the string so that ascii letters with accents can be part of the word.
s = normalize("NFD", s) s = normalize("NFD", s)
s = multi_replace(s, "-_&+():;\\[]{}.,<>/?~!@#$*", " ").lower() s = multi_replace(s, "-_&+():;\\[]{}.,<>/?~!@#$*", " ").lower()
# logging.debug(f"DEBUG chars for: {s}\n"
# f"{[c for c in s if ord(c) != 32]}\n"
# f"{[ord(c) for c in s if ord(c) != 32]}")
# HACK We shouldn't ignore non-ascii characters altogether. Any Unicode char
# above common european characters that cannot be "sanitized" (ie. stripped
# of their accents, etc.) are preserved as is. The arbitrary limit is
# obtained from this one: ord("\u037e") GREEK QUESTION MARK
s = "".join( s = "".join(
c for c in s if c in string.ascii_letters + string.digits + string.whitespace c
for c in s
if (ord(c) <= 894 and c in string.ascii_letters + string.digits + string.whitespace) or ord(c) > 894
) )
return [_f for _f in s.split(" ") if _f] # remove empty elements return [_f for _f in s.split(" ") if _f] # remove empty elements
@@ -93,20 +107,18 @@ def compare_fields(first, second, flags=()):
# We don't want to remove field directly in the list. We must work on a copy. # We don't want to remove field directly in the list. We must work on a copy.
second = second[:] second = second[:]
for field1 in first: for field1 in first:
max = 0 max_score = 0
matched_field = None matched_field = None
for field2 in second: for field2 in second:
r = compare(field1, field2, flags) r = compare(field1, field2, flags)
if r > max: if r > max_score:
max = r max_score = r
matched_field = field2 matched_field = field2
results.append(max) results.append(max_score)
if matched_field: if matched_field:
second.remove(matched_field) second.remove(matched_field)
else: else:
results = [ results = [compare(field1, field2, flags) for field1, field2 in zip(first, second)]
compare(field1, field2, flags) for field1, field2 in zip(first, second)
]
return min(results) if results else 0 return min(results) if results else 0
@@ -119,9 +131,7 @@ def build_word_dict(objects, j=job.nulljob):
The result will be a dict with words as keys, lists of objects as values. The result will be a dict with words as keys, lists of objects as values.
""" """
result = defaultdict(set) result = defaultdict(set)
for object in j.iter_with_progress( for object in j.iter_with_progress(objects, "Prepared %d/%d files", JOB_REFRESH_RATE):
objects, "Prepared %d/%d files", JOB_REFRESH_RATE
):
for word in unpack_fields(object.words): for word in unpack_fields(object.words):
result[word].add(object) result[word].add(object)
return result return result
@@ -156,9 +166,7 @@ def reduce_common_words(word_dict, threshold):
The exception to this removal are the objects where all the words of the object are common. The exception to this removal are the objects where all the words of the object are common.
Because if we remove them, we will miss some duplicates! Because if we remove them, we will miss some duplicates!
""" """
uncommon_words = set( uncommon_words = {word for word, objects in word_dict.items() if len(objects) < threshold}
word for word, objects in word_dict.items() if len(objects) < threshold
)
for word, objects in list(word_dict.items()): for word, objects in list(word_dict.items()):
if len(objects) < threshold: if len(objects) < threshold:
continue continue
@@ -241,10 +249,11 @@ def getmatches(
match_flags.append(MATCH_SIMILAR_WORDS) match_flags.append(MATCH_SIMILAR_WORDS)
if no_field_order: if no_field_order:
match_flags.append(NO_FIELD_ORDER) match_flags.append(NO_FIELD_ORDER)
j.start_job(len(word_dict), tr("0 matches found")) j.start_job(len(word_dict), PROGRESS_MESSAGE % (0, 0))
compared = defaultdict(set) compared = defaultdict(set)
result = [] result = []
try: try:
word_count = 0
# This whole 'popping' thing is there to avoid taking too much memory at the same time. # This whole 'popping' thing is there to avoid taking too much memory at the same time.
while word_dict: while word_dict:
items = word_dict.popitem()[1] items = word_dict.popitem()[1]
@@ -259,41 +268,50 @@ def getmatches(
result.append(m) result.append(m)
if len(result) >= LIMIT: if len(result) >= LIMIT:
return result return result
j.add_progress(desc=tr("%d matches found") % len(result)) word_count += 1
j.add_progress(desc=PROGRESS_MESSAGE % (len(result), word_count))
except MemoryError: except MemoryError:
# This is the place where the memory usage is at its peak during the scan. # This is the place where the memory usage is at its peak during the scan.
# Just continue the process with an incomplete list of matches. # Just continue the process with an incomplete list of matches.
del compared # This should give us enough room to call logging. del compared # This should give us enough room to call logging.
logging.warning( logging.warning("Memory Overflow. Matches: %d. Word dict: %d" % (len(result), len(word_dict)))
"Memory Overflow. Matches: %d. Word dict: %d"
% (len(result), len(word_dict))
)
return result return result
return result return result
def getmatches_by_contents(files, j=job.nulljob): def getmatches_by_contents(files, bigsize=0, j=job.nulljob):
"""Returns a list of :class:`Match` within ``files`` if their contents is the same. """Returns a list of :class:`Match` within ``files`` if their contents is the same.
:param bigsize: The size in bytes over which we consider files big enough to
justify taking samples of the file for hashing. If 0, compute digest as usual.
:param j: A :ref:`job progress instance <jobs>`. :param j: A :ref:`job progress instance <jobs>`.
""" """
size2files = defaultdict(set) size2files = defaultdict(set)
for f in files: for f in files:
if f.size:
size2files[f.size].add(f) size2files[f.size].add(f)
del files del files
possible_matches = [files for files in size2files.values() if len(files) > 1] possible_matches = [files for files in size2files.values() if len(files) > 1]
del size2files del size2files
result = [] result = []
j.start_job(len(possible_matches), tr("0 matches found")) j.start_job(len(possible_matches), PROGRESS_MESSAGE % (0, 0))
group_count = 0
for group in possible_matches: for group in possible_matches:
for first, second in itertools.combinations(group, 2): for first, second in itertools.combinations(group, 2):
if first.is_ref and second.is_ref: if first.is_ref and second.is_ref:
continue # Don't spend time comparing two ref pics together. continue # Don't spend time comparing two ref pics together.
if first.md5partial == second.md5partial: if first.size == 0 and second.size == 0:
if first.md5 == second.md5: # skip hashing for zero length files
result.append(Match(first, second, 100)) result.append(Match(first, second, 100))
j.add_progress(desc=tr("%d matches found") % len(result)) continue
if first.digest_partial == second.digest_partial:
if bigsize > 0 and first.size > bigsize:
if first.digest_samples == second.digest_samples:
result.append(Match(first, second, 100))
else:
if first.digest == second.digest:
result.append(Match(first, second, 100))
group_count += 1
j.add_progress(desc=PROGRESS_MESSAGE % (len(result), group_count))
return result return result
@@ -391,18 +409,13 @@ class Group:
You can call this after the duplicate scanning process to free a bit of memory. You can call this after the duplicate scanning process to free a bit of memory.
""" """
discarded = set( discarded = {m for m in self.matches if not all(obj in self.unordered for obj in [m.first, m.second])}
m
for m in self.matches
if not all(obj in self.unordered for obj in [m.first, m.second])
)
self.matches -= discarded self.matches -= discarded
self.candidates = defaultdict(set) self.candidates = defaultdict(set)
return discarded return discarded
def get_match_of(self, item): def get_match_of(self, item):
"""Returns the match pair between ``item`` and :attr:`ref`. """Returns the match pair between ``item`` and :attr:`ref`."""
"""
if item is self.ref: if item is self.ref:
return return
for m in self._get_matches_for_ref(): for m in self._get_matches_for_ref():
@@ -418,8 +431,7 @@ class Group:
""" """
# tie_breaker(ref, dupe) --> True if dupe should be ref # tie_breaker(ref, dupe) --> True if dupe should be ref
# Returns True if anything changed during prioritization. # Returns True if anything changed during prioritization.
master_key_func = lambda x: (-x.is_ref, key_func(x)) new_order = sorted(self.ordered, key=lambda x: (-x.is_ref, key_func(x)))
new_order = sorted(self.ordered, key=master_key_func)
changed = new_order != self.ordered changed = new_order != self.ordered
self.ordered = new_order self.ordered = new_order
if tie_breaker is None: if tie_breaker is None:
@@ -442,19 +454,16 @@ class Group:
self.unordered.remove(item) self.unordered.remove(item)
self._percentage = None self._percentage = None
self._matches_for_ref = None self._matches_for_ref = None
if (len(self) > 1) and any( if (len(self) > 1) and any(not getattr(item, "is_ref", False) for item in self):
not getattr(item, "is_ref", False) for item in self
):
if discard_matches: if discard_matches:
self.matches = set(m for m in self.matches if item not in m) self.matches = {m for m in self.matches if item not in m}
else: else:
self._clear() self._clear()
except ValueError: except ValueError:
pass pass
def switch_ref(self, with_dupe): def switch_ref(self, with_dupe):
"""Make the :attr:`ref` dupe of the group switch position with ``with_dupe``. """Make the :attr:`ref` dupe of the group switch position with ``with_dupe``."""
"""
if self.ref.is_ref: if self.ref.is_ref:
return False return False
try: try:
@@ -473,9 +482,7 @@ class Group:
if self._percentage is None: if self._percentage is None:
if self.dupes: if self.dupes:
matches = self._get_matches_for_ref() matches = self._get_matches_for_ref()
self._percentage = sum(match.percentage for match in matches) // len( self._percentage = sum(match.percentage for match in matches) // len(matches)
matches
)
else: else:
self._percentage = 0 self._percentage = 0
return self._percentage return self._percentage
@@ -522,7 +529,7 @@ def get_groups(matches):
del dupe2group del dupe2group
del matches del matches
# should free enough memory to continue # should free enough memory to continue
logging.warning("Memory Overflow. Groups: {0}".format(len(groups))) logging.warning(f"Memory Overflow. Groups: {len(groups)}")
# Now that we have a group, we have to discard groups' matches and see if there're any "orphan" # Now that we have a group, we have to discard groups' matches and see if there're any "orphan"
# matches, that is, matches that were candidate in a group but that none of their 2 files were # matches, that is, matches that were candidate in a group but that none of their 2 files were
# accepted in the group. With these orphan groups, it's safe to build additional groups # accepted in the group. With these orphan groups, it's safe to build additional groups
@@ -530,12 +537,8 @@ def get_groups(matches):
orphan_matches = [] orphan_matches = []
for group in groups: for group in groups:
orphan_matches += { orphan_matches += {
m m for m in group.discard_matches() if not any(obj in matched_files for obj in [m.first, m.second])
for m in group.discard_matches()
if not any(obj in matched_files for obj in [m.first, m.second])
} }
if groups and orphan_matches: if groups and orphan_matches:
groups += get_groups( groups += get_groups(orphan_matches) # no job, as it isn't supposed to take a long time
orphan_matches
) # no job, as it isn't supposed to take a long time
return groups return groups

View File

@@ -2,8 +2,9 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from .markable import Markable from core.markable import Markable
from xml.etree import ElementTree as ET from xml.etree import ElementTree as ET
# TODO: perhaps use regex module for better Unicode support? https://pypi.org/project/regex/ # TODO: perhaps use regex module for better Unicode support? https://pypi.org/project/regex/
# also https://pypi.org/project/re2/ # also https://pypi.org/project/re2/
# TODO update the Result list with newly added regexes if possible # TODO update the Result list with newly added regexes if possible
@@ -15,7 +16,8 @@ from hscommon.util import FileOrPath
from hscommon.plat import ISWINDOWS from hscommon.plat import ISWINDOWS
import time import time
default_regexes = [r"^thumbs\.db$", # Obsolete after WindowsXP default_regexes = [
r"^thumbs\.db$", # Obsolete after WindowsXP
r"^desktop\.ini$", # Windows metadata r"^desktop\.ini$", # Windows metadata
r"^\.DS_Store$", # MacOS metadata r"^\.DS_Store$", # MacOS metadata
r"^\.Trash\-.*", # Linux trash directories r"^\.Trash\-.*", # Linux trash directories
@@ -34,6 +36,7 @@ def timer(func):
end = time.perf_counter_ns() end = time.perf_counter_ns()
print(f"DEBUG: func {func.__name__!r} took {end - start} ns.") print(f"DEBUG: func {func.__name__!r} took {end - start} ns.")
return value return value
return wrapper_timer return wrapper_timer
@@ -45,11 +48,13 @@ def memoize(func):
if args not in func.cache: if args not in func.cache:
func.cache[args] = func(*args) func.cache[args] = func(*args)
return func.cache[args] return func.cache[args]
return _memoize return _memoize
class AlreadyThereException(Exception): class AlreadyThereException(Exception):
"""Expression already in the list""" """Expression already in the list"""
def __init__(self, arg="Expression is already in excluded list."): def __init__(self, arg="Expression is already in excluded list."):
super().__init__(arg) super().__init__(arg)
@@ -81,7 +86,7 @@ class ExcludeList(Markable):
yield self.is_marked(regex), regex yield self.is_marked(regex), regex
def __contains__(self, item): def __contains__(self, item):
return self.isExcluded(item) return self.has_entry(item)
def __len__(self): def __len__(self):
"""Returns the total number of regexes regardless of mark status.""" """Returns the total number of regexes regardless of mark status."""
@@ -145,10 +150,7 @@ class ExcludeList(Markable):
# @timer # @timer
@memoize @memoize
def _do_compile(self, expr): def _do_compile(self, expr):
try:
return re.compile(expr) return re.compile(expr)
except Exception as e:
raise(e)
# @timer # @timer
# @memoize # probably not worth memoizing this one if we memoize the above # @memoize # probably not worth memoizing this one if we memoize the above
@@ -169,11 +171,11 @@ class ExcludeList(Markable):
def build_compiled_caches(self, union=False): def build_compiled_caches(self, union=False):
if not union: if not union:
self._cached_compiled_files =\ self._cached_compiled_files = [x for x in self._excluded_compiled if not has_sep(x.pattern)]
[x for x in self._excluded_compiled if not has_sep(x.pattern)] self._cached_compiled_paths = [x for x in self._excluded_compiled if has_sep(x.pattern)]
self._cached_compiled_paths =\ self._dirty = False
[x for x in self._excluded_compiled if has_sep(x.pattern)]
return return
marked_count = [x for marked, x in self if marked] marked_count = [x for marked, x in self if marked]
# If there is no item, the compiled Pattern will be '' and match everything! # If there is no item, the compiled Pattern will be '' and match everything!
if not marked_count: if not marked_count:
@@ -183,28 +185,25 @@ class ExcludeList(Markable):
else: else:
# HACK returned as a tuple to get a free iterator and keep interface # HACK returned as a tuple to get a free iterator and keep interface
# the same regardless of whether the client asked for union or not # the same regardless of whether the client asked for union or not
self._cached_compiled_union_all =\ self._cached_compiled_union_all = (re.compile("|".join(marked_count)),)
(re.compile('|'.join(marked_count)),)
files_marked = [x for x in marked_count if not has_sep(x)] files_marked = [x for x in marked_count if not has_sep(x)]
if not files_marked: if not files_marked:
self._cached_compiled_union_files = tuple() self._cached_compiled_union_files = tuple()
else: else:
self._cached_compiled_union_files =\ self._cached_compiled_union_files = (re.compile("|".join(files_marked)),)
(re.compile('|'.join(files_marked)),)
paths_marked = [x for x in marked_count if has_sep(x)] paths_marked = [x for x in marked_count if has_sep(x)]
if not paths_marked: if not paths_marked:
self._cached_compiled_union_paths = tuple() self._cached_compiled_union_paths = tuple()
else: else:
self._cached_compiled_union_paths =\ self._cached_compiled_union_paths = (re.compile("|".join(paths_marked)),)
(re.compile('|'.join(paths_marked)),) self._dirty = False
@property @property
def compiled(self): def compiled(self):
"""Should be used by other classes to retrieve the up-to-date list of patterns.""" """Should be used by other classes to retrieve the up-to-date list of patterns."""
if self._use_union: if self._use_union:
if self._dirty: if self._dirty:
self.build_compiled_caches(True) self.build_compiled_caches(self._use_union)
self._dirty = False
return self._cached_compiled_union_all return self._cached_compiled_union_all
return self._excluded_compiled return self._excluded_compiled
@@ -215,29 +214,25 @@ class ExcludeList(Markable):
The interface should be expected to be a generator, even if it returns only The interface should be expected to be a generator, even if it returns only
one item (one Pattern in the union case).""" one item (one Pattern in the union case)."""
if self._dirty: if self._dirty:
self.build_compiled_caches(True if self._use_union else False) self.build_compiled_caches(self._use_union)
self._dirty = False return self._cached_compiled_union_files if self._use_union else self._cached_compiled_files
return self._cached_compiled_union_files if self._use_union\
else self._cached_compiled_files
@property @property
def compiled_paths(self): def compiled_paths(self):
"""Returns patterns with only separators in them, for more precise filtering.""" """Returns patterns with only separators in them, for more precise filtering."""
if self._dirty: if self._dirty:
self.build_compiled_caches(True if self._use_union else False) self.build_compiled_caches(self._use_union)
self._dirty = False return self._cached_compiled_union_paths if self._use_union else self._cached_compiled_paths
return self._cached_compiled_union_paths if self._use_union\
else self._cached_compiled_paths
# ---Public # ---Public
def add(self, regex, forced=False): def add(self, regex, forced=False):
"""This interface should throw exceptions if there is an error during """This interface should throw exceptions if there is an error during
regex compilation""" regex compilation"""
if self.isExcluded(regex): if self.has_entry(regex):
# This exception should never be ignored # This exception should never be ignored
raise AlreadyThereException() raise AlreadyThereException()
if regex in forbidden_regexes: if regex in forbidden_regexes:
raise Exception("Forbidden (dangerous) expression.") raise ValueError("Forbidden (dangerous) expression.")
iscompilable, exception, compiled = self.compile_re(regex) iscompilable, exception, compiled = self.compile_re(regex)
if not iscompilable and not forced: if not iscompilable and not forced:
@@ -256,12 +251,27 @@ class ExcludeList(Markable):
"""Returns the number of marked regexes only.""" """Returns the number of marked regexes only."""
return len([x for marked, x in self if marked]) return len([x for marked, x in self if marked])
def isExcluded(self, regex): def has_entry(self, regex):
for item in self._excluded: for item in self._excluded:
if regex == item[0]: if regex == item[0]:
return True return True
return False return False
def is_excluded(self, dirname, filename):
"""Return True if the file or the absolute path to file is supposed to be
filtered out, False otherwise."""
matched = False
for expr in self.compiled_files:
if expr.fullmatch(filename):
matched = True
break
if not matched:
for expr in self.compiled_paths:
if expr.fullmatch(dirname + sep + filename):
matched = True
break
return matched
def remove(self, regex): def remove(self, regex):
for item in self._excluded: for item in self._excluded:
if item[0] == regex: if item[0] == regex:
@@ -280,13 +290,14 @@ class ExcludeList(Markable):
was_marked = self.is_marked(regex) was_marked = self.is_marked(regex)
is_compilable, exception, compiled = self.compile_re(newregex) is_compilable, exception, compiled = self.compile_re(newregex)
# We overwrite the found entry # We overwrite the found entry
self._excluded[self._excluded.index(item)] =\ self._excluded[self._excluded.index(item)] = [newregex, is_compilable, exception, compiled]
[newregex, is_compilable, exception, compiled]
self._remove_compiled(regex) self._remove_compiled(regex)
break break
if not found: if not found:
return return
if is_compilable and was_marked: if is_compilable:
self._add_compiled(newregex)
if was_marked:
# Not marked by default when added, add it back # Not marked by default when added, add it back
self.mark(newregex) self.mark(newregex)
@@ -300,7 +311,7 @@ class ExcludeList(Markable):
if regex not in default_regexes: if regex not in default_regexes:
self.unmark(regex) self.unmark(regex)
for default_regex in default_regexes: for default_regex in default_regexes:
if not self.isExcluded(default_regex): if not self.has_entry(default_regex):
self.add(default_regex) self.add(default_regex)
self.mark(default_regex) self.mark(default_regex)
@@ -326,8 +337,10 @@ class ExcludeList(Markable):
# "forced" avoids compilation exceptions and adds anyway # "forced" avoids compilation exceptions and adds anyway
self.add(regex_string, forced=True) self.add(regex_string, forced=True)
except AlreadyThereException: except AlreadyThereException:
logging.error(f"Regex \"{regex_string}\" \ logging.error(
loaded from XML was already present in the list.") f'Regex "{regex_string}" \
loaded from XML was already present in the list.'
)
continue continue
if exclude_item.get("marked") == "y": if exclude_item.get("marked") == "y":
marked.add(regex_string) marked.add(regex_string)
@@ -352,6 +365,7 @@ loaded from XML was already present in the list.")
class ExcludeDict(ExcludeList): class ExcludeDict(ExcludeList):
"""Exclusion list holding a set of regular expressions as keys, the compiled """Exclusion list holding a set of regular expressions as keys, the compiled
Pattern, compilation error and compilable boolean as values.""" Pattern, compilation error and compilable boolean as values."""
# Implemntation around a dictionary instead of a list, which implies # Implemntation around a dictionary instead of a list, which implies
# to keep the index of each string-key as its sub-element and keep it updated # to keep the index of each string-key as its sub-element and keep it updated
# whenever insert/remove is done. # whenever insert/remove is done.
@@ -399,9 +413,9 @@ class ExcludeDict(ExcludeList):
if self._use_union: if self._use_union:
return return
try: try:
self._excluded_compiled.add(self._excluded[regex]["compiled"]) self._excluded_compiled.add(self._excluded.get(regex).get("compiled"))
except Exception as e: except Exception as e:
logging.warning(f"Exception while adding regex {regex} to compiled set: {e}") logging.error(f"Exception while adding regex {regex} to compiled set: {e}")
return return
def is_compilable(self, regex): def is_compilable(self, regex):
@@ -418,14 +432,9 @@ class ExcludeDict(ExcludeList):
# and other indices should be pushed by one # and other indices should be pushed by one
for value in self._excluded.values(): for value in self._excluded.values():
value["index"] += 1 value["index"] += 1
self._excluded[regex] = { self._excluded[regex] = {"index": 0, "compilable": iscompilable, "error": exception, "compiled": compiled}
"index": 0,
"compilable": iscompilable,
"error": exception,
"compiled": compiled
}
def isExcluded(self, regex): def has_entry(self, regex):
if regex in self._excluded.keys(): if regex in self._excluded.keys():
return True return True
return False return False
@@ -451,13 +460,15 @@ class ExcludeDict(ExcludeList):
previous = self._excluded.pop(regex) previous = self._excluded.pop(regex)
iscompilable, error, compiled = self.compile_re(newregex) iscompilable, error, compiled = self.compile_re(newregex)
self._excluded[newregex] = { self._excluded[newregex] = {
"index": previous["index"], "index": previous.get("index"),
"compilable": iscompilable, "compilable": iscompilable,
"error": error, "error": error,
"compiled": compiled "compiled": compiled,
} }
self._remove_compiled(regex) self._remove_compiled(regex)
if was_marked and iscompilable: if iscompilable:
self._add_compiled(newregex)
if was_marked:
self.mark(newregex) self.mark(newregex)
def save_to_xml(self, outfile): def save_to_xml(self, outfile):
@@ -492,8 +503,11 @@ def ordered_keys(_dict):
if ISWINDOWS: if ISWINDOWS:
def has_sep(x):
return '\\' + sep in x def has_sep(regexp):
return "\\" + sep in regexp
else: else:
def has_sep(x):
return sep in x def has_sep(regexp):
return sep in regexp

View File

@@ -131,15 +131,11 @@ def export_to_xhtml(colnames, rows):
indented = "indented" indented = "indented"
filename = row[1] filename = row[1]
cells = "".join(CELL_TEMPLATE.format(value=value) for value in row[2:]) cells = "".join(CELL_TEMPLATE.format(value=value) for value in row[2:])
rendered_rows.append( rendered_rows.append(ROW_TEMPLATE.format(indented=indented, filename=filename, cells=cells))
ROW_TEMPLATE.format(indented=indented, filename=filename, cells=cells)
)
previous_group_id = row[0] previous_group_id = row[0]
rendered_rows = "".join(rendered_rows) rendered_rows = "".join(rendered_rows)
# The main template can't use format because the css code uses {} # The main template can't use format because the css code uses {}
content = MAIN_TEMPLATE.replace("$colheaders", colheaders).replace( content = MAIN_TEMPLATE.replace("$colheaders", colheaders).replace("$rows", rendered_rows)
"$rows", rendered_rows
)
folder = mkdtemp() folder = mkdtemp()
destpath = op.join(folder, "export.htm") destpath = op.join(folder, "export.htm")
fp = open(destpath, "wt", encoding="utf-8") fp = open(destpath, "wt", encoding="utf-8")

View File

@@ -11,11 +11,27 @@
# resulting needless complexity and memory usage. It's been a while since I wanted to do that fork, # resulting needless complexity and memory usage. It's been a while since I wanted to do that fork,
# and I'm doing it now. # and I'm doing it now.
import hashlib import os
import logging
from math import floor
import logging
import sqlite3
from threading import Lock
from typing import Any, AnyStr, Union, Callable
from pathlib import Path
from hscommon.util import nonone, get_file_ext from hscommon.util import nonone, get_file_ext
hasher: Callable
try:
import xxhash
hasher = xxhash.xxh128
except ImportError:
import hashlib
hasher = hashlib.md5
__all__ = [ __all__ = [
"File", "File",
"Folder", "Folder",
@@ -30,6 +46,14 @@ __all__ = [
NOT_SET = object() NOT_SET = object()
# The goal here is to not run out of memory on really big files. However, the chunk
# size has to be large enough so that the python loop isn't too costly in terms of
# CPU.
CHUNK_SIZE = 1024 * 1024 # 1 MiB
# Minimum size below which partial hashing is not used
MIN_FILE_SIZE = 3 * CHUNK_SIZE # 3MiB, because we take 3 samples
class FSError(Exception): class FSError(Exception):
cls_message = "An error has occured on '{name}' in '{parent}'" cls_message = "An error has occured on '{name}' in '{parent}'"
@@ -69,28 +93,112 @@ class OperationError(FSError):
cls_message = "Operation on '{name}' failed." cls_message = "Operation on '{name}' failed."
class File: class FilesDB:
"""Represents a file and holds metadata to be used for scanning. schema_version = 1
schema_version_description = "Changed from md5 to xxhash if available."
create_table_query = "CREATE TABLE IF NOT EXISTS files (path TEXT PRIMARY KEY, size INTEGER, mtime_ns INTEGER, entry_dt DATETIME, digest BLOB, digest_partial BLOB, digest_samples BLOB)"
drop_table_query = "DROP TABLE IF EXISTS files;"
select_query = "SELECT {key} FROM files WHERE path=:path AND size=:size and mtime_ns=:mtime_ns"
insert_query = """
INSERT INTO files (path, size, mtime_ns, entry_dt, {key}) VALUES (:path, :size, :mtime_ns, datetime('now'), :value)
ON CONFLICT(path) DO UPDATE SET size=:size, mtime_ns=:mtime_ns, entry_dt=datetime('now'), {key}=:value;
""" """
INITIAL_INFO = { def __init__(self):
"size": 0, self.conn = None
"mtime": 0, self.cur = None
"md5": "", self.lock = None
"md5partial": "",
} def connect(self, path: Union[AnyStr, os.PathLike]) -> None:
self.conn = sqlite3.connect(path, check_same_thread=False)
self.cur = self.conn.cursor()
self.lock = Lock()
self._check_upgrade()
def _check_upgrade(self) -> None:
with self.lock:
has_schema = self.cur.execute(
"SELECT NAME FROM sqlite_master WHERE type='table' AND name='schema_version'"
).fetchall()
version = None
if has_schema:
version = self.cur.execute("SELECT version FROM schema_version ORDER BY version DESC").fetchone()[0]
else:
self.cur.execute("CREATE TABLE schema_version (version int PRIMARY KEY, description TEXT)")
if version != self.schema_version:
self.cur.execute(self.drop_table_query)
self.cur.execute(
"INSERT OR REPLACE INTO schema_version VALUES (:version, :description)",
{"version": self.schema_version, "description": self.schema_version_description},
)
self.cur.execute(self.create_table_query)
self.conn.commit()
def clear(self) -> None:
with self.lock:
self.cur.execute(self.drop_table_query)
self.cur.execute(self.create_table_query)
def get(self, path: Path, key: str) -> Union[bytes, None]:
stat = path.stat()
size = stat.st_size
mtime_ns = stat.st_mtime_ns
with self.lock:
self.cur.execute(self.select_query.format(key=key), {"path": str(path), "size": size, "mtime_ns": mtime_ns})
result = self.cur.fetchone()
if result:
return result[0]
return None
def put(self, path: Path, key: str, value: Any) -> None:
stat = path.stat()
size = stat.st_size
mtime_ns = stat.st_mtime_ns
with self.lock:
self.cur.execute(
self.insert_query.format(key=key),
{"path": str(path), "size": size, "mtime_ns": mtime_ns, "value": value},
)
def commit(self) -> None:
with self.lock:
self.conn.commit()
def close(self) -> None:
with self.lock:
self.cur.close()
self.conn.close()
filesdb = FilesDB() # Singleton
class File:
"""Represents a file and holds metadata to be used for scanning."""
INITIAL_INFO = {"size": 0, "mtime": 0, "digest": b"", "digest_partial": b"", "digest_samples": b""}
# Slots for File make us save quite a bit of memory. In a memory test I've made with a lot of # Slots for File make us save quite a bit of memory. In a memory test I've made with a lot of
# files, I saved 35% memory usage with "unread" files (no _read_info() call) and gains become # files, I saved 35% memory usage with "unread" files (no _read_info() call) and gains become
# even greater when we take into account read attributes (70%!). Yeah, it's worth it. # even greater when we take into account read attributes (70%!). Yeah, it's worth it.
__slots__ = ("path", "is_ref", "words") + tuple(INITIAL_INFO.keys()) __slots__ = ("path", "is_ref", "words") + tuple(INITIAL_INFO.keys())
def __init__(self, path): def __init__(self, path):
self.path = path
for attrname in self.INITIAL_INFO: for attrname in self.INITIAL_INFO:
setattr(self, attrname, NOT_SET) setattr(self, attrname, NOT_SET)
if type(path) is os.DirEntry:
self.path = Path(path.path)
self.size = nonone(path.stat().st_size, 0)
self.mtime = nonone(path.stat().st_mtime, 0)
else:
self.path = path
def __repr__(self): def __repr__(self):
return "<{} {}>".format(self.__class__.__name__, str(self.path)) return f"<{self.__class__.__name__} {str(self.path)}>"
def __getattribute__(self, attrname): def __getattribute__(self, attrname):
result = object.__getattribute__(self, attrname) result = object.__getattribute__(self, attrname)
@@ -98,51 +206,93 @@ class File:
try: try:
self._read_info(attrname) self._read_info(attrname)
except Exception as e: except Exception as e:
logging.warning( logging.warning("An error '%s' was raised while decoding '%s'", e, repr(self.path))
"An error '%s' was raised while decoding '%s'", e, repr(self.path)
)
result = object.__getattribute__(self, attrname) result = object.__getattribute__(self, attrname)
if result is NOT_SET: if result is NOT_SET:
result = self.INITIAL_INFO[attrname] result = self.INITIAL_INFO[attrname]
return result return result
# This offset is where we should start reading the file to get a partial md5 def _calc_digest(self):
# For audio file, it should be where audio data starts # type: () -> bytes
def _get_md5partial_offset_and_size(self):
return (0x4000, 0x4000) # 16Kb
def _read_info(self, field): with self.path.open("rb") as fp:
if field in ("size", "mtime"): file_hash = hasher()
stats = self.path.stat()
self.size = nonone(stats.st_size, 0)
self.mtime = nonone(stats.st_mtime, 0)
elif field == "md5partial":
try:
fp = self.path.open("rb")
offset, size = self._get_md5partial_offset_and_size()
fp.seek(offset)
partialdata = fp.read(size)
md5 = hashlib.md5(partialdata)
self.md5partial = md5.digest()
fp.close()
except Exception:
pass
elif field == "md5":
try:
fp = self.path.open("rb")
md5 = hashlib.md5()
# The goal here is to not run out of memory on really big files. However, the chunk # The goal here is to not run out of memory on really big files. However, the chunk
# size has to be large enough so that the python loop isn't too costly in terms of # size has to be large enough so that the python loop isn't too costly in terms of
# CPU. # CPU.
CHUNK_SIZE = 1024 * 1024 # 1 mb CHUNK_SIZE = 1024 * 1024 # 1 mb
filedata = fp.read(CHUNK_SIZE) filedata = fp.read(CHUNK_SIZE)
while filedata: while filedata:
md5.update(filedata) file_hash.update(filedata)
filedata = fp.read(CHUNK_SIZE) filedata = fp.read(CHUNK_SIZE)
self.md5 = md5.digest() return file_hash.digest()
fp.close()
except Exception: def _calc_digest_partial(self):
pass # type: () -> bytes
# This offset is where we should start reading the file to get a partial hash
# For audio file, it should be where audio data starts
offset, size = (0x4000, 0x4000)
with self.path.open("rb") as fp:
fp.seek(offset)
partial_data = fp.read(size)
return hasher(partial_data).digest()
def _calc_digest_samples(self) -> bytes:
size = self.size
with self.path.open("rb") as fp:
# Chunk at 25% of the file
fp.seek(floor(size * 25 / 100), 0)
file_data = fp.read(CHUNK_SIZE)
file_hash = hasher(file_data)
# Chunk at 60% of the file
fp.seek(floor(size * 60 / 100), 0)
file_data = fp.read(CHUNK_SIZE)
file_hash.update(file_data)
# Last chunk of the file
fp.seek(-CHUNK_SIZE, 2)
file_data = fp.read(CHUNK_SIZE)
file_hash.update(file_data)
return file_hash.digest()
def _read_info(self, field):
# print(f"_read_info({field}) for {self}")
if field in ("size", "mtime"):
stats = self.path.stat()
self.size = nonone(stats.st_size, 0)
self.mtime = nonone(stats.st_mtime, 0)
elif field == "digest_partial":
try:
self.digest_partial = filesdb.get(self.path, "digest_partial")
if self.digest_partial is None:
self.digest_partial = self._calc_digest_partial()
filesdb.put(self.path, "digest_partial", self.digest_partial)
except Exception as e:
logging.warning("Couldn't get digest_partial for %s: %s", self.path, e)
elif field == "digest":
try:
self.digest = filesdb.get(self.path, "digest")
if self.digest is None:
self.digest = self._calc_digest()
filesdb.put(self.path, "digest", self.digest)
except Exception as e:
logging.warning("Couldn't get digest for %s: %s", self.path, e)
elif field == "digest_samples":
size = self.size
# Might as well hash such small files entirely.
if size <= MIN_FILE_SIZE:
setattr(self, field, self.digest)
return
try:
self.digest_samples = filesdb.get(self.path, "digest_samples")
if self.digest_samples is None:
self.digest_samples = self._calc_digest_samples()
filesdb.put(self.path, "digest_samples", self.digest_samples)
except Exception as e:
logging.warning(f"Couldn't get digest_samples for {self.path}: {e}")
def _read_all_info(self, attrnames=None): def _read_all_info(self, attrnames=None):
"""Cache all possible info. """Cache all possible info.
@@ -157,27 +307,25 @@ class File:
# --- Public # --- Public
@classmethod @classmethod
def can_handle(cls, path): def can_handle(cls, path):
"""Returns whether this file wrapper class can handle ``path``. """Returns whether this file wrapper class can handle ``path``."""
""" return not path.is_symlink() and path.is_file()
return not path.islink() and path.isfile()
def rename(self, newname): def rename(self, newname):
if newname == self.name: if newname == self.name:
return return
destpath = self.path.parent()[newname] destpath = self.path.parent.joinpath(newname)
if destpath.exists(): if destpath.exists():
raise AlreadyExistsError(newname, self.path.parent()) raise AlreadyExistsError(newname, self.path.parent)
try: try:
self.path.rename(destpath) self.path.rename(destpath)
except EnvironmentError: except OSError:
raise OperationError(self) raise OperationError(self)
if not destpath.exists(): if not destpath.exists():
raise OperationError(self) raise OperationError(self)
self.path = destpath self.path = destpath
def get_display_info(self, group, delta): def get_display_info(self, group, delta):
"""Returns a display-ready dict of dupe's data. """Returns a display-ready dict of dupe's data."""
"""
raise NotImplementedError() raise NotImplementedError()
# --- Properties # --- Properties
@@ -191,19 +339,20 @@ class File:
@property @property
def folder_path(self): def folder_path(self):
return self.path.parent() return self.path.parent
class Folder(File): class Folder(File):
"""A wrapper around a folder path. """A wrapper around a folder path.
It has the size/md5 info of a File, but it's value are the sum of its subitems. It has the size/digest info of a File, but its value is the sum of its subitems.
""" """
__slots__ = File.__slots__ + ("_subfolders",) __slots__ = File.__slots__ + ("_subfolders",)
def __init__(self, path): def __init__(self, path):
File.__init__(self, path) File.__init__(self, path)
self.size = NOT_SET
self._subfolders = None self._subfolders = None
def _all_items(self): def _all_items(self):
@@ -212,37 +361,37 @@ class Folder(File):
return folders + files return folders + files
def _read_info(self, field): def _read_info(self, field):
# print(f"_read_info({field}) for Folder {self}")
if field in {"size", "mtime"}: if field in {"size", "mtime"}:
size = sum((f.size for f in self._all_items()), 0) size = sum((f.size for f in self._all_items()), 0)
self.size = size self.size = size
stats = self.path.stat() stats = self.path.stat()
self.mtime = nonone(stats.st_mtime, 0) self.mtime = nonone(stats.st_mtime, 0)
elif field in {"md5", "md5partial"}: elif field in {"digest", "digest_partial", "digest_samples"}:
# What's sensitive here is that we must make sure that subfiles' # What's sensitive here is that we must make sure that subfiles'
# md5 are always added up in the same order, but we also want a # digest are always added up in the same order, but we also want a
# different md5 if a file gets moved in a different subdirectory. # different digest if a file gets moved in a different subdirectory.
def get_dir_md5_concat():
def get_dir_digest_concat():
items = self._all_items() items = self._all_items()
items.sort(key=lambda f: f.path) items.sort(key=lambda f: f.path)
md5s = [getattr(f, field) for f in items] digests = [getattr(f, field) for f in items]
return b"".join(md5s) return b"".join(digests)
md5 = hashlib.md5(get_dir_md5_concat()) digest = hasher(get_dir_digest_concat()).digest()
digest = md5.digest()
setattr(self, field, digest) setattr(self, field, digest)
@property @property
def subfolders(self): def subfolders(self):
if self._subfolders is None: if self._subfolders is None:
subfolders = [ with os.scandir(self.path) as iter:
p for p in self.path.listdir() if not p.islink() and p.isdir() subfolders = [p for p in iter if not p.is_symlink() and p.is_dir()]
]
self._subfolders = [self.__class__(p) for p in subfolders] self._subfolders = [self.__class__(p) for p in subfolders]
return self._subfolders return self._subfolders
@classmethod @classmethod
def can_handle(cls, path): def can_handle(cls, path):
return not path.islink() and path.isdir() return not path.is_symlink() and path.is_dir()
def get_file(path, fileclasses=[File]): def get_file(path, fileclasses=[File]):
@@ -267,10 +416,11 @@ def get_files(path, fileclasses=[File]):
assert all(issubclass(fileclass, File) for fileclass in fileclasses) assert all(issubclass(fileclass, File) for fileclass in fileclasses)
try: try:
result = [] result = []
for path in path.listdir(): with os.scandir(path) as iter:
file = get_file(path, fileclasses=fileclasses) for item in iter:
file = get_file(item, fileclasses=fileclasses)
if file is not None: if file is not None:
result.append(file) result.append(file)
return result return result
except EnvironmentError: except OSError:
raise InvalidPath(path) raise InvalidPath(path)

View File

@@ -15,16 +15,21 @@ class DupeGuruGUIObject(Listener):
self.app = app self.app = app
def directories_changed(self): def directories_changed(self):
# Implemented in child classes
pass pass
def dupes_selected(self): def dupes_selected(self):
# Implemented in child classes
pass pass
def marking_changed(self): def marking_changed(self):
# Implemented in child classes
pass pass
def results_changed(self): def results_changed(self):
# Implemented in child classes
pass pass
def results_changed_but_keep_selection(self): def results_changed_but_keep_selection(self):
# Implemented in child classes
pass pass

View File

@@ -29,8 +29,7 @@ class DeletionOptionsView:
""" """
def update_msg(self, msg: str): def update_msg(self, msg: str):
"""Update the dialog's prompt with ``str``. """Update the dialog's prompt with ``str``."""
"""
def show(self): def show(self):
"""Show the dialog in a modal fashion. """Show the dialog in a modal fashion.
@@ -39,8 +38,7 @@ class DeletionOptionsView:
""" """
def set_hardlink_option_enabled(self, is_enabled: bool): def set_hardlink_option_enabled(self, is_enabled: bool):
"""Enable or disable the widget controlling :attr:`DeletionOptions.use_hardlinks`. """Enable or disable the widget controlling :attr:`DeletionOptions.use_hardlinks`."""
"""
class DeletionOptions(GUIObject): class DeletionOptions(GUIObject):
@@ -75,8 +73,7 @@ class DeletionOptions(GUIObject):
return self.view.show() return self.view.show()
def supports_links(self): def supports_links(self):
"""Returns whether our platform supports symlinks. """Returns whether our platform supports symlinks."""
"""
# When on a platform that doesn't implement it, calling os.symlink() (with the wrong number # When on a platform that doesn't implement it, calling os.symlink() (with the wrong number
# of arguments) raises NotImplementedError, which allows us to gracefully check for the # of arguments) raises NotImplementedError, which allows us to gracefully check for the
# feature. # feature.

View File

@@ -7,7 +7,7 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from hscommon.gui.base import GUIObject from hscommon.gui.base import GUIObject
from .base import DupeGuruGUIObject from core.gui.base import DupeGuruGUIObject
class DetailsPanel(GUIObject, DupeGuruGUIObject): class DetailsPanel(GUIObject, DupeGuruGUIObject):
@@ -32,9 +32,7 @@ class DetailsPanel(GUIObject, DupeGuruGUIObject):
# we don't want the two sides of the table to display the stats for the same file # we don't want the two sides of the table to display the stats for the same file
ref = group.ref if group is not None and group.ref is not dupe else None ref = group.ref if group is not None and group.ref is not dupe else None
data2 = self.app.get_display_info(ref, group, False) data2 = self.app.get_display_info(ref, group, False)
columns = self.app.result_table.COLUMNS[ columns = self.app.result_table.COLUMNS[1:] # first column is the 'marked' column
1:
] # first column is the 'marked' column
self._table = [(c.display, data1[c.name], data2[c.name]) for c in columns] self._table = [(c.display, data1[c.name], data2[c.name]) for c in columns]
# --- Public # --- Public
@@ -46,5 +44,4 @@ class DetailsPanel(GUIObject, DupeGuruGUIObject):
# --- Event Handlers # --- Event Handlers
def dupes_selected(self): def dupes_selected(self):
self._refresh() self._view_updated()
self.view.refresh()

View File

@@ -8,10 +8,10 @@
from hscommon.gui.tree import Tree, Node from hscommon.gui.tree import Tree, Node
from ..directories import DirectoryState from core.directories import DirectoryState
from .base import DupeGuruGUIObject from core.gui.base import DupeGuruGUIObject
STATE_ORDER = [DirectoryState.Normal, DirectoryState.Reference, DirectoryState.Excluded] STATE_ORDER = [DirectoryState.NORMAL, DirectoryState.REFERENCE, DirectoryState.EXCLUDED]
# Lazily loads children # Lazily loads children
@@ -36,9 +36,7 @@ class DirectoryNode(Node):
self._loaded = True self._loaded = True
def update_all_states(self): def update_all_states(self):
self._state = STATE_ORDER.index( self._state = STATE_ORDER.index(self._tree.app.directories.get_state(self._directory_path))
self._tree.app.directories.get_state(self._directory_path)
)
for node in self: for node in self:
node.update_all_states() node.update_all_states()
@@ -88,9 +86,9 @@ class DirectoryTree(Tree, DupeGuruGUIObject):
else: else:
# All selected nodes or on second-or-more level, exclude them. # All selected nodes or on second-or-more level, exclude them.
nodes = self.selected_nodes nodes = self.selected_nodes
newstate = DirectoryState.Excluded newstate = DirectoryState.EXCLUDED
if all(node.state == DirectoryState.Excluded for node in nodes): if all(node.state == DirectoryState.EXCLUDED for node in nodes):
newstate = DirectoryState.Normal newstate = DirectoryState.NORMAL
for node in nodes: for node in nodes:
node.state = newstate node.state = newstate
@@ -105,5 +103,4 @@ class DirectoryTree(Tree, DupeGuruGUIObject):
# --- Event Handlers # --- Event Handlers
def directories_changed(self): def directories_changed(self):
self._refresh() self._view_updated()
self.view.refresh()

View File

@@ -5,8 +5,9 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
# from hscommon.trans import tr from core.gui.exclude_list_table import ExcludeListTable
from .exclude_list_table import ExcludeListTable from core.exclude import has_sep
from os import sep
import logging import logging
@@ -30,9 +31,10 @@ class ExcludeListDialogCore:
self.refresh() self.refresh()
def rename_selected(self, newregex): def rename_selected(self, newregex):
"""Renames the selected regex to ``newregex``. """Rename the selected regex to ``newregex``.
If there's more than one selected row, the first one is used. If there is more than one selected row, the first one is used.
:param str newregex: The regex to rename the row's regex to. :param str newregex: The regex to rename the row's regex to.
:return bool: true if success, false if error.
""" """
try: try:
r = self.exclude_list_table.selected_rows[0] r = self.exclude_list_table.selected_rows[0]
@@ -44,25 +46,42 @@ class ExcludeListDialogCore:
return False return False
def add(self, regex): def add(self, regex):
try:
self.exclude_list.add(regex) self.exclude_list.add(regex)
except Exception as e:
raise(e)
self.exclude_list.mark(regex) self.exclude_list.mark(regex)
self.exclude_list_table.add(regex) self.exclude_list_table.add(regex)
def test_string(self, test_string): def test_string(self, test_string):
"""Sets property on row to highlight if its regex matches test_string supplied.""" """Set the highlight property on each row when its regex matches the
test_string supplied. Return True if any row matched."""
matched = False matched = False
for row in self.exclude_list_table.rows: for row in self.exclude_list_table.rows:
compiled_regex = self.exclude_list.get_compiled(row.regex) compiled_regex = self.exclude_list.get_compiled(row.regex)
if compiled_regex and compiled_regex.match(test_string):
matched = True if self.is_match(test_string, compiled_regex):
row.highlight = True row.highlight = True
matched = True
else: else:
row.highlight = False row.highlight = False
return matched return matched
def is_match(self, test_string, compiled_regex):
# This method is like an inverted version of ExcludeList.is_excluded()
if not compiled_regex:
return False
matched = False
# Test only the filename portion of the path
if not has_sep(compiled_regex.pattern) and sep in test_string:
filename = test_string.rsplit(sep, 1)[1]
if compiled_regex.fullmatch(filename):
matched = True
return matched
# Test the entire path + filename
if compiled_regex.fullmatch(test_string):
matched = True
return matched
def reset_rows_highlight(self): def reset_rows_highlight(self):
for row in self.exclude_list_table.rows: for row in self.exclude_list_table.rows:
row.highlight = False row.highlight = False

View File

@@ -2,23 +2,21 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from .base import DupeGuruGUIObject from core.gui.base import DupeGuruGUIObject
from hscommon.gui.table import GUITable, Row from hscommon.gui.table import GUITable, Row
from hscommon.gui.column import Column, Columns from hscommon.gui.column import Column, Columns
from hscommon.trans import trget from hscommon.trans import trget
tr = trget("ui") tr = trget("ui")
class ExcludeListTable(GUITable, DupeGuruGUIObject): class ExcludeListTable(GUITable, DupeGuruGUIObject):
COLUMNS = [ COLUMNS = [Column("marked", ""), Column("regex", tr("Regular Expressions"))]
Column("marked", ""),
Column("regex", tr("Regular Expressions"))
]
def __init__(self, exclude_list_dialog, app): def __init__(self, exclude_list_dialog, app):
GUITable.__init__(self) GUITable.__init__(self)
DupeGuruGUIObject.__init__(self, app) DupeGuruGUIObject.__init__(self, app)
self.columns = Columns(self) self._columns = Columns(self)
self.dialog = exclude_list_dialog self.dialog = exclude_list_dialog
def rename_selected(self, newname): def rename_selected(self, newname):
@@ -36,7 +34,7 @@ class ExcludeListTable(GUITable, DupeGuruGUIObject):
return ExcludeListRow(self, self.dialog.exclude_list.is_marked(regex), regex), 0 return ExcludeListRow(self, self.dialog.exclude_list.is_marked(regex), regex), 0
def _do_delete(self): def _do_delete(self):
self.dalog.exclude_list.remove(self.selected_row.regex) self.dialog.exclude_list.remove(self.selected_row.regex)
# --- Override # --- Override
def add(self, regex): def add(self, regex):

View File

@@ -6,7 +6,7 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from hscommon.trans import tr from hscommon.trans import tr
from .ignore_list_table import IgnoreListTable from core.gui.ignore_list_table import IgnoreListTable
class IgnoreListDialog: class IgnoreListDialog:
@@ -22,11 +22,9 @@ class IgnoreListDialog:
def clear(self): def clear(self):
if not self.ignore_list: if not self.ignore_list:
return return
msg = tr( msg = tr("Do you really want to remove all %d items from the ignore list?") % len(self.ignore_list)
"Do you really want to remove all %d items from the ignore list?"
) % len(self.ignore_list)
if self.app.view.ask_yes_no(msg): if self.app.view.ask_yes_no(msg):
self.ignore_list.Clear() self.ignore_list.clear()
self.refresh() self.refresh()
def refresh(self): def refresh(self):

View File

@@ -22,7 +22,7 @@ class IgnoreListTable(GUITable):
def __init__(self, ignore_list_dialog): def __init__(self, ignore_list_dialog):
GUITable.__init__(self) GUITable.__init__(self)
self.columns = Columns(self) self._columns = Columns(self)
self.view = None self.view = None
self.dialog = ignore_list_dialog self.dialog = ignore_list_dialog

View File

@@ -8,7 +8,7 @@
from hscommon import desktop from hscommon import desktop
from .problem_table import ProblemTable from core.gui.problem_table import ProblemTable
class ProblemDialog: class ProblemDialog:

View File

@@ -21,7 +21,7 @@ class ProblemTable(GUITable):
def __init__(self, problem_dialog): def __init__(self, problem_dialog):
GUITable.__init__(self) GUITable.__init__(self)
self.columns = Columns(self) self._columns = Columns(self)
self.dialog = problem_dialog self.dialog = problem_dialog
# --- Override # --- Override

View File

@@ -11,7 +11,7 @@ from operator import attrgetter
from hscommon.gui.table import GUITable, Row from hscommon.gui.table import GUITable, Row
from hscommon.gui.column import Columns from hscommon.gui.column import Columns
from .base import DupeGuruGUIObject from core.gui.base import DupeGuruGUIObject
class DupeRow(Row): class DupeRow(Row):
@@ -41,11 +41,11 @@ class DupeRow(Row):
# table.DELTA_COLUMNS are always "delta" # table.DELTA_COLUMNS are always "delta"
self._delta_columns = self.table.DELTA_COLUMNS.copy() self._delta_columns = self.table.DELTA_COLUMNS.copy()
dupe_info = self.data dupe_info = self.data
if self._group.ref is None:
return False
ref_info = self._group.ref.get_display_info(group=self._group, delta=False) ref_info = self._group.ref.get_display_info(group=self._group, delta=False)
for key, value in dupe_info.items(): for key, value in dupe_info.items():
if (key not in self._delta_columns) and ( if (key not in self._delta_columns) and (ref_info[key].lower() != value.lower()):
ref_info[key].lower() != value.lower()
):
self._delta_columns.add(key) self._delta_columns.add(key)
return column_name in self._delta_columns return column_name in self._delta_columns
@@ -82,7 +82,7 @@ class ResultTable(GUITable, DupeGuruGUIObject):
def __init__(self, app): def __init__(self, app):
GUITable.__init__(self) GUITable.__init__(self)
DupeGuruGUIObject.__init__(self, app) DupeGuruGUIObject.__init__(self, app)
self.columns = Columns(self, prefaccess=app, savename="ResultTable") self._columns = Columns(self, prefaccess=app, savename="ResultTable")
self._power_marker = False self._power_marker = False
self._delta_values = False self._delta_values = False
self._sort_descriptors = ("name", True) self._sort_descriptors = ("name", True)
@@ -190,4 +190,4 @@ class ResultTable(GUITable, DupeGuruGUIObject):
self.view.refresh() self.view.refresh()
def save_session(self): def save_session(self):
self.columns.save_columns() self._columns.save_columns()

View File

@@ -6,7 +6,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from .base import DupeGuruGUIObject from core.gui.base import DupeGuruGUIObject
class StatsLabel(DupeGuruGUIObject): class StatsLabel(DupeGuruGUIObject):

View File

@@ -20,8 +20,7 @@ class IgnoreList:
# ---Override # ---Override
def __init__(self): def __init__(self):
self._ignored = {} self.clear()
self._count = 0
def __iter__(self): def __iter__(self):
for first, seconds in self._ignored.items(): for first, seconds in self._ignored.items():
@@ -32,7 +31,7 @@ class IgnoreList:
return self._count return self._count
# ---Public # ---Public
def AreIgnored(self, first, second): def are_ignored(self, first, second):
def do_check(first, second): def do_check(first, second):
try: try:
matches = self._ignored[first] matches = self._ignored[first]
@@ -42,23 +41,23 @@ class IgnoreList:
return do_check(first, second) or do_check(second, first) return do_check(first, second) or do_check(second, first)
def Clear(self): def clear(self):
self._ignored = {} self._ignored = {}
self._count = 0 self._count = 0
def Filter(self, func): def filter(self, func):
"""Applies a filter on all ignored items, and remove all matches where func(first,second) """Applies a filter on all ignored items, and remove all matches where func(first,second)
doesn't return True. doesn't return True.
""" """
filtered = IgnoreList() filtered = IgnoreList()
for first, second in self: for first, second in self:
if func(first, second): if func(first, second):
filtered.Ignore(first, second) filtered.ignore(first, second)
self._ignored = filtered._ignored self._ignored = filtered._ignored
self._count = filtered._count self._count = filtered._count
def Ignore(self, first, second): def ignore(self, first, second):
if self.AreIgnored(first, second): if self.are_ignored(first, second):
return return
try: try:
matches = self._ignored[first] matches = self._ignored[first]
@@ -88,8 +87,7 @@ class IgnoreList:
except KeyError: except KeyError:
return False return False
if not inner(first, second): if not inner(first, second) and not inner(second, first):
if not inner(second, first):
raise ValueError() raise ValueError()
def load_from_xml(self, infile): def load_from_xml(self, infile):
@@ -110,7 +108,7 @@ class IgnoreList:
for sfn in subfile_elems: for sfn in subfile_elems:
subfile_path = sfn.get("path") subfile_path = sfn.get("path")
if subfile_path: if subfile_path:
self.Ignore(file_path, subfile_path) self.ignore(file_path, subfile_path)
def save_to_xml(self, outfile): def save_to_xml(self, outfile):
"""Create a XML file that can be used by load_from_xml. """Create a XML file that can be used by load_from_xml.

View File

@@ -17,9 +17,11 @@ class Markable:
# in self.__marked, and is not affected by __inverted. Thus, self.mark while __inverted # in self.__marked, and is not affected by __inverted. Thus, self.mark while __inverted
# is True will launch _DidUnmark. # is True will launch _DidUnmark.
def _did_mark(self, o): def _did_mark(self, o):
# Implemented in child classes
pass pass
def _did_unmark(self, o): def _did_unmark(self, o):
# Implemented in child classes
pass pass
def _get_markable_count(self): def _get_markable_count(self):

View File

@@ -1 +1 @@
from . import fs, prioritize, result_table, scanner # noqa from core.me import fs, prioritize, result_table, scanner # noqa

View File

@@ -6,7 +6,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from hsaudiotag import auto import mutagen
from hscommon.util import get_file_ext, format_size, format_time from hscommon.util import get_file_ext, format_size, format_time
from core.util import format_timestamp, format_perc, format_words, format_dupe_count from core.util import format_timestamp, format_perc, format_words, format_dupe_count
@@ -26,6 +26,9 @@ TAG_FIELDS = {
"comment", "comment",
} }
# This is a temporary workaround for migration from hsaudiotag for the can_handle method
SUPPORTED_EXTS = {"mp3", "wma", "m4a", "m4p", "ogg", "flac", "aif", "aiff", "aifc"}
class MusicFile(fs.File): class MusicFile(fs.File):
INITIAL_INFO = fs.File.INITIAL_INFO.copy() INITIAL_INFO = fs.File.INITIAL_INFO.copy()
@@ -50,7 +53,7 @@ class MusicFile(fs.File):
def can_handle(cls, path): def can_handle(cls, path):
if not fs.File.can_handle(path): if not fs.File.can_handle(path):
return False return False
return get_file_ext(path.name) in auto.EXT2CLASS return get_file_ext(path.name) in SUPPORTED_EXTS
def get_display_info(self, group, delta): def get_display_info(self, group, delta):
size = self.size size = self.size
@@ -94,22 +97,19 @@ class MusicFile(fs.File):
"dupe_count": format_dupe_count(dupe_count), "dupe_count": format_dupe_count(dupe_count),
} }
def _get_md5partial_offset_and_size(self):
f = auto.File(str(self.path))
return (f.audio_offset, f.audio_size)
def _read_info(self, field): def _read_info(self, field):
fs.File._read_info(self, field) fs.File._read_info(self, field)
if field in TAG_FIELDS: if field in TAG_FIELDS:
f = auto.File(str(self.path)) # The various conversions here are to make this look like the previous implementation
self.audiosize = f.audio_size file = mutagen.File(str(self.path), easy=True)
self.bitrate = f.bitrate self.audiosize = self.path.stat().st_size
self.duration = f.duration self.bitrate = file.info.bitrate / 1000
self.samplerate = f.sample_rate self.duration = file.info.length
self.artist = f.artist self.samplerate = file.info.sample_rate
self.album = f.album self.artist = ", ".join(file.tags.get("artist") or [])
self.title = f.title self.album = ", ".join(file.tags.get("album") or [])
self.genre = f.genre self.title = ", ".join(file.tags.get("title") or [])
self.comment = f.comment self.genre = ", ".join(file.tags.get("genre") or [])
self.year = f.year self.comment = ", ".join(file.tags.get("comment") or [""])
self.track = f.track self.year = ", ".join(file.tags.get("date") or [])
self.track = (file.tags.get("tracknumber") or [""])[0]

View File

@@ -17,9 +17,9 @@ class ScannerME(ScannerBase):
@staticmethod @staticmethod
def get_scan_options(): def get_scan_options():
return [ return [
ScanOption(ScanType.Filename, tr("Filename")), ScanOption(ScanType.FILENAME, tr("Filename")),
ScanOption(ScanType.Fields, tr("Filename - Fields")), ScanOption(ScanType.FIELDS, tr("Filename - Fields")),
ScanOption(ScanType.FieldsNoOrder, tr("Filename - Fields (No Order)")), ScanOption(ScanType.FIELDSNOORDER, tr("Filename - Fields (No Order)")),
ScanOption(ScanType.Tag, tr("Tags")), ScanOption(ScanType.TAG, tr("Tags")),
ScanOption(ScanType.Contents, tr("Contents")), ScanOption(ScanType.CONTENTS, tr("Contents")),
] ]

View File

@@ -1,8 +1,7 @@
from . import ( # noqa from core.pe import ( # noqa
block, block,
cache, cache,
exif, exif,
iphoto_plist,
matchblock, matchblock,
matchexif, matchexif,
photo, photo,

View File

@@ -6,7 +6,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from ._block import NoBlocksError, DifferentBlockCountError, avgdiff, getblocks2 # NOQA from core.pe._block import NoBlocksError, DifferentBlockCountError, avgdiff, getblocks2 # NOQA
# Converted to C # Converted to C
# def getblock(image): # def getblock(image):

13
core/pe/block.pyi Normal file
View File

@@ -0,0 +1,13 @@
from typing import Tuple, List, Union, Sequence
_block = Tuple[int, int, int]
class NoBlocksError(Exception): ... # noqa: E302, E701
class DifferentBlockCountError(Exception): ... # noqa E701
def getblock(image: object) -> Union[_block, None]: ... # noqa: E302
def getblocks2(image: object, block_count_per_side: int) -> Union[List[_block], None]: ...
def diff(first: _block, second: _block) -> int: ...
def avgdiff( # noqa: E302
first: Sequence[_block], second: Sequence[_block], limit: int = 768, min_iterations: int = 1
) -> Union[int, None]: ...

View File

@@ -4,7 +4,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from ._cache import string_to_colors # noqa from core.pe._cache import string_to_colors # noqa
def colors_to_string(colors): def colors_to_string(colors):
@@ -13,7 +13,7 @@ def colors_to_string(colors):
[(0,100,255)] --> 0064ff [(0,100,255)] --> 0064ff
[(1,2,3),(4,5,6)] --> 010203040506 [(1,2,3),(4,5,6)] --> 010203040506
""" """
return "".join("%02x%02x%02x" % (r, g, b) for r, g, b in colors) return "".join("{:02x}{:02x}{:02x}".format(r, g, b) for r, g, b in colors)
# This function is an important bottleneck of dupeGuru PE. It has been converted to C. # This function is an important bottleneck of dupeGuru PE. It has been converted to C.

6
core/pe/cache.pyi Normal file
View File

@@ -0,0 +1,6 @@
from typing import Union, Tuple, List
_block = Tuple[int, int, int]
def colors_to_string(colors: List[_block]) -> str: ... # noqa: E302
def string_to_colors(s: str) -> Union[List[_block], None]: ...

View File

@@ -10,11 +10,11 @@ import shelve
import tempfile import tempfile
from collections import namedtuple from collections import namedtuple
from .cache import string_to_colors, colors_to_string from core.pe.cache import string_to_colors, colors_to_string
def wrap_path(path): def wrap_path(path):
return "path:{}".format(path) return f"path:{path}"
def unwrap_path(key): def unwrap_path(key):
@@ -22,7 +22,7 @@ def unwrap_path(key):
def wrap_id(path): def wrap_id(path):
return "id:{}".format(path) return f"id:{path}"
def unwrap_id(key): def unwrap_id(key):
@@ -33,8 +33,7 @@ CacheRow = namedtuple("CacheRow", "id path blocks mtime")
class ShelveCache: class ShelveCache:
"""A class to cache picture blocks in a shelve backend. """A class to cache picture blocks in a shelve backend."""
"""
def __init__(self, db=None, readonly=False): def __init__(self, db=None, readonly=False):
self.istmp = db is None self.istmp = db is None
@@ -81,9 +80,7 @@ class ShelveCache:
self.shelve[wrap_id(rowid)] = wrap_path(path_str) self.shelve[wrap_id(rowid)] = wrap_path(path_str)
def _compute_maxid(self): def _compute_maxid(self):
return max( return max((unwrap_id(k) for k in self.shelve if k.startswith("id:")), default=1)
(unwrap_id(k) for k in self.shelve if k.startswith("id:")), default=1
)
def _get_new_id(self): def _get_new_id(self):
self.maxid += 1 self.maxid += 1

View File

@@ -9,12 +9,11 @@ import os.path as op
import logging import logging
import sqlite3 as sqlite import sqlite3 as sqlite
from .cache import string_to_colors, colors_to_string from core.pe.cache import string_to_colors, colors_to_string
class SqliteCache: class SqliteCache:
"""A class to cache picture blocks in a sqlite backend. """A class to cache picture blocks in a sqlite backend."""
"""
def __init__(self, db=":memory:", readonly=False): def __init__(self, db=":memory:", readonly=False):
# readonly is not used in the sqlite version of the cache # readonly is not used in the sqlite version of the cache
@@ -71,18 +70,14 @@ class SqliteCache:
except sqlite.OperationalError: except sqlite.OperationalError:
logging.warning("Picture cache could not set value for key %r", path_str) logging.warning("Picture cache could not set value for key %r", path_str)
except sqlite.DatabaseError as e: except sqlite.DatabaseError as e:
logging.warning( logging.warning("DatabaseError while setting value for key %r: %s", path_str, str(e))
"DatabaseError while setting value for key %r: %s", path_str, str(e)
)
def _create_con(self, second_try=False): def _create_con(self, second_try=False):
def create_tables(): def create_tables():
logging.debug("Creating picture cache tables.") logging.debug("Creating picture cache tables.")
self.con.execute("drop table if exists pictures") self.con.execute("drop table if exists pictures")
self.con.execute("drop index if exists idx_path") self.con.execute("drop index if exists idx_path")
self.con.execute( self.con.execute("create table pictures(path TEXT, mtime INTEGER, blocks TEXT)")
"create table pictures(path TEXT, mtime INTEGER, blocks TEXT)"
)
self.con.execute("create index idx_path on pictures (path)") self.con.execute("create index idx_path on pictures (path)")
self.con = sqlite.connect(self.dbname, isolation_level=None) self.con = sqlite.connect(self.dbname, isolation_level=None)
@@ -93,9 +88,7 @@ class SqliteCache:
except sqlite.DatabaseError as e: # corrupted db except sqlite.DatabaseError as e: # corrupted db
if second_try: if second_try:
raise # Something really strange is happening raise # Something really strange is happening
logging.warning( logging.warning("Could not create picture cache because of an error: %s", str(e))
"Could not create picture cache because of an error: %s", str(e)
)
self.con.close() self.con.close()
os.remove(self.dbname) os.remove(self.dbname)
self._create_con(second_try=True) self._create_con(second_try=True)
@@ -125,9 +118,7 @@ class SqliteCache:
raise ValueError(path) raise ValueError(path)
def get_multiple(self, rowids): def get_multiple(self, rowids):
sql = "select rowid, blocks from pictures where rowid in (%s)" % ",".join( sql = "select rowid, blocks from pictures where rowid in (%s)" % ",".join(map(str, rowids))
map(str, rowids)
)
cur = self.con.execute(sql) cur = self.con.execute(sql)
return ((rowid, string_to_colors(blocks)) for rowid, blocks in cur) return ((rowid, string_to_colors(blocks)) for rowid, blocks in cur)
@@ -148,7 +139,5 @@ class SqliteCache:
continue continue
todelete.append(rowid) todelete.append(rowid)
if todelete: if todelete:
sql = "delete from pictures where rowid in (%s)" % ",".join( sql = "delete from pictures where rowid in (%s)" % ",".join(map(str, todelete))
map(str, todelete)
)
self.con.execute(sql) self.con.execute(sql)

View File

@@ -193,8 +193,8 @@ class TIFF_file:
self.s2nfunc = s2n_intel if self.endian == INTEL_ENDIAN else s2n_motorola self.s2nfunc = s2n_intel if self.endian == INTEL_ENDIAN else s2n_motorola
def s2n(self, offset, length, signed=0, debug=False): def s2n(self, offset, length, signed=0, debug=False):
slice = self.data[offset : offset + length] data_slice = self.data[offset : offset + length]
val = self.s2nfunc(slice) val = self.s2nfunc(data_slice)
# Sign extension ? # Sign extension ?
if signed: if signed:
msb = 1 << (8 * length - 1) msb = 1 << (8 * length - 1)
@@ -206,7 +206,7 @@ class TIFF_file:
"Slice for offset %d length %d: %r and value: %d", "Slice for offset %d length %d: %r and value: %d",
offset, offset,
length, length,
slice, data_slice,
val, val,
) )
return val return val
@@ -236,10 +236,10 @@ class TIFF_file:
for i in range(entries): for i in range(entries):
entry = ifd + 2 + 12 * i entry = ifd + 2 + 12 * i
tag = self.s2n(entry, 2) tag = self.s2n(entry, 2)
type = self.s2n(entry + 2, 2) entry_type = self.s2n(entry + 2, 2)
if not 1 <= type <= 10: if not 1 <= entry_type <= 10:
continue # not handled continue # not handled
typelen = [1, 1, 2, 4, 8, 1, 1, 2, 4, 8][type - 1] typelen = [1, 1, 2, 4, 8, 1, 1, 2, 4, 8][entry_type - 1]
count = self.s2n(entry + 4, 4) count = self.s2n(entry + 4, 4)
if count > MAX_COUNT: if count > MAX_COUNT:
logging.debug("Probably corrupt. Aborting.") logging.debug("Probably corrupt. Aborting.")
@@ -247,25 +247,23 @@ class TIFF_file:
offset = entry + 8 offset = entry + 8
if count * typelen > 4: if count * typelen > 4:
offset = self.s2n(offset, 4) offset = self.s2n(offset, 4)
if type == 2: if entry_type == 2:
# Special case: nul-terminated ASCII string # Special case: nul-terminated ASCII string
values = str(self.data[offset : offset + count - 1], encoding="latin-1") values = str(self.data[offset : offset + count - 1], encoding="latin-1")
else: else:
values = [] values = []
signed = type == 6 or type >= 8 signed = entry_type == 6 or entry_type >= 8
for j in range(count): for _ in range(count):
if type in {5, 10}: if entry_type in {5, 10}:
# The type is either 5 or 10 # The type is either 5 or 10
value_j = Fraction( value_j = Fraction(self.s2n(offset, 4, signed), self.s2n(offset + 4, 4, signed))
self.s2n(offset, 4, signed), self.s2n(offset + 4, 4, signed)
)
else: else:
# Not a fraction # Not a fraction
value_j = self.s2n(offset, typelen, signed) value_j = self.s2n(offset, typelen, signed)
values.append(value_j) values.append(value_j)
offset = offset + typelen offset = offset + typelen
# Now "values" is either a string or an array # Now "values" is either a string or an array
a.append((tag, type, values)) a.append((tag, entry_type, values))
return a return a
@@ -296,13 +294,11 @@ def get_fields(fp):
logging.debug("Exif header length: %d bytes", length) logging.debug("Exif header length: %d bytes", length)
data = fp.read(length - 8) data = fp.read(length - 8)
data_format = data[0] data_format = data[0]
logging.debug( logging.debug("%s format", {INTEL_ENDIAN: "Intel", MOTOROLA_ENDIAN: "Motorola"}[data_format])
"%s format", {INTEL_ENDIAN: "Intel", MOTOROLA_ENDIAN: "Motorola"}[data_format]
)
T = TIFF_file(data) T = TIFF_file(data)
# There may be more than one IFD per file, but we only read the first one because others are # There may be more than one IFD per file, but we only read the first one because others are
# most likely thumbnails. # most likely thumbnails.
main_IFD_offset = T.first_IFD() main_ifd_offset = T.first_IFD()
result = {} result = {}
def add_tag_to_result(tag, values): def add_tag_to_result(tag, values):
@@ -314,8 +310,8 @@ def get_fields(fp):
return # don't overwrite data return # don't overwrite data
result[stag] = values result[stag] = values
logging.debug("IFD at offset %d", main_IFD_offset) logging.debug("IFD at offset %d", main_ifd_offset)
IFD = T.dump_IFD(main_IFD_offset) IFD = T.dump_IFD(main_ifd_offset)
exif_off = gps_off = 0 exif_off = gps_off = 0
for tag, type, values in IFD: for tag, type, values in IFD:
if tag == 0x8769: if tag == 0x8769:

View File

@@ -1,33 +0,0 @@
# Created By: Virgil Dupras
# Created On: 2014-03-15
# Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
#
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
import plistlib
class IPhotoPlistParser(plistlib._PlistParser):
"""A parser for iPhoto plists.
iPhoto plists tend to be malformed, so we have to subclass the built-in parser to be a bit more
lenient.
"""
def __init__(self):
plistlib._PlistParser.__init__(self, use_builtin_types=True, dict_type=dict)
# For debugging purposes, we remember the last bit of data to be analyzed so that we can
# log it in case of an exception
self.lastdata = ""
def get_data(self):
self.lastdata = plistlib._PlistParser.get_data(self)
return self.lastdata
def end_integer(self):
try:
self.add_object(int(self.get_data()))
except ValueError:
self.add_object(0)

View File

@@ -15,7 +15,7 @@ from hscommon.trans import tr
from hscommon.jobprogress import job from hscommon.jobprogress import job
from core.engine import Match from core.engine import Match
from .block import avgdiff, DifferentBlockCountError, NoBlocksError from core.pe.block import avgdiff, DifferentBlockCountError, NoBlocksError
# OPTIMIZATION NOTES: # OPTIMIZATION NOTES:
# The bottleneck of the matching phase is CPU, which is why we use multiprocessing. However, another # The bottleneck of the matching phase is CPU, which is why we use multiprocessing. However, another
@@ -51,11 +51,11 @@ except Exception:
def get_cache(cache_path, readonly=False): def get_cache(cache_path, readonly=False):
if cache_path.endswith("shelve"): if cache_path.endswith("shelve"):
from .cache_shelve import ShelveCache from core.pe.cache_shelve import ShelveCache
return ShelveCache(cache_path, readonly=readonly) return ShelveCache(cache_path, readonly=readonly)
else: else:
from .cache_sqlite import SqliteCache from core.pe.cache_sqlite import SqliteCache
return SqliteCache(cache_path, readonly=readonly) return SqliteCache(cache_path, readonly=readonly)
@@ -87,7 +87,7 @@ def prepare_pictures(pictures, cache_path, with_dimensions, j=job.nulljob):
blocks = picture.get_blocks(BLOCK_COUNT_PER_SIDE) blocks = picture.get_blocks(BLOCK_COUNT_PER_SIDE)
cache[picture.unicode_path] = blocks cache[picture.unicode_path] = blocks
prepared.append(picture) prepared.append(picture)
except (IOError, ValueError) as e: except (OSError, ValueError) as e:
logging.warning(str(e)) logging.warning(str(e))
except MemoryError: except MemoryError:
logging.warning( logging.warning(
@@ -95,9 +95,7 @@ def prepare_pictures(pictures, cache_path, with_dimensions, j=job.nulljob):
picture.unicode_path, picture.unicode_path,
picture.size, picture.size,
) )
if ( if picture.size < 10 * 1024 * 1024: # We're really running out of memory
picture.size < 10 * 1024 * 1024
): # We're really running out of memory
raise raise
except MemoryError: except MemoryError:
logging.warning("Ran out of memory while preparing pictures") logging.warning("Ran out of memory while preparing pictures")
@@ -106,9 +104,7 @@ def prepare_pictures(pictures, cache_path, with_dimensions, j=job.nulljob):
def get_chunks(pictures): def get_chunks(pictures):
min_chunk_count = ( min_chunk_count = multiprocessing.cpu_count() * 2 # have enough chunks to feed all subprocesses
multiprocessing.cpu_count() * 2
) # have enough chunks to feed all subprocesses
chunk_count = len(pictures) // DEFAULT_CHUNK_SIZE chunk_count = len(pictures) // DEFAULT_CHUNK_SIZE
chunk_count = max(min_chunk_count, chunk_count) chunk_count = max(min_chunk_count, chunk_count)
chunk_size = (len(pictures) // chunk_count) + 1 chunk_size = (len(pictures) // chunk_count) + 1
@@ -185,9 +181,7 @@ def getmatches(pictures, cache_path, threshold, match_scaled=False, j=job.nulljo
j.set_progress(comparison_count, progress_msg) j.set_progress(comparison_count, progress_msg)
j = j.start_subjob([3, 7]) j = j.start_subjob([3, 7])
pictures = prepare_pictures( pictures = prepare_pictures(pictures, cache_path, with_dimensions=not match_scaled, j=j)
pictures, cache_path, with_dimensions=not match_scaled, j=j
)
j = j.start_subjob([9, 1], tr("Preparing for matching")) j = j.start_subjob([9, 1], tr("Preparing for matching"))
cache = get_cache(cache_path) cache = get_cache(cache_path)
id2picture = {} id2picture = {}
@@ -231,12 +225,8 @@ def getmatches(pictures, cache_path, threshold, match_scaled=False, j=job.nulljo
chunks, chunks,
pictures, pictures,
) # some wiggle room for the next statements ) # some wiggle room for the next statements
logging.warning( logging.warning("Ran out of memory when scanning! We had %d matches.", len(matches))
"Ran out of memory when scanning! We had %d matches.", len(matches) del matches[-len(matches) // 3 :] # some wiggle room to ensure we don't run out of memory again.
)
del matches[
-len(matches) // 3 :
] # some wiggle room to ensure we don't run out of memory again.
pool.close() pool.close()
result = [] result = []
myiter = j.iter_with_progress( myiter = j.iter_with_progress(
@@ -248,7 +238,7 @@ def getmatches(pictures, cache_path, threshold, match_scaled=False, j=job.nulljo
for ref_id, other_id, percentage in myiter: for ref_id, other_id, percentage in myiter:
ref = id2picture[ref_id] ref = id2picture[ref_id]
other = id2picture[other_id] other = id2picture[other_id]
if percentage == 100 and ref.md5 != other.md5: if percentage == 100 and ref.digest != other.digest:
percentage = 99 percentage = 99
if percentage >= threshold: if percentage >= threshold:
ref.dimensions # pre-read dimensions for display in results ref.dimensions # pre-read dimensions for display in results

View File

@@ -2,9 +2,9 @@
* Created On: 2010-01-30 * Created On: 2010-01-30
* Copyright 2014 Hardcoded Software (http://www.hardcoded.net) * Copyright 2014 Hardcoded Software (http://www.hardcoded.net)
* *
* This software is licensed under the "BSD" License as described in the "LICENSE" file, * This software is licensed under the "BSD" License as described in the
* which should be included with this package. The terms are also available at * "LICENSE" file, which should be included with this package. The terms are
* http://www.hardcoded.net/licenses/bsd_license * also available at http://www.hardcoded.net/licenses/bsd_license
*/ */
#include "common.h" #include "common.h"
@@ -17,8 +17,7 @@ static PyObject *DifferentBlockCountError;
/* Returns a 3 sized tuple containing the mean color of 'image'. /* Returns a 3 sized tuple containing the mean color of 'image'.
* image: a PIL image or crop. * image: a PIL image or crop.
*/ */
static PyObject* getblock(PyObject *image) static PyObject *getblock(PyObject *image) {
{
int i, totr, totg, totb; int i, totr, totg, totb;
Py_ssize_t pixel_count; Py_ssize_t pixel_count;
PyObject *ppixels; PyObject *ppixels;
@@ -65,8 +64,7 @@ static PyObject* getblock(PyObject *image)
/* Returns the difference between the first block and the second. /* Returns the difference between the first block and the second.
* It returns an absolute sum of the 3 differences (RGB). * It returns an absolute sum of the 3 differences (RGB).
*/ */
static int diff(PyObject *first, PyObject *second) static int diff(PyObject *first, PyObject *second) {
{
int r1, g1, b1, r2, b2, g2; int r1, g1, b1, r2, b2, g2;
PyObject *pr, *pg, *pb; PyObject *pr, *pg, *pb;
pr = PySequence_ITEM(first, 0); pr = PySequence_ITEM(first, 0);
@@ -101,8 +99,7 @@ If it is 10, for example, 100 blocks will be returns (10 width, 10 height). The
necessarely cover square areas. The area covered by each block will be proportional to the image\n\ necessarely cover square areas. The area covered by each block will be proportional to the image\n\
itself.\n"); itself.\n");
static PyObject* block_getblocks2(PyObject *self, PyObject *args) static PyObject *block_getblocks2(PyObject *self, PyObject *args) {
{
int block_count_per_side, width, height, block_width, block_height, ih; int block_count_per_side, width, height, block_width, block_height, ih;
PyObject *image; PyObject *image;
PyObject *pimage_size, *pwidth, *pheight; PyObject *pimage_size, *pwidth, *pheight;
@@ -128,7 +125,7 @@ static PyObject* block_getblocks2(PyObject *self, PyObject *args)
block_width = max(width / block_count_per_side, 1); block_width = max(width / block_count_per_side, 1);
block_height = max(height / block_count_per_side, 1); block_height = max(height / block_count_per_side, 1);
result = PyList_New(block_count_per_side * block_count_per_side); result = PyList_New((Py_ssize_t)block_count_per_side * block_count_per_side);
if (result == NULL) { if (result == NULL) {
return NULL; return NULL;
} }
@@ -174,14 +171,14 @@ PyDoc_STRVAR(block_avgdiff_doc,
If the result surpasses limit, limit + 1 is returned, except if less than min_iterations\n\ If the result surpasses limit, limit + 1 is returned, except if less than min_iterations\n\
iterations have been made in the blocks.\n"); iterations have been made in the blocks.\n");
static PyObject* block_avgdiff(PyObject *self, PyObject *args) static PyObject *block_avgdiff(PyObject *self, PyObject *args) {
{
PyObject *first, *second; PyObject *first, *second;
int limit, min_iterations; int limit, min_iterations;
Py_ssize_t count; Py_ssize_t count;
int sum, i, result; int sum, i, result;
if (!PyArg_ParseTuple(args, "OOii", &first, &second, &limit, &min_iterations)) { if (!PyArg_ParseTuple(args, "OOii", &first, &second, &limit,
&min_iterations)) {
return NULL; return NULL;
} }
@@ -206,7 +203,8 @@ static PyObject* block_avgdiff(PyObject *self, PyObject *args)
sum += diff(item1, item2); sum += diff(item1, item2);
Py_DECREF(item1); Py_DECREF(item1);
Py_DECREF(item2); Py_DECREF(item2);
if ((sum > limit*iteration_count) && (iteration_count >= min_iterations)) { if ((sum > limit * iteration_count) &&
(iteration_count >= min_iterations)) {
return PyLong_FromLong(limit + 1); return PyLong_FromLong(limit + 1);
} }
} }
@@ -224,8 +222,7 @@ static PyMethodDef BlockMethods[] = {
{NULL, NULL, 0, NULL} /* Sentinel */ {NULL, NULL, 0, NULL} /* Sentinel */
}; };
static struct PyModuleDef BlockDef = { static struct PyModuleDef BlockDef = {PyModuleDef_HEAD_INIT,
PyModuleDef_HEAD_INIT,
"_block", "_block",
NULL, NULL,
-1, -1,
@@ -233,12 +230,9 @@ static struct PyModuleDef BlockDef = {
NULL, NULL,
NULL, NULL,
NULL, NULL,
NULL NULL};
};
PyObject * PyObject *PyInit__block(void) {
PyInit__block(void)
{
PyObject *m = PyModule_Create(&BlockDef); PyObject *m = PyModule_Create(&BlockDef);
if (m == NULL) { if (m == NULL) {
return NULL; return NULL;
@@ -246,7 +240,8 @@ PyInit__block(void)
NoBlocksError = PyErr_NewException("_block.NoBlocksError", NULL, NULL); NoBlocksError = PyErr_NewException("_block.NoBlocksError", NULL, NULL);
PyModule_AddObject(m, "NoBlocksError", NoBlocksError); PyModule_AddObject(m, "NoBlocksError", NoBlocksError);
DifferentBlockCountError = PyErr_NewException("_block.DifferentBlockCountError", NULL, NULL); DifferentBlockCountError =
PyErr_NewException("_block.DifferentBlockCountError", NULL, NULL);
PyModule_AddObject(m, "DifferentBlockCountError", DifferentBlockCountError); PyModule_AddObject(m, "DifferentBlockCountError", DifferentBlockCountError);
return m; return m;

View File

@@ -10,6 +10,8 @@
#include "common.h" #include "common.h"
#import <Foundation/Foundation.h> #import <Foundation/Foundation.h>
#import <CoreGraphics/CoreGraphics.h>
#import <ImageIO/ImageIO.h>
#define RADIANS( degrees ) ( degrees * M_PI / 180 ) #define RADIANS( degrees ) ( degrees * M_PI / 180 )

View File

@@ -9,7 +9,7 @@ from hscommon.util import get_file_ext, format_size
from core.util import format_timestamp, format_perc, format_dupe_count from core.util import format_timestamp, format_perc, format_dupe_count
from core import fs from core import fs
from . import exif from core.pe import exif
# This global value is set by the platform-specific subclasser of the Photo base class # This global value is set by the platform-specific subclasser of the Photo base class
PLAT_SPECIFIC_PHOTO_CLASS = None PLAT_SPECIFIC_PHOTO_CLASS = None

View File

@@ -8,7 +8,7 @@ from hscommon.trans import tr
from core.scanner import Scanner, ScanType, ScanOption from core.scanner import Scanner, ScanType, ScanOption
from . import matchblock, matchexif from core.pe import matchblock, matchexif
class ScannerPE(Scanner): class ScannerPE(Scanner):
@@ -18,12 +18,12 @@ class ScannerPE(Scanner):
@staticmethod @staticmethod
def get_scan_options(): def get_scan_options():
return [ return [
ScanOption(ScanType.FuzzyBlock, tr("Contents")), ScanOption(ScanType.FUZZYBLOCK, tr("Contents")),
ScanOption(ScanType.ExifTimestamp, tr("EXIF Timestamp")), ScanOption(ScanType.EXIFTIMESTAMP, tr("EXIF Timestamp")),
] ]
def _getmatches(self, files, j): def _getmatches(self, files, j):
if self.scan_type == ScanType.FuzzyBlock: if self.scan_type == ScanType.FUZZYBLOCK:
return matchblock.getmatches( return matchblock.getmatches(
files, files,
cache_path=self.cache_path, cache_path=self.cache_path,
@@ -31,7 +31,7 @@ class ScannerPE(Scanner):
match_scaled=self.match_scaled, match_scaled=self.match_scaled,
j=j, j=j,
) )
elif self.scan_type == ScanType.ExifTimestamp: elif self.scan_type == ScanType.EXIFTIMESTAMP:
return matchexif.getmatches(files, self.match_scaled, j) return matchexif.getmatches(files, self.match_scaled, j)
else: else:
raise Exception("Invalid scan type") raise ValueError("Invalid scan type")

View File

@@ -43,7 +43,7 @@ class Criterion:
@property @property
def display(self): def display(self):
return "{} ({})".format(self.category.NAME, self.display_value) return f"{self.category.NAME} ({self.display_value})"
class ValueListCategory(CriterionCategory): class ValueListCategory(CriterionCategory):
@@ -82,10 +82,12 @@ class FolderCategory(ValueListCategory):
def sort_key(self, dupe, crit_value): def sort_key(self, dupe, crit_value):
value = self.extract_value(dupe) value = self.extract_value(dupe)
if value[: len(crit_value)] == crit_value: # This is instead of using is_relative_to() which was added in py 3.9
return 0 try:
else: value.relative_to(crit_value)
except ValueError:
return 1 return 1
return 0
class FilenameCategory(CriterionCategory): class FilenameCategory(CriterionCategory):

View File

@@ -17,8 +17,8 @@ from hscommon.conflict import get_conflicted_name
from hscommon.util import flatten, nonone, FileOrPath, format_size from hscommon.util import flatten, nonone, FileOrPath, format_size
from hscommon.trans import tr from hscommon.trans import tr
from . import engine from core import engine
from .markable import Markable from core.markable import Markable
class Results(Markable): class Results(Markable):
@@ -52,6 +52,7 @@ class Results(Markable):
self.app = app self.app = app
self.problems = [] # (dupe, error_msg) self.problems = [] # (dupe, error_msg)
self.is_modified = False self.is_modified = False
self.refresh_required = False
def _did_mark(self, dupe): def _did_mark(self, dupe):
self.__marked_size += dupe.size self.__marked_size += dupe.size
@@ -94,8 +95,9 @@ class Results(Markable):
# ---Private # ---Private
def __get_dupe_list(self): def __get_dupe_list(self):
if self.__dupes is None: if self.__dupes is None or self.refresh_required:
self.__dupes = flatten(group.dupes for group in self.groups) self.__dupes = flatten(group.dupes for group in self.groups)
self.refresh_required = False
if None in self.__dupes: if None in self.__dupes:
# This is debug logging to try to figure out #44 # This is debug logging to try to figure out #44
logging.warning( logging.warning(
@@ -104,9 +106,7 @@ class Results(Markable):
self.groups, self.groups,
) )
if self.__filtered_dupes: if self.__filtered_dupes:
self.__dupes = [ self.__dupes = [dupe for dupe in self.__dupes if dupe in self.__filtered_dupes]
dupe for dupe in self.__dupes if dupe in self.__filtered_dupes
]
sd = self.__dupes_sort_descriptor sd = self.__dupes_sort_descriptor
if sd: if sd:
self.sort_dupes(sd[0], sd[1], sd[2]) self.sort_dupes(sd[0], sd[1], sd[2])
@@ -125,18 +125,10 @@ class Results(Markable):
total_count = self.__total_count total_count = self.__total_count
total_size = self.__total_size total_size = self.__total_size
else: else:
mark_count = len( mark_count = len([dupe for dupe in self.__filtered_dupes if self.is_marked(dupe)])
[dupe for dupe in self.__filtered_dupes if self.is_marked(dupe)] marked_size = sum(dupe.size for dupe in self.__filtered_dupes if self.is_marked(dupe))
) total_count = len([dupe for dupe in self.__filtered_dupes if self.is_markable(dupe)])
marked_size = sum( total_size = sum(dupe.size for dupe in self.__filtered_dupes if self.is_markable(dupe))
dupe.size for dupe in self.__filtered_dupes if self.is_marked(dupe)
)
total_count = len(
[dupe for dupe in self.__filtered_dupes if self.is_markable(dupe)]
)
total_size = sum(
dupe.size for dupe in self.__filtered_dupes if self.is_markable(dupe)
)
if self.mark_inverted: if self.mark_inverted:
marked_size = self.__total_size - marked_size marked_size = self.__total_size - marked_size
result = tr("%d / %d (%s / %s) duplicates marked.") % ( result = tr("%d / %d (%s / %s) duplicates marked.") % (
@@ -199,11 +191,7 @@ class Results(Markable):
self.__filters.append(filter_str) self.__filters.append(filter_str)
if self.__filtered_dupes is None: if self.__filtered_dupes is None:
self.__filtered_dupes = flatten(g[:] for g in self.groups) self.__filtered_dupes = flatten(g[:] for g in self.groups)
self.__filtered_dupes = set( self.__filtered_dupes = {dupe for dupe in self.__filtered_dupes if filter_re.search(str(dupe.path))}
dupe
for dupe in self.__filtered_dupes
if filter_re.search(str(dupe.path))
)
filtered_groups = set() filtered_groups = set()
for dupe in self.__filtered_dupes: for dupe in self.__filtered_dupes:
filtered_groups.add(self.get_group_of_duplicate(dupe)) filtered_groups.add(self.get_group_of_duplicate(dupe))
@@ -215,8 +203,7 @@ class Results(Markable):
self.__dupes = None self.__dupes = None
def get_group_of_duplicate(self, dupe): def get_group_of_duplicate(self, dupe):
"""Returns :class:`~core.engine.Group` in which ``dupe`` belongs. """Returns :class:`~core.engine.Group` in which ``dupe`` belongs."""
"""
try: try:
return self.__group_of_duplicate[dupe] return self.__group_of_duplicate[dupe]
except (TypeError, KeyError): except (TypeError, KeyError):
@@ -282,8 +269,7 @@ class Results(Markable):
self.is_modified = False self.is_modified = False
def make_ref(self, dupe): def make_ref(self, dupe):
"""Make ``dupe`` take the :attr:`~core.engine.Group.ref` position of its group. """Make ``dupe`` take the :attr:`~core.engine.Group.ref` position of its group."""
"""
g = self.get_group_of_duplicate(dupe) g = self.get_group_of_duplicate(dupe)
r = g.ref r = g.ref
if not g.switch_ref(dupe): if not g.switch_ref(dupe):
@@ -315,7 +301,7 @@ class Results(Markable):
try: try:
func(dupe) func(dupe)
to_remove.append(dupe) to_remove.append(dupe)
except (EnvironmentError, UnicodeEncodeError) as e: except (OSError, UnicodeEncodeError) as e:
self.problems.append((dupe, str(e))) self.problems.append((dupe, str(e)))
if remove_from_results: if remove_from_results:
self.remove_duplicates(to_remove) self.remove_duplicates(to_remove)
@@ -388,8 +374,8 @@ class Results(Markable):
try: try:
do_write(outfile) do_write(outfile)
except IOError as e: except OSError as e:
# If our IOError is because dest is already a directory, we want to handle that. 21 is # If our OSError is because dest is already a directory, we want to handle that. 21 is
# the code we get on OS X and Linux, 13 is what we get on Windows. # the code we get on OS X and Linux, 13 is what we get on Windows.
if e.errno in {21, 13}: if e.errno in {21, 13}:
p = str(outfile) p = str(outfile)
@@ -410,10 +396,10 @@ class Results(Markable):
""" """
if not self.__dupes: if not self.__dupes:
self.__get_dupe_list() self.__get_dupe_list()
keyfunc = lambda d: self.app._get_dupe_sort_key( self.__dupes.sort(
d, lambda: self.get_group_of_duplicate(d), key, delta key=lambda d: self.app._get_dupe_sort_key(d, lambda: self.get_group_of_duplicate(d), key, delta),
reverse=not asc,
) )
self.__dupes.sort(key=keyfunc, reverse=not asc)
self.__dupes_sort_descriptor = (key, asc, delta) self.__dupes_sort_descriptor = (key, asc, delta)
def sort_groups(self, key, asc=True): def sort_groups(self, key, asc=True):
@@ -424,8 +410,7 @@ class Results(Markable):
:param str key: key attribute name to sort with. :param str key: key attribute name to sort with.
:param bool asc: If false, sorting is reversed. :param bool asc: If false, sorting is reversed.
""" """
keyfunc = lambda g: self.app._get_group_sort_key(g, key) self.groups.sort(key=lambda g: self.app._get_group_sort_key(g, key), reverse=not asc)
self.groups.sort(key=keyfunc, reverse=not asc)
self.__groups_sort_descriptor = (key, asc) self.__groups_sort_descriptor = (key, asc)
# ---Properties # ---Properties

View File

@@ -13,7 +13,7 @@ from hscommon.jobprogress import job
from hscommon.util import dedupe, rem_file_ext, get_file_ext from hscommon.util import dedupe, rem_file_ext, get_file_ext
from hscommon.trans import tr from hscommon.trans import tr
from . import engine from core import engine
# It's quite ugly to have scan types from all editions all put in the same class, but because there's # It's quite ugly to have scan types from all editions all put in the same class, but because there's
# there will be some nasty bugs popping up (ScanType is used in core when in should exclusively be # there will be some nasty bugs popping up (ScanType is used in core when in should exclusively be
@@ -21,16 +21,16 @@ from . import engine
class ScanType: class ScanType:
Filename = 0 FILENAME = 0
Fields = 1 FIELDS = 1
FieldsNoOrder = 2 FIELDSNOORDER = 2
Tag = 3 TAG = 3
Folders = 4 FOLDERS = 4
Contents = 5 CONTENTS = 5
# PE # PE
FuzzyBlock = 10 FUZZYBLOCK = 10
ExifTimestamp = 11 EXIFTIMESTAMP = 11
ScanOption = namedtuple("ScanOption", "scan_type label") ScanOption = namedtuple("ScanOption", "scan_type label")
@@ -77,30 +77,37 @@ class Scanner:
self.discarded_file_count = 0 self.discarded_file_count = 0
def _getmatches(self, files, j): def _getmatches(self, files, j):
if self.size_threshold or self.scan_type in { if (
ScanType.Contents, self.size_threshold
ScanType.Folders, or self.large_size_threshold
}: or self.scan_type
in {
ScanType.CONTENTS,
ScanType.FOLDERS,
}
):
j = j.start_subjob([2, 8]) j = j.start_subjob([2, 8])
for f in j.iter_with_progress(files, tr("Read size of %d/%d files")): for f in j.iter_with_progress(files, tr("Read size of %d/%d files")):
f.size # pre-read, makes a smoother progress if read here (especially for bundles) f.size # pre-read, makes a smoother progress if read here (especially for bundles)
if self.size_threshold: if self.size_threshold:
files = [f for f in files if f.size >= self.size_threshold] files = [f for f in files if f.size >= self.size_threshold]
if self.scan_type in {ScanType.Contents, ScanType.Folders}: if self.large_size_threshold:
return engine.getmatches_by_contents(files, j=j) files = [f for f in files if f.size <= self.large_size_threshold]
if self.scan_type in {ScanType.CONTENTS, ScanType.FOLDERS}:
return engine.getmatches_by_contents(files, bigsize=self.big_file_size_threshold, j=j)
else: else:
j = j.start_subjob([2, 8]) j = j.start_subjob([2, 8])
kw = {} kw = {}
kw["match_similar_words"] = self.match_similar_words kw["match_similar_words"] = self.match_similar_words
kw["weight_words"] = self.word_weighting kw["weight_words"] = self.word_weighting
kw["min_match_percentage"] = self.min_match_percentage kw["min_match_percentage"] = self.min_match_percentage
if self.scan_type == ScanType.FieldsNoOrder: if self.scan_type == ScanType.FIELDSNOORDER:
self.scan_type = ScanType.Fields self.scan_type = ScanType.FIELDS
kw["no_field_order"] = True kw["no_field_order"] = True
func = { func = {
ScanType.Filename: lambda f: engine.getwords(rem_file_ext(f.name)), ScanType.FILENAME: lambda f: engine.getwords(rem_file_ext(f.name)),
ScanType.Fields: lambda f: engine.getfields(rem_file_ext(f.name)), ScanType.FIELDS: lambda f: engine.getfields(rem_file_ext(f.name)),
ScanType.Tag: lambda f: [ ScanType.TAG: lambda f: [
engine.getwords(str(getattr(f, attrname))) engine.getwords(str(getattr(f, attrname)))
for attrname in SCANNABLE_TAGS for attrname in SCANNABLE_TAGS
if attrname in self.scanned_tags if attrname in self.scanned_tags
@@ -127,7 +134,7 @@ class Scanner:
return False return False
if is_same_with_digit(refname, dupename): if is_same_with_digit(refname, dupename):
return True return True
return len(dupe.path) > len(ref.path) return len(dupe.path.parts) > len(ref.path.parts)
@staticmethod @staticmethod
def get_scan_options(): def get_scan_options():
@@ -150,49 +157,33 @@ class Scanner:
# "duplicated duplicates if you will). Then, we also don't want mixed file kinds if the # "duplicated duplicates if you will). Then, we also don't want mixed file kinds if the
# option isn't enabled, we want matches for which both files exist and, lastly, we don't # option isn't enabled, we want matches for which both files exist and, lastly, we don't
# want matches with both files as ref. # want matches with both files as ref.
if self.scan_type == ScanType.Folders and matches: if self.scan_type == ScanType.FOLDERS and matches:
allpath = {m.first.path for m in matches} allpath = {m.first.path for m in matches}
allpath |= {m.second.path for m in matches} allpath |= {m.second.path for m in matches}
sortedpaths = sorted(allpath) sortedpaths = sorted(allpath)
toremove = set() toremove = set()
last_parent_path = sortedpaths[0] last_parent_path = sortedpaths[0]
for p in sortedpaths[1:]: for p in sortedpaths[1:]:
if p in last_parent_path: if last_parent_path in p.parents:
toremove.add(p) toremove.add(p)
else: else:
last_parent_path = p last_parent_path = p
matches = [ matches = [m for m in matches if m.first.path not in toremove or m.second.path not in toremove]
m
for m in matches
if m.first.path not in toremove or m.second.path not in toremove
]
if not self.mix_file_kind: if not self.mix_file_kind:
matches = [ matches = [m for m in matches if get_file_ext(m.first.name) == get_file_ext(m.second.name)]
m matches = [m for m in matches if m.first.path.exists() and m.second.path.exists()]
for m in matches
if get_file_ext(m.first.name) == get_file_ext(m.second.name)
]
matches = [
m for m in matches if m.first.path.exists() and m.second.path.exists()
]
matches = [m for m in matches if not (m.first.is_ref and m.second.is_ref)] matches = [m for m in matches if not (m.first.is_ref and m.second.is_ref)]
if ignore_list: if ignore_list:
matches = [ matches = [m for m in matches if not ignore_list.are_ignored(str(m.first.path), str(m.second.path))]
m
for m in matches
if not ignore_list.AreIgnored(str(m.first.path), str(m.second.path))
]
logging.info("Grouping matches") logging.info("Grouping matches")
groups = engine.get_groups(matches) groups = engine.get_groups(matches)
if self.scan_type in { if self.scan_type in {
ScanType.Filename, ScanType.FILENAME,
ScanType.Fields, ScanType.FIELDS,
ScanType.FieldsNoOrder, ScanType.FIELDSNOORDER,
ScanType.Tag, ScanType.TAG,
}: }:
matched_files = dedupe( matched_files = dedupe([m.first for m in matches] + [m.second for m in matches])
[m.first for m in matches] + [m.second for m in matches]
)
self.discarded_file_count = len(matched_files) - sum(len(g) for g in groups) self.discarded_file_count = len(matched_files) - sum(len(g) for g in groups)
else: else:
# Ticket #195 # Ticket #195
@@ -215,7 +206,9 @@ class Scanner:
match_similar_words = False match_similar_words = False
min_match_percentage = 80 min_match_percentage = 80
mix_file_kind = True mix_file_kind = True
scan_type = ScanType.Filename scan_type = ScanType.FILENAME
scanned_tags = {"artist", "title"} scanned_tags = {"artist", "title"}
size_threshold = 0 size_threshold = 0
large_size_threshold = 0
big_file_size_threshold = 0
word_weighting = False word_weighting = False

View File

@@ -1 +1 @@
from . import fs, result_table, scanner # noqa from core.se import fs, result_table, scanner # noqa

View File

@@ -13,7 +13,7 @@ class ScannerSE(ScannerBase):
@staticmethod @staticmethod
def get_scan_options(): def get_scan_options():
return [ return [
ScanOption(ScanType.Filename, tr("Filename")), ScanOption(ScanType.FILENAME, tr("Filename")),
ScanOption(ScanType.Contents, tr("Contents")), ScanOption(ScanType.CONTENTS, tr("Contents")),
ScanOption(ScanType.Folders, tr("Folders")), ScanOption(ScanType.FOLDERS, tr("Folders")),
] ]

View File

@@ -7,31 +7,30 @@
import os import os
import os.path as op import os.path as op
import logging import logging
import tempfile
import pytest import pytest
from hscommon.path import Path from pathlib import Path
import hscommon.conflict import hscommon.conflict
import hscommon.util import hscommon.util
from hscommon.testutil import eq_, log_calls from hscommon.testutil import eq_, log_calls
from hscommon.jobprogress.job import Job from hscommon.jobprogress.job import Job
from .base import TestApp from core.tests.base import TestApp
from .results_test import GetTestGroups from core.tests.results_test import GetTestGroups
from .. import app, fs, engine from core import app, fs, engine
from ..scanner import ScanType from core.scanner import ScanType
def add_fake_files_to_directories(directories, files): def add_fake_files_to_directories(directories, files):
directories.get_files = lambda j=None: iter(files) directories.get_files = lambda j=None: iter(files)
directories._dirs.append("this is just so Scan() doesnt return 3") directories._dirs.append("this is just so Scan() doesn't return 3")
class TestCaseDupeGuru: class TestCaseDupeGuru:
def test_apply_filter_calls_results_apply_filter(self, monkeypatch): def test_apply_filter_calls_results_apply_filter(self, monkeypatch):
dgapp = TestApp().app dgapp = TestApp().app
monkeypatch.setattr( monkeypatch.setattr(dgapp.results, "apply_filter", log_calls(dgapp.results.apply_filter))
dgapp.results, "apply_filter", log_calls(dgapp.results.apply_filter)
)
dgapp.apply_filter("foo") dgapp.apply_filter("foo")
eq_(2, len(dgapp.results.apply_filter.calls)) eq_(2, len(dgapp.results.apply_filter.calls))
call = dgapp.results.apply_filter.calls[0] call = dgapp.results.apply_filter.calls[0]
@@ -41,15 +40,11 @@ class TestCaseDupeGuru:
def test_apply_filter_escapes_regexp(self, monkeypatch): def test_apply_filter_escapes_regexp(self, monkeypatch):
dgapp = TestApp().app dgapp = TestApp().app
monkeypatch.setattr( monkeypatch.setattr(dgapp.results, "apply_filter", log_calls(dgapp.results.apply_filter))
dgapp.results, "apply_filter", log_calls(dgapp.results.apply_filter)
)
dgapp.apply_filter("()[]\\.|+?^abc") dgapp.apply_filter("()[]\\.|+?^abc")
call = dgapp.results.apply_filter.calls[1] call = dgapp.results.apply_filter.calls[1]
eq_("\\(\\)\\[\\]\\\\\\.\\|\\+\\?\\^abc", call["filter_str"]) eq_("\\(\\)\\[\\]\\\\\\.\\|\\+\\?\\^abc", call["filter_str"])
dgapp.apply_filter( dgapp.apply_filter("(*)") # In "simple mode", we want the * to behave as a wildcard
"(*)"
) # In "simple mode", we want the * to behave as a wilcard
call = dgapp.results.apply_filter.calls[3] call = dgapp.results.apply_filter.calls[3]
eq_(r"\(.*\)", call["filter_str"]) eq_(r"\(.*\)", call["filter_str"])
dgapp.options["escape_filter_regexp"] = False dgapp.options["escape_filter_regexp"] = False
@@ -62,7 +57,7 @@ class TestCaseDupeGuru:
# for this unit is pathetic. What's done is done. My approach now is to add tests for # for this unit is pathetic. What's done is done. My approach now is to add tests for
# every change I want to make. The blowup was caused by a missing import. # every change I want to make. The blowup was caused by a missing import.
p = Path(str(tmpdir)) p = Path(str(tmpdir))
p["foo"].open("w").close() p.joinpath("foo").touch()
monkeypatch.setattr( monkeypatch.setattr(
hscommon.conflict, hscommon.conflict,
"smart_copy", "smart_copy",
@@ -70,40 +65,39 @@ class TestCaseDupeGuru:
) )
# XXX This monkeypatch is temporary. will be fixed in a better monkeypatcher. # XXX This monkeypatch is temporary. will be fixed in a better monkeypatcher.
monkeypatch.setattr(app, "smart_copy", hscommon.conflict.smart_copy) monkeypatch.setattr(app, "smart_copy", hscommon.conflict.smart_copy)
monkeypatch.setattr( monkeypatch.setattr(os, "makedirs", lambda path: None) # We don't want the test to create that fake directory
os, "makedirs", lambda path: None
) # We don't want the test to create that fake directory
dgapp = TestApp().app dgapp = TestApp().app
dgapp.directories.add_path(p) dgapp.directories.add_path(p)
[f] = dgapp.directories.get_files() [f] = dgapp.directories.get_files()
dgapp.copy_or_move(f, True, "some_destination", 0) with tempfile.TemporaryDirectory() as tmp_dir:
dgapp.copy_or_move(f, True, tmp_dir, 0)
eq_(1, len(hscommon.conflict.smart_copy.calls)) eq_(1, len(hscommon.conflict.smart_copy.calls))
call = hscommon.conflict.smart_copy.calls[0] call = hscommon.conflict.smart_copy.calls[0]
eq_(call["dest_path"], op.join("some_destination", "foo")) eq_(call["dest_path"], Path(tmp_dir, "foo"))
eq_(call["source_path"], f.path) eq_(call["source_path"], f.path)
def test_copy_or_move_clean_empty_dirs(self, tmpdir, monkeypatch): def test_copy_or_move_clean_empty_dirs(self, tmpdir, monkeypatch):
tmppath = Path(str(tmpdir)) tmppath = Path(str(tmpdir))
sourcepath = tmppath["source"] sourcepath = tmppath.joinpath("source")
sourcepath.mkdir() sourcepath.mkdir()
sourcepath["myfile"].open("w") sourcepath.joinpath("myfile").touch()
app = TestApp().app app = TestApp().app
app.directories.add_path(tmppath) app.directories.add_path(tmppath)
[myfile] = app.directories.get_files() [myfile] = app.directories.get_files()
monkeypatch.setattr(app, "clean_empty_dirs", log_calls(lambda path: None)) monkeypatch.setattr(app, "clean_empty_dirs", log_calls(lambda path: None))
app.copy_or_move(myfile, False, tmppath["dest"], 0) app.copy_or_move(myfile, False, tmppath.joinpath("dest"), 0)
calls = app.clean_empty_dirs.calls calls = app.clean_empty_dirs.calls
eq_(1, len(calls)) eq_(1, len(calls))
eq_(sourcepath, calls[0]["path"]) eq_(sourcepath, calls[0]["path"])
def test_Scan_with_objects_evaluating_to_false(self): def test_scan_with_objects_evaluating_to_false(self):
class FakeFile(fs.File): class FakeFile(fs.File):
def __bool__(self): def __bool__(self):
return False return False
# At some point, any() was used in a wrong way that made Scan() wrongly return 1 # At some point, any() was used in a wrong way that made Scan() wrongly return 1
app = TestApp().app app = TestApp().app
f1, f2 = [FakeFile("foo") for i in range(2)] f1, f2 = (FakeFile("foo") for _ in range(2))
f1.is_ref, f2.is_ref = (False, False) f1.is_ref, f2.is_ref = (False, False)
assert not (bool(f1) and bool(f2)) assert not (bool(f1) and bool(f2))
add_fake_files_to_directories(app.directories, [f1, f2]) add_fake_files_to_directories(app.directories, [f1, f2])
@@ -114,11 +108,11 @@ class TestCaseDupeGuru:
# If the ignore_hardlink_matches option is set, don't match files hardlinking to the same # If the ignore_hardlink_matches option is set, don't match files hardlinking to the same
# inode. # inode.
tmppath = Path(str(tmpdir)) tmppath = Path(str(tmpdir))
tmppath["myfile"].open("w").write("foo") tmppath.joinpath("myfile").open("wt").write("foo")
os.link(str(tmppath["myfile"]), str(tmppath["hardlink"])) os.link(str(tmppath.joinpath("myfile")), str(tmppath.joinpath("hardlink")))
app = TestApp().app app = TestApp().app
app.directories.add_path(tmppath) app.directories.add_path(tmppath)
app.options["scan_type"] = ScanType.Contents app.options["scan_type"] = ScanType.CONTENTS
app.options["ignore_hardlink_matches"] = True app.options["ignore_hardlink_matches"] = True
app.start_scanning() app.start_scanning()
eq_(len(app.results.groups), 0) eq_(len(app.results.groups), 0)
@@ -132,7 +126,7 @@ class TestCaseDupeGuru:
assert not dgapp.result_table.rename_selected("foo") # no crash assert not dgapp.result_table.rename_selected("foo") # no crash
class TestCaseDupeGuru_clean_empty_dirs: class TestCaseDupeGuruCleanEmptyDirs:
@pytest.fixture @pytest.fixture
def do_setup(self, request): def do_setup(self, request):
monkeypatch = request.getfixturevalue("monkeypatch") monkeypatch = request.getfixturevalue("monkeypatch")
@@ -161,7 +155,7 @@ class TestCaseDupeGuru_clean_empty_dirs:
# delete_if_empty must be recursively called up in the path until it returns False # delete_if_empty must be recursively called up in the path until it returns False
@log_calls @log_calls
def mock_delete_if_empty(path, files_to_delete=[]): def mock_delete_if_empty(path, files_to_delete=[]):
return len(path) > 1 return len(path.parts) > 1
monkeypatch.setattr(hscommon.util, "delete_if_empty", mock_delete_if_empty) monkeypatch.setattr(hscommon.util, "delete_if_empty", mock_delete_if_empty)
# XXX This monkeypatch is temporary. will be fixed in a better monkeypatcher. # XXX This monkeypatch is temporary. will be fixed in a better monkeypatcher.
@@ -188,11 +182,11 @@ class TestCaseDupeGuruWithResults:
self.rtable.refresh() self.rtable.refresh()
tmpdir = request.getfixturevalue("tmpdir") tmpdir = request.getfixturevalue("tmpdir")
tmppath = Path(str(tmpdir)) tmppath = Path(str(tmpdir))
tmppath["foo"].mkdir() tmppath.joinpath("foo").mkdir()
tmppath["bar"].mkdir() tmppath.joinpath("bar").mkdir()
self.app.directories.add_path(tmppath) self.app.directories.add_path(tmppath)
def test_GetObjects(self, do_setup): def test_get_objects(self, do_setup):
objects = self.objects objects = self.objects
groups = self.groups groups = self.groups
r = self.rtable[0] r = self.rtable[0]
@@ -205,7 +199,7 @@ class TestCaseDupeGuruWithResults:
assert r._group is groups[1] assert r._group is groups[1]
assert r._dupe is objects[4] assert r._dupe is objects[4]
def test_GetObjects_after_sort(self, do_setup): def test_get_objects_after_sort(self, do_setup):
objects = self.objects objects = self.objects
groups = self.groups[:] # we need an un-sorted reference groups = self.groups[:] # we need an un-sorted reference
self.rtable.sort("name", False) self.rtable.sort("name", False)
@@ -220,7 +214,7 @@ class TestCaseDupeGuruWithResults:
# The first 2 dupes have been removed. The 3rd one is a ref. it stays there, in first pos. # The first 2 dupes have been removed. The 3rd one is a ref. it stays there, in first pos.
eq_(self.rtable.selected_indexes, [1]) # no exception eq_(self.rtable.selected_indexes, [1]) # no exception
def test_selectResultNodePaths(self, do_setup): def test_select_result_node_paths(self, do_setup):
app = self.app app = self.app
objects = self.objects objects = self.objects
self.rtable.select([1, 2]) self.rtable.select([1, 2])
@@ -228,7 +222,7 @@ class TestCaseDupeGuruWithResults:
assert app.selected_dupes[0] is objects[1] assert app.selected_dupes[0] is objects[1]
assert app.selected_dupes[1] is objects[2] assert app.selected_dupes[1] is objects[2]
def test_selectResultNodePaths_with_ref(self, do_setup): def test_select_result_node_paths_with_ref(self, do_setup):
app = self.app app = self.app
objects = self.objects objects = self.objects
self.rtable.select([1, 2, 3]) self.rtable.select([1, 2, 3])
@@ -237,7 +231,7 @@ class TestCaseDupeGuruWithResults:
assert app.selected_dupes[1] is objects[2] assert app.selected_dupes[1] is objects[2]
assert app.selected_dupes[2] is self.groups[1].ref assert app.selected_dupes[2] is self.groups[1].ref
def test_selectResultNodePaths_after_sort(self, do_setup): def test_select_result_node_paths_after_sort(self, do_setup):
app = self.app app = self.app
objects = self.objects objects = self.objects
groups = self.groups[:] # To keep the old order in memory groups = self.groups[:] # To keep the old order in memory
@@ -264,7 +258,7 @@ class TestCaseDupeGuruWithResults:
app.remove_selected() app.remove_selected()
eq_(self.rtable.selected_indexes, []) # no exception eq_(self.rtable.selected_indexes, []) # no exception
def test_selectPowerMarkerRows_after_sort(self, do_setup): def test_select_powermarker_rows_after_sort(self, do_setup):
app = self.app app = self.app
objects = self.objects objects = self.objects
self.rtable.power_marker = True self.rtable.power_marker = True
@@ -303,7 +297,7 @@ class TestCaseDupeGuruWithResults:
app.toggle_selected_mark_state() app.toggle_selected_mark_state()
eq_(app.results.mark_count, 0) eq_(app.results.mark_count, 0)
def test_refreshDetailsWithSelected(self, do_setup): def test_refresh_details_with_selected(self, do_setup):
self.rtable.select([1, 4]) self.rtable.select([1, 4])
eq_(self.dpanel.row(0), ("Filename", "bar bleh", "foo bar")) eq_(self.dpanel.row(0), ("Filename", "bar bleh", "foo bar"))
self.dpanel.view.check_gui_calls(["refresh"]) self.dpanel.view.check_gui_calls(["refresh"])
@@ -311,7 +305,7 @@ class TestCaseDupeGuruWithResults:
eq_(self.dpanel.row(0), ("Filename", "---", "---")) eq_(self.dpanel.row(0), ("Filename", "---", "---"))
self.dpanel.view.check_gui_calls(["refresh"]) self.dpanel.view.check_gui_calls(["refresh"])
def test_makeSelectedReference(self, do_setup): def test_make_selected_reference(self, do_setup):
app = self.app app = self.app
objects = self.objects objects = self.objects
groups = self.groups groups = self.groups
@@ -320,9 +314,7 @@ class TestCaseDupeGuruWithResults:
assert groups[0].ref is objects[1] assert groups[0].ref is objects[1]
assert groups[1].ref is objects[4] assert groups[1].ref is objects[4]
def test_makeSelectedReference_by_selecting_two_dupes_in_the_same_group( def test_make_selected_reference_by_selecting_two_dupes_in_the_same_group(self, do_setup):
self, do_setup
):
app = self.app app = self.app
objects = self.objects objects = self.objects
groups = self.groups groups = self.groups
@@ -332,7 +324,7 @@ class TestCaseDupeGuruWithResults:
assert groups[0].ref is objects[1] assert groups[0].ref is objects[1]
assert groups[1].ref is objects[4] assert groups[1].ref is objects[4]
def test_removeSelected(self, do_setup): def test_remove_selected(self, do_setup):
app = self.app app = self.app
self.rtable.select([1, 4]) self.rtable.select([1, 4])
app.remove_selected() app.remove_selected()
@@ -340,7 +332,7 @@ class TestCaseDupeGuruWithResults:
app.remove_selected() app.remove_selected()
eq_(len(app.results.dupes), 0) eq_(len(app.results.dupes), 0)
def test_addDirectory_simple(self, do_setup): def test_add_directory_simple(self, do_setup):
# There's already a directory in self.app, so adding another once makes 2 of em # There's already a directory in self.app, so adding another once makes 2 of em
app = self.app app = self.app
# any other path that isn't a parent or child of the already added path # any other path that isn't a parent or child of the already added path
@@ -348,7 +340,7 @@ class TestCaseDupeGuruWithResults:
app.add_directory(otherpath) app.add_directory(otherpath)
eq_(len(app.directories), 2) eq_(len(app.directories), 2)
def test_addDirectory_already_there(self, do_setup): def test_add_directory_already_there(self, do_setup):
app = self.app app = self.app
otherpath = Path(op.dirname(__file__)) otherpath = Path(op.dirname(__file__))
app.add_directory(otherpath) app.add_directory(otherpath)
@@ -356,7 +348,7 @@ class TestCaseDupeGuruWithResults:
eq_(len(app.view.messages), 1) eq_(len(app.view.messages), 1)
assert "already" in app.view.messages[0] assert "already" in app.view.messages[0]
def test_addDirectory_does_not_exist(self, do_setup): def test_add_directory_does_not_exist(self, do_setup):
app = self.app app = self.app
app.add_directory("/does_not_exist") app.add_directory("/does_not_exist")
eq_(len(app.view.messages), 1) eq_(len(app.view.messages), 1)
@@ -372,30 +364,30 @@ class TestCaseDupeGuruWithResults:
# BOTH the ref and the other dupe should have been added # BOTH the ref and the other dupe should have been added
eq_(len(app.ignore_list), 3) eq_(len(app.ignore_list), 3)
def test_purgeIgnoreList(self, do_setup, tmpdir): def test_purge_ignorelist(self, do_setup, tmpdir):
app = self.app app = self.app
p1 = str(tmpdir.join("file1")) p1 = str(tmpdir.join("file1"))
p2 = str(tmpdir.join("file2")) p2 = str(tmpdir.join("file2"))
open(p1, "w").close() open(p1, "w").close()
open(p2, "w").close() open(p2, "w").close()
dne = "/does_not_exist" dne = "/does_not_exist"
app.ignore_list.Ignore(dne, p1) app.ignore_list.ignore(dne, p1)
app.ignore_list.Ignore(p2, dne) app.ignore_list.ignore(p2, dne)
app.ignore_list.Ignore(p1, p2) app.ignore_list.ignore(p1, p2)
app.purge_ignore_list() app.purge_ignore_list()
eq_(1, len(app.ignore_list)) eq_(1, len(app.ignore_list))
assert app.ignore_list.AreIgnored(p1, p2) assert app.ignore_list.are_ignored(p1, p2)
assert not app.ignore_list.AreIgnored(dne, p1) assert not app.ignore_list.are_ignored(dne, p1)
def test_only_unicode_is_added_to_ignore_list(self, do_setup): def test_only_unicode_is_added_to_ignore_list(self, do_setup):
def FakeIgnore(first, second): def fake_ignore(first, second):
if not isinstance(first, str): if not isinstance(first, str):
self.fail() self.fail()
if not isinstance(second, str): if not isinstance(second, str):
self.fail() self.fail()
app = self.app app = self.app
app.ignore_list.Ignore = FakeIgnore app.ignore_list.ignore = fake_ignore
self.rtable.select([4]) self.rtable.select([4])
app.add_selected_to_ignore_list() app.add_selected_to_ignore_list()
@@ -404,9 +396,7 @@ class TestCaseDupeGuruWithResults:
# results table. # results table.
app = self.app app = self.app
app.JOB = Job(1, lambda *args, **kw: False) # Cancels the task app.JOB = Job(1, lambda *args, **kw: False) # Cancels the task
add_fake_files_to_directories( add_fake_files_to_directories(app.directories, self.objects) # We want the scan to at least start
app.directories, self.objects
) # We want the scan to at least start
app.start_scanning() # will be cancelled immediately app.start_scanning() # will be cancelled immediately
eq_(len(app.result_table), 0) eq_(len(app.result_table), 0)
@@ -431,17 +421,14 @@ class TestCaseDupeGuruWithResults:
# don't crash # don't crash
class TestCaseDupeGuru_renameSelected: class TestCaseDupeGuruRenameSelected:
@pytest.fixture @pytest.fixture
def do_setup(self, request): def do_setup(self, request):
tmpdir = request.getfixturevalue("tmpdir") tmpdir = request.getfixturevalue("tmpdir")
p = Path(str(tmpdir)) p = Path(str(tmpdir))
fp = open(str(p["foo bar 1"]), mode="w") p.joinpath("foo bar 1").touch()
fp.close() p.joinpath("foo bar 2").touch()
fp = open(str(p["foo bar 2"]), mode="w") p.joinpath("foo bar 3").touch()
fp.close()
fp = open(str(p["foo bar 3"]), mode="w")
fp.close()
files = fs.get_files(p) files = fs.get_files(p)
for f in files: for f in files:
f.is_ref = False f.is_ref = False
@@ -463,7 +450,7 @@ class TestCaseDupeGuru_renameSelected:
g = self.groups[0] g = self.groups[0]
self.rtable.select([1]) self.rtable.select([1])
assert app.rename_selected("renamed") assert app.rename_selected("renamed")
names = [p.name for p in self.p.listdir()] names = [p.name for p in self.p.glob("*")]
assert "renamed" in names assert "renamed" in names
assert "foo bar 2" not in names assert "foo bar 2" not in names
eq_(g.dupes[0].name, "renamed") eq_(g.dupes[0].name, "renamed")
@@ -476,7 +463,7 @@ class TestCaseDupeGuru_renameSelected:
assert not app.rename_selected("renamed") assert not app.rename_selected("renamed")
msg = logging.warning.calls[0]["msg"] msg = logging.warning.calls[0]["msg"]
eq_("dupeGuru Warning: list index out of range", msg) eq_("dupeGuru Warning: list index out of range", msg)
names = [p.name for p in self.p.listdir()] names = [p.name for p in self.p.glob("*")]
assert "renamed" not in names assert "renamed" not in names
assert "foo bar 2" in names assert "foo bar 2" in names
eq_(g.dupes[0].name, "foo bar 2") eq_(g.dupes[0].name, "foo bar 2")
@@ -489,7 +476,7 @@ class TestCaseDupeGuru_renameSelected:
assert not app.rename_selected("foo bar 1") assert not app.rename_selected("foo bar 1")
msg = logging.warning.calls[0]["msg"] msg = logging.warning.calls[0]["msg"]
assert msg.startswith("dupeGuru Warning: 'foo bar 1' already exists in") assert msg.startswith("dupeGuru Warning: 'foo bar 1' already exists in")
names = [p.name for p in self.p.listdir()] names = [p.name for p in self.p.glob("*")]
assert "foo bar 1" in names assert "foo bar 1" in names
assert "foo bar 2" in names assert "foo bar 2" in names
eq_(g.dupes[0].name, "foo bar 2") eq_(g.dupes[0].name, "foo bar 2")
@@ -500,9 +487,9 @@ class TestAppWithDirectoriesInTree:
def do_setup(self, request): def do_setup(self, request):
tmpdir = request.getfixturevalue("tmpdir") tmpdir = request.getfixturevalue("tmpdir")
p = Path(str(tmpdir)) p = Path(str(tmpdir))
p["sub1"].mkdir() p.joinpath("sub1").mkdir()
p["sub2"].mkdir() p.joinpath("sub2").mkdir()
p["sub3"].mkdir() p.joinpath("sub3").mkdir()
app = TestApp() app = TestApp()
self.app = app.app self.app = app.app
self.dtree = app.dtree self.dtree = app.dtree
@@ -514,7 +501,6 @@ class TestAppWithDirectoriesInTree:
# refreshed. # refreshed.
node = self.dtree[0] node = self.dtree[0]
eq_(len(node), 3) # a len() call is required for subnodes to be loaded eq_(len(node), 3) # a len() call is required for subnodes to be loaded
subnode = node[0]
node.state = 1 # the state property is a state index node.state = 1 # the state property is a state index
node = self.dtree[0] node = self.dtree[0]
eq_(len(node), 3) eq_(len(node), 3)

View File

@@ -5,17 +5,16 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from hscommon.testutil import TestApp as TestAppBase, CallLogger, eq_, with_app # noqa from hscommon.testutil import TestApp as TestAppBase, CallLogger, eq_, with_app # noqa
from hscommon.path import Path from pathlib import Path
from hscommon.util import get_file_ext, format_size from hscommon.util import get_file_ext, format_size
from hscommon.gui.column import Column from hscommon.gui.column import Column
from hscommon.jobprogress.job import nulljob, JobCancelled from hscommon.jobprogress.job import nulljob, JobCancelled
from .. import engine from core import engine, prioritize
from .. import prioritize from core.engine import getwords
from ..engine import getwords from core.app import DupeGuru as DupeGuruBase
from ..app import DupeGuru as DupeGuruBase from core.gui.result_table import ResultTable as ResultTableBase
from ..gui.result_table import ResultTable as ResultTableBase from core.gui.prioritize_dialog import PrioritizeDialog
from ..gui.prioritize_dialog import PrioritizeDialog
class DupeGuruView: class DupeGuruView:
@@ -86,8 +85,9 @@ class NamedObject:
folder = "basepath" folder = "basepath"
self._folder = Path(folder) self._folder = Path(folder)
self.size = size self.size = size
self.md5partial = name self.digest_partial = name
self.md5 = name self.digest = name
self.digest_samples = name
if with_words: if with_words:
self.words = getwords(name) self.words = getwords(name)
self.is_ref = False self.is_ref = False
@@ -110,11 +110,11 @@ class NamedObject:
@property @property
def path(self): def path(self):
return self._folder[self.name] return self._folder.joinpath(self.name)
@property @property
def folder_path(self): def folder_path(self):
return self.path.parent() return self.path.parent
@property @property
def extension(self): def extension(self):
@@ -139,9 +139,7 @@ def GetTestGroups():
matches = engine.getmatches(objects) # we should have 5 matches matches = engine.getmatches(objects) # we should have 5 matches
groups = engine.get_groups(matches) # We should have 2 groups groups = engine.get_groups(matches) # We should have 2 groups
for g in groups: for g in groups:
g.prioritize( g.prioritize(lambda x: objects.index(x)) # We want the dupes to be in the same order as the list is
lambda x: objects.index(x)
) # We want the dupes to be in the same order as the list is
groups.sort(key=len, reverse=True) # We want the group with 3 members to be first. groups.sort(key=len, reverse=True) # We want the group with 3 members to be first.
return (objects, matches, groups) return (objects, matches, groups)
@@ -152,8 +150,8 @@ class TestApp(TestAppBase):
def __init__(self): def __init__(self):
def link_gui(gui): def link_gui(gui):
gui.view = self.make_logger() gui.view = self.make_logger()
if hasattr(gui, "columns"): # tables if hasattr(gui, "_columns"): # tables
gui.columns.view = self.make_logger() gui._columns.view = self.make_logger()
return gui return gui
TestAppBase.__init__(self) TestAppBase.__init__(self)

View File

@@ -9,14 +9,12 @@ from pytest import raises, skip
from hscommon.testutil import eq_ from hscommon.testutil import eq_
try: try:
from ..pe.block import avgdiff, getblocks2, NoBlocksError, DifferentBlockCountError from core.pe.block import avgdiff, getblocks2, NoBlocksError, DifferentBlockCountError
except ImportError: except ImportError:
skip("Can't import the block module, probably hasn't been compiled.") skip("Can't import the block module, probably hasn't been compiled.")
def my_avgdiff( def my_avgdiff(first, second, limit=768, min_iter=3): # this is so I don't have to re-write every call
first, second, limit=768, min_iter=3
): # this is so I don't have to re-write every call
return avgdiff(first, second, limit, min_iter) return avgdiff(first, second, limit, min_iter)
@@ -75,99 +73,6 @@ class TestCasegetblock:
eq_((meanred, meangreen, meanblue), b) eq_((meanred, meangreen, meanblue), b)
# class TCdiff(unittest.TestCase):
# def test_diff(self):
# b1 = (10, 20, 30)
# b2 = (1, 2, 3)
# eq_(9 + 18 + 27, diff(b1, b2))
#
# def test_diff_negative(self):
# b1 = (10, 20, 30)
# b2 = (1, 2, 3)
# eq_(9 + 18 + 27, diff(b2, b1))
#
# def test_diff_mixed_positive_and_negative(self):
# b1 = (1, 5, 10)
# b2 = (10, 1, 15)
# eq_(9 + 4 + 5, diff(b1, b2))
#
# class TCgetblocks(unittest.TestCase):
# def test_empty_image(self):
# im = empty()
# blocks = getblocks(im, 1)
# eq_(0, len(blocks))
#
# def test_one_block_image(self):
# im = four_pixels()
# blocks = getblocks2(im, 1)
# eq_(1, len(blocks))
# block = blocks[0]
# meanred = (0xff + 0x80) // 4
# meangreen = (0x80 + 0x40) // 4
# meanblue = (0xff + 0x80) // 4
# eq_((meanred, meangreen, meanblue), block)
#
# def test_not_enough_height_to_fit_a_block(self):
# im = FakeImage((2, 1), [BLACK, BLACK])
# blocks = getblocks(im, 2)
# eq_(0, len(blocks))
#
# def xtest_dont_include_leftovers(self):
# # this test is disabled because getblocks is not used and getblock in cdeffed
# pixels = [
# RED,(0, 0x80, 0xff), BLACK,
# (0x80, 0, 0),(0, 0x40, 0x80), BLACK,
# BLACK, BLACK, BLACK
# ]
# im = FakeImage((3, 3), pixels)
# blocks = getblocks(im, 2)
# block = blocks[0]
# #Because the block is smaller than the image, only blocksize must be considered.
# meanred = (0xff + 0x80) // 4
# meangreen = (0x80 + 0x40) // 4
# meanblue = (0xff + 0x80) // 4
# eq_((meanred, meangreen, meanblue), block)
#
# def xtest_two_blocks(self):
# # this test is disabled because getblocks is not used and getblock in cdeffed
# pixels = [BLACK for i in xrange(4 * 2)]
# pixels[0] = RED
# pixels[1] = (0, 0x80, 0xff)
# pixels[4] = (0x80, 0, 0)
# pixels[5] = (0, 0x40, 0x80)
# im = FakeImage((4, 2), pixels)
# blocks = getblocks(im, 2)
# eq_(2, len(blocks))
# block = blocks[0]
# #Because the block is smaller than the image, only blocksize must be considered.
# meanred = (0xff + 0x80) // 4
# meangreen = (0x80 + 0x40) // 4
# meanblue = (0xff + 0x80) // 4
# eq_((meanred, meangreen, meanblue), block)
# eq_(BLACK, blocks[1])
#
# def test_four_blocks(self):
# pixels = [BLACK for i in xrange(4 * 4)]
# pixels[0] = RED
# pixels[1] = (0, 0x80, 0xff)
# pixels[4] = (0x80, 0, 0)
# pixels[5] = (0, 0x40, 0x80)
# im = FakeImage((4, 4), pixels)
# blocks = getblocks2(im, 2)
# eq_(4, len(blocks))
# block = blocks[0]
# #Because the block is smaller than the image, only blocksize must be considered.
# meanred = (0xff + 0x80) // 4
# meangreen = (0x80 + 0x40) // 4
# meanblue = (0xff + 0x80) // 4
# eq_((meanred, meangreen, meanblue), block)
# eq_(BLACK, blocks[1])
# eq_(BLACK, blocks[2])
# eq_(BLACK, blocks[3])
#
class TestCasegetblocks2: class TestCasegetblocks2:
def test_empty_image(self): def test_empty_image(self):
im = empty() im = empty()
@@ -272,8 +177,8 @@ class TestCaseavgdiff:
def test_return_at_least_1_at_the_slightest_difference(self): def test_return_at_least_1_at_the_slightest_difference(self):
ref = (0, 0, 0) ref = (0, 0, 0)
b1 = (1, 0, 0) b1 = (1, 0, 0)
blocks1 = [ref for i in range(250)] blocks1 = [ref for _ in range(250)]
blocks2 = [ref for i in range(250)] blocks2 = [ref for _ in range(250)]
blocks2[0] = b1 blocks2[0] = b1
eq_(1, my_avgdiff(blocks1, blocks2)) eq_(1, my_avgdiff(blocks1, blocks2))
@@ -282,41 +187,3 @@ class TestCaseavgdiff:
blocks1 = [ref, ref] blocks1 = [ref, ref]
blocks2 = [ref, ref] blocks2 = [ref, ref]
eq_(0, my_avgdiff(blocks1, blocks2)) eq_(0, my_avgdiff(blocks1, blocks2))
# class TCmaxdiff(unittest.TestCase):
# def test_empty(self):
# self.assertRaises(NoBlocksError, maxdiff,[],[])
#
# def test_two_blocks(self):
# b1 = (5, 10, 15)
# b2 = (255, 250, 245)
# b3 = (0, 0, 0)
# b4 = (255, 0, 255)
# blocks1 = [b1, b2]
# blocks2 = [b3, b4]
# expected1 = 5 + 10 + 15
# expected2 = 0 + 250 + 10
# expected = max(expected1, expected2)
# eq_(expected, maxdiff(blocks1, blocks2))
#
# def test_blocks_not_the_same_size(self):
# b = (0, 0, 0)
# self.assertRaises(DifferentBlockCountError, maxdiff,[b, b],[b])
#
# def test_first_arg_is_empty_but_not_second(self):
# #Don't return 0 (as when the 2 lists are empty), raise!
# b = (0, 0, 0)
# self.assertRaises(DifferentBlockCountError, maxdiff,[],[b])
#
# def test_limit(self):
# b1 = (5, 10, 15)
# b2 = (255, 250, 245)
# b3 = (0, 0, 0)
# b4 = (255, 0, 255)
# blocks1 = [b1, b2]
# blocks2 = [b3, b4]
# expected1 = 5 + 10 + 15
# expected2 = 0 + 250 + 10
# eq_(expected1, maxdiff(blocks1, blocks2, expected1 - 1))
#

View File

@@ -10,14 +10,14 @@ from pytest import raises, skip
from hscommon.testutil import eq_ from hscommon.testutil import eq_
try: try:
from ..pe.cache import colors_to_string, string_to_colors from core.pe.cache import colors_to_string, string_to_colors
from ..pe.cache_sqlite import SqliteCache from core.pe.cache_sqlite import SqliteCache
from ..pe.cache_shelve import ShelveCache from core.pe.cache_shelve import ShelveCache
except ImportError: except ImportError:
skip("Can't import the cache module, probably hasn't been compiled.") skip("Can't import the cache module, probably hasn't been compiled.")
class TestCasecolors_to_string: class TestCaseColorsToString:
def test_no_color(self): def test_no_color(self):
eq_("", colors_to_string([])) eq_("", colors_to_string([]))
@@ -30,7 +30,7 @@ class TestCasecolors_to_string:
eq_("000102030405", colors_to_string([(0, 1, 2), (3, 4, 5)])) eq_("000102030405", colors_to_string([(0, 1, 2), (3, 4, 5)]))
class TestCasestring_to_colors: class TestCaseStringToColors:
def test_empty(self): def test_empty(self):
eq_([], string_to_colors("")) eq_([], string_to_colors(""))

View File

@@ -10,45 +10,39 @@ import tempfile
import shutil import shutil
from pytest import raises from pytest import raises
from hscommon.path import Path from pathlib import Path
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from hscommon.plat import ISWINDOWS from hscommon.plat import ISWINDOWS
from ..fs import File from core.fs import File
from ..directories import ( from core.directories import (
Directories, Directories,
DirectoryState, DirectoryState,
AlreadyThereError, AlreadyThereError,
InvalidPathError, InvalidPathError,
) )
from ..exclude import ExcludeList, ExcludeDict from core.exclude import ExcludeList, ExcludeDict
def create_fake_fs(rootpath): def create_fake_fs(rootpath):
# We have it as a separate function because other units are using it. # We have it as a separate function because other units are using it.
rootpath = rootpath["fs"] rootpath = rootpath.joinpath("fs")
rootpath.mkdir() rootpath.mkdir()
rootpath["dir1"].mkdir() rootpath.joinpath("dir1").mkdir()
rootpath["dir2"].mkdir() rootpath.joinpath("dir2").mkdir()
rootpath["dir3"].mkdir() rootpath.joinpath("dir3").mkdir()
fp = rootpath["file1.test"].open("w") with rootpath.joinpath("file1.test").open("wt") as fp:
fp.write("1") fp.write("1")
fp.close() with rootpath.joinpath("file2.test").open("wt") as fp:
fp = rootpath["file2.test"].open("w")
fp.write("12") fp.write("12")
fp.close() with rootpath.joinpath("file3.test").open("wt") as fp:
fp = rootpath["file3.test"].open("w")
fp.write("123") fp.write("123")
fp.close() with rootpath.joinpath("dir1", "file1.test").open("wt") as fp:
fp = rootpath["dir1"]["file1.test"].open("w")
fp.write("1") fp.write("1")
fp.close() with rootpath.joinpath("dir2", "file2.test").open("wt") as fp:
fp = rootpath["dir2"]["file2.test"].open("w")
fp.write("12") fp.write("12")
fp.close() with rootpath.joinpath("dir3", "file3.test").open("wt") as fp:
fp = rootpath["dir3"]["file3.test"].open("w")
fp.write("123") fp.write("123")
fp.close()
return rootpath return rootpath
@@ -60,11 +54,10 @@ def setup_module(module):
# and another with a more complex structure. # and another with a more complex structure.
testpath = Path(tempfile.mkdtemp()) testpath = Path(tempfile.mkdtemp())
module.testpath = testpath module.testpath = testpath
rootpath = testpath["onefile"] rootpath = testpath.joinpath("onefile")
rootpath.mkdir() rootpath.mkdir()
fp = rootpath["test.txt"].open("w") with rootpath.joinpath("test.txt").open("wt") as fp:
fp.write("test_data") fp.write("test_data")
fp.close()
create_fake_fs(testpath) create_fake_fs(testpath)
@@ -80,41 +73,41 @@ def test_empty():
def test_add_path(): def test_add_path():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
eq_(1, len(d)) eq_(1, len(d))
assert p in d assert p in d
assert (p["foobar"]) in d assert (p.joinpath("foobar")) in d
assert p.parent() not in d assert p.parent not in d
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
eq_(2, len(d)) eq_(2, len(d))
assert p in d assert p in d
def test_AddPath_when_path_is_already_there(): def test_add_path_when_path_is_already_there():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
with raises(AlreadyThereError): with raises(AlreadyThereError):
d.add_path(p) d.add_path(p)
with raises(AlreadyThereError): with raises(AlreadyThereError):
d.add_path(p["foobar"]) d.add_path(p.joinpath("foobar"))
eq_(1, len(d)) eq_(1, len(d))
def test_add_path_containing_paths_already_there(): def test_add_path_containing_paths_already_there():
d = Directories() d = Directories()
d.add_path(testpath["onefile"]) d.add_path(testpath.joinpath("onefile"))
eq_(1, len(d)) eq_(1, len(d))
d.add_path(testpath) d.add_path(testpath)
eq_(len(d), 1) eq_(len(d), 1)
eq_(d[0], testpath) eq_(d[0], testpath)
def test_AddPath_non_latin(tmpdir): def test_add_path_non_latin(tmpdir):
p = Path(str(tmpdir)) p = Path(str(tmpdir))
to_add = p["unicode\u201a"] to_add = p.joinpath("unicode\u201a")
os.mkdir(str(to_add)) os.mkdir(str(to_add))
d = Directories() d = Directories()
try: try:
@@ -125,61 +118,61 @@ def test_AddPath_non_latin(tmpdir):
def test_del(): def test_del():
d = Directories() d = Directories()
d.add_path(testpath["onefile"]) d.add_path(testpath.joinpath("onefile"))
try: try:
del d[1] del d[1]
assert False assert False
except IndexError: except IndexError:
pass pass
d.add_path(testpath["fs"]) d.add_path(testpath.joinpath("fs"))
del d[1] del d[1]
eq_(1, len(d)) eq_(1, len(d))
def test_states(): def test_states():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
eq_(DirectoryState.Normal, d.get_state(p)) eq_(DirectoryState.NORMAL, d.get_state(p))
d.set_state(p, DirectoryState.Reference) d.set_state(p, DirectoryState.REFERENCE)
eq_(DirectoryState.Reference, d.get_state(p)) eq_(DirectoryState.REFERENCE, d.get_state(p))
eq_(DirectoryState.Reference, d.get_state(p["dir1"])) eq_(DirectoryState.REFERENCE, d.get_state(p.joinpath("dir1")))
eq_(1, len(d.states)) eq_(1, len(d.states))
eq_(p, list(d.states.keys())[0]) eq_(p, list(d.states.keys())[0])
eq_(DirectoryState.Reference, d.states[p]) eq_(DirectoryState.REFERENCE, d.states[p])
def test_get_state_with_path_not_there(): def test_get_state_with_path_not_there():
# When the path's not there, just return DirectoryState.Normal # When the path's not there, just return DirectoryState.Normal
d = Directories() d = Directories()
d.add_path(testpath["onefile"]) d.add_path(testpath.joinpath("onefile"))
eq_(d.get_state(testpath), DirectoryState.Normal) eq_(d.get_state(testpath), DirectoryState.NORMAL)
def test_states_overwritten_when_larger_directory_eat_smaller_ones(): def test_states_overwritten_when_larger_directory_eat_smaller_ones():
# ref #248 # ref #248
# When setting the state of a folder, we overwrite previously set states for subfolders. # When setting the state of a folder, we overwrite previously set states for subfolders.
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
d.set_state(p, DirectoryState.Excluded) d.set_state(p, DirectoryState.EXCLUDED)
d.add_path(testpath) d.add_path(testpath)
d.set_state(testpath, DirectoryState.Reference) d.set_state(testpath, DirectoryState.REFERENCE)
eq_(d.get_state(p), DirectoryState.Reference) eq_(d.get_state(p), DirectoryState.REFERENCE)
eq_(d.get_state(p["dir1"]), DirectoryState.Reference) eq_(d.get_state(p.joinpath("dir1")), DirectoryState.REFERENCE)
eq_(d.get_state(testpath), DirectoryState.Reference) eq_(d.get_state(testpath), DirectoryState.REFERENCE)
def test_get_files(): def test_get_files():
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
d.set_state(p["dir1"], DirectoryState.Reference) d.set_state(p.joinpath("dir1"), DirectoryState.REFERENCE)
d.set_state(p["dir2"], DirectoryState.Excluded) d.set_state(p.joinpath("dir2"), DirectoryState.EXCLUDED)
files = list(d.get_files()) files = list(d.get_files())
eq_(5, len(files)) eq_(5, len(files))
for f in files: for f in files:
if f.path.parent() == p["dir1"]: if f.path.parent == p.joinpath("dir1"):
assert f.is_ref assert f.is_ref
else: else:
assert not f.is_ref assert not f.is_ref
@@ -193,7 +186,7 @@ def test_get_files_with_folders():
return True return True
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
files = list(d.get_files(fileclasses=[FakeFile])) files = list(d.get_files(fileclasses=[FakeFile]))
# We have the 3 root files and the 3 root dirs # We have the 3 root files and the 3 root dirs
@@ -202,25 +195,25 @@ def test_get_files_with_folders():
def test_get_folders(): def test_get_folders():
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
d.set_state(p["dir1"], DirectoryState.Reference) d.set_state(p.joinpath("dir1"), DirectoryState.REFERENCE)
d.set_state(p["dir2"], DirectoryState.Excluded) d.set_state(p.joinpath("dir2"), DirectoryState.EXCLUDED)
folders = list(d.get_folders()) folders = list(d.get_folders())
eq_(len(folders), 3) eq_(len(folders), 3)
ref = [f for f in folders if f.is_ref] ref = [f for f in folders if f.is_ref]
not_ref = [f for f in folders if not f.is_ref] not_ref = [f for f in folders if not f.is_ref]
eq_(len(ref), 1) eq_(len(ref), 1)
eq_(ref[0].path, p["dir1"]) eq_(ref[0].path, p.joinpath("dir1"))
eq_(len(not_ref), 2) eq_(len(not_ref), 2)
eq_(ref[0].size, 1) eq_(ref[0].size, 1)
def test_get_files_with_inherited_exclusion(): def test_get_files_with_inherited_exclusion():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
d.set_state(p, DirectoryState.Excluded) d.set_state(p, DirectoryState.EXCLUDED)
eq_([], list(d.get_files())) eq_([], list(d.get_files()))
@@ -233,14 +226,14 @@ def test_save_and_load(tmpdir):
p2.mkdir() p2.mkdir()
d1.add_path(p1) d1.add_path(p1)
d1.add_path(p2) d1.add_path(p2)
d1.set_state(p1, DirectoryState.Reference) d1.set_state(p1, DirectoryState.REFERENCE)
d1.set_state(p1["dir1"], DirectoryState.Excluded) d1.set_state(p1.joinpath("dir1"), DirectoryState.EXCLUDED)
tmpxml = str(tmpdir.join("directories_testunit.xml")) tmpxml = str(tmpdir.join("directories_testunit.xml"))
d1.save_to_file(tmpxml) d1.save_to_file(tmpxml)
d2.load_from_file(tmpxml) d2.load_from_file(tmpxml)
eq_(2, len(d2)) eq_(2, len(d2))
eq_(DirectoryState.Reference, d2.get_state(p1)) eq_(DirectoryState.REFERENCE, d2.get_state(p1))
eq_(DirectoryState.Excluded, d2.get_state(p1["dir1"])) eq_(DirectoryState.EXCLUDED, d2.get_state(p1.joinpath("dir1")))
def test_invalid_path(): def test_invalid_path():
@@ -254,7 +247,12 @@ def test_invalid_path():
def test_set_state_on_invalid_path(): def test_set_state_on_invalid_path():
d = Directories() d = Directories()
try: try:
d.set_state(Path("foobar",), DirectoryState.Normal) d.set_state(
Path(
"foobar",
),
DirectoryState.NORMAL,
)
except LookupError: except LookupError:
assert False assert False
@@ -263,7 +261,7 @@ def test_load_from_file_with_invalid_path(tmpdir):
# This test simulates a load from file resulting in a # This test simulates a load from file resulting in a
# InvalidPath raise. Other directories must be loaded. # InvalidPath raise. Other directories must be loaded.
d1 = Directories() d1 = Directories()
d1.add_path(testpath["onefile"]) d1.add_path(testpath.joinpath("onefile"))
# Will raise InvalidPath upon loading # Will raise InvalidPath upon loading
p = Path(str(tmpdir.join("toremove"))) p = Path(str(tmpdir.join("toremove")))
p.mkdir() p.mkdir()
@@ -278,11 +276,11 @@ def test_load_from_file_with_invalid_path(tmpdir):
def test_unicode_save(tmpdir): def test_unicode_save(tmpdir):
d = Directories() d = Directories()
p1 = Path(str(tmpdir))["hello\xe9"] p1 = Path(str(tmpdir), "hello\xe9")
p1.mkdir() p1.mkdir()
p1["foo\xe9"].mkdir() p1.joinpath("foo\xe9").mkdir()
d.add_path(p1) d.add_path(p1)
d.set_state(p1["foo\xe9"], DirectoryState.Excluded) d.set_state(p1.joinpath("foo\xe9"), DirectoryState.EXCLUDED)
tmpxml = str(tmpdir.join("directories_testunit.xml")) tmpxml = str(tmpdir.join("directories_testunit.xml"))
try: try:
d.save_to_file(tmpxml) d.save_to_file(tmpxml)
@@ -292,12 +290,12 @@ def test_unicode_save(tmpdir):
def test_get_files_refreshes_its_directories(): def test_get_files_refreshes_its_directories():
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
files = d.get_files() files = d.get_files()
eq_(6, len(list(files))) eq_(6, len(list(files)))
time.sleep(1) time.sleep(1)
os.remove(str(p["dir1"]["file1.test"])) os.remove(str(p.joinpath("dir1", "file1.test")))
files = d.get_files() files = d.get_files()
eq_(5, len(list(files))) eq_(5, len(list(files)))
@@ -306,54 +304,56 @@ def test_get_files_does_not_choke_on_non_existing_directories(tmpdir):
d = Directories() d = Directories()
p = Path(str(tmpdir)) p = Path(str(tmpdir))
d.add_path(p) d.add_path(p)
p.rmtree() shutil.rmtree(str(p))
eq_([], list(d.get_files())) eq_([], list(d.get_files()))
def test_get_state_returns_excluded_by_default_for_hidden_directories(tmpdir): def test_get_state_returns_excluded_by_default_for_hidden_directories(tmpdir):
d = Directories() d = Directories()
p = Path(str(tmpdir)) p = Path(str(tmpdir))
hidden_dir_path = p[".foo"] hidden_dir_path = p.joinpath(".foo")
p[".foo"].mkdir() p.joinpath(".foo").mkdir()
d.add_path(p) d.add_path(p)
eq_(d.get_state(hidden_dir_path), DirectoryState.Excluded) eq_(d.get_state(hidden_dir_path), DirectoryState.EXCLUDED)
# But it can be overriden # But it can be overriden
d.set_state(hidden_dir_path, DirectoryState.Normal) d.set_state(hidden_dir_path, DirectoryState.NORMAL)
eq_(d.get_state(hidden_dir_path), DirectoryState.Normal) eq_(d.get_state(hidden_dir_path), DirectoryState.NORMAL)
def test_default_path_state_override(tmpdir): def test_default_path_state_override(tmpdir):
# It's possible for a subclass to override the default state of a path # It's possible for a subclass to override the default state of a path
class MyDirectories(Directories): class MyDirectories(Directories):
def _default_state_for_path(self, path): def _default_state_for_path(self, path):
if "foobar" in path: if "foobar" in path.parts:
return DirectoryState.Excluded return DirectoryState.EXCLUDED
d = MyDirectories() d = MyDirectories()
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["foobar"].mkdir() p1.joinpath("foobar").mkdir()
p1["foobar/somefile"].open("w").close() p1.joinpath("foobar/somefile").touch()
p1["foobaz"].mkdir() p1.joinpath("foobaz").mkdir()
p1["foobaz/somefile"].open("w").close() p1.joinpath("foobaz/somefile").touch()
d.add_path(p1) d.add_path(p1)
eq_(d.get_state(p1["foobaz"]), DirectoryState.Normal) eq_(d.get_state(p1.joinpath("foobaz")), DirectoryState.NORMAL)
eq_(d.get_state(p1["foobar"]), DirectoryState.Excluded) eq_(d.get_state(p1.joinpath("foobar")), DirectoryState.EXCLUDED)
eq_(len(list(d.get_files())), 1) # only the 'foobaz' file is there eq_(len(list(d.get_files())), 1) # only the 'foobaz' file is there
# However, the default state can be changed # However, the default state can be changed
d.set_state(p1["foobar"], DirectoryState.Normal) d.set_state(p1.joinpath("foobar"), DirectoryState.NORMAL)
eq_(d.get_state(p1["foobar"]), DirectoryState.Normal) eq_(d.get_state(p1.joinpath("foobar")), DirectoryState.NORMAL)
eq_(len(list(d.get_files())), 2) eq_(len(list(d.get_files())), 2)
class TestExcludeList(): class TestExcludeList:
def setup_method(self, method): def setup_method(self, method):
self.d = Directories(exclude_list=ExcludeList(union_regex=False)) self.d = Directories(exclude_list=ExcludeList(union_regex=False))
def get_files_and_expect_num_result(self, num_result): def get_files_and_expect_num_result(self, num_result):
"""Calls get_files(), get the filenames only, print for debugging. """Calls get_files(), get the filenames only, print for debugging.
num_result is how many files are expected as a result.""" num_result is how many files are expected as a result."""
print(f"EXCLUDED REGEX: paths {self.d._exclude_list.compiled_paths} \ print(
files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled}") f"EXCLUDED REGEX: paths {self.d._exclude_list.compiled_paths} \
files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled}"
)
files = list(self.d.get_files()) files = list(self.d.get_files())
files = [file.name for file in files] files = [file.name for file in files]
print(f"FINAL FILES {files}") print(f"FINAL FILES {files}")
@@ -365,42 +365,42 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
self.d._exclude_list.add(regex) self.d._exclude_list.add(regex)
self.d._exclude_list.mark(regex) self.d._exclude_list.mark(regex)
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["$Recycle.Bin"].mkdir() p1.joinpath("$Recycle.Bin").mkdir()
p1["$Recycle.Bin"]["subdir"].mkdir() p1.joinpath("$Recycle.Bin", "subdir").mkdir()
self.d.add_path(p1) self.d.add_path(p1)
eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.Excluded) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
# By default, subdirs should be excluded too, but this can be overriden separately # By default, subdirs should be excluded too, but this can be overridden separately
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.Excluded) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.Normal) self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.Normal) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
def test_exclude_refined(self, tmpdir): def test_exclude_refined(self, tmpdir):
regex1 = r"^\$Recycle\.Bin$" regex1 = r"^\$Recycle\.Bin$"
self.d._exclude_list.add(regex1) self.d._exclude_list.add(regex1)
self.d._exclude_list.mark(regex1) self.d._exclude_list.mark(regex1)
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["$Recycle.Bin"].mkdir() p1.joinpath("$Recycle.Bin").mkdir()
p1["$Recycle.Bin"]["somefile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "somefile.png").touch()
p1["$Recycle.Bin"]["some_unwanted_file.jpg"].open("w").close() p1.joinpath("$Recycle.Bin", "some_unwanted_file.jpg").touch()
p1["$Recycle.Bin"]["subdir"].mkdir() p1.joinpath("$Recycle.Bin", "subdir").mkdir()
p1["$Recycle.Bin"]["subdir"]["somesubdirfile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "somesubdirfile.png").touch()
p1["$Recycle.Bin"]["subdir"]["unwanted_subdirfile.gif"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "unwanted_subdirfile.gif").touch()
p1["$Recycle.Bin"]["subdar"].mkdir() p1.joinpath("$Recycle.Bin", "subdar").mkdir()
p1["$Recycle.Bin"]["subdar"]["somesubdarfile.jpeg"].open("w").close() p1.joinpath("$Recycle.Bin", "subdar", "somesubdarfile.jpeg").touch()
p1["$Recycle.Bin"]["subdar"]["unwanted_subdarfile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "subdar", "unwanted_subdarfile.png").touch()
self.d.add_path(p1["$Recycle.Bin"]) self.d.add_path(p1.joinpath("$Recycle.Bin"))
# Filter should set the default state to Excluded # Filter should set the default state to Excluded
eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.Excluded) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
# The subdir should inherit its parent state # The subdir should inherit its parent state
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.Excluded) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.Excluded) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.EXCLUDED)
# Override a child path's state # Override a child path's state
self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.Normal) self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.Normal) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
# Parent should keep its default state, and the other child too # Parent should keep its default state, and the other child too
eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.Excluded) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.Excluded) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.EXCLUDED)
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
# only the 2 files directly under the Normal directory # only the 2 files directly under the Normal directory
@@ -412,8 +412,8 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
assert "somesubdirfile.png" in files assert "somesubdirfile.png" in files
assert "unwanted_subdirfile.gif" in files assert "unwanted_subdirfile.gif" in files
# Overriding the parent should enable all children # Overriding the parent should enable all children
self.d.set_state(p1["$Recycle.Bin"], DirectoryState.Normal) self.d.set_state(p1.joinpath("$Recycle.Bin"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.Normal) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.NORMAL)
# all files there # all files there
files = self.get_files_and_expect_num_result(6) files = self.get_files_and_expect_num_result(6)
assert "somefile.png" in files assert "somefile.png" in files
@@ -437,7 +437,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
assert self.d._exclude_list.error(regex3) is None assert self.d._exclude_list.error(regex3) is None
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
# Directory shouldn't change its state here, unless explicitely done by user # Directory shouldn't change its state here, unless explicitely done by user
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.Normal) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
assert "unwanted_subdirfile.gif" not in files assert "unwanted_subdirfile.gif" not in files
assert "unwanted_subdarfile.png" in files assert "unwanted_subdarfile.png" in files
@@ -446,15 +446,15 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
regex4 = r".*subdir$" regex4 = r".*subdir$"
self.d._exclude_list.rename(regex3, regex4) self.d._exclude_list.rename(regex3, regex4)
assert self.d._exclude_list.error(regex4) is None assert self.d._exclude_list.error(regex4) is None
p1["$Recycle.Bin"]["subdar"]["file_ending_with_subdir"].open("w").close() p1.joinpath("$Recycle.Bin", "subdar", "file_ending_with_subdir").touch()
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.Excluded) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
files = self.get_files_and_expect_num_result(4) files = self.get_files_and_expect_num_result(4)
assert "file_ending_with_subdir" not in files assert "file_ending_with_subdir" not in files
assert "somesubdarfile.jpeg" in files assert "somesubdarfile.jpeg" in files
assert "somesubdirfile.png" not in files assert "somesubdirfile.png" not in files
assert "unwanted_subdirfile.gif" not in files assert "unwanted_subdirfile.gif" not in files
self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.Normal) self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.Normal) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
files = self.get_files_and_expect_num_result(6) files = self.get_files_and_expect_num_result(6)
assert "file_ending_with_subdir" not in files assert "file_ending_with_subdir" not in files
@@ -464,31 +464,54 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
regex5 = r".*subdir.*" regex5 = r".*subdir.*"
self.d._exclude_list.rename(regex4, regex5) self.d._exclude_list.rename(regex4, regex5)
# Files containing substring should be filtered # Files containing substring should be filtered
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.Normal) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
# The path should not match, only the filename, the "subdir" in the directory name shouldn't matter # The path should not match, only the filename, the "subdir" in the directory name shouldn't matter
p1["$Recycle.Bin"]["subdir"]["file_which_shouldnt_match"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "file_which_shouldnt_match").touch()
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
assert "somesubdirfile.png" not in files assert "somesubdirfile.png" not in files
assert "unwanted_subdirfile.gif" not in files assert "unwanted_subdirfile.gif" not in files
assert "file_ending_with_subdir" not in files assert "file_ending_with_subdir" not in files
assert "file_which_shouldnt_match" in files assert "file_which_shouldnt_match" in files
# This should match the directory only
regex6 = r".*/.*subdir.*/.*"
if ISWINDOWS:
regex6 = r".*\\.*subdir.*\\.*"
assert os.sep in regex6
self.d._exclude_list.rename(regex5, regex6)
self.d._exclude_list.remove(regex1)
eq_(len(self.d._exclude_list.compiled), 1)
assert regex1 not in self.d._exclude_list
assert regex5 not in self.d._exclude_list
assert self.d._exclude_list.error(regex6) is None
assert regex6 in self.d._exclude_list
# This still should not be affected
eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
files = self.get_files_and_expect_num_result(5)
# These files are under the "/subdir" directory
assert "somesubdirfile.png" not in files
assert "unwanted_subdirfile.gif" not in files
# This file under "subdar" directory should not be filtered out
assert "file_ending_with_subdir" in files
# This file is in a directory that should be filtered out
assert "file_which_shouldnt_match" not in files
def test_japanese_unicode(self, tmpdir): def test_japanese_unicode(self, tmpdir):
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["$Recycle.Bin"].mkdir() p1.joinpath("$Recycle.Bin").mkdir()
p1["$Recycle.Bin"]["somerecycledfile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "somerecycledfile.png").touch()
p1["$Recycle.Bin"]["some_unwanted_file.jpg"].open("w").close() p1.joinpath("$Recycle.Bin", "some_unwanted_file.jpg").touch()
p1["$Recycle.Bin"]["subdir"].mkdir() p1.joinpath("$Recycle.Bin", "subdir").mkdir()
p1["$Recycle.Bin"]["subdir"]["過去白濁物語~]_カラー.jpg"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "過去白濁物語~]_カラー.jpg").touch()
p1["$Recycle.Bin"]["思叫物語"].mkdir() p1.joinpath("$Recycle.Bin", "思叫物語").mkdir()
p1["$Recycle.Bin"]["思叫物語"]["なししろ会う前"].open("w").close() p1.joinpath("$Recycle.Bin", "思叫物語", "なししろ会う前").touch()
p1["$Recycle.Bin"]["思叫物語"]["堂~ロ"].open("w").close() p1.joinpath("$Recycle.Bin", "思叫物語", "堂~ロ").touch()
self.d.add_path(p1["$Recycle.Bin"]) self.d.add_path(p1.joinpath("$Recycle.Bin"))
regex3 = r".*物語.*" regex3 = r".*物語.*"
self.d._exclude_list.add(regex3) self.d._exclude_list.add(regex3)
self.d._exclude_list.mark(regex3) self.d._exclude_list.mark(regex3)
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
eq_(self.d.get_state(p1["$Recycle.Bin"]["思叫物語"]), DirectoryState.Excluded) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "思叫物語")), DirectoryState.EXCLUDED)
files = self.get_files_and_expect_num_result(2) files = self.get_files_and_expect_num_result(2)
assert "過去白濁物語~]_カラー.jpg" not in files assert "過去白濁物語~]_カラー.jpg" not in files
assert "なししろ会う前" not in files assert "なししろ会う前" not in files
@@ -497,7 +520,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
regex4 = r".*物語$" regex4 = r".*物語$"
self.d._exclude_list.rename(regex3, regex4) self.d._exclude_list.rename(regex3, regex4)
assert self.d._exclude_list.error(regex4) is None assert self.d._exclude_list.error(regex4) is None
self.d.set_state(p1["$Recycle.Bin"]["思叫物語"], DirectoryState.Normal) self.d.set_state(p1.joinpath("$Recycle.Bin", "思叫物語"), DirectoryState.NORMAL)
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
assert "過去白濁物語~]_カラー.jpg" in files assert "過去白濁物語~]_カラー.jpg" in files
assert "なししろ会う前" in files assert "なししろ会う前" in files
@@ -509,15 +532,15 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
self.d._exclude_list.add(regex) self.d._exclude_list.add(regex)
self.d._exclude_list.mark(regex) self.d._exclude_list.mark(regex)
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["foobar"].mkdir() p1.joinpath("foobar").mkdir()
p1["foobar"][".hidden_file.txt"].open("w").close() p1.joinpath("foobar", ".hidden_file.txt").touch()
p1["foobar"][".hidden_dir"].mkdir() p1.joinpath("foobar", ".hidden_dir").mkdir()
p1["foobar"][".hidden_dir"]["foobar.jpg"].open("w").close() p1.joinpath("foobar", ".hidden_dir", "foobar.jpg").touch()
p1["foobar"][".hidden_dir"][".hidden_subfile.png"].open("w").close() p1.joinpath("foobar", ".hidden_dir", ".hidden_subfile.png").touch()
self.d.add_path(p1["foobar"]) self.d.add_path(p1.joinpath("foobar"))
# It should not inherit its parent's state originally # It should not inherit its parent's state originally
eq_(self.d.get_state(p1["foobar"][".hidden_dir"]), DirectoryState.Excluded) eq_(self.d.get_state(p1.joinpath("foobar", ".hidden_dir")), DirectoryState.EXCLUDED)
self.d.set_state(p1["foobar"][".hidden_dir"], DirectoryState.Normal) self.d.set_state(p1.joinpath("foobar", ".hidden_dir"), DirectoryState.NORMAL)
# The files should still be filtered # The files should still be filtered
files = self.get_files_and_expect_num_result(1) files = self.get_files_and_expect_num_result(1)
eq_(len(self.d._exclude_list.compiled_paths), 0) eq_(len(self.d._exclude_list.compiled_paths), 0)

View File

@@ -10,9 +10,9 @@ from hscommon.jobprogress import job
from hscommon.util import first from hscommon.util import first
from hscommon.testutil import eq_, log_calls from hscommon.testutil import eq_, log_calls
from .base import NamedObject from core.tests.base import NamedObject
from .. import engine from core import engine
from ..engine import ( from core.engine import (
get_match, get_match,
getwords, getwords,
Group, Group,
@@ -69,6 +69,10 @@ class TestCasegetwords:
eq_(["a", "b", "c", "d"], getwords("a b c d")) eq_(["a", "b", "c", "d"], getwords("a b c d"))
eq_(["a", "b", "c", "d"], getwords(" a b c d ")) eq_(["a", "b", "c", "d"], getwords(" a b c d "))
def test_unicode(self):
eq_(["e", "c", "0", "a", "o", "u", "e", "u"], getwords("é ç 0 à ö û è ¤ ù"))
eq_(["02", "君のこころは輝いてるかい?", "国木田花丸", "solo", "ver"], getwords("02 君のこころは輝いてるかい? 国木田花丸 Solo Ver"))
def test_splitter_chars(self): def test_splitter_chars(self):
eq_( eq_(
[chr(i) for i in range(ord("a"), ord("z") + 1)], [chr(i) for i in range(ord("a"), ord("z") + 1)],
@@ -85,7 +89,7 @@ class TestCasegetwords:
eq_(["foo", "bar"], getwords("FOO BAR")) eq_(["foo", "bar"], getwords("FOO BAR"))
def test_decompose_unicode(self): def test_decompose_unicode(self):
eq_(getwords("foo\xe9bar"), ["fooebar"]) eq_(["fooebar"], getwords("foo\xe9bar"))
class TestCasegetfields: class TestCasegetfields:
@@ -99,10 +103,9 @@ class TestCasegetfields:
expected = [["a", "bc", "def"]] expected = [["a", "bc", "def"]]
actual = getfields(" - a bc def") actual = getfields(" - a bc def")
eq_(expected, actual) eq_(expected, actual)
expected = [["bc", "def"]]
class TestCaseunpack_fields: class TestCaseUnpackFields:
def test_with_fields(self): def test_with_fields(self):
expected = ["a", "b", "c", "d", "e", "f"] expected = ["a", "b", "c", "d", "e", "f"]
actual = unpack_fields([["a"], ["b", "c"], ["d", "e", "f"]]) actual = unpack_fields([["a"], ["b", "c"], ["d", "e", "f"]])
@@ -173,9 +176,7 @@ class TestCaseWordCompareWithFields:
def test_simple(self): def test_simple(self):
eq_( eq_(
67, 67,
compare_fields( compare_fields([["a", "b"], ["c", "d", "e"]], [["a", "b"], ["c", "d", "f"]]),
[["a", "b"], ["c", "d", "e"]], [["a", "b"], ["c", "d", "f"]]
),
) )
def test_empty(self): def test_empty(self):
@@ -216,24 +217,24 @@ class TestCaseWordCompareWithFields:
eq_([["c", "d", "f"], ["a", "b"]], second) eq_([["c", "d", "f"], ["a", "b"]], second)
class TestCasebuild_word_dict: class TestCaseBuildWordDict:
def test_with_standard_words(self): def test_with_standard_words(self):
itemList = [NamedObject("foo bar", True)] item_list = [NamedObject("foo bar", True)]
itemList.append(NamedObject("bar baz", True)) item_list.append(NamedObject("bar baz", True))
itemList.append(NamedObject("baz bleh foo", True)) item_list.append(NamedObject("baz bleh foo", True))
d = build_word_dict(itemList) d = build_word_dict(item_list)
eq_(4, len(d)) eq_(4, len(d))
eq_(2, len(d["foo"])) eq_(2, len(d["foo"]))
assert itemList[0] in d["foo"] assert item_list[0] in d["foo"]
assert itemList[2] in d["foo"] assert item_list[2] in d["foo"]
eq_(2, len(d["bar"])) eq_(2, len(d["bar"]))
assert itemList[0] in d["bar"] assert item_list[0] in d["bar"]
assert itemList[1] in d["bar"] assert item_list[1] in d["bar"]
eq_(2, len(d["baz"])) eq_(2, len(d["baz"]))
assert itemList[1] in d["baz"] assert item_list[1] in d["baz"]
assert itemList[2] in d["baz"] assert item_list[2] in d["baz"]
eq_(1, len(d["bleh"])) eq_(1, len(d["bleh"]))
assert itemList[2] in d["bleh"] assert item_list[2] in d["bleh"]
def test_unpack_fields(self): def test_unpack_fields(self):
o = NamedObject("") o = NamedObject("")
@@ -261,31 +262,29 @@ class TestCasebuild_word_dict:
j = job.Job(1, do_progress) j = job.Job(1, do_progress)
self.log = [] self.log = []
s = "foo bar" s = "foo bar"
build_word_dict( build_word_dict([NamedObject(s, True), NamedObject(s, True), NamedObject(s, True)], j)
[NamedObject(s, True), NamedObject(s, True), NamedObject(s, True)], j
)
# We don't have intermediate log because iter_with_progress is called with every > 1 # We don't have intermediate log because iter_with_progress is called with every > 1
eq_(0, self.log[0]) eq_(0, self.log[0])
eq_(100, self.log[1]) eq_(100, self.log[1])
class TestCasemerge_similar_words: class TestCaseMergeSimilarWords:
def test_some_similar_words(self): def test_some_similar_words(self):
d = { d = {
"foobar": set([1]), "foobar": {1},
"foobar1": set([2]), "foobar1": {2},
"foobar2": set([3]), "foobar2": {3},
} }
merge_similar_words(d) merge_similar_words(d)
eq_(1, len(d)) eq_(1, len(d))
eq_(3, len(d["foobar"])) eq_(3, len(d["foobar"]))
class TestCasereduce_common_words: class TestCaseReduceCommonWords:
def test_typical(self): def test_typical(self):
d = { d = {
"foo": set([NamedObject("foo bar", True) for i in range(50)]), "foo": {NamedObject("foo bar", True) for _ in range(50)},
"bar": set([NamedObject("foo bar", True) for i in range(49)]), "bar": {NamedObject("foo bar", True) for _ in range(49)},
} }
reduce_common_words(d, 50) reduce_common_words(d, 50)
assert "foo" not in d assert "foo" not in d
@@ -293,11 +292,8 @@ class TestCasereduce_common_words:
def test_dont_remove_objects_with_only_common_words(self): def test_dont_remove_objects_with_only_common_words(self):
d = { d = {
"common": set( "common": set([NamedObject("common uncommon", True) for _ in range(50)] + [NamedObject("common", True)]),
[NamedObject("common uncommon", True) for i in range(50)] "uncommon": {NamedObject("common uncommon", True)},
+ [NamedObject("common", True)]
),
"uncommon": set([NamedObject("common uncommon", True)]),
} }
reduce_common_words(d, 50) reduce_common_words(d, 50)
eq_(1, len(d["common"])) eq_(1, len(d["common"]))
@@ -305,23 +301,20 @@ class TestCasereduce_common_words:
def test_values_still_are_set_instances(self): def test_values_still_are_set_instances(self):
d = { d = {
"common": set( "common": set([NamedObject("common uncommon", True) for _ in range(50)] + [NamedObject("common", True)]),
[NamedObject("common uncommon", True) for i in range(50)] "uncommon": {NamedObject("common uncommon", True)},
+ [NamedObject("common", True)]
),
"uncommon": set([NamedObject("common uncommon", True)]),
} }
reduce_common_words(d, 50) reduce_common_words(d, 50)
assert isinstance(d["common"], set) assert isinstance(d["common"], set)
assert isinstance(d["uncommon"], set) assert isinstance(d["uncommon"], set)
def test_dont_raise_KeyError_when_a_word_has_been_removed(self): def test_dont_raise_keyerror_when_a_word_has_been_removed(self):
# If a word has been removed by the reduce, an object in a subsequent common word that # If a word has been removed by the reduce, an object in a subsequent common word that
# contains the word that has been removed would cause a KeyError. # contains the word that has been removed would cause a KeyError.
d = { d = {
"foo": set([NamedObject("foo bar baz", True) for i in range(50)]), "foo": {NamedObject("foo bar baz", True) for _ in range(50)},
"bar": set([NamedObject("foo bar baz", True) for i in range(50)]), "bar": {NamedObject("foo bar baz", True) for _ in range(50)},
"baz": set([NamedObject("foo bar baz", True) for i in range(49)]), "baz": {NamedObject("foo bar baz", True) for _ in range(49)},
} }
try: try:
reduce_common_words(d, 50) reduce_common_words(d, 50)
@@ -335,7 +328,7 @@ class TestCasereduce_common_words:
o.words = [["foo", "bar"], ["baz"]] o.words = [["foo", "bar"], ["baz"]]
return o return o
d = {"foo": set([create_it() for i in range(50)])} d = {"foo": {create_it() for _ in range(50)}}
try: try:
reduce_common_words(d, 50) reduce_common_words(d, 50)
except TypeError: except TypeError:
@@ -348,13 +341,9 @@ class TestCasereduce_common_words:
# would not stay in 'bar' because 'foo' is not a common word anymore. # would not stay in 'bar' because 'foo' is not a common word anymore.
only_common = NamedObject("foo bar", True) only_common = NamedObject("foo bar", True)
d = { d = {
"foo": set( "foo": set([NamedObject("foo bar baz", True) for _ in range(49)] + [only_common]),
[NamedObject("foo bar baz", True) for i in range(49)] + [only_common] "bar": set([NamedObject("foo bar baz", True) for _ in range(49)] + [only_common]),
), "baz": {NamedObject("foo bar baz", True) for _ in range(49)},
"bar": set(
[NamedObject("foo bar baz", True) for i in range(49)] + [only_common]
),
"baz": set([NamedObject("foo bar baz", True) for i in range(49)]),
} }
reduce_common_words(d, 50) reduce_common_words(d, 50)
eq_(1, len(d["foo"])) eq_(1, len(d["foo"]))
@@ -362,7 +351,7 @@ class TestCasereduce_common_words:
eq_(49, len(d["baz"])) eq_(49, len(d["baz"]))
class TestCaseget_match: class TestCaseGetMatch:
def test_simple(self): def test_simple(self):
o1 = NamedObject("foo bar", True) o1 = NamedObject("foo bar", True)
o2 = NamedObject("bar bleh", True) o2 = NamedObject("bar bleh", True)
@@ -382,9 +371,7 @@ class TestCaseget_match:
assert object() not in m assert object() not in m
def test_word_weight(self): def test_word_weight(self):
m = get_match( m = get_match(NamedObject("foo bar", True), NamedObject("bar bleh", True), (WEIGHT_WORDS,))
NamedObject("foo bar", True), NamedObject("bar bleh", True), (WEIGHT_WORDS,)
)
eq_(m.percentage, int((6.0 / 13.0) * 100)) eq_(m.percentage, int((6.0 / 13.0) * 100))
@@ -393,12 +380,12 @@ class TestCaseGetMatches:
eq_(getmatches([]), []) eq_(getmatches([]), [])
def test_simple(self): def test_simple(self):
itemList = [ item_list = [
NamedObject("foo bar"), NamedObject("foo bar"),
NamedObject("bar bleh"), NamedObject("bar bleh"),
NamedObject("a b c foo"), NamedObject("a b c foo"),
] ]
r = getmatches(itemList) r = getmatches(item_list)
eq_(2, len(r)) eq_(2, len(r))
m = first(m for m in r if m.percentage == 50) # "foo bar" and "bar bleh" m = first(m for m in r if m.percentage == 50) # "foo bar" and "bar bleh"
assert_match(m, "foo bar", "bar bleh") assert_match(m, "foo bar", "bar bleh")
@@ -406,40 +393,40 @@ class TestCaseGetMatches:
assert_match(m, "foo bar", "a b c foo") assert_match(m, "foo bar", "a b c foo")
def test_null_and_unrelated_objects(self): def test_null_and_unrelated_objects(self):
itemList = [ item_list = [
NamedObject("foo bar"), NamedObject("foo bar"),
NamedObject("bar bleh"), NamedObject("bar bleh"),
NamedObject(""), NamedObject(""),
NamedObject("unrelated object"), NamedObject("unrelated object"),
] ]
r = getmatches(itemList) r = getmatches(item_list)
eq_(len(r), 1) eq_(len(r), 1)
m = r[0] m = r[0]
eq_(m.percentage, 50) eq_(m.percentage, 50)
assert_match(m, "foo bar", "bar bleh") assert_match(m, "foo bar", "bar bleh")
def test_twice_the_same_word(self): def test_twice_the_same_word(self):
itemList = [NamedObject("foo foo bar"), NamedObject("bar bleh")] item_list = [NamedObject("foo foo bar"), NamedObject("bar bleh")]
r = getmatches(itemList) r = getmatches(item_list)
eq_(1, len(r)) eq_(1, len(r))
def test_twice_the_same_word_when_preworded(self): def test_twice_the_same_word_when_preworded(self):
itemList = [NamedObject("foo foo bar", True), NamedObject("bar bleh", True)] item_list = [NamedObject("foo foo bar", True), NamedObject("bar bleh", True)]
r = getmatches(itemList) r = getmatches(item_list)
eq_(1, len(r)) eq_(1, len(r))
def test_two_words_match(self): def test_two_words_match(self):
itemList = [NamedObject("foo bar"), NamedObject("foo bar bleh")] item_list = [NamedObject("foo bar"), NamedObject("foo bar bleh")]
r = getmatches(itemList) r = getmatches(item_list)
eq_(1, len(r)) eq_(1, len(r))
def test_match_files_with_only_common_words(self): def test_match_files_with_only_common_words(self):
# If a word occurs more than 50 times, it is excluded from the matching process # If a word occurs more than 50 times, it is excluded from the matching process
# The problem with the common_word_threshold is that the files containing only common # The problem with the common_word_threshold is that the files containing only common
# words will never be matched together. We *should* match them. # words will never be matched together. We *should* match them.
# This test assumes that the common word threashold const is 50 # This test assumes that the common word threshold const is 50
itemList = [NamedObject("foo") for i in range(50)] item_list = [NamedObject("foo") for _ in range(50)]
r = getmatches(itemList) r = getmatches(item_list)
eq_(1225, len(r)) eq_(1225, len(r))
def test_use_words_already_there_if_there(self): def test_use_words_already_there_if_there(self):
@@ -462,28 +449,28 @@ class TestCaseGetMatches:
eq_(100, self.log[-1]) eq_(100, self.log[-1])
def test_weight_words(self): def test_weight_words(self):
itemList = [NamedObject("foo bar"), NamedObject("bar bleh")] item_list = [NamedObject("foo bar"), NamedObject("bar bleh")]
m = getmatches(itemList, weight_words=True)[0] m = getmatches(item_list, weight_words=True)[0]
eq_(int((6.0 / 13.0) * 100), m.percentage) eq_(int((6.0 / 13.0) * 100), m.percentage)
def test_similar_word(self): def test_similar_word(self):
itemList = [NamedObject("foobar"), NamedObject("foobars")] item_list = [NamedObject("foobar"), NamedObject("foobars")]
eq_(len(getmatches(itemList, match_similar_words=True)), 1) eq_(len(getmatches(item_list, match_similar_words=True)), 1)
eq_(getmatches(itemList, match_similar_words=True)[0].percentage, 100) eq_(getmatches(item_list, match_similar_words=True)[0].percentage, 100)
itemList = [NamedObject("foobar"), NamedObject("foo")] item_list = [NamedObject("foobar"), NamedObject("foo")]
eq_(len(getmatches(itemList, match_similar_words=True)), 0) # too far eq_(len(getmatches(item_list, match_similar_words=True)), 0) # too far
itemList = [NamedObject("bizkit"), NamedObject("bizket")] item_list = [NamedObject("bizkit"), NamedObject("bizket")]
eq_(len(getmatches(itemList, match_similar_words=True)), 1) eq_(len(getmatches(item_list, match_similar_words=True)), 1)
itemList = [NamedObject("foobar"), NamedObject("foosbar")] item_list = [NamedObject("foobar"), NamedObject("foosbar")]
eq_(len(getmatches(itemList, match_similar_words=True)), 1) eq_(len(getmatches(item_list, match_similar_words=True)), 1)
def test_single_object_with_similar_words(self): def test_single_object_with_similar_words(self):
itemList = [NamedObject("foo foos")] item_list = [NamedObject("foo foos")]
eq_(len(getmatches(itemList, match_similar_words=True)), 0) eq_(len(getmatches(item_list, match_similar_words=True)), 0)
def test_double_words_get_counted_only_once(self): def test_double_words_get_counted_only_once(self):
itemList = [NamedObject("foo bar foo bleh"), NamedObject("foo bar bleh bar")] item_list = [NamedObject("foo bar foo bleh"), NamedObject("foo bar bleh bar")]
m = getmatches(itemList)[0] m = getmatches(item_list)[0]
eq_(75, m.percentage) eq_(75, m.percentage)
def test_with_fields(self): def test_with_fields(self):
@@ -503,13 +490,13 @@ class TestCaseGetMatches:
eq_(m.percentage, 50) eq_(m.percentage, 50)
def test_only_match_similar_when_the_option_is_set(self): def test_only_match_similar_when_the_option_is_set(self):
itemList = [NamedObject("foobar"), NamedObject("foobars")] item_list = [NamedObject("foobar"), NamedObject("foobars")]
eq_(len(getmatches(itemList, match_similar_words=False)), 0) eq_(len(getmatches(item_list, match_similar_words=False)), 0)
def test_dont_recurse_do_match(self): def test_dont_recurse_do_match(self):
# with nosetests, the stack is increased. The number has to be high enough not to be failing falsely # with nosetests, the stack is increased. The number has to be high enough not to be failing falsely
sys.setrecursionlimit(200) sys.setrecursionlimit(200)
files = [NamedObject("foo bar") for i in range(201)] files = [NamedObject("foo bar") for _ in range(201)]
try: try:
getmatches(files) getmatches(files)
except RuntimeError: except RuntimeError:
@@ -518,38 +505,60 @@ class TestCaseGetMatches:
sys.setrecursionlimit(1000) sys.setrecursionlimit(1000)
def test_min_match_percentage(self): def test_min_match_percentage(self):
itemList = [ item_list = [
NamedObject("foo bar"), NamedObject("foo bar"),
NamedObject("bar bleh"), NamedObject("bar bleh"),
NamedObject("a b c foo"), NamedObject("a b c foo"),
] ]
r = getmatches(itemList, min_match_percentage=50) r = getmatches(item_list, min_match_percentage=50)
eq_(1, len(r)) # Only "foo bar" / "bar bleh" should match eq_(1, len(r)) # Only "foo bar" / "bar bleh" should match
def test_MemoryError(self, monkeypatch): def test_memory_error(self, monkeypatch):
@log_calls @log_calls
def mocked_match(first, second, flags): def mocked_match(first, second, flags):
if len(mocked_match.calls) > 42: if len(mocked_match.calls) > 42:
raise MemoryError() raise MemoryError()
return Match(first, second, 0) return Match(first, second, 0)
objects = [NamedObject() for i in range(10)] # results in 45 matches objects = [NamedObject() for _ in range(10)] # results in 45 matches
monkeypatch.setattr(engine, "get_match", mocked_match) monkeypatch.setattr(engine, "get_match", mocked_match)
try: try:
r = getmatches(objects) r = getmatches(objects)
except MemoryError: except MemoryError:
self.fail("MemorryError must be handled") self.fail("MemoryError must be handled")
eq_(42, len(r)) eq_(42, len(r))
class TestCaseGetMatchesByContents: class TestCaseGetMatchesByContents:
def test_dont_compare_empty_files(self): def test_big_file_partial_hashing(self):
o1, o2 = no(size=0), no(size=0) smallsize = 1
assert not getmatches_by_contents([o1, o2]) bigsize = 100 * 1024 * 1024 # 100MB
f = [
no("bigfoo", size=bigsize),
no("bigbar", size=bigsize),
no("smallfoo", size=smallsize),
no("smallbar", size=smallsize),
]
f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].digest = f[2].digest_partial = "bleh"
f[3].digest = f[3].digest_partial = "bleh"
r = getmatches_by_contents(f, bigsize=bigsize)
eq_(len(r), 2)
# User disabled optimization for big files, compute digests as usual
r = getmatches_by_contents(f, bigsize=0)
eq_(len(r), 2)
# Other file is now slightly different, digest_partial is still the same
f[1].digest = f[1].digest_samples = "foobardiff"
r = getmatches_by_contents(f, bigsize=bigsize)
# Successfully filter it out
eq_(len(r), 1)
r = getmatches_by_contents(f, bigsize=0)
eq_(len(r), 1)
class TestCaseGroup: class TestCaseGroup:
def test_empy(self): def test_empty(self):
g = Group() g = Group()
eq_(None, g.ref) eq_(None, g.ref)
eq_([], g.dupes) eq_([], g.dupes)
@@ -723,8 +732,7 @@ class TestCaseGroup:
# if the ref has the same key as one or more of the dupe, run the tie_breaker func among them # if the ref has the same key as one or more of the dupe, run the tie_breaker func among them
g = get_test_group() g = get_test_group()
o1, o2, o3 = g.ordered o1, o2, o3 = g.ordered
tie_breaker = lambda ref, dupe: dupe is o3 g.prioritize(lambda x: 0, lambda ref, dupe: dupe is o3)
g.prioritize(lambda x: 0, tie_breaker)
assert g.ref is o3 assert g.ref is o3
def test_prioritize_with_tie_breaker_runs_on_all_dupes(self): def test_prioritize_with_tie_breaker_runs_on_all_dupes(self):
@@ -735,8 +743,7 @@ class TestCaseGroup:
o1.foo = 1 o1.foo = 1
o2.foo = 2 o2.foo = 2
o3.foo = 3 o3.foo = 3
tie_breaker = lambda ref, dupe: dupe.foo > ref.foo g.prioritize(lambda x: 0, lambda ref, dupe: dupe.foo > ref.foo)
g.prioritize(lambda x: 0, tie_breaker)
assert g.ref is o3 assert g.ref is o3
def test_prioritize_with_tie_breaker_runs_only_on_tie_dupes(self): def test_prioritize_with_tie_breaker_runs_only_on_tie_dupes(self):
@@ -749,9 +756,7 @@ class TestCaseGroup:
o1.bar = 1 o1.bar = 1
o2.bar = 2 o2.bar = 2
o3.bar = 3 o3.bar = 3
key_func = lambda x: -x.foo g.prioritize(lambda x: -x.foo, lambda ref, dupe: dupe.bar > ref.bar)
tie_breaker = lambda ref, dupe: dupe.bar > ref.bar
g.prioritize(key_func, tie_breaker)
assert g.ref is o2 assert g.ref is o2
def test_prioritize_with_ref_dupe(self): def test_prioritize_with_ref_dupe(self):
@@ -792,14 +797,14 @@ class TestCaseGroup:
eq_(0, len(g.candidates)) eq_(0, len(g.candidates))
class TestCaseget_groups: class TestCaseGetGroups:
def test_empty(self): def test_empty(self):
r = get_groups([]) r = get_groups([])
eq_([], r) eq_([], r)
def test_simple(self): def test_simple(self):
itemList = [NamedObject("foo bar"), NamedObject("bar bleh")] item_list = [NamedObject("foo bar"), NamedObject("bar bleh")]
matches = getmatches(itemList) matches = getmatches(item_list)
m = matches[0] m = matches[0]
r = get_groups(matches) r = get_groups(matches)
eq_(1, len(r)) eq_(1, len(r))
@@ -809,15 +814,15 @@ class TestCaseget_groups:
def test_group_with_multiple_matches(self): def test_group_with_multiple_matches(self):
# This results in 3 matches # This results in 3 matches
itemList = [NamedObject("foo"), NamedObject("foo"), NamedObject("foo")] item_list = [NamedObject("foo"), NamedObject("foo"), NamedObject("foo")]
matches = getmatches(itemList) matches = getmatches(item_list)
r = get_groups(matches) r = get_groups(matches)
eq_(1, len(r)) eq_(1, len(r))
g = r[0] g = r[0]
eq_(3, len(g)) eq_(3, len(g))
def test_must_choose_a_group(self): def test_must_choose_a_group(self):
itemList = [ item_list = [
NamedObject("a b"), NamedObject("a b"),
NamedObject("a b"), NamedObject("a b"),
NamedObject("b c"), NamedObject("b c"),
@@ -826,13 +831,13 @@ class TestCaseget_groups:
] ]
# There will be 2 groups here: group "a b" and group "c d" # There will be 2 groups here: group "a b" and group "c d"
# "b c" can go either of them, but not both. # "b c" can go either of them, but not both.
matches = getmatches(itemList) matches = getmatches(item_list)
r = get_groups(matches) r = get_groups(matches)
eq_(2, len(r)) eq_(2, len(r))
eq_(5, len(r[0]) + len(r[1])) eq_(5, len(r[0]) + len(r[1]))
def test_should_all_go_in_the_same_group(self): def test_should_all_go_in_the_same_group(self):
itemList = [ item_list = [
NamedObject("a b"), NamedObject("a b"),
NamedObject("a b"), NamedObject("a b"),
NamedObject("a b"), NamedObject("a b"),
@@ -840,7 +845,7 @@ class TestCaseget_groups:
] ]
# There will be 2 groups here: group "a b" and group "c d" # There will be 2 groups here: group "a b" and group "c d"
# "b c" can fit in both, but it must be in only one of them # "b c" can fit in both, but it must be in only one of them
matches = getmatches(itemList) matches = getmatches(item_list)
r = get_groups(matches) r = get_groups(matches)
eq_(1, len(r)) eq_(1, len(r))
@@ -859,8 +864,8 @@ class TestCaseget_groups:
assert o3 in g assert o3 in g
def test_four_sized_group(self): def test_four_sized_group(self):
itemList = [NamedObject("foobar") for i in range(4)] item_list = [NamedObject("foobar") for _ in range(4)]
m = getmatches(itemList) m = getmatches(item_list)
r = get_groups(m) r = get_groups(m)
eq_(1, len(r)) eq_(1, len(r))
eq_(4, len(r[0])) eq_(4, len(r[0]))
@@ -879,13 +884,11 @@ class TestCaseget_groups:
# If, with a (A, B, C, D) set, all match with A, but C and D don't match with B and that the # If, with a (A, B, C, D) set, all match with A, but C and D don't match with B and that the
# (A, B) match is the highest (thus resulting in an (A, B) group), still match C and D # (A, B) match is the highest (thus resulting in an (A, B) group), still match C and D
# in a separate group instead of discarding them. # in a separate group instead of discarding them.
A, B, C, D = [NamedObject() for _ in range(4)] A, B, C, D = (NamedObject() for _ in range(4))
m1 = Match(A, B, 90) # This is the strongest "A" match m1 = Match(A, B, 90) # This is the strongest "A" match
m2 = Match(A, C, 80) # Because C doesn't match with B, it won't be in the group m2 = Match(A, C, 80) # Because C doesn't match with B, it won't be in the group
m3 = Match(A, D, 80) # Same thing for D m3 = Match(A, D, 80) # Same thing for D
m4 = Match( m4 = Match(C, D, 70) # However, because C and D match, they should have their own group.
C, D, 70
) # However, because C and D match, they should have their own group.
groups = get_groups([m1, m2, m3, m4]) groups = get_groups([m1, m2, m3, m4])
eq_(len(groups), 2) eq_(len(groups), 2)
g1, g2 = groups g1, g2 = groups

View File

@@ -5,16 +5,13 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import io import io
# import os.path as op
from xml.etree import ElementTree as ET from xml.etree import ElementTree as ET
# from pytest import raises
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from hscommon.plat import ISWINDOWS from hscommon.plat import ISWINDOWS
from .base import DupeGuru from core.tests.base import DupeGuru
from ..exclude import ExcludeList, ExcludeDict, default_regexes, AlreadyThereException from core.exclude import ExcludeList, ExcludeDict, default_regexes, AlreadyThereException
from re import error from re import error
@@ -104,7 +101,7 @@ class TestCaseListEmpty:
regex1 = r"one" regex1 = r"one"
regex2 = r"two" regex2 = r"two"
self.exclude_list.add(regex1) self.exclude_list.add(regex1)
assert(regex1 in self.exclude_list) assert regex1 in self.exclude_list
self.exclude_list.add(regex2) self.exclude_list.add(regex2)
self.exclude_list.mark(regex1) self.exclude_list.mark(regex1)
self.exclude_list.mark(regex2) self.exclude_list.mark(regex2)
@@ -113,7 +110,7 @@ class TestCaseListEmpty:
compiled_files = [x for x in self.exclude_list.compiled_files] compiled_files = [x for x in self.exclude_list.compiled_files]
eq_(len(compiled_files), 2) eq_(len(compiled_files), 2)
self.exclude_list.remove(regex2) self.exclude_list.remove(regex2)
assert(regex2 not in self.exclude_list) assert regex2 not in self.exclude_list
eq_(len(self.exclude_list), 1) eq_(len(self.exclude_list), 1)
def test_add_duplicate(self): def test_add_duplicate(self):
@@ -143,11 +140,7 @@ class TestCaseListEmpty:
def test_force_add_not_compilable(self): def test_force_add_not_compilable(self):
"""Used when loading from XML for example""" """Used when loading from XML for example"""
regex = r"one))" regex = r"one))"
try:
self.exclude_list.add(regex, forced=True) self.exclude_list.add(regex, forced=True)
except Exception as e:
# Should not get an exception here unless it's a duplicate regex
raise e
marked = self.exclude_list.mark(regex) marked = self.exclude_list.mark(regex)
eq_(marked, False) # can't be marked since not compilable eq_(marked, False) # can't be marked since not compilable
eq_(len(self.exclude_list), 1) eq_(len(self.exclude_list), 1)
@@ -188,6 +181,28 @@ class TestCaseListEmpty:
self.exclude_list.rename(regex_renamed_compilable, regex_compilable) self.exclude_list.rename(regex_renamed_compilable, regex_compilable)
eq_(self.exclude_list.is_marked(regex_compilable), True) eq_(self.exclude_list.is_marked(regex_compilable), True)
def test_rename_regex_file_to_path(self):
regex = r".*/one.*"
if ISWINDOWS:
regex = r".*\\one.*"
regex2 = r".*one.*"
self.exclude_list.add(regex)
self.exclude_list.mark(regex)
compiled_re = [x.pattern for x in self.exclude_list._excluded_compiled]
files_re = [x.pattern for x in self.exclude_list.compiled_files]
paths_re = [x.pattern for x in self.exclude_list.compiled_paths]
assert regex in compiled_re
assert regex not in files_re
assert regex in paths_re
self.exclude_list.rename(regex, regex2)
compiled_re = [x.pattern for x in self.exclude_list._excluded_compiled]
files_re = [x.pattern for x in self.exclude_list.compiled_files]
paths_re = [x.pattern for x in self.exclude_list.compiled_paths]
assert regex not in compiled_re
assert regex2 in compiled_re
assert regex2 in files_re
assert regex2 not in paths_re
def test_restore_default(self): def test_restore_default(self):
"""Only unmark previously added regexes and mark the pre-defined ones""" """Only unmark previously added regexes and mark the pre-defined ones"""
regex = r"one" regex = r"one"
@@ -209,25 +224,162 @@ class TestCaseListEmpty:
found = True found = True
if not found: if not found:
raise (Exception(f"Default RE {re} not found in compiled list.")) raise (Exception(f"Default RE {re} not found in compiled list."))
continue
eq_(len(default_regexes), len(self.exclude_list.compiled)) eq_(len(default_regexes), len(self.exclude_list.compiled))
class TestCaseListEmptyUnion(TestCaseListEmpty):
"""Same but with union regex"""
def setup_method(self, method):
self.app = DupeGuru()
self.app.exclude_list = ExcludeList(union_regex=True)
self.exclude_list = self.app.exclude_list
def test_add_mark_and_remove_regex(self):
regex1 = r"one"
regex2 = r"two"
self.exclude_list.add(regex1)
assert regex1 in self.exclude_list
self.exclude_list.add(regex2)
self.exclude_list.mark(regex1)
self.exclude_list.mark(regex2)
eq_(len(self.exclude_list), 2)
eq_(len(self.exclude_list.compiled), 1)
compiled_files = [x for x in self.exclude_list.compiled_files]
eq_(len(compiled_files), 1) # Two patterns joined together into one
assert "|" in compiled_files[0].pattern
self.exclude_list.remove(regex2)
assert regex2 not in self.exclude_list
eq_(len(self.exclude_list), 1)
def test_rename_regex_file_to_path(self):
regex = r".*/one.*"
if ISWINDOWS:
regex = r".*\\one.*"
regex2 = r".*one.*"
self.exclude_list.add(regex)
self.exclude_list.mark(regex)
eq_(len([x for x in self.exclude_list]), 1)
compiled_re = [x.pattern for x in self.exclude_list.compiled]
files_re = [x.pattern for x in self.exclude_list.compiled_files]
paths_re = [x.pattern for x in self.exclude_list.compiled_paths]
assert regex in compiled_re
assert regex not in files_re
assert regex in paths_re
self.exclude_list.rename(regex, regex2)
eq_(len([x for x in self.exclude_list]), 1)
compiled_re = [x.pattern for x in self.exclude_list.compiled]
files_re = [x.pattern for x in self.exclude_list.compiled_files]
paths_re = [x.pattern for x in self.exclude_list.compiled_paths]
assert regex not in compiled_re
assert regex2 in compiled_re
assert regex2 in files_re
assert regex2 not in paths_re
def test_restore_default(self):
"""Only unmark previously added regexes and mark the pre-defined ones"""
regex = r"one"
self.exclude_list.add(regex)
self.exclude_list.mark(regex)
self.exclude_list.restore_defaults()
eq_(len(default_regexes), self.exclude_list.marked_count)
# added regex shouldn't be marked
eq_(self.exclude_list.is_marked(regex), False)
# added regex shouldn't be in compiled list either
compiled = [x for x in self.exclude_list.compiled]
assert regex not in compiled
# Need to escape both to get the same strings after compilation
compiled_escaped = {x.encode("unicode-escape").decode() for x in compiled[0].pattern.split("|")}
default_escaped = {x.encode("unicode-escape").decode() for x in default_regexes}
assert compiled_escaped == default_escaped
eq_(len(default_regexes), len(compiled[0].pattern.split("|")))
class TestCaseDictEmpty(TestCaseListEmpty): class TestCaseDictEmpty(TestCaseListEmpty):
"""Same, but with dictionary implementation""" """Same, but with dictionary implementation"""
def setup_method(self, method): def setup_method(self, method):
self.app = DupeGuru() self.app = DupeGuru()
self.app.exclude_list = ExcludeDict(union_regex=False) self.app.exclude_list = ExcludeDict(union_regex=False)
self.exclude_list = self.app.exclude_list self.exclude_list = self.app.exclude_list
class TestCaseDictEmptyUnion(TestCaseDictEmpty):
"""Same, but with union regex"""
def setup_method(self, method):
self.app = DupeGuru()
self.app.exclude_list = ExcludeDict(union_regex=True)
self.exclude_list = self.app.exclude_list
def test_add_mark_and_remove_regex(self):
regex1 = r"one"
regex2 = r"two"
self.exclude_list.add(regex1)
assert regex1 in self.exclude_list
self.exclude_list.add(regex2)
self.exclude_list.mark(regex1)
self.exclude_list.mark(regex2)
eq_(len(self.exclude_list), 2)
eq_(len(self.exclude_list.compiled), 1)
compiled_files = [x for x in self.exclude_list.compiled_files]
# two patterns joined into one
eq_(len(compiled_files), 1)
self.exclude_list.remove(regex2)
assert regex2 not in self.exclude_list
eq_(len(self.exclude_list), 1)
def test_rename_regex_file_to_path(self):
regex = r".*/one.*"
if ISWINDOWS:
regex = r".*\\one.*"
regex2 = r".*one.*"
self.exclude_list.add(regex)
self.exclude_list.mark(regex)
marked_re = [x for marked, x in self.exclude_list if marked]
eq_(len(marked_re), 1)
compiled_re = [x.pattern for x in self.exclude_list.compiled]
files_re = [x.pattern for x in self.exclude_list.compiled_files]
paths_re = [x.pattern for x in self.exclude_list.compiled_paths]
assert regex in compiled_re
assert regex not in files_re
assert regex in paths_re
self.exclude_list.rename(regex, regex2)
compiled_re = [x.pattern for x in self.exclude_list.compiled]
files_re = [x.pattern for x in self.exclude_list.compiled_files]
paths_re = [x.pattern for x in self.exclude_list.compiled_paths]
assert regex not in compiled_re
assert regex2 in compiled_re
assert regex2 in files_re
assert regex2 not in paths_re
def test_restore_default(self):
"""Only unmark previously added regexes and mark the pre-defined ones"""
regex = r"one"
self.exclude_list.add(regex)
self.exclude_list.mark(regex)
self.exclude_list.restore_defaults()
eq_(len(default_regexes), self.exclude_list.marked_count)
# added regex shouldn't be marked
eq_(self.exclude_list.is_marked(regex), False)
# added regex shouldn't be in compiled list either
compiled = [x for x in self.exclude_list.compiled]
assert regex not in compiled
# Need to escape both to get the same strings after compilation
compiled_escaped = {x.encode("unicode-escape").decode() for x in compiled[0].pattern.split("|")}
default_escaped = {x.encode("unicode-escape").decode() for x in default_regexes}
assert compiled_escaped == default_escaped
eq_(len(default_regexes), len(compiled[0].pattern.split("|")))
def split_union(pattern_object): def split_union(pattern_object):
"""Returns list of strings for each union pattern""" """Returns list of strings for each union pattern"""
return [x for x in pattern_object.pattern.split("|")] return [x for x in pattern_object.pattern.split("|")]
class TestCaseCompiledList(): class TestCaseCompiledList:
"""Test consistency between union or and separate versions.""" """Test consistency between union or and separate versions."""
def setup_method(self, method): def setup_method(self, method):
self.e_separate = ExcludeList(union_regex=False) self.e_separate = ExcludeList(union_regex=False)
self.e_separate.restore_defaults() self.e_separate.restore_defaults()
@@ -275,6 +427,7 @@ class TestCaseCompiledList():
class TestCaseCompiledDict(TestCaseCompiledList): class TestCaseCompiledDict(TestCaseCompiledList):
"""Test the dictionary version""" """Test the dictionary version"""
def setup_method(self, method): def setup_method(self, method):
self.e_separate = ExcludeDict(union_regex=False) self.e_separate = ExcludeDict(union_regex=False)
self.e_separate.restore_defaults() self.e_separate.restore_defaults()

View File

@@ -6,13 +6,48 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import hashlib import typing
from os import urandom
from hscommon.path import Path from pathlib import Path
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from core.tests.directories_test import create_fake_fs from core.tests.directories_test import create_fake_fs
from .. import fs from core import fs
hasher: typing.Callable
try:
import xxhash
hasher = xxhash.xxh128
except ImportError:
import hashlib
hasher = hashlib.md5
def create_fake_fs_with_random_data(rootpath):
rootpath = rootpath.joinpath("fs")
rootpath.mkdir()
rootpath.joinpath("dir1").mkdir()
rootpath.joinpath("dir2").mkdir()
rootpath.joinpath("dir3").mkdir()
data1 = urandom(200 * 1024) # 200KiB
data2 = urandom(1024 * 1024) # 1MiB
data3 = urandom(10 * 1024 * 1024) # 10MiB
with rootpath.joinpath("file1.test").open("wb") as fp:
fp.write(data1)
with rootpath.joinpath("file2.test").open("wb") as fp:
fp.write(data2)
with rootpath.joinpath("file3.test").open("wb") as fp:
fp.write(data3)
with rootpath.joinpath("dir1", "file1.test").open("wb") as fp:
fp.write(data1)
with rootpath.joinpath("dir2", "file2.test").open("wb") as fp:
fp.write(data2)
with rootpath.joinpath("dir3", "file3.test").open("wb") as fp:
fp.write(data3)
return rootpath
def test_size_aggregates_subfiles(tmpdir): def test_size_aggregates_subfiles(tmpdir):
@@ -21,24 +56,54 @@ def test_size_aggregates_subfiles(tmpdir):
eq_(b.size, 12) eq_(b.size, 12)
def test_md5_aggregate_subfiles_sorted(tmpdir): def test_digest_aggregate_subfiles_sorted(tmpdir):
# dir.allfiles can return child in any order. Thus, bundle.md5 must aggregate # dir.allfiles can return child in any order. Thus, bundle.digest must aggregate
# all files' md5 it contains, but it must make sure that it does so in the # all files' digests it contains, but it must make sure that it does so in the
# same order everytime. # same order everytime.
p = create_fake_fs(Path(str(tmpdir))) p = create_fake_fs_with_random_data(Path(str(tmpdir)))
b = fs.Folder(p) b = fs.Folder(p)
md51 = fs.File(p["dir1"]["file1.test"]).md5 digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest
md52 = fs.File(p["dir2"]["file2.test"]).md5 digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest
md53 = fs.File(p["dir3"]["file3.test"]).md5 digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest
md54 = fs.File(p["file1.test"]).md5 digest4 = fs.File(p.joinpath("file1.test")).digest
md55 = fs.File(p["file2.test"]).md5 digest5 = fs.File(p.joinpath("file2.test")).digest
md56 = fs.File(p["file3.test"]).md5 digest6 = fs.File(p.joinpath("file3.test")).digest
# The expected md5 is the md5 of md5s for folders and the direct md5 for files # The expected digest is the hash of digests for folders and the direct digest for files
folder_md51 = hashlib.md5(md51).digest() folder_digest1 = hasher(digest1).digest()
folder_md52 = hashlib.md5(md52).digest() folder_digest2 = hasher(digest2).digest()
folder_md53 = hashlib.md5(md53).digest() folder_digest3 = hasher(digest3).digest()
md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56) digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
eq_(b.md5, md5.digest()) eq_(b.digest, digest)
def test_partial_digest_aggregate_subfile_sorted(tmpdir):
p = create_fake_fs_with_random_data(Path(str(tmpdir)))
b = fs.Folder(p)
digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest_partial
digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest_partial
digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest_partial
digest4 = fs.File(p.joinpath("file1.test")).digest_partial
digest5 = fs.File(p.joinpath("file2.test")).digest_partial
digest6 = fs.File(p.joinpath("file3.test")).digest_partial
# The expected digest is the hash of digests for folders and the direct digest for files
folder_digest1 = hasher(digest1).digest()
folder_digest2 = hasher(digest2).digest()
folder_digest3 = hasher(digest3).digest()
digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
eq_(b.digest_partial, digest)
digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest_samples
digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest_samples
digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest_samples
digest4 = fs.File(p.joinpath("file1.test")).digest_samples
digest5 = fs.File(p.joinpath("file2.test")).digest_samples
digest6 = fs.File(p.joinpath("file3.test")).digest_samples
# The expected digest is the digest of digests for folders and the direct digest for files
folder_digest1 = hasher(digest1).digest()
folder_digest2 = hasher(digest2).digest()
folder_digest3 = hasher(digest3).digest()
digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
eq_(b.digest_samples, digest)
def test_has_file_attrs(tmpdir): def test_has_file_attrs(tmpdir):

View File

@@ -10,60 +10,60 @@ from xml.etree import ElementTree as ET
from pytest import raises from pytest import raises
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from ..ignore import IgnoreList from core.ignore import IgnoreList
def test_empty(): def test_empty():
il = IgnoreList() il = IgnoreList()
eq_(0, len(il)) eq_(0, len(il))
assert not il.AreIgnored("foo", "bar") assert not il.are_ignored("foo", "bar")
def test_simple(): def test_simple():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
assert il.AreIgnored("foo", "bar") assert il.are_ignored("foo", "bar")
assert il.AreIgnored("bar", "foo") assert il.are_ignored("bar", "foo")
assert not il.AreIgnored("foo", "bleh") assert not il.are_ignored("foo", "bleh")
assert not il.AreIgnored("bleh", "bar") assert not il.are_ignored("bleh", "bar")
eq_(1, len(il)) eq_(1, len(il))
def test_multiple(): def test_multiple():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
il.Ignore("foo", "bleh") il.ignore("foo", "bleh")
il.Ignore("bleh", "bar") il.ignore("bleh", "bar")
il.Ignore("aybabtu", "bleh") il.ignore("aybabtu", "bleh")
assert il.AreIgnored("foo", "bar") assert il.are_ignored("foo", "bar")
assert il.AreIgnored("bar", "foo") assert il.are_ignored("bar", "foo")
assert il.AreIgnored("foo", "bleh") assert il.are_ignored("foo", "bleh")
assert il.AreIgnored("bleh", "bar") assert il.are_ignored("bleh", "bar")
assert not il.AreIgnored("aybabtu", "bar") assert not il.are_ignored("aybabtu", "bar")
eq_(4, len(il)) eq_(4, len(il))
def test_clear(): def test_clear():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
il.Clear() il.clear()
assert not il.AreIgnored("foo", "bar") assert not il.are_ignored("foo", "bar")
assert not il.AreIgnored("bar", "foo") assert not il.are_ignored("bar", "foo")
eq_(0, len(il)) eq_(0, len(il))
def test_add_same_twice(): def test_add_same_twice():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
il.Ignore("bar", "foo") il.ignore("bar", "foo")
eq_(1, len(il)) eq_(1, len(il))
def test_save_to_xml(): def test_save_to_xml():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
il.Ignore("foo", "bleh") il.ignore("foo", "bleh")
il.Ignore("bleh", "bar") il.ignore("bleh", "bar")
f = io.BytesIO() f = io.BytesIO()
il.save_to_xml(f) il.save_to_xml(f)
f.seek(0) f.seek(0)
@@ -73,50 +73,46 @@ def test_save_to_xml():
eq_(len(root), 2) eq_(len(root), 2)
eq_(len([c for c in root if c.tag == "file"]), 2) eq_(len([c for c in root if c.tag == "file"]), 2)
f1, f2 = root[:] f1, f2 = root[:]
subchildren = [c for c in f1 if c.tag == "file"] + [ subchildren = [c for c in f1 if c.tag == "file"] + [c for c in f2 if c.tag == "file"]
c for c in f2 if c.tag == "file"
]
eq_(len(subchildren), 3) eq_(len(subchildren), 3)
def test_SaveThenLoad(): def test_save_then_load():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
il.Ignore("foo", "bleh") il.ignore("foo", "bleh")
il.Ignore("bleh", "bar") il.ignore("bleh", "bar")
il.Ignore("\u00e9", "bar") il.ignore("\u00e9", "bar")
f = io.BytesIO() f = io.BytesIO()
il.save_to_xml(f) il.save_to_xml(f)
f.seek(0) f.seek(0)
il = IgnoreList() il = IgnoreList()
il.load_from_xml(f) il.load_from_xml(f)
eq_(4, len(il)) eq_(4, len(il))
assert il.AreIgnored("\u00e9", "bar") assert il.are_ignored("\u00e9", "bar")
def test_LoadXML_with_empty_file_tags(): def test_load_xml_with_empty_file_tags():
f = io.BytesIO() f = io.BytesIO()
f.write( f.write(b'<?xml version="1.0" encoding="utf-8"?><ignore_list><file><file/></file></ignore_list>')
b'<?xml version="1.0" encoding="utf-8"?><ignore_list><file><file/></file></ignore_list>'
)
f.seek(0) f.seek(0)
il = IgnoreList() il = IgnoreList()
il.load_from_xml(f) il.load_from_xml(f)
eq_(0, len(il)) eq_(0, len(il))
def test_AreIgnore_works_when_a_child_is_a_key_somewhere_else(): def test_are_ignore_works_when_a_child_is_a_key_somewhere_else():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
il.Ignore("bar", "baz") il.ignore("bar", "baz")
assert il.AreIgnored("bar", "foo") assert il.are_ignored("bar", "foo")
def test_no_dupes_when_a_child_is_a_key_somewhere_else(): def test_no_dupes_when_a_child_is_a_key_somewhere_else():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
il.Ignore("bar", "baz") il.ignore("bar", "baz")
il.Ignore("bar", "foo") il.ignore("bar", "foo")
eq_(2, len(il)) eq_(2, len(il))
@@ -125,7 +121,7 @@ def test_iterate():
il = IgnoreList() il = IgnoreList()
expected = [("foo", "bar"), ("bar", "baz"), ("foo", "baz")] expected = [("foo", "bar"), ("bar", "baz"), ("foo", "baz")]
for i in expected: for i in expected:
il.Ignore(i[0], i[1]) il.ignore(i[0], i[1])
for i in il: for i in il:
expected.remove(i) # No exception should be raised expected.remove(i) # No exception should be raised
assert not expected # expected should be empty assert not expected # expected should be empty
@@ -133,18 +129,18 @@ def test_iterate():
def test_filter(): def test_filter():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
il.Ignore("bar", "baz") il.ignore("bar", "baz")
il.Ignore("foo", "baz") il.ignore("foo", "baz")
il.Filter(lambda f, s: f == "bar") il.filter(lambda f, s: f == "bar")
eq_(1, len(il)) eq_(1, len(il))
assert not il.AreIgnored("foo", "bar") assert not il.are_ignored("foo", "bar")
assert il.AreIgnored("bar", "baz") assert il.are_ignored("bar", "baz")
def test_save_with_non_ascii_items(): def test_save_with_non_ascii_items():
il = IgnoreList() il = IgnoreList()
il.Ignore("\xac", "\xbf") il.ignore("\xac", "\xbf")
f = io.BytesIO() f = io.BytesIO()
try: try:
il.save_to_xml(f) il.save_to_xml(f)
@@ -155,29 +151,29 @@ def test_save_with_non_ascii_items():
def test_len(): def test_len():
il = IgnoreList() il = IgnoreList()
eq_(0, len(il)) eq_(0, len(il))
il.Ignore("foo", "bar") il.ignore("foo", "bar")
eq_(1, len(il)) eq_(1, len(il))
def test_nonzero(): def test_nonzero():
il = IgnoreList() il = IgnoreList()
assert not il assert not il
il.Ignore("foo", "bar") il.ignore("foo", "bar")
assert il assert il
def test_remove(): def test_remove():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
il.Ignore("foo", "baz") il.ignore("foo", "baz")
il.remove("bar", "foo") il.remove("bar", "foo")
eq_(len(il), 1) eq_(len(il), 1)
assert not il.AreIgnored("foo", "bar") assert not il.are_ignored("foo", "bar")
def test_remove_non_existant(): def test_remove_non_existant():
il = IgnoreList() il = IgnoreList()
il.Ignore("foo", "bar") il.ignore("foo", "bar")
il.Ignore("foo", "baz") il.ignore("foo", "baz")
with raises(ValueError): with raises(ValueError):
il.remove("foo", "bleh") il.remove("foo", "bleh")

View File

@@ -6,7 +6,7 @@
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from ..markable import MarkableList, Markable from core.markable import MarkableList, Markable
def gen(): def gen():

View File

@@ -9,8 +9,8 @@
import os.path as op import os.path as op
from itertools import combinations from itertools import combinations
from .base import TestApp, NamedObject, with_app, eq_ from core.tests.base import TestApp, NamedObject, with_app, eq_
from ..engine import Group, Match from core.engine import Group, Match
no = NamedObject no = NamedObject

View File

@@ -6,7 +6,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from .base import TestApp, GetTestGroups from core.tests.base import TestApp, GetTestGroups
def app_with_results(): def app_with_results():

View File

@@ -12,10 +12,9 @@ from xml.etree import ElementTree as ET
from pytest import raises from pytest import raises
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from hscommon.util import first from hscommon.util import first
from core import engine
from .. import engine from core.tests.base import NamedObject, GetTestGroups, DupeGuru
from .base import NamedObject, GetTestGroups, DupeGuru from core.results import Results
from ..results import Results
class TestCaseResultsEmpty: class TestCaseResultsEmpty:
@@ -117,9 +116,7 @@ class TestCaseResultsWithSomeGroups:
assert d is g.ref assert d is g.ref
def test_sort_groups(self): def test_sort_groups(self):
self.results.make_ref( self.results.make_ref(self.objects[1]) # We want to make the 1024 sized object to go ref.
self.objects[1]
) # We want to make the 1024 sized object to go ref.
g1, g2 = self.groups g1, g2 = self.groups
self.results.sort_groups("size") self.results.sort_groups("size")
assert self.results.groups[0] is g2 assert self.results.groups[0] is g2
@@ -129,9 +126,7 @@ class TestCaseResultsWithSomeGroups:
assert self.results.groups[1] is g2 assert self.results.groups[1] is g2
def test_set_groups_when_sorted(self): def test_set_groups_when_sorted(self):
self.results.make_ref( self.results.make_ref(self.objects[1]) # We want to make the 1024 sized object to go ref.
self.objects[1]
) # We want to make the 1024 sized object to go ref.
self.results.sort_groups("size") self.results.sort_groups("size")
objects, matches, groups = GetTestGroups() objects, matches, groups = GetTestGroups()
g1, g2 = groups g1, g2 = groups
@@ -341,7 +336,7 @@ class TestCaseResultsMarkings:
def log_object(o): def log_object(o):
log.append(o) log.append(o)
if o is self.objects[1]: if o is self.objects[1]:
raise EnvironmentError("foobar") raise OSError("foobar")
log = [] log = []
self.results.mark_all() self.results.mark_all()
@@ -406,7 +401,7 @@ class TestCaseResultsMarkings:
self.results.make_ref(d) self.results.make_ref(d)
eq_("0 / 3 (0.00 B / 3.00 B) duplicates marked.", self.results.stat_line) eq_("0 / 3 (0.00 B / 3.00 B) duplicates marked.", self.results.stat_line)
def test_SaveXML(self): def test_save_xml(self):
self.results.mark(self.objects[1]) self.results.mark(self.objects[1])
self.results.mark_invert() self.results.mark_invert()
f = io.BytesIO() f = io.BytesIO()
@@ -423,7 +418,7 @@ class TestCaseResultsMarkings:
eq_("n", d1.get("marked")) eq_("n", d1.get("marked"))
eq_("y", d2.get("marked")) eq_("y", d2.get("marked"))
def test_LoadXML(self): def test_load_xml(self):
def get_file(path): def get_file(path):
return [f for f in self.objects if str(f.path) == path][0] return [f for f in self.objects if str(f.path) == path][0]
@@ -451,7 +446,7 @@ class TestCaseResultsXML:
self.results.groups = self.groups self.results.groups = self.groups
def get_file(self, path): # use this as a callback for load_from_xml def get_file(self, path): # use this as a callback for load_from_xml
return [o for o in self.objects if o.path == path][0] return [o for o in self.objects if str(o.path) == path][0]
def test_save_to_xml(self): def test_save_to_xml(self):
self.objects[0].is_ref = True self.objects[0].is_ref = True
@@ -468,7 +463,7 @@ class TestCaseResultsXML:
eq_(6, len(g1)) eq_(6, len(g1))
eq_(3, len([c for c in g1 if c.tag == "file"])) eq_(3, len([c for c in g1 if c.tag == "file"]))
eq_(3, len([c for c in g1 if c.tag == "match"])) eq_(3, len([c for c in g1 if c.tag == "match"]))
d1, d2, d3 = [c for c in g1 if c.tag == "file"] d1, d2, d3 = (c for c in g1 if c.tag == "file")
eq_(op.join("basepath", "foo bar"), d1.get("path")) eq_(op.join("basepath", "foo bar"), d1.get("path"))
eq_(op.join("basepath", "bar bleh"), d2.get("path")) eq_(op.join("basepath", "bar bleh"), d2.get("path"))
eq_(op.join("basepath", "foo bleh"), d3.get("path")) eq_(op.join("basepath", "foo bleh"), d3.get("path"))
@@ -481,7 +476,7 @@ class TestCaseResultsXML:
eq_(3, len(g2)) eq_(3, len(g2))
eq_(2, len([c for c in g2 if c.tag == "file"])) eq_(2, len([c for c in g2 if c.tag == "file"]))
eq_(1, len([c for c in g2 if c.tag == "match"])) eq_(1, len([c for c in g2 if c.tag == "match"]))
d1, d2 = [c for c in g2 if c.tag == "file"] d1, d2 = (c for c in g2 if c.tag == "file")
eq_(op.join("basepath", "ibabtu"), d1.get("path")) eq_(op.join("basepath", "ibabtu"), d1.get("path"))
eq_(op.join("basepath", "ibabtu"), d2.get("path")) eq_(op.join("basepath", "ibabtu"), d2.get("path"))
eq_("n", d1.get("is_ref")) eq_("n", d1.get("is_ref"))
@@ -489,7 +484,7 @@ class TestCaseResultsXML:
eq_("ibabtu", d1.get("words")) eq_("ibabtu", d1.get("words"))
eq_("ibabtu", d2.get("words")) eq_("ibabtu", d2.get("words"))
def test_LoadXML(self): def test_load_xml(self):
def get_file(path): def get_file(path):
return [f for f in self.objects if str(f.path) == path][0] return [f for f in self.objects if str(f.path) == path][0]
@@ -521,7 +516,7 @@ class TestCaseResultsXML:
eq_(["ibabtu"], g2[0].words) eq_(["ibabtu"], g2[0].words)
eq_(["ibabtu"], g2[1].words) eq_(["ibabtu"], g2[1].words)
def test_LoadXML_with_filename(self, tmpdir): def test_load_xml_with_filename(self, tmpdir):
def get_file(path): def get_file(path):
return [f for f in self.objects if str(f.path) == path][0] return [f for f in self.objects if str(f.path) == path][0]
@@ -533,7 +528,7 @@ class TestCaseResultsXML:
r.load_from_xml(filename, get_file) r.load_from_xml(filename, get_file)
eq_(2, len(r.groups)) eq_(2, len(r.groups))
def test_LoadXML_with_some_files_that_dont_exist_anymore(self): def test_load_xml_with_some_files_that_dont_exist_anymore(self):
def get_file(path): def get_file(path):
if path.endswith("ibabtu 2"): if path.endswith("ibabtu 2"):
return None return None
@@ -549,7 +544,7 @@ class TestCaseResultsXML:
eq_(1, len(r.groups)) eq_(1, len(r.groups))
eq_(3, len(r.groups[0])) eq_(3, len(r.groups[0]))
def test_LoadXML_missing_attributes_and_bogus_elements(self): def test_load_xml_missing_attributes_and_bogus_elements(self):
def get_file(path): def get_file(path):
return [f for f in self.objects if str(f.path) == path][0] return [f for f in self.objects if str(f.path) == path][0]
@@ -601,9 +596,7 @@ class TestCaseResultsXML:
matches = engine.getmatches(objects) # we should have 5 matches matches = engine.getmatches(objects) # we should have 5 matches
groups = engine.get_groups(matches) # We should have 2 groups groups = engine.get_groups(matches) # We should have 2 groups
for g in groups: for g in groups:
g.prioritize( g.prioritize(lambda x: objects.index(x)) # We want the dupes to be in the same order as the list is
lambda x: objects.index(x)
) # We want the dupes to be in the same order as the list is
app = DupeGuru() app = DupeGuru()
results = Results(app) results = Results(app)
results.groups = groups results.groups = groups
@@ -807,9 +800,7 @@ class TestCaseResultsFilter:
# Now the stats should display *2* markable dupes (instead of 1) # Now the stats should display *2* markable dupes (instead of 1)
expected = "0 / 2 (0.00 B / 2.00 B) duplicates marked. filter: foo" expected = "0 / 2 (0.00 B / 2.00 B) duplicates marked. filter: foo"
eq_(expected, self.results.stat_line) eq_(expected, self.results.stat_line)
self.results.apply_filter( self.results.apply_filter(None) # Now let's make sure our unfiltered results aren't fucked up
None
) # Now let's make sure our unfiltered results aren't fucked up
expected = "0 / 3 (0.00 B / 3.00 B) duplicates marked." expected = "0 / 3 (0.00 B / 3.00 B) duplicates marked."
eq_(expected, self.results.stat_line) eq_(expected, self.results.stat_line)

View File

@@ -7,14 +7,14 @@
import pytest import pytest
from hscommon.jobprogress import job from hscommon.jobprogress import job
from hscommon.path import Path from pathlib import Path
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from .. import fs from core import fs
from ..engine import getwords, Match from core.engine import getwords, Match
from ..ignore import IgnoreList from core.ignore import IgnoreList
from ..scanner import Scanner, ScanType from core.scanner import Scanner, ScanType
from ..me.scanner import ScannerME from core.me.scanner import ScannerME
class NamedObject: class NamedObject:
@@ -22,14 +22,14 @@ class NamedObject:
if path is None: if path is None:
path = Path(name) path = Path(name)
else: else:
path = Path(path)[name] path = Path(path, name)
self.name = name self.name = name
self.size = size self.size = size
self.path = path self.path = path
self.words = getwords(name) self.words = getwords(name)
def __repr__(self): def __repr__(self):
return "<NamedObject %r %r>" % (self.name, self.path) return "<NamedObject {!r} {!r}>".format(self.name, self.path)
no = NamedObject no = NamedObject
@@ -52,10 +52,13 @@ def test_empty(fake_fileexists):
def test_default_settings(fake_fileexists): def test_default_settings(fake_fileexists):
s = Scanner() s = Scanner()
eq_(s.min_match_percentage, 80) eq_(s.min_match_percentage, 80)
eq_(s.scan_type, ScanType.Filename) eq_(s.scan_type, ScanType.FILENAME)
eq_(s.mix_file_kind, True) eq_(s.mix_file_kind, True)
eq_(s.word_weighting, False) eq_(s.word_weighting, False)
eq_(s.match_similar_words, False) eq_(s.match_similar_words, False)
eq_(s.size_threshold, 0)
eq_(s.large_size_threshold, 0)
eq_(s.big_file_size_threshold, 0)
def test_simple_with_default_settings(fake_fileexists): def test_simple_with_default_settings(fake_fileexists):
@@ -97,7 +100,7 @@ def test_trim_all_ref_groups(fake_fileexists):
eq_(s.discarded_file_count, 0) eq_(s.discarded_file_count, 0)
def test_priorize(fake_fileexists): def test_prioritize(fake_fileexists):
s = Scanner() s = Scanner()
f = [ f = [
no("foo", path="p1"), no("foo", path="p1"),
@@ -118,36 +121,109 @@ def test_priorize(fake_fileexists):
def test_content_scan(fake_fileexists): def test_content_scan(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Contents s.scan_type = ScanType.CONTENTS
f = [no("foo"), no("bar"), no("bleh")] f = [no("foo"), no("bar"), no("bleh")]
f[0].md5 = f[0].md5partial = "foobar" f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].md5 = f[1].md5partial = "foobar" f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].md5 = f[2].md5partial = "bleh" f[2].digest = f[2].digest_partial = f[1].digest_samples = "bleh"
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
eq_(len(r[0]), 2) eq_(len(r[0]), 2)
eq_(s.discarded_file_count, 0) # don't count the different md5 as discarded! eq_(s.discarded_file_count, 0) # don't count the different digest as discarded!
def test_content_scan_compare_sizes_first(fake_fileexists): def test_content_scan_compare_sizes_first(fake_fileexists):
class MyFile(no): class MyFile(no):
@property @property
def md5(file): def digest(self):
raise AssertionError() raise AssertionError()
s = Scanner() s = Scanner()
s.scan_type = ScanType.Contents s.scan_type = ScanType.CONTENTS
f = [MyFile("foo", 1), MyFile("bar", 2)] f = [MyFile("foo", 1), MyFile("bar", 2)]
eq_(len(s.get_dupe_groups(f)), 0) eq_(len(s.get_dupe_groups(f)), 0)
def test_ignore_file_size(fake_fileexists):
s = Scanner()
s.scan_type = ScanType.CONTENTS
small_size = 10 # 10KB
s.size_threshold = 0
large_size = 100 * 1024 * 1024 # 100MB
s.large_size_threshold = 0
f = [
no("smallignore1", small_size - 1),
no("smallignore2", small_size - 1),
no("small1", small_size),
no("small2", small_size),
no("large1", large_size),
no("large2", large_size),
no("largeignore1", large_size + 1),
no("largeignore2", large_size + 1),
]
f[0].digest = f[0].digest_partial = f[0].digest_samples = "smallignore"
f[1].digest = f[1].digest_partial = f[1].digest_samples = "smallignore"
f[2].digest = f[2].digest_partial = f[2].digest_samples = "small"
f[3].digest = f[3].digest_partial = f[3].digest_samples = "small"
f[4].digest = f[4].digest_partial = f[4].digest_samples = "large"
f[5].digest = f[5].digest_partial = f[5].digest_samples = "large"
f[6].digest = f[6].digest_partial = f[6].digest_samples = "largeignore"
f[7].digest = f[7].digest_partial = f[7].digest_samples = "largeignore"
r = s.get_dupe_groups(f)
# No ignores
eq_(len(r), 4)
# Ignore smaller
s.size_threshold = small_size
r = s.get_dupe_groups(f)
eq_(len(r), 3)
# Ignore larger
s.size_threshold = 0
s.large_size_threshold = large_size
r = s.get_dupe_groups(f)
eq_(len(r), 3)
# Ignore both
s.size_threshold = small_size
r = s.get_dupe_groups(f)
eq_(len(r), 2)
def test_big_file_partial_hashes(fake_fileexists):
s = Scanner()
s.scan_type = ScanType.CONTENTS
smallsize = 1
bigsize = 100 * 1024 * 1024 # 100MB
s.big_file_size_threshold = bigsize
f = [no("bigfoo", bigsize), no("bigbar", bigsize), no("smallfoo", smallsize), no("smallbar", smallsize)]
f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].digest = f[2].digest_partial = "bleh"
f[3].digest = f[3].digest_partial = "bleh"
r = s.get_dupe_groups(f)
eq_(len(r), 2)
# digest_partial is still the same, but the file is actually different
f[1].digest = f[1].digest_samples = "difffoobar"
# here we compare the full digests, as the user disabled the optimization
s.big_file_size_threshold = 0
r = s.get_dupe_groups(f)
eq_(len(r), 1)
# here we should compare the digest_samples, and see they are different
s.big_file_size_threshold = bigsize
r = s.get_dupe_groups(f)
eq_(len(r), 1)
def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists): def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Contents s.scan_type = ScanType.CONTENTS
f = [no("foo"), no("bar"), no("bleh")] f = [no("foo"), no("bar"), no("bleh")]
f[0].md5 = f[0].md5partial = "foobar" f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].md5 = f[1].md5partial = "foobar" f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].md5 = f[2].md5partial = "bleh" f[2].digest = f[2].digest_partial = f[2].digest_samples = "bleh"
s.min_match_percentage = 101 s.min_match_percentage = 101
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
@@ -158,17 +234,18 @@ def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists):
eq_(len(r[0]), 2) eq_(len(r[0]), 2)
def test_content_scan_doesnt_put_md5_in_words_at_the_end(fake_fileexists): def test_content_scan_doesnt_put_digest_in_words_at_the_end(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Contents s.scan_type = ScanType.CONTENTS
f = [no("foo"), no("bar")] f = [no("foo"), no("bar")]
f[0].md5 = f[ f[0].digest = f[0].digest_partial = f[
0 0
].md5partial = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" ].digest_samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
f[1].md5 = f[ f[1].digest = f[1].digest_partial = f[
1 1
].md5partial = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" ].digest_samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
# FIXME looks like we are missing something here?
r[0] r[0]
@@ -229,7 +306,7 @@ def test_similar_words(fake_fileexists):
def test_fields(fake_fileexists): def test_fields(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Fields s.scan_type = ScanType.FIELDS
f = [no("The White Stripes - Little Ghost"), no("The White Stripes - Little Acorn")] f = [no("The White Stripes - Little Ghost"), no("The White Stripes - Little Acorn")]
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 0) eq_(len(r), 0)
@@ -237,7 +314,7 @@ def test_fields(fake_fileexists):
def test_fields_no_order(fake_fileexists): def test_fields_no_order(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.FieldsNoOrder s.scan_type = ScanType.FIELDSNOORDER
f = [no("The White Stripes - Little Ghost"), no("Little Ghost - The White Stripes")] f = [no("The White Stripes - Little Ghost"), no("Little Ghost - The White Stripes")]
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
@@ -245,7 +322,7 @@ def test_fields_no_order(fake_fileexists):
def test_tag_scan(fake_fileexists): def test_tag_scan(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Tag s.scan_type = ScanType.TAG
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o1.artist = "The White Stripes" o1.artist = "The White Stripes"
@@ -258,8 +335,8 @@ def test_tag_scan(fake_fileexists):
def test_tag_with_album_scan(fake_fileexists): def test_tag_with_album_scan(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Tag s.scan_type = ScanType.TAG
s.scanned_tags = set(["artist", "album", "title"]) s.scanned_tags = {"artist", "album", "title"}
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o3 = no("bleh") o3 = no("bleh")
@@ -278,8 +355,8 @@ def test_tag_with_album_scan(fake_fileexists):
def test_that_dash_in_tags_dont_create_new_fields(fake_fileexists): def test_that_dash_in_tags_dont_create_new_fields(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Tag s.scan_type = ScanType.TAG
s.scanned_tags = set(["artist", "album", "title"]) s.scanned_tags = {"artist", "album", "title"}
s.min_match_percentage = 50 s.min_match_percentage = 50
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
@@ -295,8 +372,8 @@ def test_that_dash_in_tags_dont_create_new_fields(fake_fileexists):
def test_tag_scan_with_different_scanned(fake_fileexists): def test_tag_scan_with_different_scanned(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Tag s.scan_type = ScanType.TAG
s.scanned_tags = set(["track", "year"]) s.scanned_tags = {"track", "year"}
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o1.artist = "The White Stripes" o1.artist = "The White Stripes"
@@ -313,8 +390,8 @@ def test_tag_scan_with_different_scanned(fake_fileexists):
def test_tag_scan_only_scans_existing_tags(fake_fileexists): def test_tag_scan_only_scans_existing_tags(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Tag s.scan_type = ScanType.TAG
s.scanned_tags = set(["artist", "foo"]) s.scanned_tags = {"artist", "foo"}
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o1.artist = "The White Stripes" o1.artist = "The White Stripes"
@@ -327,8 +404,8 @@ def test_tag_scan_only_scans_existing_tags(fake_fileexists):
def test_tag_scan_converts_to_str(fake_fileexists): def test_tag_scan_converts_to_str(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Tag s.scan_type = ScanType.TAG
s.scanned_tags = set(["track"]) s.scanned_tags = {"track"}
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o1.track = 42 o1.track = 42
@@ -342,8 +419,8 @@ def test_tag_scan_converts_to_str(fake_fileexists):
def test_tag_scan_non_ascii(fake_fileexists): def test_tag_scan_non_ascii(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.Tag s.scan_type = ScanType.TAG
s.scanned_tags = set(["title"]) s.scanned_tags = {"title"}
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o1.title = "foobar\u00e9" o1.title = "foobar\u00e9"
@@ -364,8 +441,8 @@ def test_ignore_list(fake_fileexists):
f2.path = Path("dir2/foobar") f2.path = Path("dir2/foobar")
f3.path = Path("dir3/foobar") f3.path = Path("dir3/foobar")
ignore_list = IgnoreList() ignore_list = IgnoreList()
ignore_list.Ignore(str(f1.path), str(f2.path)) ignore_list.ignore(str(f1.path), str(f2.path))
ignore_list.Ignore(str(f1.path), str(f3.path)) ignore_list.ignore(str(f1.path), str(f3.path))
r = s.get_dupe_groups([f1, f2, f3], ignore_list=ignore_list) r = s.get_dupe_groups([f1, f2, f3], ignore_list=ignore_list)
eq_(len(r), 1) eq_(len(r), 1)
g = r[0] g = r[0]
@@ -388,8 +465,8 @@ def test_ignore_list_checks_for_unicode(fake_fileexists):
f2.path = Path("foo2\u00e9") f2.path = Path("foo2\u00e9")
f3.path = Path("foo3\u00e9") f3.path = Path("foo3\u00e9")
ignore_list = IgnoreList() ignore_list = IgnoreList()
ignore_list.Ignore(str(f1.path), str(f2.path)) ignore_list.ignore(str(f1.path), str(f2.path))
ignore_list.Ignore(str(f1.path), str(f3.path)) ignore_list.ignore(str(f1.path), str(f3.path))
r = s.get_dupe_groups([f1, f2, f3], ignore_list=ignore_list) r = s.get_dupe_groups([f1, f2, f3], ignore_list=ignore_list)
eq_(len(r), 1) eq_(len(r), 1)
g = r[0] g = r[0]
@@ -493,14 +570,16 @@ def test_dont_group_files_that_dont_exist(tmpdir):
# In this test, we have to delete one of the files between the get_matches() part and the # In this test, we have to delete one of the files between the get_matches() part and the
# get_groups() part. # get_groups() part.
s = Scanner() s = Scanner()
s.scan_type = ScanType.Contents s.scan_type = ScanType.CONTENTS
p = Path(str(tmpdir)) p = Path(str(tmpdir))
p["file1"].open("w").write("foo") with p.joinpath("file1").open("w") as fp:
p["file2"].open("w").write("foo") fp.write("foo")
with p.joinpath("file2").open("w") as fp:
fp.write("foo")
file1, file2 = fs.get_files(p) file1, file2 = fs.get_files(p)
def getmatches(*args, **kw): def getmatches(*args, **kw):
file2.path.remove() file2.path.unlink()
return [Match(file1, file2, 100)] return [Match(file1, file2, 100)]
s._getmatches = getmatches s._getmatches = getmatches
@@ -512,23 +591,23 @@ def test_folder_scan_exclude_subfolder_matches(fake_fileexists):
# when doing a Folders scan type, don't include matches for folders whose parent folder already # when doing a Folders scan type, don't include matches for folders whose parent folder already
# match. # match.
s = Scanner() s = Scanner()
s.scan_type = ScanType.Folders s.scan_type = ScanType.FOLDERS
topf1 = no("top folder 1", size=42) topf1 = no("top folder 1", size=42)
topf1.md5 = topf1.md5partial = b"some_md5_1" topf1.digest = topf1.digest_partial = topf1.digest_samples = b"some_digest__1"
topf1.path = Path("/topf1") topf1.path = Path("/topf1")
topf2 = no("top folder 2", size=42) topf2 = no("top folder 2", size=42)
topf2.md5 = topf2.md5partial = b"some_md5_1" topf2.digest = topf2.digest_partial = topf2.digest_samples = b"some_digest__1"
topf2.path = Path("/topf2") topf2.path = Path("/topf2")
subf1 = no("sub folder 1", size=41) subf1 = no("sub folder 1", size=41)
subf1.md5 = subf1.md5partial = b"some_md5_2" subf1.digest = subf1.digest_partial = subf1.digest_samples = b"some_digest__2"
subf1.path = Path("/topf1/sub") subf1.path = Path("/topf1/sub")
subf2 = no("sub folder 2", size=41) subf2 = no("sub folder 2", size=41)
subf2.md5 = subf2.md5partial = b"some_md5_2" subf2.digest = subf2.digest_partial = subf2.digest_samples = b"some_digest__2"
subf2.path = Path("/topf2/sub") subf2.path = Path("/topf2/sub")
eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2])), 1) # only top folders eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2])), 1) # only top folders
# however, if another folder matches a subfolder, keep in in the matches # however, if another folder matches a subfolder, keep in in the matches
otherf = no("other folder", size=41) otherf = no("other folder", size=41)
otherf.md5 = otherf.md5partial = b"some_md5_2" otherf.digest = otherf.digest_partial = otherf.digest_samples = b"some_digest__2"
otherf.path = Path("/otherfolder") otherf.path = Path("/otherfolder")
eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2, otherf])), 2) eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2, otherf])), 2)
@@ -547,21 +626,21 @@ def test_dont_count_ref_files_as_discarded(fake_fileexists):
# However, this causes problems in "discarded" counting and we make sure here that we don't # However, this causes problems in "discarded" counting and we make sure here that we don't
# report discarded matches in exact duplicate scans. # report discarded matches in exact duplicate scans.
s = Scanner() s = Scanner()
s.scan_type = ScanType.Contents s.scan_type = ScanType.CONTENTS
o1 = no("foo", path="p1") o1 = no("foo", path="p1")
o2 = no("foo", path="p2") o2 = no("foo", path="p2")
o3 = no("foo", path="p3") o3 = no("foo", path="p3")
o1.md5 = o1.md5partial = "foobar" o1.digest = o1.digest_partial = o1.digest_samples = "foobar"
o2.md5 = o2.md5partial = "foobar" o2.digest = o2.digest_partial = o2.digest_samples = "foobar"
o3.md5 = o3.md5partial = "foobar" o3.digest = o3.digest_partial = o3.digest_samples = "foobar"
o1.is_ref = True o1.is_ref = True
o2.is_ref = True o2.is_ref = True
eq_(len(s.get_dupe_groups([o1, o2, o3])), 1) eq_(len(s.get_dupe_groups([o1, o2, o3])), 1)
eq_(s.discarded_file_count, 0) eq_(s.discarded_file_count, 0)
def test_priorize_me(fake_fileexists): def test_prioritize_me(fake_fileexists):
# in ScannerME, bitrate goes first (right after is_ref) in priorization # in ScannerME, bitrate goes first (right after is_ref) in prioritization
s = ScannerME() s = ScannerME()
o1, o2 = no("foo", path="p1"), no("foo", path="p2") o1, o2 = no("foo", path="p1"), no("foo", path="p2")
o1.bitrate = 1 o1.bitrate = 1

View File

@@ -5,6 +5,14 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import time import time
import sys
import os
import urllib.request
import urllib.error
import json
import semantic_version
import logging
from typing import Union
from hscommon.util import format_time_decimal from hscommon.util import format_time_decimal
@@ -58,3 +66,38 @@ def fix_surrogate_encoding(s, encoding="utf-8"):
return s.encode(encoding, "replace").decode(encoding) return s.encode(encoding, "replace").decode(encoding)
else: else:
return s return s
def executable_folder():
return os.path.dirname(os.path.abspath(sys.argv[0]))
def check_for_update(current_version: str, include_prerelease: bool = False) -> Union[None, dict]:
request = urllib.request.Request(
"https://api.github.com/repos/arsenetar/dupeguru/releases",
headers={"Accept": "application/vnd.github.v3+json"},
)
try:
with urllib.request.urlopen(request) as response:
if response.status != 200:
logging.warn(f"Error retriving updates. Status: {response.status}")
return None
try:
response_json = json.loads(response.read())
except json.JSONDecodeError as ex:
logging.warn(f"Error parsing updates. {ex.msg}")
return None
except urllib.error.URLError as ex:
logging.warn(f"Error retriving updates. {ex.reason}")
return None
new_version = semantic_version.Version(current_version)
new_url = None
for release in response_json:
release_version = semantic_version.Version(release["name"])
if new_version < release_version and (include_prerelease or not release_version.prerelease):
new_version = release_version
new_url = release["html_url"]
if new_url is not None:
return {"version": new_version, "url": new_url}
else:
return None

View File

@@ -1,3 +1,55 @@
=== 4.3.0 (2022-07-01)
* Redirect stdout from custom command to the log files (#1008)
* Update translations
* Fix typo in debian control file (#989)
* Add option to profile scans
* Update fs.py to optimize stat() calls
* Fix Error when delete after scan (#988)
* Update directory scanning to use os.scandir() and DirEntry objects
* Improve performance of Directories.get_state()
* Migrate from hscommon.path to pathlib
* Switch file hashing to xxhash with fallback to md5
* Add update check feature to about box
=== 4.2.1 (2022-03-25)
* Default to English on unsupported system language (#976)
* Fix image viewer zoom datatype issue (#978)
* Fix errors from window change event (#937, #980)
* Fix deprecation warning from SQLite
* Enforce minimum Windows version in installer (#983)
* Fix help path for local files
* Drop python 3.6 support
* VS Code project settings added, yaml validation for GitHub actions
=== 4.2.0 (2021-01-24)
* Add Malay and Turkish
* Add dark style for windows (#900)
* Add caching md5 file hashes (#942)
* Add feature to partially hash large files, with user adjustable preference (#908)
* Add portable mode (store settings next to executable)
* Add file association for .dupeguru files on windows
* Add ability to pass .dupeguru file to load on startup (#902)
* Add ability to reveal in explorer/finder (#895)
* Switch audio tag processing from hsaudiotag to mutagen (#440)
* Add ability to use Qt dialogs instead of native OS dialogs for some file selection operations
* Add OS and Python details to error dialog to assist in troubleshooting
* Add preference to ignore large files with threshold (#430)
* Fix error on close from DetailsPanel (#857, #873)
* Change reference background color (#894, #898)
* Remove stripping of unicode characters when matching names (#879)
* Fix exception when deleting in delta view (#863, #905)
* Fix dupes only view not updating after re-prioritize results (#757, #910, #911)
* Fix ability to drag'n'drop file/folder with certain characters in name (#897)
* Fix window position opening partially offscreen (#653)
* Fix TypeError is photo mode (#551)
* Change message for when files are deleted directly (#904)
* Add more feedback during scan (#700)
* Add Python version check to build.py (#589)
* General code cleanups
* Improvements to using standardized build tooling
* Moved CI/CD to github actions, added codeql, SonarCloud
=== 4.1.1 (2021-03-21) === 4.1.1 (2021-03-21)
* Add Japanese * Add Japanese

View File

@@ -1,7 +1,7 @@
Häufig gestellte Fragen Häufig gestellte Fragen
========================== ==========================
.. topic:: What is |appname|? .. topic:: What is dupeGuru?
.. only:: edition_se .. only:: edition_se
@@ -25,7 +25,7 @@ Häufig gestellte Fragen
.. topic:: Was sind die Demo-Einschränkungen von dupeGuru? .. topic:: Was sind die Demo-Einschränkungen von dupeGuru?
Keine, |appname| ist `Fairware <http://open.hardcoded.net/about/>`_. Keine, dupeGuru ist `Fairware <http://open.hardcoded.net/about/>`_.
.. topic:: Die Markierungsbox einer Datei, die ich löschen möchte, ist deaktiviert. Was muss ich tun? .. topic:: Die Markierungsbox einer Datei, die ich löschen möchte, ist deaktiviert. Was muss ich tun?

View File

@@ -1,21 +1,13 @@
|appname| Hilfe dupeGuru Hilfe
=============== ===============
.. only:: edition_se .. only:: edition_se
Dieses Dokument ist auch auf `Englisch <http://www.hardcoded.net/dupeguru/help/en/>`__ und `Französisch <http://www.hardcoded.net/dupeguru/help/fr/>`__ verfügbar. Dieses Dokument ist auch auf `Englisch <http://dupeguru.voltaicideas.net/help/en/>`__ und `Französisch <http://dupeguru.voltaicideas.net/help/fr/>`__ verfügbar.
.. only:: edition_me
Dieses Dokument ist auch auf `Englisch <http://www.hardcoded.net/dupeguru/help/en/>`__ und `Französisch <http://www.hardcoded.net/dupeguru_me/help/fr/>`__ verfügbar.
.. only:: edition_pe
Dieses Dokument ist auch auf `Englisch <http://www.hardcoded.net/dupeguru/help/en/>`__ und `Französisch <http://www.hardcoded.net/dupeguru_pe/help/fr/>`__ verfügbar.
.. only:: edition_se or edition_me .. only:: edition_se or edition_me
|appname| ist ein Tool zum Auffinden von Duplikaten auf Ihrem Computer. Es kann entweder Dateinamen oder Inhalte scannen. Der Dateiname-Scan stellt einen lockeren Suchalgorithmus zur Verfügung, der sogar Duplikate findet, die nicht den exakten selben Namen haben. dupeGuru ist ein Tool zum Auffinden von Duplikaten auf Ihrem Computer. Es kann entweder Dateinamen oder Inhalte scannen. Der Dateiname-Scan stellt einen lockeren Suchalgorithmus zur Verfügung, der sogar Duplikate findet, die nicht den exakten selben Namen haben.
.. only:: edition_pe .. only:: edition_pe
@@ -23,7 +15,7 @@
Obwohl dupeGuru auch leicht ohne Dokumentation genutzt werden kann, ist es sinnvoll die Hilfe zu lesen. Wenn Sie nach einer Führung für den ersten Duplikatscan suchen, werfen Sie einen Blick auf die :doc:`Schnellstart <quick_start>` Sektion Obwohl dupeGuru auch leicht ohne Dokumentation genutzt werden kann, ist es sinnvoll die Hilfe zu lesen. Wenn Sie nach einer Führung für den ersten Duplikatscan suchen, werfen Sie einen Blick auf die :doc:`Schnellstart <quick_start>` Sektion
Es ist eine gute Idee |appname| aktuell zu halten. Sie können die neueste Version auf der `homepage`_ finden. Es ist eine gute Idee dupeGuru aktuell zu halten. Sie können die neueste Version auf der http://dupeguru.voltaicideas.net finden.
Inhalte: Inhalte:

View File

@@ -12,7 +12,7 @@ a community around this project.
So, whatever your skills, if you're interested in contributing to dupeGuru, please do so. Normally, So, whatever your skills, if you're interested in contributing to dupeGuru, please do so. Normally,
this documentation should be enough to get you started, but if it isn't, then **please**, this documentation should be enough to get you started, but if it isn't, then **please**,
`let me know`_ because it's a problem that I'm committed to fix. If there's any situation where you'd open a discussion at https://github.com/arsenetar/dupeguru/discussions. If there's any situation where you'd
wish to contribute but some doubt you're having prevent you from going forward, please contact me. wish to contribute but some doubt you're having prevent you from going forward, please contact me.
I'd much prefer to spend the time figuring out with you whether (and how) you can contribute than I'd much prefer to spend the time figuring out with you whether (and how) you can contribute than
taking the chance of missing that opportunity. taking the chance of missing that opportunity.
@@ -82,10 +82,9 @@ agree on what should be added to the documentation.
dupeGuru. For more information about how to do that, you can refer to the `translator guide`_. dupeGuru. For more information about how to do that, you can refer to the `translator guide`_.
.. _been open source: https://www.hardcoded.net/articles/free-as-in-speech-fair-as-in-trade .. _been open source: https://www.hardcoded.net/articles/free-as-in-speech-fair-as-in-trade
.. _let me know: mailto:hsoft@hardcoded.net .. _Source code repository: https://github.com/arsenetar/dupeguru
.. _Source code repository: https://github.com/hsoft/dupeguru .. _Issue Tracker: https://github.com/arsenetar/issues
.. _Issue Tracker: https://github.com/hsoft/dupeguru/issues .. _Issue labels meaning: https://github.com/arsenetar/wiki/issue-labels
.. _Issue labels meaning: https://github.com/hsoft/dupeguru/wiki/issue-labels
.. _Sphinx: http://sphinx-doc.org/ .. _Sphinx: http://sphinx-doc.org/
.. _reST: http://en.wikipedia.org/wiki/ReStructuredText .. _reST: http://en.wikipedia.org/wiki/ReStructuredText
.. _translator guide: https://github.com/hsoft/dupeguru/wiki/Translator-Guide .. _translator guide: https://github.com/arsenetar/wiki/Translator-Guide

View File

@@ -1,12 +0,0 @@
hscommon.jobprogress.qt
=======================
.. automodule:: hscommon.jobprogress.qt
.. autosummary::
Progress
.. autoclass:: Progress
:members:

View File

@@ -151,8 +151,6 @@ delete files" option that is offered to you when you activate Send to Trash. Thi
files to the Trash, but delete them immediately. In some cases, for example on network storage files to the Trash, but delete them immediately. In some cases, for example on network storage
(NAS), this has been known to work when normal deletion didn't. (NAS), this has been known to work when normal deletion didn't.
If this fail, `HS forums`_ might be of some help.
Why is Picture mode's contents scan so slow? Why is Picture mode's contents scan so slow?
-------------------------------------------- --------------------------------------------
@@ -178,7 +176,6 @@ Preferences are stored elsewhere:
* Linux: ``~/.config/Hardcoded Software/dupeGuru.conf`` * Linux: ``~/.config/Hardcoded Software/dupeGuru.conf``
* Mac OS X: In the built-in ``defaults`` system, as ``com.hardcoded-software.dupeguru`` * Mac OS X: In the built-in ``defaults`` system, as ``com.hardcoded-software.dupeguru``
.. _HS forums: https://forum.hardcoded.net/ .. _Github: https://github.com/arsenetar/dupeguru
.. _Github: https://github.com/hsoft/dupeguru .. _open an issue: https://github.com/arsenetar/dupeguru/wiki/issue-labels
.. _open an issue: https://github.com/hsoft/dupeguru/wiki/issue-labels

View File

@@ -3,11 +3,11 @@ dupeGuru help
This help document is also available in these languages: This help document is also available in these languages:
* `French <http://www.hardcoded.net/dupeguru/help/fr>`__ * `French <http://dupeguru.voltaicideas.net/help/fr>`__
* `German <http://www.hardcoded.net/dupeguru/help/de>`__ * `German <http://dupeguru.voltaicideas.net/help/de>`__
* `Armenian <http://www.hardcoded.net/dupeguru/help/hy>`__ * `Armenian <http://dupeguru.voltaicideas.net/help/hy>`__
* `Russian <http://www.hardcoded.net/dupeguru/help/ru>`__ * `Russian <http://dupeguru.voltaicideas.net/help/ru>`__
* `Ukrainian <http://www.hardcoded.net/dupeguru/help/uk>`__ * `Ukrainian <http://dupeguru.voltaicideas.net/help/uk>`__
dupeGuru is a tool to find duplicate files on your computer. It has three dupeGuru is a tool to find duplicate files on your computer. It has three
modes, Standard, Music and Picture, with each mode having its own scan types modes, Standard, Music and Picture, with each mode having its own scan types
@@ -42,4 +42,4 @@ Indices and tables
* :ref:`genindex` * :ref:`genindex`
* :ref:`search` * :ref:`search`
.. _homepage: https://www.hardcoded.net/dupeguru .. _homepage: https://dupeguru.voltaicideas.net/

View File

@@ -3,7 +3,7 @@ Foire aux questions
.. contents:: .. contents::
Qu'est-ce que |appname|? Qu'est-ce que dupeGuru?
------------------------ ------------------------
.. only:: edition_se .. only:: edition_se

View File

@@ -1,21 +1,13 @@
Aide |appname| Aide dupeGuru
=============== ===============
.. only:: edition_se .. only:: edition_se
Ce document est aussi disponible en `anglais <http://www.hardcoded.net/dupeguru/help/en/>`__, en `allemand <http://www.hardcoded.net/dupeguru/help/de/>`__ et en `arménien <http://www.hardcoded.net/dupeguru/help/hy/>`__. Ce document est aussi disponible en `anglais <http://dupeguru.voltaicideas.net/help/en/>`__, en `allemand <http://dupeguru.voltaicideas.net/help/de/>`__ et en `arménien <http://dupeguru.voltaicideas.net/help/hy/>`__.
.. only:: edition_me
Ce document est aussi disponible en `anglais <http://www.hardcoded.net/dupeguru_me/help/en/>`__, en `allemand <http://www.hardcoded.net/dupeguru_me/help/de/>`__ et en `arménien <http://www.hardcoded.net/dupeguru_me/help/hy/>`__.
.. only:: edition_pe
Ce document est aussi disponible en `anglais <http://www.hardcoded.net/dupeguru_pe/help/en/>`__, en `allemand <http://www.hardcoded.net/dupeguru_pe/help/de/>`__ et en `arménien <http://www.hardcoded.net/dupeguru_pe/help/hy/>`__.
.. only:: edition_se or edition_me .. only:: edition_se or edition_me
|appname| est un outil pour trouver des doublons parmi vos fichiers. Il peut comparer soit les noms de fichiers, soit le contenu. Le comparateur de nom de fichier peut trouver des doublons même si les noms ne sont pas exactement pareils. dupeGuru est un outil pour trouver des doublons parmi vos fichiers. Il peut comparer soit les noms de fichiers, soit le contenu. Le comparateur de nom de fichier peut trouver des doublons même si les noms ne sont pas exactement pareils.
.. only:: edition_pe .. only:: edition_pe
@@ -23,7 +15,7 @@ Aide |appname|
Bien que dupeGuru puisse être utilisé sans lire l'aide, une telle lecture vous permettra de bien comprendre comment l'application fonctionne. Pour un guide rapide pour une première utilisation, référez vous à la section :doc:`Démarrage Rapide <quick_start>`. Bien que dupeGuru puisse être utilisé sans lire l'aide, une telle lecture vous permettra de bien comprendre comment l'application fonctionne. Pour un guide rapide pour une première utilisation, référez vous à la section :doc:`Démarrage Rapide <quick_start>`.
C'est toujours une bonne idée de garder |appname| à jour. Vous pouvez télécharger la dernière version sur sa `page web`_. C'est toujours une bonne idée de garder dupeGuru à jour. Vous pouvez télécharger la dernière version sur sa http://dupeguru.voltaicideas.net.
Contents: Contents:

View File

@@ -1,7 +1,7 @@
Հաճախ Տրվող Հարցեր Հաճախ Տրվող Հարցեր
========================== ==========================
.. topic:: Ի՞նչ է |appname|-ը: .. topic:: Ի՞նչ է dupeGuru-ը:
.. only:: edition_se .. only:: edition_se

View File

@@ -1,21 +1,13 @@
|appname| help dupeGuru help
=============== ===============
.. only:: edition_se .. only:: edition_se
Այս փաստաթուղթը հասանելի է նաև՝ `Ֆրանսերեն <http://www.hardcoded.net/dupeguru/help/fr/>`__ և `Գերմաներեն <http://www.hardcoded.net/dupeguru/help/de/>`__. Այս փաստաթուղթը հասանելի է նաև՝ `Ֆրանսերեն <http://dupeguru.voltaicideas.net/help/fr/>`__ և `Գերմաներեն <http://dupeguru.voltaicideas.net/help/de/>`__.
.. only:: edition_me
Այս փաստաթուղթը հասանելի է նաև՝ `Ֆրանսերեն <http://www.hardcoded.net/dupeguru_me/help/fr/>`__ և `Գերմաներեն <http://www.hardcoded.net/dupeguru_me/help/de/>`__.
.. only:: edition_pe
Այս փաստաթուղթը հասանելի է նաև՝ `Ֆրանսերեն <http://www.hardcoded.net/dupeguru_pe/help/fr/>`__ և `Գերմաներեն <http://www.hardcoded.net/dupeguru_pe/help/de/>`__.
.. only:: edition_se or edition_me .. only:: edition_se or edition_me
|appname| ծրագիր է՝ գտնելու կրկնօրինակ ունեցող ֆայլեր Ձեր համակարգչում: Այն կարող է անգամ ստուգել ֆայլի անունները կան բովանդակությունը: Ֆայլի անվան ստուգման հնարավորությունները ոչ ճշգրիտ համընկման ալգորիթմով, որը կարող է գտնել ֆայլի անվան կրկնօրինակներ, անգամ եթե դրանք նույնը չեն: dupeGuru ծրագիր է՝ գտնելու կրկնօրինակ ունեցող ֆայլեր Ձեր համակարգչում: Այն կարող է անգամ ստուգել ֆայլի անունները կան բովանդակությունը: Ֆայլի անվան ստուգման հնարավորությունները ոչ ճշգրիտ համընկման ալգորիթմով, որը կարող է գտնել ֆայլի անվան կրկնօրինակներ, անգամ եթե դրանք նույնը չեն:
.. only:: edition_pe .. only:: edition_pe
@@ -23,7 +15,7 @@
Չնայած dupeGuru-ն կարող է հեշտությամբ օգտագործվել առանց օգնության, այնուհանդերձ եթե կարդաք այս ֆայլը, այն մեծապես կօգնի Ձեզ ընկալելու ծրագրի աշխատանքը: Եթե Դուք նայում եք ձեռնարկը կրկնօրինակների առաջին ստուգման համար, ապա կարող եք ընտրել :doc:`Արագ Սկիզբ <quick_start>` հատվածը: Չնայած dupeGuru-ն կարող է հեշտությամբ օգտագործվել առանց օգնության, այնուհանդերձ եթե կարդաք այս ֆայլը, այն մեծապես կօգնի Ձեզ ընկալելու ծրագրի աշխատանքը: Եթե Դուք նայում եք ձեռնարկը կրկնօրինակների առաջին ստուգման համար, ապա կարող եք ընտրել :doc:`Արագ Սկիզբ <quick_start>` հատվածը:
Շատ լավ միտք է պահելու |appname| թարմացված: Կարող եք բեռնել վեբ կայքի համապատասխան էջից `homepage`_: Շատ լավ միտք է պահելու dupeGuru թարմացված: Կարող եք բեռնել վեբ կայքի համապատասխան էջից http://dupeguru.voltaicideas.net:
Պարունակությունը. Պարունակությունը.

View File

@@ -1,7 +1,7 @@
Часто задаваемые вопросы Часто задаваемые вопросы
========================== ==========================
.. topic:: Что такое |appname|? .. topic:: Что такое dupeGuru?
.. only:: edition_se .. only:: edition_se

View File

@@ -1,21 +1,11 @@
|appname| help dupeGuru help
=============== ===============
.. only:: edition_se Этот документ также доступна на `французском <http://dupeguru.voltaicideas.net/help/fr/>`__, `немецком <http://dupeguru.voltaicideas.net/help/de/>`__ и `армянский <http://dupeguru.voltaicideas.net/help/hy/>`__.
Этот документ также доступна на `французском <http://www.hardcoded.net/dupeguru/help/fr/>`__, `немецком <http://www.hardcoded.net/dupeguru/help/de/>`__ и `армянский <http://www.hardcoded.net/dupeguru/help/hy/>`__.
.. only:: edition_me
Этот документ также доступна на `французском <http://www.hardcoded.net/dupeguru_me/help/fr/>`__, `немецкий <http://www.hardcoded.net/dupeguru_me/help/de/>`__ и `армянский <http://www.hardcoded.net/dupeguru_me/help/hy/>`__.
.. only:: edition_pe
Этот документ также доступна на `французском <http://www.hardcoded.net/dupeguru_pe/help/fr/>`__, `немецкий <http://www.hardcoded.net/dupeguru_pe/help/de/>`__ и `армянский <http://www.hardcoded.net/dupeguru_pe/help/hy/>`__.
.. only:: edition_se or edition_me .. only:: edition_se or edition_me
|appname| есть инструмент для поиска дубликатов файлов на вашем компьютере. Он может сканировать либо имен файлов или содержимого.Имя файла функций сканирования нечеткого соответствия алгоритма, который позволяет найти одинаковые имена файлов, даже если они не совсем то же самое. dupeGuru есть инструмент для поиска дубликатов файлов на вашем компьютере. Он может сканировать либо имен файлов или содержимого.Имя файла функций сканирования нечеткого соответствия алгоритма, который позволяет найти одинаковые имена файлов, даже если они не совсем то же самое.
.. only:: edition_pe .. only:: edition_pe
@@ -23,7 +13,7 @@
Хотя dupeGuru может быть легко использована без документации, чтение этого файла поможет вам освоить его. Если вы ищете руководство для вашей первой дублировать сканирования, вы можете взглянуть на раздел :doc:`Быстрый <quick_start>` Начало. Хотя dupeGuru может быть легко использована без документации, чтение этого файла поможет вам освоить его. Если вы ищете руководство для вашей первой дублировать сканирования, вы можете взглянуть на раздел :doc:`Быстрый <quick_start>` Начало.
Это хорошая идея, чтобы сохранить |appname| обновлен. Вы можете скачать последнюю версию на своей `homepage`_. Это хорошая идея, чтобы сохранить dupeGuru обновлен. Вы можете скачать последнюю версию на своей http://dupeguru.voltaicideas.net.
Содержание: Содержание:
.. toctree:: .. toctree::

View File

@@ -1,7 +1,7 @@
Часті питання Часті питання
========================== ==========================
.. topic:: Що таке |appname|? .. topic:: Що таке dupeGuru?
.. only:: edition_se .. only:: edition_se

View File

@@ -1,21 +1,13 @@
|appname| help dupeGuru help
=============== ===============
.. only:: edition_se .. only:: edition_se
Цей документ також доступна на `французькому <http://www.hardcoded.net/dupeguru/help/fr/>`__, `німецький <http://www.hardcoded.net/dupeguru/help/de/>`__ і `Вірменський <http://www.hardcoded.net/dupeguru/help/hy/>`__. Цей документ також доступна на `французькому <http://dupeguru.voltaicideas.net/help/fr/>`__, `німецький <http://dupeguru.voltaicideas.net/help/de/>`__ і `Вірменський <http://dupeguru.voltaicideas.net/help/hy/>`__.
.. only:: edition_me
Цей документ також доступна на `французькому <http://www.hardcoded.net/dupeguru_me/help/fr/>`__, `німецький <http://www.hardcoded.net/dupeguru_me/help/de/>`__ і `Вірменський <http://www.hardcoded.net/dupeguru_me/help/hy/>`__.
.. only:: edition_pe
Цей документ також доступна на `французькому <http://www.hardcoded.net/dupeguru_pe/help/fr/>`__, `німецький <http://www.hardcoded.net/dupeguru_pe/help/de/>`__ і `Вірменський <http://www.hardcoded.net/dupeguru_pe/help/hy/>`__.
.. only:: edition_se or edition_me .. only:: edition_se or edition_me
|appname| це інструмент для пошуку дублікатів файлів на вашому комп'ютері. Він може сканувати або імен файлів або вмісту. Файл функцій сканування нечіткого відповідності алгоритму, який дозволяє знайти однакові імена файлів, навіть якщо вони не зовсім те ж саме. dupeGuru це інструмент для пошуку дублікатів файлів на вашому комп'ютері. Він може сканувати або імен файлів або вмісту. Файл функцій сканування нечіткого відповідності алгоритму, який дозволяє знайти однакові імена файлів, навіть якщо вони не зовсім те ж саме.
.. only:: edition_pe .. only:: edition_pe
@@ -23,7 +15,7 @@
Хоча dupeGuru може бути легко використана без документації, читання цього файлу допоможе вам освоїти його. Якщо ви шукаєте керівництво для вашої першої дублювати сканування, ви можете поглянути на: :doc:`Quick Start <quick_start>` Хоча dupeGuru може бути легко використана без документації, читання цього файлу допоможе вам освоїти його. Якщо ви шукаєте керівництво для вашої першої дублювати сканування, ви можете поглянути на: :doc:`Quick Start <quick_start>`
Це гарна ідея, щоб зберегти |appname| оновлено. Ви можете завантажити останню версію на своєму `homepage`_. Це гарна ідея, щоб зберегти dupeGuru оновлено. Ви можете завантажити останню версію на своєму http://dupeguru.voltaicideas.net.
Contents: Contents:

5
hscommon/.gitignore vendored
View File

@@ -1,5 +0,0 @@
*.pyc
*.mo
*.so
.DS_Store
/docs_html

View File

@@ -9,6 +9,7 @@
"""This module is a collection of function to help in HS apps build process. """This module is a collection of function to help in HS apps build process.
""" """
from argparse import ArgumentParser
import os import os
import sys import sys
import os.path as op import os.path as op
@@ -20,24 +21,19 @@ import re
import importlib import importlib
from datetime import datetime from datetime import datetime
import glob import glob
import sysconfig from typing import Any, AnyStr, Callable, Dict, List, Union
import modulefinder
from setuptools import setup, Extension from hscommon.plat import ISWINDOWS
from .plat import ISWINDOWS
from .util import ensure_folder, delete_files_with_pattern
def print_and_do(cmd): def print_and_do(cmd: str) -> int:
"""Prints ``cmd`` and executes it in the shell. """Prints ``cmd`` and executes it in the shell."""
"""
print(cmd) print(cmd)
p = Popen(cmd, shell=True) p = Popen(cmd, shell=True)
return p.wait() return p.wait()
def _perform(src, dst, action, actionname): def _perform(src: os.PathLike, dst: os.PathLike, action: Callable, actionname: str) -> None:
if not op.lexists(src): if not op.lexists(src):
print("Copying %s failed: it doesn't exist." % src) print("Copying %s failed: it doesn't exist." % src)
return return
@@ -46,34 +42,26 @@ def _perform(src, dst, action, actionname):
shutil.rmtree(dst) shutil.rmtree(dst)
else: else:
os.remove(dst) os.remove(dst)
print("%s %s --> %s" % (actionname, src, dst)) print("{} {} --> {}".format(actionname, src, dst))
action(src, dst) action(src, dst)
def copy_file_or_folder(src, dst): def copy_file_or_folder(src: os.PathLike, dst: os.PathLike) -> None:
if op.isdir(src): if op.isdir(src):
shutil.copytree(src, dst, symlinks=True) shutil.copytree(src, dst, symlinks=True)
else: else:
shutil.copy(src, dst) shutil.copy(src, dst)
def move(src, dst): def move(src: os.PathLike, dst: os.PathLike) -> None:
_perform(src, dst, os.rename, "Moving") _perform(src, dst, os.rename, "Moving")
def copy(src, dst): def copy(src: os.PathLike, dst: os.PathLike) -> None:
_perform(src, dst, copy_file_or_folder, "Copying") _perform(src, dst, copy_file_or_folder, "Copying")
def symlink(src, dst): def _perform_on_all(pattern: AnyStr, dst: os.PathLike, action: Callable) -> None:
_perform(src, dst, os.symlink, "Symlinking")
def hardlink(src, dst):
_perform(src, dst, os.link, "Hardlinking")
def _perform_on_all(pattern, dst, action):
# pattern is a glob pattern, example "folder/foo*". The file is moved directly in dst, no folder # pattern is a glob pattern, example "folder/foo*". The file is moved directly in dst, no folder
# structure from src is kept. # structure from src is kept.
filenames = glob.glob(pattern) filenames = glob.glob(pattern)
@@ -82,44 +70,35 @@ def _perform_on_all(pattern, dst, action):
action(fn, destpath) action(fn, destpath)
def move_all(pattern, dst): def move_all(pattern: AnyStr, dst: os.PathLike) -> None:
_perform_on_all(pattern, dst, move) _perform_on_all(pattern, dst, move)
def copy_all(pattern, dst): def copy_all(pattern: AnyStr, dst: os.PathLike) -> None:
_perform_on_all(pattern, dst, copy) _perform_on_all(pattern, dst, copy)
def ensure_empty_folder(path): def filereplace(filename: os.PathLike, outfilename: Union[os.PathLike, None] = None, **kwargs) -> None:
"""Make sure that the path exists and that it's an empty folder. """Reads `filename`, replaces all {variables} in kwargs, and writes the result to `outfilename`."""
"""
if op.exists(path):
shutil.rmtree(path)
os.mkdir(path)
def filereplace(filename, outfilename=None, **kwargs):
"""Reads `filename`, replaces all {variables} in kwargs, and writes the result to `outfilename`.
"""
if outfilename is None: if outfilename is None:
outfilename = filename outfilename = filename
fp = open(filename, "rt", encoding="utf-8") fp = open(filename, encoding="utf-8")
contents = fp.read() contents = fp.read()
fp.close() fp.close()
# We can't use str.format() because in some files, there might be {} characters that mess with it. # We can't use str.format() because in some files, there might be {} characters that mess with it.
for key, item in kwargs.items(): for key, item in kwargs.items():
contents = contents.replace("{{{}}}".format(key), item) contents = contents.replace(f"{{{key}}}", item)
fp = open(outfilename, "wt", encoding="utf-8") fp = open(outfilename, "wt", encoding="utf-8")
fp.write(contents) fp.write(contents)
fp.close() fp.close()
def get_module_version(modulename): def get_module_version(modulename: str) -> str:
mod = importlib.import_module(modulename) mod = importlib.import_module(modulename)
return mod.__version__ return mod.__version__
def setup_package_argparser(parser): def setup_package_argparser(parser: ArgumentParser):
parser.add_argument( parser.add_argument(
"--sign", "--sign",
dest="sign_identity", dest="sign_identity",
@@ -146,15 +125,13 @@ def setup_package_argparser(parser):
# `args` come from an ArgumentParser updated with setup_package_argparser() # `args` come from an ArgumentParser updated with setup_package_argparser()
def package_cocoa_app_in_dmg(app_path, destfolder, args): def package_cocoa_app_in_dmg(app_path: os.PathLike, destfolder: os.PathLike, args) -> None:
# Rather than signing our app in XCode during the build phase, we sign it during the package # Rather than signing our app in XCode during the build phase, we sign it during the package
# phase because running the app before packaging can modify it and we want to be sure to have # phase because running the app before packaging can modify it and we want to be sure to have
# a valid signature. # a valid signature.
if args.sign_identity: if args.sign_identity:
sign_identity = "Developer ID Application: {}".format(args.sign_identity) sign_identity = f"Developer ID Application: {args.sign_identity}"
result = print_and_do( result = print_and_do(f'codesign --force --deep --sign "{sign_identity}" "{app_path}"')
'codesign --force --deep --sign "{}" "{}"'.format(sign_identity, app_path)
)
if result != 0: if result != 0:
print("ERROR: Signing failed. Aborting packaging.") print("ERROR: Signing failed. Aborting packaging.")
return return
@@ -164,51 +141,33 @@ def package_cocoa_app_in_dmg(app_path, destfolder, args):
build_dmg(app_path, destfolder) build_dmg(app_path, destfolder)
def build_dmg(app_path, destfolder): def build_dmg(app_path: os.PathLike, destfolder: os.PathLike) -> None:
"""Builds a DMG volume with application at ``app_path`` and puts it in ``dest_path``. """Builds a DMG volume with application at ``app_path`` and puts it in ``dest_path``.
The name of the resulting DMG volume is determined by the app's name and version. The name of the resulting DMG volume is determined by the app's name and version.
""" """
print(repr(op.join(app_path, "Contents", "Info.plist"))) print(repr(op.join(app_path, "Contents", "Info.plist")))
plist = plistlib.readPlist(op.join(app_path, "Contents", "Info.plist")) with open(op.join(app_path, "Contents", "Info.plist"), "rb") as fp:
plist = plistlib.load(fp)
workpath = tempfile.mkdtemp() workpath = tempfile.mkdtemp()
dmgpath = op.join(workpath, plist["CFBundleName"]) dmgpath = op.join(workpath, plist["CFBundleName"])
os.mkdir(dmgpath) os.mkdir(dmgpath)
print_and_do('cp -R "%s" "%s"' % (app_path, dmgpath)) print_and_do('cp -R "{}" "{}"'.format(app_path, dmgpath))
print_and_do('ln -s /Applications "%s"' % op.join(dmgpath, "Applications")) print_and_do('ln -s /Applications "%s"' % op.join(dmgpath, "Applications"))
dmgname = "%s_osx_%s.dmg" % ( dmgname = "{}_osx_{}.dmg".format(
plist["CFBundleName"].lower().replace(" ", "_"), plist["CFBundleName"].lower().replace(" ", "_"),
plist["CFBundleVersion"].replace(".", "_"), plist["CFBundleVersion"].replace(".", "_"),
) )
print("Building %s" % dmgname) print("Building %s" % dmgname)
# UDBZ = bzip compression. UDZO (zip compression) was used before, but it compresses much less. # UDBZ = bzip compression. UDZO (zip compression) was used before, but it compresses much less.
print_and_do( print_and_do(
'hdiutil create "%s" -format UDBZ -nocrossdev -srcdir "%s"' 'hdiutil create "{}" -format UDBZ -nocrossdev -srcdir "{}"'.format(op.join(destfolder, dmgname), dmgpath)
% (op.join(destfolder, dmgname), dmgpath)
) )
print("Build Complete") print("Build Complete")
def copy_sysconfig_files_for_embed(destpath): def add_to_pythonpath(path: os.PathLike) -> None:
# This normally shouldn't be needed for Python 3.3+. """Adds ``path`` to both ``PYTHONPATH`` env and ``sys.path``."""
makefile = sysconfig.get_makefile_filename()
configh = sysconfig.get_config_h_filename()
shutil.copy(makefile, destpath)
shutil.copy(configh, destpath)
with open(op.join(destpath, "site.py"), "w") as fp:
fp.write(
"""
import os.path as op
from distutils import sysconfig
sysconfig.get_makefile_filename = lambda: op.join(op.dirname(__file__), 'Makefile')
sysconfig.get_config_h_filename = lambda: op.join(op.dirname(__file__), 'pyconfig.h')
"""
)
def add_to_pythonpath(path):
"""Adds ``path`` to both ``PYTHONPATH`` env and ``sys.path``.
"""
abspath = op.abspath(path) abspath = op.abspath(path)
pythonpath = os.environ.get("PYTHONPATH", "") pythonpath = os.environ.get("PYTHONPATH", "")
pathsep = ";" if ISWINDOWS else ":" pathsep = ";" if ISWINDOWS else ":"
@@ -220,7 +179,12 @@ def add_to_pythonpath(path):
# This is a method to hack around those freakingly tricky data inclusion/exlusion rules # This is a method to hack around those freakingly tricky data inclusion/exlusion rules
# in setuptools. We copy the packages *without data* in a build folder and then build the plugin # in setuptools. We copy the packages *without data* in a build folder and then build the plugin
# from there. # from there.
def copy_packages(packages_names, dest, create_links=False, extra_ignores=None): def copy_packages(
packages_names: List[str],
dest: os.PathLike,
create_links: bool = False,
extra_ignores: Union[List[str], None] = None,
) -> None:
"""Copy python packages ``packages_names`` to ``dest``, spurious data. """Copy python packages ``packages_names`` to ``dest``, spurious data.
Copy will happen without tests, testdata, mercurial data or C extension module source with it. Copy will happen without tests, testdata, mercurial data or C extension module source with it.
@@ -231,9 +195,7 @@ def copy_packages(packages_names, dest, create_links=False, extra_ignores=None):
create_links = False create_links = False
if not extra_ignores: if not extra_ignores:
extra_ignores = [] extra_ignores = []
ignore = shutil.ignore_patterns( ignore = shutil.ignore_patterns(".hg*", "tests", "testdata", "modules", "docs", "locale", *extra_ignores)
".hg*", "tests", "testdata", "modules", "docs", "locale", *extra_ignores
)
for package_name in packages_names: for package_name in packages_names:
if op.exists(package_name): if op.exists(package_name):
source_path = package_name source_path = package_name
@@ -249,7 +211,7 @@ def copy_packages(packages_names, dest, create_links=False, extra_ignores=None):
os.unlink(dest_path) os.unlink(dest_path)
else: else:
shutil.rmtree(dest_path) shutil.rmtree(dest_path)
print("Copying package at {0} to {1}".format(source_path, dest_path)) print(f"Copying package at {source_path} to {dest_path}")
if create_links: if create_links:
os.symlink(op.abspath(source_path), dest_path) os.symlink(op.abspath(source_path), dest_path)
else: else:
@@ -259,28 +221,14 @@ def copy_packages(packages_names, dest, create_links=False, extra_ignores=None):
shutil.copy(source_path, dest_path) shutil.copy(source_path, dest_path)
def copy_qt_plugins(folder_names, dest): # This is only for Windows
from PyQt5.QtCore import QLibraryInfo
qt_plugin_dir = QLibraryInfo.location(QLibraryInfo.PluginsPath)
def ignore(path, names):
if path == qt_plugin_dir:
return [n for n in names if n not in folder_names]
else:
return [n for n in names if not n.endswith(".dll")]
shutil.copytree(qt_plugin_dir, dest, ignore=ignore)
def build_debian_changelog( def build_debian_changelog(
changelogpath, changelogpath: os.PathLike,
destfile, destfile: os.PathLike,
pkgname, pkgname: str,
from_version=None, from_version: Union[str, None] = None,
distribution="precise", distribution: str = "precise",
fix_version=None, fix_version: Union[str, None] = None,
): ) -> None:
"""Builds a debian changelog out of a YAML changelog. """Builds a debian changelog out of a YAML changelog.
Use fix_version to patch the top changelog to that version (if, for example, there was a Use fix_version to patch the top changelog to that version (if, for example, there was a
@@ -333,7 +281,7 @@ def build_debian_changelog(
re_changelog_header = re.compile(r"=== ([\d.b]*) \(([\d\-]*)\)") re_changelog_header = re.compile(r"=== ([\d.b]*) \(([\d\-]*)\)")
def read_changelog_file(filename): def read_changelog_file(filename: os.PathLike) -> List[Dict[str, Any]]:
def iter_by_three(it): def iter_by_three(it):
while True: while True:
try: try:
@@ -344,10 +292,9 @@ def read_changelog_file(filename):
return return
yield version, date, description yield version, date, description
with open(filename, "rt", encoding="utf-8") as fp: with open(filename, encoding="utf-8") as fp:
contents = fp.read() contents = fp.read()
splitted = re_changelog_header.split(contents)[1:] # the first item is empty splitted = re_changelog_header.split(contents)[1:] # the first item is empty
# splitted = [version1, date1, desc1, version2, date2, ...]
result = [] result = []
for version, date_str, description in iter_by_three(iter(splitted)): for version, date_str, description in iter_by_three(iter(splitted)):
date = datetime.strptime(date_str, "%Y-%m-%d").date() date = datetime.strptime(date_str, "%Y-%m-%d").date()
@@ -361,195 +308,7 @@ def read_changelog_file(filename):
return result return result
class OSXAppStructure: def fix_qt_resource_file(path: os.PathLike) -> None:
def __init__(self, dest):
self.dest = dest
self.contents = op.join(dest, "Contents")
self.macos = op.join(self.contents, "MacOS")
self.resources = op.join(self.contents, "Resources")
self.frameworks = op.join(self.contents, "Frameworks")
self.infoplist = op.join(self.contents, "Info.plist")
def create(self, infoplist):
ensure_empty_folder(self.dest)
os.makedirs(self.macos)
os.mkdir(self.resources)
os.mkdir(self.frameworks)
copy(infoplist, self.infoplist)
open(op.join(self.contents, "PkgInfo"), "wt").write("APPLxxxx")
def copy_executable(self, executable):
info = plistlib.readPlist(self.infoplist)
self.executablename = info["CFBundleExecutable"]
self.executablepath = op.join(self.macos, self.executablename)
copy(executable, self.executablepath)
def copy_resources(self, *resources, use_symlinks=False):
for path in resources:
resource_dest = op.join(self.resources, op.basename(path))
action = symlink if use_symlinks else copy
action(op.abspath(path), resource_dest)
def copy_frameworks(self, *frameworks):
for path in frameworks:
framework_dest = op.join(self.frameworks, op.basename(path))
copy(path, framework_dest)
def create_osx_app_structure(
dest,
executable,
infoplist,
resources=None,
frameworks=None,
symlink_resources=False,
):
# `dest`: A path to the destination .app folder
# `executable`: the path of the executable file that goes in "MacOS"
# `infoplist`: The path to your Info.plist file.
# `resources`: A list of paths of files or folders going in the "Resources" folder.
# `frameworks`: Same as above for "Frameworks".
# `symlink_resources`: If True, will symlink resources into the structure instead of copying them.
app = OSXAppStructure(dest, infoplist)
app.create()
app.copy_executable(executable)
app.copy_resources(*resources, use_symlinks=symlink_resources)
app.copy_frameworks(*frameworks)
class OSXFrameworkStructure:
def __init__(self, dest):
self.dest = dest
self.contents = op.join(dest, "Versions", "A")
self.resources = op.join(self.contents, "Resources")
self.headers = op.join(self.contents, "Headers")
self.infoplist = op.join(self.resources, "Info.plist")
self._update_executable_path()
def _update_executable_path(self):
if not op.exists(self.infoplist):
self.executablename = self.executablepath = None
return
info = plistlib.readPlist(self.infoplist)
self.executablename = info["CFBundleExecutable"]
self.executablepath = op.join(self.contents, self.executablename)
def create(self, infoplist):
ensure_empty_folder(self.dest)
os.makedirs(self.contents)
os.mkdir(self.resources)
os.mkdir(self.headers)
copy(infoplist, self.infoplist)
self._update_executable_path()
def create_symlinks(self):
# Only call this after create() and copy_executable()
rel = lambda path: op.relpath(path, self.dest)
os.symlink("A", op.join(self.dest, "Versions", "Current"))
os.symlink(rel(self.executablepath), op.join(self.dest, self.executablename))
os.symlink(rel(self.headers), op.join(self.dest, "Headers"))
os.symlink(rel(self.resources), op.join(self.dest, "Resources"))
def copy_executable(self, executable):
copy(executable, self.executablepath)
def copy_resources(self, *resources, use_symlinks=False):
for path in resources:
resource_dest = op.join(self.resources, op.basename(path))
action = symlink if use_symlinks else copy
action(op.abspath(path), resource_dest)
def copy_headers(self, *headers, use_symlinks=False):
for path in headers:
header_dest = op.join(self.headers, op.basename(path))
action = symlink if use_symlinks else copy
action(op.abspath(path), header_dest)
def copy_embeddable_python_dylib(dst):
runtime = op.join(
sysconfig.get_config_var("PYTHONFRAMEWORKPREFIX"),
sysconfig.get_config_var("LDLIBRARY"),
)
filedest = op.join(dst, "Python")
shutil.copy(runtime, filedest)
os.chmod(filedest, 0o774) # We need write permission to use install_name_tool
cmd = "install_name_tool -id @rpath/Python %s" % filedest
print_and_do(cmd)
def collect_stdlib_dependencies(script, dest_folder, extra_deps=None):
sysprefix = sys.prefix # could be a virtualenv
basesysprefix = sys.base_prefix # seems to be path to non-virtual sys
real_lib_prefix = sysconfig.get_config_var(
"LIBDEST"
) # leaving this in case it is neede
def is_stdlib_path(path):
# A module path is only a stdlib path if it's in either sys.prefix or
# sysconfig.get_config_var('prefix') (the 2 are different if we are in a virtualenv) and if
# there's no "site-package in the path.
if not path:
return False
if "site-package" in path:
return False
if not (
path.startswith(sysprefix)
or path.startswith(basesysprefix)
or path.startswith(real_lib_prefix)
):
return False
return True
ensure_folder(dest_folder)
mf = modulefinder.ModuleFinder()
mf.run_script(script)
modpaths = [mod.__file__ for mod in mf.modules.values()]
modpaths = filter(is_stdlib_path, modpaths)
for p in modpaths:
if p.startswith(real_lib_prefix):
relpath = op.relpath(p, real_lib_prefix)
elif p.startswith(sysprefix):
relpath = op.relpath(p, sysprefix)
assert relpath.startswith(
"lib/python3."
) # we want to get rid of that lib/python3.x part
relpath = relpath[len("lib/python3.X/") :]
elif p.startswith(basesysprefix):
relpath = op.relpath(p, basesysprefix)
assert relpath.startswith("lib/python3.")
relpath = relpath[len("lib/python3.X/") :]
else:
raise AssertionError()
if relpath.startswith(
"lib-dynload"
): # We copy .so files in lib-dynload directly in our dest
relpath = relpath[len("lib-dynload/") :]
if relpath.startswith("encodings") or relpath.startswith("distutils"):
# We force their inclusion later.
continue
dest_path = op.join(dest_folder, relpath)
ensure_folder(op.dirname(dest_path))
copy(p, dest_path)
# stringprep is used by encodings.
# We use real_lib_prefix with distutils because virtualenv messes with it and we need to refer
# to the original distutils folder.
FORCED_INCLUSION = [
"encodings",
"stringprep",
op.join(real_lib_prefix, "distutils"),
]
if extra_deps:
FORCED_INCLUSION += extra_deps
copy_packages(FORCED_INCLUSION, dest_folder)
# There's a couple of rather big exe files in the distutils folder that we absolutely don't
# need. Remove them.
delete_files_with_pattern(op.join(dest_folder, "distutils"), "*.exe")
# And, finally, create an empty "site.py" that Python needs around on startup.
open(op.join(dest_folder, "site.py"), "w").close()
def fix_qt_resource_file(path):
# pyrcc5 under Windows, if the locale is non-english, can produce a source file with a date # pyrcc5 under Windows, if the locale is non-english, can produce a source file with a date
# containing accented characters. If it does, the encoding is wrong and it prevents the file # containing accented characters. If it does, the encoding is wrong and it prevents the file
# from being correctly frozen by cx_freeze. To work around that, we open the file, strip all # from being correctly frozen by cx_freeze. To work around that, we open the file, strip all
@@ -560,23 +319,3 @@ def fix_qt_resource_file(path):
lines = [line for line in lines if not line.startswith(b"#")] lines = [line for line in lines if not line.startswith(b"#")]
with open(path, "wb") as fp: with open(path, "wb") as fp:
fp.write(b"\n".join(lines)) fp.write(b"\n".join(lines))
def build_cocoa_ext(
extname, dest, source_files, extra_frameworks=(), extra_includes=()
):
extra_link_args = ["-framework", "CoreFoundation", "-framework", "Foundation"]
for extra in extra_frameworks:
extra_link_args += ["-framework", extra]
ext = Extension(
extname,
source_files,
extra_link_args=extra_link_args,
include_dirs=extra_includes,
)
setup(script_args=["build_ext", "--inplace"], ext_modules=[ext])
# Our problem here is to get the fully qualified filename of the resulting .so but I couldn't
# find a documented way to do so. The only thing I could find is this below :(
fn = ext._file_name
assert op.exists(fn)
move(fn, op.join(dest, fn))

View File

@@ -1,31 +0,0 @@
# Copyright 2016 Virgil Dupras
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
import argparse
from setuptools import setup, Extension
def get_parser():
parser = argparse.ArgumentParser(description="Build an arbitrary Python extension.")
parser.add_argument(
"source_files", nargs="+", help="List of source files to compile"
)
parser.add_argument("name", nargs=1, help="Name of the resulting extension")
return parser
def main():
args = get_parser().parse_args()
print("Building {}...".format(args.name[0]))
ext = Extension(args.name[0], args.source_files)
setup(
script_args=["build_ext", "--inplace"], ext_modules=[ext],
)
if __name__ == "__main__":
main()

View File

@@ -14,7 +14,8 @@ import re
import os import os
import shutil import shutil
from .path import Path, pathify from pathlib import Path
from typing import Callable, List
# This matches [123], but not [12] (3 digits being the minimum). # This matches [123], but not [12] (3 digits being the minimum).
# It also matches [1234] [12345] etc.. # It also matches [1234] [12345] etc..
@@ -22,7 +23,7 @@ from .path import Path, pathify
re_conflict = re.compile(r"^\[\d{3}\d*\] ") re_conflict = re.compile(r"^\[\d{3}\d*\] ")
def get_conflicted_name(other_names, name): def get_conflicted_name(other_names: List[str], name: str) -> str:
"""Returns name with a ``[000]`` number in front of it. """Returns name with a ``[000]`` number in front of it.
The number between brackets depends on how many conlicted filenames The number between brackets depends on how many conlicted filenames
@@ -39,7 +40,7 @@ def get_conflicted_name(other_names, name):
i += 1 i += 1
def get_unconflicted_name(name): def get_unconflicted_name(name: str) -> str:
"""Returns ``name`` without ``[]`` brackets. """Returns ``name`` without ``[]`` brackets.
Brackets which, of course, might have been added by func:`get_conflicted_name`. Brackets which, of course, might have been added by func:`get_conflicted_name`.
@@ -47,38 +48,33 @@ def get_unconflicted_name(name):
return re_conflict.sub("", name, 1) return re_conflict.sub("", name, 1)
def is_conflicted(name): def is_conflicted(name: str) -> bool:
"""Returns whether ``name`` is prepended with a bracketed number. """Returns whether ``name`` is prepended with a bracketed number."""
"""
return re_conflict.match(name) is not None return re_conflict.match(name) is not None
@pathify def _smart_move_or_copy(operation: Callable, source_path: Path, dest_path: Path) -> None:
def _smart_move_or_copy(operation, source_path: Path, dest_path: Path): """Use move() or copy() to move and copy file with the conflict management."""
"""Use move() or copy() to move and copy file with the conflict management. if dest_path.is_dir() and not source_path.is_dir():
""" dest_path = dest_path.joinpath(source_path.name)
if dest_path.isdir() and not source_path.isdir():
dest_path = dest_path[source_path.name]
if dest_path.exists(): if dest_path.exists():
filename = dest_path.name filename = dest_path.name
dest_dir_path = dest_path.parent() dest_dir_path = dest_path.parent
newname = get_conflicted_name(os.listdir(str(dest_dir_path)), filename) newname = get_conflicted_name(os.listdir(str(dest_dir_path)), filename)
dest_path = dest_dir_path[newname] dest_path = dest_dir_path.joinpath(newname)
operation(str(source_path), str(dest_path)) operation(str(source_path), str(dest_path))
def smart_move(source_path, dest_path): def smart_move(source_path: Path, dest_path: Path) -> None:
"""Same as :func:`smart_copy`, but it moves files instead. """Same as :func:`smart_copy`, but it moves files instead."""
"""
_smart_move_or_copy(shutil.move, source_path, dest_path) _smart_move_or_copy(shutil.move, source_path, dest_path)
def smart_copy(source_path, dest_path): def smart_copy(source_path: Path, dest_path: Path) -> None:
"""Copies ``source_path`` to ``dest_path``, recursively and with conflict resolution. """Copies ``source_path`` to ``dest_path``, recursively and with conflict resolution."""
"""
try: try:
_smart_move_or_copy(shutil.copy, source_path, dest_path) _smart_move_or_copy(shutil.copy, source_path, dest_path)
except IOError as e: except OSError as e:
if e.errno in { if e.errno in {
21, 21,
13, 13,

View File

@@ -1,23 +0,0 @@
# Created By: Virgil Dupras
# Created On: 2011-04-19
# Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
#
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
import sys
import traceback
# Taken from http://bzimmer.ziclix.com/2008/12/17/python-thread-dumps/
def stacktraces():
code = []
for threadId, stack in sys._current_frames().items():
code.append("\n# ThreadID: %s" % threadId)
for filename, lineno, name, line in traceback.extract_stack(stack):
code.append('File: "%s", line %d, in %s' % (filename, lineno, name))
if line:
code.append(" %s" % (line.strip()))
return "\n".join(code)

View File

@@ -6,34 +6,33 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from enum import Enum
from os import PathLike
import os.path as op import os.path as op
import logging import logging
class SpecialFolder: class SpecialFolder(Enum):
AppData = 1 APPDATA = 1
Cache = 2 CACHE = 2
def open_url(url): def open_url(url: str) -> None:
"""Open ``url`` with the default browser. """Open ``url`` with the default browser."""
"""
_open_url(url) _open_url(url)
def open_path(path): def open_path(path: PathLike) -> None:
"""Open ``path`` with its associated application. """Open ``path`` with its associated application."""
"""
_open_path(str(path)) _open_path(str(path))
def reveal_path(path): def reveal_path(path: PathLike) -> None:
"""Open the folder containing ``path`` with the default file browser. """Open the folder containing ``path`` with the default file browser."""
"""
_reveal_path(str(path)) _reveal_path(str(path))
def special_folder_path(special_folder, appname=None): def special_folder_path(special_folder: SpecialFolder, portable: bool = False) -> str:
"""Returns the path of ``special_folder``. """Returns the path of ``special_folder``.
``special_folder`` is a SpecialFolder.* const. The result is the special folder for the current ``special_folder`` is a SpecialFolder.* const. The result is the special folder for the current
@@ -41,64 +40,58 @@ def special_folder_path(special_folder, appname=None):
You can override the application name with ``appname``. This argument is ingored under Qt. You can override the application name with ``appname``. This argument is ingored under Qt.
""" """
return _special_folder_path(special_folder, appname) return _special_folder_path(special_folder, portable=portable)
try:
# Normally, we would simply do "from cocoa import proxy", but due to a bug in pytest (currently
# at v2.4.2), our test suite is broken when we do that. This below is a workaround until that
# bug is fixed.
import cocoa
if not hasattr(cocoa, "proxy"):
raise ImportError()
proxy = cocoa.proxy
_open_url = proxy.openURL_
_open_path = proxy.openPath_
_reveal_path = proxy.revealPath_
def _special_folder_path(special_folder, appname=None):
if special_folder == SpecialFolder.Cache:
base = proxy.getCachePath()
else:
base = proxy.getAppdataPath()
if not appname:
appname = proxy.bundleInfo_("CFBundleName")
return op.join(base, appname)
except ImportError:
try: try:
from PyQt5.QtCore import QUrl, QStandardPaths from PyQt5.QtCore import QUrl, QStandardPaths
from PyQt5.QtGui import QDesktopServices from PyQt5.QtGui import QDesktopServices
from qt.util import get_appdata
from core.util import executable_folder
from hscommon.plat import ISWINDOWS, ISOSX
import subprocess
def _open_url(url): def _open_url(url: str) -> None:
QDesktopServices.openUrl(QUrl(url)) QDesktopServices.openUrl(QUrl(url))
def _open_path(path): def _open_path(path: str) -> None:
url = QUrl.fromLocalFile(str(path)) url = QUrl.fromLocalFile(str(path))
QDesktopServices.openUrl(url) QDesktopServices.openUrl(url)
def _reveal_path(path): def _reveal_path(path: str) -> None:
if ISWINDOWS:
subprocess.run(["explorer", "/select,", op.abspath(path)])
elif ISOSX:
subprocess.run(["open", "-R", op.abspath(path)])
else:
_open_path(op.dirname(str(path))) _open_path(op.dirname(str(path)))
def _special_folder_path(special_folder, appname=None): def _special_folder_path(special_folder: SpecialFolder, portable: bool = False) -> str:
if special_folder == SpecialFolder.Cache: if special_folder == SpecialFolder.CACHE:
qtfolder = QStandardPaths.CacheLocation if ISWINDOWS and portable:
folder = op.join(executable_folder(), "cache")
else: else:
qtfolder = QStandardPaths.DataLocation folder = QStandardPaths.standardLocations(QStandardPaths.CacheLocation)[0]
return QStandardPaths.standardLocations(qtfolder)[0] else:
folder = get_appdata(portable)
return folder
except ImportError: except ImportError:
# We're either running tests, and these functions don't matter much or we're in a really # We're either running tests, and these functions don't matter much or we're in a really
# weird situation. Let's just have dummy fallbacks. # weird situation. Let's just have dummy fallbacks.
logging.warning("Can't setup desktop functions!") logging.warning("Can't setup desktop functions!")
def _open_path(path): def _open_url(url: str) -> None:
# Dummy for tests
pass pass
def _reveal_path(path): def _open_path(path: str) -> None:
# Dummy for tests
pass pass
def _special_folder_path(special_folder, appname=None): def _reveal_path(path: str) -> None:
# Dummy for tests
pass
def _special_folder_path(special_folder: SpecialFolder, portable: bool = False) -> str:
return "/tmp" return "/tmp"

View File

@@ -1,218 +0,0 @@
# Created By: Virgil Dupras
# Created On: 2011-08-05
# Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
#
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
from sys import maxsize as INF
from math import sqrt
VERY_SMALL = 0.0000001
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return "<Point {:2.2f}, {:2.2f}>".format(*self)
def __iter__(self):
yield self.x
yield self.y
def distance_to(self, other):
return Line(self, other).length()
class Line:
def __init__(self, p1, p2):
self.p1 = p1
self.p2 = p2
def __repr__(self):
return "<Line {}, {}>".format(*self)
def __iter__(self):
yield self.p1
yield self.p2
def dx(self):
return self.p2.x - self.p1.x
def dy(self):
return self.p2.y - self.p1.y
def length(self):
return sqrt(self.dx() ** 2 + self.dy() ** 2)
def slope(self):
if self.dx() == 0:
return INF if self.dy() > 0 else -INF
else:
return self.dy() / self.dx()
def intersection_point(self, other):
# with help from http://paulbourke.net/geometry/lineline2d/
if abs(self.slope() - other.slope()) < VERY_SMALL:
# parallel. Even if coincident, we return nothing
return None
A, B = self
C, D = other
denom = (D.y - C.y) * (B.x - A.x) - (D.x - C.x) * (B.y - A.y)
if denom == 0:
return None
numera = (D.x - C.x) * (A.y - C.y) - (D.y - C.y) * (A.x - C.x)
numerb = (B.x - A.x) * (A.y - C.y) - (B.y - A.y) * (A.x - C.x)
mua = numera / denom
mub = numerb / denom
if (0 <= mua <= 1) and (0 <= mub <= 1):
x = A.x + mua * (B.x - A.x)
y = A.y + mua * (B.y - A.y)
return Point(x, y)
else:
return None
class Rect:
def __init__(self, x, y, w, h):
self.x = x
self.y = y
self.w = w
self.h = h
def __iter__(self):
yield self.x
yield self.y
yield self.w
yield self.h
def __repr__(self):
return "<Rect {:2.2f}, {:2.2f}, {:2.2f}, {:2.2f}>".format(*self)
@classmethod
def from_center(cls, center, width, height):
x = center.x - width / 2
y = center.y - height / 2
return cls(x, y, width, height)
@classmethod
def from_corners(cls, pt1, pt2):
x1, y1 = pt1
x2, y2 = pt2
return cls(min(x1, x2), min(y1, y2), abs(x1 - x2), abs(y1 - y2))
def center(self):
return Point(self.x + self.w / 2, self.y + self.h / 2)
def contains_point(self, point):
x, y = point
(x1, y1), (x2, y2) = self.corners()
return (x1 <= x <= x2) and (y1 <= y <= y2)
def contains_rect(self, rect):
pt1, pt2 = rect.corners()
return self.contains_point(pt1) and self.contains_point(pt2)
def corners(self):
return Point(self.x, self.y), Point(self.x + self.w, self.y + self.h)
def intersects(self, other):
r1pt1, r1pt2 = self.corners()
r2pt1, r2pt2 = other.corners()
if r1pt1.x < r2pt1.x:
xinter = r1pt2.x >= r2pt1.x
else:
xinter = r2pt2.x >= r1pt1.x
if not xinter:
return False
if r1pt1.y < r2pt1.y:
yinter = r1pt2.y >= r2pt1.y
else:
yinter = r2pt2.y >= r1pt1.y
return yinter
def lines(self):
pt1, pt4 = self.corners()
pt2 = Point(pt4.x, pt1.y)
pt3 = Point(pt1.x, pt4.y)
l1 = Line(pt1, pt2)
l2 = Line(pt2, pt4)
l3 = Line(pt3, pt4)
l4 = Line(pt1, pt3)
return l1, l2, l3, l4
def scaled_rect(self, dx, dy):
"""Returns a rect that has the same borders at self, but grown/shrunk by dx/dy on each side.
"""
x, y, w, h = self
x -= dx
y -= dy
w += dx * 2
h += dy * 2
return Rect(x, y, w, h)
def united(self, other):
"""Returns the bounding rectangle of this rectangle and `other`.
"""
# ul=upper left lr=lower right
ulcorner1, lrcorner1 = self.corners()
ulcorner2, lrcorner2 = other.corners()
corner1 = Point(min(ulcorner1.x, ulcorner2.x), min(ulcorner1.y, ulcorner2.y))
corner2 = Point(max(lrcorner1.x, lrcorner2.x), max(lrcorner1.y, lrcorner2.y))
return Rect.from_corners(corner1, corner2)
# --- Properties
@property
def top(self):
return self.y
@top.setter
def top(self, value):
self.y = value
@property
def bottom(self):
return self.y + self.h
@bottom.setter
def bottom(self, value):
self.y = value - self.h
@property
def left(self):
return self.x
@left.setter
def left(self, value):
self.x = value
@property
def right(self):
return self.x + self.w
@right.setter
def right(self, value):
self.x = value - self.w
@property
def width(self):
return self.w
@width.setter
def width(self, value):
self.w = value
@property
def height(self):
return self.h
@height.setter
def height(self, value):
self.h = value

View File

@@ -36,11 +36,11 @@ class GUIObject:
``multibind`` flag to ``True`` and the safeguard will be disabled. ``multibind`` flag to ``True`` and the safeguard will be disabled.
""" """
def __init__(self, multibind=False): def __init__(self, multibind: bool = False) -> None:
self._view = None self._view = None
self._multibind = multibind self._multibind = multibind
def _view_updated(self): def _view_updated(self) -> None:
"""(Virtual) Called after :attr:`view` has been set. """(Virtual) Called after :attr:`view` has been set.
Doing nothing by default, this method is called after :attr:`view` has been set (it isn't Doing nothing by default, this method is called after :attr:`view` has been set (it isn't
@@ -48,7 +48,7 @@ class GUIObject:
(which is often the whole of the initialization code). (which is often the whole of the initialization code).
""" """
def has_view(self): def has_view(self) -> bool:
return (self._view is not None) and (not isinstance(self._view, NoopGUI)) return (self._view is not None) and (not isinstance(self._view, NoopGUI))
@property @property
@@ -67,7 +67,7 @@ class GUIObject:
return self._view return self._view
@view.setter @view.setter
def view(self, value): def view(self, value) -> None:
if self._view is None and value is None: if self._view is None and value is None:
# Initial view assignment # Initial view assignment
return return

View File

@@ -7,8 +7,10 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import copy import copy
from typing import Any, List, Tuple, Union
from .base import GUIObject from hscommon.gui.base import GUIObject
from hscommon.gui.table import GUITable
class Column: class Column:
@@ -17,7 +19,7 @@ class Column:
These attributes are then used to correctly configure the column on the "view" side. These attributes are then used to correctly configure the column on the "view" side.
""" """
def __init__(self, name, display="", visible=True, optional=False): def __init__(self, name: str, display: str = "", visible: bool = True, optional: bool = False) -> None:
#: "programmatical" (not for display) name. Used as a reference in a couple of place, such #: "programmatical" (not for display) name. Used as a reference in a couple of place, such
#: as :meth:`Columns.column_by_name`. #: as :meth:`Columns.column_by_name`.
self.name = name self.name = name
@@ -52,14 +54,14 @@ class ColumnsView:
callbacks. callbacks.
""" """
def restore_columns(self): def restore_columns(self) -> None:
"""Update all columns according to the model. """Update all columns according to the model.
When this is called, our view has to update the columns title, order and visibility of all When this is called, our view has to update the columns title, order and visibility of all
columns. columns.
""" """
def set_column_visible(self, colname, visible): def set_column_visible(self, colname: str, visible: bool) -> None:
"""Update visibility of column ``colname``. """Update visibility of column ``colname``.
Called when the user toggles the visibility of a column, we must update the column Called when the user toggles the visibility of a column, we must update the column
@@ -73,15 +75,14 @@ class PrefAccessInterface:
*Not actually used in the code. For documentation purposes only.* *Not actually used in the code. For documentation purposes only.*
""" """
def get_default(self, key, fallback_value): def get_default(self, key: str, fallback_value: Union[Any, None]) -> Any:
"""Retrieve the value for ``key`` in the currently running app's preference store. """Retrieve the value for ``key`` in the currently running app's preference store.
If the key doesn't exist, return ``fallback_value``. If the key doesn't exist, return ``fallback_value``.
""" """
def set_default(self, key, value): def set_default(self, key: str, value: Any) -> None:
"""Set the value ``value`` for ``key`` in the currently running app's preference store. """Set the value ``value`` for ``key`` in the currently running app's preference store."""
"""
class Columns(GUIObject): class Columns(GUIObject):
@@ -105,71 +106,65 @@ class Columns(GUIObject):
have that same prefix. have that same prefix.
""" """
def __init__(self, table, prefaccess=None, savename=None): def __init__(self, table: GUITable, prefaccess=None, savename: Union[str, None] = None):
GUIObject.__init__(self) GUIObject.__init__(self)
self.table = table self.table = table
self.prefaccess = prefaccess self.prefaccess = prefaccess
self.savename = savename self.savename = savename
# We use copy here for test isolation. If we don't, changing a column affects all tests. # We use copy here for test isolation. If we don't, changing a column affects all tests.
self.column_list = list(map(copy.copy, table.COLUMNS)) self.column_list: List[Column] = list(map(copy.copy, table.COLUMNS))
for i, column in enumerate(self.column_list): for i, column in enumerate(self.column_list):
column.logical_index = i column.logical_index = i
column.ordered_index = i column.ordered_index = i
self.coldata = {col.name: col for col in self.column_list} self.coldata = {col.name: col for col in self.column_list}
# --- Private # --- Private
def _get_colname_attr(self, colname, attrname, default): def _get_colname_attr(self, colname: str, attrname: str, default: Any) -> Any:
try: try:
return getattr(self.coldata[colname], attrname) return getattr(self.coldata[colname], attrname)
except KeyError: except KeyError:
return default return default
def _set_colname_attr(self, colname, attrname, value): def _set_colname_attr(self, colname: str, attrname: str, value: Any) -> None:
try: try:
col = self.coldata[colname] col = self.coldata[colname]
setattr(col, attrname, value) setattr(col, attrname, value)
except KeyError: except KeyError:
pass pass
def _optional_columns(self): def _optional_columns(self) -> List[Column]:
return [c for c in self.column_list if c.optional] return [c for c in self.column_list if c.optional]
# --- Override # --- Override
def _view_updated(self): def _view_updated(self) -> None:
self.restore_columns() self.restore_columns()
# --- Public # --- Public
def column_by_index(self, index): def column_by_index(self, index: int):
"""Return the :class:`Column` having the :attr:`~Column.logical_index` ``index``. """Return the :class:`Column` having the :attr:`~Column.logical_index` ``index``."""
"""
return self.column_list[index] return self.column_list[index]
def column_by_name(self, name): def column_by_name(self, name: str):
"""Return the :class:`Column` having the :attr:`~Column.name` ``name``. """Return the :class:`Column` having the :attr:`~Column.name` ``name``."""
"""
return self.coldata[name] return self.coldata[name]
def columns_count(self): def columns_count(self) -> int:
"""Returns the number of columns in our set. """Returns the number of columns in our set."""
"""
return len(self.column_list) return len(self.column_list)
def column_display(self, colname): def column_display(self, colname: str) -> str:
"""Returns display name for column named ``colname``, or ``''`` if there's none. """Returns display name for column named ``colname``, or ``''`` if there's none."""
"""
return self._get_colname_attr(colname, "display", "") return self._get_colname_attr(colname, "display", "")
def column_is_visible(self, colname): def column_is_visible(self, colname: str) -> bool:
"""Returns visibility for column named ``colname``, or ``True`` if there's none. """Returns visibility for column named ``colname``, or ``True`` if there's none."""
"""
return self._get_colname_attr(colname, "visible", True) return self._get_colname_attr(colname, "visible", True)
def column_width(self, colname): def column_width(self, colname: str) -> int:
"""Returns width for column named ``colname``, or ``0`` if there's none. """Returns width for column named ``colname``, or ``0`` if there's none."""
"""
return self._get_colname_attr(colname, "width", 0) return self._get_colname_attr(colname, "width", 0)
def columns_to_right(self, colname): def columns_to_right(self, colname: str) -> List[str]:
"""Returns the list of all columns to the right of ``colname``. """Returns the list of all columns to the right of ``colname``.
"right" meaning "having a higher :attr:`Column.ordered_index`" in our left-to-right "right" meaning "having a higher :attr:`Column.ordered_index`" in our left-to-right
@@ -177,13 +172,9 @@ class Columns(GUIObject):
""" """
column = self.coldata[colname] column = self.coldata[colname]
index = column.ordered_index index = column.ordered_index
return [ return [col.name for col in self.column_list if (col.visible and col.ordered_index > index)]
col.name
for col in self.column_list
if (col.visible and col.ordered_index > index)
]
def menu_items(self): def menu_items(self) -> List[Tuple[str, bool]]:
"""Returns a list of items convenient for quick visibility menu generation. """Returns a list of items convenient for quick visibility menu generation.
Returns a list of ``(display_name, is_marked)`` items for each optional column in the Returns a list of ``(display_name, is_marked)`` items for each optional column in the
@@ -195,7 +186,7 @@ class Columns(GUIObject):
""" """
return [(c.display, c.visible) for c in self._optional_columns()] return [(c.display, c.visible) for c in self._optional_columns()]
def move_column(self, colname, index): def move_column(self, colname: str, index: int) -> None:
"""Moves column ``colname`` to ``index``. """Moves column ``colname`` to ``index``.
The column will be placed just in front of the column currently having that index, or to the The column will be placed just in front of the column currently having that index, or to the
@@ -206,23 +197,20 @@ class Columns(GUIObject):
colnames.insert(index, colname) colnames.insert(index, colname)
self.set_column_order(colnames) self.set_column_order(colnames)
def reset_to_defaults(self): def reset_to_defaults(self) -> None:
"""Reset all columns' width and visibility to their default values. """Reset all columns' width and visibility to their default values."""
"""
self.set_column_order([col.name for col in self.column_list]) self.set_column_order([col.name for col in self.column_list])
for col in self._optional_columns(): for col in self._optional_columns():
col.visible = col.default_visible col.visible = col.default_visible
col.width = col.default_width col.width = col.default_width
self.view.restore_columns() self.view.restore_columns()
def resize_column(self, colname, newwidth): def resize_column(self, colname: str, newwidth: int) -> None:
"""Set column ``colname``'s width to ``newwidth``. """Set column ``colname``'s width to ``newwidth``."""
"""
self._set_colname_attr(colname, "width", newwidth) self._set_colname_attr(colname, "width", newwidth)
def restore_columns(self): def restore_columns(self) -> None:
"""Restore's column persistent attributes from the last :meth:`save_columns`. """Restore's column persistent attributes from the last :meth:`save_columns`."""
"""
if not (self.prefaccess and self.savename and self.coldata): if not (self.prefaccess and self.savename and self.coldata):
if (not self.savename) and (self.coldata): if (not self.savename) and (self.coldata):
# This is a table that will not have its coldata saved/restored. we should # This is a table that will not have its coldata saved/restored. we should
@@ -230,7 +218,7 @@ class Columns(GUIObject):
self.view.restore_columns() self.view.restore_columns()
return return
for col in self.column_list: for col in self.column_list:
pref_name = "{}.Columns.{}".format(self.savename, col.name) pref_name = f"{self.savename}.Columns.{col.name}"
coldata = self.prefaccess.get_default(pref_name, fallback_value={}) coldata = self.prefaccess.get_default(pref_name, fallback_value={})
if "index" in coldata: if "index" in coldata:
col.ordered_index = coldata["index"] col.ordered_index = coldata["index"]
@@ -240,19 +228,19 @@ class Columns(GUIObject):
col.visible = coldata["visible"] col.visible = coldata["visible"]
self.view.restore_columns() self.view.restore_columns()
def save_columns(self): def save_columns(self) -> None:
"""Save column attributes in persistent storage for restoration in :meth:`restore_columns`. """Save column attributes in persistent storage for restoration in :meth:`restore_columns`."""
"""
if not (self.prefaccess and self.savename and self.coldata): if not (self.prefaccess and self.savename and self.coldata):
return return
for col in self.column_list: for col in self.column_list:
pref_name = "{}.Columns.{}".format(self.savename, col.name) pref_name = f"{self.savename}.Columns.{col.name}"
coldata = {"index": col.ordered_index, "width": col.width} coldata = {"index": col.ordered_index, "width": col.width}
if col.optional: if col.optional:
coldata["visible"] = col.visible coldata["visible"] = col.visible
self.prefaccess.set_default(pref_name, coldata) self.prefaccess.set_default(pref_name, coldata)
def set_column_order(self, colnames): # TODO annotate colnames
def set_column_order(self, colnames) -> None:
"""Change the columns order so it matches the order in ``colnames``. """Change the columns order so it matches the order in ``colnames``.
:param colnames: A list of column names in the desired order. :param colnames: A list of column names in the desired order.
@@ -262,19 +250,17 @@ class Columns(GUIObject):
col = self.coldata[colname] col = self.coldata[colname]
col.ordered_index = i col.ordered_index = i
def set_column_visible(self, colname, visible): def set_column_visible(self, colname: str, visible: bool) -> None:
"""Set the visibility of column ``colname``. """Set the visibility of column ``colname``."""
"""
self.table.save_edits() # the table on the GUI side will stop editing when the columns change self.table.save_edits() # the table on the GUI side will stop editing when the columns change
self._set_colname_attr(colname, "visible", visible) self._set_colname_attr(colname, "visible", visible)
self.view.set_column_visible(colname, visible) self.view.set_column_visible(colname, visible)
def set_default_width(self, colname, width): def set_default_width(self, colname: str, width: int) -> None:
"""Set the default width or column ``colname``. """Set the default width or column ``colname``."""
"""
self._set_colname_attr(colname, "default_width", width) self._set_colname_attr(colname, "default_width", width)
def toggle_menu_item(self, index): def toggle_menu_item(self, index: int) -> bool:
"""Toggles the visibility of an optional column. """Toggles the visibility of an optional column.
You know, that optional column menu you've generated in :meth:`menu_items`? Well, ``index`` You know, that optional column menu you've generated in :meth:`menu_items`? Well, ``index``
@@ -288,15 +274,11 @@ class Columns(GUIObject):
# --- Properties # --- Properties
@property @property
def ordered_columns(self): def ordered_columns(self) -> List[Column]:
"""List of :class:`Column` in visible order. """List of :class:`Column` in visible order."""
""" return [col for col in sorted(self.column_list, key=lambda col: col.ordered_index)]
return [
col for col in sorted(self.column_list, key=lambda col: col.ordered_index)
]
@property @property
def colnames(self): def colnames(self) -> List[str]:
"""List of column names in visible order. """List of column names in visible order."""
"""
return [col.name for col in self.ordered_columns] return [col.name for col in self.ordered_columns]

View File

@@ -4,9 +4,10 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from ..jobprogress.performer import ThreadedJobPerformer from typing import Callable, Tuple, Union
from .base import GUIObject from hscommon.jobprogress.performer import ThreadedJobPerformer
from .text_field import TextField from hscommon.gui.base import GUIObject
from hscommon.gui.text_field import TextField
class ProgressWindowView: class ProgressWindowView:
@@ -20,15 +21,13 @@ class ProgressWindowView:
It's also expected to call :meth:`ProgressWindow.cancel` when the cancel button is clicked. It's also expected to call :meth:`ProgressWindow.cancel` when the cancel button is clicked.
""" """
def show(self): def show(self) -> None:
"""Show the dialog. """Show the dialog."""
"""
def close(self): def close(self) -> None:
"""Close the dialog. """Close the dialog."""
"""
def set_progress(self, progress): def set_progress(self, progress: int) -> None:
"""Set the progress of the progress bar to ``progress``. """Set the progress of the progress bar to ``progress``.
Not all jobs are equally responsive on their job progress report and it is recommended that Not all jobs are equally responsive on their job progress report and it is recommended that
@@ -62,7 +61,11 @@ class ProgressWindow(GUIObject, ThreadedJobPerformer):
called as if the job terminated normally. called as if the job terminated normally.
""" """
def __init__(self, finish_func, error_func=None): def __init__(
self,
finish_func: Callable[[Union[str, None]], None],
error_func: Callable[[Union[str, None], Exception], bool] = None,
) -> None:
# finish_func(jobid) is the function that is called when a job is completed. # finish_func(jobid) is the function that is called when a job is completed.
GUIObject.__init__(self) GUIObject.__init__(self)
ThreadedJobPerformer.__init__(self) ThreadedJobPerformer.__init__(self)
@@ -73,18 +76,17 @@ class ProgressWindow(GUIObject, ThreadedJobPerformer):
#: :class:`.TextField`. It contains the job textual update that the function might yield #: :class:`.TextField`. It contains the job textual update that the function might yield
#: during its course. #: during its course.
self.progressdesc_textfield = TextField() self.progressdesc_textfield = TextField()
self.jobid = None self.jobid: Union[str, None] = None
def cancel(self): def cancel(self) -> None:
"""Call for a user-initiated job cancellation. """Call for a user-initiated job cancellation."""
"""
# The UI is sometimes a bit buggy and calls cancel() on self.view.close(). We just want to # The UI is sometimes a bit buggy and calls cancel() on self.view.close(). We just want to
# make sure that this doesn't lead us to think that the user acually cancelled the task, so # make sure that this doesn't lead us to think that the user acually cancelled the task, so
# we verify that the job is still running. # we verify that the job is still running.
if self._job_running: if self._job_running:
self.job_cancelled = True self.job_cancelled = True
def pulse(self): def pulse(self) -> None:
"""Update progress reports in the GUI. """Update progress reports in the GUI.
Call this regularly from the GUI main run loop. The values might change before Call this regularly from the GUI main run loop. The values might change before
@@ -114,7 +116,7 @@ class ProgressWindow(GUIObject, ThreadedJobPerformer):
self.progressdesc_textfield.text = last_desc self.progressdesc_textfield.text = last_desc
self.view.set_progress(last_progress) self.view.set_progress(last_progress)
def run(self, jobid, title, target, args=()): def run(self, jobid: str, title: str, target: Callable, args: Tuple = ()):
"""Starts a threaded job. """Starts a threaded job.
The ``target`` function will be sent, as its first argument, a :class:`.Job` instance which The ``target`` function will be sent, as its first argument, a :class:`.Job` instance which

View File

@@ -8,7 +8,7 @@
from collections.abc import Sequence, MutableSequence from collections.abc import Sequence, MutableSequence
from .base import GUIObject from hscommon.gui.base import GUIObject
class Selectable(Sequence): class Selectable(Sequence):
@@ -27,9 +27,7 @@ class Selectable(Sequence):
self._selected_indexes = [] self._selected_indexes = []
if not self._selected_indexes: if not self._selected_indexes:
return return
self._selected_indexes = [ self._selected_indexes = [index for index in self._selected_indexes if index < len(self)]
index for index in self._selected_indexes if index < len(self)
]
if not self._selected_indexes: if not self._selected_indexes:
self._selected_indexes = [len(self) - 1] self._selected_indexes = [len(self) - 1]

View File

@@ -8,9 +8,10 @@
from collections.abc import MutableSequence from collections.abc import MutableSequence
from collections import namedtuple from collections import namedtuple
from typing import Any, List, Tuple, Union
from .base import GUIObject from hscommon.gui.base import GUIObject
from .selectable_list import Selectable from hscommon.gui.selectable_list import Selectable
# We used to directly subclass list, but it caused problems at some point with deepcopy # We used to directly subclass list, but it caused problems at some point with deepcopy
@@ -27,12 +28,16 @@ class Table(MutableSequence, Selectable):
Subclasses :class:`.Selectable`. Subclasses :class:`.Selectable`.
""" """
def __init__(self): # Should be List[Column], but have circular import...
Selectable.__init__(self) COLUMNS: List = []
self._rows = []
self._header = None
self._footer = None
def __init__(self) -> None:
Selectable.__init__(self)
self._rows: List["Row"] = []
self._header: Union["Row", None] = None
self._footer: Union["Row", None] = None
# TODO type hint for key
def __delitem__(self, key): def __delitem__(self, key):
self._rows.__delitem__(key) self._rows.__delitem__(key)
if self._header is not None and ((not self) or (self[0] is not self._header)): if self._header is not None and ((not self) or (self[0] is not self._header)):
@@ -41,16 +46,18 @@ class Table(MutableSequence, Selectable):
self._footer = None self._footer = None
self._check_selection_range() self._check_selection_range()
def __getitem__(self, key): # TODO type hint for key
def __getitem__(self, key) -> Any:
return self._rows.__getitem__(key) return self._rows.__getitem__(key)
def __len__(self): def __len__(self) -> int:
return len(self._rows) return len(self._rows)
def __setitem__(self, key, value): # TODO type hint for key
def __setitem__(self, key, value: Any) -> None:
self._rows.__setitem__(key, value) self._rows.__setitem__(key, value)
def append(self, item): def append(self, item: "Row") -> None:
"""Appends ``item`` at the end of the table. """Appends ``item`` at the end of the table.
If there's a footer, the item is inserted before it. If there's a footer, the item is inserted before it.
@@ -60,7 +67,7 @@ class Table(MutableSequence, Selectable):
else: else:
self._rows.append(item) self._rows.append(item)
def insert(self, index, item): def insert(self, index: int, item: "Row") -> None:
"""Inserts ``item`` at ``index`` in the table. """Inserts ``item`` at ``index`` in the table.
If there's a header, will make sure we don't insert before it, and if there's a footer, will If there's a header, will make sure we don't insert before it, and if there's a footer, will
@@ -72,7 +79,7 @@ class Table(MutableSequence, Selectable):
index = len(self) - 1 index = len(self) - 1
self._rows.insert(index, item) self._rows.insert(index, item)
def remove(self, row): def remove(self, row: "Row") -> None:
"""Removes ``row`` from table. """Removes ``row`` from table.
If ``row`` is a header or footer, that header or footer will be set to ``None``. If ``row`` is a header or footer, that header or footer will be set to ``None``.
@@ -84,7 +91,7 @@ class Table(MutableSequence, Selectable):
self._rows.remove(row) self._rows.remove(row)
self._check_selection_range() self._check_selection_range()
def sort_by(self, column_name, desc=False): def sort_by(self, column_name: str, desc: bool = False) -> None:
"""Sort table by ``column_name``. """Sort table by ``column_name``.
Sort key for each row is computed from :meth:`Row.sort_key_for_column`. Sort key for each row is computed from :meth:`Row.sort_key_for_column`.
@@ -97,8 +104,7 @@ class Table(MutableSequence, Selectable):
self._rows.pop(0) self._rows.pop(0)
if self._footer is not None: if self._footer is not None:
self._rows.pop() self._rows.pop()
key = lambda row: row.sort_key_for_column(column_name) self._rows.sort(key=lambda row: row.sort_key_for_column(column_name), reverse=desc)
self._rows.sort(key=key, reverse=desc)
if self._header is not None: if self._header is not None:
self._rows.insert(0, self._header) self._rows.insert(0, self._header)
if self._footer is not None: if self._footer is not None:
@@ -106,7 +112,7 @@ class Table(MutableSequence, Selectable):
# --- Properties # --- Properties
@property @property
def footer(self): def footer(self) -> Union["Row", None]:
"""If set, a row that always stay at the bottom of the table. """If set, a row that always stay at the bottom of the table.
:class:`Row`. *get/set*. :class:`Row`. *get/set*.
@@ -129,7 +135,7 @@ class Table(MutableSequence, Selectable):
return self._footer return self._footer
@footer.setter @footer.setter
def footer(self, value): def footer(self, value: Union["Row", None]) -> None:
if self._footer is not None: if self._footer is not None:
self._rows.pop() self._rows.pop()
if value is not None: if value is not None:
@@ -137,7 +143,7 @@ class Table(MutableSequence, Selectable):
self._footer = value self._footer = value
@property @property
def header(self): def header(self) -> Union["Row", None]:
"""If set, a row that always stay at the bottom of the table. """If set, a row that always stay at the bottom of the table.
See :attr:`footer` for details. See :attr:`footer` for details.
@@ -145,7 +151,7 @@ class Table(MutableSequence, Selectable):
return self._header return self._header
@header.setter @header.setter
def header(self, value): def header(self, value: Union["Row", None]) -> None:
if self._header is not None: if self._header is not None:
self._rows.pop(0) self._rows.pop(0)
if value is not None: if value is not None:
@@ -153,7 +159,7 @@ class Table(MutableSequence, Selectable):
self._header = value self._header = value
@property @property
def row_count(self): def row_count(self) -> int:
"""Number or rows in the table (without counting header and footer). """Number or rows in the table (without counting header and footer).
*int*. *read-only*. *int*. *read-only*.
@@ -166,7 +172,7 @@ class Table(MutableSequence, Selectable):
return result return result
@property @property
def rows(self): def rows(self) -> List["Row"]:
"""List of rows in the table, excluding header and footer. """List of rows in the table, excluding header and footer.
List of :class:`Row`. *read-only*. List of :class:`Row`. *read-only*.
@@ -180,7 +186,7 @@ class Table(MutableSequence, Selectable):
return self[start:end] return self[start:end]
@property @property
def selected_row(self): def selected_row(self) -> "Row":
"""Selected row according to :attr:`Selectable.selected_index`. """Selected row according to :attr:`Selectable.selected_index`.
:class:`Row`. *get/set*. :class:`Row`. *get/set*.
@@ -191,14 +197,14 @@ class Table(MutableSequence, Selectable):
return self[self.selected_index] if self.selected_index is not None else None return self[self.selected_index] if self.selected_index is not None else None
@selected_row.setter @selected_row.setter
def selected_row(self, value): def selected_row(self, value: int) -> None:
try: try:
self.selected_index = self.index(value) self.selected_index = self.index(value)
except ValueError: except ValueError:
pass pass
@property @property
def selected_rows(self): def selected_rows(self) -> List["Row"]:
"""List of selected rows based on :attr:`.selected_indexes`. """List of selected rows based on :attr:`.selected_indexes`.
List of :class:`Row`. *read-only*. List of :class:`Row`. *read-only*.
@@ -220,20 +226,20 @@ class GUITableView:
Whenever the user changes the selection, we expect the view to call :meth:`Table.select`. Whenever the user changes the selection, we expect the view to call :meth:`Table.select`.
""" """
def refresh(self): def refresh(self) -> None:
"""Refreshes the contents of the table widget. """Refreshes the contents of the table widget.
Ensures that the contents of the table widget is synced with the model. This includes Ensures that the contents of the table widget is synced with the model. This includes
selection. selection.
""" """
def start_editing(self): def start_editing(self) -> None:
"""Start editing the currently selected row. """Start editing the currently selected row.
Begin whatever inline editing support that the view supports. Begin whatever inline editing support that the view supports.
""" """
def stop_editing(self): def stop_editing(self) -> None:
"""Stop editing if there's an inline editing in effect. """Stop editing if there's an inline editing in effect.
There's no "aborting" implied in this call, so it's appropriate to send whatever the user There's no "aborting" implied in this call, so it's appropriate to send whatever the user
@@ -261,34 +267,33 @@ class GUITable(Table, GUIObject):
:class:`GUITableView`. :class:`GUITableView`.
""" """
def __init__(self): def __init__(self) -> None:
GUIObject.__init__(self) GUIObject.__init__(self)
Table.__init__(self) Table.__init__(self)
#: The row being currently edited by the user. ``None`` if no edit is taking place. #: The row being currently edited by the user. ``None`` if no edit is taking place.
self.edited = None self.edited: Union["Row", None] = None
self._sort_descriptor = None self._sort_descriptor: Union[SortDescriptor, None] = None
# --- Virtual # --- Virtual
def _do_add(self): def _do_add(self) -> Tuple["Row", int]:
"""(Virtual) Creates a new row, adds it in the table. """(Virtual) Creates a new row, adds it in the table.
Returns ``(row, insert_index)``. Returns ``(row, insert_index)``.
""" """
raise NotImplementedError() raise NotImplementedError()
def _do_delete(self): def _do_delete(self) -> None:
"""(Virtual) Delete the selected rows. """(Virtual) Delete the selected rows."""
"""
pass pass
def _fill(self): def _fill(self) -> None:
"""(Virtual/Required) Fills the table with all the rows that this table is supposed to have. """(Virtual/Required) Fills the table with all the rows that this table is supposed to have.
Called by :meth:`refresh`. Does nothing by default. Called by :meth:`refresh`. Does nothing by default.
""" """
pass pass
def _is_edited_new(self): def _is_edited_new(self) -> bool:
"""(Virtual) Returns whether the currently edited row should be considered "new". """(Virtual) Returns whether the currently edited row should be considered "new".
This is used in :meth:`cancel_edits` to know whether the cancellation of the edit means a This is used in :meth:`cancel_edits` to know whether the cancellation of the edit means a
@@ -317,7 +322,7 @@ class GUITable(Table, GUIObject):
self.select([len(self) - 1]) self.select([len(self) - 1])
# --- Public # --- Public
def add(self): def add(self) -> None:
"""Add a new row in edit mode. """Add a new row in edit mode.
Requires :meth:`do_add` to be implemented. The newly added row will be selected and in edit Requires :meth:`do_add` to be implemented. The newly added row will be selected and in edit
@@ -336,7 +341,7 @@ class GUITable(Table, GUIObject):
self.edited = row self.edited = row
self.view.start_editing() self.view.start_editing()
def can_edit_cell(self, column_name, row_index): def can_edit_cell(self, column_name: str, row_index: int) -> bool:
"""Returns whether the cell at ``row_index`` and ``column_name`` can be edited. """Returns whether the cell at ``row_index`` and ``column_name`` can be edited.
A row is, by default, editable as soon as it has an attr with the same name as `column`. A row is, by default, editable as soon as it has an attr with the same name as `column`.
@@ -348,7 +353,7 @@ class GUITable(Table, GUIObject):
row = self[row_index] row = self[row_index]
return row.can_edit_cell(column_name) return row.can_edit_cell(column_name)
def cancel_edits(self): def cancel_edits(self) -> None:
"""Cancels the current edit operation. """Cancels the current edit operation.
If there's an :attr:`edited` row, it will be re-initialized (with :meth:`Row.load`). If there's an :attr:`edited` row, it will be re-initialized (with :meth:`Row.load`).
@@ -366,7 +371,7 @@ class GUITable(Table, GUIObject):
self.edited = None self.edited = None
self.view.refresh() self.view.refresh()
def delete(self): def delete(self) -> None:
"""Delete the currently selected rows. """Delete the currently selected rows.
Requires :meth:`_do_delete` for this to have any effect on the model. Cancels editing if Requires :meth:`_do_delete` for this to have any effect on the model. Cancels editing if
@@ -379,7 +384,7 @@ class GUITable(Table, GUIObject):
if self: if self:
self._do_delete() self._do_delete()
def refresh(self, refresh_view=True): def refresh(self, refresh_view: bool = True) -> None:
"""Empty the table and re-create its rows. """Empty the table and re-create its rows.
:meth:`_fill` is called after we emptied the table to create our rows. Previous sort order :meth:`_fill` is called after we emptied the table to create our rows. Previous sort order
@@ -401,7 +406,7 @@ class GUITable(Table, GUIObject):
if refresh_view: if refresh_view:
self.view.refresh() self.view.refresh()
def save_edits(self): def save_edits(self) -> None:
"""Commit user edits to the model. """Commit user edits to the model.
This is done by calling :meth:`Row.save`. This is done by calling :meth:`Row.save`.
@@ -412,7 +417,7 @@ class GUITable(Table, GUIObject):
self.edited = None self.edited = None
row.save() row.save()
def sort_by(self, column_name, desc=False): def sort_by(self, column_name: str, desc: bool = False) -> None:
"""Sort table by ``column_name``. """Sort table by ``column_name``.
Overrides :meth:`Table.sort_by`. After having performed sorting, calls Overrides :meth:`Table.sort_by`. After having performed sorting, calls
@@ -452,18 +457,18 @@ class Row:
Of course, this is only default behavior. This can be overriden. Of course, this is only default behavior. This can be overriden.
""" """
def __init__(self, table): def __init__(self, table: GUITable) -> None:
super(Row, self).__init__() super().__init__()
self.table = table self.table = table
def _edit(self): def _edit(self) -> None:
if self.table.edited is self: if self.table.edited is self:
return return
assert self.table.edited is None assert self.table.edited is None
self.table.edited = self self.table.edited = self
# --- Virtual # --- Virtual
def can_edit(self): def can_edit(self) -> bool:
"""(Virtual) Whether the whole row can be edited. """(Virtual) Whether the whole row can be edited.
By default, always returns ``True``. This is for the *whole* row. For individual cells, it's By default, always returns ``True``. This is for the *whole* row. For individual cells, it's
@@ -471,7 +476,7 @@ class Row:
""" """
return True return True
def load(self): def load(self) -> None:
"""(Virtual/Required) Loads up values from the model to be presented in the table. """(Virtual/Required) Loads up values from the model to be presented in the table.
Usually, our model instances contain values that are not quite ready for display. If you Usually, our model instances contain values that are not quite ready for display. If you
@@ -480,7 +485,7 @@ class Row:
""" """
raise NotImplementedError() raise NotImplementedError()
def save(self): def save(self) -> None:
"""(Virtual/Required) Saves user edits into your model. """(Virtual/Required) Saves user edits into your model.
If your table is editable, this is called when the user commits his changes. Usually, these If your table is editable, this is called when the user commits his changes. Usually, these
@@ -489,7 +494,7 @@ class Row:
""" """
raise NotImplementedError() raise NotImplementedError()
def sort_key_for_column(self, column_name): def sort_key_for_column(self, column_name: str) -> Any:
"""(Virtual) Return the value that is to be used to sort by column ``column_name``. """(Virtual) Return the value that is to be used to sort by column ``column_name``.
By default, looks for an attribute with the same name as ``column_name``, but with an By default, looks for an attribute with the same name as ``column_name``, but with an
@@ -502,7 +507,7 @@ class Row:
return getattr(self, column_name) return getattr(self, column_name)
# --- Public # --- Public
def can_edit_cell(self, column_name): def can_edit_cell(self, column_name: str) -> bool:
"""Returns whether cell for column ``column_name`` can be edited. """Returns whether cell for column ``column_name`` can be edited.
By the default, the check is done in many steps: By the default, the check is done in many steps:
@@ -532,7 +537,7 @@ class Row:
return False return False
return bool(getattr(prop, "fset", None)) return bool(getattr(prop, "fset", None))
def get_cell_value(self, attrname): def get_cell_value(self, attrname: str) -> Any:
"""Get cell value for ``attrname``. """Get cell value for ``attrname``.
By default, does a simple ``getattr()``, but it is used to allow subclasses to have By default, does a simple ``getattr()``, but it is used to allow subclasses to have
@@ -542,7 +547,7 @@ class Row:
attrname = "from_" attrname = "from_"
return getattr(self, attrname) return getattr(self, attrname)
def set_cell_value(self, attrname, value): def set_cell_value(self, attrname: str, value: Any) -> None:
"""Set cell value to ``value`` for ``attrname``. """Set cell value to ``value`` for ``attrname``.
By default, does a simple ``setattr()``, but it is used to allow subclasses to have By default, does a simple ``setattr()``, but it is used to allow subclasses to have

Some files were not shown because too many files have changed in this diff Show More