1
0
mirror of https://github.com/arsenetar/dupeguru.git synced 2024-10-31 22:05:58 +00:00

Compare commits

...

124 Commits

Author SHA1 Message Date
Alexander Gee
8f197ea7e1
feat: Create longest and shortest path criteria (#1242)
* Create longest and shortest path criteria
2024-08-23 18:31:46 -05:00
3a97ba941a
ci: Merge artifacts
- Merge the resulting artifacts
- Use only the .so files from build
2024-05-11 01:21:58 -07:00
e3bcf9d686
chore: Update VS Code configuration 2024-05-11 00:12:19 -07:00
a81069be61
fix: Photo matching fixes
- Correct bad query introduced in rotation matching
- Promote get_orientation from "private" on photo class
- Fix prepare_pictures to only generate the needed blocks, add check for missing blocks when rotation matchin is true
- Fix cache test inputs to match schema
2024-05-11 00:11:27 -07:00
08154815d0
ci: Upgrade to latest actions & python versions 2024-02-19 14:39:04 -08:00
a95a9db08b
ci: Fix version for tx-push worklflow 2024-02-19 14:22:55 -08:00
3d866cec9a
ci: Add action to push translation source to transifex 2024-02-19 14:17:40 -08:00
253dfd897c
chore: Pull existing language translations from Transifex 2024-02-19 13:35:35 -08:00
6e87f53f91
chore: Regenerate translation source files 2024-02-19 13:26:07 -08:00
95e04c4d82
ci: Update .pre-commit-config.yaml
Update .pre-commit-config.yaml to use the latest versions of black,
flake8, etc.
2024-02-19 11:35:39 -08:00
e3a612a704
fix: Correct change that broke black formmating 2024-02-19 10:43:39 -08:00
53d5ac06bf
fix: Linting Errors & VS Code config
- Add additional settings to VS Code for formatter changes in plugins
- Fix black formatting
- Fix flake8 errors due to long lines
- Fix flake8 errors due to type comparisons
2024-02-19 10:32:13 -08:00
13dd00c798
fix: Replace use of imp with importlib
Original PR and information found at #1187
2024-02-19 09:48:54 -08:00
Luca Falavigna
9f22835f73
Use errno.EISDIR and errno.EACCESS instead of hardcoding values (#1196) 2024-02-19 09:38:24 -06:00
Bruno Cabral
85a4557525
match all orientations (#1127)
* match all orientations

* use rotation as option

---------

Co-authored-by: Andrew Senetar <arsenetar@gmail.com>
Co-authored-by: Luke <byunghun.hyun26@gmail.com>
2024-02-19 09:19:33 -06:00
70d956b4f8
Merge pull request #1197 from dktrkranz/Hurd
Use isolation_level=None mode for GNU Hurd
2024-02-19 09:15:10 -06:00
Luca Falavigna
007404f46a Use isolation_level=None mode for GNU Hurd 2024-02-12 18:02:13 +01:00
4385b50825
Merge pull request #1143 from cebtenzzre/fix-tox
Update python versions in tox.ini
2023-12-01 22:08:42 -06:00
4ef1d24351
Merge pull request #1142 from cebtenzzre/fix-requirements
Update requirements.txt
2023-12-01 22:08:05 -06:00
03be82c0b0
Merge pull request #1144 from Cebtenzzre/fix-arch-package
Do not package send2trash on Arch Linux
2023-07-17 21:54:22 -05:00
Cebtenzzre
332b814c00 Do not package send2trash on Arch Linux
send2trash is provided by the python-send2trash package in the 'extra'
repository.
2023-07-16 19:04:11 -04:00
Cebtenzzre
f56bef67e1 Update python versions in tox.ini
Python 3.6 support was already dropped in commit b9dfeac2 ("Drop Python
3.6 Support"). Don't attempt to run tests with it if it is installed.

Python 3.11 is currently supported, so add it to the list.
2023-07-16 18:49:36 -04:00
Cebtenzzre
8160fe4fcc Update requirements.txt
sphinx 7.0.0 was released on April 29th.
2023-07-16 18:38:07 -04:00
9ad84ade29
Merge pull request #1130 from lukehyun/master
RE: Rewrote some of the korean translation to be more understandable
2023-06-08 20:10:33 -05:00
18f32fda19
chore(translations): Synchronize translations 2023-06-08 19:14:57 -05:00
99ec4e0f27
fix: Minor cleanups and fixes
- Update NullJob to subclass Job
- Remove unnecessary size pre-read in _getMatches() as file sizes are
  already loaded during file scan via stat call
- Skip ref check if contents scan as the scan already prevents this from
  happening, some of the other scans do things differently and need to
  be reviewed before removing this post step completely
- Add guard on partial hashing to just hash the whole file if smaller
  than the offset and size and use the value for both the partial digest
  and digest
2023-06-08 01:14:52 -05:00
Luke
fe0e4bef91 RE: Rewrote some of the korean translation to be more understandable
I have updated my fork and moved my changes from before.
2023-06-01 11:40:20 +10:00
322d29a996
Merge pull request #1121 from arsenetar/as/upgrade-deps
feat: Upgrade dependencies
2023-04-27 02:35:48 -05:00
c5a71f61b8
feat: Upgrade dependencies
Upgrade required dependency versions for most dependencies.  Add maximum
version to most dependencies as well.
2023-04-27 02:25:22 -05:00
10405ad063
fix(build): Clean prior qt/dg_rc.py file before build
Since calls without pyqrcc5 may result in a broken file, clean prior
qt/gg_rc.py file before calling pyqrcc5.  This makes troubleshooting
pyqrcc5 issues more straightforward.

Fix #1103
2023-04-27 01:36:32 -05:00
a257dbf0d5
fix(win): Shorten file description
Fix #1119
2023-04-27 01:22:06 -05:00
Daniel Chalmers
7a4506ece3
Github -> GitHub (#1115)
Co-authored-by: Andrew Senetar <arsenetar@gmail.com>
2023-04-27 00:54:39 -05:00
aade6593ac
feat: Update translations from transifex 2023-04-27 00:49:03 -05:00
6d8b86b7eb
fix(core): Remove old directory state logic
- Remove code forcing the exclusion of `.` directories by default, the
  new default exclusion filters do this by default
- Change default state code to always return a value
2023-02-27 17:58:15 -06:00
e41c91623c
Merge pull request #1049 from Dobatymo/colors-bytes
serialize/deserialize colors to/from bytes instead of strings
2023-01-26 21:24:20 -06:00
46521c8af1
feat: Add migration for picture cache db
- Add migration (just delete db and change to new schema) for picture
  cache following the same sort of strategy as the file digest cache
- Rename mtime column to mtime_ns to match file cache for consistency
2023-01-13 00:05:47 -06:00
549eb7f153
chore: Add vscode launch.json 2023-01-12 23:51:05 -06:00
8125e3ec97
chore: Add rulers to vscode settings, format 2023-01-12 23:30:35 -06:00
8c5e18b980
Merge remote-tracking branch 'upstream/master' into colors-bytes 2023-01-12 00:14:17 -06:00
d81759f77f
fix: Specify maximum python version for deb
Specify maximum supported python version so attempts to install are met
with better errors.
2023-01-11 23:53:02 -06:00
c57042fdd2
fix: Resolve issue with mock object for core test
Last change introduced a new method on the fs.File object that the test
object did not have.  Add similar method to test object.
2023-01-11 23:20:40 -06:00
057be0294a
fix: Prevent exception during existence check
- Add "safe" existence check to files which catches OSErrors that may
  occur when trying to stat files
- Use "safe" existence check during final existence check
2023-01-11 23:07:06 -06:00
81daddd072
refactor: Improve digest cache db method performance
- Remove lock on read operations, only needed for write operations
- Change to use context manager for sqlite connection
- Remove long lived cursor object and use short lived cursors instead

Fixes #1080
2023-01-11 00:58:29 -06:00
1e651a1603
Merge pull request #1089 from arsenetar/as/pre-commit
feat: Add pre-commit, include python 3.11 in tests
2023-01-09 23:18:13 -06:00
78f4145910
chore: Remove unused qtlib.pot file 2023-01-09 23:02:19 -06:00
46d1afb566
chore: Apply whitespace fixes from hooks
- Remove trailing whitespace
- Correct single newline at end of files (skip for json)
- Update to formatting in a few places due to black
2023-01-09 22:58:08 -06:00
a5e31f15f0
Merge pull request #1088 from arsenetar/as/remove-shelve
feat: Remove shelve picture cache
2023-01-09 22:48:37 -06:00
0cf6c9a1a2
ci: Update to include python 3.11 & pre-commit 2023-01-09 22:44:10 -06:00
6db2fa2be6
fix: Correct flake8 config
- Add exclude pattern for flake8 when running with pre-commit as it does
  not fully honor the exclude paths.
- Cleanup exclude paths for flake8 in tox.ini
- Re-enable line length check and correct three affected files
2023-01-09 22:35:12 -06:00
2dd2a801cc
feat: Add pre-commit and commitlint 2023-01-09 21:53:22 -06:00
83f5e80427
feat: Remove shelve picture cache
- Remove shelve picture cache as it has had a fair number of historical
  issues.  Original issue for which it was added should be long
  resolved.  Additionally this allows additional consolidation of the
  various cache code and potentially dbs in the future.
- Remove all related preferences and related code for changing cache
  backend between sqlite and shelve.
2023-01-06 00:35:23 -06:00
091cae0cc6
feat: Add confirmation dialog when canceling job
- Implement a confirmation dialog for cancellation of jobs, required
  changing from QProgressDialog to QDialog to keep cleaner.
- Update ui translation source file

Close #1033, #515
2023-01-06 00:06:55 -06:00
e30a135451
feat: Add additional scan time options
- Add option to include file existence check at end of scan, speeds up
  end of scan operation time considerably, however if user has removed
  or moved files since starting a scan there could be later errors when
  interacting with results.  Defaults to existing behavior of including
  the check, until it can be verified later dialogs and actions handle
  non-existent items better.
- Add option to ignore differences in mtime when checking hash cache.
  Option is present in advanced tab of preferences.  Closes #1022.
- Regenerate pot files for translations
2023-01-05 23:01:16 -06:00
1db93fd142
Merge pull request #1069 from eugenesan/master
Add webp image format support
2022-12-06 05:50:36 -06:00
48862b6414
Merge pull request #1036 from dktrkranz/desktopfile
Add Keywords tag to desktop file
2022-12-06 05:48:50 -06:00
Eugene San (eugenesan)
c920412856 Add webp image format support 2022-11-24 13:53:27 -07:00
4448b999ab
fix: Add W503 to flake8 extend-ignore
For some reason flake8 is now throwing W503, which should be disabled by
default, adding to extend-ignore fixes it, so doing that for now.
2022-09-28 07:05:46 -05:00
af1ae33598
Merge pull request #1042 from fascox/patch-1
Update core.po for `it`
2022-09-28 06:52:52 -05:00
265d10b261
Merge pull request #1026 from muath-ye/patch-1
Update columns.po for `ar`
2022-09-28 06:46:50 -05:00
Dobatymo
f1153c85c0 serialize/deserialize colors to/from bytes instead of strings
it's a tiny bit faster and saves a bit of memory
2022-09-27 17:47:38 +08:00
Fabio Scognamiglio
1eee3fd7e4
Update core.po
fix mispelled translation
2022-09-10 13:29:04 +02:00
Luca Falavigna
1827827fdf Add Keywords tag to desktop file 2022-08-31 14:57:16 +00:00
Muath Alsowadi
db174d4e63
Update columns.po 2022-08-07 09:32:33 +03:00
1f1dfa88dc
Update version & changelog for 4.3.1 release 2022-07-07 22:06:06 -05:00
916c5204cf
Update translations from transifex 2022-07-07 21:57:59 -05:00
71af825b37
Move try/except of cache db to get() and put()
- Move the try/except of cache db calls to the calls themselves.
- Add some additional information to logging statements on cache db
  exception to improve troubleshooting.
2022-07-07 21:52:22 -05:00
97f490b8b7
Fix typo in engine.py 2022-07-07 19:06:35 -05:00
d369bcddd7
Updates from investigation of #1015
- Add protection for empty hash digests in comparison of non-zero size
  files
- Bump version to 4.3.1-dev for identification
2022-07-07 19:00:09 -05:00
360dceca7b
Update to version 4.3.0, update changelog 2022-06-30 23:27:14 -05:00
92b27801c3
Update translations, remove iphoto_plist.py 2022-06-30 23:03:40 -05:00
Marcus Yanello
b9aabb8545
Redirect stdout from custom command to the log files (#1008)
Send the logs for the custom command subprocess to the logs
Closes #1007
2022-06-13 21:04:40 -05:00
d5eeab4a17
Additional type hints in hscommon 2022-05-11 00:50:34 -05:00
7865e4aeac
Type hinting hscommon & cleanup 2022-05-09 23:36:39 -05:00
58863b1728
Change to use a real temporary directory for test
app_test was not using a real temporary location originally
2022-05-09 01:46:42 -05:00
e382683f66
Replace all relative imports 2022-05-09 01:40:08 -05:00
f7ed1c801c
Add type hinting to desktop.py 2022-05-09 01:15:25 -05:00
f587c7b5d8
Removed unused code in hscommon/util
Also added type hints throughout
2022-05-09 00:47:57 -05:00
40ff40bea8
Move create_qsettings() out of preferences
- Load order was impacting translations
- Fix by moving create_qsettings() for now
2022-05-08 20:33:31 -05:00
7a44c72a0a
Complete removal of qtlib locale files 2022-05-08 19:52:25 -05:00
66aff9f74e
Update pot files
This "moves" the translation points from qtlib.pot to ui.pot.
Needs further updates to propagate across.
2022-05-08 19:28:37 -05:00
5451f55219
Move qtlib localization files to top level 2022-05-08 19:23:13 -05:00
36280b01e6
Finish moving all qtlib py files to qt 2022-05-08 19:22:08 -05:00
18359c3ea6
Start flattening Qtlib into qt
- Remove app.py from qtlib (unused)
- Remove .gitignore from qtlib (unecessary)
- Move contents of preferences.py in qtlib to qt, clean up references
- Simplify language dropdown code
2022-05-08 18:51:10 -05:00
0a4e61edf5
Additional cleanup per mypy
- Add Callable type to hasher (should realy be more specific...)
- Add type hint to COLUMNS in qtlib/table.py
- Use Qt.ItemFlag.ItemIsEnabled instead of Qt.itemIsEnabled in qtlib/table.py
2022-04-30 05:16:46 -05:00
d73a85b82e
Add type hints for compiled modules 2022-04-30 05:11:54 -05:00
81c593399e
Format changes with black 2022-04-27 20:59:20 -05:00
6a732a79a8
Remove old tx config 2022-04-27 20:58:30 -05:00
63dd4d4561
Apply pyupgrade changes 2022-04-27 20:53:12 -05:00
e0061d7bc1
Fix #989, typo in debian control file 2022-04-02 16:43:19 -05:00
c5818b1d1f
Add option to profile scans
- Add preference for profiling scans
- Move debug options to tab in preferences
- Add label with clickable link to debug output (appdata) to debug tab in preferences
- Update translation source files
2022-03-31 00:16:37 -05:00
a470a8de25
Update fs.py to optimize stat() calls
- Update to get size and mtime at time of class creation when os.DirEntry is used for initialization.
- Folders still calculate size later for folder scans.
- Ref #962, #959
2022-03-30 22:58:01 -05:00
a37b5b0eeb
Fix #988 2022-03-30 01:06:51 -05:00
efd500ecc1
Update directory scanning to use os.scandir()
- Change to use os.scandir() instead of os.walk() to leverage DirEntry objects.
- Avoids extra calls to stat() on files during fs.can_handle()
- See 3x speed improvement on Windows in some cases
2022-03-29 23:37:56 -05:00
43fcc52291
Replace pathlib.glob() with os.scandir() in fs.py 2022-03-29 22:35:38 -05:00
50f5db1543
Update fs to support DirEntry on get_file() 2022-03-29 22:32:36 -05:00
a5b0ccdd02
Improve performance of Directories.get_state() 2022-03-29 21:48:14 -05:00
143147cb8e
Remove Cocoa specific and other unused code 2022-03-28 00:47:46 -05:00
ebb81d9f03
Remove pathlib function added in Python 3.9 2022-03-28 00:06:32 -05:00
da9f8b2b9d
Squashed commit of the following:
commit 8b15fe9a502ebf4841c6529e7098cef03a6a5e6f
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Sun Mar 27 23:48:15 2022 -0500

    Finish up changes to copy_or_move

commit 21f6a32cf3186a400af8f30e67ad2743dc9a49bd
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Thu Mar 17 23:56:52 2022 -0500

    Migrate from hscommon.path to pathlib
    - Part one, this gets all hscommon and core tests passing
    - App appears to be able to load directories and complete scans, need further testing
    - app.py copy_or_move needs some additional work
2022-03-27 23:50:03 -05:00
5ed5eddde6
Add polib back to requirements.txt 2022-03-27 22:35:34 -05:00
9f40e4e786
Squashed commit of the following:
commit 5eb515f666bfa1ff06c2e96bdc351a4b7456580e
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Sun Mar 27 22:19:39 2022 -0500

    Add fallback to md5 if xxhash not available

    Mainly here for the case when distributions have not packaged python3-xxhash.

commit 51b18d4c84
Author: Andrew Senetar <arsenetar@gmail.com>
Date:   Sat Mar 19 15:25:46 2022 -0500

    Switch file hashing to xxhash instead of md5

    - Improves performance significantly in some cases
    - Add xxhash to requirements.txt and sort requirements
    - Rename md5 based members to digest
    - Update all tests to use new member names and hashing methods
    - Update hash db code to upgrade schema

    NOTE: May consider supporting multiple hashing algorithms in the future.
2022-03-27 22:27:13 -05:00
86bf9b39d0
Add update check function and call from about
- Implement a update check against the GitHub releases via the api
- Add semantic-version dependency
- Add automatic check when opening about dialog
2022-03-27 21:13:27 -05:00
c0be0aecbd
Minor documentation update 2022-03-27 21:04:37 -05:00
c408873d20
Update changelog 2022-03-25 23:37:46 -05:00
bbcdfbf698
Add vscode extension recommendation 2022-03-21 22:27:16 -05:00
8cee1a9467
Fix internal links in CONTRIBUTING.md 2022-03-21 22:19:58 -05:00
448d33dcb6
Add workflow yml validation settings
- Add yml validation to project for vscode
- Allow .vscode/settings.json
- Apply formatting to workflow files
2022-03-21 22:18:22 -05:00
8d414cadac
Add initial partial CONTRIBUTING.md
- Adopt a CONTRIBUTING.md format similar to that used by atom/atom.
- Add label section as replacement to wiki
- Add style guide section
- Setup basic document structure

TODO:
- Migrate some existing wiki information here where applicable.
- Migrate some existing help information here.
- Finish up remaining sections.
2022-03-21 22:04:45 -05:00
f902ee889a
Add configuration for isort to pyproject.toml 2022-03-21 00:25:36 -05:00
bc89e71935
Update .gitignore
- Pull from github/gitignore to cover some things better
- Organize remaining items
- Remove a few no longer relevant items
2022-03-20 23:25:01 -05:00
17b83c8001
Move polib to setup_requires instead of install_requires 2022-03-20 22:48:03 -05:00
0f845ee67a
Update min python version in Makefile 2022-03-20 01:23:01 -05:00
d40e32a143
Update transifex config & pull latest updates
- Update transifex configuration to new format
- Pull translation updates
2022-03-19 20:21:14 -05:00
1bc206e62d
Bump version to 4.2.1 2022-03-19 19:02:41 -05:00
106a0feaba
Add sponsor information 2022-03-19 17:46:12 -05:00
984e0c4094
Fix help path for local files and some help doc updates 2022-03-19 17:43:11 -05:00
9321e811d7
Enforce minimum Windows version ref #983 2022-03-19 17:01:54 -05:00
a64fcbfb5c
Fix deprecation warning from sqlite 2022-03-19 17:01:53 -05:00
cff07a12d6
Black formatter changes 2022-03-19 17:01:53 -05:00
Alfonso Montero
b9c7832c4a
Apply @arsenetar's proposed change to fix for errors on window change event. Solves #937. (#980) 2022-03-15 20:47:48 -05:00
b9dfeac2f3
Drop Python 3.6 Support 2022-03-15 05:10:41 -05:00
efc99eee96
Merge pull request #978 from glubsy/fix_zoom_scrollbar
Fix image viewer scrollbar zoom
2022-03-14 20:43:40 -05:00
glubsy
ff7733bb73 Fix image viewer
When zooming in or out, the value computed might be a float instead
of an int, which is what the QScrollBar expect for its setValue method.
Simply casting to int should be enough here.
2022-03-12 22:36:17 +01:00
4b2fbe87ea
Default to English on unsupported system language Fix #976
- Add check for supported language to system locale detection
- Fall-back to English when not a supported locale
2022-03-12 04:36:13 -06:00
291 changed files with 8184 additions and 8611 deletions

13
.github/FUNDING.yml vendored Normal file
View File

@ -0,0 +1,13 @@
# These are supported funding model platforms
github: arsenetar
patreon: # Replace with a single Patreon username
open_collective: # Replace with a single Open Collective username
ko_fi: # Replace with a single Ko-fi username
tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
liberapay: # Replace with a single Liberapay username
issuehunt: # Replace with a single IssueHunt username
otechie: # Replace with a single Otechie username
lfx_crowdfunding: # Replace with a single LFX Crowdfunding project-name e.g., cloud-foundry
custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']

View File

@ -2,12 +2,12 @@ name: "CodeQL"
on: on:
push: push:
branches: [ master ] branches: [master]
pull_request: pull_request:
# The branches below must be a subset of the branches above # The branches below must be a subset of the branches above
branches: [ master ] branches: [master]
schedule: schedule:
- cron: '24 20 * * 2' - cron: "24 20 * * 2"
jobs: jobs:
analyze: analyze:
@ -21,30 +21,30 @@ jobs:
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
language: [ 'cpp', 'python' ] language: ["cpp", "python"]
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@v2 uses: actions/checkout@v2
# Initializes the CodeQL tools for scanning. # Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL - name: Initialize CodeQL
uses: github/codeql-action/init@v1 uses: github/codeql-action/init@v1
with: with:
languages: ${{ matrix.language }} languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file. # If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file. # By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file. # Prefix the list here with "+" to use these queries and those in the config file.
# queries: ./path/to/local/query, your-org/your-repo/queries@main # queries: ./path/to/local/query, your-org/your-repo/queries@main
- if: matrix.language == 'cpp' - if: matrix.language == 'cpp'
name: Build Cpp name: Build Cpp
run: | run: |
sudo apt-get update sudo apt-get update
sudo apt-get install python3-pyqt5 sudo apt-get install python3-pyqt5
make modules make modules
- if: matrix.language == 'python' - if: matrix.language == 'python'
name: Autobuild name: Autobuild
uses: github/codeql-action/autobuild@v1 uses: github/codeql-action/autobuild@v1
# Analysis # Analysis
- name: Perform CodeQL Analysis - name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v1 uses: github/codeql-action/analyze@v1

View File

@ -4,81 +4,62 @@ name: Default CI/CD
on: on:
push: push:
branches: [ master ]
pull_request: pull_request:
branches: [ master ] branches: [master]
jobs: jobs:
lint: pre-commit:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v4
- name: Set up Python 3.10 - name: Set up Python 3.12
uses: actions/setup-python@v2 uses: actions/setup-python@v5
with: with:
python-version: '3.10' python-version: "3.12"
- name: Install dependencies - uses: pre-commit/action@v3.0.1
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt -r requirements-extra.txt
- name: Lint with flake8
run: |
flake8 .
format:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.10
uses: actions/setup-python@v2
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt -r requirements-extra.txt
- name: Check format with black
run: |
black .
test: test:
needs: [lint, format] needs: [pre-commit]
runs-on: ${{ matrix.os }} runs-on: ${{ matrix.os }}
strategy: strategy:
matrix: matrix:
os: [ubuntu-latest, macos-latest, windows-latest] os: [ubuntu-latest]
python-version: [3.6, 3.7, 3.8, 3.9, '3.10'] python-version: [3.7, 3.8, 3.9, "3.10", "3.11", "3.12"]
exclude: include:
- os: macos-latest
python-version: 3.6
- os: macos-latest
python-version: 3.7
- os: macos-latest
python-version: 3.8
- os: windows-latest - os: windows-latest
python-version: 3.6 python-version: "3.12"
- os: windows-latest - os: macos-latest
python-version: 3.7 python-version: "3.12"
- os: windows-latest
python-version: 3.8
steps: steps:
- uses: actions/checkout@v2 - uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} - name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2 uses: actions/setup-python@v5
with: with:
python-version: ${{ matrix.python-version }} python-version: ${{ matrix.python-version }}
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m pip install --upgrade pip python -m pip install --upgrade pip
pip install -r requirements.txt -r requirements-extra.txt pip install setuptools
- name: Build python modules pip install -r requirements.txt -r requirements-extra.txt
run: | - name: Build python modules
python build.py --modules run: |
- name: Run tests python build.py --modules
run: | - name: Run tests
pytest core hscommon run: |
- name: Upload Artifacts pytest core hscommon
if: matrix.os == 'ubuntu-latest' - name: Upload Artifacts
uses: actions/upload-artifact@v3 if: matrix.os == 'ubuntu-latest'
with: uses: actions/upload-artifact@v4
name: modules ${{ matrix.python-version }} with:
path: ${{ github.workspace }}/**/*.so name: modules ${{ matrix.python-version }}
path: build/**/*.so
merge-artifacts:
needs: [test]
runs-on: ubuntu-latest
steps:
- name: Merge Artifacts
uses: actions/upload-artifact/merge@v4
with:
name: modules
pattern: modules*
delete-merged: true

26
.github/workflows/tx-push.yml vendored Normal file
View File

@ -0,0 +1,26 @@
# Push translation source to Transifex
name: Transifex Sync
on:
push:
branches:
- master
paths:
- locale/*.pot
env:
TX_VERSION: "v1.6.10"
jobs:
push-source:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Get Transifex Client
run: |
curl -o- https://raw.githubusercontent.com/transifex/cli/master/install.sh | bash -s -- $TX_VERSION
- name: Update & Push Translation Sources
env:
TX_TOKEN: ${{ secrets.TX_TOKEN }}
run: |
./tx push -s --use-git-timestamps

125
.gitignore vendored
View File

@ -1,30 +1,111 @@
.DS_Store # Byte-compiled / optimized / DLL files
__pycache__ __pycache__/
*.egg-info *.py[cod]
*$py.class
# C extensions
*.so *.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo *.mo
*.waf* #*.pot
.lock-waf*
.tox
/tags
*.eggs
build # PEP 582; used by e.g. github.com/David-OConnor/pyflow
dist __pypackages__/
env*
/deps
cocoa/autogen
/run.py # Environments
/cocoa/*/Info.plist .env
/cocoa/*/build .venv
env*/
venv/
ENV/
env.bak/
venv.bak/
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# macOS
.DS_Store
# Visual Studio Code
.vscode/*
!.vscode/settings.json
!.vscode/tasks.json
!.vscode/launch.json
!.vscode/extensions.json
!.vscode/*.code-snippets
# Local History for Visual Studio Code
.history/
# Built Visual Studio Code Extensions
*.vsix
# dupeGuru Specific
/qt/*_rc.py /qt/*_rc.py
/help/*/conf.py /help/*/conf.py
/help/*/changelog.rst /help/*/changelog.rst
/transifex cocoa/autogen
/cocoa/*/Info.plist
/cocoa/*/build
*.pyd *.waf*
*.exe .lock-waf*
*.spec /tags
.vscode

24
.pre-commit-config.yaml Normal file
View File

@ -0,0 +1,24 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: check-yaml
- id: check-toml
- id: end-of-file-fixer
exclude: ".*.json"
- id: trailing-whitespace
- repo: https://github.com/psf/black
rev: 24.2.0
hooks:
- id: black
- repo: https://github.com/PyCQA/flake8
rev: 7.0.0
hooks:
- id: flake8
exclude: ^(.tox|env|build|dist|help|qt/dg_rc.py|pkg).*
- repo: https://github.com/alessandrojcm/commitlint-pre-commit-hook
rev: v9.11.0
hooks:
- id: commitlint
stages: [commit-msg]
additional_dependencies: ["@commitlint/config-conventional"]

View File

@ -1 +1 @@
sonar.python.version=3.6, 3.7, 3.8, 3.9, 3.10 sonar.python.version=3.7, 3.8, 3.9, 3.10, 3.11

View File

@ -1,26 +1,20 @@
[main] [main]
host = https://www.transifex.com host = https://www.transifex.com
[dupeguru-1.core] [o:voltaicideas:p:dupeguru-1:r:columns]
file_filter = locale/<lang>/LC_MESSAGES/core.po
source_file = locale/core.pot
source_lang = en
type = PO
[dupeguru-1.columns]
file_filter = locale/<lang>/LC_MESSAGES/columns.po file_filter = locale/<lang>/LC_MESSAGES/columns.po
source_file = locale/columns.pot source_file = locale/columns.pot
source_lang = en source_lang = en
type = PO type = PO
[dupeguru-1.ui] [o:voltaicideas:p:dupeguru-1:r:core]
file_filter = locale/<lang>/LC_MESSAGES/core.po
source_file = locale/core.pot
source_lang = en
type = PO
[o:voltaicideas:p:dupeguru-1:r:ui]
file_filter = locale/<lang>/LC_MESSAGES/ui.po file_filter = locale/<lang>/LC_MESSAGES/ui.po
source_file = locale/ui.pot source_file = locale/ui.pot
source_lang = en source_lang = en
type = PO type = PO
[dupeguru-1.qtlib]
file_filter = qtlib/locale/<lang>/LC_MESSAGES/qtlib.po
source_file = qtlib/locale/qtlib.pot
source_lang = en
type = PO

12
.vscode/extensions.json vendored Normal file
View File

@ -0,0 +1,12 @@
{
// List of extensions which should be recommended for users of this workspace.
"recommendations": [
"redhat.vscode-yaml",
"ms-python.vscode-pylance",
"ms-python.python",
"ms-python.black-formatter",
],
// List of extensions recommended by VS Code that should not be recommended for
// users of this workspace.
"unwantedRecommendations": []
}

17
.vscode/launch.json vendored Normal file
View File

@ -0,0 +1,17 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "DupuGuru",
"type": "debugpy",
"request": "launch",
"program": "run.py",
"console": "integratedTerminal",
"subProcess": true,
"justMyCode": false
},
]
}

17
.vscode/settings.json vendored Normal file
View File

@ -0,0 +1,17 @@
{
"cSpell.words": [
"Dupras",
"hscommon"
],
"editor.rulers": [
88,
120
],
"python.languageServer": "Pylance",
"yaml.schemaStore.enable": true,
"[python]": {
"editor.formatOnSave": true,
"editor.defaultFormatter": "ms-python.black-formatter"
},
"python.testing.pytestEnabled": true
}

88
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,88 @@
# Contributing to dupeGuru
The following is a set of guidelines and information for contributing to dupeGuru.
#### Table of Contents
[Things to Know Before Starting](#things-to-know-before-starting)
[Ways to Contribute](#ways-to-contribute)
* [Reporting Bugs](#reporting-bugs)
* [Suggesting Enhancements](#suggesting-enhancements)
* [Localization](#localization)
* [Code Contribution](#code-contribution)
* [Pull Requests](#pull-requests)
[Style Guides](#style-guides)
* [Git Commit Messages](#git-commit-messages)
* [Python Style Guide](#python-style-guide)
* [Documentation Style Guide](#documentation-style-guide)
[Additional Notes](#additional-notes)
* [Issue and Pull Request Labels](#issue-and-pull-request-labels)
## Things to Know Before Starting
**TODO**
## Ways to contribute
### Reporting Bugs
**TODO**
### Suggesting Enhancements
**TODO**
### Localization
**TODO**
### Code Contribution
**TODO**
### Pull Requests
Please follow these steps to have your contribution considered by the maintainers:
1. Keep Pull Request specific to one feature or bug.
2. Follow the [style guides](#style-guides)
3. After you submit your pull request, verify that all [status checks](https://help.github.com/articles/about-status-checks/) are passing <details><summary>What if the status checks are failing?</summary>If a status check is failing, and you believe that the failure is unrelated to your change, please leave a comment on the pull request explaining why you believe the failure is unrelated. A maintainer will re-run the status check for you. If we conclude that the failure was a false positive, then we will open an issue to track that problem with our status check suite.</details>
While the prerequisites above must be satisfied prior to having your pull request reviewed, the reviewer(s) may ask you to complete additional design work, tests, or other changes before your pull request can be ultimately accepted.
## Style Guides
### Git Commit Messages
- Use the present tense ("Add feature" not "Added feature")
- Use the imperative mood ("Move cursor to..." not "Moves cursor to...")
- Limit the first line to 72 characters or less
- Reference issues and pull requests liberally after the first line
### Python Style Guide
- All files are formatted with [Black](https://github.com/psf/black)
- Follow [PEP 8](https://peps.python.org/pep-0008/) as much as practical
- Pass [flake8](https://flake8.pycqa.org/en/latest/) linting
- Include [PEP 484](https://peps.python.org/pep-0484/) type hints (new code)
### Documentation Style Guide
**TODO**
## Additional Notes
### Issue and Pull Request Labels
This section lists and describes the various labels used with issues and pull requests. Each of the labels is listed with a search link as well.
#### Issue Type and Status
| Label name | Search | Description |
|------------|--------|-------------|
| `enhancement` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aenhancement) | Feature requests and enhancements. |
| `bug` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Abug) | Bug reports. |
| `duplicate` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aduplicate) | Issue is a duplicate of existing issue. |
| `needs-reproduction` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aneeds-reproduction) | A bug that has not been able to be reproduced. |
| `needs-information` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aneeds-information) | More information needs to be collected about these problems or feature requests (e.g. steps to reproduce). |
| `blocked` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Ablocked) | Issue blocked by other issues. |
| `beginner` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Abeginner) | Less complex issues for users who want to start contributing. |
#### Category Labels
| Label name | Search | Description |
|------------|--------|-------------|
| `3rd party` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3A%223rd%20party%22) | Related to a 3rd party dependency. |
| `crash` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Acrash) | Related to crashes (complete, or unhandled). |
| `documentation` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Adocumentation) | Related to any documentation. |
| `linux` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3linux) | Related to running on Linux. |
| `mac` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Amac) | Related to running on macOS. |
| `performance` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aperformance) | Related to the performance. |
| `ui` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Aui)| Related to the visual design. |
| `windows` | [search](https://github.com/arsenetar/dupeguru/issues?q=is%3Aopen+is%3Aissue+label%3Awindows) | Related to running on Windows. |
#### Pull Request Labels
None at this time, if the volume of Pull Requests increase labels may be added to manage.

View File

@ -619,4 +619,3 @@ Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee. copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS END OF TERMS AND CONDITIONS

View File

@ -3,4 +3,3 @@ recursive-include core *.m
include run.py include run.py
graft locale graft locale
graft help graft help
graft qtlib/locale

View File

@ -1,7 +1,7 @@
PYTHON ?= python3 PYTHON ?= python3
PYTHON_VERSION_MINOR := $(shell ${PYTHON} -c "import sys; print(sys.version_info.minor)") PYTHON_VERSION_MINOR := $(shell ${PYTHON} -c "import sys; print(sys.version_info.minor)")
PYRCC5 ?= pyrcc5 PYRCC5 ?= pyrcc5
REQ_MINOR_VERSION = 6 REQ_MINOR_VERSION = 7
PREFIX ?= /usr/local PREFIX ?= /usr/local
# Window compatability via Msys2 # Window compatability via Msys2
@ -35,7 +35,7 @@ endif
# Our build scripts are not very "make like" yet and perform their task in a bundle. For now, we # Our build scripts are not very "make like" yet and perform their task in a bundle. For now, we
# use one of each file to act as a representative, a target, of these groups. # use one of each file to act as a representative, a target, of these groups.
packages = hscommon qtlib core qt packages = hscommon core qt
localedirs = $(wildcard locale/*/LC_MESSAGES) localedirs = $(wildcard locale/*/LC_MESSAGES)
pofiles = $(wildcard locale/*/LC_MESSAGES/*.po) pofiles = $(wildcard locale/*/LC_MESSAGES/*.po)
mofiles = $(patsubst %.po,%.mo,$(pofiles)) mofiles = $(patsubst %.po,%.mo,$(pofiles))

View File

@ -1,16 +1,12 @@
# dupeGuru # dupeGuru
[dupeGuru][dupeguru] is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in [dupeGuru][dupeguru] is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in
a system. It is written mostly in Python 3 and has the peculiarity of using a system. It is written mostly in Python 3 and uses [qt](https://www.qt.io/) for the UI.
[multiple GUI toolkits][cross-toolkit], all using the same core Python code. On OS X, the UI layer
is written in Objective-C and uses Cocoa. On Linux, it is written in Python and uses Qt5.
The Cocoa UI of dupeGuru is hosted in a separate repo: https://github.com/arsenetar/dupeguru-cocoa
## Current status ## Current status
Still looking for additional help especially with regards to: Still looking for additional help especially with regards to:
* OSX maintenance: reproducing bugs & cocoa version, building package with Cocoa UI. * OSX maintenance: reproducing bugs, packaging verification.
* Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package. * Linux maintenance: reproducing bugs, maintaining PPA repository, Debian package, rpm package.
* Translations: updating missing strings, transifex project at https://www.transifex.com/voltaicideas/dupeguru-1 * Translations: updating missing strings, transifex project at https://www.transifex.com/voltaicideas/dupeguru-1
* Documentation: keeping it up-to-date. * Documentation: keeping it up-to-date.
@ -26,7 +22,6 @@ This folder contains the source for dupeGuru. Its documentation is in `help`, bu
* help: Help document, written for Sphinx. * help: Help document, written for Sphinx.
* locale: .po files for localization. * locale: .po files for localization.
* hscommon: A collection of helpers used across HS applications. * hscommon: A collection of helpers used across HS applications.
* qtlib: A collection of helpers used across Qt UI codebases of HS applications.
## How to build dupeGuru from source ## How to build dupeGuru from source
@ -36,19 +31,17 @@ For windows instructions see the [Windows Instructions](Windows.md).
For macos instructions (qt version) see the [macOS Instructions](macos.md). For macos instructions (qt version) see the [macOS Instructions](macos.md).
### Prerequisites ### Prerequisites
* [Python 3.6+][python] * [Python 3.7+][python]
* PyQt5 * PyQt5
### System Setup ### System Setup
When running in a linux based environment the following system packages or equivalents are needed to build: When running in a linux based environment the following system packages or equivalents are needed to build:
* python3-pyqt5 * python3-pyqt5
* pyqt5-dev-tools (on some systems, see note) * pyqt5-dev-tools (on some systems, see note)
* python3-wheel (for hsaudiotag3k)
* python3-venv (only if using a virtual environment) * python3-venv (only if using a virtual environment)
* python3-dev * python3-dev
* build-essential * build-essential
Note: On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with `which pyrcc5`. Debian based systems need the extra package, and Arch does not. Note: On some linux systems pyrcc5 is not put on the path when installing python3-pyqt5, this will cause some issues with the resource files (and icons). These systems should have a respective pyqt5-dev-tools package, which should also be installed. The presence of pyrcc5 can be checked with `which pyrcc5`. Debian based systems need the extra package, and Arch does not.
To create packages the following are also needed: To create packages the following are also needed:

View File

@ -2,7 +2,7 @@
### Prerequisites ### Prerequisites
- [Python 3.6+][python] - [Python 3.7+][python]
- [Visual Studio 2019][vs] or [Visual Studio Build Tools 2019][vsBuildTools] with the Windows 10 SDK - [Visual Studio 2019][vs] or [Visual Studio Build Tools 2019][vsBuildTools] with the Windows 10 SDK
- [nsis][nsis] (for installer creation) - [nsis][nsis] (for installer creation)
- [msys2][msys2] (for using makefile method) - [msys2][msys2] (for using makefile method)
@ -16,7 +16,7 @@ After installing python it is recommended to update setuptools before compiling
More details on setting up python for compiling packages on windows can be found on the [python wiki][pythonWindowsCompilers] Take note of the required vc++ versions. More details on setting up python for compiling packages on windows can be found on the [python wiki][pythonWindowsCompilers] Take note of the required vc++ versions.
### With build.py (preferred) ### With build.py (preferred)
To build with a different python version 3.6 vs 3.8 or 32 bit vs 64 bit specify that version instead of -3.8 to the `py` command below. If you want to build additional versions while keeping all virtual environments setup use a different location for each virtual environment. To build with a different python version 3.7 vs 3.8 or 32 bit vs 64 bit specify that version instead of -3.8 to the `py` command below. If you want to build additional versions while keeping all virtual environments setup use a different location for each virtual environment.
$ cd <dupeGuru directory> $ cd <dupeGuru directory>
$ py -3.8 -m venv .\env $ py -3.8 -m venv .\env

View File

@ -61,7 +61,7 @@ def parse_args():
def build_one_help(language): def build_one_help(language):
print("Generating Help in {}".format(language)) print(f"Generating Help in {language}")
current_path = Path(".").absolute() current_path = Path(".").absolute()
changelog_path = current_path.joinpath("help", "changelog") changelog_path = current_path.joinpath("help", "changelog")
tixurl = "https://github.com/arsenetar/dupeguru/issues/{}" tixurl = "https://github.com/arsenetar/dupeguru/issues/{}"
@ -88,14 +88,8 @@ def build_help():
p.map(build_one_help, languages) p.map(build_one_help, languages)
def build_qt_localizations():
loc.compile_all_po(Path("qtlib", "locale"))
loc.merge_locale_dir(Path("qtlib", "locale"), "locale")
def build_localizations(): def build_localizations():
loc.compile_all_po("locale") loc.compile_all_po("locale")
build_qt_localizations()
locale_dest = Path("build", "locale") locale_dest = Path("build", "locale")
if locale_dest.exists(): if locale_dest.exists():
shutil.rmtree(locale_dest) shutil.rmtree(locale_dest)
@ -109,25 +103,16 @@ def build_updatepot():
print("Building columns.pot") print("Building columns.pot")
loc.generate_pot(["core"], Path("locale", "columns.pot"), ["coltr"]) loc.generate_pot(["core"], Path("locale", "columns.pot"), ["coltr"])
print("Building ui.pot") print("Building ui.pot")
# When we're not under OS X, we don't want to overwrite ui.pot because it contains Cocoa locs loc.generate_pot(["qt"], Path("locale", "ui.pot"), ["tr"], merge=True)
# We want to merge the generated pot with the old pot in the most preserving way possible.
ui_packages = ["qt", Path("cocoa", "inter")]
loc.generate_pot(ui_packages, Path("locale", "ui.pot"), ["tr"], merge=True)
print("Building qtlib.pot")
loc.generate_pot(["qtlib"], Path("qtlib", "locale", "qtlib.pot"), ["tr"])
def build_mergepot(): def build_mergepot():
print("Updating .po files using .pot files") print("Updating .po files using .pot files")
loc.merge_pots_into_pos("locale") loc.merge_pots_into_pos("locale")
loc.merge_pots_into_pos(Path("qtlib", "locale"))
# loc.merge_pots_into_pos(Path("cocoalib", "locale"))
def build_normpo(): def build_normpo():
loc.normalize_all_pos("locale") loc.normalize_all_pos("locale")
loc.normalize_all_pos(Path("qtlib", "locale"))
# loc.normalize_all_pos(Path("cocoalib", "locale"))
def build_pe_modules(): def build_pe_modules():
@ -144,14 +129,15 @@ def build_normal():
print("Building localizations") print("Building localizations")
build_localizations() build_localizations()
print("Building Qt stuff") print("Building Qt stuff")
print_and_do("pyrcc5 {0} > {1}".format(Path("qt", "dg.qrc"), Path("qt", "dg_rc.py"))) Path("qt", "dg_rc.py").unlink(missing_ok=True)
print_and_do("pyrcc5 {} > {}".format(Path("qt", "dg.qrc"), Path("qt", "dg_rc.py")))
fix_qt_resource_file(Path("qt", "dg_rc.py")) fix_qt_resource_file(Path("qt", "dg_rc.py"))
build_help() build_help()
def main(): def main():
if sys.version_info < (3, 6): if sys.version_info < (3, 7):
sys.exit("Python < 3.6 is unsupported.") sys.exit("Python < 3.7 is unsupported.")
options = parse_args() options = parse_args()
if options.clean and Path("build").exists(): if options.clean and Path("build").exists():
shutil.rmtree("build") shutil.rmtree("build")

17
commitlint.config.js Normal file
View File

@ -0,0 +1,17 @@
const Configuration = {
/*
* Resolve and load @commitlint/config-conventional from node_modules.
* Referenced packages must be installed
*/
extends: ['@commitlint/config-conventional'],
/*
* Any rules defined here will override rules from @commitlint/config-conventional
*/
rules: {
'header-max-length': [2, 'always', 72],
'subject-case': [2, 'always', 'sentence-case'],
'scope-enum': [2, 'always'],
},
};
module.exports = Configuration;

View File

@ -1,2 +1,2 @@
__version__ = "4.2.0" __version__ = "4.3.1"
__appname__ = "dupeGuru" __appname__ = "dupeGuru"

View File

@ -4,37 +4,39 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import cProfile
import datetime
import os import os
import os.path as op import os.path as op
import logging import logging
import subprocess import subprocess
import re import re
import shutil import shutil
from pathlib import Path
from send2trash import send2trash from send2trash import send2trash
from hscommon.jobprogress import job from hscommon.jobprogress import job
from hscommon.notify import Broadcaster from hscommon.notify import Broadcaster
from hscommon.path import Path
from hscommon.conflict import smart_move, smart_copy from hscommon.conflict import smart_move, smart_copy
from hscommon.gui.progress_window import ProgressWindow from hscommon.gui.progress_window import ProgressWindow
from hscommon.util import delete_if_empty, first, escape, nonone, allsame from hscommon.util import delete_if_empty, first, escape, nonone, allsame
from hscommon.trans import tr from hscommon.trans import tr
from hscommon import desktop from hscommon import desktop
from . import se, me, pe from core import se, me, pe
from .pe.photo import get_delta_dimensions from core.pe.photo import get_delta_dimensions
from .util import cmp_value, fix_surrogate_encoding from core.util import cmp_value, fix_surrogate_encoding
from . import directories, results, export, fs, prioritize from core import directories, results, export, fs, prioritize
from .ignore import IgnoreList from core.ignore import IgnoreList
from .exclude import ExcludeDict as ExcludeList from core.exclude import ExcludeDict as ExcludeList
from .scanner import ScanType from core.scanner import ScanType
from .gui.deletion_options import DeletionOptions from core.gui.deletion_options import DeletionOptions
from .gui.details_panel import DetailsPanel from core.gui.details_panel import DetailsPanel
from .gui.directory_tree import DirectoryTree from core.gui.directory_tree import DirectoryTree
from .gui.ignore_list_dialog import IgnoreListDialog from core.gui.ignore_list_dialog import IgnoreListDialog
from .gui.exclude_list_dialog import ExcludeListDialogCore from core.gui.exclude_list_dialog import ExcludeListDialogCore
from .gui.problem_dialog import ProblemDialog from core.gui.problem_dialog import ProblemDialog
from .gui.stats_label import StatsLabel from core.gui.stats_label import StatsLabel
HAD_FIRST_LAUNCH_PREFERENCE = "HadFirstLaunch" HAD_FIRST_LAUNCH_PREFERENCE = "HadFirstLaunch"
DEBUG_MODE_PREFERENCE = "DebugMode" DEBUG_MODE_PREFERENCE = "DebugMode"
@ -124,15 +126,13 @@ class DupeGuru(Broadcaster):
NAME = PROMPT_NAME = "dupeGuru" NAME = PROMPT_NAME = "dupeGuru"
PICTURE_CACHE_TYPE = "sqlite" # set to 'shelve' for a ShelveCache
def __init__(self, view, portable=False): def __init__(self, view, portable=False):
if view.get_default(DEBUG_MODE_PREFERENCE): if view.get_default(DEBUG_MODE_PREFERENCE):
logging.getLogger().setLevel(logging.DEBUG) logging.getLogger().setLevel(logging.DEBUG)
logging.debug("Debug mode enabled") logging.debug("Debug mode enabled")
Broadcaster.__init__(self) Broadcaster.__init__(self)
self.view = view self.view = view
self.appdata = desktop.special_folder_path(desktop.SpecialFolder.APPDATA, appname=self.NAME, portable=portable) self.appdata = desktop.special_folder_path(desktop.SpecialFolder.APPDATA, portable=portable)
if not op.exists(self.appdata): if not op.exists(self.appdata):
os.makedirs(self.appdata) os.makedirs(self.appdata)
self.app_mode = AppMode.STANDARD self.app_mode = AppMode.STANDARD
@ -151,7 +151,8 @@ class DupeGuru(Broadcaster):
"clean_empty_dirs": False, "clean_empty_dirs": False,
"ignore_hardlink_matches": False, "ignore_hardlink_matches": False,
"copymove_dest_type": DestType.RELATIVE, "copymove_dest_type": DestType.RELATIVE,
"picture_cache_type": self.PICTURE_CACHE_TYPE, "include_exists_check": True,
"rehash_ignore_mtime": False,
} }
self.selected_dupes = [] self.selected_dupes = []
self.details_panel = DetailsPanel(self) self.details_panel = DetailsPanel(self)
@ -181,8 +182,7 @@ class DupeGuru(Broadcaster):
self.view.create_results_window() self.view.create_results_window()
def _get_picture_cache_path(self): def _get_picture_cache_path(self):
cache_type = self.options["picture_cache_type"] cache_name = "cached_pictures.db"
cache_name = "cached_pictures.shelve" if cache_type == "shelve" else "cached_pictures.db"
return op.join(self.appdata, cache_name) return op.join(self.appdata, cache_name)
def _get_dupe_sort_key(self, dupe, get_group, key, delta): def _get_dupe_sort_key(self, dupe, get_group, key, delta):
@ -248,7 +248,7 @@ class DupeGuru(Broadcaster):
ref = group.ref ref = group.ref
linkfunc = os.link if use_hardlinks else os.symlink linkfunc = os.link if use_hardlinks else os.symlink
linkfunc(str(ref.path), str_path) linkfunc(str(ref.path), str_path)
self.clean_empty_dirs(dupe.path.parent()) self.clean_empty_dirs(dupe.path.parent)
def _create_file(self, path): def _create_file(self, path):
# We add fs.Folder to fileclasses in case the file we're loading contains folder paths. # We add fs.Folder to fileclasses in case the file we're loading contains folder paths.
@ -262,7 +262,7 @@ class DupeGuru(Broadcaster):
try: try:
f._read_all_info(attrnames=self.METADATA_TO_READ) f._read_all_info(attrnames=self.METADATA_TO_READ)
return f return f
except EnvironmentError: except OSError:
return None return None
def _get_export_data(self): def _get_export_data(self):
@ -415,7 +415,7 @@ class DupeGuru(Broadcaster):
def clean_empty_dirs(self, path): def clean_empty_dirs(self, path):
if self.options["clean_empty_dirs"]: if self.options["clean_empty_dirs"]:
while delete_if_empty(path, [".DS_Store"]): while delete_if_empty(path, [".DS_Store"]):
path = path.parent() path = path.parent
def clear_picture_cache(self): def clear_picture_cache(self):
try: try:
@ -428,25 +428,25 @@ class DupeGuru(Broadcaster):
def copy_or_move(self, dupe, copy: bool, destination: str, dest_type: DestType): def copy_or_move(self, dupe, copy: bool, destination: str, dest_type: DestType):
source_path = dupe.path source_path = dupe.path
location_path = first(p for p in self.directories if dupe.path in p) location_path = first(p for p in self.directories if p in dupe.path.parents)
dest_path = Path(destination) dest_path = Path(destination)
if dest_type in {DestType.RELATIVE, DestType.ABSOLUTE}: if dest_type in {DestType.RELATIVE, DestType.ABSOLUTE}:
# no filename, no windows drive letter # no filename, no windows drive letter
source_base = source_path.remove_drive_letter().parent() source_base = source_path.relative_to(source_path.anchor).parent
if dest_type == DestType.RELATIVE: if dest_type == DestType.RELATIVE:
source_base = source_base[location_path:] source_base = source_base.relative_to(location_path.relative_to(location_path.anchor))
dest_path = dest_path[source_base] dest_path = dest_path.joinpath(source_base)
if not dest_path.exists(): if not dest_path.exists():
dest_path.makedirs() dest_path.mkdir(parents=True)
# Add filename to dest_path. For file move/copy, it's not required, but for folders, yes. # Add filename to dest_path. For file move/copy, it's not required, but for folders, yes.
dest_path = dest_path[source_path.name] dest_path = dest_path.joinpath(source_path.name)
logging.debug("Copy/Move operation from '%s' to '%s'", source_path, dest_path) logging.debug("Copy/Move operation from '%s' to '%s'", source_path, dest_path)
# Raises an EnvironmentError if there's a problem # Raises an EnvironmentError if there's a problem
if copy: if copy:
smart_copy(source_path, dest_path) smart_copy(source_path, dest_path)
else: else:
smart_move(source_path, dest_path) smart_move(source_path, dest_path)
self.clean_empty_dirs(source_path.parent()) self.clean_empty_dirs(source_path.parent)
def copy_or_move_marked(self, copy): def copy_or_move_marked(self, copy):
"""Start an async move (or copy) job on marked duplicates. """Start an async move (or copy) job on marked duplicates.
@ -553,9 +553,15 @@ class DupeGuru(Broadcaster):
# a workaround to make the damn thing work. # a workaround to make the damn thing work.
exepath, args = match.groups() exepath, args = match.groups()
path, exename = op.split(exepath) path, exename = op.split(exepath)
subprocess.Popen(exename + args, shell=True, cwd=path) p = subprocess.Popen(
exename + args, shell=True, cwd=path, stdout=subprocess.PIPE, stderr=subprocess.STDOUT
)
output = p.stdout.read()
logging.info("Custom command %s %s: %s", exename, args, output)
else: else:
subprocess.Popen(dupe_cmd, shell=True) p = subprocess.Popen(dupe_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = p.stdout.read()
logging.info("Custom command %s: %s", dupe_cmd, output)
def load(self): def load(self):
"""Load directory selection and ignore list from files in appdata. """Load directory selection and ignore list from files in appdata.
@ -780,12 +786,13 @@ class DupeGuru(Broadcaster):
except OSError as e: except OSError as e:
self.view.show_message(tr("Couldn't write to file: {}").format(str(e))) self.view.show_message(tr("Couldn't write to file: {}").format(str(e)))
def start_scanning(self): def start_scanning(self, profile_scan=False):
"""Starts an async job to scan for duplicates. """Starts an async job to scan for duplicates.
Scans folders selected in :attr:`directories` and put the results in :attr:`results` Scans folders selected in :attr:`directories` and put the results in :attr:`results`
""" """
scanner = self.SCANNER_CLASS() scanner = self.SCANNER_CLASS()
fs.filesdb.ignore_mtime = self.options["rehash_ignore_mtime"] is True
if not self.directories.has_any_file(): if not self.directories.has_any_file():
self.view.show_message(tr("The selected directories contain no scannable file.")) self.view.show_message(tr("The selected directories contain no scannable file."))
return return
@ -800,6 +807,9 @@ class DupeGuru(Broadcaster):
self._results_changed() self._results_changed()
def do(j): def do(j):
if profile_scan:
pr = cProfile.Profile()
pr.enable()
j.set_progress(0, tr("Collecting files to scan")) j.set_progress(0, tr("Collecting files to scan"))
if scanner.scan_type == ScanType.FOLDERS: if scanner.scan_type == ScanType.FOLDERS:
files = list(self.directories.get_folders(folderclass=se.fs.Folder, j=j)) files = list(self.directories.get_folders(folderclass=se.fs.Folder, j=j))
@ -810,6 +820,9 @@ class DupeGuru(Broadcaster):
logging.info("Scanning %d files" % len(files)) logging.info("Scanning %d files" % len(files))
self.results.groups = scanner.get_dupe_groups(files, self.ignore_list, j) self.results.groups = scanner.get_dupe_groups(files, self.ignore_list, j)
self.discarded_file_count = scanner.discarded_file_count self.discarded_file_count = scanner.discarded_file_count
if profile_scan:
pr.disable()
pr.dump_stats(op.join(self.appdata, f"{datetime.datetime.now():%Y-%m-%d_%H-%M-%S}.profile"))
self._start_job(JobType.SCAN, do) self._start_job(JobType.SCAN, do)

View File

@ -7,13 +7,13 @@
import os import os
from xml.etree import ElementTree as ET from xml.etree import ElementTree as ET
import logging import logging
from pathlib import Path
from hscommon.jobprogress import job from hscommon.jobprogress import job
from hscommon.path import Path
from hscommon.util import FileOrPath from hscommon.util import FileOrPath
from hscommon.trans import tr from hscommon.trans import tr
from . import fs from core import fs
__all__ = [ __all__ = [
"Directories", "Directories",
@ -63,7 +63,7 @@ class Directories:
def __contains__(self, path): def __contains__(self, path):
for p in self._dirs: for p in self._dirs:
if path in p: if path == p or p in path.parents:
return True return True
return False return False
@ -84,64 +84,64 @@ class Directories:
for denied_path_re in self._exclude_list.compiled: for denied_path_re in self._exclude_list.compiled:
if denied_path_re.match(str(path.name)): if denied_path_re.match(str(path.name)):
return DirectoryState.EXCLUDED return DirectoryState.EXCLUDED
# return # We still use the old logic to force state on hidden dirs return DirectoryState.NORMAL
# Override this in subclasses to specify the state of some special folders. # Override this in subclasses to specify the state of some special folders.
if path.name.startswith("."): if path.name.startswith("."):
return DirectoryState.EXCLUDED return DirectoryState.EXCLUDED
return DirectoryState.NORMAL
def _get_files(self, from_path, fileclasses, j): def _get_files(self, from_path, fileclasses, j):
for root, dirs, files in os.walk(str(from_path)): try:
j.check_if_cancelled() with os.scandir(from_path) as iter:
root_path = Path(root) root_path = Path(from_path)
state = self.get_state(root_path) state = self.get_state(root_path)
if state == DirectoryState.EXCLUDED and not any(p[: len(root_path)] == root_path for p in self.states): # if we have no un-excluded dirs under this directory skip going deeper
# Recursively get files from folders with lots of subfolder is expensive. However, there skip_dirs = state == DirectoryState.EXCLUDED and not any(
# might be a subfolder in this path that is not excluded. What we want to do is to skim p.parts[: len(root_path.parts)] == root_path.parts for p in self.states
# through self.states and see if we must continue, or we can stop right here to save time )
del dirs[:] count = 0
try: for item in iter:
if state != DirectoryState.EXCLUDED: j.check_if_cancelled()
# Old logic try:
if self._exclude_list is None or not self._exclude_list.mark_count: if item.is_dir():
found_files = [fs.get_file(root_path + f, fileclasses=fileclasses) for f in files] if skip_dirs:
else: continue
found_files = [] yield from self._get_files(item.path, fileclasses, j)
# print(f"len of files: {len(files)} {files}") continue
for f in files: elif state == DirectoryState.EXCLUDED:
if not self._exclude_list.is_excluded(root, f): continue
found_files.append(fs.get_file(root_path + f, fileclasses=fileclasses)) # File excluding or not
found_files = [f for f in found_files if f is not None] if (
# In some cases, directories can be considered as files by dupeGuru, which is self._exclude_list is None
# why we have this line below. In fact, there only one case: Bundle files under or not self._exclude_list.mark_count
# OS X... In other situations, this forloop will do nothing. or not self._exclude_list.is_excluded(str(from_path), item.name)
for d in dirs[:]: ):
f = fs.get_file(root_path + d, fileclasses=fileclasses) file = fs.get_file(item, fileclasses=fileclasses)
if f is not None: if file:
found_files.append(f) file.is_ref = state == DirectoryState.REFERENCE
dirs.remove(d) count += 1
logging.debug( yield file
"Collected %d files in folder %s", except (OSError, fs.InvalidPath):
len(found_files), pass
str(root_path), logging.debug(
) "Collected %d files in folder %s",
for file in found_files: count,
file.is_ref = state == DirectoryState.REFERENCE str(root_path),
yield file )
except (EnvironmentError, fs.InvalidPath): except OSError:
pass pass
def _get_folders(self, from_folder, j): def _get_folders(self, from_folder, j):
j.check_if_cancelled() j.check_if_cancelled()
try: try:
for subfolder in from_folder.subfolders: for subfolder in from_folder.subfolders:
for folder in self._get_folders(subfolder, j): yield from self._get_folders(subfolder, j)
yield folder
state = self.get_state(from_folder.path) state = self.get_state(from_folder.path)
if state != DirectoryState.EXCLUDED: if state != DirectoryState.EXCLUDED:
from_folder.is_ref = state == DirectoryState.REFERENCE from_folder.is_ref = state == DirectoryState.REFERENCE
logging.debug("Yielding Folder %r state: %d", from_folder, state) logging.debug("Yielding Folder %r state: %d", from_folder, state)
yield from_folder yield from_folder
except (EnvironmentError, fs.InvalidPath): except (OSError, fs.InvalidPath):
pass pass
# ---Public # ---Public
@ -159,7 +159,7 @@ class Directories:
raise AlreadyThereError() raise AlreadyThereError()
if not path.exists(): if not path.exists():
raise InvalidPathError() raise InvalidPathError()
self._dirs = [p for p in self._dirs if p not in path] self._dirs = [p for p in self._dirs if path not in p.parents]
self._dirs.append(path) self._dirs.append(path)
@staticmethod @staticmethod
@ -170,10 +170,10 @@ class Directories:
:rtype: list of Path :rtype: list of Path
""" """
try: try:
subpaths = [p for p in path.listdir() if p.isdir()] subpaths = [p for p in path.glob("*") if p.is_dir()]
subpaths.sort(key=lambda x: x.name.lower()) subpaths.sort(key=lambda x: x.name.lower())
return subpaths return subpaths
except EnvironmentError: except OSError:
return [] return []
def get_files(self, fileclasses=None, j=job.nulljob): def get_files(self, fileclasses=None, j=job.nulljob):
@ -187,7 +187,7 @@ class Directories:
for path in self._dirs: for path in self._dirs:
for file in self._get_files(path, fileclasses=fileclasses, j=j): for file in self._get_files(path, fileclasses=fileclasses, j=j):
file_count += 1 file_count += 1
if type(j) != job.NullJob: if not isinstance(j, job.NullJob):
j.set_progress(-1, tr("Collected {} files to scan").format(file_count)) j.set_progress(-1, tr("Collected {} files to scan").format(file_count))
yield file yield file
@ -203,7 +203,7 @@ class Directories:
from_folder = folderclass(path) from_folder = folderclass(path)
for folder in self._get_folders(from_folder, j): for folder in self._get_folders(from_folder, j):
folder_count += 1 folder_count += 1
if type(j) != job.NullJob: if not isinstance(j, job.NullJob):
j.set_progress(-1, tr("Collected {} folders to scan").format(folder_count)) j.set_progress(-1, tr("Collected {} folders to scan").format(folder_count))
yield folder yield folder
@ -215,19 +215,16 @@ class Directories:
# direct match? easy result. # direct match? easy result.
if path in self.states: if path in self.states:
return self.states[path] return self.states[path]
state = self._default_state_for_path(path) or DirectoryState.NORMAL state = self._default_state_for_path(path)
# Save non-default states in cache, necessary for _get_files() # Save non-default states in cache, necessary for _get_files()
if state != DirectoryState.NORMAL: if state != DirectoryState.NORMAL:
self.states[path] = state self.states[path] = state
return state return state
# find the longest parent path that is in states and return that state if found
prevlen = 0 # NOTE: path.parents is ordered longest to shortest
# we loop through the states to find the longest matching prefix for parent_path in path.parents:
# if the parent has a state in cache, return that state if parent_path in self.states:
for p, s in self.states.items(): return self.states[parent_path]
if p.is_parent_of(path) and len(p) > prevlen:
prevlen = len(p)
state = s
return state return state
def has_any_file(self): def has_any_file(self):
@ -296,6 +293,6 @@ class Directories:
if self.get_state(path) == state: if self.get_state(path) == state:
return return
for iter_path in list(self.states.keys()): for iter_path in list(self.states.keys()):
if path.is_parent_of(iter_path): if path in iter_path.parents:
del self.states[iter_path] del self.states[iter_path]
self.states[path] = state self.states[path] = state

View File

@ -166,7 +166,7 @@ def reduce_common_words(word_dict, threshold):
The exception to this removal are the objects where all the words of the object are common. The exception to this removal are the objects where all the words of the object are common.
Because if we remove them, we will miss some duplicates! Because if we remove them, we will miss some duplicates!
""" """
uncommon_words = set(word for word, objects in word_dict.items() if len(objects) < threshold) uncommon_words = {word for word, objects in word_dict.items() if len(objects) < threshold}
for word, objects in list(word_dict.items()): for word, objects in list(word_dict.items()):
if len(objects) < threshold: if len(objects) < threshold:
continue continue
@ -283,7 +283,7 @@ def getmatches_by_contents(files, bigsize=0, j=job.nulljob):
"""Returns a list of :class:`Match` within ``files`` if their contents is the same. """Returns a list of :class:`Match` within ``files`` if their contents is the same.
:param bigsize: The size in bytes over which we consider files big enough to :param bigsize: The size in bytes over which we consider files big enough to
justify taking samples of md5. If 0, compute md5 as usual. justify taking samples of the file for hashing. If 0, compute digest as usual.
:param j: A :ref:`job progress instance <jobs>`. :param j: A :ref:`job progress instance <jobs>`.
""" """
size2files = defaultdict(set) size2files = defaultdict(set)
@ -300,15 +300,16 @@ def getmatches_by_contents(files, bigsize=0, j=job.nulljob):
if first.is_ref and second.is_ref: if first.is_ref and second.is_ref:
continue # Don't spend time comparing two ref pics together. continue # Don't spend time comparing two ref pics together.
if first.size == 0 and second.size == 0: if first.size == 0 and second.size == 0:
# skip md5 for zero length files # skip hashing for zero length files
result.append(Match(first, second, 100)) result.append(Match(first, second, 100))
continue continue
if first.md5partial == second.md5partial: # if digests are the same (and not None) then files match
if first.digest_partial is not None and first.digest_partial == second.digest_partial:
if bigsize > 0 and first.size > bigsize: if bigsize > 0 and first.size > bigsize:
if first.md5samples == second.md5samples: if first.digest_samples is not None and first.digest_samples == second.digest_samples:
result.append(Match(first, second, 100)) result.append(Match(first, second, 100))
else: else:
if first.md5 == second.md5: if first.digest is not None and first.digest == second.digest:
result.append(Match(first, second, 100)) result.append(Match(first, second, 100))
group_count += 1 group_count += 1
j.add_progress(desc=PROGRESS_MESSAGE % (len(result), group_count)) j.add_progress(desc=PROGRESS_MESSAGE % (len(result), group_count))
@ -409,7 +410,7 @@ class Group:
You can call this after the duplicate scanning process to free a bit of memory. You can call this after the duplicate scanning process to free a bit of memory.
""" """
discarded = set(m for m in self.matches if not all(obj in self.unordered for obj in [m.first, m.second])) discarded = {m for m in self.matches if not all(obj in self.unordered for obj in [m.first, m.second])}
self.matches -= discarded self.matches -= discarded
self.candidates = defaultdict(set) self.candidates = defaultdict(set)
return discarded return discarded
@ -456,7 +457,7 @@ class Group:
self._matches_for_ref = None self._matches_for_ref = None
if (len(self) > 1) and any(not getattr(item, "is_ref", False) for item in self): if (len(self) > 1) and any(not getattr(item, "is_ref", False) for item in self):
if discard_matches: if discard_matches:
self.matches = set(m for m in self.matches if item not in m) self.matches = {m for m in self.matches if item not in m}
else: else:
self._clear() self._clear()
except ValueError: except ValueError:
@ -529,7 +530,7 @@ def get_groups(matches):
del dupe2group del dupe2group
del matches del matches
# should free enough memory to continue # should free enough memory to continue
logging.warning("Memory Overflow. Groups: {0}".format(len(groups))) logging.warning(f"Memory Overflow. Groups: {len(groups)}")
# Now that we have a group, we have to discard groups' matches and see if there're any "orphan" # Now that we have a group, we have to discard groups' matches and see if there're any "orphan"
# matches, that is, matches that were candidate in a group but that none of their 2 files were # matches, that is, matches that were candidate in a group but that none of their 2 files were
# accepted in the group. With these orphan groups, it's safe to build additional groups # accepted in the group. With these orphan groups, it's safe to build additional groups

View File

@ -2,7 +2,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from .markable import Markable from core.markable import Markable
from xml.etree import ElementTree as ET from xml.etree import ElementTree as ET
# TODO: perhaps use regex module for better Unicode support? https://pypi.org/project/regex/ # TODO: perhaps use regex module for better Unicode support? https://pypi.org/project/regex/

View File

@ -11,16 +11,28 @@
# resulting needless complexity and memory usage. It's been a while since I wanted to do that fork, # resulting needless complexity and memory usage. It's been a while since I wanted to do that fork,
# and I'm doing it now. # and I'm doing it now.
import hashlib import os
from math import floor from math import floor
import logging import logging
import sqlite3 import sqlite3
from sys import platform
from threading import Lock from threading import Lock
from typing import Any from typing import Any, AnyStr, Union, Callable
from hscommon.path import Path from pathlib import Path
from hscommon.util import nonone, get_file_ext from hscommon.util import nonone, get_file_ext
hasher: Callable
try:
import xxhash
hasher = xxhash.xxh128
except ImportError:
import hashlib
hasher = hashlib.md5
__all__ = [ __all__ = [
"File", "File",
"Folder", "Folder",
@ -40,9 +52,12 @@ NOT_SET = object()
# CPU. # CPU.
CHUNK_SIZE = 1024 * 1024 # 1 MiB CHUNK_SIZE = 1024 * 1024 # 1 MiB
# Minimum size below which partial hashes don't need to be computed # Minimum size below which partial hashing is not used
MIN_FILE_SIZE = 3 * CHUNK_SIZE # 3MiB, because we take 3 samples MIN_FILE_SIZE = 3 * CHUNK_SIZE # 3MiB, because we take 3 samples
# Partial hashing offset and size
PARTIAL_OFFSET_SIZE = (0x4000, 0x4000)
class FSError(Exception): class FSError(Exception):
cls_message = "An error has occured on '{name}' in '{parent}'" cls_message = "An error has occured on '{name}' in '{parent}'"
@ -83,75 +98,101 @@ class OperationError(FSError):
class FilesDB: class FilesDB:
schema_version = 1
schema_version_description = "Changed from md5 to xxhash if available."
create_table_query = "CREATE TABLE IF NOT EXISTS files (path TEXT PRIMARY KEY, size INTEGER, mtime_ns INTEGER, entry_dt DATETIME, md5 BLOB, md5partial BLOB)" create_table_query = """CREATE TABLE IF NOT EXISTS files (path TEXT PRIMARY KEY, size INTEGER, mtime_ns INTEGER,
drop_table_query = "DROP TABLE files;" entry_dt DATETIME, digest BLOB, digest_partial BLOB, digest_samples BLOB)"""
drop_table_query = "DROP TABLE IF EXISTS files;"
select_query = "SELECT {key} FROM files WHERE path=:path AND size=:size and mtime_ns=:mtime_ns" select_query = "SELECT {key} FROM files WHERE path=:path AND size=:size and mtime_ns=:mtime_ns"
select_query_ignore_mtime = "SELECT {key} FROM files WHERE path=:path AND size=:size"
insert_query = """ insert_query = """
INSERT INTO files (path, size, mtime_ns, entry_dt, {key}) VALUES (:path, :size, :mtime_ns, datetime('now'), :value) INSERT INTO files (path, size, mtime_ns, entry_dt, {key})
VALUES (:path, :size, :mtime_ns, datetime('now'), :value)
ON CONFLICT(path) DO UPDATE SET size=:size, mtime_ns=:mtime_ns, entry_dt=datetime('now'), {key}=:value; ON CONFLICT(path) DO UPDATE SET size=:size, mtime_ns=:mtime_ns, entry_dt=datetime('now'), {key}=:value;
""" """
ignore_mtime = False
def __init__(self): def __init__(self):
self.conn = None self.conn = None
self.cur = None
self.lock = None self.lock = None
def connect(self, path): def connect(self, path: Union[AnyStr, os.PathLike]) -> None:
# type: (str, ) -> None if platform.startswith("gnu0"):
self.conn = sqlite3.connect(path, check_same_thread=False, isolation_level=None)
self.conn = sqlite3.connect(path, check_same_thread=False) else:
self.cur = self.conn.cursor() self.conn = sqlite3.connect(path, check_same_thread=False)
self.cur.execute(self.create_table_query)
self.lock = Lock() self.lock = Lock()
self._check_upgrade()
def clear(self): def _check_upgrade(self) -> None:
# type: () -> None with self.lock, self.conn as conn:
has_schema = conn.execute(
"SELECT NAME FROM sqlite_master WHERE type='table' AND name='schema_version'"
).fetchall()
version = None
if has_schema:
version = conn.execute("SELECT version FROM schema_version ORDER BY version DESC").fetchone()[0]
else:
conn.execute("CREATE TABLE schema_version (version int PRIMARY KEY, description TEXT)")
if version != self.schema_version:
conn.execute(self.drop_table_query)
conn.execute(
"INSERT OR REPLACE INTO schema_version VALUES (:version, :description)",
{"version": self.schema_version, "description": self.schema_version_description},
)
conn.execute(self.create_table_query)
with self.lock: def clear(self) -> None:
self.cur.execute(self.drop_table_query) with self.lock, self.conn as conn:
self.cur.execute(self.create_table_query) conn.execute(self.drop_table_query)
conn.execute(self.create_table_query)
def get(self, path, key):
# type: (Path, str) -> bytes
def get(self, path: Path, key: str) -> Union[bytes, None]:
stat = path.stat() stat = path.stat()
size = stat.st_size size = stat.st_size
mtime_ns = stat.st_mtime_ns mtime_ns = stat.st_mtime_ns
try:
with self.conn as conn:
if self.ignore_mtime:
cursor = conn.execute(
self.select_query_ignore_mtime.format(key=key), {"path": str(path), "size": size}
)
else:
cursor = conn.execute(
self.select_query.format(key=key),
{"path": str(path), "size": size, "mtime_ns": mtime_ns},
)
result = cursor.fetchone()
cursor.close()
with self.lock: if result:
self.cur.execute(self.select_query.format(key=key), {"path": str(path), "size": size, "mtime_ns": mtime_ns}) return result[0]
result = self.cur.fetchone() except Exception as ex:
logging.warning(f"Couldn't get {key} for {path} w/{size}, {mtime_ns}: {ex}")
if result:
return result[0]
return None return None
def put(self, path, key, value): def put(self, path: Path, key: str, value: Any) -> None:
# type: (Path, str, Any) -> None
stat = path.stat() stat = path.stat()
size = stat.st_size size = stat.st_size
mtime_ns = stat.st_mtime_ns mtime_ns = stat.st_mtime_ns
try:
with self.lock, self.conn as conn:
conn.execute(
self.insert_query.format(key=key),
{"path": str(path), "size": size, "mtime_ns": mtime_ns, "value": value},
)
except Exception as ex:
logging.warning(f"Couldn't put {key} for {path} w/{size}, {mtime_ns}: {ex}")
with self.lock: def commit(self) -> None:
self.cur.execute(
self.insert_query.format(key=key),
{"path": str(path), "size": size, "mtime_ns": mtime_ns, "value": value},
)
def commit(self):
# type: () -> None
with self.lock: with self.lock:
self.conn.commit() self.conn.commit()
def close(self): def close(self) -> None:
# type: () -> None
with self.lock: with self.lock:
self.cur.close()
self.conn.close() self.conn.close()
@ -161,19 +202,26 @@ filesdb = FilesDB() # Singleton
class File: class File:
"""Represents a file and holds metadata to be used for scanning.""" """Represents a file and holds metadata to be used for scanning."""
INITIAL_INFO = {"size": 0, "mtime": 0, "md5": b"", "md5partial": b"", "md5samples": b""} INITIAL_INFO = {"size": 0, "mtime": 0, "digest": b"", "digest_partial": b"", "digest_samples": b""}
# Slots for File make us save quite a bit of memory. In a memory test I've made with a lot of # Slots for File make us save quite a bit of memory. In a memory test I've made with a lot of
# files, I saved 35% memory usage with "unread" files (no _read_info() call) and gains become # files, I saved 35% memory usage with "unread" files (no _read_info() call) and gains become
# even greater when we take into account read attributes (70%!). Yeah, it's worth it. # even greater when we take into account read attributes (70%!). Yeah, it's worth it.
__slots__ = ("path", "is_ref", "words") + tuple(INITIAL_INFO.keys()) __slots__ = ("path", "unicode_path", "is_ref", "words") + tuple(INITIAL_INFO.keys())
def __init__(self, path): def __init__(self, path):
self.path = path
for attrname in self.INITIAL_INFO: for attrname in self.INITIAL_INFO:
setattr(self, attrname, NOT_SET) setattr(self, attrname, NOT_SET)
if type(path) is os.DirEntry:
self.path = Path(path.path)
self.size = nonone(path.stat().st_size, 0)
self.mtime = nonone(path.stat().st_mtime, 0)
else:
self.path = path
if self.path:
self.unicode_path = str(self.path)
def __repr__(self): def __repr__(self):
return "<{} {}>".format(self.__class__.__name__, str(self.path)) return f"<{self.__class__.__name__} {str(self.path)}>"
def __getattribute__(self, attrname): def __getattribute__(self, attrname):
result = object.__getattribute__(self, attrname) result = object.__getattribute__(self, attrname)
@ -187,32 +235,46 @@ class File:
result = self.INITIAL_INFO[attrname] result = self.INITIAL_INFO[attrname]
return result return result
def _calc_md5(self): def _calc_digest(self):
# type: () -> bytes # type: () -> bytes
with self.path.open("rb") as fp: with self.path.open("rb") as fp:
md5 = hashlib.md5() file_hash = hasher()
# The goal here is to not run out of memory on really big files. However, the chunk # The goal here is to not run out of memory on really big files. However, the chunk
# size has to be large enough so that the python loop isn't too costly in terms of # size has to be large enough so that the python loop isn't too costly in terms of
# CPU. # CPU.
CHUNK_SIZE = 1024 * 1024 # 1 mb CHUNK_SIZE = 1024 * 1024 # 1 mb
filedata = fp.read(CHUNK_SIZE) filedata = fp.read(CHUNK_SIZE)
while filedata: while filedata:
md5.update(filedata) file_hash.update(filedata)
filedata = fp.read(CHUNK_SIZE) filedata = fp.read(CHUNK_SIZE)
return md5.digest() return file_hash.digest()
def _calc_md5partial(self): def _calc_digest_partial(self):
# type: () -> bytes # type: () -> bytes
# This offset is where we should start reading the file to get a partial md5
# For audio file, it should be where audio data starts
offset, size = (0x4000, 0x4000)
with self.path.open("rb") as fp: with self.path.open("rb") as fp:
fp.seek(offset) fp.seek(PARTIAL_OFFSET_SIZE[0])
partialdata = fp.read(size) partial_data = fp.read(PARTIAL_OFFSET_SIZE[1])
return hashlib.md5(partialdata).digest() return hasher(partial_data).digest()
def _calc_digest_samples(self) -> bytes:
size = self.size
with self.path.open("rb") as fp:
# Chunk at 25% of the file
fp.seek(floor(size * 25 / 100), 0)
file_data = fp.read(CHUNK_SIZE)
file_hash = hasher(file_data)
# Chunk at 60% of the file
fp.seek(floor(size * 60 / 100), 0)
file_data = fp.read(CHUNK_SIZE)
file_hash.update(file_data)
# Last chunk of the file
fp.seek(-CHUNK_SIZE, 2)
file_data = fp.read(CHUNK_SIZE)
file_hash.update(file_data)
return file_hash.digest()
def _read_info(self, field): def _read_info(self, field):
# print(f"_read_info({field}) for {self}") # print(f"_read_info({field}) for {self}")
@ -220,48 +282,30 @@ class File:
stats = self.path.stat() stats = self.path.stat()
self.size = nonone(stats.st_size, 0) self.size = nonone(stats.st_size, 0)
self.mtime = nonone(stats.st_mtime, 0) self.mtime = nonone(stats.st_mtime, 0)
elif field == "md5partial": elif field == "digest_partial":
try: self.digest_partial = filesdb.get(self.path, "digest_partial")
self.md5partial = filesdb.get(self.path, "md5partial") if self.digest_partial is None:
if self.md5partial is None: # If file is smaller than partial requirements just use the full digest
self.md5partial = self._calc_md5partial() if self.size < PARTIAL_OFFSET_SIZE[0] + PARTIAL_OFFSET_SIZE[1]:
filesdb.put(self.path, "md5partial", self.md5partial) self.digest_partial = self.digest
except Exception as e: else:
logging.warning("Couldn't get md5partial for %s: %s", self.path, e) self.digest_partial = self._calc_digest_partial()
elif field == "md5": filesdb.put(self.path, "digest_partial", self.digest_partial)
try: elif field == "digest":
self.md5 = filesdb.get(self.path, "md5") self.digest = filesdb.get(self.path, "digest")
if self.md5 is None: if self.digest is None:
self.md5 = self._calc_md5() self.digest = self._calc_digest()
filesdb.put(self.path, "md5", self.md5) filesdb.put(self.path, "digest", self.digest)
except Exception as e: elif field == "digest_samples":
logging.warning("Couldn't get md5 for %s: %s", self.path, e) size = self.size
elif field == "md5samples": # Might as well hash such small files entirely.
try: if size <= MIN_FILE_SIZE:
with self.path.open("rb") as fp: self.digest_samples = self.digest
size = self.size return
# Might as well hash such small files entirely. self.digest_samples = filesdb.get(self.path, "digest_samples")
if size <= MIN_FILE_SIZE: if self.digest_samples is None:
setattr(self, field, self.md5) self.digest_samples = self._calc_digest_samples()
return filesdb.put(self.path, "digest_samples", self.digest_samples)
# Chunk at 25% of the file
fp.seek(floor(size * 25 / 100), 0)
filedata = fp.read(CHUNK_SIZE)
md5 = hashlib.md5(filedata)
# Chunk at 60% of the file
fp.seek(floor(size * 60 / 100), 0)
filedata = fp.read(CHUNK_SIZE)
md5.update(filedata)
# Last chunk of the file
fp.seek(-CHUNK_SIZE, 2)
filedata = fp.read(CHUNK_SIZE)
md5.update(filedata)
setattr(self, field, md5.digest())
except Exception as e:
logging.error(f"Error computing md5samples: {e}")
def _read_all_info(self, attrnames=None): def _read_all_info(self, attrnames=None):
"""Cache all possible info. """Cache all possible info.
@ -277,17 +321,25 @@ class File:
@classmethod @classmethod
def can_handle(cls, path): def can_handle(cls, path):
"""Returns whether this file wrapper class can handle ``path``.""" """Returns whether this file wrapper class can handle ``path``."""
return not path.islink() and path.isfile() return not path.is_symlink() and path.is_file()
def exists(self) -> bool:
"""Safely check if the underlying file exists, treat error as non-existent"""
try:
return self.path.exists()
except OSError as ex:
logging.warning(f"Checking {self.path} raised: {ex}")
return False
def rename(self, newname): def rename(self, newname):
if newname == self.name: if newname == self.name:
return return
destpath = self.path.parent()[newname] destpath = self.path.parent.joinpath(newname)
if destpath.exists(): if destpath.exists():
raise AlreadyExistsError(newname, self.path.parent()) raise AlreadyExistsError(newname, self.path.parent)
try: try:
self.path.rename(destpath) self.path.rename(destpath)
except EnvironmentError: except OSError:
raise OperationError(self) raise OperationError(self)
if not destpath.exists(): if not destpath.exists():
raise OperationError(self) raise OperationError(self)
@ -308,19 +360,20 @@ class File:
@property @property
def folder_path(self): def folder_path(self):
return self.path.parent() return self.path.parent
class Folder(File): class Folder(File):
"""A wrapper around a folder path. """A wrapper around a folder path.
It has the size/md5 info of a File, but its value is the sum of its subitems. It has the size/digest info of a File, but its value is the sum of its subitems.
""" """
__slots__ = File.__slots__ + ("_subfolders",) __slots__ = File.__slots__ + ("_subfolders",)
def __init__(self, path): def __init__(self, path):
File.__init__(self, path) File.__init__(self, path)
self.size = NOT_SET
self._subfolders = None self._subfolders = None
def _all_items(self): def _all_items(self):
@ -335,31 +388,31 @@ class Folder(File):
self.size = size self.size = size
stats = self.path.stat() stats = self.path.stat()
self.mtime = nonone(stats.st_mtime, 0) self.mtime = nonone(stats.st_mtime, 0)
elif field in {"md5", "md5partial", "md5samples"}: elif field in {"digest", "digest_partial", "digest_samples"}:
# What's sensitive here is that we must make sure that subfiles' # What's sensitive here is that we must make sure that subfiles'
# md5 are always added up in the same order, but we also want a # digest are always added up in the same order, but we also want a
# different md5 if a file gets moved in a different subdirectory. # different digest if a file gets moved in a different subdirectory.
def get_dir_md5_concat(): def get_dir_digest_concat():
items = self._all_items() items = self._all_items()
items.sort(key=lambda f: f.path) items.sort(key=lambda f: f.path)
md5s = [getattr(f, field) for f in items] digests = [getattr(f, field) for f in items]
return b"".join(md5s) return b"".join(digests)
md5 = hashlib.md5(get_dir_md5_concat()) digest = hasher(get_dir_digest_concat()).digest()
digest = md5.digest()
setattr(self, field, digest) setattr(self, field, digest)
@property @property
def subfolders(self): def subfolders(self):
if self._subfolders is None: if self._subfolders is None:
subfolders = [p for p in self.path.listdir() if not p.islink() and p.isdir()] with os.scandir(self.path) as iter:
subfolders = [p for p in iter if not p.is_symlink() and p.is_dir()]
self._subfolders = [self.__class__(p) for p in subfolders] self._subfolders = [self.__class__(p) for p in subfolders]
return self._subfolders return self._subfolders
@classmethod @classmethod
def can_handle(cls, path): def can_handle(cls, path):
return not path.islink() and path.isdir() return not path.is_symlink() and path.is_dir()
def get_file(path, fileclasses=[File]): def get_file(path, fileclasses=[File]):
@ -384,10 +437,11 @@ def get_files(path, fileclasses=[File]):
assert all(issubclass(fileclass, File) for fileclass in fileclasses) assert all(issubclass(fileclass, File) for fileclass in fileclasses)
try: try:
result = [] result = []
for path in path.listdir(): with os.scandir(path) as iter:
file = get_file(path, fileclasses=fileclasses) for item in iter:
if file is not None: file = get_file(item, fileclasses=fileclasses)
result.append(file) if file is not None:
result.append(file)
return result return result
except EnvironmentError: except OSError:
raise InvalidPath(path) raise InvalidPath(path)

View File

@ -7,7 +7,7 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from hscommon.gui.base import GUIObject from hscommon.gui.base import GUIObject
from .base import DupeGuruGUIObject from core.gui.base import DupeGuruGUIObject
class DetailsPanel(GUIObject, DupeGuruGUIObject): class DetailsPanel(GUIObject, DupeGuruGUIObject):

View File

@ -8,8 +8,8 @@
from hscommon.gui.tree import Tree, Node from hscommon.gui.tree import Tree, Node
from ..directories import DirectoryState from core.directories import DirectoryState
from .base import DupeGuruGUIObject from core.gui.base import DupeGuruGUIObject
STATE_ORDER = [DirectoryState.NORMAL, DirectoryState.REFERENCE, DirectoryState.EXCLUDED] STATE_ORDER = [DirectoryState.NORMAL, DirectoryState.REFERENCE, DirectoryState.EXCLUDED]

View File

@ -5,7 +5,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from .exclude_list_table import ExcludeListTable from core.gui.exclude_list_table import ExcludeListTable
from core.exclude import has_sep from core.exclude import has_sep
from os import sep from os import sep
import logging import logging

View File

@ -2,7 +2,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from .base import DupeGuruGUIObject from core.gui.base import DupeGuruGUIObject
from hscommon.gui.table import GUITable, Row from hscommon.gui.table import GUITable, Row
from hscommon.gui.column import Column, Columns from hscommon.gui.column import Column, Columns
from hscommon.trans import trget from hscommon.trans import trget

View File

@ -6,7 +6,7 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from hscommon.trans import tr from hscommon.trans import tr
from .ignore_list_table import IgnoreListTable from core.gui.ignore_list_table import IgnoreListTable
class IgnoreListDialog: class IgnoreListDialog:

View File

@ -8,7 +8,7 @@
from hscommon import desktop from hscommon import desktop
from .problem_table import ProblemTable from core.gui.problem_table import ProblemTable
class ProblemDialog: class ProblemDialog:

View File

@ -11,7 +11,7 @@ from operator import attrgetter
from hscommon.gui.table import GUITable, Row from hscommon.gui.table import GUITable, Row
from hscommon.gui.column import Columns from hscommon.gui.column import Columns
from .base import DupeGuruGUIObject from core.gui.base import DupeGuruGUIObject
class DupeRow(Row): class DupeRow(Row):

View File

@ -6,7 +6,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from .base import DupeGuruGUIObject from core.gui.base import DupeGuruGUIObject
class StatsLabel(DupeGuruGUIObject): class StatsLabel(DupeGuruGUIObject):

View File

@ -1 +1 @@
from . import fs, prioritize, result_table, scanner # noqa from core.me import fs, prioritize, result_table, scanner # noqa

View File

@ -97,11 +97,6 @@ class MusicFile(fs.File):
"dupe_count": format_dupe_count(dupe_count), "dupe_count": format_dupe_count(dupe_count),
} }
def _get_md5partial_offset_and_size(self):
# No longer calculating the offset and audio size, just whole file
size = self.path.stat().st_size
return (0, size)
def _read_info(self, field): def _read_info(self, field):
fs.File._read_info(self, field) fs.File._read_info(self, field)
if field in TAG_FIELDS: if field in TAG_FIELDS:

View File

@ -1,8 +1,7 @@
from . import ( # noqa from core.pe import ( # noqa
block, block,
cache, cache,
exif, exif,
iphoto_plist,
matchblock, matchblock,
matchexif, matchexif,
photo, photo,

View File

@ -6,7 +6,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from ._block import NoBlocksError, DifferentBlockCountError, avgdiff, getblocks2 # NOQA from core.pe._block import NoBlocksError, DifferentBlockCountError, avgdiff, getblocks2 # NOQA
# Converted to C # Converted to C
# def getblock(image): # def getblock(image):

13
core/pe/block.pyi Normal file
View File

@ -0,0 +1,13 @@
from typing import Tuple, List, Union, Sequence
_block = Tuple[int, int, int]
class NoBlocksError(Exception): ... # noqa: E302, E701
class DifferentBlockCountError(Exception): ... # noqa E701
def getblock(image: object) -> Union[_block, None]: ... # noqa: E302
def getblocks2(image: object, block_count_per_side: int) -> Union[List[_block], None]: ...
def diff(first: _block, second: _block) -> int: ...
def avgdiff( # noqa: E302
first: Sequence[_block], second: Sequence[_block], limit: int = 768, min_iterations: int = 1
) -> Union[int, None]: ...

View File

@ -4,24 +4,13 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from ._cache import string_to_colors # noqa from core.pe._cache import bytes_to_colors # noqa
def colors_to_string(colors): def colors_to_bytes(colors):
"""Transform the 3 sized tuples 'colors' into a hex string. """Transform the 3 sized tuples 'colors' into a bytes string.
[(0,100,255)] --> 0064ff [(0,100,255)] --> b'\x00d\xff'
[(1,2,3),(4,5,6)] --> 010203040506 [(1,2,3),(4,5,6)] --> b'\x01\x02\x03\x04\x05\x06'
""" """
return "".join("%02x%02x%02x" % (r, g, b) for r, g, b in colors) return b"".join(map(bytes, colors))
# This function is an important bottleneck of dupeGuru PE. It has been converted to C.
# def string_to_colors(s):
# """Transform the string 's' in a list of 3 sized tuples.
# """
# result = []
# for i in xrange(0, len(s), 6):
# number = int(s[i:i+6], 16)
# result.append((number >> 16, (number >> 8) & 0xff, number & 0xff))
# return result

6
core/pe/cache.pyi Normal file
View File

@ -0,0 +1,6 @@
from typing import Union, Tuple, List
_block = Tuple[int, int, int]
def colors_to_bytes(colors: List[_block]) -> bytes: ... # noqa: E302
def bytes_to_colors(s: bytes) -> Union[List[_block], None]: ...

View File

@ -1,141 +0,0 @@
# Copyright 2016 Virgil Dupras
#
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
import os
import os.path as op
import shelve
import tempfile
from collections import namedtuple
from .cache import string_to_colors, colors_to_string
def wrap_path(path):
return "path:{}".format(path)
def unwrap_path(key):
return key[5:]
def wrap_id(path):
return "id:{}".format(path)
def unwrap_id(key):
return int(key[3:])
CacheRow = namedtuple("CacheRow", "id path blocks mtime")
class ShelveCache:
"""A class to cache picture blocks in a shelve backend."""
def __init__(self, db=None, readonly=False):
self.istmp = db is None
if self.istmp:
self.dtmp = tempfile.mkdtemp()
self.ftmp = db = op.join(self.dtmp, "tmpdb")
flag = "r" if readonly else "c"
self.shelve = shelve.open(db, flag)
self.maxid = self._compute_maxid()
def __contains__(self, key):
return wrap_path(key) in self.shelve
def __delitem__(self, key):
row = self.shelve[wrap_path(key)]
del self.shelve[wrap_path(key)]
del self.shelve[wrap_id(row.id)]
def __getitem__(self, key):
if isinstance(key, int):
skey = self.shelve[wrap_id(key)]
else:
skey = wrap_path(key)
return string_to_colors(self.shelve[skey].blocks)
def __iter__(self):
return (unwrap_path(k) for k in self.shelve if k.startswith("path:"))
def __len__(self):
return sum(1 for k in self.shelve if k.startswith("path:"))
def __setitem__(self, path_str, blocks):
blocks = colors_to_string(blocks)
if op.exists(path_str):
mtime = int(os.stat(path_str).st_mtime)
else:
mtime = 0
if path_str in self:
rowid = self.shelve[wrap_path(path_str)].id
else:
rowid = self._get_new_id()
row = CacheRow(rowid, path_str, blocks, mtime)
self.shelve[wrap_path(path_str)] = row
self.shelve[wrap_id(rowid)] = wrap_path(path_str)
def _compute_maxid(self):
return max((unwrap_id(k) for k in self.shelve if k.startswith("id:")), default=1)
def _get_new_id(self):
self.maxid += 1
return self.maxid
def clear(self):
self.shelve.clear()
def close(self):
if self.shelve is not None:
self.shelve.close()
if self.istmp:
os.remove(self.ftmp)
os.rmdir(self.dtmp)
self.shelve = None
def filter(self, func):
to_delete = [key for key in self if not func(key)]
for key in to_delete:
del self[key]
def get_id(self, path):
if path in self:
return self.shelve[wrap_path(path)].id
else:
raise ValueError(path)
def get_multiple(self, rowids):
for rowid in rowids:
try:
skey = self.shelve[wrap_id(rowid)]
except KeyError:
continue
yield (rowid, string_to_colors(self.shelve[skey].blocks))
def purge_outdated(self):
"""Go through the cache and purge outdated records.
A record is outdated if the picture doesn't exist or if its mtime is greater than the one in
the db.
"""
todelete = []
for path in self:
row = self.shelve[wrap_path(path)]
if row.mtime and op.exists(path):
picture_mtime = os.stat(path).st_mtime
if int(picture_mtime) <= row.mtime:
# not outdated
continue
todelete.append(path)
for path in todelete:
try:
del self[path]
except KeyError:
# I have no idea why a KeyError sometimes happen, but it does, as we can see in
# #402 and #439. I don't think it hurts to silently ignore the error, so that's
# what we do
pass

View File

@ -9,12 +9,24 @@ import os.path as op
import logging import logging
import sqlite3 as sqlite import sqlite3 as sqlite
from .cache import string_to_colors, colors_to_string from core.pe.cache import bytes_to_colors, colors_to_bytes
class SqliteCache: class SqliteCache:
"""A class to cache picture blocks in a sqlite backend.""" """A class to cache picture blocks in a sqlite backend."""
schema_version = 2
schema_version_description = "Added blocks for all 8 orientations."
create_table_query = (
"CREATE TABLE IF NOT EXISTS "
"pictures(path TEXT, mtime_ns INTEGER, blocks BLOB, blocks2 BLOB, blocks3 BLOB, "
"blocks4 BLOB, blocks5 BLOB, blocks6 BLOB, blocks7 BLOB, blocks8 BLOB)"
)
create_index_query = "CREATE INDEX IF NOT EXISTS idx_path on pictures (path)"
drop_table_query = "DROP TABLE IF EXISTS pictures"
drop_index_query = "DROP INDEX IF EXISTS idx_path"
def __init__(self, db=":memory:", readonly=False): def __init__(self, db=":memory:", readonly=False):
# readonly is not used in the sqlite version of the cache # readonly is not used in the sqlite version of the cache
self.dbname = db self.dbname = db
@ -35,12 +47,20 @@ class SqliteCache:
# Optimized # Optimized
def __getitem__(self, key): def __getitem__(self, key):
if isinstance(key, int): if isinstance(key, int):
sql = "select blocks from pictures where rowid = ?" sql = (
"select blocks, blocks2, blocks3, blocks4, blocks5, blocks6, blocks7, blocks8 "
"from pictures "
"where rowid = ?"
)
else: else:
sql = "select blocks from pictures where path = ?" sql = (
result = self.con.execute(sql, [key]).fetchone() "select blocks, blocks2, blocks3, blocks4, blocks5, blocks6, blocks7, blocks8 "
if result: "from pictures "
result = string_to_colors(result[0]) "where path = ?"
)
blocks = self.con.execute(sql, [key]).fetchone()
if blocks:
result = [bytes_to_colors(block) for block in blocks]
return result return result
else: else:
raise KeyError(key) raise KeyError(key)
@ -56,35 +76,33 @@ class SqliteCache:
return result[0][0] return result[0][0]
def __setitem__(self, path_str, blocks): def __setitem__(self, path_str, blocks):
blocks = colors_to_string(blocks) blocks = [colors_to_bytes(block) for block in blocks]
if op.exists(path_str): if op.exists(path_str):
mtime = int(os.stat(path_str).st_mtime) mtime = int(os.stat(path_str).st_mtime)
else: else:
mtime = 0 mtime = 0
if path_str in self: if path_str in self:
sql = "update pictures set blocks = ?, mtime = ? where path = ?" sql = (
"update pictures set blocks = ?, blocks2 = ?, blocks3 = ?, blocks4 = ?, blocks5 = ?, blocks6 = ?, "
"blocks7 = ?, blocks8 = ?, mtime_ns = ?"
"where path = ?"
)
else: else:
sql = "insert into pictures(blocks,mtime,path) values(?,?,?)" sql = (
"insert into pictures(blocks,blocks2,blocks3,blocks4,blocks5,blocks6,blocks7,blocks8,mtime_ns,path) "
"values(?,?,?,?,?,?,?,?,?,?)"
)
try: try:
self.con.execute(sql, [blocks, mtime, path_str]) self.con.execute(sql, blocks + [mtime, path_str])
except sqlite.OperationalError: except sqlite.OperationalError:
logging.warning("Picture cache could not set value for key %r", path_str) logging.warning("Picture cache could not set value for key %r", path_str)
except sqlite.DatabaseError as e: except sqlite.DatabaseError as e:
logging.warning("DatabaseError while setting value for key %r: %s", path_str, str(e)) logging.warning("DatabaseError while setting value for key %r: %s", path_str, str(e))
def _create_con(self, second_try=False): def _create_con(self, second_try=False):
def create_tables():
logging.debug("Creating picture cache tables.")
self.con.execute("drop table if exists pictures")
self.con.execute("drop index if exists idx_path")
self.con.execute("create table pictures(path TEXT, mtime INTEGER, blocks TEXT)")
self.con.execute("create index idx_path on pictures (path)")
self.con = sqlite.connect(self.dbname, isolation_level=None)
try: try:
self.con.execute("select path, mtime, blocks from pictures where 1=2") self.con = sqlite.connect(self.dbname, isolation_level=None)
except sqlite.OperationalError: # new db self._check_upgrade()
create_tables()
except sqlite.DatabaseError as e: # corrupted db except sqlite.DatabaseError as e: # corrupted db
if second_try: if second_try:
raise # Something really strange is happening raise # Something really strange is happening
@ -93,6 +111,25 @@ class SqliteCache:
os.remove(self.dbname) os.remove(self.dbname)
self._create_con(second_try=True) self._create_con(second_try=True)
def _check_upgrade(self) -> None:
with self.con as conn:
has_schema = conn.execute(
"SELECT NAME FROM sqlite_master WHERE type='table' AND name='schema_version'"
).fetchall()
version = None
if has_schema:
version = conn.execute("SELECT version FROM schema_version ORDER BY version DESC").fetchone()[0]
else:
conn.execute("CREATE TABLE schema_version (version int PRIMARY KEY, description TEXT)")
if version != self.schema_version:
conn.execute(self.drop_table_query)
conn.execute(
"INSERT OR REPLACE INTO schema_version VALUES (:version, :description)",
{"version": self.schema_version, "description": self.schema_version_description},
)
conn.execute(self.create_table_query)
conn.execute(self.create_index_query)
def clear(self): def clear(self):
self.close() self.close()
if self.dbname != ":memory:": if self.dbname != ":memory:":
@ -118,9 +155,28 @@ class SqliteCache:
raise ValueError(path) raise ValueError(path)
def get_multiple(self, rowids): def get_multiple(self, rowids):
sql = "select rowid, blocks from pictures where rowid in (%s)" % ",".join(map(str, rowids)) ids = ",".join(map(str, rowids))
sql = (
"select rowid, blocks, blocks2, blocks3, blocks4, blocks5, blocks6, blocks7, blocks8 "
f"from pictures where rowid in ({ids})"
)
cur = self.con.execute(sql) cur = self.con.execute(sql)
return ((rowid, string_to_colors(blocks)) for rowid, blocks in cur) return (
(
rowid,
[
bytes_to_colors(blocks),
bytes_to_colors(blocks2),
bytes_to_colors(blocks3),
bytes_to_colors(blocks4),
bytes_to_colors(blocks5),
bytes_to_colors(blocks6),
bytes_to_colors(blocks7),
bytes_to_colors(blocks8),
],
)
for rowid, blocks, blocks2, blocks3, blocks4, blocks5, blocks6, blocks7, blocks8 in cur
)
def purge_outdated(self): def purge_outdated(self):
"""Go through the cache and purge outdated records. """Go through the cache and purge outdated records.
@ -129,12 +185,12 @@ class SqliteCache:
the db. the db.
""" """
todelete = [] todelete = []
sql = "select rowid, path, mtime from pictures" sql = "select rowid, path, mtime_ns from pictures"
cur = self.con.execute(sql) cur = self.con.execute(sql)
for rowid, path_str, mtime in cur: for rowid, path_str, mtime_ns in cur:
if mtime and op.exists(path_str): if mtime_ns and op.exists(path_str):
picture_mtime = os.stat(path_str).st_mtime picture_mtime = os.stat(path_str).st_mtime
if int(picture_mtime) <= mtime: if int(picture_mtime) <= mtime_ns:
# not outdated # not outdated
continue continue
todelete.append(rowid) todelete.append(rowid)

View File

@ -1,33 +0,0 @@
# Created By: Virgil Dupras
# Created On: 2014-03-15
# Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
#
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
import plistlib
class IPhotoPlistParser(plistlib._PlistParser):
"""A parser for iPhoto plists.
iPhoto plists tend to be malformed, so we have to subclass the built-in parser to be a bit more
lenient.
"""
def __init__(self):
plistlib._PlistParser.__init__(self, use_builtin_types=True, dict_type=dict)
# For debugging purposes, we remember the last bit of data to be analyzed so that we can
# log it in case of an exception
self.lastdata = ""
def get_data(self):
self.lastdata = plistlib._PlistParser.get_data(self)
return self.lastdata
def end_integer(self):
try:
self.add_object(int(self.get_data()))
except ValueError:
self.add_object(0)

View File

@ -15,7 +15,8 @@ from hscommon.trans import tr
from hscommon.jobprogress import job from hscommon.jobprogress import job
from core.engine import Match from core.engine import Match
from .block import avgdiff, DifferentBlockCountError, NoBlocksError from core.pe.block import avgdiff, DifferentBlockCountError, NoBlocksError
from core.pe.cache_sqlite import SqliteCache
# OPTIMIZATION NOTES: # OPTIMIZATION NOTES:
# The bottleneck of the matching phase is CPU, which is why we use multiprocessing. However, another # The bottleneck of the matching phase is CPU, which is why we use multiprocessing. However, another
@ -27,7 +28,7 @@ from .block import avgdiff, DifferentBlockCountError, NoBlocksError
# to files in other chunks. So chunkifying doesn't save us any actual comparison, but the advantage # to files in other chunks. So chunkifying doesn't save us any actual comparison, but the advantage
# is that instead of reading blocks from disk number_of_files**2 times, we read it # is that instead of reading blocks from disk number_of_files**2 times, we read it
# number_of_files*number_of_chunks times. # number_of_files*number_of_chunks times.
# Determining the right chunk size is tricky, bceause if it's too big, too many blocks will be in # Determining the right chunk size is tricky, because if it's too big, too many blocks will be in
# memory at the same time and we might end up with memory trashing, which is awfully slow. So, # memory at the same time and we might end up with memory trashing, which is awfully slow. So,
# because our *real* bottleneck is CPU, the chunk size must simply be enough so that the CPU isn't # because our *real* bottleneck is CPU, the chunk size must simply be enough so that the CPU isn't
# starved by Disk IOs. # starved by Disk IOs.
@ -50,17 +51,10 @@ except Exception:
def get_cache(cache_path, readonly=False): def get_cache(cache_path, readonly=False):
if cache_path.endswith("shelve"): return SqliteCache(cache_path, readonly=readonly)
from .cache_shelve import ShelveCache
return ShelveCache(cache_path, readonly=readonly)
else:
from .cache_sqlite import SqliteCache
return SqliteCache(cache_path, readonly=readonly)
def prepare_pictures(pictures, cache_path, with_dimensions, j=job.nulljob): def prepare_pictures(pictures, cache_path, with_dimensions, match_rotated, j=job.nulljob):
# The MemoryError handlers in there use logging without first caring about whether or not # The MemoryError handlers in there use logging without first caring about whether or not
# there is enough memory left to carry on the operation because it is assumed that the # there is enough memory left to carry on the operation because it is assumed that the
# MemoryError happens when trying to read an image file, which is freed from memory by the # MemoryError happens when trying to read an image file, which is freed from memory by the
@ -78,16 +72,21 @@ def prepare_pictures(pictures, cache_path, with_dimensions, j=job.nulljob):
# entry in iPhoto library. # entry in iPhoto library.
logging.warning("We have a picture with a null path here") logging.warning("We have a picture with a null path here")
continue continue
picture.unicode_path = str(picture.path)
logging.debug("Analyzing picture at %s", picture.unicode_path) logging.debug("Analyzing picture at %s", picture.unicode_path)
if with_dimensions: if with_dimensions:
picture.dimensions # pre-read dimensions picture.dimensions # pre-read dimensions
try: try:
if picture.unicode_path not in cache: if picture.unicode_path not in cache or (
blocks = picture.get_blocks(BLOCK_COUNT_PER_SIDE) match_rotated and any(block == [] for block in cache[picture.unicode_path])
):
if match_rotated:
blocks = [picture.get_blocks(BLOCK_COUNT_PER_SIDE, orientation) for orientation in range(1, 9)]
else:
blocks = [[]] * 8
blocks[max(picture.get_orientation() - 1, 0)] = picture.get_blocks(BLOCK_COUNT_PER_SIDE)
cache[picture.unicode_path] = blocks cache[picture.unicode_path] = blocks
prepared.append(picture) prepared.append(picture)
except (IOError, ValueError) as e: except (OSError, ValueError) as e:
logging.warning(str(e)) logging.warning(str(e))
except MemoryError: except MemoryError:
logging.warning( logging.warning(
@ -125,13 +124,13 @@ def get_match(first, second, percentage):
return Match(first, second, percentage) return Match(first, second, percentage)
def async_compare(ref_ids, other_ids, dbname, threshold, picinfo): def async_compare(ref_ids, other_ids, dbname, threshold, picinfo, match_rotated=False):
# The list of ids in ref_ids have to be compared to the list of ids in other_ids. other_ids # The list of ids in ref_ids have to be compared to the list of ids in other_ids. other_ids
# can be None. In this case, ref_ids has to be compared with itself # can be None. In this case, ref_ids has to be compared with itself
# picinfo is a dictionary {pic_id: (dimensions, is_ref)} # picinfo is a dictionary {pic_id: (dimensions, is_ref)}
cache = get_cache(dbname, readonly=True) cache = get_cache(dbname, readonly=True)
limit = 100 - threshold limit = 100 - threshold
ref_pairs = list(cache.get_multiple(ref_ids)) ref_pairs = list(cache.get_multiple(ref_ids)) # (rowid, [b, b2, ..., b8])
if other_ids is not None: if other_ids is not None:
other_pairs = list(cache.get_multiple(other_ids)) other_pairs = list(cache.get_multiple(other_ids))
comparisons_to_do = [(r, o) for r in ref_pairs for o in other_pairs] comparisons_to_do = [(r, o) for r in ref_pairs for o in other_pairs]
@ -144,22 +143,35 @@ def async_compare(ref_ids, other_ids, dbname, threshold, picinfo):
if ref_is_ref and other_is_ref: if ref_is_ref and other_is_ref:
continue continue
if ref_dimensions != other_dimensions: if ref_dimensions != other_dimensions:
continue if match_rotated:
try: rotated_ref_dimensions = (ref_dimensions[1], ref_dimensions[0])
diff = avgdiff(ref_blocks, other_blocks, limit, MIN_ITERATIONS) if rotated_ref_dimensions != other_dimensions:
percentage = 100 - diff continue
except (DifferentBlockCountError, NoBlocksError): else:
percentage = 0 continue
if percentage >= threshold:
results.append((ref_id, other_id, percentage)) orientation_range = 1
if match_rotated:
orientation_range = 8
for orientation_ref in range(orientation_range):
try:
diff = avgdiff(ref_blocks[orientation_ref], other_blocks[0], limit, MIN_ITERATIONS)
percentage = 100 - diff
except (DifferentBlockCountError, NoBlocksError):
percentage = 0
if percentage >= threshold:
results.append((ref_id, other_id, percentage))
break
cache.close() cache.close()
return results return results
def getmatches(pictures, cache_path, threshold, match_scaled=False, j=job.nulljob): def getmatches(pictures, cache_path, threshold, match_scaled=False, match_rotated=False, j=job.nulljob):
def get_picinfo(p): def get_picinfo(p):
if match_scaled: if match_scaled:
return (None, p.is_ref) return ((None, None), p.is_ref)
else: else:
return (p.dimensions, p.is_ref) return (p.dimensions, p.is_ref)
@ -181,7 +193,7 @@ def getmatches(pictures, cache_path, threshold, match_scaled=False, j=job.nulljo
j.set_progress(comparison_count, progress_msg) j.set_progress(comparison_count, progress_msg)
j = j.start_subjob([3, 7]) j = j.start_subjob([3, 7])
pictures = prepare_pictures(pictures, cache_path, with_dimensions=not match_scaled, j=j) pictures = prepare_pictures(pictures, cache_path, not match_scaled, match_rotated, j=j)
j = j.start_subjob([9, 1], tr("Preparing for matching")) j = j.start_subjob([9, 1], tr("Preparing for matching"))
cache = get_cache(cache_path) cache = get_cache(cache_path)
id2picture = {} id2picture = {}
@ -211,7 +223,7 @@ def getmatches(pictures, cache_path, threshold, match_scaled=False, j=job.nulljo
picinfo.update({p.cache_id: get_picinfo(p) for p in other_chunk}) picinfo.update({p.cache_id: get_picinfo(p) for p in other_chunk})
else: else:
other_ids = None other_ids = None
args = (ref_ids, other_ids, cache_path, threshold, picinfo) args = (ref_ids, other_ids, cache_path, threshold, picinfo, match_rotated)
async_results.append(pool.apply_async(async_compare, args)) async_results.append(pool.apply_async(async_compare, args))
collect_results() collect_results()
collect_results(collect_all=True) collect_results(collect_all=True)
@ -238,7 +250,7 @@ def getmatches(pictures, cache_path, threshold, match_scaled=False, j=job.nulljo
for ref_id, other_id, percentage in myiter: for ref_id, other_id, percentage in myiter:
ref = id2picture[ref_id] ref = id2picture[ref_id]
other = id2picture[other_id] other = id2picture[other_id]
if percentage == 100 and ref.md5 != other.md5: if percentage == 100 and ref.digest != other.digest:
percentage = 99 percentage = 99
if percentage >= threshold: if percentage >= threshold:
ref.dimensions # pre-read dimensions for display in results ref.dimensions # pre-read dimensions for display in results

View File

@ -2,94 +2,68 @@
* Created On: 2010-01-30 * Created On: 2010-01-30
* Copyright 2014 Hardcoded Software (http://www.hardcoded.net) * Copyright 2014 Hardcoded Software (http://www.hardcoded.net)
* *
* This software is licensed under the "BSD" License as described in the "LICENSE" file, * This software is licensed under the "BSD" License as described in the
* which should be included with this package. The terms are also available at * "LICENSE" file, which should be included with this package. The terms are
* http://www.hardcoded.net/licenses/bsd_license * also available at http://www.hardcoded.net/licenses/bsd_license
*/ */
#include "common.h" #include "common.h"
/* I know that there strtol out there, but it requires a pointer to static PyObject *cache_bytes_to_colors(PyObject *self, PyObject *args) {
* a char, which would in turn require me to buffer my chars around, char *y;
* making the whole process slower. Py_ssize_t char_count, i, color_count;
*/ PyObject *result;
static long unsigned long r, g, b;
xchar_to_long(char c) Py_ssize_t ci;
{ PyObject *color_tuple;
if ((c >= 48) && (c <= 57)) { /* 0-9 */
return c - 48; if (!PyArg_ParseTuple(args, "y#", &y, &char_count)) {
return NULL;
}
color_count = char_count / 3;
result = PyList_New(color_count);
if (result == NULL) {
return NULL;
}
for (i = 0; i < color_count; i++) {
ci = i * 3;
r = (unsigned char)y[ci];
g = (unsigned char)y[ci + 1];
b = (unsigned char)y[ci + 2];
color_tuple = inttuple(3, r, g, b);
if (color_tuple == NULL) {
Py_DECREF(result);
return NULL;
} }
else if ((c >= 65) && (c <= 70)) { /* A-F */ PyList_SET_ITEM(result, i, color_tuple);
return c - 55; }
}
else if ((c >= 97) && (c <= 102)) { /* a-f */
return c - 87;
}
return 0;
}
static PyObject* return result;
cache_string_to_colors(PyObject *self, PyObject *args)
{
char *s;
Py_ssize_t char_count, color_count, i;
PyObject *result;
if (!PyArg_ParseTuple(args, "s#", &s, &char_count)) {
return NULL;
}
color_count = (char_count / 6);
result = PyList_New(color_count);
if (result == NULL) {
return NULL;
}
for (i=0; i<color_count; i++) {
long r, g, b;
Py_ssize_t ci;
PyObject *color_tuple;
ci = i * 6;
r = (xchar_to_long(s[ci]) << 4) + xchar_to_long(s[ci+1]);
g = (xchar_to_long(s[ci+2]) << 4) + xchar_to_long(s[ci+3]);
b = (xchar_to_long(s[ci+4]) << 4) + xchar_to_long(s[ci+5]);
color_tuple = inttuple(3, r, g, b);
if (color_tuple == NULL) {
Py_DECREF(result);
return NULL;
}
PyList_SET_ITEM(result, i, color_tuple);
}
return result;
} }
static PyMethodDef CacheMethods[] = { static PyMethodDef CacheMethods[] = {
{"string_to_colors", cache_string_to_colors, METH_VARARGS, {"bytes_to_colors", cache_bytes_to_colors, METH_VARARGS,
"Transform the string 's' in a list of 3 sized tuples."}, "Transform the bytes 's' into a list of 3 sized tuples."},
{NULL, NULL, 0, NULL} /* Sentinel */ {NULL, NULL, 0, NULL} /* Sentinel */
}; };
static struct PyModuleDef CacheDef = { static struct PyModuleDef CacheDef = {PyModuleDef_HEAD_INIT,
PyModuleDef_HEAD_INIT, "_cache",
"_cache", NULL,
NULL, -1,
-1, CacheMethods,
CacheMethods, NULL,
NULL, NULL,
NULL, NULL,
NULL, NULL};
NULL
};
PyObject * PyObject *PyInit__cache(void) {
PyInit__cache(void) PyObject *m = PyModule_Create(&CacheDef);
{ if (m == NULL) {
PyObject *m = PyModule_Create(&CacheDef); return NULL;
if (m == NULL) { }
return NULL; return m;
}
return m;
} }

View File

@ -32,7 +32,7 @@ PyObject* inttuple(int n, ...)
result = PyTuple_New(n); result = PyTuple_New(n);
for (i=0; i<n; i++) { for (i=0; i<n; i++) {
pnumber = PyLong_FromLong(va_arg(numbers, long)); pnumber = PyLong_FromUnsignedLong(va_arg(numbers, long));
if (pnumber == NULL) { if (pnumber == NULL) {
Py_DECREF(result); Py_DECREF(result);
return NULL; return NULL;

View File

@ -9,7 +9,7 @@ from hscommon.util import get_file_ext, format_size
from core.util import format_timestamp, format_perc, format_dupe_count from core.util import format_timestamp, format_perc, format_dupe_count
from core import fs from core import fs
from . import exif from core.pe import exif
# This global value is set by the platform-specific subclasser of the Photo base class # This global value is set by the platform-specific subclasser of the Photo base class
PLAT_SPECIFIC_PHOTO_CLASS = None PLAT_SPECIFIC_PHOTO_CLASS = None
@ -29,7 +29,7 @@ class Photo(fs.File):
__slots__ = fs.File.__slots__ + tuple(INITIAL_INFO.keys()) __slots__ = fs.File.__slots__ + tuple(INITIAL_INFO.keys())
# These extensions are supported on all platforms # These extensions are supported on all platforms
HANDLED_EXTS = {"png", "jpg", "jpeg", "gif", "bmp", "tiff", "tif"} HANDLED_EXTS = {"png", "jpg", "jpeg", "gif", "bmp", "tiff", "tif", "webp"}
def _plat_get_dimensions(self): def _plat_get_dimensions(self):
raise NotImplementedError() raise NotImplementedError()
@ -37,7 +37,7 @@ class Photo(fs.File):
def _plat_get_blocks(self, block_count_per_side, orientation): def _plat_get_blocks(self, block_count_per_side, orientation):
raise NotImplementedError() raise NotImplementedError()
def _get_orientation(self): def get_orientation(self):
if not hasattr(self, "_cached_orientation"): if not hasattr(self, "_cached_orientation"):
try: try:
with self.path.open("rb") as fp: with self.path.open("rb") as fp:
@ -95,10 +95,13 @@ class Photo(fs.File):
fs.File._read_info(self, field) fs.File._read_info(self, field)
if field == "dimensions": if field == "dimensions":
self.dimensions = self._plat_get_dimensions() self.dimensions = self._plat_get_dimensions()
if self._get_orientation() in {5, 6, 7, 8}: if self.get_orientation() in {5, 6, 7, 8}:
self.dimensions = (self.dimensions[1], self.dimensions[0]) self.dimensions = (self.dimensions[1], self.dimensions[0])
elif field == "exif_timestamp": elif field == "exif_timestamp":
self.exif_timestamp = self._get_exif_timestamp() self.exif_timestamp = self._get_exif_timestamp()
def get_blocks(self, block_count_per_side): def get_blocks(self, block_count_per_side, orientation: int = None):
return self._plat_get_blocks(block_count_per_side, self._get_orientation()) if orientation is None:
return self._plat_get_blocks(block_count_per_side, self.get_orientation())
else:
return self._plat_get_blocks(block_count_per_side, orientation)

View File

@ -8,12 +8,13 @@ from hscommon.trans import tr
from core.scanner import Scanner, ScanType, ScanOption from core.scanner import Scanner, ScanType, ScanOption
from . import matchblock, matchexif from core.pe import matchblock, matchexif
class ScannerPE(Scanner): class ScannerPE(Scanner):
cache_path = None cache_path = None
match_scaled = False match_scaled = False
match_rotated = False
@staticmethod @staticmethod
def get_scan_options(): def get_scan_options():
@ -29,6 +30,7 @@ class ScannerPE(Scanner):
cache_path=self.cache_path, cache_path=self.cache_path,
threshold=self.min_match_percentage, threshold=self.min_match_percentage,
match_scaled=self.match_scaled, match_scaled=self.match_scaled,
match_rotated=self.match_rotated,
j=j, j=j,
) )
elif self.scan_type == ScanType.EXIFTIMESTAMP: elif self.scan_type == ScanType.EXIFTIMESTAMP:

View File

@ -43,7 +43,7 @@ class Criterion:
@property @property
def display(self): def display(self):
return "{} ({})".format(self.category.NAME, self.display_value) return f"{self.category.NAME} ({self.display_value})"
class ValueListCategory(CriterionCategory): class ValueListCategory(CriterionCategory):
@ -82,10 +82,12 @@ class FolderCategory(ValueListCategory):
def sort_key(self, dupe, crit_value): def sort_key(self, dupe, crit_value):
value = self.extract_value(dupe) value = self.extract_value(dupe)
if value[: len(crit_value)] == crit_value: # This is instead of using is_relative_to() which was added in py 3.9
return 0 try:
else: value.relative_to(crit_value)
except ValueError:
return 1 return 1
return 0
class FilenameCategory(CriterionCategory): class FilenameCategory(CriterionCategory):
@ -94,6 +96,8 @@ class FilenameCategory(CriterionCategory):
DOESNT_END_WITH_NUMBER = 1 DOESNT_END_WITH_NUMBER = 1
LONGEST = 2 LONGEST = 2
SHORTEST = 3 SHORTEST = 3
LONGEST_PATH = 4
SHORTEST_PATH = 5
def format_criterion_value(self, value): def format_criterion_value(self, value):
return { return {
@ -101,6 +105,8 @@ class FilenameCategory(CriterionCategory):
self.DOESNT_END_WITH_NUMBER: tr("Doesn't end with number"), self.DOESNT_END_WITH_NUMBER: tr("Doesn't end with number"),
self.LONGEST: tr("Longest"), self.LONGEST: tr("Longest"),
self.SHORTEST: tr("Shortest"), self.SHORTEST: tr("Shortest"),
self.LONGEST_PATH: tr("Longest Path"),
self.SHORTEST_PATH: tr("Shortest Path"),
}[value] }[value]
def extract_value(self, dupe): def extract_value(self, dupe):
@ -114,6 +120,10 @@ class FilenameCategory(CriterionCategory):
return 0 if ends_with_digit else 1 return 0 if ends_with_digit else 1
else: else:
return 1 if ends_with_digit else 0 return 1 if ends_with_digit else 0
elif crit_value == self.LONGEST_PATH:
return len(str(dupe.folder_path)) * -1
elif crit_value == self.SHORTEST_PATH:
return len(str(dupe.folder_path))
else: else:
value = len(value) value = len(value)
if crit_value == self.LONGEST: if crit_value == self.LONGEST:
@ -128,6 +138,8 @@ class FilenameCategory(CriterionCategory):
self.DOESNT_END_WITH_NUMBER, self.DOESNT_END_WITH_NUMBER,
self.LONGEST, self.LONGEST,
self.SHORTEST, self.SHORTEST,
self.LONGEST_PATH,
self.SHORTEST_PATH,
] ]
] ]

View File

@ -10,6 +10,7 @@ import logging
import re import re
import os import os
import os.path as op import os.path as op
from errno import EISDIR, EACCES
from xml.etree import ElementTree as ET from xml.etree import ElementTree as ET
from hscommon.jobprogress.job import nulljob from hscommon.jobprogress.job import nulljob
@ -17,8 +18,8 @@ from hscommon.conflict import get_conflicted_name
from hscommon.util import flatten, nonone, FileOrPath, format_size from hscommon.util import flatten, nonone, FileOrPath, format_size
from hscommon.trans import tr from hscommon.trans import tr
from . import engine from core import engine
from .markable import Markable from core.markable import Markable
class Results(Markable): class Results(Markable):
@ -191,7 +192,7 @@ class Results(Markable):
self.__filters.append(filter_str) self.__filters.append(filter_str)
if self.__filtered_dupes is None: if self.__filtered_dupes is None:
self.__filtered_dupes = flatten(g[:] for g in self.groups) self.__filtered_dupes = flatten(g[:] for g in self.groups)
self.__filtered_dupes = set(dupe for dupe in self.__filtered_dupes if filter_re.search(str(dupe.path))) self.__filtered_dupes = {dupe for dupe in self.__filtered_dupes if filter_re.search(str(dupe.path))}
filtered_groups = set() filtered_groups = set()
for dupe in self.__filtered_dupes: for dupe in self.__filtered_dupes:
filtered_groups.add(self.get_group_of_duplicate(dupe)) filtered_groups.add(self.get_group_of_duplicate(dupe))
@ -301,7 +302,7 @@ class Results(Markable):
try: try:
func(dupe) func(dupe)
to_remove.append(dupe) to_remove.append(dupe)
except (EnvironmentError, UnicodeEncodeError) as e: except (OSError, UnicodeEncodeError) as e:
self.problems.append((dupe, str(e))) self.problems.append((dupe, str(e)))
if remove_from_results: if remove_from_results:
self.remove_duplicates(to_remove) self.remove_duplicates(to_remove)
@ -374,10 +375,10 @@ class Results(Markable):
try: try:
do_write(outfile) do_write(outfile)
except IOError as e: except OSError as e:
# If our IOError is because dest is already a directory, we want to handle that. 21 is # If our OSError is because dest is already a directory, we want to handle that. 21 is
# the code we get on OS X and Linux, 13 is what we get on Windows. # the code we get on OS X and Linux (EISDIR), 13 is what we get on Windows (EACCES).
if e.errno in {21, 13}: if e.errno in (EISDIR, EACCES):
p = str(outfile) p = str(outfile)
dirname, basename = op.split(p) dirname, basename = op.split(p)
otherfiles = os.listdir(dirname) otherfiles = os.listdir(dirname)

View File

@ -13,7 +13,7 @@ from hscommon.jobprogress import job
from hscommon.util import dedupe, rem_file_ext, get_file_ext from hscommon.util import dedupe, rem_file_ext, get_file_ext
from hscommon.trans import tr from hscommon.trans import tr
from . import engine from core import engine
# It's quite ugly to have scan types from all editions all put in the same class, but because there's # It's quite ugly to have scan types from all editions all put in the same class, but because there's
# there will be some nasty bugs popping up (ScanType is used in core when in should exclusively be # there will be some nasty bugs popping up (ScanType is used in core when in should exclusively be
@ -87,8 +87,6 @@ class Scanner:
} }
): ):
j = j.start_subjob([2, 8]) j = j.start_subjob([2, 8])
for f in j.iter_with_progress(files, tr("Read size of %d/%d files")):
f.size # pre-read, makes a smoother progress if read here (especially for bundles)
if self.size_threshold: if self.size_threshold:
files = [f for f in files if f.size >= self.size_threshold] files = [f for f in files if f.size >= self.size_threshold]
if self.large_size_threshold: if self.large_size_threshold:
@ -134,7 +132,7 @@ class Scanner:
return False return False
if is_same_with_digit(refname, dupename): if is_same_with_digit(refname, dupename):
return True return True
return len(dupe.path) > len(ref.path) return len(dupe.path.parts) > len(ref.path.parts)
@staticmethod @staticmethod
def get_scan_options(): def get_scan_options():
@ -164,15 +162,18 @@ class Scanner:
toremove = set() toremove = set()
last_parent_path = sortedpaths[0] last_parent_path = sortedpaths[0]
for p in sortedpaths[1:]: for p in sortedpaths[1:]:
if p in last_parent_path: if last_parent_path in p.parents:
toremove.add(p) toremove.add(p)
else: else:
last_parent_path = p last_parent_path = p
matches = [m for m in matches if m.first.path not in toremove or m.second.path not in toremove] matches = [m for m in matches if m.first.path not in toremove or m.second.path not in toremove]
if not self.mix_file_kind: if not self.mix_file_kind:
matches = [m for m in matches if get_file_ext(m.first.name) == get_file_ext(m.second.name)] matches = [m for m in matches if get_file_ext(m.first.name) == get_file_ext(m.second.name)]
matches = [m for m in matches if m.first.path.exists() and m.second.path.exists()] if self.include_exists_check:
matches = [m for m in matches if not (m.first.is_ref and m.second.is_ref)] matches = [m for m in matches if m.first.exists() and m.second.exists()]
# Contents already handles ref checks, other scan types might not catch during scan
if self.scan_type != ScanType.CONTENTS:
matches = [m for m in matches if not (m.first.is_ref and m.second.is_ref)]
if ignore_list: if ignore_list:
matches = [m for m in matches if not ignore_list.are_ignored(str(m.first.path), str(m.second.path))] matches = [m for m in matches if not ignore_list.are_ignored(str(m.first.path), str(m.second.path))]
logging.info("Grouping matches") logging.info("Grouping matches")
@ -212,3 +213,4 @@ class Scanner:
large_size_threshold = 0 large_size_threshold = 0
big_file_size_threshold = 0 big_file_size_threshold = 0
word_weighting = False word_weighting = False
include_exists_check = True

View File

@ -1 +1 @@
from . import fs, result_table, scanner # noqa from core.se import fs, result_table, scanner # noqa

View File

@ -7,18 +7,19 @@
import os import os
import os.path as op import os.path as op
import logging import logging
import tempfile
import pytest import pytest
from hscommon.path import Path from pathlib import Path
import hscommon.conflict import hscommon.conflict
import hscommon.util import hscommon.util
from hscommon.testutil import eq_, log_calls from hscommon.testutil import eq_, log_calls
from hscommon.jobprogress.job import Job from hscommon.jobprogress.job import Job
from .base import TestApp from core.tests.base import TestApp
from .results_test import GetTestGroups from core.tests.results_test import GetTestGroups
from .. import app, fs, engine from core import app, fs, engine
from ..scanner import ScanType from core.scanner import ScanType
def add_fake_files_to_directories(directories, files): def add_fake_files_to_directories(directories, files):
@ -56,7 +57,7 @@ class TestCaseDupeGuru:
# for this unit is pathetic. What's done is done. My approach now is to add tests for # for this unit is pathetic. What's done is done. My approach now is to add tests for
# every change I want to make. The blowup was caused by a missing import. # every change I want to make. The blowup was caused by a missing import.
p = Path(str(tmpdir)) p = Path(str(tmpdir))
p["foo"].open("w").close() p.joinpath("foo").touch()
monkeypatch.setattr( monkeypatch.setattr(
hscommon.conflict, hscommon.conflict,
"smart_copy", "smart_copy",
@ -68,22 +69,23 @@ class TestCaseDupeGuru:
dgapp = TestApp().app dgapp = TestApp().app
dgapp.directories.add_path(p) dgapp.directories.add_path(p)
[f] = dgapp.directories.get_files() [f] = dgapp.directories.get_files()
dgapp.copy_or_move(f, True, "some_destination", 0) with tempfile.TemporaryDirectory() as tmp_dir:
eq_(1, len(hscommon.conflict.smart_copy.calls)) dgapp.copy_or_move(f, True, tmp_dir, 0)
call = hscommon.conflict.smart_copy.calls[0] eq_(1, len(hscommon.conflict.smart_copy.calls))
eq_(call["dest_path"], op.join("some_destination", "foo")) call = hscommon.conflict.smart_copy.calls[0]
eq_(call["source_path"], f.path) eq_(call["dest_path"], Path(tmp_dir, "foo"))
eq_(call["source_path"], f.path)
def test_copy_or_move_clean_empty_dirs(self, tmpdir, monkeypatch): def test_copy_or_move_clean_empty_dirs(self, tmpdir, monkeypatch):
tmppath = Path(str(tmpdir)) tmppath = Path(str(tmpdir))
sourcepath = tmppath["source"] sourcepath = tmppath.joinpath("source")
sourcepath.mkdir() sourcepath.mkdir()
sourcepath["myfile"].open("w") sourcepath.joinpath("myfile").touch()
app = TestApp().app app = TestApp().app
app.directories.add_path(tmppath) app.directories.add_path(tmppath)
[myfile] = app.directories.get_files() [myfile] = app.directories.get_files()
monkeypatch.setattr(app, "clean_empty_dirs", log_calls(lambda path: None)) monkeypatch.setattr(app, "clean_empty_dirs", log_calls(lambda path: None))
app.copy_or_move(myfile, False, tmppath["dest"], 0) app.copy_or_move(myfile, False, tmppath.joinpath("dest"), 0)
calls = app.clean_empty_dirs.calls calls = app.clean_empty_dirs.calls
eq_(1, len(calls)) eq_(1, len(calls))
eq_(sourcepath, calls[0]["path"]) eq_(sourcepath, calls[0]["path"])
@ -95,7 +97,7 @@ class TestCaseDupeGuru:
# At some point, any() was used in a wrong way that made Scan() wrongly return 1 # At some point, any() was used in a wrong way that made Scan() wrongly return 1
app = TestApp().app app = TestApp().app
f1, f2 = [FakeFile("foo") for _ in range(2)] f1, f2 = (FakeFile("foo") for _ in range(2))
f1.is_ref, f2.is_ref = (False, False) f1.is_ref, f2.is_ref = (False, False)
assert not (bool(f1) and bool(f2)) assert not (bool(f1) and bool(f2))
add_fake_files_to_directories(app.directories, [f1, f2]) add_fake_files_to_directories(app.directories, [f1, f2])
@ -106,8 +108,8 @@ class TestCaseDupeGuru:
# If the ignore_hardlink_matches option is set, don't match files hardlinking to the same # If the ignore_hardlink_matches option is set, don't match files hardlinking to the same
# inode. # inode.
tmppath = Path(str(tmpdir)) tmppath = Path(str(tmpdir))
tmppath["myfile"].open("w").write("foo") tmppath.joinpath("myfile").open("wt").write("foo")
os.link(str(tmppath["myfile"]), str(tmppath["hardlink"])) os.link(str(tmppath.joinpath("myfile")), str(tmppath.joinpath("hardlink")))
app = TestApp().app app = TestApp().app
app.directories.add_path(tmppath) app.directories.add_path(tmppath)
app.options["scan_type"] = ScanType.CONTENTS app.options["scan_type"] = ScanType.CONTENTS
@ -153,7 +155,7 @@ class TestCaseDupeGuruCleanEmptyDirs:
# delete_if_empty must be recursively called up in the path until it returns False # delete_if_empty must be recursively called up in the path until it returns False
@log_calls @log_calls
def mock_delete_if_empty(path, files_to_delete=[]): def mock_delete_if_empty(path, files_to_delete=[]):
return len(path) > 1 return len(path.parts) > 1
monkeypatch.setattr(hscommon.util, "delete_if_empty", mock_delete_if_empty) monkeypatch.setattr(hscommon.util, "delete_if_empty", mock_delete_if_empty)
# XXX This monkeypatch is temporary. will be fixed in a better monkeypatcher. # XXX This monkeypatch is temporary. will be fixed in a better monkeypatcher.
@ -180,8 +182,8 @@ class TestCaseDupeGuruWithResults:
self.rtable.refresh() self.rtable.refresh()
tmpdir = request.getfixturevalue("tmpdir") tmpdir = request.getfixturevalue("tmpdir")
tmppath = Path(str(tmpdir)) tmppath = Path(str(tmpdir))
tmppath["foo"].mkdir() tmppath.joinpath("foo").mkdir()
tmppath["bar"].mkdir() tmppath.joinpath("bar").mkdir()
self.app.directories.add_path(tmppath) self.app.directories.add_path(tmppath)
def test_get_objects(self, do_setup): def test_get_objects(self, do_setup):
@ -424,12 +426,9 @@ class TestCaseDupeGuruRenameSelected:
def do_setup(self, request): def do_setup(self, request):
tmpdir = request.getfixturevalue("tmpdir") tmpdir = request.getfixturevalue("tmpdir")
p = Path(str(tmpdir)) p = Path(str(tmpdir))
fp = open(str(p["foo bar 1"]), mode="w") p.joinpath("foo bar 1").touch()
fp.close() p.joinpath("foo bar 2").touch()
fp = open(str(p["foo bar 2"]), mode="w") p.joinpath("foo bar 3").touch()
fp.close()
fp = open(str(p["foo bar 3"]), mode="w")
fp.close()
files = fs.get_files(p) files = fs.get_files(p)
for f in files: for f in files:
f.is_ref = False f.is_ref = False
@ -451,7 +450,7 @@ class TestCaseDupeGuruRenameSelected:
g = self.groups[0] g = self.groups[0]
self.rtable.select([1]) self.rtable.select([1])
assert app.rename_selected("renamed") assert app.rename_selected("renamed")
names = [p.name for p in self.p.listdir()] names = [p.name for p in self.p.glob("*")]
assert "renamed" in names assert "renamed" in names
assert "foo bar 2" not in names assert "foo bar 2" not in names
eq_(g.dupes[0].name, "renamed") eq_(g.dupes[0].name, "renamed")
@ -464,7 +463,7 @@ class TestCaseDupeGuruRenameSelected:
assert not app.rename_selected("renamed") assert not app.rename_selected("renamed")
msg = logging.warning.calls[0]["msg"] msg = logging.warning.calls[0]["msg"]
eq_("dupeGuru Warning: list index out of range", msg) eq_("dupeGuru Warning: list index out of range", msg)
names = [p.name for p in self.p.listdir()] names = [p.name for p in self.p.glob("*")]
assert "renamed" not in names assert "renamed" not in names
assert "foo bar 2" in names assert "foo bar 2" in names
eq_(g.dupes[0].name, "foo bar 2") eq_(g.dupes[0].name, "foo bar 2")
@ -477,7 +476,7 @@ class TestCaseDupeGuruRenameSelected:
assert not app.rename_selected("foo bar 1") assert not app.rename_selected("foo bar 1")
msg = logging.warning.calls[0]["msg"] msg = logging.warning.calls[0]["msg"]
assert msg.startswith("dupeGuru Warning: 'foo bar 1' already exists in") assert msg.startswith("dupeGuru Warning: 'foo bar 1' already exists in")
names = [p.name for p in self.p.listdir()] names = [p.name for p in self.p.glob("*")]
assert "foo bar 1" in names assert "foo bar 1" in names
assert "foo bar 2" in names assert "foo bar 2" in names
eq_(g.dupes[0].name, "foo bar 2") eq_(g.dupes[0].name, "foo bar 2")
@ -488,9 +487,9 @@ class TestAppWithDirectoriesInTree:
def do_setup(self, request): def do_setup(self, request):
tmpdir = request.getfixturevalue("tmpdir") tmpdir = request.getfixturevalue("tmpdir")
p = Path(str(tmpdir)) p = Path(str(tmpdir))
p["sub1"].mkdir() p.joinpath("sub1").mkdir()
p["sub2"].mkdir() p.joinpath("sub2").mkdir()
p["sub3"].mkdir() p.joinpath("sub3").mkdir()
app = TestApp() app = TestApp()
self.app = app.app self.app = app.app
self.dtree = app.dtree self.dtree = app.dtree

View File

@ -5,17 +5,16 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from hscommon.testutil import TestApp as TestAppBase, CallLogger, eq_, with_app # noqa from hscommon.testutil import TestApp as TestAppBase, CallLogger, eq_, with_app # noqa
from hscommon.path import Path from pathlib import Path
from hscommon.util import get_file_ext, format_size from hscommon.util import get_file_ext, format_size
from hscommon.gui.column import Column from hscommon.gui.column import Column
from hscommon.jobprogress.job import nulljob, JobCancelled from hscommon.jobprogress.job import nulljob, JobCancelled
from .. import engine from core import engine, prioritize
from .. import prioritize from core.engine import getwords
from ..engine import getwords from core.app import DupeGuru as DupeGuruBase
from ..app import DupeGuru as DupeGuruBase from core.gui.result_table import ResultTable as ResultTableBase
from ..gui.result_table import ResultTable as ResultTableBase from core.gui.prioritize_dialog import PrioritizeDialog
from ..gui.prioritize_dialog import PrioritizeDialog
class DupeGuruView: class DupeGuruView:
@ -86,9 +85,9 @@ class NamedObject:
folder = "basepath" folder = "basepath"
self._folder = Path(folder) self._folder = Path(folder)
self.size = size self.size = size
self.md5partial = name self.digest_partial = name
self.md5 = name self.digest = name
self.md5samples = name self.digest_samples = name
if with_words: if with_words:
self.words = getwords(name) self.words = getwords(name)
self.is_ref = False self.is_ref = False
@ -111,11 +110,11 @@ class NamedObject:
@property @property
def path(self): def path(self):
return self._folder[self.name] return self._folder.joinpath(self.name)
@property @property
def folder_path(self): def folder_path(self):
return self.path.parent() return self.path.parent
@property @property
def extension(self): def extension(self):

View File

@ -9,7 +9,7 @@ from pytest import raises, skip
from hscommon.testutil import eq_ from hscommon.testutil import eq_
try: try:
from ..pe.block import avgdiff, getblocks2, NoBlocksError, DifferentBlockCountError from core.pe.block import avgdiff, getblocks2, NoBlocksError, DifferentBlockCountError
except ImportError: except ImportError:
skip("Can't import the block module, probably hasn't been compiled.") skip("Can't import the block module, probably hasn't been compiled.")

View File

@ -10,41 +10,41 @@ from pytest import raises, skip
from hscommon.testutil import eq_ from hscommon.testutil import eq_
try: try:
from ..pe.cache import colors_to_string, string_to_colors from core.pe.cache import colors_to_bytes, bytes_to_colors
from ..pe.cache_sqlite import SqliteCache from core.pe.cache_sqlite import SqliteCache
from ..pe.cache_shelve import ShelveCache
except ImportError: except ImportError:
skip("Can't import the cache module, probably hasn't been compiled.") skip("Can't import the cache module, probably hasn't been compiled.")
class TestCaseColorsToString: class TestCaseColorsToString:
def test_no_color(self): def test_no_color(self):
eq_("", colors_to_string([])) eq_(b"", colors_to_bytes([]))
def test_single_color(self): def test_single_color(self):
eq_("000000", colors_to_string([(0, 0, 0)])) eq_(b"\x00\x00\x00", colors_to_bytes([(0, 0, 0)]))
eq_("010101", colors_to_string([(1, 1, 1)])) eq_(b"\x01\x01\x01", colors_to_bytes([(1, 1, 1)]))
eq_("0a141e", colors_to_string([(10, 20, 30)])) eq_(b"\x0a\x14\x1e", colors_to_bytes([(10, 20, 30)]))
def test_two_colors(self): def test_two_colors(self):
eq_("000102030405", colors_to_string([(0, 1, 2), (3, 4, 5)])) eq_(b"\x00\x01\x02\x03\x04\x05", colors_to_bytes([(0, 1, 2), (3, 4, 5)]))
class TestCaseStringToColors: class TestCaseStringToColors:
def test_empty(self): def test_empty(self):
eq_([], string_to_colors("")) eq_([], bytes_to_colors(b""))
def test_single_color(self): def test_single_color(self):
eq_([(0, 0, 0)], string_to_colors("000000")) eq_([(0, 0, 0)], bytes_to_colors(b"\x00\x00\x00"))
eq_([(2, 3, 4)], string_to_colors("020304")) eq_([(2, 3, 4)], bytes_to_colors(b"\x02\x03\x04"))
eq_([(10, 20, 30)], string_to_colors("0a141e")) eq_([(10, 20, 30)], bytes_to_colors(b"\x0a\x14\x1e"))
def test_two_colors(self): def test_two_colors(self):
eq_([(10, 20, 30), (40, 50, 60)], string_to_colors("0a141e28323c")) eq_([(10, 20, 30), (40, 50, 60)], bytes_to_colors(b"\x0a\x14\x1e\x28\x32\x3c"))
def test_incomplete_color(self): def test_incomplete_color(self):
# don't return anything if it's not a complete color # don't return anything if it's not a complete color
eq_([], string_to_colors("102")) eq_([], bytes_to_colors(b"\x01"))
eq_([(1, 2, 3)], bytes_to_colors(b"\x01\x02\x03\x04"))
class BaseTestCaseCache: class BaseTestCaseCache:
@ -59,13 +59,13 @@ class BaseTestCaseCache:
def test_set_then_retrieve_blocks(self): def test_set_then_retrieve_blocks(self):
c = self.get_cache() c = self.get_cache()
b = [(0, 0, 0), (1, 2, 3)] b = [[(0, 0, 0), (1, 2, 3)]] * 8
c["foo"] = b c["foo"] = b
eq_(b, c["foo"]) eq_(b, c["foo"])
def test_delitem(self): def test_delitem(self):
c = self.get_cache() c = self.get_cache()
c["foo"] = "" c["foo"] = [[]] * 8
del c["foo"] del c["foo"]
assert "foo" not in c assert "foo" not in c
with raises(KeyError): with raises(KeyError):
@ -74,16 +74,16 @@ class BaseTestCaseCache:
def test_persistance(self, tmpdir): def test_persistance(self, tmpdir):
DBNAME = tmpdir.join("hstest.db") DBNAME = tmpdir.join("hstest.db")
c = self.get_cache(str(DBNAME)) c = self.get_cache(str(DBNAME))
c["foo"] = [(1, 2, 3)] c["foo"] = [[(1, 2, 3)]] * 8
del c del c
c = self.get_cache(str(DBNAME)) c = self.get_cache(str(DBNAME))
eq_([(1, 2, 3)], c["foo"]) eq_([[(1, 2, 3)]] * 8, c["foo"])
def test_filter(self): def test_filter(self):
c = self.get_cache() c = self.get_cache()
c["foo"] = "" c["foo"] = [[]] * 8
c["bar"] = "" c["bar"] = [[]] * 8
c["baz"] = "" c["baz"] = [[]] * 8
c.filter(lambda p: p != "bar") # only 'bar' is removed c.filter(lambda p: p != "bar") # only 'bar' is removed
eq_(2, len(c)) eq_(2, len(c))
assert "foo" in c assert "foo" in c
@ -92,9 +92,9 @@ class BaseTestCaseCache:
def test_clear(self): def test_clear(self):
c = self.get_cache() c = self.get_cache()
c["foo"] = "" c["foo"] = [[]] * 8
c["bar"] = "" c["bar"] = [[]] * 8
c["baz"] = "" c["baz"] = [[]] * 8
c.clear() c.clear()
eq_(0, len(c)) eq_(0, len(c))
assert "foo" not in c assert "foo" not in c
@ -104,7 +104,7 @@ class BaseTestCaseCache:
def test_by_id(self): def test_by_id(self):
# it's possible to use the cache by referring to the files by their row_id # it's possible to use the cache by referring to the files by their row_id
c = self.get_cache() c = self.get_cache()
b = [(0, 0, 0), (1, 2, 3)] b = [[(0, 0, 0), (1, 2, 3)]] * 8
c["foo"] = b c["foo"] = b
foo_id = c.get_id("foo") foo_id = c.get_id("foo")
eq_(c[foo_id], b) eq_(c[foo_id], b)
@ -127,15 +127,10 @@ class TestCaseSqliteCache(BaseTestCaseCache):
fp.write("invalid sqlite content") fp.write("invalid sqlite content")
fp.close() fp.close()
c = self.get_cache(dbname) # should not raise a DatabaseError c = self.get_cache(dbname) # should not raise a DatabaseError
c["foo"] = [(1, 2, 3)] c["foo"] = [[(1, 2, 3)]] * 8
del c del c
c = self.get_cache(dbname) c = self.get_cache(dbname)
eq_(c["foo"], [(1, 2, 3)]) eq_(c["foo"], [[(1, 2, 3)]] * 8)
class TestCaseShelveCache(BaseTestCaseCache):
def get_cache(self, dbname=None):
return ShelveCache(dbname)
class TestCaseCacheSQLEscape: class TestCaseCacheSQLEscape:
@ -157,7 +152,7 @@ class TestCaseCacheSQLEscape:
def test_delitem(self): def test_delitem(self):
c = self.get_cache() c = self.get_cache()
c["foo'bar"] = [] c["foo'bar"] = [[]] * 8
try: try:
del c["foo'bar"] del c["foo'bar"]
except KeyError: except KeyError:

View File

@ -10,45 +10,39 @@ import tempfile
import shutil import shutil
from pytest import raises from pytest import raises
from hscommon.path import Path from pathlib import Path
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from hscommon.plat import ISWINDOWS from hscommon.plat import ISWINDOWS
from ..fs import File from core.fs import File
from ..directories import ( from core.directories import (
Directories, Directories,
DirectoryState, DirectoryState,
AlreadyThereError, AlreadyThereError,
InvalidPathError, InvalidPathError,
) )
from ..exclude import ExcludeList, ExcludeDict from core.exclude import ExcludeList, ExcludeDict
def create_fake_fs(rootpath): def create_fake_fs(rootpath):
# We have it as a separate function because other units are using it. # We have it as a separate function because other units are using it.
rootpath = rootpath["fs"] rootpath = rootpath.joinpath("fs")
rootpath.mkdir() rootpath.mkdir()
rootpath["dir1"].mkdir() rootpath.joinpath("dir1").mkdir()
rootpath["dir2"].mkdir() rootpath.joinpath("dir2").mkdir()
rootpath["dir3"].mkdir() rootpath.joinpath("dir3").mkdir()
fp = rootpath["file1.test"].open("w") with rootpath.joinpath("file1.test").open("wt") as fp:
fp.write("1") fp.write("1")
fp.close() with rootpath.joinpath("file2.test").open("wt") as fp:
fp = rootpath["file2.test"].open("w") fp.write("12")
fp.write("12") with rootpath.joinpath("file3.test").open("wt") as fp:
fp.close() fp.write("123")
fp = rootpath["file3.test"].open("w") with rootpath.joinpath("dir1", "file1.test").open("wt") as fp:
fp.write("123") fp.write("1")
fp.close() with rootpath.joinpath("dir2", "file2.test").open("wt") as fp:
fp = rootpath["dir1"]["file1.test"].open("w") fp.write("12")
fp.write("1") with rootpath.joinpath("dir3", "file3.test").open("wt") as fp:
fp.close() fp.write("123")
fp = rootpath["dir2"]["file2.test"].open("w")
fp.write("12")
fp.close()
fp = rootpath["dir3"]["file3.test"].open("w")
fp.write("123")
fp.close()
return rootpath return rootpath
@ -60,11 +54,10 @@ def setup_module(module):
# and another with a more complex structure. # and another with a more complex structure.
testpath = Path(tempfile.mkdtemp()) testpath = Path(tempfile.mkdtemp())
module.testpath = testpath module.testpath = testpath
rootpath = testpath["onefile"] rootpath = testpath.joinpath("onefile")
rootpath.mkdir() rootpath.mkdir()
fp = rootpath["test.txt"].open("w") with rootpath.joinpath("test.txt").open("wt") as fp:
fp.write("test_data") fp.write("test_data")
fp.close()
create_fake_fs(testpath) create_fake_fs(testpath)
@ -80,13 +73,13 @@ def test_empty():
def test_add_path(): def test_add_path():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
eq_(1, len(d)) eq_(1, len(d))
assert p in d assert p in d
assert (p["foobar"]) in d assert (p.joinpath("foobar")) in d
assert p.parent() not in d assert p.parent not in d
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
eq_(2, len(d)) eq_(2, len(d))
assert p in d assert p in d
@ -94,18 +87,18 @@ def test_add_path():
def test_add_path_when_path_is_already_there(): def test_add_path_when_path_is_already_there():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
with raises(AlreadyThereError): with raises(AlreadyThereError):
d.add_path(p) d.add_path(p)
with raises(AlreadyThereError): with raises(AlreadyThereError):
d.add_path(p["foobar"]) d.add_path(p.joinpath("foobar"))
eq_(1, len(d)) eq_(1, len(d))
def test_add_path_containing_paths_already_there(): def test_add_path_containing_paths_already_there():
d = Directories() d = Directories()
d.add_path(testpath["onefile"]) d.add_path(testpath.joinpath("onefile"))
eq_(1, len(d)) eq_(1, len(d))
d.add_path(testpath) d.add_path(testpath)
eq_(len(d), 1) eq_(len(d), 1)
@ -114,7 +107,7 @@ def test_add_path_containing_paths_already_there():
def test_add_path_non_latin(tmpdir): def test_add_path_non_latin(tmpdir):
p = Path(str(tmpdir)) p = Path(str(tmpdir))
to_add = p["unicode\u201a"] to_add = p.joinpath("unicode\u201a")
os.mkdir(str(to_add)) os.mkdir(str(to_add))
d = Directories() d = Directories()
try: try:
@ -125,25 +118,25 @@ def test_add_path_non_latin(tmpdir):
def test_del(): def test_del():
d = Directories() d = Directories()
d.add_path(testpath["onefile"]) d.add_path(testpath.joinpath("onefile"))
try: try:
del d[1] del d[1]
assert False assert False
except IndexError: except IndexError:
pass pass
d.add_path(testpath["fs"]) d.add_path(testpath.joinpath("fs"))
del d[1] del d[1]
eq_(1, len(d)) eq_(1, len(d))
def test_states(): def test_states():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
eq_(DirectoryState.NORMAL, d.get_state(p)) eq_(DirectoryState.NORMAL, d.get_state(p))
d.set_state(p, DirectoryState.REFERENCE) d.set_state(p, DirectoryState.REFERENCE)
eq_(DirectoryState.REFERENCE, d.get_state(p)) eq_(DirectoryState.REFERENCE, d.get_state(p))
eq_(DirectoryState.REFERENCE, d.get_state(p["dir1"])) eq_(DirectoryState.REFERENCE, d.get_state(p.joinpath("dir1")))
eq_(1, len(d.states)) eq_(1, len(d.states))
eq_(p, list(d.states.keys())[0]) eq_(p, list(d.states.keys())[0])
eq_(DirectoryState.REFERENCE, d.states[p]) eq_(DirectoryState.REFERENCE, d.states[p])
@ -152,7 +145,7 @@ def test_states():
def test_get_state_with_path_not_there(): def test_get_state_with_path_not_there():
# When the path's not there, just return DirectoryState.Normal # When the path's not there, just return DirectoryState.Normal
d = Directories() d = Directories()
d.add_path(testpath["onefile"]) d.add_path(testpath.joinpath("onefile"))
eq_(d.get_state(testpath), DirectoryState.NORMAL) eq_(d.get_state(testpath), DirectoryState.NORMAL)
@ -160,26 +153,26 @@ def test_states_overwritten_when_larger_directory_eat_smaller_ones():
# ref #248 # ref #248
# When setting the state of a folder, we overwrite previously set states for subfolders. # When setting the state of a folder, we overwrite previously set states for subfolders.
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
d.set_state(p, DirectoryState.EXCLUDED) d.set_state(p, DirectoryState.EXCLUDED)
d.add_path(testpath) d.add_path(testpath)
d.set_state(testpath, DirectoryState.REFERENCE) d.set_state(testpath, DirectoryState.REFERENCE)
eq_(d.get_state(p), DirectoryState.REFERENCE) eq_(d.get_state(p), DirectoryState.REFERENCE)
eq_(d.get_state(p["dir1"]), DirectoryState.REFERENCE) eq_(d.get_state(p.joinpath("dir1")), DirectoryState.REFERENCE)
eq_(d.get_state(testpath), DirectoryState.REFERENCE) eq_(d.get_state(testpath), DirectoryState.REFERENCE)
def test_get_files(): def test_get_files():
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
d.set_state(p["dir1"], DirectoryState.REFERENCE) d.set_state(p.joinpath("dir1"), DirectoryState.REFERENCE)
d.set_state(p["dir2"], DirectoryState.EXCLUDED) d.set_state(p.joinpath("dir2"), DirectoryState.EXCLUDED)
files = list(d.get_files()) files = list(d.get_files())
eq_(5, len(files)) eq_(5, len(files))
for f in files: for f in files:
if f.path.parent() == p["dir1"]: if f.path.parent == p.joinpath("dir1"):
assert f.is_ref assert f.is_ref
else: else:
assert not f.is_ref assert not f.is_ref
@ -193,7 +186,7 @@ def test_get_files_with_folders():
return True return True
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
files = list(d.get_files(fileclasses=[FakeFile])) files = list(d.get_files(fileclasses=[FakeFile]))
# We have the 3 root files and the 3 root dirs # We have the 3 root files and the 3 root dirs
@ -202,23 +195,23 @@ def test_get_files_with_folders():
def test_get_folders(): def test_get_folders():
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
d.set_state(p["dir1"], DirectoryState.REFERENCE) d.set_state(p.joinpath("dir1"), DirectoryState.REFERENCE)
d.set_state(p["dir2"], DirectoryState.EXCLUDED) d.set_state(p.joinpath("dir2"), DirectoryState.EXCLUDED)
folders = list(d.get_folders()) folders = list(d.get_folders())
eq_(len(folders), 3) eq_(len(folders), 3)
ref = [f for f in folders if f.is_ref] ref = [f for f in folders if f.is_ref]
not_ref = [f for f in folders if not f.is_ref] not_ref = [f for f in folders if not f.is_ref]
eq_(len(ref), 1) eq_(len(ref), 1)
eq_(ref[0].path, p["dir1"]) eq_(ref[0].path, p.joinpath("dir1"))
eq_(len(not_ref), 2) eq_(len(not_ref), 2)
eq_(ref[0].size, 1) eq_(ref[0].size, 1)
def test_get_files_with_inherited_exclusion(): def test_get_files_with_inherited_exclusion():
d = Directories() d = Directories()
p = testpath["onefile"] p = testpath.joinpath("onefile")
d.add_path(p) d.add_path(p)
d.set_state(p, DirectoryState.EXCLUDED) d.set_state(p, DirectoryState.EXCLUDED)
eq_([], list(d.get_files())) eq_([], list(d.get_files()))
@ -234,13 +227,13 @@ def test_save_and_load(tmpdir):
d1.add_path(p1) d1.add_path(p1)
d1.add_path(p2) d1.add_path(p2)
d1.set_state(p1, DirectoryState.REFERENCE) d1.set_state(p1, DirectoryState.REFERENCE)
d1.set_state(p1["dir1"], DirectoryState.EXCLUDED) d1.set_state(p1.joinpath("dir1"), DirectoryState.EXCLUDED)
tmpxml = str(tmpdir.join("directories_testunit.xml")) tmpxml = str(tmpdir.join("directories_testunit.xml"))
d1.save_to_file(tmpxml) d1.save_to_file(tmpxml)
d2.load_from_file(tmpxml) d2.load_from_file(tmpxml)
eq_(2, len(d2)) eq_(2, len(d2))
eq_(DirectoryState.REFERENCE, d2.get_state(p1)) eq_(DirectoryState.REFERENCE, d2.get_state(p1))
eq_(DirectoryState.EXCLUDED, d2.get_state(p1["dir1"])) eq_(DirectoryState.EXCLUDED, d2.get_state(p1.joinpath("dir1")))
def test_invalid_path(): def test_invalid_path():
@ -268,7 +261,7 @@ def test_load_from_file_with_invalid_path(tmpdir):
# This test simulates a load from file resulting in a # This test simulates a load from file resulting in a
# InvalidPath raise. Other directories must be loaded. # InvalidPath raise. Other directories must be loaded.
d1 = Directories() d1 = Directories()
d1.add_path(testpath["onefile"]) d1.add_path(testpath.joinpath("onefile"))
# Will raise InvalidPath upon loading # Will raise InvalidPath upon loading
p = Path(str(tmpdir.join("toremove"))) p = Path(str(tmpdir.join("toremove")))
p.mkdir() p.mkdir()
@ -283,11 +276,11 @@ def test_load_from_file_with_invalid_path(tmpdir):
def test_unicode_save(tmpdir): def test_unicode_save(tmpdir):
d = Directories() d = Directories()
p1 = Path(str(tmpdir))["hello\xe9"] p1 = Path(str(tmpdir), "hello\xe9")
p1.mkdir() p1.mkdir()
p1["foo\xe9"].mkdir() p1.joinpath("foo\xe9").mkdir()
d.add_path(p1) d.add_path(p1)
d.set_state(p1["foo\xe9"], DirectoryState.EXCLUDED) d.set_state(p1.joinpath("foo\xe9"), DirectoryState.EXCLUDED)
tmpxml = str(tmpdir.join("directories_testunit.xml")) tmpxml = str(tmpdir.join("directories_testunit.xml"))
try: try:
d.save_to_file(tmpxml) d.save_to_file(tmpxml)
@ -297,12 +290,12 @@ def test_unicode_save(tmpdir):
def test_get_files_refreshes_its_directories(): def test_get_files_refreshes_its_directories():
d = Directories() d = Directories()
p = testpath["fs"] p = testpath.joinpath("fs")
d.add_path(p) d.add_path(p)
files = d.get_files() files = d.get_files()
eq_(6, len(list(files))) eq_(6, len(list(files)))
time.sleep(1) time.sleep(1)
os.remove(str(p["dir1"]["file1.test"])) os.remove(str(p.joinpath("dir1", "file1.test")))
files = d.get_files() files = d.get_files()
eq_(5, len(list(files))) eq_(5, len(list(files)))
@ -311,15 +304,15 @@ def test_get_files_does_not_choke_on_non_existing_directories(tmpdir):
d = Directories() d = Directories()
p = Path(str(tmpdir)) p = Path(str(tmpdir))
d.add_path(p) d.add_path(p)
p.rmtree() shutil.rmtree(str(p))
eq_([], list(d.get_files())) eq_([], list(d.get_files()))
def test_get_state_returns_excluded_by_default_for_hidden_directories(tmpdir): def test_get_state_returns_excluded_by_default_for_hidden_directories(tmpdir):
d = Directories() d = Directories()
p = Path(str(tmpdir)) p = Path(str(tmpdir))
hidden_dir_path = p[".foo"] hidden_dir_path = p.joinpath(".foo")
p[".foo"].mkdir() p.joinpath(".foo").mkdir()
d.add_path(p) d.add_path(p)
eq_(d.get_state(hidden_dir_path), DirectoryState.EXCLUDED) eq_(d.get_state(hidden_dir_path), DirectoryState.EXCLUDED)
# But it can be overriden # But it can be overriden
@ -331,22 +324,23 @@ def test_default_path_state_override(tmpdir):
# It's possible for a subclass to override the default state of a path # It's possible for a subclass to override the default state of a path
class MyDirectories(Directories): class MyDirectories(Directories):
def _default_state_for_path(self, path): def _default_state_for_path(self, path):
if "foobar" in path: if "foobar" in path.parts:
return DirectoryState.EXCLUDED return DirectoryState.EXCLUDED
return DirectoryState.NORMAL
d = MyDirectories() d = MyDirectories()
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["foobar"].mkdir() p1.joinpath("foobar").mkdir()
p1["foobar/somefile"].open("w").close() p1.joinpath("foobar/somefile").touch()
p1["foobaz"].mkdir() p1.joinpath("foobaz").mkdir()
p1["foobaz/somefile"].open("w").close() p1.joinpath("foobaz/somefile").touch()
d.add_path(p1) d.add_path(p1)
eq_(d.get_state(p1["foobaz"]), DirectoryState.NORMAL) eq_(d.get_state(p1.joinpath("foobaz")), DirectoryState.NORMAL)
eq_(d.get_state(p1["foobar"]), DirectoryState.EXCLUDED) eq_(d.get_state(p1.joinpath("foobar")), DirectoryState.EXCLUDED)
eq_(len(list(d.get_files())), 1) # only the 'foobaz' file is there eq_(len(list(d.get_files())), 1) # only the 'foobaz' file is there
# However, the default state can be changed # However, the default state can be changed
d.set_state(p1["foobar"], DirectoryState.NORMAL) d.set_state(p1.joinpath("foobar"), DirectoryState.NORMAL)
eq_(d.get_state(p1["foobar"]), DirectoryState.NORMAL) eq_(d.get_state(p1.joinpath("foobar")), DirectoryState.NORMAL)
eq_(len(list(d.get_files())), 2) eq_(len(list(d.get_files())), 2)
@ -372,42 +366,42 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
self.d._exclude_list.add(regex) self.d._exclude_list.add(regex)
self.d._exclude_list.mark(regex) self.d._exclude_list.mark(regex)
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["$Recycle.Bin"].mkdir() p1.joinpath("$Recycle.Bin").mkdir()
p1["$Recycle.Bin"]["subdir"].mkdir() p1.joinpath("$Recycle.Bin", "subdir").mkdir()
self.d.add_path(p1) self.d.add_path(p1)
eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
# By default, subdirs should be excluded too, but this can be overridden separately # By default, subdirs should be excluded too, but this can be overridden separately
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
def test_exclude_refined(self, tmpdir): def test_exclude_refined(self, tmpdir):
regex1 = r"^\$Recycle\.Bin$" regex1 = r"^\$Recycle\.Bin$"
self.d._exclude_list.add(regex1) self.d._exclude_list.add(regex1)
self.d._exclude_list.mark(regex1) self.d._exclude_list.mark(regex1)
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["$Recycle.Bin"].mkdir() p1.joinpath("$Recycle.Bin").mkdir()
p1["$Recycle.Bin"]["somefile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "somefile.png").touch()
p1["$Recycle.Bin"]["some_unwanted_file.jpg"].open("w").close() p1.joinpath("$Recycle.Bin", "some_unwanted_file.jpg").touch()
p1["$Recycle.Bin"]["subdir"].mkdir() p1.joinpath("$Recycle.Bin", "subdir").mkdir()
p1["$Recycle.Bin"]["subdir"]["somesubdirfile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "somesubdirfile.png").touch()
p1["$Recycle.Bin"]["subdir"]["unwanted_subdirfile.gif"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "unwanted_subdirfile.gif").touch()
p1["$Recycle.Bin"]["subdar"].mkdir() p1.joinpath("$Recycle.Bin", "subdar").mkdir()
p1["$Recycle.Bin"]["subdar"]["somesubdarfile.jpeg"].open("w").close() p1.joinpath("$Recycle.Bin", "subdar", "somesubdarfile.jpeg").touch()
p1["$Recycle.Bin"]["subdar"]["unwanted_subdarfile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "subdar", "unwanted_subdarfile.png").touch()
self.d.add_path(p1["$Recycle.Bin"]) self.d.add_path(p1.joinpath("$Recycle.Bin"))
# Filter should set the default state to Excluded # Filter should set the default state to Excluded
eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
# The subdir should inherit its parent state # The subdir should inherit its parent state
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.EXCLUDED)
# Override a child path's state # Override a child path's state
self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
# Parent should keep its default state, and the other child too # Parent should keep its default state, and the other child too
eq_(self.d.get_state(p1["$Recycle.Bin"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin")), DirectoryState.EXCLUDED)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.EXCLUDED)
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
# only the 2 files directly under the Normal directory # only the 2 files directly under the Normal directory
@ -419,8 +413,8 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
assert "somesubdirfile.png" in files assert "somesubdirfile.png" in files
assert "unwanted_subdirfile.gif" in files assert "unwanted_subdirfile.gif" in files
# Overriding the parent should enable all children # Overriding the parent should enable all children
self.d.set_state(p1["$Recycle.Bin"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("$Recycle.Bin"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdar"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdar")), DirectoryState.NORMAL)
# all files there # all files there
files = self.get_files_and_expect_num_result(6) files = self.get_files_and_expect_num_result(6)
assert "somefile.png" in files assert "somefile.png" in files
@ -444,7 +438,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
assert self.d._exclude_list.error(regex3) is None assert self.d._exclude_list.error(regex3) is None
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
# Directory shouldn't change its state here, unless explicitely done by user # Directory shouldn't change its state here, unless explicitely done by user
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
assert "unwanted_subdirfile.gif" not in files assert "unwanted_subdirfile.gif" not in files
assert "unwanted_subdarfile.png" in files assert "unwanted_subdarfile.png" in files
@ -453,15 +447,15 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
regex4 = r".*subdir$" regex4 = r".*subdir$"
self.d._exclude_list.rename(regex3, regex4) self.d._exclude_list.rename(regex3, regex4)
assert self.d._exclude_list.error(regex4) is None assert self.d._exclude_list.error(regex4) is None
p1["$Recycle.Bin"]["subdar"]["file_ending_with_subdir"].open("w").close() p1.joinpath("$Recycle.Bin", "subdar", "file_ending_with_subdir").touch()
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.EXCLUDED)
files = self.get_files_and_expect_num_result(4) files = self.get_files_and_expect_num_result(4)
assert "file_ending_with_subdir" not in files assert "file_ending_with_subdir" not in files
assert "somesubdarfile.jpeg" in files assert "somesubdarfile.jpeg" in files
assert "somesubdirfile.png" not in files assert "somesubdirfile.png" not in files
assert "unwanted_subdirfile.gif" not in files assert "unwanted_subdirfile.gif" not in files
self.d.set_state(p1["$Recycle.Bin"]["subdir"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("$Recycle.Bin", "subdir"), DirectoryState.NORMAL)
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
files = self.get_files_and_expect_num_result(6) files = self.get_files_and_expect_num_result(6)
assert "file_ending_with_subdir" not in files assert "file_ending_with_subdir" not in files
@ -471,9 +465,9 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
regex5 = r".*subdir.*" regex5 = r".*subdir.*"
self.d._exclude_list.rename(regex4, regex5) self.d._exclude_list.rename(regex4, regex5)
# Files containing substring should be filtered # Files containing substring should be filtered
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
# The path should not match, only the filename, the "subdir" in the directory name shouldn't matter # The path should not match, only the filename, the "subdir" in the directory name shouldn't matter
p1["$Recycle.Bin"]["subdir"]["file_which_shouldnt_match"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "file_which_shouldnt_match").touch()
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
assert "somesubdirfile.png" not in files assert "somesubdirfile.png" not in files
assert "unwanted_subdirfile.gif" not in files assert "unwanted_subdirfile.gif" not in files
@ -493,7 +487,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
assert self.d._exclude_list.error(regex6) is None assert self.d._exclude_list.error(regex6) is None
assert regex6 in self.d._exclude_list assert regex6 in self.d._exclude_list
# This still should not be affected # This still should not be affected
eq_(self.d.get_state(p1["$Recycle.Bin"]["subdir"]), DirectoryState.NORMAL) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "subdir")), DirectoryState.NORMAL)
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
# These files are under the "/subdir" directory # These files are under the "/subdir" directory
assert "somesubdirfile.png" not in files assert "somesubdirfile.png" not in files
@ -505,20 +499,20 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
def test_japanese_unicode(self, tmpdir): def test_japanese_unicode(self, tmpdir):
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["$Recycle.Bin"].mkdir() p1.joinpath("$Recycle.Bin").mkdir()
p1["$Recycle.Bin"]["somerecycledfile.png"].open("w").close() p1.joinpath("$Recycle.Bin", "somerecycledfile.png").touch()
p1["$Recycle.Bin"]["some_unwanted_file.jpg"].open("w").close() p1.joinpath("$Recycle.Bin", "some_unwanted_file.jpg").touch()
p1["$Recycle.Bin"]["subdir"].mkdir() p1.joinpath("$Recycle.Bin", "subdir").mkdir()
p1["$Recycle.Bin"]["subdir"]["過去白濁物語~]_カラー.jpg"].open("w").close() p1.joinpath("$Recycle.Bin", "subdir", "過去白濁物語~]_カラー.jpg").touch()
p1["$Recycle.Bin"]["思叫物語"].mkdir() p1.joinpath("$Recycle.Bin", "思叫物語").mkdir()
p1["$Recycle.Bin"]["思叫物語"]["なししろ会う前"].open("w").close() p1.joinpath("$Recycle.Bin", "思叫物語", "なししろ会う前").touch()
p1["$Recycle.Bin"]["思叫物語"]["堂~ロ"].open("w").close() p1.joinpath("$Recycle.Bin", "思叫物語", "堂~ロ").touch()
self.d.add_path(p1["$Recycle.Bin"]) self.d.add_path(p1.joinpath("$Recycle.Bin"))
regex3 = r".*物語.*" regex3 = r".*物語.*"
self.d._exclude_list.add(regex3) self.d._exclude_list.add(regex3)
self.d._exclude_list.mark(regex3) self.d._exclude_list.mark(regex3)
# print(f"get_folders(): {[x for x in self.d.get_folders()]}") # print(f"get_folders(): {[x for x in self.d.get_folders()]}")
eq_(self.d.get_state(p1["$Recycle.Bin"]["思叫物語"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("$Recycle.Bin", "思叫物語")), DirectoryState.EXCLUDED)
files = self.get_files_and_expect_num_result(2) files = self.get_files_and_expect_num_result(2)
assert "過去白濁物語~]_カラー.jpg" not in files assert "過去白濁物語~]_カラー.jpg" not in files
assert "なししろ会う前" not in files assert "なししろ会う前" not in files
@ -527,7 +521,7 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
regex4 = r".*物語$" regex4 = r".*物語$"
self.d._exclude_list.rename(regex3, regex4) self.d._exclude_list.rename(regex3, regex4)
assert self.d._exclude_list.error(regex4) is None assert self.d._exclude_list.error(regex4) is None
self.d.set_state(p1["$Recycle.Bin"]["思叫物語"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("$Recycle.Bin", "思叫物語"), DirectoryState.NORMAL)
files = self.get_files_and_expect_num_result(5) files = self.get_files_and_expect_num_result(5)
assert "過去白濁物語~]_カラー.jpg" in files assert "過去白濁物語~]_カラー.jpg" in files
assert "なししろ会う前" in files assert "なししろ会う前" in files
@ -539,15 +533,15 @@ files: {self.d._exclude_list.compiled_files} all: {self.d._exclude_list.compiled
self.d._exclude_list.add(regex) self.d._exclude_list.add(regex)
self.d._exclude_list.mark(regex) self.d._exclude_list.mark(regex)
p1 = Path(str(tmpdir)) p1 = Path(str(tmpdir))
p1["foobar"].mkdir() p1.joinpath("foobar").mkdir()
p1["foobar"][".hidden_file.txt"].open("w").close() p1.joinpath("foobar", ".hidden_file.txt").touch()
p1["foobar"][".hidden_dir"].mkdir() p1.joinpath("foobar", ".hidden_dir").mkdir()
p1["foobar"][".hidden_dir"]["foobar.jpg"].open("w").close() p1.joinpath("foobar", ".hidden_dir", "foobar.jpg").touch()
p1["foobar"][".hidden_dir"][".hidden_subfile.png"].open("w").close() p1.joinpath("foobar", ".hidden_dir", ".hidden_subfile.png").touch()
self.d.add_path(p1["foobar"]) self.d.add_path(p1.joinpath("foobar"))
# It should not inherit its parent's state originally # It should not inherit its parent's state originally
eq_(self.d.get_state(p1["foobar"][".hidden_dir"]), DirectoryState.EXCLUDED) eq_(self.d.get_state(p1.joinpath("foobar", ".hidden_dir")), DirectoryState.EXCLUDED)
self.d.set_state(p1["foobar"][".hidden_dir"], DirectoryState.NORMAL) self.d.set_state(p1.joinpath("foobar", ".hidden_dir"), DirectoryState.NORMAL)
# The files should still be filtered # The files should still be filtered
files = self.get_files_and_expect_num_result(1) files = self.get_files_and_expect_num_result(1)
eq_(len(self.d._exclude_list.compiled_paths), 0) eq_(len(self.d._exclude_list.compiled_paths), 0)

View File

@ -10,9 +10,9 @@ from hscommon.jobprogress import job
from hscommon.util import first from hscommon.util import first
from hscommon.testutil import eq_, log_calls from hscommon.testutil import eq_, log_calls
from .base import NamedObject from core.tests.base import NamedObject
from .. import engine from core import engine
from ..engine import ( from core.engine import (
get_match, get_match,
getwords, getwords,
Group, Group,
@ -71,7 +71,10 @@ class TestCasegetwords:
def test_unicode(self): def test_unicode(self):
eq_(["e", "c", "0", "a", "o", "u", "e", "u"], getwords("é ç 0 à ö û è ¤ ù")) eq_(["e", "c", "0", "a", "o", "u", "e", "u"], getwords("é ç 0 à ö û è ¤ ù"))
eq_(["02", "君のこころは輝いてるかい?", "国木田花丸", "solo", "ver"], getwords("02 君のこころは輝いてるかい? 国木田花丸 Solo Ver")) eq_(
["02", "君のこころは輝いてるかい?", "国木田花丸", "solo", "ver"],
getwords("02 君のこころは輝いてるかい? 国木田花丸 Solo Ver"),
)
def test_splitter_chars(self): def test_splitter_chars(self):
eq_( eq_(
@ -271,9 +274,9 @@ class TestCaseBuildWordDict:
class TestCaseMergeSimilarWords: class TestCaseMergeSimilarWords:
def test_some_similar_words(self): def test_some_similar_words(self):
d = { d = {
"foobar": set([1]), "foobar": {1},
"foobar1": set([2]), "foobar1": {2},
"foobar2": set([3]), "foobar2": {3},
} }
merge_similar_words(d) merge_similar_words(d)
eq_(1, len(d)) eq_(1, len(d))
@ -283,8 +286,8 @@ class TestCaseMergeSimilarWords:
class TestCaseReduceCommonWords: class TestCaseReduceCommonWords:
def test_typical(self): def test_typical(self):
d = { d = {
"foo": set([NamedObject("foo bar", True) for _ in range(50)]), "foo": {NamedObject("foo bar", True) for _ in range(50)},
"bar": set([NamedObject("foo bar", True) for _ in range(49)]), "bar": {NamedObject("foo bar", True) for _ in range(49)},
} }
reduce_common_words(d, 50) reduce_common_words(d, 50)
assert "foo" not in d assert "foo" not in d
@ -293,7 +296,7 @@ class TestCaseReduceCommonWords:
def test_dont_remove_objects_with_only_common_words(self): def test_dont_remove_objects_with_only_common_words(self):
d = { d = {
"common": set([NamedObject("common uncommon", True) for _ in range(50)] + [NamedObject("common", True)]), "common": set([NamedObject("common uncommon", True) for _ in range(50)] + [NamedObject("common", True)]),
"uncommon": set([NamedObject("common uncommon", True)]), "uncommon": {NamedObject("common uncommon", True)},
} }
reduce_common_words(d, 50) reduce_common_words(d, 50)
eq_(1, len(d["common"])) eq_(1, len(d["common"]))
@ -302,7 +305,7 @@ class TestCaseReduceCommonWords:
def test_values_still_are_set_instances(self): def test_values_still_are_set_instances(self):
d = { d = {
"common": set([NamedObject("common uncommon", True) for _ in range(50)] + [NamedObject("common", True)]), "common": set([NamedObject("common uncommon", True) for _ in range(50)] + [NamedObject("common", True)]),
"uncommon": set([NamedObject("common uncommon", True)]), "uncommon": {NamedObject("common uncommon", True)},
} }
reduce_common_words(d, 50) reduce_common_words(d, 50)
assert isinstance(d["common"], set) assert isinstance(d["common"], set)
@ -312,9 +315,9 @@ class TestCaseReduceCommonWords:
# If a word has been removed by the reduce, an object in a subsequent common word that # If a word has been removed by the reduce, an object in a subsequent common word that
# contains the word that has been removed would cause a KeyError. # contains the word that has been removed would cause a KeyError.
d = { d = {
"foo": set([NamedObject("foo bar baz", True) for _ in range(50)]), "foo": {NamedObject("foo bar baz", True) for _ in range(50)},
"bar": set([NamedObject("foo bar baz", True) for _ in range(50)]), "bar": {NamedObject("foo bar baz", True) for _ in range(50)},
"baz": set([NamedObject("foo bar baz", True) for _ in range(49)]), "baz": {NamedObject("foo bar baz", True) for _ in range(49)},
} }
try: try:
reduce_common_words(d, 50) reduce_common_words(d, 50)
@ -328,7 +331,7 @@ class TestCaseReduceCommonWords:
o.words = [["foo", "bar"], ["baz"]] o.words = [["foo", "bar"], ["baz"]]
return o return o
d = {"foo": set([create_it() for _ in range(50)])} d = {"foo": {create_it() for _ in range(50)}}
try: try:
reduce_common_words(d, 50) reduce_common_words(d, 50)
except TypeError: except TypeError:
@ -343,7 +346,7 @@ class TestCaseReduceCommonWords:
d = { d = {
"foo": set([NamedObject("foo bar baz", True) for _ in range(49)] + [only_common]), "foo": set([NamedObject("foo bar baz", True) for _ in range(49)] + [only_common]),
"bar": set([NamedObject("foo bar baz", True) for _ in range(49)] + [only_common]), "bar": set([NamedObject("foo bar baz", True) for _ in range(49)] + [only_common]),
"baz": set([NamedObject("foo bar baz", True) for _ in range(49)]), "baz": {NamedObject("foo bar baz", True) for _ in range(49)},
} }
reduce_common_words(d, 50) reduce_common_words(d, 50)
eq_(1, len(d["foo"])) eq_(1, len(d["foo"]))
@ -530,7 +533,7 @@ class TestCaseGetMatches:
class TestCaseGetMatchesByContents: class TestCaseGetMatchesByContents:
def test_big_file_partial_hashes(self): def test_big_file_partial_hashing(self):
smallsize = 1 smallsize = 1
bigsize = 100 * 1024 * 1024 # 100MB bigsize = 100 * 1024 * 1024 # 100MB
f = [ f = [
@ -539,17 +542,17 @@ class TestCaseGetMatchesByContents:
no("smallfoo", size=smallsize), no("smallfoo", size=smallsize),
no("smallbar", size=smallsize), no("smallbar", size=smallsize),
] ]
f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar" f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar" f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].md5 = f[2].md5partial = "bleh" f[2].digest = f[2].digest_partial = "bleh"
f[3].md5 = f[3].md5partial = "bleh" f[3].digest = f[3].digest_partial = "bleh"
r = getmatches_by_contents(f, bigsize=bigsize) r = getmatches_by_contents(f, bigsize=bigsize)
eq_(len(r), 2) eq_(len(r), 2)
# User disabled optimization for big files, compute hashes as usual # User disabled optimization for big files, compute digests as usual
r = getmatches_by_contents(f, bigsize=0) r = getmatches_by_contents(f, bigsize=0)
eq_(len(r), 2) eq_(len(r), 2)
# Other file is now slightly different, md5partial is still the same # Other file is now slightly different, digest_partial is still the same
f[1].md5 = f[1].md5samples = "foobardiff" f[1].digest = f[1].digest_samples = "foobardiff"
r = getmatches_by_contents(f, bigsize=bigsize) r = getmatches_by_contents(f, bigsize=bigsize)
# Successfully filter it out # Successfully filter it out
eq_(len(r), 1) eq_(len(r), 1)
@ -884,7 +887,7 @@ class TestCaseGetGroups:
# If, with a (A, B, C, D) set, all match with A, but C and D don't match with B and that the # If, with a (A, B, C, D) set, all match with A, but C and D don't match with B and that the
# (A, B) match is the highest (thus resulting in an (A, B) group), still match C and D # (A, B) match is the highest (thus resulting in an (A, B) group), still match C and D
# in a separate group instead of discarding them. # in a separate group instead of discarding them.
A, B, C, D = [NamedObject() for _ in range(4)] A, B, C, D = (NamedObject() for _ in range(4))
m1 = Match(A, B, 90) # This is the strongest "A" match m1 = Match(A, B, 90) # This is the strongest "A" match
m2 = Match(A, C, 80) # Because C doesn't match with B, it won't be in the group m2 = Match(A, C, 80) # Because C doesn't match with B, it won't be in the group
m3 = Match(A, D, 80) # Same thing for D m3 = Match(A, D, 80) # Same thing for D

View File

@ -10,8 +10,8 @@ from xml.etree import ElementTree as ET
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from hscommon.plat import ISWINDOWS from hscommon.plat import ISWINDOWS
from .base import DupeGuru from core.tests.base import DupeGuru
from ..exclude import ExcludeList, ExcludeDict, default_regexes, AlreadyThereException from core.exclude import ExcludeList, ExcludeDict, default_regexes, AlreadyThereException
from re import error from re import error
@ -289,8 +289,8 @@ class TestCaseListEmptyUnion(TestCaseListEmpty):
compiled = [x for x in self.exclude_list.compiled] compiled = [x for x in self.exclude_list.compiled]
assert regex not in compiled assert regex not in compiled
# Need to escape both to get the same strings after compilation # Need to escape both to get the same strings after compilation
compiled_escaped = set([x.encode("unicode-escape").decode() for x in compiled[0].pattern.split("|")]) compiled_escaped = {x.encode("unicode-escape").decode() for x in compiled[0].pattern.split("|")}
default_escaped = set([x.encode("unicode-escape").decode() for x in default_regexes]) default_escaped = {x.encode("unicode-escape").decode() for x in default_regexes}
assert compiled_escaped == default_escaped assert compiled_escaped == default_escaped
eq_(len(default_regexes), len(compiled[0].pattern.split("|"))) eq_(len(default_regexes), len(compiled[0].pattern.split("|")))
@ -366,8 +366,8 @@ class TestCaseDictEmptyUnion(TestCaseDictEmpty):
compiled = [x for x in self.exclude_list.compiled] compiled = [x for x in self.exclude_list.compiled]
assert regex not in compiled assert regex not in compiled
# Need to escape both to get the same strings after compilation # Need to escape both to get the same strings after compilation
compiled_escaped = set([x.encode("unicode-escape").decode() for x in compiled[0].pattern.split("|")]) compiled_escaped = {x.encode("unicode-escape").decode() for x in compiled[0].pattern.split("|")}
default_escaped = set([x.encode("unicode-escape").decode() for x in default_regexes]) default_escaped = {x.encode("unicode-escape").decode() for x in default_regexes}
assert compiled_escaped == default_escaped assert compiled_escaped == default_escaped
eq_(len(default_regexes), len(compiled[0].pattern.split("|"))) eq_(len(default_regexes), len(compiled[0].pattern.split("|")))

View File

@ -6,43 +6,47 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import hashlib import typing
from os import urandom from os import urandom
from hscommon.path import Path from pathlib import Path
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from core.tests.directories_test import create_fake_fs from core.tests.directories_test import create_fake_fs
from .. import fs from core import fs
hasher: typing.Callable
try:
import xxhash
hasher = xxhash.xxh128
except ImportError:
import hashlib
hasher = hashlib.md5
def create_fake_fs_with_random_data(rootpath): def create_fake_fs_with_random_data(rootpath):
rootpath = rootpath["fs"] rootpath = rootpath.joinpath("fs")
rootpath.mkdir() rootpath.mkdir()
rootpath["dir1"].mkdir() rootpath.joinpath("dir1").mkdir()
rootpath["dir2"].mkdir() rootpath.joinpath("dir2").mkdir()
rootpath["dir3"].mkdir() rootpath.joinpath("dir3").mkdir()
fp = rootpath["file1.test"].open("wb")
data1 = urandom(200 * 1024) # 200KiB data1 = urandom(200 * 1024) # 200KiB
data2 = urandom(1024 * 1024) # 1MiB data2 = urandom(1024 * 1024) # 1MiB
data3 = urandom(10 * 1024 * 1024) # 10MiB data3 = urandom(10 * 1024 * 1024) # 10MiB
fp.write(data1) with rootpath.joinpath("file1.test").open("wb") as fp:
fp.close() fp.write(data1)
fp = rootpath["file2.test"].open("wb") with rootpath.joinpath("file2.test").open("wb") as fp:
fp.write(data2) fp.write(data2)
fp.close() with rootpath.joinpath("file3.test").open("wb") as fp:
fp = rootpath["file3.test"].open("wb") fp.write(data3)
fp.write(data3) with rootpath.joinpath("dir1", "file1.test").open("wb") as fp:
fp.close() fp.write(data1)
fp = rootpath["dir1"]["file1.test"].open("wb") with rootpath.joinpath("dir2", "file2.test").open("wb") as fp:
fp.write(data1) fp.write(data2)
fp.close() with rootpath.joinpath("dir3", "file3.test").open("wb") as fp:
fp = rootpath["dir2"]["file2.test"].open("wb") fp.write(data3)
fp.write(data2)
fp.close()
fp = rootpath["dir3"]["file3.test"].open("wb")
fp.write(data3)
fp.close()
return rootpath return rootpath
@ -52,54 +56,54 @@ def test_size_aggregates_subfiles(tmpdir):
eq_(b.size, 12) eq_(b.size, 12)
def test_md5_aggregate_subfiles_sorted(tmpdir): def test_digest_aggregate_subfiles_sorted(tmpdir):
# dir.allfiles can return child in any order. Thus, bundle.md5 must aggregate # dir.allfiles can return child in any order. Thus, bundle.digest must aggregate
# all files' md5 it contains, but it must make sure that it does so in the # all files' digests it contains, but it must make sure that it does so in the
# same order everytime. # same order everytime.
p = create_fake_fs_with_random_data(Path(str(tmpdir))) p = create_fake_fs_with_random_data(Path(str(tmpdir)))
b = fs.Folder(p) b = fs.Folder(p)
md51 = fs.File(p["dir1"]["file1.test"]).md5 digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest
md52 = fs.File(p["dir2"]["file2.test"]).md5 digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest
md53 = fs.File(p["dir3"]["file3.test"]).md5 digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest
md54 = fs.File(p["file1.test"]).md5 digest4 = fs.File(p.joinpath("file1.test")).digest
md55 = fs.File(p["file2.test"]).md5 digest5 = fs.File(p.joinpath("file2.test")).digest
md56 = fs.File(p["file3.test"]).md5 digest6 = fs.File(p.joinpath("file3.test")).digest
# The expected md5 is the md5 of md5s for folders and the direct md5 for files # The expected digest is the hash of digests for folders and the direct digest for files
folder_md51 = hashlib.md5(md51).digest() folder_digest1 = hasher(digest1).digest()
folder_md52 = hashlib.md5(md52).digest() folder_digest2 = hasher(digest2).digest()
folder_md53 = hashlib.md5(md53).digest() folder_digest3 = hasher(digest3).digest()
md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56) digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
eq_(b.md5, md5.digest()) eq_(b.digest, digest)
def test_partial_md5_aggregate_subfile_sorted(tmpdir): def test_partial_digest_aggregate_subfile_sorted(tmpdir):
p = create_fake_fs_with_random_data(Path(str(tmpdir))) p = create_fake_fs_with_random_data(Path(str(tmpdir)))
b = fs.Folder(p) b = fs.Folder(p)
md51 = fs.File(p["dir1"]["file1.test"]).md5partial digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest_partial
md52 = fs.File(p["dir2"]["file2.test"]).md5partial digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest_partial
md53 = fs.File(p["dir3"]["file3.test"]).md5partial digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest_partial
md54 = fs.File(p["file1.test"]).md5partial digest4 = fs.File(p.joinpath("file1.test")).digest_partial
md55 = fs.File(p["file2.test"]).md5partial digest5 = fs.File(p.joinpath("file2.test")).digest_partial
md56 = fs.File(p["file3.test"]).md5partial digest6 = fs.File(p.joinpath("file3.test")).digest_partial
# The expected md5 is the md5 of md5s for folders and the direct md5 for files # The expected digest is the hash of digests for folders and the direct digest for files
folder_md51 = hashlib.md5(md51).digest() folder_digest1 = hasher(digest1).digest()
folder_md52 = hashlib.md5(md52).digest() folder_digest2 = hasher(digest2).digest()
folder_md53 = hashlib.md5(md53).digest() folder_digest3 = hasher(digest3).digest()
md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56) digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
eq_(b.md5partial, md5.digest()) eq_(b.digest_partial, digest)
md51 = fs.File(p["dir1"]["file1.test"]).md5samples digest1 = fs.File(p.joinpath("dir1", "file1.test")).digest_samples
md52 = fs.File(p["dir2"]["file2.test"]).md5samples digest2 = fs.File(p.joinpath("dir2", "file2.test")).digest_samples
md53 = fs.File(p["dir3"]["file3.test"]).md5samples digest3 = fs.File(p.joinpath("dir3", "file3.test")).digest_samples
md54 = fs.File(p["file1.test"]).md5samples digest4 = fs.File(p.joinpath("file1.test")).digest_samples
md55 = fs.File(p["file2.test"]).md5samples digest5 = fs.File(p.joinpath("file2.test")).digest_samples
md56 = fs.File(p["file3.test"]).md5samples digest6 = fs.File(p.joinpath("file3.test")).digest_samples
# The expected md5 is the md5 of md5s for folders and the direct md5 for files # The expected digest is the digest of digests for folders and the direct digest for files
folder_md51 = hashlib.md5(md51).digest() folder_digest1 = hasher(digest1).digest()
folder_md52 = hashlib.md5(md52).digest() folder_digest2 = hasher(digest2).digest()
folder_md53 = hashlib.md5(md53).digest() folder_digest3 = hasher(digest3).digest()
md5 = hashlib.md5(folder_md51 + folder_md52 + folder_md53 + md54 + md55 + md56) digest = hasher(folder_digest1 + folder_digest2 + folder_digest3 + digest4 + digest5 + digest6).digest()
eq_(b.md5samples, md5.digest()) eq_(b.digest_samples, digest)
def test_has_file_attrs(tmpdir): def test_has_file_attrs(tmpdir):

View File

@ -10,7 +10,7 @@ from xml.etree import ElementTree as ET
from pytest import raises from pytest import raises
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from ..ignore import IgnoreList from core.ignore import IgnoreList
def test_empty(): def test_empty():

View File

@ -6,7 +6,7 @@
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from ..markable import MarkableList, Markable from core.markable import MarkableList, Markable
def gen(): def gen():

View File

@ -9,8 +9,8 @@
import os.path as op import os.path as op
from itertools import combinations from itertools import combinations
from .base import TestApp, NamedObject, with_app, eq_ from core.tests.base import TestApp, NamedObject, with_app, eq_
from ..engine import Group, Match from core.engine import Group, Match
no = NamedObject no = NamedObject

View File

@ -6,7 +6,7 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from .base import TestApp, GetTestGroups from core.tests.base import TestApp, GetTestGroups
def app_with_results(): def app_with_results():

View File

@ -12,10 +12,9 @@ from xml.etree import ElementTree as ET
from pytest import raises from pytest import raises
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from hscommon.util import first from hscommon.util import first
from core import engine
from .. import engine from core.tests.base import NamedObject, GetTestGroups, DupeGuru
from .base import NamedObject, GetTestGroups, DupeGuru from core.results import Results
from ..results import Results
class TestCaseResultsEmpty: class TestCaseResultsEmpty:
@ -337,7 +336,7 @@ class TestCaseResultsMarkings:
def log_object(o): def log_object(o):
log.append(o) log.append(o)
if o is self.objects[1]: if o is self.objects[1]:
raise EnvironmentError("foobar") raise OSError("foobar")
log = [] log = []
self.results.mark_all() self.results.mark_all()
@ -447,7 +446,7 @@ class TestCaseResultsXML:
self.results.groups = self.groups self.results.groups = self.groups
def get_file(self, path): # use this as a callback for load_from_xml def get_file(self, path): # use this as a callback for load_from_xml
return [o for o in self.objects if o.path == path][0] return [o for o in self.objects if str(o.path) == path][0]
def test_save_to_xml(self): def test_save_to_xml(self):
self.objects[0].is_ref = True self.objects[0].is_ref = True
@ -464,7 +463,7 @@ class TestCaseResultsXML:
eq_(6, len(g1)) eq_(6, len(g1))
eq_(3, len([c for c in g1 if c.tag == "file"])) eq_(3, len([c for c in g1 if c.tag == "file"]))
eq_(3, len([c for c in g1 if c.tag == "match"])) eq_(3, len([c for c in g1 if c.tag == "match"]))
d1, d2, d3 = [c for c in g1 if c.tag == "file"] d1, d2, d3 = (c for c in g1 if c.tag == "file")
eq_(op.join("basepath", "foo bar"), d1.get("path")) eq_(op.join("basepath", "foo bar"), d1.get("path"))
eq_(op.join("basepath", "bar bleh"), d2.get("path")) eq_(op.join("basepath", "bar bleh"), d2.get("path"))
eq_(op.join("basepath", "foo bleh"), d3.get("path")) eq_(op.join("basepath", "foo bleh"), d3.get("path"))
@ -477,7 +476,7 @@ class TestCaseResultsXML:
eq_(3, len(g2)) eq_(3, len(g2))
eq_(2, len([c for c in g2 if c.tag == "file"])) eq_(2, len([c for c in g2 if c.tag == "file"]))
eq_(1, len([c for c in g2 if c.tag == "match"])) eq_(1, len([c for c in g2 if c.tag == "match"]))
d1, d2 = [c for c in g2 if c.tag == "file"] d1, d2 = (c for c in g2 if c.tag == "file")
eq_(op.join("basepath", "ibabtu"), d1.get("path")) eq_(op.join("basepath", "ibabtu"), d1.get("path"))
eq_(op.join("basepath", "ibabtu"), d2.get("path")) eq_(op.join("basepath", "ibabtu"), d2.get("path"))
eq_("n", d1.get("is_ref")) eq_("n", d1.get("is_ref"))

View File

@ -7,29 +7,33 @@
import pytest import pytest
from hscommon.jobprogress import job from hscommon.jobprogress import job
from hscommon.path import Path from pathlib import Path
from hscommon.testutil import eq_ from hscommon.testutil import eq_
from .. import fs from core import fs
from ..engine import getwords, Match from core.engine import getwords, Match
from ..ignore import IgnoreList from core.ignore import IgnoreList
from ..scanner import Scanner, ScanType from core.scanner import Scanner, ScanType
from ..me.scanner import ScannerME from core.me.scanner import ScannerME
# TODO update this to be able to inherit from fs.File
class NamedObject: class NamedObject:
def __init__(self, name="foobar", size=1, path=None): def __init__(self, name="foobar", size=1, path=None):
if path is None: if path is None:
path = Path(name) path = Path(name)
else: else:
path = Path(path)[name] path = Path(path, name)
self.name = name self.name = name
self.size = size self.size = size
self.path = path self.path = path
self.words = getwords(name) self.words = getwords(name)
def __repr__(self): def __repr__(self):
return "<NamedObject %r %r>" % (self.name, self.path) return "<NamedObject {!r} {!r}>".format(self.name, self.path)
def exists(self):
return self.path.exists()
no = NamedObject no = NamedObject
@ -123,19 +127,19 @@ def test_content_scan(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.CONTENTS s.scan_type = ScanType.CONTENTS
f = [no("foo"), no("bar"), no("bleh")] f = [no("foo"), no("bar"), no("bleh")]
f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar" f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar" f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].md5 = f[2].md5partial = f[1].md5samples = "bleh" f[2].digest = f[2].digest_partial = f[1].digest_samples = "bleh"
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
eq_(len(r[0]), 2) eq_(len(r[0]), 2)
eq_(s.discarded_file_count, 0) # don't count the different md5 as discarded! eq_(s.discarded_file_count, 0) # don't count the different digest as discarded!
def test_content_scan_compare_sizes_first(fake_fileexists): def test_content_scan_compare_sizes_first(fake_fileexists):
class MyFile(no): class MyFile(no):
@property @property
def md5(self): def digest(self):
raise AssertionError() raise AssertionError()
s = Scanner() s = Scanner()
@ -161,14 +165,14 @@ def test_ignore_file_size(fake_fileexists):
no("largeignore1", large_size + 1), no("largeignore1", large_size + 1),
no("largeignore2", large_size + 1), no("largeignore2", large_size + 1),
] ]
f[0].md5 = f[0].md5partial = f[0].md5samples = "smallignore" f[0].digest = f[0].digest_partial = f[0].digest_samples = "smallignore"
f[1].md5 = f[1].md5partial = f[1].md5samples = "smallignore" f[1].digest = f[1].digest_partial = f[1].digest_samples = "smallignore"
f[2].md5 = f[2].md5partial = f[2].md5samples = "small" f[2].digest = f[2].digest_partial = f[2].digest_samples = "small"
f[3].md5 = f[3].md5partial = f[3].md5samples = "small" f[3].digest = f[3].digest_partial = f[3].digest_samples = "small"
f[4].md5 = f[4].md5partial = f[4].md5samples = "large" f[4].digest = f[4].digest_partial = f[4].digest_samples = "large"
f[5].md5 = f[5].md5partial = f[5].md5samples = "large" f[5].digest = f[5].digest_partial = f[5].digest_samples = "large"
f[6].md5 = f[6].md5partial = f[6].md5samples = "largeignore" f[6].digest = f[6].digest_partial = f[6].digest_samples = "largeignore"
f[7].md5 = f[7].md5partial = f[7].md5samples = "largeignore" f[7].digest = f[7].digest_partial = f[7].digest_samples = "largeignore"
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
# No ignores # No ignores
@ -197,21 +201,21 @@ def test_big_file_partial_hashes(fake_fileexists):
s.big_file_size_threshold = bigsize s.big_file_size_threshold = bigsize
f = [no("bigfoo", bigsize), no("bigbar", bigsize), no("smallfoo", smallsize), no("smallbar", smallsize)] f = [no("bigfoo", bigsize), no("bigbar", bigsize), no("smallfoo", smallsize), no("smallbar", smallsize)]
f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar" f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar" f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].md5 = f[2].md5partial = "bleh" f[2].digest = f[2].digest_partial = "bleh"
f[3].md5 = f[3].md5partial = "bleh" f[3].digest = f[3].digest_partial = "bleh"
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 2) eq_(len(r), 2)
# md5partial is still the same, but the file is actually different # digest_partial is still the same, but the file is actually different
f[1].md5 = f[1].md5samples = "difffoobar" f[1].digest = f[1].digest_samples = "difffoobar"
# here we compare the full md5s, as the user disabled the optimization # here we compare the full digests, as the user disabled the optimization
s.big_file_size_threshold = 0 s.big_file_size_threshold = 0
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
# here we should compare the md5samples, and see they are different # here we should compare the digest_samples, and see they are different
s.big_file_size_threshold = bigsize s.big_file_size_threshold = bigsize
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
@ -221,9 +225,9 @@ def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.CONTENTS s.scan_type = ScanType.CONTENTS
f = [no("foo"), no("bar"), no("bleh")] f = [no("foo"), no("bar"), no("bleh")]
f[0].md5 = f[0].md5partial = f[0].md5samples = "foobar" f[0].digest = f[0].digest_partial = f[0].digest_samples = "foobar"
f[1].md5 = f[1].md5partial = f[1].md5samples = "foobar" f[1].digest = f[1].digest_partial = f[1].digest_samples = "foobar"
f[2].md5 = f[2].md5partial = f[2].md5samples = "bleh" f[2].digest = f[2].digest_partial = f[2].digest_samples = "bleh"
s.min_match_percentage = 101 s.min_match_percentage = 101
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
eq_(len(r), 1) eq_(len(r), 1)
@ -234,12 +238,16 @@ def test_min_match_perc_doesnt_matter_for_content_scan(fake_fileexists):
eq_(len(r[0]), 2) eq_(len(r[0]), 2)
def test_content_scan_doesnt_put_md5_in_words_at_the_end(fake_fileexists): def test_content_scan_doesnt_put_digest_in_words_at_the_end(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.CONTENTS s.scan_type = ScanType.CONTENTS
f = [no("foo"), no("bar")] f = [no("foo"), no("bar")]
f[0].md5 = f[0].md5partial = f[0].md5samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" f[0].digest = f[0].digest_partial = f[0].digest_samples = (
f[1].md5 = f[1].md5partial = f[1].md5samples = "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
)
f[1].digest = f[1].digest_partial = f[1].digest_samples = (
"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
)
r = s.get_dupe_groups(f) r = s.get_dupe_groups(f)
# FIXME looks like we are missing something here? # FIXME looks like we are missing something here?
r[0] r[0]
@ -332,7 +340,7 @@ def test_tag_scan(fake_fileexists):
def test_tag_with_album_scan(fake_fileexists): def test_tag_with_album_scan(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.TAG s.scan_type = ScanType.TAG
s.scanned_tags = set(["artist", "album", "title"]) s.scanned_tags = {"artist", "album", "title"}
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o3 = no("bleh") o3 = no("bleh")
@ -352,7 +360,7 @@ def test_tag_with_album_scan(fake_fileexists):
def test_that_dash_in_tags_dont_create_new_fields(fake_fileexists): def test_that_dash_in_tags_dont_create_new_fields(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.TAG s.scan_type = ScanType.TAG
s.scanned_tags = set(["artist", "album", "title"]) s.scanned_tags = {"artist", "album", "title"}
s.min_match_percentage = 50 s.min_match_percentage = 50
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
@ -369,7 +377,7 @@ def test_that_dash_in_tags_dont_create_new_fields(fake_fileexists):
def test_tag_scan_with_different_scanned(fake_fileexists): def test_tag_scan_with_different_scanned(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.TAG s.scan_type = ScanType.TAG
s.scanned_tags = set(["track", "year"]) s.scanned_tags = {"track", "year"}
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o1.artist = "The White Stripes" o1.artist = "The White Stripes"
@ -387,7 +395,7 @@ def test_tag_scan_with_different_scanned(fake_fileexists):
def test_tag_scan_only_scans_existing_tags(fake_fileexists): def test_tag_scan_only_scans_existing_tags(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.TAG s.scan_type = ScanType.TAG
s.scanned_tags = set(["artist", "foo"]) s.scanned_tags = {"artist", "foo"}
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o1.artist = "The White Stripes" o1.artist = "The White Stripes"
@ -401,7 +409,7 @@ def test_tag_scan_only_scans_existing_tags(fake_fileexists):
def test_tag_scan_converts_to_str(fake_fileexists): def test_tag_scan_converts_to_str(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.TAG s.scan_type = ScanType.TAG
s.scanned_tags = set(["track"]) s.scanned_tags = {"track"}
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o1.track = 42 o1.track = 42
@ -416,7 +424,7 @@ def test_tag_scan_converts_to_str(fake_fileexists):
def test_tag_scan_non_ascii(fake_fileexists): def test_tag_scan_non_ascii(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.TAG s.scan_type = ScanType.TAG
s.scanned_tags = set(["title"]) s.scanned_tags = {"title"}
o1 = no("foo") o1 = no("foo")
o2 = no("bar") o2 = no("bar")
o1.title = "foobar\u00e9" o1.title = "foobar\u00e9"
@ -568,12 +576,14 @@ def test_dont_group_files_that_dont_exist(tmpdir):
s = Scanner() s = Scanner()
s.scan_type = ScanType.CONTENTS s.scan_type = ScanType.CONTENTS
p = Path(str(tmpdir)) p = Path(str(tmpdir))
p["file1"].open("w").write("foo") with p.joinpath("file1").open("w") as fp:
p["file2"].open("w").write("foo") fp.write("foo")
with p.joinpath("file2").open("w") as fp:
fp.write("foo")
file1, file2 = fs.get_files(p) file1, file2 = fs.get_files(p)
def getmatches(*args, **kw): def getmatches(*args, **kw):
file2.path.remove() file2.path.unlink()
return [Match(file1, file2, 100)] return [Match(file1, file2, 100)]
s._getmatches = getmatches s._getmatches = getmatches
@ -587,21 +597,21 @@ def test_folder_scan_exclude_subfolder_matches(fake_fileexists):
s = Scanner() s = Scanner()
s.scan_type = ScanType.FOLDERS s.scan_type = ScanType.FOLDERS
topf1 = no("top folder 1", size=42) topf1 = no("top folder 1", size=42)
topf1.md5 = topf1.md5partial = topf1.md5samples = b"some_md5_1" topf1.digest = topf1.digest_partial = topf1.digest_samples = b"some_digest__1"
topf1.path = Path("/topf1") topf1.path = Path("/topf1")
topf2 = no("top folder 2", size=42) topf2 = no("top folder 2", size=42)
topf2.md5 = topf2.md5partial = topf2.md5samples = b"some_md5_1" topf2.digest = topf2.digest_partial = topf2.digest_samples = b"some_digest__1"
topf2.path = Path("/topf2") topf2.path = Path("/topf2")
subf1 = no("sub folder 1", size=41) subf1 = no("sub folder 1", size=41)
subf1.md5 = subf1.md5partial = subf1.md5samples = b"some_md5_2" subf1.digest = subf1.digest_partial = subf1.digest_samples = b"some_digest__2"
subf1.path = Path("/topf1/sub") subf1.path = Path("/topf1/sub")
subf2 = no("sub folder 2", size=41) subf2 = no("sub folder 2", size=41)
subf2.md5 = subf2.md5partial = subf2.md5samples = b"some_md5_2" subf2.digest = subf2.digest_partial = subf2.digest_samples = b"some_digest__2"
subf2.path = Path("/topf2/sub") subf2.path = Path("/topf2/sub")
eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2])), 1) # only top folders eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2])), 1) # only top folders
# however, if another folder matches a subfolder, keep in in the matches # however, if another folder matches a subfolder, keep in in the matches
otherf = no("other folder", size=41) otherf = no("other folder", size=41)
otherf.md5 = otherf.md5partial = otherf.md5samples = b"some_md5_2" otherf.digest = otherf.digest_partial = otherf.digest_samples = b"some_digest__2"
otherf.path = Path("/otherfolder") otherf.path = Path("/otherfolder")
eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2, otherf])), 2) eq_(len(s.get_dupe_groups([topf1, topf2, subf1, subf2, otherf])), 2)
@ -624,9 +634,9 @@ def test_dont_count_ref_files_as_discarded(fake_fileexists):
o1 = no("foo", path="p1") o1 = no("foo", path="p1")
o2 = no("foo", path="p2") o2 = no("foo", path="p2")
o3 = no("foo", path="p3") o3 = no("foo", path="p3")
o1.md5 = o1.md5partial = o1.md5samples = "foobar" o1.digest = o1.digest_partial = o1.digest_samples = "foobar"
o2.md5 = o2.md5partial = o2.md5samples = "foobar" o2.digest = o2.digest_partial = o2.digest_samples = "foobar"
o3.md5 = o3.md5partial = o3.md5samples = "foobar" o3.digest = o3.digest_partial = o3.digest_samples = "foobar"
o1.is_ref = True o1.is_ref = True
o2.is_ref = True o2.is_ref = True
eq_(len(s.get_dupe_groups([o1, o2, o3])), 1) eq_(len(s.get_dupe_groups([o1, o2, o3])), 1)

View File

@ -7,6 +7,12 @@
import time import time
import sys import sys
import os import os
import urllib.request
import urllib.error
import json
import semantic_version
import logging
from typing import Union
from hscommon.util import format_time_decimal from hscommon.util import format_time_decimal
@ -64,3 +70,34 @@ def fix_surrogate_encoding(s, encoding="utf-8"):
def executable_folder(): def executable_folder():
return os.path.dirname(os.path.abspath(sys.argv[0])) return os.path.dirname(os.path.abspath(sys.argv[0]))
def check_for_update(current_version: str, include_prerelease: bool = False) -> Union[None, dict]:
request = urllib.request.Request(
"https://api.github.com/repos/arsenetar/dupeguru/releases",
headers={"Accept": "application/vnd.github.v3+json"},
)
try:
with urllib.request.urlopen(request) as response:
if response.status != 200:
logging.warn(f"Error retriving updates. Status: {response.status}")
return None
try:
response_json = json.loads(response.read())
except json.JSONDecodeError as ex:
logging.warn(f"Error parsing updates. {ex.msg}")
return None
except urllib.error.URLError as ex:
logging.warn(f"Error retriving updates. {ex.reason}")
return None
new_version = semantic_version.Version(current_version)
new_url = None
for release in response_json:
release_version = semantic_version.Version(release["name"])
if new_version < release_version and (include_prerelease or not release_version.prerelease):
new_version = release_version
new_url = release["html_url"]
if new_url is not None:
return {"version": new_version, "url": new_url}
else:
return None

View File

@ -1,3 +1,31 @@
=== 4.3.1 (2022-07-08)
* Fix issue where cache db exceptions could prevent files being hashed (#1015)
* Add extra guard for non-zero length files without digests to prevent false duplicates
* Update Italian translations
=== 4.3.0 (2022-07-01)
* Redirect stdout from custom command to the log files (#1008)
* Update translations
* Fix typo in debian control file (#989)
* Add option to profile scans
* Update fs.py to optimize stat() calls
* Fix Error when delete after scan (#988)
* Update directory scanning to use os.scandir() and DirEntry objects
* Improve performance of Directories.get_state()
* Migrate from hscommon.path to pathlib
* Switch file hashing to xxhash with fallback to md5
* Add update check feature to about box
=== 4.2.1 (2022-03-25)
* Default to English on unsupported system language (#976)
* Fix image viewer zoom datatype issue (#978)
* Fix errors from window change event (#937, #980)
* Fix deprecation warning from SQLite
* Enforce minimum Windows version in installer (#983)
* Fix help path for local files
* Drop python 3.6 support
* VS Code project settings added, yaml validation for GitHub actions
=== 4.2.0 (2021-01-24) === 4.2.0 (2021-01-24)
* Add Malay and Turkish * Add Malay and Turkish

View File

@ -12,7 +12,7 @@ a community around this project.
So, whatever your skills, if you're interested in contributing to dupeGuru, please do so. Normally, So, whatever your skills, if you're interested in contributing to dupeGuru, please do so. Normally,
this documentation should be enough to get you started, but if it isn't, then **please**, this documentation should be enough to get you started, but if it isn't, then **please**,
`let me know`_ because it's a problem that I'm committed to fix. If there's any situation where you'd open a discussion at https://github.com/arsenetar/dupeguru/discussions. If there's any situation where you'd
wish to contribute but some doubt you're having prevent you from going forward, please contact me. wish to contribute but some doubt you're having prevent you from going forward, please contact me.
I'd much prefer to spend the time figuring out with you whether (and how) you can contribute than I'd much prefer to spend the time figuring out with you whether (and how) you can contribute than
taking the chance of missing that opportunity. taking the chance of missing that opportunity.
@ -24,7 +24,7 @@ Development process
* `Issue Tracker`_ * `Issue Tracker`_
* `Issue labels meaning`_ * `Issue labels meaning`_
dupeGuru's source code is on Github and thus managed in a Git repository. At all times, you should dupeGuru's source code is on GitHub and thus managed in a Git repository. At all times, you should
be able to build from source a fresh checkout of the ``master`` branch using instructions from the be able to build from source a fresh checkout of the ``master`` branch using instructions from the
``README.md`` file at the root of this project. If you can't, it's a bug. Please report it. ``README.md`` file at the root of this project. If you can't, it's a bug. Please report it.
@ -61,7 +61,7 @@ It's the same thing with feature requests. Description of a feature request, whe
already been given to how such a feature would fit in the current design, are precious to developers already been given to how such a feature would fit in the current design, are precious to developers
and help them figure out a clear roadmap for the project. and help them figure out a clear roadmap for the project.
So, even if you're not a developer, you can always open a Github account and create/comment issues. So, even if you're not a developer, you can always open a GitHub account and create/comment issues.
Your contribution will be much appreciated. Your contribution will be much appreciated.
**Documentation**. This is a bit trickier because dupeGuru's documentation is written with a rather **Documentation**. This is a bit trickier because dupeGuru's documentation is written with a rather
@ -82,10 +82,9 @@ agree on what should be added to the documentation.
dupeGuru. For more information about how to do that, you can refer to the `translator guide`_. dupeGuru. For more information about how to do that, you can refer to the `translator guide`_.
.. _been open source: https://www.hardcoded.net/articles/free-as-in-speech-fair-as-in-trade .. _been open source: https://www.hardcoded.net/articles/free-as-in-speech-fair-as-in-trade
.. _let me know: mailto:hsoft@hardcoded.net
.. _Source code repository: https://github.com/arsenetar/dupeguru .. _Source code repository: https://github.com/arsenetar/dupeguru
.. _Issue Tracker: https://github.com/hsoft/arsenetar/issues .. _Issue Tracker: https://github.com/arsenetar/issues
.. _Issue labels meaning: https://github.com/hsoft/arsenetar/wiki/issue-labels .. _Issue labels meaning: https://github.com/arsenetar/wiki/issue-labels
.. _Sphinx: http://sphinx-doc.org/ .. _Sphinx: http://sphinx-doc.org/
.. _reST: http://en.wikipedia.org/wiki/ReStructuredText .. _reST: http://en.wikipedia.org/wiki/ReStructuredText
.. _translator guide: https://github.com/hsoft/arsenetar/wiki/Translator-Guide .. _translator guide: https://github.com/arsenetar/wiki/Translator-Guide

View File

@ -15,4 +15,3 @@ hscommon.gui.progress_window
.. autoclass:: ProgressWindowView .. autoclass:: ProgressWindowView
:members: :members:
:private-members: :private-members:

View File

@ -15,4 +15,3 @@ hscommon.gui.tree
.. autoclass:: Node .. autoclass:: Node
:members: :members:
:private-members: :private-members:

View File

@ -13,4 +13,3 @@ hscommon
util util
jobprogress/* jobprogress/*
gui/* gui/*

View File

@ -14,4 +14,3 @@ hscommon.jobprogress.job
.. autoclass:: NullJob .. autoclass:: NullJob
:members: :members:

View File

@ -9,4 +9,3 @@ hscommon.jobprogress.performer
.. autoclass:: ThreadedJobPerformer .. autoclass:: ThreadedJobPerformer
:members: :members:

View File

@ -30,8 +30,8 @@ that makes sure that you will **always** keep at least one member of the duplica
How can I report a bug a suggest a feature? How can I report a bug a suggest a feature?
------------------------------------------- -------------------------------------------
dupeGuru is hosted on `Github`_ and it's also where issues are tracked. The best way to report a dupeGuru is hosted on `GitHub`_ and it's also where issues are tracked. The best way to report a
bug or suggest a feature is to sign up on Github and `open an issue`_. bug or suggest a feature is to sign up on GitHub and `open an issue`_.
The mark box of a file I want to delete is disabled. What must I do? The mark box of a file I want to delete is disabled. What must I do?
-------------------------------------------------------------------- --------------------------------------------------------------------
@ -176,6 +176,5 @@ Preferences are stored elsewhere:
* Linux: ``~/.config/Hardcoded Software/dupeGuru.conf`` * Linux: ``~/.config/Hardcoded Software/dupeGuru.conf``
* Mac OS X: In the built-in ``defaults`` system, as ``com.hardcoded-software.dupeguru`` * Mac OS X: In the built-in ``defaults`` system, as ``com.hardcoded-software.dupeguru``
.. _Github: https://github.com/arsenetar/dupeguru .. _GitHub: https://github.com/arsenetar/dupeguru
.. _open an issue: https://github.com/arsenetar/dupeguru/wiki/issue-labels .. _open an issue: https://github.com/arsenetar/dupeguru/wiki/issue-labels

View File

@ -14,6 +14,10 @@ Preferences
If you check this box, pictures of different dimensions will be allowed in the same If you check this box, pictures of different dimensions will be allowed in the same
duplicate group. duplicate group.
**Match pictures of different rotations:**
If you check this box, pictures of different rotations will be allowed in the same
duplicate group.
.. _filter-hardness: .. _filter-hardness:
**Filter Hardness:** **Filter Hardness:**

View File

@ -12,4 +12,3 @@
* Եթե համոզված եք, որ կրկնօրինակը արդյունքներում կա, ապա սեղմեք **Խմբագրել-->Նշել բոլորը**, և ապա **Գործողություններ-->Ուղարկել Նշվածը Աղբարկղ**: * Եթե համոզված եք, որ կրկնօրինակը արդյունքներում կա, ապա սեղմեք **Խմբագրել-->Նշել բոլորը**, և ապա **Գործողություններ-->Ուղարկել Նշվածը Աղբարկղ**:
Սա միայն բազային ստուգում է: Կան բազմաթիվ կարգավորումներ, որոնք հնարավորություն են տալիս նշելու տարբեր արդյունքներ և մի քանի եղանակներ արդյունքների փոփոխման: Մանրամասների համար կարդացեք Օգնության ֆայլը: Սա միայն բազային ստուգում է: Կան բազմաթիվ կարգավորումներ, որոնք հնարավորություն են տալիս նշելու տարբեր արդյունքներ և մի քանի եղանակներ արդյունքների փոփոխման: Մանրամասների համար կարդացեք Օգնության ֆայլը:

View File

@ -23,4 +23,3 @@ dupeGuru-ը փորձում է որոշել, թե որ կրկնօրինակներ
մեծագույն ֆայլը և եթե երկու կամ ավելի ֆայլեր ունեն նույն չափը, ապա մեկը ունի ֆայլի անուն, որը մեծագույն ֆայլը և եթե երկու կամ ավելի ֆայլեր ունեն նույն չափը, ապա մեկը ունի ֆայլի անուն, որը
չի ավարտվում թվով, կօգտագործվի: Երբ փաստարկի արդյունքը կապված է, կարգը, որի սխալները չի ավարտվում թվով, կօգտագործվի: Երբ փաստարկի արդյունքը կապված է, կարգը, որի սխալները
նախկինում էին, խումբը պետք է օգտագործվի: նախկինում էին, խումբը պետք է օգտագործվի:

View File

@ -114,4 +114,3 @@
Якщо все це не так, `контакт УГ підтримки <http://www.hardcoded.net/support>`_, ми зрозуміти це. Якщо все це не так, `контакт УГ підтримки <http://www.hardcoded.net/support>`_, ми зрозуміти це.
.. todo:: This FAQ qestion is outdated, see english version. .. todo:: This FAQ qestion is outdated, see english version.

5
hscommon/.gitignore vendored
View File

@ -1,5 +0,0 @@
*.pyc
*.mo
*.so
.DS_Store
/docs_html

View File

@ -9,6 +9,7 @@
"""This module is a collection of function to help in HS apps build process. """This module is a collection of function to help in HS apps build process.
""" """
from argparse import ArgumentParser
import os import os
import sys import sys
import os.path as op import os.path as op
@ -20,23 +21,19 @@ import re
import importlib import importlib
from datetime import datetime from datetime import datetime
import glob import glob
import sysconfig from typing import Any, AnyStr, Callable, Dict, List, Union
import modulefinder
from setuptools import setup, Extension from hscommon.plat import ISWINDOWS
from .plat import ISWINDOWS
from .util import ensure_folder, delete_files_with_pattern
def print_and_do(cmd): def print_and_do(cmd: str) -> int:
"""Prints ``cmd`` and executes it in the shell.""" """Prints ``cmd`` and executes it in the shell."""
print(cmd) print(cmd)
p = Popen(cmd, shell=True) p = Popen(cmd, shell=True)
return p.wait() return p.wait()
def _perform(src, dst, action, actionname): def _perform(src: os.PathLike, dst: os.PathLike, action: Callable, actionname: str) -> None:
if not op.lexists(src): if not op.lexists(src):
print("Copying %s failed: it doesn't exist." % src) print("Copying %s failed: it doesn't exist." % src)
return return
@ -45,34 +42,26 @@ def _perform(src, dst, action, actionname):
shutil.rmtree(dst) shutil.rmtree(dst)
else: else:
os.remove(dst) os.remove(dst)
print("%s %s --> %s" % (actionname, src, dst)) print("{} {} --> {}".format(actionname, src, dst))
action(src, dst) action(src, dst)
def copy_file_or_folder(src, dst): def copy_file_or_folder(src: os.PathLike, dst: os.PathLike) -> None:
if op.isdir(src): if op.isdir(src):
shutil.copytree(src, dst, symlinks=True) shutil.copytree(src, dst, symlinks=True)
else: else:
shutil.copy(src, dst) shutil.copy(src, dst)
def move(src, dst): def move(src: os.PathLike, dst: os.PathLike) -> None:
_perform(src, dst, os.rename, "Moving") _perform(src, dst, os.rename, "Moving")
def copy(src, dst): def copy(src: os.PathLike, dst: os.PathLike) -> None:
_perform(src, dst, copy_file_or_folder, "Copying") _perform(src, dst, copy_file_or_folder, "Copying")
def symlink(src, dst): def _perform_on_all(pattern: AnyStr, dst: os.PathLike, action: Callable) -> None:
_perform(src, dst, os.symlink, "Symlinking")
def hardlink(src, dst):
_perform(src, dst, os.link, "Hardlinking")
def _perform_on_all(pattern, dst, action):
# pattern is a glob pattern, example "folder/foo*". The file is moved directly in dst, no folder # pattern is a glob pattern, example "folder/foo*". The file is moved directly in dst, no folder
# structure from src is kept. # structure from src is kept.
filenames = glob.glob(pattern) filenames = glob.glob(pattern)
@ -81,42 +70,35 @@ def _perform_on_all(pattern, dst, action):
action(fn, destpath) action(fn, destpath)
def move_all(pattern, dst): def move_all(pattern: AnyStr, dst: os.PathLike) -> None:
_perform_on_all(pattern, dst, move) _perform_on_all(pattern, dst, move)
def copy_all(pattern, dst): def copy_all(pattern: AnyStr, dst: os.PathLike) -> None:
_perform_on_all(pattern, dst, copy) _perform_on_all(pattern, dst, copy)
def ensure_empty_folder(path): def filereplace(filename: os.PathLike, outfilename: Union[os.PathLike, None] = None, **kwargs) -> None:
"""Make sure that the path exists and that it's an empty folder."""
if op.exists(path):
shutil.rmtree(path)
os.mkdir(path)
def filereplace(filename, outfilename=None, **kwargs):
"""Reads `filename`, replaces all {variables} in kwargs, and writes the result to `outfilename`.""" """Reads `filename`, replaces all {variables} in kwargs, and writes the result to `outfilename`."""
if outfilename is None: if outfilename is None:
outfilename = filename outfilename = filename
fp = open(filename, "rt", encoding="utf-8") fp = open(filename, encoding="utf-8")
contents = fp.read() contents = fp.read()
fp.close() fp.close()
# We can't use str.format() because in some files, there might be {} characters that mess with it. # We can't use str.format() because in some files, there might be {} characters that mess with it.
for key, item in kwargs.items(): for key, item in kwargs.items():
contents = contents.replace("{{{}}}".format(key), item) contents = contents.replace(f"{{{key}}}", item)
fp = open(outfilename, "wt", encoding="utf-8") fp = open(outfilename, "wt", encoding="utf-8")
fp.write(contents) fp.write(contents)
fp.close() fp.close()
def get_module_version(modulename): def get_module_version(modulename: str) -> str:
mod = importlib.import_module(modulename) mod = importlib.import_module(modulename)
return mod.__version__ return mod.__version__
def setup_package_argparser(parser): def setup_package_argparser(parser: ArgumentParser):
parser.add_argument( parser.add_argument(
"--sign", "--sign",
dest="sign_identity", dest="sign_identity",
@ -143,13 +125,13 @@ def setup_package_argparser(parser):
# `args` come from an ArgumentParser updated with setup_package_argparser() # `args` come from an ArgumentParser updated with setup_package_argparser()
def package_cocoa_app_in_dmg(app_path, destfolder, args): def package_cocoa_app_in_dmg(app_path: os.PathLike, destfolder: os.PathLike, args) -> None:
# Rather than signing our app in XCode during the build phase, we sign it during the package # Rather than signing our app in XCode during the build phase, we sign it during the package
# phase because running the app before packaging can modify it and we want to be sure to have # phase because running the app before packaging can modify it and we want to be sure to have
# a valid signature. # a valid signature.
if args.sign_identity: if args.sign_identity:
sign_identity = "Developer ID Application: {}".format(args.sign_identity) sign_identity = f"Developer ID Application: {args.sign_identity}"
result = print_and_do('codesign --force --deep --sign "{}" "{}"'.format(sign_identity, app_path)) result = print_and_do(f'codesign --force --deep --sign "{sign_identity}" "{app_path}"')
if result != 0: if result != 0:
print("ERROR: Signing failed. Aborting packaging.") print("ERROR: Signing failed. Aborting packaging.")
return return
@ -159,46 +141,32 @@ def package_cocoa_app_in_dmg(app_path, destfolder, args):
build_dmg(app_path, destfolder) build_dmg(app_path, destfolder)
def build_dmg(app_path, destfolder): def build_dmg(app_path: os.PathLike, destfolder: os.PathLike) -> None:
"""Builds a DMG volume with application at ``app_path`` and puts it in ``dest_path``. """Builds a DMG volume with application at ``app_path`` and puts it in ``dest_path``.
The name of the resulting DMG volume is determined by the app's name and version. The name of the resulting DMG volume is determined by the app's name and version.
""" """
print(repr(op.join(app_path, "Contents", "Info.plist"))) print(repr(op.join(app_path, "Contents", "Info.plist")))
plist = plistlib.readPlist(op.join(app_path, "Contents", "Info.plist")) with open(op.join(app_path, "Contents", "Info.plist"), "rb") as fp:
plist = plistlib.load(fp)
workpath = tempfile.mkdtemp() workpath = tempfile.mkdtemp()
dmgpath = op.join(workpath, plist["CFBundleName"]) dmgpath = op.join(workpath, plist["CFBundleName"])
os.mkdir(dmgpath) os.mkdir(dmgpath)
print_and_do('cp -R "%s" "%s"' % (app_path, dmgpath)) print_and_do('cp -R "{}" "{}"'.format(app_path, dmgpath))
print_and_do('ln -s /Applications "%s"' % op.join(dmgpath, "Applications")) print_and_do('ln -s /Applications "%s"' % op.join(dmgpath, "Applications"))
dmgname = "%s_osx_%s.dmg" % ( dmgname = "{}_osx_{}.dmg".format(
plist["CFBundleName"].lower().replace(" ", "_"), plist["CFBundleName"].lower().replace(" ", "_"),
plist["CFBundleVersion"].replace(".", "_"), plist["CFBundleVersion"].replace(".", "_"),
) )
print("Building %s" % dmgname) print("Building %s" % dmgname)
# UDBZ = bzip compression. UDZO (zip compression) was used before, but it compresses much less. # UDBZ = bzip compression. UDZO (zip compression) was used before, but it compresses much less.
print_and_do('hdiutil create "%s" -format UDBZ -nocrossdev -srcdir "%s"' % (op.join(destfolder, dmgname), dmgpath)) print_and_do(
'hdiutil create "{}" -format UDBZ -nocrossdev -srcdir "{}"'.format(op.join(destfolder, dmgname), dmgpath)
)
print("Build Complete") print("Build Complete")
def copy_sysconfig_files_for_embed(destpath): def add_to_pythonpath(path: os.PathLike) -> None:
# This normally shouldn't be needed for Python 3.3+.
makefile = sysconfig.get_makefile_filename()
configh = sysconfig.get_config_h_filename()
shutil.copy(makefile, destpath)
shutil.copy(configh, destpath)
with open(op.join(destpath, "site.py"), "w") as fp:
fp.write(
"""
import os.path as op
from distutils import sysconfig
sysconfig.get_makefile_filename = lambda: op.join(op.dirname(__file__), 'Makefile')
sysconfig.get_config_h_filename = lambda: op.join(op.dirname(__file__), 'pyconfig.h')
"""
)
def add_to_pythonpath(path):
"""Adds ``path`` to both ``PYTHONPATH`` env and ``sys.path``.""" """Adds ``path`` to both ``PYTHONPATH`` env and ``sys.path``."""
abspath = op.abspath(path) abspath = op.abspath(path)
pythonpath = os.environ.get("PYTHONPATH", "") pythonpath = os.environ.get("PYTHONPATH", "")
@ -211,7 +179,12 @@ def add_to_pythonpath(path):
# This is a method to hack around those freakingly tricky data inclusion/exlusion rules # This is a method to hack around those freakingly tricky data inclusion/exlusion rules
# in setuptools. We copy the packages *without data* in a build folder and then build the plugin # in setuptools. We copy the packages *without data* in a build folder and then build the plugin
# from there. # from there.
def copy_packages(packages_names, dest, create_links=False, extra_ignores=None): def copy_packages(
packages_names: List[str],
dest: os.PathLike,
create_links: bool = False,
extra_ignores: Union[List[str], None] = None,
) -> None:
"""Copy python packages ``packages_names`` to ``dest``, spurious data. """Copy python packages ``packages_names`` to ``dest``, spurious data.
Copy will happen without tests, testdata, mercurial data or C extension module source with it. Copy will happen without tests, testdata, mercurial data or C extension module source with it.
@ -238,7 +211,7 @@ def copy_packages(packages_names, dest, create_links=False, extra_ignores=None):
os.unlink(dest_path) os.unlink(dest_path)
else: else:
shutil.rmtree(dest_path) shutil.rmtree(dest_path)
print("Copying package at {0} to {1}".format(source_path, dest_path)) print(f"Copying package at {source_path} to {dest_path}")
if create_links: if create_links:
os.symlink(op.abspath(source_path), dest_path) os.symlink(op.abspath(source_path), dest_path)
else: else:
@ -248,28 +221,14 @@ def copy_packages(packages_names, dest, create_links=False, extra_ignores=None):
shutil.copy(source_path, dest_path) shutil.copy(source_path, dest_path)
def copy_qt_plugins(folder_names, dest): # This is only for Windows
from PyQt5.QtCore import QLibraryInfo
qt_plugin_dir = QLibraryInfo.location(QLibraryInfo.PluginsPath)
def ignore(path, names):
if path == qt_plugin_dir:
return [n for n in names if n not in folder_names]
else:
return [n for n in names if not n.endswith(".dll")]
shutil.copytree(qt_plugin_dir, dest, ignore=ignore)
def build_debian_changelog( def build_debian_changelog(
changelogpath, changelogpath: os.PathLike,
destfile, destfile: os.PathLike,
pkgname, pkgname: str,
from_version=None, from_version: Union[str, None] = None,
distribution="precise", distribution: str = "precise",
fix_version=None, fix_version: Union[str, None] = None,
): ) -> None:
"""Builds a debian changelog out of a YAML changelog. """Builds a debian changelog out of a YAML changelog.
Use fix_version to patch the top changelog to that version (if, for example, there was a Use fix_version to patch the top changelog to that version (if, for example, there was a
@ -322,7 +281,7 @@ def build_debian_changelog(
re_changelog_header = re.compile(r"=== ([\d.b]*) \(([\d\-]*)\)") re_changelog_header = re.compile(r"=== ([\d.b]*) \(([\d\-]*)\)")
def read_changelog_file(filename): def read_changelog_file(filename: os.PathLike) -> List[Dict[str, Any]]:
def iter_by_three(it): def iter_by_three(it):
while True: while True:
try: try:
@ -333,7 +292,7 @@ def read_changelog_file(filename):
return return
yield version, date, description yield version, date, description
with open(filename, "rt", encoding="utf-8") as fp: with open(filename, encoding="utf-8") as fp:
contents = fp.read() contents = fp.read()
splitted = re_changelog_header.split(contents)[1:] # the first item is empty splitted = re_changelog_header.split(contents)[1:] # the first item is empty
result = [] result = []
@ -349,184 +308,7 @@ def read_changelog_file(filename):
return result return result
class OSXAppStructure: def fix_qt_resource_file(path: os.PathLike) -> None:
def __init__(self, dest):
self.dest = dest
self.contents = op.join(dest, "Contents")
self.macos = op.join(self.contents, "MacOS")
self.resources = op.join(self.contents, "Resources")
self.frameworks = op.join(self.contents, "Frameworks")
self.infoplist = op.join(self.contents, "Info.plist")
def create(self, infoplist):
ensure_empty_folder(self.dest)
os.makedirs(self.macos)
os.mkdir(self.resources)
os.mkdir(self.frameworks)
copy(infoplist, self.infoplist)
open(op.join(self.contents, "PkgInfo"), "wt").write("APPLxxxx")
def copy_executable(self, executable):
info = plistlib.readPlist(self.infoplist)
self.executablename = info["CFBundleExecutable"]
self.executablepath = op.join(self.macos, self.executablename)
copy(executable, self.executablepath)
def copy_resources(self, *resources, use_symlinks=False):
for path in resources:
resource_dest = op.join(self.resources, op.basename(path))
action = symlink if use_symlinks else copy
action(op.abspath(path), resource_dest)
def copy_frameworks(self, *frameworks):
for path in frameworks:
framework_dest = op.join(self.frameworks, op.basename(path))
copy(path, framework_dest)
def create_osx_app_structure(
dest,
executable,
infoplist,
resources=None,
frameworks=None,
symlink_resources=False,
):
# `dest`: A path to the destination .app folder
# `executable`: the path of the executable file that goes in "MacOS"
# `infoplist`: The path to your Info.plist file.
# `resources`: A list of paths of files or folders going in the "Resources" folder.
# `frameworks`: Same as above for "Frameworks".
# `symlink_resources`: If True, will symlink resources into the structure instead of copying them.
app = OSXAppStructure(dest)
app.create(infoplist)
app.copy_executable(executable)
app.copy_resources(*resources, use_symlinks=symlink_resources)
app.copy_frameworks(*frameworks)
class OSXFrameworkStructure:
def __init__(self, dest):
self.dest = dest
self.contents = op.join(dest, "Versions", "A")
self.resources = op.join(self.contents, "Resources")
self.headers = op.join(self.contents, "Headers")
self.infoplist = op.join(self.resources, "Info.plist")
self._update_executable_path()
def _update_executable_path(self):
if not op.exists(self.infoplist):
self.executablename = self.executablepath = None
return
info = plistlib.readPlist(self.infoplist)
self.executablename = info["CFBundleExecutable"]
self.executablepath = op.join(self.contents, self.executablename)
def create(self, infoplist):
ensure_empty_folder(self.dest)
os.makedirs(self.contents)
os.mkdir(self.resources)
os.mkdir(self.headers)
copy(infoplist, self.infoplist)
self._update_executable_path()
def create_symlinks(self):
# Only call this after create() and copy_executable()
os.symlink("A", op.join(self.dest, "Versions", "Current"))
os.symlink(op.relpath(self.executablepath, self.dest), op.join(self.dest, self.executablename))
os.symlink(op.relpath(self.headers, self.dest), op.join(self.dest, "Headers"))
os.symlink(op.relpath(self.resources, self.dest), op.join(self.dest, "Resources"))
def copy_executable(self, executable):
copy(executable, self.executablepath)
def copy_resources(self, *resources, use_symlinks=False):
for path in resources:
resource_dest = op.join(self.resources, op.basename(path))
action = symlink if use_symlinks else copy
action(op.abspath(path), resource_dest)
def copy_headers(self, *headers, use_symlinks=False):
for path in headers:
header_dest = op.join(self.headers, op.basename(path))
action = symlink if use_symlinks else copy
action(op.abspath(path), header_dest)
def copy_embeddable_python_dylib(dst):
runtime = op.join(
sysconfig.get_config_var("PYTHONFRAMEWORKPREFIX"),
sysconfig.get_config_var("LDLIBRARY"),
)
filedest = op.join(dst, "Python")
shutil.copy(runtime, filedest)
os.chmod(filedest, 0o774) # We need write permission to use install_name_tool
cmd = "install_name_tool -id @rpath/Python %s" % filedest
print_and_do(cmd)
def collect_stdlib_dependencies(script, dest_folder, extra_deps=None):
sysprefix = sys.prefix # could be a virtualenv
basesysprefix = sys.base_prefix # seems to be path to non-virtual sys
real_lib_prefix = sysconfig.get_config_var("LIBDEST") # leaving this in case it is neede
def is_stdlib_path(path):
# A module path is only a stdlib path if it's in either sys.prefix or
# sysconfig.get_config_var('prefix') (the 2 are different if we are in a virtualenv) and if
# there's no "site-package in the path.
if not path:
return False
if "site-package" in path:
return False
if not (path.startswith(sysprefix) or path.startswith(basesysprefix) or path.startswith(real_lib_prefix)):
return False
return True
ensure_folder(dest_folder)
mf = modulefinder.ModuleFinder()
mf.run_script(script)
modpaths = [mod.__file__ for mod in mf.modules.values()]
modpaths = filter(is_stdlib_path, modpaths)
for p in modpaths:
if p.startswith(real_lib_prefix):
relpath = op.relpath(p, real_lib_prefix)
elif p.startswith(sysprefix):
relpath = op.relpath(p, sysprefix)
assert relpath.startswith("lib/python3.") # we want to get rid of that lib/python3.x part
relpath = relpath[len("lib/python3.X/") :]
elif p.startswith(basesysprefix):
relpath = op.relpath(p, basesysprefix)
assert relpath.startswith("lib/python3.")
relpath = relpath[len("lib/python3.X/") :]
else:
raise AssertionError()
if relpath.startswith("lib-dynload"): # We copy .so files in lib-dynload directly in our dest
relpath = relpath[len("lib-dynload/") :]
if relpath.startswith("encodings") or relpath.startswith("distutils"):
# We force their inclusion later.
continue
dest_path = op.join(dest_folder, relpath)
ensure_folder(op.dirname(dest_path))
copy(p, dest_path)
# stringprep is used by encodings.
# We use real_lib_prefix with distutils because virtualenv messes with it and we need to refer
# to the original distutils folder.
FORCED_INCLUSION = [
"encodings",
"stringprep",
op.join(real_lib_prefix, "distutils"),
]
if extra_deps:
FORCED_INCLUSION += extra_deps
copy_packages(FORCED_INCLUSION, dest_folder)
# There's a couple of rather big exe files in the distutils folder that we absolutely don't
# need. Remove them.
delete_files_with_pattern(op.join(dest_folder, "distutils"), "*.exe")
# And, finally, create an empty "site.py" that Python needs around on startup.
open(op.join(dest_folder, "site.py"), "w").close()
def fix_qt_resource_file(path):
# pyrcc5 under Windows, if the locale is non-english, can produce a source file with a date # pyrcc5 under Windows, if the locale is non-english, can produce a source file with a date
# containing accented characters. If it does, the encoding is wrong and it prevents the file # containing accented characters. If it does, the encoding is wrong and it prevents the file
# from being correctly frozen by cx_freeze. To work around that, we open the file, strip all # from being correctly frozen by cx_freeze. To work around that, we open the file, strip all
@ -537,21 +319,3 @@ def fix_qt_resource_file(path):
lines = [line for line in lines if not line.startswith(b"#")] lines = [line for line in lines if not line.startswith(b"#")]
with open(path, "wb") as fp: with open(path, "wb") as fp:
fp.write(b"\n".join(lines)) fp.write(b"\n".join(lines))
def build_cocoa_ext(extname, dest, source_files, extra_frameworks=(), extra_includes=()):
extra_link_args = ["-framework", "CoreFoundation", "-framework", "Foundation"]
for extra in extra_frameworks:
extra_link_args += ["-framework", extra]
ext = Extension(
extname,
source_files,
extra_link_args=extra_link_args,
include_dirs=extra_includes,
)
setup(script_args=["build_ext", "--inplace"], ext_modules=[ext])
# Our problem here is to get the fully qualified filename of the resulting .so but I couldn't
# find a documented way to do so. The only thing I could find is this below :(
fn = ext._file_name
assert op.exists(fn)
move(fn, op.join(dest, fn))

View File

@ -1,30 +0,0 @@
# Copyright 2016 Virgil Dupras
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
import argparse
from setuptools import setup, Extension
def get_parser():
parser = argparse.ArgumentParser(description="Build an arbitrary Python extension.")
parser.add_argument("source_files", nargs="+", help="List of source files to compile")
parser.add_argument("name", nargs=1, help="Name of the resulting extension")
return parser
def main():
args = get_parser().parse_args()
print("Building {}...".format(args.name[0]))
ext = Extension(args.name[0], args.source_files)
setup(
script_args=["build_ext", "--inplace"],
ext_modules=[ext],
)
if __name__ == "__main__":
main()

View File

@ -14,7 +14,9 @@ import re
import os import os
import shutil import shutil
from .path import Path, pathify from errno import EISDIR, EACCES
from pathlib import Path
from typing import Callable, List
# This matches [123], but not [12] (3 digits being the minimum). # This matches [123], but not [12] (3 digits being the minimum).
# It also matches [1234] [12345] etc.. # It also matches [1234] [12345] etc..
@ -22,7 +24,7 @@ from .path import Path, pathify
re_conflict = re.compile(r"^\[\d{3}\d*\] ") re_conflict = re.compile(r"^\[\d{3}\d*\] ")
def get_conflicted_name(other_names, name): def get_conflicted_name(other_names: List[str], name: str) -> str:
"""Returns name with a ``[000]`` number in front of it. """Returns name with a ``[000]`` number in front of it.
The number between brackets depends on how many conlicted filenames The number between brackets depends on how many conlicted filenames
@ -39,7 +41,7 @@ def get_conflicted_name(other_names, name):
i += 1 i += 1
def get_unconflicted_name(name): def get_unconflicted_name(name: str) -> str:
"""Returns ``name`` without ``[]`` brackets. """Returns ``name`` without ``[]`` brackets.
Brackets which, of course, might have been added by func:`get_conflicted_name`. Brackets which, of course, might have been added by func:`get_conflicted_name`.
@ -47,38 +49,35 @@ def get_unconflicted_name(name):
return re_conflict.sub("", name, 1) return re_conflict.sub("", name, 1)
def is_conflicted(name): def is_conflicted(name: str) -> bool:
"""Returns whether ``name`` is prepended with a bracketed number.""" """Returns whether ``name`` is prepended with a bracketed number."""
return re_conflict.match(name) is not None return re_conflict.match(name) is not None
@pathify def _smart_move_or_copy(operation: Callable, source_path: Path, dest_path: Path) -> None:
def _smart_move_or_copy(operation, source_path: Path, dest_path: Path):
"""Use move() or copy() to move and copy file with the conflict management.""" """Use move() or copy() to move and copy file with the conflict management."""
if dest_path.isdir() and not source_path.isdir(): if dest_path.is_dir() and not source_path.is_dir():
dest_path = dest_path[source_path.name] dest_path = dest_path.joinpath(source_path.name)
if dest_path.exists(): if dest_path.exists():
filename = dest_path.name filename = dest_path.name
dest_dir_path = dest_path.parent() dest_dir_path = dest_path.parent
newname = get_conflicted_name(os.listdir(str(dest_dir_path)), filename) newname = get_conflicted_name(os.listdir(str(dest_dir_path)), filename)
dest_path = dest_dir_path[newname] dest_path = dest_dir_path.joinpath(newname)
operation(str(source_path), str(dest_path)) operation(str(source_path), str(dest_path))
def smart_move(source_path, dest_path): def smart_move(source_path: Path, dest_path: Path) -> None:
"""Same as :func:`smart_copy`, but it moves files instead.""" """Same as :func:`smart_copy`, but it moves files instead."""
_smart_move_or_copy(shutil.move, source_path, dest_path) _smart_move_or_copy(shutil.move, source_path, dest_path)
def smart_copy(source_path, dest_path): def smart_copy(source_path: Path, dest_path: Path) -> None:
"""Copies ``source_path`` to ``dest_path``, recursively and with conflict resolution.""" """Copies ``source_path`` to ``dest_path``, recursively and with conflict resolution."""
try: try:
_smart_move_or_copy(shutil.copy, source_path, dest_path) _smart_move_or_copy(shutil.copy, source_path, dest_path)
except IOError as e: except OSError as e:
if e.errno in { # It's a directory, code is 21 on OS X / Linux (EISDIR) and 13 on Windows (EACCES)
21, if e.errno in (EISDIR, EACCES):
13,
}: # it's a directory, code is 21 on OS X / Linux and 13 on Windows
_smart_move_or_copy(shutil.copytree, source_path, dest_path) _smart_move_or_copy(shutil.copytree, source_path, dest_path)
else: else:
raise raise

View File

@ -1,23 +0,0 @@
# Created By: Virgil Dupras
# Created On: 2011-04-19
# Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
#
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
import sys
import traceback
# Taken from http://bzimmer.ziclix.com/2008/12/17/python-thread-dumps/
def stacktraces():
code = []
for thread_id, stack in sys._current_frames().items():
code.append("\n# ThreadID: %s" % thread_id)
for filename, lineno, name, line in traceback.extract_stack(stack):
code.append('File: "%s", line %d, in %s' % (filename, lineno, name))
if line:
code.append(" %s" % (line.strip()))
return "\n".join(code)

View File

@ -6,31 +6,33 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from enum import Enum
from os import PathLike
import os.path as op import os.path as op
import logging import logging
class SpecialFolder: class SpecialFolder(Enum):
APPDATA = 1 APPDATA = 1
CACHE = 2 CACHE = 2
def open_url(url): def open_url(url: str) -> None:
"""Open ``url`` with the default browser.""" """Open ``url`` with the default browser."""
_open_url(url) _open_url(url)
def open_path(path): def open_path(path: PathLike) -> None:
"""Open ``path`` with its associated application.""" """Open ``path`` with its associated application."""
_open_path(str(path)) _open_path(str(path))
def reveal_path(path): def reveal_path(path: PathLike) -> None:
"""Open the folder containing ``path`` with the default file browser.""" """Open the folder containing ``path`` with the default file browser."""
_reveal_path(str(path)) _reveal_path(str(path))
def special_folder_path(special_folder, appname=None, portable=False): def special_folder_path(special_folder: SpecialFolder, portable: bool = False) -> str:
"""Returns the path of ``special_folder``. """Returns the path of ``special_folder``.
``special_folder`` is a SpecialFolder.* const. The result is the special folder for the current ``special_folder`` is a SpecialFolder.* const. The result is the special folder for the current
@ -38,77 +40,58 @@ def special_folder_path(special_folder, appname=None, portable=False):
You can override the application name with ``appname``. This argument is ingored under Qt. You can override the application name with ``appname``. This argument is ingored under Qt.
""" """
return _special_folder_path(special_folder, appname, portable=portable) return _special_folder_path(special_folder, portable=portable)
try: try:
# Normally, we would simply do "from cocoa import proxy", but due to a bug in pytest (currently from PyQt5.QtCore import QUrl, QStandardPaths
# at v2.4.2), our test suite is broken when we do that. This below is a workaround until that from PyQt5.QtGui import QDesktopServices
# bug is fixed. from qt.util import get_appdata
import cocoa from core.util import executable_folder
from hscommon.plat import ISWINDOWS, ISOSX
import subprocess
if not hasattr(cocoa, "proxy"): def _open_url(url: str) -> None:
raise ImportError() QDesktopServices.openUrl(QUrl(url))
proxy = cocoa.proxy
_open_url = proxy.openURL_
_open_path = proxy.openPath_
_reveal_path = proxy.revealPath_
def _special_folder_path(special_folder, appname=None, portable=False): def _open_path(path: str) -> None:
if special_folder == SpecialFolder.CACHE: url = QUrl.fromLocalFile(str(path))
base = proxy.getCachePath() QDesktopServices.openUrl(url)
def _reveal_path(path: str) -> None:
if ISWINDOWS:
subprocess.run(["explorer", "/select,", op.abspath(path)])
elif ISOSX:
subprocess.run(["open", "-R", op.abspath(path)])
else: else:
base = proxy.getAppdataPath() _open_path(op.dirname(str(path)))
if not appname:
appname = proxy.bundleInfo_("CFBundleName") def _special_folder_path(special_folder: SpecialFolder, portable: bool = False) -> str:
return op.join(base, appname) if special_folder == SpecialFolder.CACHE:
if ISWINDOWS and portable:
folder = op.join(executable_folder(), "cache")
else:
folder = QStandardPaths.standardLocations(QStandardPaths.CacheLocation)[0]
else:
folder = get_appdata(portable)
return folder
except ImportError: except ImportError:
try: # We're either running tests, and these functions don't matter much or we're in a really
from PyQt5.QtCore import QUrl, QStandardPaths # weird situation. Let's just have dummy fallbacks.
from PyQt5.QtGui import QDesktopServices logging.warning("Can't setup desktop functions!")
from qtlib.util import get_appdata
from core.util import executable_folder
from hscommon.plat import ISWINDOWS, ISOSX
import subprocess
def _open_url(url): def _open_url(url: str) -> None:
QDesktopServices.openUrl(QUrl(url)) # Dummy for tests
pass
def _open_path(path): def _open_path(path: str) -> None:
url = QUrl.fromLocalFile(str(path)) # Dummy for tests
QDesktopServices.openUrl(url) pass
def _reveal_path(path): def _reveal_path(path: str) -> None:
if ISWINDOWS: # Dummy for tests
subprocess.run(["explorer", "/select,", op.abspath(path)]) pass
elif ISOSX:
subprocess.run(["open", "-R", op.abspath(path)])
else:
_open_path(op.dirname(str(path)))
def _special_folder_path(special_folder, appname=None, portable=False): def _special_folder_path(special_folder: SpecialFolder, portable: bool = False) -> str:
if special_folder == SpecialFolder.CACHE: return "/tmp"
if ISWINDOWS and portable:
folder = op.join(executable_folder(), "cache")
else:
folder = QStandardPaths.standardLocations(QStandardPaths.CacheLocation)[0]
else:
folder = get_appdata(portable)
return folder
except ImportError:
# We're either running tests, and these functions don't matter much or we're in a really
# weird situation. Let's just have dummy fallbacks.
logging.warning("Can't setup desktop functions!")
def _open_path(path):
# Dummy for tests
pass
def _reveal_path(path):
# Dummy for tests
pass
def _special_folder_path(special_folder, appname=None, portable=False):
return "/tmp"

View File

@ -1,216 +0,0 @@
# Created By: Virgil Dupras
# Created On: 2011-08-05
# Copyright 2015 Hardcoded Software (http://www.hardcoded.net)
#
# This software is licensed under the "GPLv3" License as described in the "LICENSE" file,
# which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html
from sys import maxsize as INF
from math import sqrt
VERY_SMALL = 0.0000001
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return "<Point {:2.2f}, {:2.2f}>".format(*self)
def __iter__(self):
yield self.x
yield self.y
def distance_to(self, other):
return Line(self, other).length()
class Line:
def __init__(self, p1, p2):
self.p1 = p1
self.p2 = p2
def __repr__(self):
return "<Line {}, {}>".format(*self)
def __iter__(self):
yield self.p1
yield self.p2
def dx(self):
return self.p2.x - self.p1.x
def dy(self):
return self.p2.y - self.p1.y
def length(self):
return sqrt(self.dx() ** 2 + self.dy() ** 2)
def slope(self):
if self.dx() == 0:
return INF if self.dy() > 0 else -INF
else:
return self.dy() / self.dx()
def intersection_point(self, other):
# with help from http://paulbourke.net/geometry/lineline2d/
if abs(self.slope() - other.slope()) < VERY_SMALL:
# parallel. Even if coincident, we return nothing
return None
A, B = self
C, D = other
denom = (D.y - C.y) * (B.x - A.x) - (D.x - C.x) * (B.y - A.y)
if denom == 0:
return None
numera = (D.x - C.x) * (A.y - C.y) - (D.y - C.y) * (A.x - C.x)
numerb = (B.x - A.x) * (A.y - C.y) - (B.y - A.y) * (A.x - C.x)
mua = numera / denom
mub = numerb / denom
if (0 <= mua <= 1) and (0 <= mub <= 1):
x = A.x + mua * (B.x - A.x)
y = A.y + mua * (B.y - A.y)
return Point(x, y)
else:
return None
class Rect:
def __init__(self, x, y, w, h):
self.x = x
self.y = y
self.w = w
self.h = h
def __iter__(self):
yield self.x
yield self.y
yield self.w
yield self.h
def __repr__(self):
return "<Rect {:2.2f}, {:2.2f}, {:2.2f}, {:2.2f}>".format(*self)
@classmethod
def from_center(cls, center, width, height):
x = center.x - width / 2
y = center.y - height / 2
return cls(x, y, width, height)
@classmethod
def from_corners(cls, pt1, pt2):
x1, y1 = pt1
x2, y2 = pt2
return cls(min(x1, x2), min(y1, y2), abs(x1 - x2), abs(y1 - y2))
def center(self):
return Point(self.x + self.w / 2, self.y + self.h / 2)
def contains_point(self, point):
x, y = point
(x1, y1), (x2, y2) = self.corners()
return (x1 <= x <= x2) and (y1 <= y <= y2)
def contains_rect(self, rect):
pt1, pt2 = rect.corners()
return self.contains_point(pt1) and self.contains_point(pt2)
def corners(self):
return Point(self.x, self.y), Point(self.x + self.w, self.y + self.h)
def intersects(self, other):
r1pt1, r1pt2 = self.corners()
r2pt1, r2pt2 = other.corners()
if r1pt1.x < r2pt1.x:
xinter = r1pt2.x >= r2pt1.x
else:
xinter = r2pt2.x >= r1pt1.x
if not xinter:
return False
if r1pt1.y < r2pt1.y:
yinter = r1pt2.y >= r2pt1.y
else:
yinter = r2pt2.y >= r1pt1.y
return yinter
def lines(self):
pt1, pt4 = self.corners()
pt2 = Point(pt4.x, pt1.y)
pt3 = Point(pt1.x, pt4.y)
l1 = Line(pt1, pt2)
l2 = Line(pt2, pt4)
l3 = Line(pt3, pt4)
l4 = Line(pt1, pt3)
return l1, l2, l3, l4
def scaled_rect(self, dx, dy):
"""Returns a rect that has the same borders at self, but grown/shrunk by dx/dy on each side."""
x, y, w, h = self
x -= dx
y -= dy
w += dx * 2
h += dy * 2
return Rect(x, y, w, h)
def united(self, other):
"""Returns the bounding rectangle of this rectangle and `other`."""
# ul=upper left lr=lower right
ulcorner1, lrcorner1 = self.corners()
ulcorner2, lrcorner2 = other.corners()
corner1 = Point(min(ulcorner1.x, ulcorner2.x), min(ulcorner1.y, ulcorner2.y))
corner2 = Point(max(lrcorner1.x, lrcorner2.x), max(lrcorner1.y, lrcorner2.y))
return Rect.from_corners(corner1, corner2)
# --- Properties
@property
def top(self):
return self.y
@top.setter
def top(self, value):
self.y = value
@property
def bottom(self):
return self.y + self.h
@bottom.setter
def bottom(self, value):
self.y = value - self.h
@property
def left(self):
return self.x
@left.setter
def left(self, value):
self.x = value
@property
def right(self):
return self.x + self.w
@right.setter
def right(self, value):
self.x = value - self.w
@property
def width(self):
return self.w
@width.setter
def width(self, value):
self.w = value
@property
def height(self):
return self.h
@height.setter
def height(self, value):
self.h = value

View File

@ -36,11 +36,11 @@ class GUIObject:
``multibind`` flag to ``True`` and the safeguard will be disabled. ``multibind`` flag to ``True`` and the safeguard will be disabled.
""" """
def __init__(self, multibind=False): def __init__(self, multibind: bool = False) -> None:
self._view = None self._view = None
self._multibind = multibind self._multibind = multibind
def _view_updated(self): def _view_updated(self) -> None:
"""(Virtual) Called after :attr:`view` has been set. """(Virtual) Called after :attr:`view` has been set.
Doing nothing by default, this method is called after :attr:`view` has been set (it isn't Doing nothing by default, this method is called after :attr:`view` has been set (it isn't
@ -48,7 +48,7 @@ class GUIObject:
(which is often the whole of the initialization code). (which is often the whole of the initialization code).
""" """
def has_view(self): def has_view(self) -> bool:
return (self._view is not None) and (not isinstance(self._view, NoopGUI)) return (self._view is not None) and (not isinstance(self._view, NoopGUI))
@property @property
@ -67,7 +67,7 @@ class GUIObject:
return self._view return self._view
@view.setter @view.setter
def view(self, value): def view(self, value) -> None:
if self._view is None and value is None: if self._view is None and value is None:
# Initial view assignment # Initial view assignment
return return

View File

@ -7,8 +7,10 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import copy import copy
from typing import Any, List, Tuple, Union
from .base import GUIObject from hscommon.gui.base import GUIObject
from hscommon.gui.table import GUITable
class Column: class Column:
@ -17,7 +19,7 @@ class Column:
These attributes are then used to correctly configure the column on the "view" side. These attributes are then used to correctly configure the column on the "view" side.
""" """
def __init__(self, name, display="", visible=True, optional=False): def __init__(self, name: str, display: str = "", visible: bool = True, optional: bool = False) -> None:
#: "programmatical" (not for display) name. Used as a reference in a couple of place, such #: "programmatical" (not for display) name. Used as a reference in a couple of place, such
#: as :meth:`Columns.column_by_name`. #: as :meth:`Columns.column_by_name`.
self.name = name self.name = name
@ -52,14 +54,14 @@ class ColumnsView:
callbacks. callbacks.
""" """
def restore_columns(self): def restore_columns(self) -> None:
"""Update all columns according to the model. """Update all columns according to the model.
When this is called, our view has to update the columns title, order and visibility of all When this is called, our view has to update the columns title, order and visibility of all
columns. columns.
""" """
def set_column_visible(self, colname, visible): def set_column_visible(self, colname: str, visible: bool) -> None:
"""Update visibility of column ``colname``. """Update visibility of column ``colname``.
Called when the user toggles the visibility of a column, we must update the column Called when the user toggles the visibility of a column, we must update the column
@ -73,13 +75,13 @@ class PrefAccessInterface:
*Not actually used in the code. For documentation purposes only.* *Not actually used in the code. For documentation purposes only.*
""" """
def get_default(self, key, fallback_value): def get_default(self, key: str, fallback_value: Union[Any, None]) -> Any:
"""Retrieve the value for ``key`` in the currently running app's preference store. """Retrieve the value for ``key`` in the currently running app's preference store.
If the key doesn't exist, return ``fallback_value``. If the key doesn't exist, return ``fallback_value``.
""" """
def set_default(self, key, value): def set_default(self, key: str, value: Any) -> None:
"""Set the value ``value`` for ``key`` in the currently running app's preference store.""" """Set the value ``value`` for ``key`` in the currently running app's preference store."""
@ -104,65 +106,65 @@ class Columns(GUIObject):
have that same prefix. have that same prefix.
""" """
def __init__(self, table, prefaccess=None, savename=None): def __init__(self, table: GUITable, prefaccess=None, savename: Union[str, None] = None):
GUIObject.__init__(self) GUIObject.__init__(self)
self.table = table self.table = table
self.prefaccess = prefaccess self.prefaccess = prefaccess
self.savename = savename self.savename = savename
# We use copy here for test isolation. If we don't, changing a column affects all tests. # We use copy here for test isolation. If we don't, changing a column affects all tests.
self.column_list = list(map(copy.copy, table.COLUMNS)) self.column_list: List[Column] = list(map(copy.copy, table.COLUMNS))
for i, column in enumerate(self.column_list): for i, column in enumerate(self.column_list):
column.logical_index = i column.logical_index = i
column.ordered_index = i column.ordered_index = i
self.coldata = {col.name: col for col in self.column_list} self.coldata = {col.name: col for col in self.column_list}
# --- Private # --- Private
def _get_colname_attr(self, colname, attrname, default): def _get_colname_attr(self, colname: str, attrname: str, default: Any) -> Any:
try: try:
return getattr(self.coldata[colname], attrname) return getattr(self.coldata[colname], attrname)
except KeyError: except KeyError:
return default return default
def _set_colname_attr(self, colname, attrname, value): def _set_colname_attr(self, colname: str, attrname: str, value: Any) -> None:
try: try:
col = self.coldata[colname] col = self.coldata[colname]
setattr(col, attrname, value) setattr(col, attrname, value)
except KeyError: except KeyError:
pass pass
def _optional_columns(self): def _optional_columns(self) -> List[Column]:
return [c for c in self.column_list if c.optional] return [c for c in self.column_list if c.optional]
# --- Override # --- Override
def _view_updated(self): def _view_updated(self) -> None:
self.restore_columns() self.restore_columns()
# --- Public # --- Public
def column_by_index(self, index): def column_by_index(self, index: int):
"""Return the :class:`Column` having the :attr:`~Column.logical_index` ``index``.""" """Return the :class:`Column` having the :attr:`~Column.logical_index` ``index``."""
return self.column_list[index] return self.column_list[index]
def column_by_name(self, name): def column_by_name(self, name: str):
"""Return the :class:`Column` having the :attr:`~Column.name` ``name``.""" """Return the :class:`Column` having the :attr:`~Column.name` ``name``."""
return self.coldata[name] return self.coldata[name]
def columns_count(self): def columns_count(self) -> int:
"""Returns the number of columns in our set.""" """Returns the number of columns in our set."""
return len(self.column_list) return len(self.column_list)
def column_display(self, colname): def column_display(self, colname: str) -> str:
"""Returns display name for column named ``colname``, or ``''`` if there's none.""" """Returns display name for column named ``colname``, or ``''`` if there's none."""
return self._get_colname_attr(colname, "display", "") return self._get_colname_attr(colname, "display", "")
def column_is_visible(self, colname): def column_is_visible(self, colname: str) -> bool:
"""Returns visibility for column named ``colname``, or ``True`` if there's none.""" """Returns visibility for column named ``colname``, or ``True`` if there's none."""
return self._get_colname_attr(colname, "visible", True) return self._get_colname_attr(colname, "visible", True)
def column_width(self, colname): def column_width(self, colname: str) -> int:
"""Returns width for column named ``colname``, or ``0`` if there's none.""" """Returns width for column named ``colname``, or ``0`` if there's none."""
return self._get_colname_attr(colname, "width", 0) return self._get_colname_attr(colname, "width", 0)
def columns_to_right(self, colname): def columns_to_right(self, colname: str) -> List[str]:
"""Returns the list of all columns to the right of ``colname``. """Returns the list of all columns to the right of ``colname``.
"right" meaning "having a higher :attr:`Column.ordered_index`" in our left-to-right "right" meaning "having a higher :attr:`Column.ordered_index`" in our left-to-right
@ -172,7 +174,7 @@ class Columns(GUIObject):
index = column.ordered_index index = column.ordered_index
return [col.name for col in self.column_list if (col.visible and col.ordered_index > index)] return [col.name for col in self.column_list if (col.visible and col.ordered_index > index)]
def menu_items(self): def menu_items(self) -> List[Tuple[str, bool]]:
"""Returns a list of items convenient for quick visibility menu generation. """Returns a list of items convenient for quick visibility menu generation.
Returns a list of ``(display_name, is_marked)`` items for each optional column in the Returns a list of ``(display_name, is_marked)`` items for each optional column in the
@ -184,7 +186,7 @@ class Columns(GUIObject):
""" """
return [(c.display, c.visible) for c in self._optional_columns()] return [(c.display, c.visible) for c in self._optional_columns()]
def move_column(self, colname, index): def move_column(self, colname: str, index: int) -> None:
"""Moves column ``colname`` to ``index``. """Moves column ``colname`` to ``index``.
The column will be placed just in front of the column currently having that index, or to the The column will be placed just in front of the column currently having that index, or to the
@ -195,7 +197,7 @@ class Columns(GUIObject):
colnames.insert(index, colname) colnames.insert(index, colname)
self.set_column_order(colnames) self.set_column_order(colnames)
def reset_to_defaults(self): def reset_to_defaults(self) -> None:
"""Reset all columns' width and visibility to their default values.""" """Reset all columns' width and visibility to their default values."""
self.set_column_order([col.name for col in self.column_list]) self.set_column_order([col.name for col in self.column_list])
for col in self._optional_columns(): for col in self._optional_columns():
@ -203,11 +205,11 @@ class Columns(GUIObject):
col.width = col.default_width col.width = col.default_width
self.view.restore_columns() self.view.restore_columns()
def resize_column(self, colname, newwidth): def resize_column(self, colname: str, newwidth: int) -> None:
"""Set column ``colname``'s width to ``newwidth``.""" """Set column ``colname``'s width to ``newwidth``."""
self._set_colname_attr(colname, "width", newwidth) self._set_colname_attr(colname, "width", newwidth)
def restore_columns(self): def restore_columns(self) -> None:
"""Restore's column persistent attributes from the last :meth:`save_columns`.""" """Restore's column persistent attributes from the last :meth:`save_columns`."""
if not (self.prefaccess and self.savename and self.coldata): if not (self.prefaccess and self.savename and self.coldata):
if (not self.savename) and (self.coldata): if (not self.savename) and (self.coldata):
@ -216,7 +218,7 @@ class Columns(GUIObject):
self.view.restore_columns() self.view.restore_columns()
return return
for col in self.column_list: for col in self.column_list:
pref_name = "{}.Columns.{}".format(self.savename, col.name) pref_name = f"{self.savename}.Columns.{col.name}"
coldata = self.prefaccess.get_default(pref_name, fallback_value={}) coldata = self.prefaccess.get_default(pref_name, fallback_value={})
if "index" in coldata: if "index" in coldata:
col.ordered_index = coldata["index"] col.ordered_index = coldata["index"]
@ -226,18 +228,19 @@ class Columns(GUIObject):
col.visible = coldata["visible"] col.visible = coldata["visible"]
self.view.restore_columns() self.view.restore_columns()
def save_columns(self): def save_columns(self) -> None:
"""Save column attributes in persistent storage for restoration in :meth:`restore_columns`.""" """Save column attributes in persistent storage for restoration in :meth:`restore_columns`."""
if not (self.prefaccess and self.savename and self.coldata): if not (self.prefaccess and self.savename and self.coldata):
return return
for col in self.column_list: for col in self.column_list:
pref_name = "{}.Columns.{}".format(self.savename, col.name) pref_name = f"{self.savename}.Columns.{col.name}"
coldata = {"index": col.ordered_index, "width": col.width} coldata = {"index": col.ordered_index, "width": col.width}
if col.optional: if col.optional:
coldata["visible"] = col.visible coldata["visible"] = col.visible
self.prefaccess.set_default(pref_name, coldata) self.prefaccess.set_default(pref_name, coldata)
def set_column_order(self, colnames): # TODO annotate colnames
def set_column_order(self, colnames) -> None:
"""Change the columns order so it matches the order in ``colnames``. """Change the columns order so it matches the order in ``colnames``.
:param colnames: A list of column names in the desired order. :param colnames: A list of column names in the desired order.
@ -247,17 +250,17 @@ class Columns(GUIObject):
col = self.coldata[colname] col = self.coldata[colname]
col.ordered_index = i col.ordered_index = i
def set_column_visible(self, colname, visible): def set_column_visible(self, colname: str, visible: bool) -> None:
"""Set the visibility of column ``colname``.""" """Set the visibility of column ``colname``."""
self.table.save_edits() # the table on the GUI side will stop editing when the columns change self.table.save_edits() # the table on the GUI side will stop editing when the columns change
self._set_colname_attr(colname, "visible", visible) self._set_colname_attr(colname, "visible", visible)
self.view.set_column_visible(colname, visible) self.view.set_column_visible(colname, visible)
def set_default_width(self, colname, width): def set_default_width(self, colname: str, width: int) -> None:
"""Set the default width or column ``colname``.""" """Set the default width or column ``colname``."""
self._set_colname_attr(colname, "default_width", width) self._set_colname_attr(colname, "default_width", width)
def toggle_menu_item(self, index): def toggle_menu_item(self, index: int) -> bool:
"""Toggles the visibility of an optional column. """Toggles the visibility of an optional column.
You know, that optional column menu you've generated in :meth:`menu_items`? Well, ``index`` You know, that optional column menu you've generated in :meth:`menu_items`? Well, ``index``
@ -271,11 +274,11 @@ class Columns(GUIObject):
# --- Properties # --- Properties
@property @property
def ordered_columns(self): def ordered_columns(self) -> List[Column]:
"""List of :class:`Column` in visible order.""" """List of :class:`Column` in visible order."""
return [col for col in sorted(self.column_list, key=lambda col: col.ordered_index)] return [col for col in sorted(self.column_list, key=lambda col: col.ordered_index)]
@property @property
def colnames(self): def colnames(self) -> List[str]:
"""List of column names in visible order.""" """List of column names in visible order."""
return [col.name for col in self.ordered_columns] return [col.name for col in self.ordered_columns]

View File

@ -4,9 +4,10 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from ..jobprogress.performer import ThreadedJobPerformer from typing import Callable, Tuple, Union
from .base import GUIObject from hscommon.jobprogress.performer import ThreadedJobPerformer
from .text_field import TextField from hscommon.gui.base import GUIObject
from hscommon.gui.text_field import TextField
class ProgressWindowView: class ProgressWindowView:
@ -20,13 +21,13 @@ class ProgressWindowView:
It's also expected to call :meth:`ProgressWindow.cancel` when the cancel button is clicked. It's also expected to call :meth:`ProgressWindow.cancel` when the cancel button is clicked.
""" """
def show(self): def show(self) -> None:
"""Show the dialog.""" """Show the dialog."""
def close(self): def close(self) -> None:
"""Close the dialog.""" """Close the dialog."""
def set_progress(self, progress): def set_progress(self, progress: int) -> None:
"""Set the progress of the progress bar to ``progress``. """Set the progress of the progress bar to ``progress``.
Not all jobs are equally responsive on their job progress report and it is recommended that Not all jobs are equally responsive on their job progress report and it is recommended that
@ -60,7 +61,11 @@ class ProgressWindow(GUIObject, ThreadedJobPerformer):
called as if the job terminated normally. called as if the job terminated normally.
""" """
def __init__(self, finish_func, error_func=None): def __init__(
self,
finish_func: Callable[[Union[str, None]], None],
error_func: Callable[[Union[str, None], Exception], bool] = None,
) -> None:
# finish_func(jobid) is the function that is called when a job is completed. # finish_func(jobid) is the function that is called when a job is completed.
GUIObject.__init__(self) GUIObject.__init__(self)
ThreadedJobPerformer.__init__(self) ThreadedJobPerformer.__init__(self)
@ -71,9 +76,9 @@ class ProgressWindow(GUIObject, ThreadedJobPerformer):
#: :class:`.TextField`. It contains the job textual update that the function might yield #: :class:`.TextField`. It contains the job textual update that the function might yield
#: during its course. #: during its course.
self.progressdesc_textfield = TextField() self.progressdesc_textfield = TextField()
self.jobid = None self.jobid: Union[str, None] = None
def cancel(self): def cancel(self) -> None:
"""Call for a user-initiated job cancellation.""" """Call for a user-initiated job cancellation."""
# The UI is sometimes a bit buggy and calls cancel() on self.view.close(). We just want to # The UI is sometimes a bit buggy and calls cancel() on self.view.close(). We just want to
# make sure that this doesn't lead us to think that the user acually cancelled the task, so # make sure that this doesn't lead us to think that the user acually cancelled the task, so
@ -81,7 +86,7 @@ class ProgressWindow(GUIObject, ThreadedJobPerformer):
if self._job_running: if self._job_running:
self.job_cancelled = True self.job_cancelled = True
def pulse(self): def pulse(self) -> None:
"""Update progress reports in the GUI. """Update progress reports in the GUI.
Call this regularly from the GUI main run loop. The values might change before Call this regularly from the GUI main run loop. The values might change before
@ -111,7 +116,7 @@ class ProgressWindow(GUIObject, ThreadedJobPerformer):
self.progressdesc_textfield.text = last_desc self.progressdesc_textfield.text = last_desc
self.view.set_progress(last_progress) self.view.set_progress(last_progress)
def run(self, jobid, title, target, args=()): def run(self, jobid: str, title: str, target: Callable, args: Tuple = ()):
"""Starts a threaded job. """Starts a threaded job.
The ``target`` function will be sent, as its first argument, a :class:`.Job` instance which The ``target`` function will be sent, as its first argument, a :class:`.Job` instance which

View File

@ -8,7 +8,7 @@
from collections.abc import Sequence, MutableSequence from collections.abc import Sequence, MutableSequence
from .base import GUIObject from hscommon.gui.base import GUIObject
class Selectable(Sequence): class Selectable(Sequence):

View File

@ -8,9 +8,10 @@
from collections.abc import MutableSequence from collections.abc import MutableSequence
from collections import namedtuple from collections import namedtuple
from typing import Any, List, Tuple, Union
from .base import GUIObject from hscommon.gui.base import GUIObject
from .selectable_list import Selectable from hscommon.gui.selectable_list import Selectable
# We used to directly subclass list, but it caused problems at some point with deepcopy # We used to directly subclass list, but it caused problems at some point with deepcopy
@ -27,12 +28,16 @@ class Table(MutableSequence, Selectable):
Subclasses :class:`.Selectable`. Subclasses :class:`.Selectable`.
""" """
def __init__(self): # Should be List[Column], but have circular import...
Selectable.__init__(self) COLUMNS: List = []
self._rows = []
self._header = None
self._footer = None
def __init__(self) -> None:
Selectable.__init__(self)
self._rows: List["Row"] = []
self._header: Union["Row", None] = None
self._footer: Union["Row", None] = None
# TODO type hint for key
def __delitem__(self, key): def __delitem__(self, key):
self._rows.__delitem__(key) self._rows.__delitem__(key)
if self._header is not None and ((not self) or (self[0] is not self._header)): if self._header is not None and ((not self) or (self[0] is not self._header)):
@ -41,16 +46,18 @@ class Table(MutableSequence, Selectable):
self._footer = None self._footer = None
self._check_selection_range() self._check_selection_range()
def __getitem__(self, key): # TODO type hint for key
def __getitem__(self, key) -> Any:
return self._rows.__getitem__(key) return self._rows.__getitem__(key)
def __len__(self): def __len__(self) -> int:
return len(self._rows) return len(self._rows)
def __setitem__(self, key, value): # TODO type hint for key
def __setitem__(self, key, value: Any) -> None:
self._rows.__setitem__(key, value) self._rows.__setitem__(key, value)
def append(self, item): def append(self, item: "Row") -> None:
"""Appends ``item`` at the end of the table. """Appends ``item`` at the end of the table.
If there's a footer, the item is inserted before it. If there's a footer, the item is inserted before it.
@ -60,7 +67,7 @@ class Table(MutableSequence, Selectable):
else: else:
self._rows.append(item) self._rows.append(item)
def insert(self, index, item): def insert(self, index: int, item: "Row") -> None:
"""Inserts ``item`` at ``index`` in the table. """Inserts ``item`` at ``index`` in the table.
If there's a header, will make sure we don't insert before it, and if there's a footer, will If there's a header, will make sure we don't insert before it, and if there's a footer, will
@ -72,7 +79,7 @@ class Table(MutableSequence, Selectable):
index = len(self) - 1 index = len(self) - 1
self._rows.insert(index, item) self._rows.insert(index, item)
def remove(self, row): def remove(self, row: "Row") -> None:
"""Removes ``row`` from table. """Removes ``row`` from table.
If ``row`` is a header or footer, that header or footer will be set to ``None``. If ``row`` is a header or footer, that header or footer will be set to ``None``.
@ -84,7 +91,7 @@ class Table(MutableSequence, Selectable):
self._rows.remove(row) self._rows.remove(row)
self._check_selection_range() self._check_selection_range()
def sort_by(self, column_name, desc=False): def sort_by(self, column_name: str, desc: bool = False) -> None:
"""Sort table by ``column_name``. """Sort table by ``column_name``.
Sort key for each row is computed from :meth:`Row.sort_key_for_column`. Sort key for each row is computed from :meth:`Row.sort_key_for_column`.
@ -105,7 +112,7 @@ class Table(MutableSequence, Selectable):
# --- Properties # --- Properties
@property @property
def footer(self): def footer(self) -> Union["Row", None]:
"""If set, a row that always stay at the bottom of the table. """If set, a row that always stay at the bottom of the table.
:class:`Row`. *get/set*. :class:`Row`. *get/set*.
@ -128,7 +135,7 @@ class Table(MutableSequence, Selectable):
return self._footer return self._footer
@footer.setter @footer.setter
def footer(self, value): def footer(self, value: Union["Row", None]) -> None:
if self._footer is not None: if self._footer is not None:
self._rows.pop() self._rows.pop()
if value is not None: if value is not None:
@ -136,7 +143,7 @@ class Table(MutableSequence, Selectable):
self._footer = value self._footer = value
@property @property
def header(self): def header(self) -> Union["Row", None]:
"""If set, a row that always stay at the bottom of the table. """If set, a row that always stay at the bottom of the table.
See :attr:`footer` for details. See :attr:`footer` for details.
@ -144,7 +151,7 @@ class Table(MutableSequence, Selectable):
return self._header return self._header
@header.setter @header.setter
def header(self, value): def header(self, value: Union["Row", None]) -> None:
if self._header is not None: if self._header is not None:
self._rows.pop(0) self._rows.pop(0)
if value is not None: if value is not None:
@ -152,7 +159,7 @@ class Table(MutableSequence, Selectable):
self._header = value self._header = value
@property @property
def row_count(self): def row_count(self) -> int:
"""Number or rows in the table (without counting header and footer). """Number or rows in the table (without counting header and footer).
*int*. *read-only*. *int*. *read-only*.
@ -165,7 +172,7 @@ class Table(MutableSequence, Selectable):
return result return result
@property @property
def rows(self): def rows(self) -> List["Row"]:
"""List of rows in the table, excluding header and footer. """List of rows in the table, excluding header and footer.
List of :class:`Row`. *read-only*. List of :class:`Row`. *read-only*.
@ -179,7 +186,7 @@ class Table(MutableSequence, Selectable):
return self[start:end] return self[start:end]
@property @property
def selected_row(self): def selected_row(self) -> "Row":
"""Selected row according to :attr:`Selectable.selected_index`. """Selected row according to :attr:`Selectable.selected_index`.
:class:`Row`. *get/set*. :class:`Row`. *get/set*.
@ -190,14 +197,14 @@ class Table(MutableSequence, Selectable):
return self[self.selected_index] if self.selected_index is not None else None return self[self.selected_index] if self.selected_index is not None else None
@selected_row.setter @selected_row.setter
def selected_row(self, value): def selected_row(self, value: int) -> None:
try: try:
self.selected_index = self.index(value) self.selected_index = self.index(value)
except ValueError: except ValueError:
pass pass
@property @property
def selected_rows(self): def selected_rows(self) -> List["Row"]:
"""List of selected rows based on :attr:`.selected_indexes`. """List of selected rows based on :attr:`.selected_indexes`.
List of :class:`Row`. *read-only*. List of :class:`Row`. *read-only*.
@ -219,20 +226,20 @@ class GUITableView:
Whenever the user changes the selection, we expect the view to call :meth:`Table.select`. Whenever the user changes the selection, we expect the view to call :meth:`Table.select`.
""" """
def refresh(self): def refresh(self) -> None:
"""Refreshes the contents of the table widget. """Refreshes the contents of the table widget.
Ensures that the contents of the table widget is synced with the model. This includes Ensures that the contents of the table widget is synced with the model. This includes
selection. selection.
""" """
def start_editing(self): def start_editing(self) -> None:
"""Start editing the currently selected row. """Start editing the currently selected row.
Begin whatever inline editing support that the view supports. Begin whatever inline editing support that the view supports.
""" """
def stop_editing(self): def stop_editing(self) -> None:
"""Stop editing if there's an inline editing in effect. """Stop editing if there's an inline editing in effect.
There's no "aborting" implied in this call, so it's appropriate to send whatever the user There's no "aborting" implied in this call, so it's appropriate to send whatever the user
@ -260,33 +267,33 @@ class GUITable(Table, GUIObject):
:class:`GUITableView`. :class:`GUITableView`.
""" """
def __init__(self): def __init__(self) -> None:
GUIObject.__init__(self) GUIObject.__init__(self)
Table.__init__(self) Table.__init__(self)
#: The row being currently edited by the user. ``None`` if no edit is taking place. #: The row being currently edited by the user. ``None`` if no edit is taking place.
self.edited = None self.edited: Union["Row", None] = None
self._sort_descriptor = None self._sort_descriptor: Union[SortDescriptor, None] = None
# --- Virtual # --- Virtual
def _do_add(self): def _do_add(self) -> Tuple["Row", int]:
"""(Virtual) Creates a new row, adds it in the table. """(Virtual) Creates a new row, adds it in the table.
Returns ``(row, insert_index)``. Returns ``(row, insert_index)``.
""" """
raise NotImplementedError() raise NotImplementedError()
def _do_delete(self): def _do_delete(self) -> None:
"""(Virtual) Delete the selected rows.""" """(Virtual) Delete the selected rows."""
pass pass
def _fill(self): def _fill(self) -> None:
"""(Virtual/Required) Fills the table with all the rows that this table is supposed to have. """(Virtual/Required) Fills the table with all the rows that this table is supposed to have.
Called by :meth:`refresh`. Does nothing by default. Called by :meth:`refresh`. Does nothing by default.
""" """
pass pass
def _is_edited_new(self): def _is_edited_new(self) -> bool:
"""(Virtual) Returns whether the currently edited row should be considered "new". """(Virtual) Returns whether the currently edited row should be considered "new".
This is used in :meth:`cancel_edits` to know whether the cancellation of the edit means a This is used in :meth:`cancel_edits` to know whether the cancellation of the edit means a
@ -315,7 +322,7 @@ class GUITable(Table, GUIObject):
self.select([len(self) - 1]) self.select([len(self) - 1])
# --- Public # --- Public
def add(self): def add(self) -> None:
"""Add a new row in edit mode. """Add a new row in edit mode.
Requires :meth:`do_add` to be implemented. The newly added row will be selected and in edit Requires :meth:`do_add` to be implemented. The newly added row will be selected and in edit
@ -334,7 +341,7 @@ class GUITable(Table, GUIObject):
self.edited = row self.edited = row
self.view.start_editing() self.view.start_editing()
def can_edit_cell(self, column_name, row_index): def can_edit_cell(self, column_name: str, row_index: int) -> bool:
"""Returns whether the cell at ``row_index`` and ``column_name`` can be edited. """Returns whether the cell at ``row_index`` and ``column_name`` can be edited.
A row is, by default, editable as soon as it has an attr with the same name as `column`. A row is, by default, editable as soon as it has an attr with the same name as `column`.
@ -346,7 +353,7 @@ class GUITable(Table, GUIObject):
row = self[row_index] row = self[row_index]
return row.can_edit_cell(column_name) return row.can_edit_cell(column_name)
def cancel_edits(self): def cancel_edits(self) -> None:
"""Cancels the current edit operation. """Cancels the current edit operation.
If there's an :attr:`edited` row, it will be re-initialized (with :meth:`Row.load`). If there's an :attr:`edited` row, it will be re-initialized (with :meth:`Row.load`).
@ -364,7 +371,7 @@ class GUITable(Table, GUIObject):
self.edited = None self.edited = None
self.view.refresh() self.view.refresh()
def delete(self): def delete(self) -> None:
"""Delete the currently selected rows. """Delete the currently selected rows.
Requires :meth:`_do_delete` for this to have any effect on the model. Cancels editing if Requires :meth:`_do_delete` for this to have any effect on the model. Cancels editing if
@ -377,7 +384,7 @@ class GUITable(Table, GUIObject):
if self: if self:
self._do_delete() self._do_delete()
def refresh(self, refresh_view=True): def refresh(self, refresh_view: bool = True) -> None:
"""Empty the table and re-create its rows. """Empty the table and re-create its rows.
:meth:`_fill` is called after we emptied the table to create our rows. Previous sort order :meth:`_fill` is called after we emptied the table to create our rows. Previous sort order
@ -399,7 +406,7 @@ class GUITable(Table, GUIObject):
if refresh_view: if refresh_view:
self.view.refresh() self.view.refresh()
def save_edits(self): def save_edits(self) -> None:
"""Commit user edits to the model. """Commit user edits to the model.
This is done by calling :meth:`Row.save`. This is done by calling :meth:`Row.save`.
@ -410,7 +417,7 @@ class GUITable(Table, GUIObject):
self.edited = None self.edited = None
row.save() row.save()
def sort_by(self, column_name, desc=False): def sort_by(self, column_name: str, desc: bool = False) -> None:
"""Sort table by ``column_name``. """Sort table by ``column_name``.
Overrides :meth:`Table.sort_by`. After having performed sorting, calls Overrides :meth:`Table.sort_by`. After having performed sorting, calls
@ -450,18 +457,18 @@ class Row:
Of course, this is only default behavior. This can be overriden. Of course, this is only default behavior. This can be overriden.
""" """
def __init__(self, table): def __init__(self, table: GUITable) -> None:
super(Row, self).__init__() super().__init__()
self.table = table self.table = table
def _edit(self): def _edit(self) -> None:
if self.table.edited is self: if self.table.edited is self:
return return
assert self.table.edited is None assert self.table.edited is None
self.table.edited = self self.table.edited = self
# --- Virtual # --- Virtual
def can_edit(self): def can_edit(self) -> bool:
"""(Virtual) Whether the whole row can be edited. """(Virtual) Whether the whole row can be edited.
By default, always returns ``True``. This is for the *whole* row. For individual cells, it's By default, always returns ``True``. This is for the *whole* row. For individual cells, it's
@ -469,7 +476,7 @@ class Row:
""" """
return True return True
def load(self): def load(self) -> None:
"""(Virtual/Required) Loads up values from the model to be presented in the table. """(Virtual/Required) Loads up values from the model to be presented in the table.
Usually, our model instances contain values that are not quite ready for display. If you Usually, our model instances contain values that are not quite ready for display. If you
@ -478,7 +485,7 @@ class Row:
""" """
raise NotImplementedError() raise NotImplementedError()
def save(self): def save(self) -> None:
"""(Virtual/Required) Saves user edits into your model. """(Virtual/Required) Saves user edits into your model.
If your table is editable, this is called when the user commits his changes. Usually, these If your table is editable, this is called when the user commits his changes. Usually, these
@ -487,7 +494,7 @@ class Row:
""" """
raise NotImplementedError() raise NotImplementedError()
def sort_key_for_column(self, column_name): def sort_key_for_column(self, column_name: str) -> Any:
"""(Virtual) Return the value that is to be used to sort by column ``column_name``. """(Virtual) Return the value that is to be used to sort by column ``column_name``.
By default, looks for an attribute with the same name as ``column_name``, but with an By default, looks for an attribute with the same name as ``column_name``, but with an
@ -500,7 +507,7 @@ class Row:
return getattr(self, column_name) return getattr(self, column_name)
# --- Public # --- Public
def can_edit_cell(self, column_name): def can_edit_cell(self, column_name: str) -> bool:
"""Returns whether cell for column ``column_name`` can be edited. """Returns whether cell for column ``column_name`` can be edited.
By the default, the check is done in many steps: By the default, the check is done in many steps:
@ -530,7 +537,7 @@ class Row:
return False return False
return bool(getattr(prop, "fset", None)) return bool(getattr(prop, "fset", None))
def get_cell_value(self, attrname): def get_cell_value(self, attrname: str) -> Any:
"""Get cell value for ``attrname``. """Get cell value for ``attrname``.
By default, does a simple ``getattr()``, but it is used to allow subclasses to have By default, does a simple ``getattr()``, but it is used to allow subclasses to have
@ -540,7 +547,7 @@ class Row:
attrname = "from_" attrname = "from_"
return getattr(self, attrname) return getattr(self, attrname)
def set_cell_value(self, attrname, value): def set_cell_value(self, attrname: str, value: Any) -> None:
"""Set cell value to ``value`` for ``attrname``. """Set cell value to ``value`` for ``attrname``.
By default, does a simple ``setattr()``, but it is used to allow subclasses to have By default, does a simple ``setattr()``, but it is used to allow subclasses to have

View File

@ -5,8 +5,8 @@
# which should be included with this package. The terms are also available at # which should be included with this package. The terms are also available at
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from .base import GUIObject from hscommon.gui.base import GUIObject
from ..util import nonone from hscommon.util import nonone
class TextFieldView: class TextFieldView:

View File

@ -6,7 +6,7 @@
from collections.abc import MutableSequence from collections.abc import MutableSequence
from .base import GUIObject from hscommon.gui.base import GUIObject
class Node(MutableSequence): class Node(MutableSequence):
@ -77,8 +77,7 @@ class Node(MutableSequence):
if include_self and predicate(self): if include_self and predicate(self):
yield self yield self
for child in self: for child in self:
for found in child.findall(predicate, include_self=True): yield from child.findall(predicate, include_self=True)
yield found
def get_node(self, index_path): def get_node(self, index_path):
"""Returns the node at ``index_path``. """Returns the node at ``index_path``.

View File

@ -7,6 +7,9 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
from typing import Any, Callable, Generator, List, Union
class JobCancelled(Exception): class JobCancelled(Exception):
"The user has cancelled the job" "The user has cancelled the job"
@ -36,7 +39,7 @@ class Job:
""" """
# ---Magic functions # ---Magic functions
def __init__(self, job_proportions, callback): def __init__(self, job_proportions: Union[List[int], int], callback: Callable) -> None:
"""Initialize the Job with 'jobcount' jobs. Start every job with """Initialize the Job with 'jobcount' jobs. Start every job with
start_job(). Every time the job progress is updated, 'callback' is called start_job(). Every time the job progress is updated, 'callback' is called
'callback' takes a 'progress' int param, and a optional 'desc' 'callback' takes a 'progress' int param, and a optional 'desc'
@ -55,12 +58,12 @@ class Job:
self._currmax = 1 self._currmax = 1
# ---Private # ---Private
def _subjob_callback(self, progress, desc=""): def _subjob_callback(self, progress: int, desc: str = "") -> bool:
"""This is the callback passed to children jobs.""" """This is the callback passed to children jobs."""
self.set_progress(progress, desc) self.set_progress(progress, desc)
return True # if JobCancelled has to be raised, it will be at the highest level return True # if JobCancelled has to be raised, it will be at the highest level
def _do_update(self, desc): def _do_update(self, desc: str) -> None:
"""Calls the callback function with a % progress as a parameter. """Calls the callback function with a % progress as a parameter.
The parameter is a int in the 0-100 range. The parameter is a int in the 0-100 range.
@ -78,13 +81,16 @@ class Job:
raise JobCancelled() raise JobCancelled()
# ---Public # ---Public
def add_progress(self, progress=1, desc=""): def add_progress(self, progress: int = 1, desc: str = "") -> None:
self.set_progress(self._progress + progress, desc) self.set_progress(self._progress + progress, desc)
def check_if_cancelled(self): def check_if_cancelled(self) -> None:
self._do_update("") self._do_update("")
def iter_with_progress(self, iterable, desc_format=None, every=1, count=None): # TODO type hint iterable
def iter_with_progress(
self, iterable, desc_format: Union[str, None] = None, every: int = 1, count: Union[int, None] = None
) -> Generator[Any, None, None]:
"""Iterate through ``iterable`` while automatically adding progress. """Iterate through ``iterable`` while automatically adding progress.
WARNING: We need our iterable's length. If ``iterable`` is not a sequence (that is, WARNING: We need our iterable's length. If ``iterable`` is not a sequence (that is,
@ -107,7 +113,7 @@ class Job:
desc = desc_format % (count, count) desc = desc_format % (count, count)
self.set_progress(100, desc) self.set_progress(100, desc)
def start_job(self, max_progress=100, desc=""): def start_job(self, max_progress: int = 100, desc: str = "") -> None:
"""Begin work on the next job. You must not call start_job more than """Begin work on the next job. You must not call start_job more than
'jobcount' (in __init__) times. 'jobcount' (in __init__) times.
'max' is the job units you are to perform. 'max' is the job units you are to perform.
@ -122,7 +128,7 @@ class Job:
self._currmax = max(1, max_progress) self._currmax = max(1, max_progress)
self._do_update(desc) self._do_update(desc)
def start_subjob(self, job_proportions, desc=""): def start_subjob(self, job_proportions: Union[List[int], int], desc: str = "") -> "Job":
"""Starts a sub job. Use this when you want to split a job into """Starts a sub job. Use this when you want to split a job into
multiple smaller jobs. Pretty handy when starting a process where you multiple smaller jobs. Pretty handy when starting a process where you
know how many subjobs you will have, but don't know the work unit count know how many subjobs you will have, but don't know the work unit count
@ -132,7 +138,7 @@ class Job:
self.start_job(100, desc) self.start_job(100, desc)
return Job(job_proportions, self._subjob_callback) return Job(job_proportions, self._subjob_callback)
def set_progress(self, progress, desc=""): def set_progress(self, progress: int, desc: str = "") -> None:
"""Sets the progress of the current job to 'progress', and call the """Sets the progress of the current job to 'progress', and call the
callback callback
""" """
@ -142,30 +148,27 @@ class Job:
self._do_update(desc) self._do_update(desc)
class NullJob: class NullJob(Job):
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs) -> None:
# Null job does nothing # Null job does nothing
pass pass
def add_progress(self, *args, **kwargs): def add_progress(self, *args, **kwargs) -> None:
# Null job does nothing # Null job does nothing
pass pass
def check_if_cancelled(self): def check_if_cancelled(self) -> None:
# Null job does nothing # Null job does nothing
pass pass
def iter_with_progress(self, sequence, *args, **kwargs): def start_job(self, *args, **kwargs) -> None:
return iter(sequence)
def start_job(self, *args, **kwargs):
# Null job does nothing # Null job does nothing
pass pass
def start_subjob(self, *args, **kwargs): def start_subjob(self, *args, **kwargs) -> "NullJob":
return NullJob() return NullJob()
def set_progress(self, *args, **kwargs): def set_progress(self, *args, **kwargs) -> None:
# Null job does nothing # Null job does nothing
pass pass

View File

@ -8,8 +8,9 @@
from threading import Thread from threading import Thread
import sys import sys
from typing import Callable, Tuple, Union
from .job import Job, JobInProgressError, JobCancelled from hscommon.jobprogress.job import Job, JobInProgressError, JobCancelled
class ThreadedJobPerformer: class ThreadedJobPerformer:
@ -28,15 +29,15 @@ class ThreadedJobPerformer:
last_error = None last_error = None
# --- Protected # --- Protected
def create_job(self): def create_job(self) -> Job:
if self._job_running: if self._job_running:
raise JobInProgressError() raise JobInProgressError()
self.last_progress = -1 self.last_progress: Union[int, None] = -1
self.last_desc = "" self.last_desc = ""
self.job_cancelled = False self.job_cancelled = False
return Job(1, self._update_progress) return Job(1, self._update_progress)
def _async_run(self, *args): def _async_run(self, *args) -> None:
target = args[0] target = args[0]
args = tuple(args[1:]) args = tuple(args[1:])
self._job_running = True self._job_running = True
@ -52,7 +53,7 @@ class ThreadedJobPerformer:
self._job_running = False self._job_running = False
self.last_progress = None self.last_progress = None
def reraise_if_error(self): def reraise_if_error(self) -> None:
"""Reraises the error that happened in the thread if any. """Reraises the error that happened in the thread if any.
Call this after the caller of run_threaded detected that self._job_running returned to False Call this after the caller of run_threaded detected that self._job_running returned to False
@ -60,13 +61,13 @@ class ThreadedJobPerformer:
if self.last_error is not None: if self.last_error is not None:
raise self.last_error.with_traceback(self.last_traceback) raise self.last_error.with_traceback(self.last_traceback)
def _update_progress(self, newprogress, newdesc=""): def _update_progress(self, newprogress: int, newdesc: str = "") -> bool:
self.last_progress = newprogress self.last_progress = newprogress
if newdesc: if newdesc:
self.last_desc = newdesc self.last_desc = newdesc
return not self.job_cancelled return not self.job_cancelled
def run_threaded(self, target, args=()): def run_threaded(self, target: Callable, args: Tuple = ()) -> None:
if self._job_running: if self._job_running:
raise JobInProgressError() raise JobInProgressError()
args = (target,) + args args = (target,) + args

View File

@ -1,38 +1,25 @@
import os import os
import os.path as op import os.path as op
import shutil import shutil
import re
import tempfile import tempfile
from typing import Any, List
import polib import polib
from . import pygettext from hscommon import pygettext
from .util import modified_after, dedupe, ensure_folder
from .build import print_and_do, ensure_empty_folder
LC_MESSAGES = "LC_MESSAGES" LC_MESSAGES = "LC_MESSAGES"
# There isn't a 1-on-1 exact fit between .po language codes and cocoa ones
PO2COCOA = {
"pl_PL": "pl",
"pt_BR": "pt-BR",
"zh_CN": "zh-Hans",
}
COCOA2PO = {v: k for k, v in PO2COCOA.items()} def get_langs(folder: str) -> List[str]:
STRING_EXT = ".strings"
def get_langs(folder):
return [name for name in os.listdir(folder) if op.isdir(op.join(folder, name))] return [name for name in os.listdir(folder) if op.isdir(op.join(folder, name))]
def files_with_ext(folder, ext): def files_with_ext(folder: str, ext: str) -> List[str]:
return [op.join(folder, fn) for fn in os.listdir(folder) if fn.endswith(ext)] return [op.join(folder, fn) for fn in os.listdir(folder) if fn.endswith(ext)]
def generate_pot(folders, outpath, keywords, merge=False): def generate_pot(folders: List[str], outpath: str, keywords: Any, merge: bool = False) -> None:
if merge and not op.exists(outpath): if merge and not op.exists(outpath):
merge = False merge = False
if merge: if merge:
@ -53,7 +40,7 @@ def generate_pot(folders, outpath, keywords, merge=False):
print("Exception while removing temporary folder %s\n", genpath) print("Exception while removing temporary folder %s\n", genpath)
def compile_all_po(base_folder): def compile_all_po(base_folder: str) -> None:
langs = get_langs(base_folder) langs = get_langs(base_folder)
for lang in langs: for lang in langs:
pofolder = op.join(base_folder, lang, LC_MESSAGES) pofolder = op.join(base_folder, lang, LC_MESSAGES)
@ -63,7 +50,7 @@ def compile_all_po(base_folder):
p.save_as_mofile(pofile[:-3] + ".mo") p.save_as_mofile(pofile[:-3] + ".mo")
def merge_locale_dir(target, mergeinto): def merge_locale_dir(target: str, mergeinto: str) -> None:
langs = get_langs(target) langs = get_langs(target)
for lang in langs: for lang in langs:
if not op.exists(op.join(mergeinto, lang)): if not op.exists(op.join(mergeinto, lang)):
@ -74,7 +61,7 @@ def merge_locale_dir(target, mergeinto):
shutil.copy(mofile, op.join(mergeinto, lang, LC_MESSAGES)) shutil.copy(mofile, op.join(mergeinto, lang, LC_MESSAGES))
def merge_pots_into_pos(folder): def merge_pots_into_pos(folder: str) -> None:
# We're going to take all pot files in `folder` and for each lang, merge it with the po file # We're going to take all pot files in `folder` and for each lang, merge it with the po file
# with the same name. # with the same name.
potfiles = files_with_ext(folder, ".pot") potfiles = files_with_ext(folder, ".pot")
@ -87,7 +74,7 @@ def merge_pots_into_pos(folder):
po.save() po.save()
def merge_po_and_preserve(source, dest): def merge_po_and_preserve(source: str, dest: str) -> None:
# Merges source entries into dest, but keep old entries intact # Merges source entries into dest, but keep old entries intact
sourcepo = polib.pofile(source) sourcepo = polib.pofile(source)
destpo = polib.pofile(dest) destpo = polib.pofile(dest)
@ -99,7 +86,7 @@ def merge_po_and_preserve(source, dest):
destpo.save() destpo.save()
def normalize_all_pos(base_folder): def normalize_all_pos(base_folder: str) -> None:
"""Normalize the format of .po files in base_folder. """Normalize the format of .po files in base_folder.
When getting POs from external sources, such as Transifex, we end up with spurious diffs because When getting POs from external sources, such as Transifex, we end up with spurious diffs because
@ -116,118 +103,3 @@ def normalize_all_pos(base_folder):
for pofile in pofiles: for pofile in pofiles:
p = polib.pofile(pofile) p = polib.pofile(pofile)
p.save() p.save()
# --- Cocoa
def all_lproj_paths(folder):
return files_with_ext(folder, ".lproj")
def escape_cocoa_strings(s):
return s.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n")
def unescape_cocoa_strings(s):
return s.replace("\\\\", "\\").replace('\\"', '"').replace("\\n", "\n")
def strings2pot(target, dest):
with open(target, "rt", encoding="utf-8") as fp:
contents = fp.read()
# We're reading an en.lproj file. We only care about the righthand part of the translation.
re_trans = re.compile(r'".*" = "(.*)";')
strings = re_trans.findall(contents)
if op.exists(dest):
po = polib.pofile(dest)
else:
po = polib.POFile()
for s in dedupe(strings):
s = unescape_cocoa_strings(s)
entry = po.find(s)
if entry is None:
entry = polib.POEntry(msgid=s)
po.append(entry)
# we don't know or care about a line number so we put 0
entry.occurrences.append((target, "0"))
entry.occurrences = dedupe(entry.occurrences)
po.save(dest)
def allstrings2pot(lprojpath, dest, excludes=None):
allstrings = files_with_ext(lprojpath, STRING_EXT)
if excludes:
allstrings = [p for p in allstrings if op.splitext(op.basename(p))[0] not in excludes]
for strings_path in allstrings:
strings2pot(strings_path, dest)
def po2strings(pofile, en_strings, dest):
# Takes en_strings and replace all righthand parts of "foo" = "bar"; entries with translations
# in pofile, then puts the result in dest.
po = polib.pofile(pofile)
if not modified_after(pofile, dest):
return
ensure_folder(op.dirname(dest))
print("Creating {} from {}".format(dest, pofile))
with open(en_strings, "rt", encoding="utf-8") as fp:
contents = fp.read()
re_trans = re.compile(r'(?<= = ").*(?=";\n)')
def repl(match):
s = match.group(0)
unescaped = unescape_cocoa_strings(s)
entry = po.find(unescaped)
if entry is None:
print("WARNING: Could not find entry '{}' in .po file".format(s))
return s
trans = entry.msgstr
return escape_cocoa_strings(trans) if trans else s
contents = re_trans.sub(repl, contents)
with open(dest, "wt", encoding="utf-8") as fp:
fp.write(contents)
def generate_cocoa_strings_from_code(code_folder, dest_folder):
# Uses the "genstrings" command to generate strings file from all .m files in "code_folder".
# The strings file (their name depends on the localization table used in the source) will be
# placed in "dest_folder".
# genstrings produces utf-16 files with comments. After having generated the files, we convert
# them to utf-8 and remove the comments.
ensure_empty_folder(dest_folder)
print_and_do('genstrings -o "{}" `find "{}" -name *.m | xargs`'.format(dest_folder, code_folder))
for stringsfile in os.listdir(dest_folder):
stringspath = op.join(dest_folder, stringsfile)
with open(stringspath, "rt", encoding="utf-16") as fp:
content = fp.read()
content = re.sub(r"/\*.*?\*/", "", content)
content = re.sub(r"\n{2,}", "\n", content)
# I have no idea why, but genstrings seems to have problems with "%" character in strings
# and inserts (number)$ after it. Find these bogus inserts and remove them.
content = re.sub(r"%\d\$", "%", content)
with open(stringspath, "wt", encoding="utf-8") as fp:
fp.write(content)
def generate_cocoa_strings_from_xib(xib_folder):
xibs = [op.join(xib_folder, fn) for fn in os.listdir(xib_folder) if fn.endswith(".xib")]
for xib in xibs:
dest = xib.replace(".xib", STRING_EXT)
print_and_do("ibtool {} --generate-strings-file {}".format(xib, dest))
print_and_do("iconv -f utf-16 -t utf-8 {0} | tee {0}".format(dest))
def localize_stringsfile(stringsfile, dest_root_folder):
stringsfile_name = op.basename(stringsfile)
for lang in get_langs("locale"):
pofile = op.join("locale", lang, "LC_MESSAGES", "ui.po")
cocoa_lang = PO2COCOA.get(lang, lang)
dest_lproj = op.join(dest_root_folder, cocoa_lang + ".lproj")
ensure_folder(dest_lproj)
po2strings(pofile, stringsfile, op.join(dest_lproj, stringsfile_name))
def localize_all_stringsfiles(src_folder, dest_root_folder):
stringsfiles = [op.join(src_folder, fn) for fn in os.listdir(src_folder) if fn.endswith(STRING_EXT)]
for path in stringsfiles:
localize_stringsfile(path, dest_root_folder)

View File

@ -13,6 +13,7 @@ the method with the same name as the broadcasted message is called on the listen
""" """
from collections import defaultdict from collections import defaultdict
from typing import Callable, DefaultDict, List
class Broadcaster: class Broadcaster:
@ -21,10 +22,10 @@ class Broadcaster:
def __init__(self): def __init__(self):
self.listeners = set() self.listeners = set()
def add_listener(self, listener): def add_listener(self, listener: "Listener") -> None:
self.listeners.add(listener) self.listeners.add(listener)
def notify(self, msg): def notify(self, msg: str) -> None:
"""Notify all connected listeners of ``msg``. """Notify all connected listeners of ``msg``.
That means that each listeners will have their method with the same name as ``msg`` called. That means that each listeners will have their method with the same name as ``msg`` called.
@ -33,18 +34,18 @@ class Broadcaster:
if listener in self.listeners: # disconnected during notification if listener in self.listeners: # disconnected during notification
listener.dispatch(msg) listener.dispatch(msg)
def remove_listener(self, listener): def remove_listener(self, listener: "Listener") -> None:
self.listeners.discard(listener) self.listeners.discard(listener)
class Listener: class Listener:
"""A listener is initialized with the broadcaster it's going to listen to. Initially, it is not connected.""" """A listener is initialized with the broadcaster it's going to listen to. Initially, it is not connected."""
def __init__(self, broadcaster): def __init__(self, broadcaster: Broadcaster) -> None:
self.broadcaster = broadcaster self.broadcaster = broadcaster
self._bound_notifications = defaultdict(list) self._bound_notifications: DefaultDict[str, List[Callable]] = defaultdict(list)
def bind_messages(self, messages, func): def bind_messages(self, messages: str, func: Callable) -> None:
"""Binds multiple message to the same function. """Binds multiple message to the same function.
Often, we perform the same thing on multiple messages. Instead of having the same function Often, we perform the same thing on multiple messages. Instead of having the same function
@ -54,15 +55,15 @@ class Listener:
for message in messages: for message in messages:
self._bound_notifications[message].append(func) self._bound_notifications[message].append(func)
def connect(self): def connect(self) -> None:
"""Connects the listener to its broadcaster.""" """Connects the listener to its broadcaster."""
self.broadcaster.add_listener(self) self.broadcaster.add_listener(self)
def disconnect(self): def disconnect(self) -> None:
"""Disconnects the listener from its broadcaster.""" """Disconnects the listener from its broadcaster."""
self.broadcaster.remove_listener(self) self.broadcaster.remove_listener(self)
def dispatch(self, msg): def dispatch(self, msg: str) -> None:
if msg in self._bound_notifications: if msg in self._bound_notifications:
for func in self._bound_notifications[msg]: for func in self._bound_notifications[msg]:
func() func()
@ -74,14 +75,14 @@ class Listener:
class Repeater(Broadcaster, Listener): class Repeater(Broadcaster, Listener):
REPEATED_NOTIFICATIONS = None REPEATED_NOTIFICATIONS = None
def __init__(self, broadcaster): def __init__(self, broadcaster: Broadcaster) -> None:
Broadcaster.__init__(self) Broadcaster.__init__(self)
Listener.__init__(self, broadcaster) Listener.__init__(self, broadcaster)
def _repeat_message(self, msg): def _repeat_message(self, msg: str) -> None:
if not self.REPEATED_NOTIFICATIONS or msg in self.REPEATED_NOTIFICATIONS: if not self.REPEATED_NOTIFICATIONS or msg in self.REPEATED_NOTIFICATIONS:
self.notify(msg) self.notify(msg)
def dispatch(self, msg): def dispatch(self, msg: str) -> None:
Listener.dispatch(self, msg) Listener.dispatch(self, msg)
self._repeat_message(msg) self._repeat_message(msg)

View File

@ -7,208 +7,9 @@
# http://www.gnu.org/licenses/gpl-3.0.html # http://www.gnu.org/licenses/gpl-3.0.html
import logging import logging
import os
import os.path as op
import shutil
import sys
from itertools import takewhile
from functools import wraps from functools import wraps
from inspect import signature from inspect import signature
from pathlib import Path
class Path(tuple):
"""A handy class to work with paths.
We subclass ``tuple``, each element of the tuple represents an element of the path.
* ``Path('/foo/bar/baz')[1]`` --> ``'bar'``
* ``Path('/foo/bar/baz')[1:2]`` --> ``Path('bar/baz')``
* ``Path('/foo/bar')['baz']`` --> ``Path('/foo/bar/baz')``
* ``str(Path('/foo/bar/baz'))`` --> ``'/foo/bar/baz'``
"""
# Saves a little bit of memory usage
__slots__ = ()
def __new__(cls, value, separator=None):
def unicode_if_needed(s):
if isinstance(s, str):
return s
else:
try:
return str(s, sys.getfilesystemencoding())
except UnicodeDecodeError:
logging.warning("Could not decode %r", s)
raise
if isinstance(value, Path):
return value
if not separator:
separator = os.sep
if isinstance(value, bytes):
value = unicode_if_needed(value)
if isinstance(value, str):
if value:
if (separator not in value) and ("/" in value):
separator = "/"
value = value.split(separator)
else:
value = ()
else:
if any(isinstance(x, bytes) for x in value):
value = [unicode_if_needed(x) for x in value]
# value is a tuple/list
if any(separator in x for x in value):
# We have a component with a separator in it. Let's rejoin it, and generate another path.
return Path(separator.join(value), separator)
if (len(value) > 1) and (not value[-1]):
value = value[
:-1
] # We never want a path to end with a '' (because Path() can be called with a trailing slash ending path)
return tuple.__new__(cls, value)
def __add__(self, other):
other = Path(other)
if other and (not other[0]):
other = other[1:]
return Path(tuple.__add__(self, other))
def __contains__(self, item):
if isinstance(item, Path):
return item[: len(self)] == self
else:
return tuple.__contains__(self, item)
def __eq__(self, other):
return tuple.__eq__(self, Path(other))
def __getitem__(self, key):
if isinstance(key, slice):
if isinstance(key.start, Path):
equal_elems = list(takewhile(lambda pair: pair[0] == pair[1], zip(self, key.start)))
key = slice(len(equal_elems), key.stop, key.step)
if isinstance(key.stop, Path):
equal_elems = list(
takewhile(
lambda pair: pair[0] == pair[1],
zip(reversed(self), reversed(key.stop)),
)
)
stop = -len(equal_elems) if equal_elems else None
key = slice(key.start, stop, key.step)
return Path(tuple.__getitem__(self, key))
elif isinstance(key, (str, Path)):
return self + key
else:
return tuple.__getitem__(self, key)
def __hash__(self):
return tuple.__hash__(self)
def __ne__(self, other):
return not self.__eq__(other)
def __radd__(self, other):
return Path(other) + self
def __str__(self):
if len(self) == 1:
first = self[0]
if (len(first) == 2) and (first[1] == ":"): # Windows drive letter
return first + "\\"
elif not len(first): # root directory
return "/"
return os.sep.join(self)
def has_drive_letter(self):
if not self:
return False
first = self[0]
return (len(first) == 2) and (first[1] == ":")
def is_parent_of(self, other):
"""Whether ``other`` is a subpath of ``self``.
Almost the same as ``other in self``, but it's a bit more self-explicative and when
``other == self``, returns False.
"""
if other == self:
return False
else:
return other in self
def remove_drive_letter(self):
if self.has_drive_letter():
return self[1:]
else:
return self
def tobytes(self):
return str(self).encode(sys.getfilesystemencoding())
def parent(self):
"""Returns the parent path.
``Path('/foo/bar/baz').parent()`` --> ``Path('/foo/bar')``
"""
return self[:-1]
@property
def name(self):
"""Last element of the path (filename), with extension.
``Path('/foo/bar/baz').name`` --> ``'baz'``
"""
return self[-1]
# OS method wrappers
def exists(self):
return op.exists(str(self))
def copy(self, dest_path):
return shutil.copy(str(self), str(dest_path))
def copytree(self, dest_path, *args, **kwargs):
return shutil.copytree(str(self), str(dest_path), *args, **kwargs)
def isdir(self):
return op.isdir(str(self))
def isfile(self):
return op.isfile(str(self))
def islink(self):
return op.islink(str(self))
def listdir(self):
return [self[name] for name in os.listdir(str(self))]
def mkdir(self, *args, **kwargs):
return os.mkdir(str(self), *args, **kwargs)
def makedirs(self, *args, **kwargs):
return os.makedirs(str(self), *args, **kwargs)
def move(self, dest_path):
return shutil.move(str(self), str(dest_path))
def open(self, *args, **kwargs):
return open(str(self), *args, **kwargs)
def remove(self):
return os.remove(str(self))
def rename(self, dest_path):
return os.rename(str(self), str(dest_path))
def rmdir(self):
return os.rmdir(str(self))
def rmtree(self):
return shutil.rmtree(str(self))
def stat(self):
return os.stat(str(self))
def pathify(f): def pathify(f):
@ -246,7 +47,7 @@ def log_io_error(func):
def wrapper(path, *args, **kwargs): def wrapper(path, *args, **kwargs):
try: try:
return func(path, *args, **kwargs) return func(path, *args, **kwargs)
except (IOError, OSError) as e: except OSError as e:
msg = 'Error "{0}" during operation "{1}" on "{2}": "{3}"' msg = 'Error "{0}" during operation "{1}" on "{2}": "{3}"'
classname = e.__class__.__name__ classname = e.__class__.__name__
funcname = func.__name__ funcname = func.__name__

View File

@ -15,7 +15,8 @@
# #
import os import os
import imp import importlib.machinery
import importlib.util
import sys import sys
import glob import glob
import token import token
@ -110,7 +111,7 @@ def _visit_pyfiles(list, dirname, names):
# get extension for python source files # get extension for python source files
if "_py_ext" not in globals(): if "_py_ext" not in globals():
global _py_ext global _py_ext
_py_ext = [triple[0] for triple in imp.get_suffixes() if triple[2] == imp.PY_SOURCE][0] _py_ext = importlib.machinery.SOURCE_SUFFIXES[0]
# don't recurse into CVS directories # don't recurse into CVS directories
if "CVS" in names: if "CVS" in names:
@ -120,45 +121,6 @@ def _visit_pyfiles(list, dirname, names):
list.extend([os.path.join(dirname, file) for file in names if os.path.splitext(file)[1] == _py_ext]) list.extend([os.path.join(dirname, file) for file in names if os.path.splitext(file)[1] == _py_ext])
def _get_modpkg_path(dotted_name, pathlist=None):
"""Get the filesystem path for a module or a package.
Return the file system path to a file for a module, and to a directory for
a package. Return None if the name is not found, or is a builtin or
extension module.
"""
# split off top-most name
parts = dotted_name.split(".", 1)
if len(parts) > 1:
# we have a dotted path, import top-level package
try:
file, pathname, description = imp.find_module(parts[0], pathlist)
if file:
file.close()
except ImportError:
return None
# check if it's indeed a package
if description[2] == imp.PKG_DIRECTORY:
# recursively handle the remaining name parts
pathname = _get_modpkg_path(parts[1], [pathname])
else:
pathname = None
else:
# plain name
try:
file, pathname, description = imp.find_module(dotted_name, pathlist)
if file:
file.close()
if description[2] not in [imp.PY_SOURCE, imp.PKG_DIRECTORY]:
pathname = None
except ImportError:
pathname = None
return pathname
def getFilesForName(name): def getFilesForName(name):
"""Get a list of module files for a filename, a module or package name, """Get a list of module files for a filename, a module or package name,
or a directory. or a directory.
@ -173,7 +135,11 @@ def getFilesForName(name):
return file_list return file_list
# try to find module or package # try to find module or package
name = _get_modpkg_path(name) try:
spec = importlib.util.find_spec(name)
name = spec.origin
except ImportError:
name = None
if not name: if not name:
return [] return []
@ -374,7 +340,7 @@ def main(source_files, outpath, keywords=None):
fp = open(options.excludefilename, encoding="utf-8") fp = open(options.excludefilename, encoding="utf-8")
options.toexclude = fp.readlines() options.toexclude = fp.readlines()
fp.close() fp.close()
except IOError: except OSError:
print( print(
"Can't read --exclude-file: %s" % options.excludefilename, "Can't read --exclude-file: %s" % options.excludefilename,
file=sys.stderr, file=sys.stderr,

View File

@ -6,8 +6,9 @@
from pathlib import Path from pathlib import Path
import re import re
from typing import Callable, Dict, Union
from .build import read_changelog_file, filereplace from hscommon.build import read_changelog_file, filereplace
from sphinx.cmd.build import build_main as sphinx_build from sphinx.cmd.build import build_main as sphinx_build
CHANGELOG_FORMAT = """ CHANGELOG_FORMAT = """
@ -18,25 +19,25 @@ CHANGELOG_FORMAT = """
""" """
def tixgen(tixurl): def tixgen(tixurl: str) -> Callable[[str], str]:
"""This is a filter *generator*. tixurl is a url pattern for the tix with a {0} placeholder """This is a filter *generator*. tixurl is a url pattern for the tix with a {0} placeholder
for the tix # for the tix #
""" """
urlpattern = tixurl.format("\\1") # will be replaced buy the content of the first group in re urlpattern = tixurl.format("\\1") # will be replaced buy the content of the first group in re
R = re.compile(r"#(\d+)") R = re.compile(r"#(\d+)")
repl = "`#\\1 <{}>`__".format(urlpattern) repl = f"`#\\1 <{urlpattern}>`__"
return lambda text: R.sub(repl, text) return lambda text: R.sub(repl, text)
def gen( def gen(
basepath, basepath: Path,
destpath, destpath: Path,
changelogpath, changelogpath: Path,
tixurl, tixurl: str,
confrepl=None, confrepl: Union[Dict[str, str], None] = None,
confpath=None, confpath: Union[Path, None] = None,
changelogtmpl=None, changelogtmpl: Union[Path, None] = None,
): ) -> None:
"""Generate sphinx docs with all bells and whistles. """Generate sphinx docs with all bells and whistles.
basepath: The base sphinx source path. basepath: The base sphinx source path.

Some files were not shown because too many files have changed in this diff Show More