If settings.ini is present next to the executable, will run in portable mode.
This results in settings, data, and cache all being in same folder as dupeGuru.
Computing 3 hash samples for files less than 3MiB (3 * CHUNK_SIZE) is not efficient since spans of later samples would overlap a previous one.
Therefore we can simply return the hash of the entire small file instead.
* Some refs appeared in the dupes-only view after a re-prioritization was done a second time.
* It seems the core.Results.__dupes list was not properly updated whenever core.app.Dupeguru.reprioritize_groups() -> core.Results.sort_dupes() was called.
When a re-prioritization is done, some refs became dupe, and some dupes became ref in their place. So we need to update the new state of the internal list of dupes kept by the Results object, instead of relying on the outdated cached one.
* Fix#757.
* Instead of keeping md5 samples separate, merge them as one hash computed from the various selected chunks we picked.
* We don't need to keep a boolean to see whether or not the user chose to optimize; we can simply compare the value of the threshold, since 0 means no optimization currently active.
* Big files above the user selected threshold can be partially hashed in 3 places.
* If the user is willing to take the risk, we consider files with identical md5samples as being identical.
* Doing a full match should be safer to avoid partial results which would result in overly aggressive filtering.
* Add new tests to test suite to cover this issue.
* Fixes#903.
* Accents are getting removed from Unicode characters to generate similar "words".
* Non-latin characters which cannot be processed that way (eg. japanese, greek, russian, etc.) should not be filtered out at all otherwise files are erroneously skipped or detected as dupes if only some characters make it passed the filter.
* Starting from an arbitrary unicode codepoint (converted to decimal), above which we know it is pointless to try any sort of processing, we leave the characters as is.
* Fix#878.
Removing prioritizations one-by-one can be tedious. This commit enables
extended selection in the prioritizations list. Multiple items can be
selected with conventional methods, such as holding down Ctrl or Shift
key and clicking the items or holding down the left mouse button and
hovering the cursor over the list. All items also can be selected with
Ctrl+A.
Multiple items drag-n-drop is also possible.
To avoid confusion, the selection in the prioritizations list is cleared
after the items are removed or drag-n-dropped.
Signed-off-by: Sergey Zhuravlevich <sergey@zhur.xyz>
Adding criteria to the prioritizations list one-by-one can be tedious.
This commit enables extended selection in the criteria list and
implements adding multiple items. Multiple criteria can be selected with
conventional methods, such as holding down Ctrl or Shift keys and
clicking the items or holding down the left mouse button and hovering
the cursor over the list. All items also can be selected with Ctrl+A.
Signed-off-by: Sergey Zhuravlevich <sergey@zhur.xyz>
* Add testing feature to Exclusion dialog to allow users to test regexes against an arbitrary string.
* Fixed test suites.
* Improve comments and help dialog box.
* Concatenating regexes into one Pattern might yield better performance under (un)certain conditions.
* Filenames are tested against regexes with no os.sep in them. This may or may not be what we want to do.
And alternative would be to test against the whole (absolute) path of each file, which would filter more agressively.
* It seems that matchblock.getmatches() returns too early and the (multi-)processes become zombies
* This is a workaround which seems to work by sleeping for one second and avoid zombie processes
- Currently some incompatibility in the hscommon tests, commented out
the ones with issues temporarily
- Also updated some deprecation warnings, still more to do
* results.py doesn't set the proper type for dupes at the line
"file = get_file(path)" so we add it on top
* Perhap it could have been added to _get_fileclasses() in core.app.py too
but I have not tested it
* In standard mode, for folder comparison, dupe type is wrongly set as core.fs.Folder
while it should be core.se.fs.Folder.
* Catching the NotImplementedError exception redirects to the appropriate handler
* This is only a temporary workaround until a better fix is implemented
- Add changes from OSX build to local hscommon/build.py
- Update package.py & srcpkg.sh
- Remove invalid submodule references
- Update srcpkg.sh to use xz
- Update package.py pyinstaller configuration
- Call PyInstaller inline
- Add --noconfirm option to be more script friendly
- Add UCRT Redist location to path should fix#545 as now all the dlls
are included
- Format all files with black
- Update tox.ini flake8 arguments to be compatible
- Add black to requirements-extra.txt
- Reduce ignored flake8 rules and fix a few violations
* Fix main version (Don't use spaces and capitals in versions!)
* Change debian changelog format in hscommon
* Fix build cleanup
* Switch to XZ compression
* Update build instructions
* Build single package for both Debian/Ubuntu
* Update packaging
These methods were previously called recursively and it seemed to cause
problems in some cases. The recursive nature of these functions not
bringing any notable advantage and `os.walk()` being of better style
anyway, I removed that recursive nature.
Hopefully fixes#421
We now get less progress feedback, but in exchange, our progress job is
simpler. Previously, our progress bar would often get wonky towards the
end of the scan and I didn't have the energy to debug that.
Besides, people don't care about that level of progress feedback.
(breaks PE temporarily)
Adds a Standard/Music Application Mode button to SE and thus adds the
ability to run ME scan types in SE. When in Music mode, the
Music-specific results window, details panel and preferences panel will
show up.
All preferences except scan_type become shared between app modes
(changing the pref in a mode changes it in the other mode).
Results Window and Details Panel are now re-created at each scan
operation because they could change their type between two runs.
Preferences panel is instantiated on the fly and discarded after close.
This is a very big merge operation and I'm trying to touch as little
code as possible, sometimes at the cost of elegance. I try to minimize
the breakage that this change brings.
It's now `DupeGuru` that holds it and passes it to `get_dupe_groups()`,
the only place where it's actually used in `Scanner`.
This will make the SE/ME/PE merge easier by allowing us to instantiate
the Scanner on-the-fly since it doesn't hold state anymore.
I have a fix to make in it and it's really silly to pretend that this
lib is of any use to anybody outside HS apps. Bringing it back here will
make things more simple.
When the "Replace with links" option is not enabled, the choice of
symlink or hardlink is irrelevant and causes confusion. Implemented core
mechanism for controlling the enabled state of that option. Also
implemented the Qt interface for it. Cocoa-part is still to be done.
I used this opportunity to greatly enhance documentation of this part of
the code. I'm beginning to like documenting...
Ref #247.