dupeguru

mirror of https://github.com/arsenetar/dupeguru.git synced 2024-11-01 06:15:58 +00:00

Author	SHA1	Message	Date
Andrew Senetar	057be0294a	fix: Prevent exception during existence check - Add "safe" existence check to files which catches OSErrors that may occur when trying to stat files - Use "safe" existence check during final existence check	2023-01-11 23:07:06 -06:00
Andrew Senetar	81daddd072	refactor: Improve digest cache db method performance - Remove lock on read operations, only needed for write operations - Change to use context manager for sqlite connection - Remove long lived cursor object and use short lived cursors instead Fixes #1080	2023-01-11 00:58:29 -06:00
Andrew Senetar	6db2fa2be6	fix: Correct flake8 config - Add exclude pattern for flake8 when running with pre-commit as it does not fully honor the exclude paths. - Cleanup exclude paths for flake8 in tox.ini - Re-enable line length check and correct three affected files	2023-01-09 22:35:12 -06:00
Andrew Senetar	e30a135451	feat: Add additional scan time options - Add option to include file existence check at end of scan, speeds up end of scan operation time considerably, however if user has removed or moved files since starting a scan there could be later errors when interacting with results. Defaults to existing behavior of including the check, until it can be verified later dialogs and actions handle non-existent items better. - Add option to ignore differences in mtime when checking hash cache. Option is present in advanced tab of preferences. Closes #1022. - Regenerate pot files for translations	2023-01-05 23:01:16 -06:00
Andrew Senetar	71af825b37	Move try/except of cache db to get() and put() - Move the try/except of cache db calls to the calls themselves. - Add some additional information to logging statements on cache db exception to improve troubleshooting.	2022-07-07 21:52:22 -05:00
Andrew Senetar	0a4e61edf5	Additional cleanup per mypy - Add Callable type to hasher (should realy be more specific...) - Add type hint to COLUMNS in qtlib/table.py - Use Qt.ItemFlag.ItemIsEnabled instead of Qt.itemIsEnabled in qtlib/table.py	2022-04-30 05:16:46 -05:00
Andrew Senetar	63dd4d4561	Apply pyupgrade changes	2022-04-27 20:53:12 -05:00
Andrew Senetar	a470a8de25	Update fs.py to optimize stat() calls - Update to get size and mtime at time of class creation when os.DirEntry is used for initialization. - Folders still calculate size later for folder scans. - Ref #962, #959	2022-03-30 22:58:01 -05:00
Andrew Senetar	efd500ecc1	Update directory scanning to use os.scandir() - Change to use os.scandir() instead of os.walk() to leverage DirEntry objects. - Avoids extra calls to stat() on files during fs.can_handle() - See 3x speed improvement on Windows in some cases	2022-03-29 23:37:56 -05:00
Andrew Senetar	43fcc52291	Replace pathlib.glob() with os.scandir() in fs.py	2022-03-29 22:35:38 -05:00
Andrew Senetar	50f5db1543	Update fs to support DirEntry on get_file()	2022-03-29 22:32:36 -05:00
Andrew Senetar	da9f8b2b9d	Squashed commit of the following: commit 8b15fe9a502ebf4841c6529e7098cef03a6a5e6f Author: Andrew Senetar <arsenetar@gmail.com> Date: Sun Mar 27 23:48:15 2022 -0500 Finish up changes to copy_or_move commit 21f6a32cf3186a400af8f30e67ad2743dc9a49bd Author: Andrew Senetar <arsenetar@gmail.com> Date: Thu Mar 17 23:56:52 2022 -0500 Migrate from hscommon.path to pathlib - Part one, this gets all hscommon and core tests passing - App appears to be able to load directories and complete scans, need further testing - app.py copy_or_move needs some additional work	2022-03-27 23:50:03 -05:00
Andrew Senetar	9f40e4e786	Squashed commit of the following: commit 5eb515f666bfa1ff06c2e96bdc351a4b7456580e Author: Andrew Senetar <arsenetar@gmail.com> Date: Sun Mar 27 22:19:39 2022 -0500 Add fallback to md5 if xxhash not available Mainly here for the case when distributions have not packaged python3-xxhash. commit `51b18d4c84` Author: Andrew Senetar <arsenetar@gmail.com> Date: Sat Mar 19 15:25:46 2022 -0500 Switch file hashing to xxhash instead of md5 - Improves performance significantly in some cases - Add xxhash to requirements.txt and sort requirements - Rename md5 based members to digest - Update all tests to use new member names and hashing methods - Update hash db code to upgrade schema NOTE: May consider supporting multiple hashing algorithms in the future.	2022-03-27 22:27:13 -05:00
Dobatymo	9753afba74	change FilesDB to singleton class move hash calculation back in to Files class clear cache now clears hash cache in addition to picture cache	2021-10-29 15:12:40 +08:00
Dobatymo	2f02a6010d	implement hash cache for md5 hash based on sqlite	2021-10-29 15:12:40 +08:00
Andrew Senetar	d576a7043c	Code cleanups in core and other affected files	2021-08-21 18:02:02 -05:00
Andrew Senetar	ffe6b7047c	Format all files with black correcting line length	2021-08-15 04:10:18 -05:00
glubsy	e95306e58f	Fix flake 8	2021-08-14 02:52:00 +02:00
glubsy	891a875990	Cache constant expression Perhaps the python byte code is already optimized, but just in case it is not, keep pre-compute the constant expression.	2021-08-13 21:33:21 +02:00
glubsy	545a5a75fb	Fix for older python versions The "walrus" operator is only available in python 3.8 and later. Fall back to more traditional notation.	2021-08-13 20:56:33 +02:00
glubsy	7b764f183e	Avoid partially hashing small files Computing 3 hash samples for files less than 3MiB (3 * CHUNK_SIZE) is not efficient since spans of later samples would overlap a previous one. Therefore we can simply return the hash of the entire small file instead.	2021-08-13 20:47:01 +02:00
glubsy	718ca5b313	Remove unused import	2021-06-22 02:41:33 +02:00
glubsy	277bc3fbb8	Add unit tests for hash sample optimization * Instead of keeping md5 samples separate, merge them as one hash computed from the various selected chunks we picked. * We don't need to keep a boolean to see whether or not the user chose to optimize; we can simply compare the value of the threshold, since 0 means no optimization currently active.	2021-06-21 22:44:05 +02:00
glubsy	e07dfd5955	Add partial hashes optimization for big files * Big files above the user selected threshold can be partially hashed in 3 places. * If the user is willing to take the risk, we consider files with identical md5samples as being identical.	2021-06-21 19:03:21 +02:00
Andrew Senetar	7ba8aa3514	Format files with black - Format all files with black - Update tox.ini flake8 arguments to be compatible - Add black to requirements-extra.txt - Reduce ignored flake8 rules and fix a few violations	2019-12-31 20:16:27 -06:00
Virgil Dupras	334f4dd2ae	Increase md5 reading buffer to 1mb This makes md5 computing faster without using too much memory.	2016-06-08 12:23:10 -04:00
Virgil Dupras	e7076bc3bd	Change license from BSD to GPLv3 See http://www.hardcoded.net/archive2014#2014-12-28 for context	2015-01-03 16:33:16 -05:00
Virgil Dupras	fc16ea8c49	Change copyright year to 2015	2015-01-03 16:30:57 -05:00
Virgil Dupras	2166a0996c	Added tox configuration ... and fixed pep8 warnings. There's a lot of them that are still ignored, but that's because it's too much of a step to take at once.	2014-10-13 15:08:59 -04:00
Virgil Dupras	ca709a60cf	Updated copyright year to 2014	2014-04-19 12:19:11 -04:00
Virgil Dupras	10dbfa9b38	Refactoring: Path API compatibility with pathlib Refactored dupeGuru to make hscommon.path's API a bit close to pathlib's API. It's not 100% compatible yet, but it's much better than before. This is more of a hscommon refactoring than a dupeguru one, but since duepGuru is the main user of Path, it was the driver behind the refactoring. This refactoring also see the introduction of @pathify, which ensure Path arguments. Previously, we were often unsure of whether the caller of a function was passing a Path or a str. This problem is now solved and this allows us to remove hscommon.io, an ill-conceived attempt to solve that same ambiguity problem. Fixes #235.	2013-11-16 12:06:16 -05:00
Virgil Dupras	be8efea081	Fixed folder scanning in SE, which was completely broken Oops	2013-08-18 20:50:31 -04:00
Virgil Dupras	7e8f9036d8	Began serious code documentation effort Enabled the autodoc Sphinx extension and started adding docstrings to classes, methods, etc.. It's quickly becoming quite interesting...	2013-08-18 18:36:09 -04:00
Virgil Dupras	7891fb5396	Refactoring: Moved some code from app.DupeGuru to fs.File. Moved DupeGuru._get_display_info() to File.get_display_info(). This method used none of the app's global state or methods and had nothing to do there.	2013-07-14 17:43:58 -04:00
Virgil Dupras	4a8ce9b6c4	Updated copyright year to 2013.	2013-04-28 10:35:51 -04:00
Virgil Dupras	df30a31782	Refactoring: Began to phase out to the use of hscommon.io in favor of Path methods.	2012-08-09 10:53:24 -04:00
Virgil Dupras	1171705921	Made core.fs.File slotted to save a lot of memory usage.	2012-05-29 17:39:54 -04:00
Virgil Dupras	657f6743c2	Changed copyright year to 2012	2012-03-15 14:28:40 -04:00
Virgil Dupras	56207f4dbb	[#161 state:fixed] Fixed folder sorting.	2011-06-15 11:58:33 -04:00
Virgil Dupras	0b20b35ffb	Fixed copying operations for folders which didn't work.	2011-04-14 12:55:50 +02:00
Virgil Dupras	279d44b7f3	[#89 state:fixed] Added a Folders scan type in dgse. --HG-- rename : core_se/tests/fs_test.py => core/tests/fs_test.py	2011-04-12 13:22:29 +02:00
Virgil Dupras	0fea59007c	Updated copyright year to 2011.	2011-04-12 10:04:01 +02:00
Virgil Dupras	eefe464fba	Replaced dependencies from hsutil to hscommon.	2011-01-11 13:36:05 +01:00
Virgil Dupras	33c0ba808c	Changed references to what has already been moved from hsutil to hscommon (io, path, testutil).	2011-01-11 11:59:53 +01:00
Virgil Dupras	4886982d43	Re-licensed to BSD	2010-09-30 12:17:41 +02:00
Virgil Dupras	565c990687	[#101 state:fixed] Remove the Creation Time column.	2010-08-13 09:26:38 +02:00
Virgil Dupras	854d194f88	Converted to py3k. There's probably some bugs still. So far, I managed to run dupeGuru SE under pyobjc and qt.	2010-08-11 16:39:06 +02:00
Virgil Dupras	b372974437	[#84 state:hold] Added debug logging to fs.get_files() to eventually figure out the cause of this bug.	2010-02-05 17:55:47 +01:00
Virgil Dupras	9f006ec08a	[#75 state:fixed] md5 hashes are now computed incrementally.	2010-01-13 08:59:44 +01:00
Virgil Dupras	d62ff40bed	Removed svn keywords.	2010-01-02 16:52:18 +01:00

1 2

52 Commits