Details
-
Type: Improvement
-
Status: Closed
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: TAG 2017/18 Sprint 2
-
Fix Version/s: None
-
Component/s: Filetracker
-
Labels:None
Description
Investigate possible options of compressing filetracker contents (different compression algorithms, compression/decompression logic, etc.). Also check if Szkopuł filetracker contains a significant number of duplicate files.
Gain (maybe slightly misleading...) is the ratio of size after compression / dedup to the original size - lower is better.
Deduplication (SHA):
('Highest count:', 14794 - there's a file that has 14794 exact instances)
('Total size:', 1101947332565)
('Dedup size:', 638640713188)
('Gain:', 57, '%')
Gzip compression (50% of data analyzed):
('Original size:', 589263912900)
('Compressed size:', 236547224239)
('Gain:', 40, '%')
Xz compression (slower, 30-40% analyzed):
('Original size:', 64855014576)
('Compressed size:', 19607237016)
('Gain:', 30, '%')