Feature Name
LessFS
Summary
LessFS is a filesystem deduplication project. The aim is to reduce disk usage where filesystem blocks are identical by only storing 1 block and using pointers to the original block for copies. This method of storage is becoming popular in Enterprise solutions for reducing disk backups and minimising virtual machine storage in particular.
Owner
- Name: Duncan Innes
- Email: duncan AT innes DOT net
Current status
- Targeted release: ?
- Last updated: 2010-11-12
- Percentage of completion: 0%
Detailed Description
Data deduplication is often used for backup purposes and for virtual machine image storage. lessfs can determine if data is redundant by calculating an unique (192 bit) tiger hash of each block of data that is written. When lessfs has determined that a block of data needs to be stored it first compresses the block with LZO or QUICKLZ compression. The combination of these two techniques results in a very high overall compression rate for many types of data. Multimedia files like mp3, avi or jpg files can not be compressed by lessfs when they are only stored once on the filesystem.
http://www.lessfs.com/wordpress/?page_id=50
Benefit to Fedora
This will bring an as yet unavailable enterprise tool to Fedora. Storage is becoming the biggest consumer of energy in the datacentre. De-duplication will help bring that power and cost requirement down. Inclusion of LessFS (even as a technology preview) will improve the coverage of Fedora and help to push forward an open source method of de-duplication.
Scope
LessFS adds functionallity that allows de-duplicated file systems.
How To Test
A Package Review Request is currently sitting in Bugzilla (https://bugzilla.redhat.com/show_bug.cgi?id=530473) but appears to have stalled.
Once the package is installed, a filesystem should be created. Testing of the filesystem will involve placing multiple copies of similar and identical files in the filesystem. Files should ideally be greater than one block in size. Editing of the files, moving, making further copies etc. should all be seamless.
In my view, this package is not aimed at filesystems requiring maximum read/write speed, but is more ideally suited to filesystems with low rate of change. Filesystems with high capacity requirements
User Experience
De-duplication will be noticeable to target users by greatly reducing the disk space requirements for backups to disk and for virtual machine storage. Greater reductions are seen where many images/backups share a common data set.
Dependencies
Contingency Plan
None necessary - this is a new feature and does not change any current part of Fedora