Features/HekaFS

From FedoraProject

Jump to: navigation, search

Contents

HekaFS

Summary

A "cloud ready" version of GlusterFS, including additional auth*/crypto/multi-tenancy.

Owner

Current status

Detailed Description

HekaFS is intended to be a version of an existing distributed/parallel filesystem that's suitable for deployment by a provider as a permanent, shared service. It could also be used as infrastructure for hosting virtual-machine images, and in fact the underlying technology (GlusterFS) is often used in this role currently. Users can already deploy this class of file system privately in the cloud, within their own virtual machines, but they pay both a performance and an administrative cost for doing so. Running servers natively and doing the configuration/administration only once could be a compelling option for anyone building a Fedora-based cloud, but requires some additional features. Specifically, HekaFS provides:

All of these features can be implemented in a modular way, so that users can deploy only those they deem necessary or appropriate for their situation.

The long-term plan for HekaFS includes performance/scalability enhancements and multi-site replication, but those are not part of the current proposal.

Benefit to Fedora

Best-in-class cloud storage with a full and familiar POSIX API, high performance, and strong security. This functionality can be used either as infrastructure for the cloud itself, or as a service providing additional functionality directly to users.

Scope

The scope of work for this feature mostly consists of the HekaFS package. Some of the code is also part of the existing glusterfs package, either in the form of HekaFS-related patches or whole features (e.g. SSL-based authentication and transport encryption).

How To Test

As a distributed file system, testing requires at least two and ideally four or more server nodes. Since the specific goal of HekaFS is to provide various kinds of protection between tenants it's best to have at least two client nodes mounting as different tenants. All of these nodes must have a current and compatible version of the glusterfs package installed, and can be virtual for testing purposes.

Configuration is mostly as described in the official manual and web screenshots. At a minimum, you'll need to create a pool/cluster and a volume first. Testing from that point is largely feature-dependent. Referring to the above feature list:

User Experience

File systems are notoriously "invisible" to users when they work. The real "user" experience for HekaFS is actually the experience of the cloud provider or cloud tenant (account holder) as they configure their respective parts of the system. This experience includes the following aspects.

Dependencies

The only major dependency is on a compatibly-patched version of glusterfs. For encryption, there is also a dependency on OpenSSL.

Contingency Plan

None necessary. The existing glusterfs functionality is not affected.

Documentation

There's a management manual which needs expansion to include setup for both forms of encryption. There is also a separate document describing the setup of SSL authentication/encryption within the management code itself, which needs to become part of a more general installation manual.

Release Notes

HekaFS 0.7 enhances the feature set of GlusterFS with multi-tenancy, security, and management features.

HekaFS deployment requires knowledge of how to set up OpenSSL keys and certificates to facilitate authentication at both the management and I/O levels.

Network and storage encryption are both optional, and incur a significant performance penalty if used.

Quota/billing support is under active development within GlusterFS, and will not be available for this release of HekaFS.

Enhanced local file distribution/replication and wide-area replication are planned as eventual features of HekaFS, but are not in this release.

Comments and Discussion