LVM/liblvm: Difference between revisions

Revision as of 16:50, 8 May 2009

Introduction / Problem Description

LVM is currently being used by other software projects. These projects interface with LVM by calling the LVM commandline either by invoking a shell or calling the string-based liblvmcmd. These interface methods are problematic for the following reasons:

Interacting with LVM requires creating and parsing command line strings
Error handling is problematic
High level CLI functionality may not meet the needs of all consumers
CLI is complex, often leading to improper use or confusion
The command line utilities have to be called with a system call type interface which could lead into problems. Strict string analysis and type checking is needed altogether with escaping values.
Return values have to be parsed to get the wanted information. This requires to use the locale C to be able to parse the output.
If an error occurs in a command line tool, used by an application with a system call type interface, the application only knows what happened if it is parsing error messages. Because of LANG=C for the call, the error message is in english but might not be useful for the user with a different language. It might be needed that the application understands what went wrong to propose a solution. There might also be the need to process lvm commands without user interaction (think of kickstart install).
Changes in the output format of informational output or error messages will lead into parsing problems.

libLVM proposes to create a real API for use by application programs to overcome at least some of the aforementioned limitations.

libLVM features will be driven by the needs of the known primary consumers of the API which are anaconda, system-config-storage, libvirt, and others as detailed in Appendix A. Note that some requests from existing LVM consumers in context of libLVM are really requests for LVM design changes or functionality. While important to meeting the needs of libLVM consumers, many of these issues are orthogonal to libLVM and so will be listed separately, and most likely not appear in the intial release.

One of the main drivers of libLVM is the anaconda storage rewrite, and specifically, system-config-storage. Much of the contents of the initial release of libLVM centers around supporting this effort.

Architecture / Requirements

The architecture of libLVM is currently evolving along an object-based design. A CLI architecture was considered along with an object design, with pros / cons of each approach outlined. As of 12/3/2008, it was clear that most current stakeholders prefer the object based design.

The key objectives of the architecture are:

Object based. This means handles to PV, VG, and LV objects are returned to the caller, and a get/set paradigm on the objects is used.
Thread-safe.
Fine-grained error handling, but minimize error code maintenance. Current direction is to try to use errno values, and not define libLVM-specific error codes. (This would mean APIs would return '0' for success.)

The current architecture is outlined in an API Proposal.

libLVM code is being slowly integrated into the upstream LVM2 project (See liblvm for more details.)

libLVM Release content

The functionality of libLVM will be divided into releases, with the first release targetting the most widely used functionality of existing consumers.

The first release of libLVM will contain the equivalent of the following CLI commands:

pvs: pv_name,vg_name,pv_size,pv_free,pv_attr,pv_fmt,pv_uuid,vg_extent_size, dev_size
vgs: vg_name,vg_attr,vg_size,vg_extent_size,vg_free_count,max_lv,max_pv,vg_uuid,vg_free_count
lvs: lv_name,vg_name,stripes,stripesize,lv_attr,lv_uuid,devices,origin,snap_percent,seg_start,seg_size,vg_extent_size,lv_size,vg_free_count,vg_attr
"pvcreate -ff -y -f pvname"
"pvremove -ff -y pvname"
"vgcreate -v -An -s pesize vgname pvname"
"vgremove -f vgname
"lvcreate -v -L lvsize -n lvname -An vgname"
"lvremove -f -v"
"vgchange -ay -v"
"vgchange -an -v"
"vgscan -v"
"vgmknodes -v"
"vgextend vgname pvname"
"vgreduce vgname pvname"
--config option with devices filters for various commands

In addition to the above functionalty used by existing consumers, the following LVM RFEs were noted in discussions with existing LVM CLI consumers as items of interest. The first release of libLVM may or may not contain any of these items as they are open to debate, and they may not be possible in the timeframe of the first libLVM release.

Allow duplicate volumes to be activated for virtualized guest image manipulation (http://bugzilla.redhat.com/show_bug.cgi?id=207470)
Provide an interface for efficient scanning of disks for LVM metadata. LVM should take as input one or more devices to scan and not try to figure out the set of devices as it does today in its dev-cache subsystem. In this new mode of operation, with LVM's dev-cache disabled, it will be the application's responsibility to handle any errors or incomplete information that results from limiting LVM to a set of devices. The following BZs relate to the scanning problem: (5.3 bz ON_QA) http://bugzilla.redhat.com/show_bug.cgi?id=461771, http://bugzilla.redhat.com/show_bug.cgi?id=277271, http://bugzilla.redhat.com/show_bug.cgi?id=464877, http://bugzilla.redhat.com/show_bug.cgi?id=464724
Cloning volumes: http://bugzilla.redhat.com/show_bug.cgi?id=409031

The remaining LVM functionality group of lowest priority is listed below. This functionality was found in some consumers but was prioritized to a later release of libLVM.

lvcreate --snapshot --name lvname --size lvsize origin_path
lvresize -An -L lvsize -v lvname
vgcreate --physicalextentsize pesize -c clustered vgname
vgchange -c clustered vgname
lvchange -an path; lvchange -ay path
lvreduce -f -L size lvpath
lvextend -L lvsize path
/etc/lvm/lvm.conf: get/set locking_type

Deliverables

Deliverables will be a shared object library (liblvm.so) with matching header file (lvm.h). Documentation will be included and may be in the form of header file comments, coding examples, man pages, and/or web pages.

Schedule

The schedule for the first release of liblvm has moved from Fedora 11 to Fedora 12: http://fedoraproject.org/wiki/Releases/12/Schedule

We will target as much of the aforementioned features as possible for this first release, with an emphasis on the definition of the LVM objects and their attributes, and querying the system for LVM information and providing the equivalent of the pvs, vgs, and lvs commands.

Milestones in the project are as follows:

Project Plan
11/18/2008 - 11/26/2008: Draft project plan
12/02/2008 - 12/02/2008: Draft functional spec due
11/26/2008 - 12/12/2008: Plan review; update project plan and functional spec
12/12/2008 - 12/16/2008: Updated plan due with final functional spec
12/16/2008 - 12/22/2008: Final plan approval; monthly status review
01/01/2009 - 08/01/2009: libLVM development
03/01/2009: Milestone - initial skeleton liblvm build
06/01/2009: Milestone - liblvm build containing vg_read/vg_release (open/close) and object attribute querying
06/07/2009: Monthly status review

Implementation tasks will focus on a specific portion of libLVM and will be broken into no more than 2 week increments. Tasks will begin after plan approval, tenatively starting 1/5/2009.

Implementation Tasks
12/07/2008 - 12/13/2008: Initial library build and initialization w/unit test infrastructure
12/14/2009 - 12/20/2009: configuration (/etc/lvm.conf) API
12/21/2008 - 01/03/2009: Cleanup / misc (vacation for most people)
01/04/2009 - 01/10/2009: reporting commands cleanup
01/11/2009 - 01/17/2009: vg_read review/test; udev integration discussion; init code cleanup patch (is_long_lived)
01/18/2009 - 01/24/2009: vgs initial implementation; vg_read review
01/20/2009 - 01/20/2009: Milestone; F11 Alpha
01/25/2009 - 01/31/2009: vgs, vg_read, unlock_vg, init_locking
02/01/2009 - 02/07/2009: vgs, pvs, orphan locking
02/15/2009 - 02/21/2009: Unit test existing functionality
05/01/2009 - 05/14/2009: Finish vg_read patch review and initial object attribute implementation
xx/xx/2009 - xx/xx/2009: pvcreate / pvremove
xx/xx/2009 - xx/xx/2009: lvcreate
xx/xx/2009 - xx/xx/2009: vgcreate / vgremove
xx/xx/2009 - xx/xx/2009: lvremove
xx/xx/2009 - xx/xx/2009: vgchange -ay, -an
xx/xx/2009 - xx/xx/2009: vgextend
xx/xx/2009 - xx/xx/2009: vgreduce
xx/xx/2009 - xx/xx/2009: Improve error codes / handling
08/01/2009 - 08/30/2009: Final unit testing

Responsibilities

The following people are identified as having a significant role in libLVM.

Alasdair Kergon (agk@redhat.com): libLVM design review, signoff
Thomas Woerner (twoerner@redhat.com): system-config-storage requirements, libLVM coding, unit testing, design
Petr Rockai (prockai@redhat.com): libLVM coding, unit testing, design
Dave Wysochanski (dwysocha@redhat.com): libLVM coding, unit testing, design
Dave Lehman (dlehman@redhat.com): anaconda storage project requirements, anaconda signoff
Peter Jones (pjones@redhat.com): anaconda requirements input
David Zeuthen <davidz@redhat.com>: Device-kit disks requirements, signoff
Tom Coughlan <coughlan@redhat.com>: libLVM planning, milestone signoff

Outstanding Issues

Object model performance implications with large number (1000) of volumes requiring lots of transactions (Heinz / Thomas have discussed - needs more discussions, with specific operations and scenarios outlined)
Translation of LVM error messages (twoerner to Investigate transifex for translation of error messages)
Require 'force' parameter to API commands
Using cmd->mem dm_pool for memory allocation of handles prevents them from being individually freed.
Creating a vg handle with read permission then later needing write permission requires a new API or ability to free handles. Currently we cannot free objects unless we free the whole command structure. Alternatives seem to be fixing the memory freeing and using a repeated vg_read, vg_close sequence or providing an API that converts read access to write. NOTE: I believe Milan has fixed this with his vg_release patches - thanks Milan!
Should the API deprecate use of PVs or will they be necessary for future LVM work?
Signal handling for liblvm calls. How do we handle application signal handlers? Should we install our own signal handler in each liblvm call?

Risk Analysis / Mitigation

The key risks are:

Refactoring of existing LVM tool code. We will mitigate this risk with upstream LVM nightly tests.
Object-based locking. The object design of the API will break up CLI operations into smaller functional chunks, and locks will be tied to handles which may change the frequency and duration of locking. This may have specific risks to clustered LVM. Mitigation should include some form of clustered LVM regression tests done on a monthly basis during the key development period (either upstream nightly tests or RHTS).

Appendix A: Per consumer LVM functionality usage/needs

Device-kit-disks / udev (email discussions)

dm/LVM tools must export information about dm/LVM devices in <KEY>=<value> format, to be imported into udev database (perhaps addition to 'vol_id' or separate 'lvm_id' prog?), BZ438604 - Add env-style reporting to devmapper + dmsetup
at least basic device information in sysfs
no device nodes/links created in /dev, if udev is active
proper userspace events from the kernel if something changes

Anaconda (mostly code extraction)

"pvs --noheadings --units b --nosuffix --options pv_name,vg_name,dev_size"
"pvremove -ff -y -v pvname"
"pvcreate -ff -y -v pvname"
"pvcreate -ff -y -v node"
"vgs --noheadings --units b --nosuffix --options vg_name,vg_size,vg_extent_size,vg_free"
"vgcreate -v -An -s pesize vgname"
"vgremove -v vgname"
"vgscan -v"
"vgmknodes -v"
"vgchange -ay -v"
"vgchange -an -v"
"lvs --noheadings --units b --nosuffix --separator --options vg_name,lv_name,attr
"lvdisplay -C --units b vg_name,lv_name,lv_size,origin"
"lvcreate -v -L lvsize -n lvname -An vgname"
"lvremove -f -v"
"lvresize -An -L lvsize -v"

system-config-storage

All functionality as listed for anaconda.
Allow duplicate volumes to be activated for virtualized guest image manipulation.

- https://bugzilla.redhat.com/show_bug.cgi?id=207470

Provide an interface for efficient scanning of disks for LVM metadata

(Needs more detail / bz)

libvirt (mostly code extraction)

"vgchange -ay", "vgchange -an"
lvs --separator , --noheadings --units b --unbuffered --nosuffix --options "lv_name,uuid,devices,seg_size,vg_extent_size" VGNAME
pvs --noheadings -o pv_name,vg_name
vgcreate
pvcreate
vgs --separator : --noheadings --units b --unbuffered --nosuffix --options "vg_size,vg_free" VGNAME
vgremove -f
pvremove
lvcreate --name LVNAME -L SIZE
lvremove -f
vgscan
Cloning volumes: https://bugzilla.redhat.com/show_bug.cgi?id=409031

conga (rhel5 code extraction)

pvs --options pv_name,vg_name,pv_size,pv_free,pv_attr,pv_fmt,pv_uuid,vg_extent_size
lvs --units b --options lv_name,vg_name,stripes,stripesize,lv_attr,lv_uuid,devices,origin,snap_percent,seg_start,seg_size,vg_extent_size,lv_size,vg_free_count,vg_attr
pvdisplay -c
vgs -o vg_name,vg_attr,vg_size,vg_extent_size,vg_free_count,max_lv,max_pv,vg_uuid
lvsdisplay -c --units b
lvs -o lv_name,vg_name,origin
pvcreate -y -f path
pvremove -y -f path
vgcreate --physicalextentsize pesize -c clustered vgname
vgremove vgname
vgextend vgname pv_path
vgreduce vgname pv_path
vgchange -c clustered vgname
lvcreate --name lvname --size lvsize vgname
lvcreate --snapshot --name lvname --size lvsize origin_path
lvchange -an path; lvchange -ay path
lvremove --force path
lvreduce -f -L size lvpath
lvextend -L lvsize path
/etc/lvm/lvm.conf: get/set locking_type

@@ Line 143: / Line 143: @@
 # Require 'force' parameter to API commands
 # Using cmd->mem dm_pool for memory allocation of handles prevents them from being individually freed.
-# Creating a vg handle with read permission then later needing write permission requires a new API or ability to free handles.  Currently we cannot free objects unless we free the whole command structure.  Alternatives seem to be fixing the memory freeing and using a repeated vg_read, vg_close sequence or providing an API that converts read access to write.  NOTE: I believe Milan has fixed this with his vg_release patches - thanks Milan!
+# <s>Creating a vg handle with read permission then later needing write permission requires a new API or ability to free handles.  Currently we cannot free objects unless we free the whole command structure.  Alternatives seem to be fixing the memory freeing and using a repeated vg_read, vg_close sequence or providing an API that converts read access to write.</s>  <b>NOTE: I believe Milan has fixed this with his vg_release patches - thanks Milan!</b>
 # Should the API deprecate use of PVs or will they be necessary for future LVM work?
 # Signal handling for liblvm calls.  How do we handle application signal handlers?  Should we install our own signal handler in each liblvm call?

Search