Archive:PackagingDrafts/OCaml

= OCaml Packaging Guidelines =

This document seeks to document the conventions and customs surrounding the proper packaging of ocaml modules in Fedora. It does not intend to cover all situations, but to codify those practices which have served the Fedora ocaml community well.

= Naming =

The base OCaml compiler is called ocaml.

OCaml modules, libraries and syntax extensions should be named ocaml-foo. Examples include: ocaml-extlib, ocaml-ssl.

This naming does not apply to applications written in OCaml, which can be given their normal name. Examples include: mldonkey, virt-top, cduce.

Rationale: this is how they are named in other distros (Debian, PLD) and this is consistent with perl / php / python naming.

= Packaging libraries =

- An example specfile for an imaginary OCaml library called foolib.

Main package
In order to allow OCaml scripts and the toplevel to use a library, the main package should contain only files matching:
 * *.cma (contains the bytecode)
 * *.cmi (contains the compiled signature)
 * *.so (if present, contains OCaml <-> C stubs)
 * META (the findlib description)
 * *.so.owner (if present, used by findlib)
 * a license file (if present) marked %doc


 * .cmo files are not normally included. There are two exceptions where *.cmo files may be included:
 * if file is needed for link (like gtkInit.cmo in lablgtk or std_exit.cmo in OCaml itself), then it must be included to allow the library to be linked properly.
 * if the cmo file is a camlp4 preprocessor (like Camlp4OCamlPrinter.cmo in OCaml), then it must be included because otherwise the syntax extension would not be available.

If the package contains *.so files, then they should have rpaths removed, as per Fedora packaging guidelines.

The packager should check the META file[[FootNote(http://www.ocaml-programming.de/packages/documentation/findlib/guide-html/x131.html - Findlib users guide - writing META files.)] .  If there is no META file, then the packager should create one, include it in the package, and pass it to the upstream maintainer.

Rationale: OCaml does not support dynamic linking of binaries, and even if it did with the current module hash system for expressing strict typing requirements almost any conceivable change to a library would require the binary to be recompiled. OCaml scripts are the closest we come to dynamic linking, in as much as they do not usually depend on a specific version of a library (albeit this only works because the scripts are recompiled each time they run).

-devel subpackage
The -devel subpackage of a library should contain all other files required to allow development with the library. Normally these would be:


 * *.a (contains the compiled machine code)
 * *.cmxa (describes the compiled machine code)
 * *.cmx (if present, allows cross-module optimizations)
 * *.mli (contains the signature of the library)


 * .o files are not normally included. There is however one exception -- if file is needed for link (like gtkInit.cmx and gtkInit.o in lablgtk or std_exit.cmx and std_exit.o in OCaml itself), then it should be included.


 * .ml files are not normally included. The exception is if the file describes a module signature and there is no corresponding .mli file, then the .ml file should be included.  (Note that Debian is more permissive and they often distribute *.ml files, allowing the programmer to peek at the implementation of a module).

Documentation, examples and other articles which are useful to the developer may be included in the -devel sub-package. The license file (which is in the main package) does not need to be included again in the -devel subpackage.

If the -devel subpackage would only contain documentation files, then the packager may at their discretion place the documentation files in the main package and not have a -devel subpackage at all.

The -devel subpackage should require the exact name-version-release of the main package (as per Fedora policy). It should also require any C libraries required for development, and sometimes this means an explicit 'Requires' is needed. For example, ocaml-pcre-devel needs an explicit 'Requires: pcre-devel' to make it usable for development.

Rationale for inclusion of all cmx files: [*.cmx files]  are needed even for module included in .cmxa libraries in order to enable cross-module optimizations (inlining, constant propagation and direct function calls). The .o files are not needed. [From a private email from Alain Frisch]

-doc subpackage
If the documentation files are very large they may be placed in a separate -doc subpackage, as per normal Fedora guidelines.

-data subpackage
If the package contains excessively large data files, they may be placed in a separate -data subpackage, as per normal Fedora guidelines.

Requires and provides
For each module that library A uses from another library B, library A must have a Requires of the form: ocaml(Modulename) = MD5hash Similarly for each module that library A may provide to other libraries, library A must have a Provides of the same form.

A library must depend on the precise version of the OCaml compiler, for example: ocaml(runtime) = 3.10.0

There are two scripts in the base ocaml package which automatically calculate the right Requires and Provides for a library. To use them, just add the following to the spec file:

%define _use_internal_dependency_generator 0 %define __find_requires /usr/lib/rpm/ocaml-find-requires.sh %define __find_provides /usr/lib/rpm/ocaml-find-provides.sh

Rationale: OCaml does not offer binary compatibility between releases of the compiler (even between bugfixes). Furthermore the module system uses a hash over the interface and some internals of a module which basically means a library or program must be linked against the identical modules it was compiled with. The Requires and Provides lines express the module name and hash so that RPM enforces the same requirements as the OCaml linker itself. Please see the further reading at the end of this page for more details.

= Packaging binaries =

The rules for packaging OCaml binaries are not significantly different from packaging ordinary programs (see ["Packaging/Guidelines"] ).

However if the OCaml package also contains a library, then you should follow the rules above for packaging libraries as well.

Stripping binaries
Binaries should be stripped, as per ordinary Fedora packaging guidelines.

There is one exception where a binary should not be stripped. If the package was compiled with ocamlc -custom then the package contains bytecode which strip will remove, thus rendering the binary inoperable. It is easy to test for this: If after stripping, any attempt to run the binary results in the message No bytecode file specified then the binary is compiled like this and should not be stripped.

Rationale: http://bugs.debian.org/256900

Providing best possible binaries
The packager should attempt to ship native code compiled binaries in preference to bytecode compiled binaries, where this is possible.

= Bytecode-only architectures =

The OCaml native code compiler (ocamlopt) contains code generators for popular architectures, but not for every architecture that Fedora might support. On such architectures, the spec file should still build bytecode libraries and binaries.

To test for presence of the native compiler, do:

%define opt %(test -x %{_bindir}/ocamlopt && echo 1 || echo 0)

then define conditional sections in %build, %install and %files if necessary. For example:

%build make byte %if %opt make opt %endif

To test that your spec file will work on such an architecture, temporarily remove or rename /usr/bin/ocamlopt and /usr/bin/ocamlopt.opt while building.

Rationale: Debian packaging policy section 2.3 does the same thing.

= Unnecessary files =

The following files should not normally be distributed:


 * *.cmo object files. Exception: see above.
 * *.o for corresponding *.cmx. Exception: see above.
 * *.ml sources. Exception: see above.

= Security issues in OCaml libraries =

If a security issue arises in an OCaml library, then all libraries and binaries which depend on it must be recompiled.

OCaml scripts do not need to be changed (unless resolving the security issue requires changing the public interface to the library and the script is broken by the change). This is because OCaml scripts are recompiled each time they run.

= Further reading =


 * http://pkg-ocaml-maint.alioth.debian.org/ocaml_packaging_policy.txt - Debian packaging policy document.
 * http://docs.pld-linux.org/ocaml.html
 * http://lists.debian.org/debian-ocaml-maint/2005/01/threads.html#00042 - Thread on ABI compatibility of different versions of OCaml.
 * https://www.redhat.com/archives/fedora-devel-list/2007-May/msg01234.html - Explains lack of dynamic linking in upstream.
 * https://www.redhat.com/archives/fedora-devel-list/2007-May/msg01280.html - Proposal to include MD5 sums in RPM deps.
 * https://bugzilla.redhat.com/show_bug.cgi?id=433783 - Common rpmlint errors and warnings in OCaml packages.

= Footnotes =

[[FootNote]