Fedora OpenGrok instance
Grok is a source code search and cross reference engine written in Java on top of Exuberant Ctags for source code analysis and Apache Lucene for indexing. It has a CLI, GUI and Web interfaces. It is able to extract and index information from many file types, including ELF binaries and most programming languages including, but not limited to, Shell, Python, Perl, C, C++ and Java.
Grok supports most of common Version Control Systems, including Subversion, Mercurial, RCS and CVS and can provide user with the way to browse multiple revisions of file.
Benefit to Fedora
Developers and Maintainers will be able to find inspirations and educate themselves by inspecting existing source code for clever programming constructs.
When a certain programming construct is found to be problematic, its occurrences throughout the Fedora code can be spotted easily. This is especially useful for Security Response together with the code that is reused verbatim in multiple packages (think
This is total size of all Open
data is the index and cross-referenced source and
src is the packages. Note that what is currently indexed is the resulting packages, not source packages. Apart from binaries, this is also
-debuginfo with the source code.
$ du -sh --apparent-size * 91G data 52G src $
Actual space requirement may be around 25% bigger, since this does not count unused space in last block of file. Given the number of files, this can lead to considerably big waste of space. With 4096 bytes blocks the files occupied bigger space than their actual sizes even after compression. 1024 bytes seems optimal.
For the curious, after compression the actual disk usage looks like this:
/dev/mapper/norkiavg-opengrok on /.compressed/opengrok type ext3 (rw,noatime,acl) fuse on /var/lib/opengrok type fuse (rw,nosuid,nodev,allow_other,default_permissions) Filesystem Size Used Avail Use% Mounted on /dev/mapper/norkiavg-opengrok 79G 53G 23G 71% /.compressed/opengrok
Interoperation with Fedora infrastructure
Grok could index
make prep'ed CVS working directories. In this case full source code would be indexed, as well as all the bits taken care of by the maintainer (SPEC file and patched). History from CVS would be kept from these.
PkgDB and FAS
In case CVS were used as described above, maintainer names in CVS file history view can be configured to link to arbitrary place, such as PkgDB or FAS.
Alternatively (with regard to CVS
preps), buildroots from koji could be used as a source for files to index.
A testing instance was set up by me (LubomirKundrak) at  . Please note that it runs on a low-end hardware with some compromises made, such as compressing the storage, which hurts performance considerably. With this said, please do not abuse it, and expect the reply to come withing 5-10 seconds for single-keyword searches that were never done before.
The software used is from repository at  . As Java Packaging Guidelines were approved this Thursday, expect it to hit Fedora soon.