From Fedora Project Wiki

< SIGs‎ | bigdata‎ | packaging

No edit summary
No edit summary
Line 3: Line 3:
The Fedora Big Data SIG is investigating the requirements to adapt the latest version of Hive as a package in Fedora, now that [https://koji.fedoraproject.org/koji/packageinfo?packageID=16841 Hadoop 2.x has been packaged]. Although Hive obviously has a significant dependency on Hadoop, the Java project is not Maven-based and instead is built using Ant and Ivy. The [[:Packaging:Java]] xmvn tooling support in Fedora does not directly apply to the Hive build. In many ways this can be viewed as a simplification instead of a challenge since one can configure a local file-system Ivy resolver relatively easily.
The Fedora Big Data SIG is investigating the requirements to adapt the latest version of Hive as a package in Fedora, now that [https://koji.fedoraproject.org/koji/packageinfo?packageID=16841 Hadoop 2.x has been packaged]. Although Hive obviously has a significant dependency on Hadoop, the Java project is not Maven-based and instead is built using Ant and Ivy. The [[:Packaging:Java]] xmvn tooling support in Fedora does not directly apply to the Hive build. In many ways this can be viewed as a simplification instead of a challenge since one can configure a local file-system Ivy resolver relatively easily.


However, using static build-based analysis (Ant doesn't really provide something like the Maven dependency plugin), there are a group of dependencies that are currently missing from Fedora which block the build of Hive using Fedora-only installed versions. There are also many dependencies available which are not necessarily version-compatible. However, like the Hadoop package, those can hopefully be mitigated in the Hive source where possible.
Using static build-derived analysis (Ant doesn't really provide something like the Maven dependency plugin), there are a group of dependencies that are currently missing from Fedora which block the build of Hive using Fedora-only installed versions. There are also many dependencies available which are not necessarily version-compatible. However, like the [[:Changes/Hadoop]] outline, those can hopefully be mitigated in the Hive source where possible.


avro-1.7.1.jar
avro-1.7.1.jar

Revision as of 17:09, 4 September 2013

From the project site: "Hive is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL."

The Fedora Big Data SIG is investigating the requirements to adapt the latest version of Hive as a package in Fedora, now that Hadoop 2.x has been packaged. Although Hive obviously has a significant dependency on Hadoop, the Java project is not Maven-based and instead is built using Ant and Ivy. The Packaging:Java xmvn tooling support in Fedora does not directly apply to the Hive build. In many ways this can be viewed as a simplification instead of a challenge since one can configure a local file-system Ivy resolver relatively easily.

Using static build-derived analysis (Ant doesn't really provide something like the Maven dependency plugin), there are a group of dependencies that are currently missing from Fedora which block the build of Hive using Fedora-only installed versions. There are also many dependencies available which are not necessarily version-compatible. However, like the Changes/Hadoop outline, those can hopefully be mitigated in the Hive source where possible.

avro-1.7.1.jar avro-ipc-1.7.1.jar avro-mapred-1.7.1.jar

commons-httpclient-3.0.1.jar (Jakarta???) commons-httpclient-3.1.jar

datanucleus-connectionpool-2.0.3.jar datanucleus-core-2.0.3.jar datanucleus-enhancer-2.0.3.jar datanucleus-rdbms-2.0.3.jar

ftplet-api-1.0.0.jar

ftpserver-core-1.0.0.jar ftpserver-deprecated-1.0.0-M2.jar

geronimo-j2ee-management_1.1_spec-1.0.1.jar

high-scale-lib-1.1.1.jar

httpclient-4.1.3.jar httpcore-4.1.3.jar

javolution-5.5.1.jar

kahadb-5.5.0.jar

libfb303-0.9.0.jar* libthrift-0.8.0.jar* libthrift-0.9.0.jar*

metrics-core-2.1.2.jar

pig-0.10.1.jar (dep on pig?)

stax-api-1.0-2.jar

tempus-fugit-1.1.jar