GSOC 2014/Student Application tonyo112/Afterthought
Please describe your proposal in detail. Include:
An overview of your proposal
"Afterthough" is a tool which allows execution of dynamically loaded set of instructions, using URI as input. It has a great potential for simplifying tasks that need automation and spares users from following tiresome manuals.
The need you believe it fulfills
Installation and configuration are usually integral steps of software utilization. Libraries and applications may be distributed as packages or archives of different types, and in the latter case it is commonly to use magic “./configure; make; make install” combination, which usually works successfully only if you have all the dependencies preinstalled. Otherwise, one has to install all the required libraries and library headers, and in most cases you can discover the entire dependency list only empirically, repeating the described procedure multiple times. Package managers (like “yum” and “apt-get”) partially solve this problem with dependencies, but they lack flexibility in terms of post-install configuration for a given package. Moreover, specific installation steps may change from version to version (which applies both to packages and tarballs), which produces headaches for both users, who should follow manual every time (which might not even exist) and developers, who have to provide these user-friendly instructions.
“Afterthought” tool aims at solving these problems. A setup instruction for an application we want to install is defined only by a URI, which describes the installation and configuration instruction. “Afterthought” will fetch the latest routine for a desired operation from this URI, and then execute all necessary steps to provide the user with a properly configured application without any user participation.
$ afterthought https://brand-new-app.net/install
...will install, configure and launch the latest version of “brand-new-app” with all the dependencies.
And this use-case if far from the only one for “Afterthought”. Any job which generally can be represented as a sequence of instructions (as a shell script, for example) may be automated with “Afterthought”. That is, from installing and binding the bundle of Rails, AngularJS and MongoDB to changing the command line prompt color.
Any relevant experience you have
In general, I have strong skills in system programming, networking and security. I used Ruby and its frameworks like Ramaze and Ruby on Rails when our security team participated in (and also organized) various Capture the Flag security competitions. I also have strong skills in C/C++, but at the moment I'm actively learning Go language, and apart from the fact that I really like the language so far, it seems like a right tool for the project.
How you intend to implement your proposal
“Afterthought” is a client-side tool, so there is a wide variety of possible technologies and programming languages to implement it. The majority of client software utilities nowadays is written in C/C++, but I suggest writing “Afterthought” in Go programming language. Extra performance is not a requirement in our case, and automatic memory management feature of Go will help to focus not on catching memory leaks, but on elegant and maintainable solution. Also, I have a strong desire to popularize this wonderful open-source language, which is being developed quite rapidly and shows promising results in terms of performance, concurrency and safety.
To describe the format of the script that has to be executed, we could start from a simple JSON file that defines some metadata (like the target version of a client, last update time, etc.) and then the script itself. Demonstration of different use-cases is needed, so I am planning to setup a server (using Ruby on Rails framework, or more lightweight counterparts like Camping or Sinatra) with various usage examples.
Numerous tests are required to be written at the very beginning to make sure that nothing will go wrong while refactoring and adding new features. Go language has a lightweight testing framework as a built-in package, and I believe it will be sufficient for our cause.
Challenges and possible solutions
“Afterthought” relies on URI provided by a user, and different security concerns show up. First of all, it is possible to disguise a malicious code under shortened URLs (like http://goo.gl/sPgwF0). A possible solution to this problem is to explicitly disable redirects by default. Another option is to follow the chain of redirects until the final destination is found, and prompt the user if he really wants to execute the script from this location.
Another problem is URLs that look genuine at first glance (http://github.com.co/user/project). As one solution, there could be some sort of a white list for host names and URLs in general (using wildcards like *.github.com, or host.com/projects/*). Another approach utilizes public-key certificates and chains of trust, similar to what we use in web-browsers. Therefore, the script should be signed by a certificate authority, and a client should first check the validity by tracing the certificate chain up to one of root CAs.
Using HTTPS by default will prevent man-in-the-middle attacks and response tampering, which may cause malicious code execution even for genuine URLs.
Script to be executed may contain different instructions that are specific for a given environment. Also, different applications used by a script may have distinct semantics (like different meaning of command line arguments) on two different operating systems. To circumvent this problem we could oblige the client to send all required information (operating system, kernel information, CPU architecture, etc.) and then reply with the corresponding script.
A rough timeline for your progress
- May 19 – May 31: [The end of semester at university] document use-cases, setup the use-case server, think about the architecture
- June 1 – June 13: first tests and prototype
- June 14 – June 22: come up with a definite job (script) format, implement the client-server interaction
- June 23 – June 30: [Exams at university]
- July 1 – July 13: work through security issues
- July 14 – July 31: work through platform specificity issues
- August 1 – August 10: finalize the core implementation, make a package
- August 11 – August 18: [Start of an internship] polishing code, tests, documentation
Any other details you feel we should consider
My internship is starting in the beginning of August, hope the workload won't be too high at the beginning, plus I believe I can postpone the start to a later date, like August 11.
Have you communicated with a potential mentor? If so, who?
I discussed the idea and possible challenges/solutions with Soumya Deb.