Skip to content

GSoC 2016 Project Ideas

kain88-de edited this page Mar 2, 2016 · 35 revisions
Google Summer of Code 2016 A list of projects ideas for Google Summer of Code 2016. Each with a short sentence describing it. Please also note that not every project we suggest here will take you all summer. Some will be short (e.g., adding the new TNG trajectory format).

The project ideas can be roughly categorized as

  1. New analysis functionality
  2. Increasing performance
  3. New input formats
  4. Increase platform availability
  5. Increase ease-of-use
  6. Improve the library core

Or work on your your own idea! Get in contact with us to propose an idea and we will work with you to flesh it out into a full project. Raise an issue in the Issue Tracker or contact us via the developer Google group.


New analysis functionality

Implement a general dimension reduction algorithm

Difficulty: Hard

Mentors: Max, Richard, Manuel

MDAnalysis already comes with a range of different standard analysis tools but currently lacks an implementation of a general dimension reduction algorithm, that can select an arbitrary number of dimensions of interest. 3 common general techniques are

There are python implementations for all of these algorithms but none of them currently work with MDAnalysis out of the box. This is because the current python implementations work on normal numpy arrays that stores a complete trajectory in memory, but MDAnalysis never loads the whole trajectory but only one frame at a time. This approach allows MDAnalysis to treat very large system on a normal laptop or workstation.

Of course you can also suggest us another dimension reduction algorithm that you would like to implement.

Increasing performance

Improve distance search

Difficulty: Hard

Mentors: Max, Richard, Manuel

Work with domain-decomposition algorithms to improve our distance search algorithms (cell grids) and/or implement distance search on GPUs using CUDA/OpenCL.

New input formats

Add new MD-Formats

Dificulty: Medium

Mentors: Max, Richard, Manuel

One of the strengths of MDAnalysis is its ability to support a wide range of different MD-formats. But we are still missing some like the new TNG file format from Gromacs or H5MD. Alternatively, you can also add a format that you want to use personally in MDAnalysis. This project will familiarize you with working with and connecting different APIs, as well as giving insight into how modern portable data storage file formats work.

Random Walk Trajectory Backend

Difficulty: Hard

Mentors: Max, Richard, Manuel

To check if a new analysis-method works as intended it is often a good idea to use it with a random walk in different simple energy landscapes (A flat energy, harmonic well, double well). In this project you would develop a 'Reader' that produces random trajectories.

Langevin dynamics in a energy landscape are close to the conformational dynamics of proteins, see [1]. As a first start you could implement a integrator for langevin dynamics and later have the trajectory 'reader' use the integrator to dynamically generate the trajectory.

[1] Robert Zwanzig. Nonequilibrium statistical mechanics. Oxford University Press, 2001

Increase platform availability

Help port MDAnalysis to Python 3

Difficulty: Easy

Mentors: Max, Richard, Manuel

Python 3 is getting adopted by a wider range of users and unix distributions are starting to switch. MDAnalysis can't run right now under Python 3 mostly due to it's C/Cython extensions, we currently try to move our C-extensions to cython which supports Python 2 and 3 with one source. See also #260.

Missing here right now is the DCD trajectory readers. There exists an incomplete work to enable Python 2/3 of the DCD reader. In this project you would finish this work by either writing finishing this work or by rewriting the DCD interface in cython.

The second part of this project is to remove all other incompatibilities with Python 3 we currently have. For this you should work that our test-suite passes on Python 3.

Increase ease-of-use

Create a command line interface for MDAnalysis tasks

Difficulty: Medium

Mentors: Max, Richard, Manuel

Currently MDAnalysis exists only as a framework, however making common tasks available via the command line would make certain work flows easier. As an example, the conversion of trajectories between formats could take the form:

mda convert --topology adk.psf -i adk_dims.dcd -o adk_dims.xtc

This project would involve creating a template for these command line utilities to follow and implementing a foolproof user interface for navigating them using a popular command line parsing library.

Improve the library core

Switch from pure ndarray's to unit aware nd-arrays.

Difficulty: Hard

Mentors: Max, Richard, Manuel

MDAnalysis is using Anström and picoseconds as default units. Our Reader/Writer objects are only aware of units to the extend that they convert other MD-formats to our default units. But we can also read the coordinates in the native units. This can make it hard to remember what units the coordinates of an AtomGroup have, to fix this you should switch from pure numpy arrays to a unit aware numpy-ndarray wrapper. See Issue #596

Clone this wiki locally