Julia for Bioinformatics and Computational Biology

The 2 language problem describes a situation in which a software developer may prototype or write software in one of the many high­level programming languages (Python, R, Perl, Ruby, Matlab), but must re­write parts of the program in a lower level language (e.g. C) to achieve performance. Such lower level languages are considered more difficult, and more time consuming to develop with. The two language problem hinders science and scientists in several ways including longer development and testing times for biologists and this creates resistance to developing and maintaining high­quality professional software. The modern julia language for technical computing can solve this problem for scientists as it is a dynamic and high level language with the ease of use of Python, but the performance of C. Critically, a scientist needs to only write in the one language to create a piece of software that is performant yet easier to create, understand, and modify. This is possible due to the combination of its features and design: by combining a type system, type inference, multiple dispatch of methods, and a JIT compiler, the generation of compiled, highly­typed and specialized code, can be achieved whilst programming in a high level and dynamic style.

BioJulia was set up to create a community for life scientists wanting to use the language for Biological work. The package Bio.jl provides core bioinformatics infrastructure with data types and methods essential for the majority of typical bioinformatic tasks (including sequence manipulation and alignment, phylogenetics, and protein structure manipulation). The community is committed to providing a friendly environment for biologists getting to grips with coding and development, and is committed to good and open software carpentry practices.