Scientific computing on jruby

Preview:

Citation preview

Scientific Computing on JRuby

github.com/prasunanand

Objective●A Scientific library is memory intensive and speed counts.How to

use JRuby effectively to create a great tool/gem.

●A General Purpose GPU library for Ruby that can be used by industry in production and academia for research.

●Ruby Science Foundation

●SciRuby has been trying to push Ruby for scientific computing.

●Popular Rubygems:

1.NMatrix

2.Daru

3.Mixed_models

4.Iruby_notebook

NMatrixNMatrix is SciRuby’s numerical matrix core, implementing dense matrices as well as two types of sparse (linked-list-based and Yale/CSR).

It currently relies on ATLAS/CBLAS/CLAPACK and standard LAPACK for several of its linear algebra operations.

Daru

Mixed_models

Nyaplot

Why nya?

Contributors wanted

●IRC #sciruby

●Slack-channel #sciruby

●Google-group #sciruby

Known for performance JRuby is 10 times faster than CRuby.

With truffle it’s around 40 times faster than CRuby.

Say hello

NMatrix for JRuby●Not a unified interface for Sciruby gems: MDArray.

●MDArray is a great gem for Linear Algebra.

●However, every gem that used NMatrix as dependency needed to

be reimplemented with MDArray.

●Hence, putting in effort for optimization.

●MdArray used Parallel colt that was depreceated.

NMatrix for JRuby●Parallelism=> No Global Interpreter Lock as in case of MRI

●Easy Deployment(Warbler gem)

How NMatrix works●N-Dimensional

●2-Dimensional NMatrix

N-dimensional NMatrixN-dimensional matrices are stored as a one-dimensional Array.

Elementwise Operation●Iterate through the elements

●Access the array; do the operation, return it

●[:add, :subtract, :sin, :gamma]

Determinants and Factoriztion●Two dimensional matrix operations

●In NMatrix-MRI, BLAS-III and LAPACK routines are implemented

using their respective libraries

●NMatrix-JRuby depends on Java functions.

Mixed models●After NMAtrix for doubles was ready, I tested it with mixed_models.

Challenges●Autoboxing and Multiple data type

●Minimise copying of data

●Handling large array

Autoboxing● :float64 => double only

● Strict dtypes => creating data type in Java: not guessing

●Errors => that can’t be reproduced :P

[ 0. 11, 0.05, 0.34, 0.14 ] + [ 0. 21,0.05, 0.14, 0.14 ] = [ 0, 0, 0, 0]

([ 0. 11, 0.05, 0.34, 0.14 ] + 5) + ([ 0. 21, 0.05, 0.14, 0.14 ] + 5) - 10 =

[ 0.32, 0.1, 0.48, 0.28]

Minimise copying of data●Make sure you make copies of data

Handling large arrays●Array Size

●Accessing elements

●Chaining to java method

●Speed and Memory Required

Ruby Codeindex =0puts Benchmark.measure{ (0...15000).each do |i| (0...15000).each do |j| c[i][j] = b[i][j] index+=1 end end}

#67.790000 0.070000 67.860000 ( 65.126546)#RAM consumed => 5.4GB

b = Java::double[15_000,15_000].newc = Java::double[15_000,15_000].newindex=0puts Benchmark.measure{ (0...15000).each do |i| (0...15000).each do |j| b[i][j] = index index+=1 end end}#43.260000 3.250000 46.510000 ( 39.606356)

Java Codepublic class MatrixGenerator{public static void test2(){for (int index=0, i=0; i < row ; i++){ for (int j=0; j < col; j++){ c[i][j]= b[i][j]; index++; } }

}puts Benchmark.measure{MatrixGenerator.test2}

#0.034000 0.001000 00.034000 ( 00.03300)#RAM consumed => 300MB

public class MatrixGenerator{public static void test1(){

double[][] b = new double[15000][15000];double[][] c = new double[15000][15000];for (int index=0, i=0; i < row ; i++){ for (int j=0; j < col; j++){ b[i][j]= index; index++; } }

}puts Benchmark.measure{MatrixGenerator.test1}#0.032000 0.001000 00.032000 ( 00.03100)

ResultsImproves:

●1000 times the speed

●10times the memory

Benchmarking NMatrix functionalities

System Specifications●CPU: AMD FX8350 0ctacore 4.2GHz

●RAM: 16GB

Addition

Subtraction

Gamma

Matrix Multiplication

Determinant

Factorization

Benchmark conclusion●NMatrix-JRuby is incredibly faster for N-dimensional matrices when

elementwise operations are concerned.

●NMatrix-MRI is faster for 2-dimensional matrix when calculating matrix multiplication, determinant calculation and factorization.

Improvements●Make NMatrix-JRuby faster than NMatrix-MRI using BLAS level-3 and

LAPACK routines.

●How?

●Why not JBlas?

Future Work●Add support for complex dtype.

●Convert NMatrix-JRuby Enumerators to Java code.

●Add sparse support.

Am I done?

Nope!

Enter GPU

A General-Purpose GPU library●Combine the beauty of Ruby with transparent GPU processing

●This will work both on client computers and on servers that make use of TESLA's and Intel Xeon Phi solutions.

● Developer activity and support for the current projects is mixed at best, and they are tough to use as they involve writing kernels and require a lot of effort to be put in buffer/RAM optimisation.

ArrayFire-rb●Wraps ArrayFire library

Using ArrayFire

MRI●C extension

●Architecture is inspired by NMatrix and NArray

●The C++ function is placed in a namespace (e.g., namespace af { }) or is declared static if possible. The C function receives the prefix af_, e.g., af_multiply() (this function also happens to be static).

●C macros are capitalized and generally have the prefix AF_, as with AF_DTYPE().

●C functions (and macros, for consistency) are placed within extern "C" { } blocks to turn off C++ mangling.

●C macros (in extern blocks) may represent C++ constants (which are always defined in namespace af {} or a child thereof).

JRuby●The approach is same as NMatrix JRuby.

●Java Native Interface( JNI )

●Work on ArrayFire-Java

Benchmarking ArrayFire

System SpecificationCPU: AMD FX Octacore 4.2GHz

RAM: 16GB

GPU: Nvidia GTX 750Ti

GPU RAM : 4GB DDR5

Matrix Addition

Matrix Multiplication

Matrix Determinant

Factorization

Transparency●Integrate with Narray

●Integrate with NMatrix

●Integrate with Rails

Applications●Endless possibilities ;)

●Bioinformatics

●Integrate Tensorflow

●Image Processing

●Computational Fluid Dynamics

Conclusion

Useful Links●https://github.com/sciruby/nmatrix

●https://github.com/arrayfire/arrayfire-rb

●https://github.com/prasunanand/arrayfire-rb/tree/temp

Acknowlegements1.Pjotr Prins

2.Charles Nutter

3.John Woods

4.Alexej Gossmann

5.Sameer Deshmukh

6.Pradeep Garigipati

Thank You

Github: prasunanandTwitter: @prasun_anandBlog: prasunanand.com