On Artificial Neural Networks, Big Data, And Programming

This post was inspired by this stackoverflow question (link will open in a new page ) which uses the computing time of a nice math problem; computing the sexy primes numbers in order to measure the performance of each programming language. While not a world class benchmark, I found the benchmark attractive, simple and meaningful. But the post is *outdated* now (about three years plus) and I have one take on the test itself that it actually takes the time of printing in the console as part of language's performance; which skews testing results significantly since various REPL environments or editors behave in different capacities when printing lots of data to the console which I will be showing too.

About The Test
Hence, I've altered the code to measure the performance of finishing and writing the output to a file. I've used the same method of writing to a file for the three languages of interest: Ruby, Clojure and Scala. Clojure and Scala use the same underlying Java file writing functionality and Ruby uses in my code the same style or type of writing files. This eliminates this variance in console performance on one side and it is meaningful to count the performance of writing a file as part of the metric since it is a standard valid I/O operation which constitutes an intrinsic feature of the language.

And to what extent various console tests can skew test results?
The following shows test results for Clojure done in REPL and again performed inside LightTable editor:

sexy prime numbers	10k	30k	60k	100k
Clojure 1.7 optimized(repl)	100 ms	721 ms	2740 ms	6887 ms
Clojure 1.7 optimized (LightTable editor)	80 ms	542 ms	1918 ms	5391 ms

You can see that the difference is actually huge -sometimes ~25%-. So which one should I take to compare against the other language(s), and which editor the other language should use. You get the picture.

The Problem

Sexy primes is simply any finding of two prime numbers that differ from each other by 6. So these pairs (5,11), (7,13), (11,17), (13,19), (17,23 are sexy primes. My goal here is to measure the performance of what I think the three most famous languages right now are: Ruby, Clojure and Scala.

Platform

Hardware : a MacBook Pro, i7, 16GB RAM

Operating system: OSX Yosemite 10.10.5

Languages : Ruby 2.2.2p95 vs Clojure 1.7 vs Scala 2.11.7

Java : Java(TM) SE Runtime Environment (build 1.8.0_60-b27)

Source Code

Note that when printing in all three languages I type cast to a string before printing to a file.

Ruby:

def is_prime?(n)
  (2...n).all?{|m| n%m != 0 }
end

def sexy_primes(x)
  (9..x).map do |i|
    [i-6, i]
  end.select do |j|
    j.all?{|j| is_prime? j}
  end
end

output_file = "ruby-sexy-primes-test-no-100.txt"
a = Time.now
File.open(output_file, 'wb') do |f|
  f.print( sexy_primes(100*1000))
end
b = Time.now
puts "#{(b-a)*1000} mils"

Clojure:

(ns user (:import (java.io BufferedWriter FileWriter)))

(defn is-prime? [n]

(every? #(> (mod n %) 0) ]
(range 2 n)))

(defn sexy-primes [m]

    (for [x (range 11 (inc m))

    :let [z (list (- x 6) x)]

:when (every? #(is-prime? %) z)]
z))

(defn spit-it [file-name data]
  (with-open [wtr (BufferedWriter. (FileWriter.    file-name))]
    (.write wtr    data)))

(let [a (System/currentTimeMillis)]  
      (spit-it "clojure-sexy-primes-test-no-100.txt" (pr-str (sexy-primes (* 100 1000))))
      (let [b (System/currentTimeMillis)] (println (- b a) "mils")))

Clojure optimized is-prime?:

(defn ^:static is-prime? [^long n]

    (loop [i (long 2)]

        (if (= (rem n i) 0)
      false

(if (>= (inc i) n) true
(recur (inc i))))))

Scala:

import java.io._

def isPrime

    (n: Int) =
  (2 until n) forall { n % _ != 0 }

def sexyPrimes

    (n: Int) = 
  (11 to n) map { i => List(i-6, i) } filter { _ forall(isPrime(_)) }

val writer = new PrintWriter(new File("scala-sexy-primes-test-100.txt"))
val a = System.currentTimeMillis(); writer.write(sexyPrimes(100*1000).toString); writer.close(); val b = System.currentTimeMillis()
println((b-a).toString + " mils")

Scala optimized

isPrime (the same idea like in Clojure optimization):

import scala.annotation.tailrec

// Not required, but will warn if

// optimization doesn't work

@tailrec

def isPrime(n: Int, i: Int = 2): Boolean =

    if (i == n) true

    else if (n % i != 0) isPrime(n, i + 1)

else false

Results

Test results writing to a file:

sexy prime numbers	10k	30k	60k	100k
Clojure 1.7	470 ms	3384 ms	12866 ms	32505 ms
Clojure 1.7 optimized	67 ms	564 ms	1834 ms	5149 ms
Scala 2.11.7	66 ms	474 ms	1747 ms	4556 ms
Scala 2.11.7 optimized	22 ms	144 ms	572 ms	1459 ms
Ruby 2.2.2p95	593.66 ms	4436.85 ms	16443.66 ms	43520.99 ms

Clearly Scala's performance shines out. I am yet to learn more about Scala and its features though. One important factor to consider is that this is a CPU bound test. If we were to test an I/O bound test I would expect the results to narrow down much more between Clojure and Scala since there might not be much use of static type optimizations in this case as much as there would be in CPU bound calculations. Consider using AKKA for concurrency for example. This is a JVM specific platform where both Clojure and Scala can make use of effectively. However, I would love to be do some testing on that and get some numbers.

The popularity of Clojure in scientific computing and AI is due to the fact that Clojure is fitting for programming symbolic systems, mathematical programming along with other needs while utilizing external optimized C/Fortran libraries like LAPAC, BLAS and alike which both languages Scala and Clojure would need in these scenarios anyways. Thus you get the speed along with Clojures innate AI abilities (for text processing and symbolic systems) which is not a trivial thing to ignore. On the other hand Type Inference in Scala might be of help here; something I need to take a close look at.

Ruby's results are shocking to say the least and slower than Ruby 1.9.3. I don't know what went wrong there. This needs further investigation.

On Artificial Neural Networks, Big Data, And Programming

Thursday, September 3, 2015

A CPU Bound Performance Comparison Between Clojure 1.7.0, Scala 2.11.7, and Ruby 2.2.2p95