[x 1] (identity x)
[arr (object-array (range 1000000))] (reduce + 0 arr)
[arr (object-array (range 1000000))] (reduce + 0N arr)
[arr (long-array (range 1000000))] (areduce arr i sum 0 (+ sum i))
[arr (long-array (range 1000000))] (areduce arr i sum 0N (+ sum i))
[coll []] (instance? clojure.lang.IPersistentVector coll)
[coll (list 1 2 3)] (instance? clojure.lang.ISeq coll)
[coll [1 2 3]] (instance? clojure.lang.ISeq coll)
[coll (list 1 2 3)] (first coll)
[coll (list 1 2 3)] (rest coll)
[] (list)
[] (list 1 2 3)
[] []
[] [1 2 3]
[coll [1 2 3]] (transient coll)
[coll [1 2 3]] (nth coll 0)
[coll [1 2 3]] (conj coll 4)
[coll [1 2 3]] (seq coll)
[coll (seq [1 2 3])] (first coll)
[coll (seq [1 2 3])] (rest coll)
[coll (seq [1 2 3])] (next coll)
[] (reduce conj [] (range 40000))
[] (persistent! (reduce conj! (transient []) (range 40000)))
[coll (into [] (range (+ 32768 32)))] (conj coll :foo)
[coll (into [] (range 40000))] (assoc coll 123 :foo)
[coll (into [] (range (+ 32768 33)))] (pop coll)
[coll (take 100000 (iterate inc 0))] (reduce + 0 coll)
[coll (range 1000000)] (reduce + 0 coll)
[coll (into [] (range 1000000))] (reduce + 0 coll)
[coll (into [] (range 1000000))] (apply + coll)
[coll {:foo 1, :bar 2}] (get coll :foo)
[coll {:foo 1, :bar 2}] (:foo coll)
[coll (Foo. 1 2)] (:bar coll)
[coll {:foo 1, :bar 2}] (assoc coll :baz 3)
[coll {:foo 1, :bar 2}] (assoc coll :foo 2)
[key :f0] (hash key)
[coll {:foo 1, :bar 2}] (loop [i 0 m coll] (if (< i 100000) (recur (inc i) (assoc m :foo 2)) m))
[coll map1] (:f0 coll)
[coll map1] (get coll :f0)
[coll map1] (assoc coll :g0 32)
[coll {}] (assoc coll :f0 1)
[] #{}
[] #{1 2 3}
[coll #{1 2 3}] (conj coll 4)
[coll (range 500000)] (reduce + coll)
[s "{:foo [1 2 3]}"] (read-string s)
[m {:foo [1 2 {:bar {3 :a, 4 #{:c :b :d :e}}}]}] (pr-str m)
[r (range 1000000)] (last r)
[r (ints-seq 1000000)] (last r)

(identity x) notes: It is a good bet that most of this measured time is the per-loop overhead in criterium of updating two primitive long loop locals, and saving the return value in a Java array, although the version of criterium used may actually try to estimate and remove that overhead from the time values it reports. This same overhead is present in all of the benchmarks, but much less noticeable when the expression takes longer to evaluate.

(reduce + 0N arr) notes: TBD exactly which commit(s) between Clojure 1.3-beta1 and 1.3-beta2 led to the much better performance afterwards.

(areduce arr i sum 0 (+ sum i)) notes: TBD which commit(s) between Clojure 1.3-beta3 and 1.3 led to the better performance on 32-bit JVMs afterwards.

(areduce arr i sum 0N (+ sum i)) notes: TBD exactly which commit(s) between Clojure 1.3-beta1 and 1.3-beta2 led to the much better performance afterwards. Likely it was the same changes that led to the better performance for (reduce + 0N arr) above.

[coll (list 1 2 3)] (first coll) notes: The spikes of longer times in this graph are most likely due to the 3 runs made for each data point having significant variation from one run to the next, and if I had tried several more runs I would get one that was as low as the neighboring results on the same JVM. I don't know the reasons for the significant run-to-run variation.

[coll (list 1 2 3)] (rest coll) notes: See previous graph for notes on the spikes in the graph.

[] (list 1 2 3) notes: I find it odd that several of the JVMs tested have a noticeable increase in run time between Clojure 1.5-beta1 and 1.5-beta2, but not the others. TBD: which commit(s) led to this change?

[coll (seq [1 2 3])] (first coll) notes: See under the graph of [coll (list 1 2 3)] (first coll) for notes on the spikes in the graph.

[coll (seq [1 2 3])] (next coll) notes: See under the graph of [coll (list 1 2 3)] (first coll) for notes on the spikes in the graph.

[] (reduce conj [] (range 40000)) notes: TBD why only the 64-bit Apple JDK 1.6 is significantly slower than the others.

[key :f0] (hash key) notes: TBD exactly which commit(s) between Clojure 1.4-alpha1 and 1.4-alpha2 led to the worse performance afterwards. It obviously appears to have been undone, or improved in some other way, in 1.5-beta2.

clj-1.6-beta1 is when a slower but higher quality hash function was introduced. The speedup in clj-1.6-rc2 may be due to caching the hash calculation for keywords (TBD).

[coll map1] (:f0 coll) notes: map1 has 30 key/value pairs where all keys are keywords. The increase in time from Clojure 1.4-alpha1 to 1.4-alpha2 is likely due to the same change as for the [key :f0] (hash key) benchmark above.

[coll map1] (get coll :f0) notes: The previous graph's notes are relevant here, too.

This benchmark shows the biggest run time increase when the slower but higher-quality hashing was introduced in clj-1.6-beta1. While some of this extra time is due directly to the slower hash function calculations, most of it is due to the way the PersistentHashMap data structure is implemented, and the fact that the hash function changes cause the shape of the PersistentHashMap 'tree' to be significantly different after the hash change. If we had a different benchmark (TBD) that totaled the run time of assoc'ing several new key/value pairs into the original map, say perhaps 10, I would expect the run times to be much more similar to each other.

In particular, with clj-1.6-alpha3 and earlier, the PersistentHashMap has an ArrayNode as its root node type, with most of the key/value pairs directly inside of it. assoc on this with key :g0 returns another PersistentHashMap with only this root ArrayNode slightly different than the original. With clj-1.6-beta1 through clj-1.6, the PersistentHashMap has a BitmapIndexedNode as its root node type, and assoc on this data structure returns a PersistentHashMap with an ArrayNode in place of the BitmapIndexedNode, which requires significantly more computation time to initialize. It runs the code in these lines to create the ArrayNode from the BitmapIndexedNode.