Array addition vs Array concat

Array#+ vs Array#concat

@dezmo and I were privately discussing the benefits and speed of Array#+ versus using Array#concat.

Backstory ...

The impetus for the tests came from the comparison of …

def get_instances(ents)
  ents.grep(Sketchup::Group) + ents.grep(Sketchup::ComponentInstance)
end

… which creates a new 3rd array, … to the following which does not create a 3rd array:

def get_instances(ents)
  ents.grep(Sketchup::Group).concat(ents.grep(Sketchup::ComponentInstance))
end

The results of the test pleasantly surpassed expectations, so I thought I would share the results.

The test code:

GC.start

a2 = ["text"]
a1 = []
t = Time.now
100000.times{
  a1 += a2
}
t1 = Time.now
puts "Array+: #{t1-t} secs"

b1 = []
t = Time.now
100000.times{
  b1.concat(a2)
}
t1 = Time.now
puts "Concat: #{t1-t} secs"

Results: (on my i7-8750H CPU @ 2.20GHz machine)

Array+: 6.3678292
Concat: 0.0107127

The first time I ran it (on my machine,) Array#+ was 9.8 secs! It seems to be much longer the first time it is run, but afterward quickly settles down into an average from 5.1-5.8 secs.

Array#concat is between at 0.0103-0.0107 secs. (Yes that is 1 hundredth of a second!)

Conclusion:

Array#+ is a minimum of 500 times slower than Array#concat !

BUT WAIT ! It gets better …

I then made the a2 array much more complex with multiple members of various types, ie …

a2 = ["text",1,Math::PI,{:key =>"value"},:symbolic]

… and reran the test. The results were even more surprising:

Array+: 56.0732861 secs
Concat: 0.0175667 secs

So array addition took almost a full minute.
:drum:
… but array concatenation was still under 2 hundredths of a second!

That’s like 3200 times faster. And that’s crazy good.

5 Likes

:thinking:
I went back to the Backstory…
I Created a simple cuboid group and similar component. Copied them a bunch of times. So there is a 1001 groups + 1001 comp instances in the model.
(Windows 10 / SU2021.0.339, i7-8850H CPU @ 2.60GHz )

def get_instances_p(ents)
  ents.grep(Sketchup::Group) + ents.grep(Sketchup::ComponentInstance)
end

def get_instances_c(ents)
  ents.grep(Sketchup::Group).concat(ents.grep(Sketchup::ComponentInstance))
end

model = Sketchup.active_model
ents = model.entities
instances_c = []
instances_p = []

t = Time.now
10000.times{
  instances_c = get_instances_c(ents)
}
t1 = Time.now
puts "Concat: #{t1-t} secs (#{instances_c.size})"

t = Time.now
10000.times{
 instances_p = get_instances_p(ents)
}
t1 = Time.now
puts "Array+: #{t1-t} secs (#{instances_p.size})"

Results:

1st

Concat: 7.4057384 secs (2002)
Array+: 7.4500132 secs (2002)

2nd

Concat: 7.3715121 secs (2002)
Array+: 7.4838291 secs (2002)

Here the addition or concat happening once, with “big” arrays, but that does not really make visible difference of performance. Still a concat looks a little faster.

1 Like

That is an interesting test.

In my mind it is indicating that reference assignment and method calling overhead is the great majority of the time being spent.

I was wondering about this in the first test where the addition had an assignment, but the concat test did not.

Main reason for the difference is that + will create a new array combining the content of the two. While .concat modifies the receiver. Less data is copied.

Hmmm … I said this in the Backstory block. But per the 2nd test by @dezmo, I am not so sure.

Granted in a long loop, as the built up array gets bigger and bigger, there is more and more items to copy into the new 3rd array for #+, whereas for #concat the same amount of items are copied into the receiver array in each iteration.

1 Like

Okay, I understand now why they look the same. (Missed this the first time I read it.)
This test doesn’t accumulate the receiver array like the following would …

model = Sketchup.active_model
ents  = model.entities
insts = ents.grep(Sketchup::ComponentInstance)

instances_c = []
instances_p = []

t = Time.now
10000.times {
  instances_c.concat(insts)
}
t1 = Time.now
puts "Concat: #{t1-t} secs (#{instances_c.size})"

t = Time.now
10000.times {
  instances_p = instances_p + insts
}
t1 = Time.now
puts "Array+: #{t1-t} secs #(#{instances_p.size})"

Yes I agree. We would not see much difference when calling once without an accumulation of the results.

1 Like

Yes, I think this is the key word here!


Off topic

:thinking: Maybe there is a similar behaviour Array#- vs ???

In the case of subtraction ...

In the case of subtraction, the receiver array is going to getting smaller each iteration so the loops should get faster each time through.

Perhaps compare against a1.delete_if {|e| a2.include?(e) }
Kind of what #- and #difference does, except #delete_if works upon the receiver (no extra array created.)

1 Like

:bulb: Do you want more? :wink:

a2 = ["text"]
b1 = []
t = Time.now
10000000.times{
  b1.concat(a2)
}
t1 = Time.now
puts "Conc: #{t1-t} secs"

c1 = []
t = Time.now
10000000.times{
  c1.push(a2[0])
}
t1 = Time.now
puts "Push: #{t1-t} secs"
Conc: 1.2401043 secs
Push: 1.1350859 secs

Yea, this will apply if a2 contains only one element…but still!! :beers:

Worth using the Benchmark feature in Ruby for these tests. Presents the results in somewhat more readable format (also separate user and and system time).

Another option, create a SpeedUp profiling test, which can be used for both benchmarking and profiling: bitmap-2-mesh/PR_HeightMap_generate.rb at master · thomthom/bitmap-2-mesh · GitHub

@dezmo, well it is not a proper compare for concat an array vs pashing a single item.

That’s right. It is a “special” case of this one (and it is more comparable, I hope):

a2 = ["text",1,Math::PI,{:key =>"value"},:symbolic]
b1 = []
t = Time.now
100000.times{
  b1.concat(a2)
}
t1 = Time.now
puts "Conc: #{t1-t} secs"

c1 = []
t = Time.now
100000.times{
  c1.push(a2)
}
c1.flatten!
t1 = Time.now
puts "Push: #{t1-t} secs"

Conc: 0.0141143 secs
Push: 0.0431016 secs

Push the array in a loop and then flatten at the end still much faster than +.

I suppose expanding the a2 arg in the push call with Ruby’s splat operator would be a little slower, ie:

c1 = []
t = Time.now
100000.times{
  c1.push(*a2)
}

?

Off topic

I’ll add it to my (long enough) list of studies … :blush:



Wow. No. It is similar!

a2 = ["text",1,Math::PI,{:key =>"value"},:symbolic]
b1 = []
t = Time.now
100000.times{
  b1.concat(a2)
}
t1 = Time.now
puts "Conc: #{t1-t} secs"

c1 = []
t = Time.now
100000.times{
  c1.push(*a2)
}
# c1.flatten!
t1 = Time.now
puts "Push: #{t1-t} secs"
c1 == b1

Conc: 0.0141545 secs
Push: 0.0137346 secs
=> true
Conc: 0.0136714 secs
Push: 0.0134551 secs
Conc: 0.0139562 secs
Push: 0.0141386 secs
1 Like