Skip to content

Conversation

@simeonschaub
Copy link
Collaborator

In my benchmarks, Julia 1.7 seems to be smart enough to eliminate the bounds check on its own:

julia> function _sum(A)
           s = 0.0
           for a in A
               s += a
           end
           s
       end
_sum (generic function with 1 method)

julia> function _sum2(A)
           s = 0.0
           @inbounds @simd for a in A
               s += a
           end
           s
       end
_sum2 (generic function with 1 method)

julia> function _sum3(A)
           s = 0.0
           @simd for a in A
               s += a
           end
           s
       end
_sum3 (generic function with 1 method)

julia> using BenchmarkTools

julia> a = rand(10^7);

julia> @benchmark _sum($a)
BenchmarkTools.Trial: 425 samples with 1 evaluation.
 Range (min  max):  10.595 ms   13.903 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     11.753 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   11.770 ms ± 177.003 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                             █▂▁
  ▅▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁███▅▅▅▄▄▄▁▁▄▄▆▁▁▄ ▆
  10.6 ms       Histogram: log(frequency) by time      12.2 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark _sum2($a)
BenchmarkTools.Trial: 1024 samples with 1 evaluation.
 Range (min  max):  4.323 ms    8.279 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     4.850 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.877 ms ± 207.137 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                       █▂▄▅▂▁▁▂▃
  ▂▁▂▂▁▂▁▁▂▁▂▂▂▂▂▁▂▁▂▁██████████▇▆▄▃▃▃▂▂▂▂▂▃▂▂▂▂▂▁▁▁▁▁▂▁▁▁▁▁▂ ▃
  4.32 ms         Histogram: frequency by time        5.55 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark _sum3($a)
BenchmarkTools.Trial: 988 samples with 1 evaluation.
 Range (min  max):  4.117 ms    8.226 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     4.935 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.049 ms ± 579.887 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

              ▄█▂
  ▃▃▃▂▁▃▃▃▃▄▅▆███▆▆▅▅▅▄▃▃▂▂▂▂▁▂▂▂▁▁▂▁▁▁▂▂▂▁▂▂▁▁▁▁▁▂▁▂▂▂▂▂▂▂▂▂ ▃
  4.12 ms         Histogram: frequency by time        7.63 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

In my benchmarks, Julia 1.7 seems to be smart enough to eliminate the bounds check on its own:

```julia
julia> function _sum(A)
           s = 0.0
           for a in A
               s += a
           end
           s
       end
_sum (generic function with 1 method)

julia> function _sum2(A)
           s = 0.0
           @inbounds @simd for a in A
               s += a
           end
           s
       end
_sum2 (generic function with 1 method)

julia> function _sum3(A)
           s = 0.0
           @simd for a in A
               s += a
           end
           s
       end
_sum3 (generic function with 1 method)

julia> using BenchmarkTools

julia> a = rand(10^7);

julia> @benchmark _sum($a)
BenchmarkTools.Trial: 425 samples with 1 evaluation.
 Range (min … max):  10.595 ms …  13.903 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     11.753 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   11.770 ms ± 177.003 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                             █▂▁
  ▅▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁███▅▅▅▄▄▄▁▁▄▄▆▁▁▄ ▆
  10.6 ms       Histogram: log(frequency) by time      12.2 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark _sum2($a)
BenchmarkTools.Trial: 1024 samples with 1 evaluation.
 Range (min … max):  4.323 ms …   8.279 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     4.850 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.877 ms ± 207.137 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                       █▂▄▅▂▁▁▂▃
  ▂▁▂▂▁▂▁▁▂▁▂▂▂▂▂▁▂▁▂▁██████████▇▆▄▃▃▃▂▂▂▂▂▃▂▂▂▂▂▁▁▁▁▁▂▁▁▁▁▁▂ ▃
  4.32 ms         Histogram: frequency by time        5.55 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark _sum3($a)
BenchmarkTools.Trial: 988 samples with 1 evaluation.
 Range (min … max):  4.117 ms …   8.226 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     4.935 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.049 ms ± 579.887 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

              ▄█▂
  ▃▃▃▂▁▃▃▃▃▄▅▆███▆▆▅▅▅▄▃▃▂▂▂▂▁▂▂▂▁▁▂▁▁▁▂▂▂▁▂▂▁▁▁▁▁▂▁▂▂▂▂▂▂▂▂▂ ▃
  4.12 ms         Histogram: frequency by time        7.63 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.
```
@simeonschaub
Copy link
Collaborator Author

There is no array accessing (e.g. A[2]) in this piece of code, so _sum2 and _sum3 are essentially the same and @inbounds is actually redundant in _sum2.

Just because you don't see it doesn't mean this code doesn't end up calling A[i]. :) It might have not been the best example though, because that call is already annotated as @inbounds: https://github.com/JuliaLang/julia/blob/2d6a84ecedc02a58d0a4f64d898ae5124ad35c2b/base/array.jl#L895.

The programming language should allow users to control this instead of being 'smart' enough to eliminate the bounds check on its own.

I definitely disagree with this. In the majority of cases, Julia's compiler should have all the information it needs to eliminate bounds check. @inbounds should really be thought of more as a bandaid for cases where the compiler isn't yet smart enough. It's very easy to use incorrectly and the problem is that you might not just get a segfault immediately, instead out-of-bounds accesses can also lead to quite memory corruption and hard-to-track bugs down the road. That's why ideally, we'd like to just get rid of @inbounds alltogether.

Observe for example that even with explicit indexing, LLVM can still eliminate bounds checks (unfortunately only on Julia master right now):

julia> function _sum(A)
           s = 0.0
           for i in eachindex(A)
               s += A[i]
           end
           s
       end
_sum (generic function with 1 method)

julia> function _sum2(A)
           s = 0.0
           @inbounds @simd for i in eachindex(A)
               s += A[i]
           end
           s
       end
_sum2 (generic function with 1 method)

julia> function _sum3(A)
           s = 0.0
           @simd for i in eachindex(A)
               s += A[i]
           end
           s
       end
_sum3 (generic function with 1 method)

julia> @benchmark _sum($a)
BenchmarkTools.Trial: 591 samples with 1 evaluation.
 Range (min  max):  8.116 ms    9.745 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     8.422 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   8.447 ms ± 134.948 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                 ▃▁▁▇▆█▄█▄▅▄   ▃▂                              
  ▂▁▁▂▁▁▃▂▃▃▅▂▇█▇█████████████████▅▃▆▆▅▄▃▃▃▄▂▁▁▁▃▁▂▁▃▃▂▂▁▂▁▁▃ ▄
  8.12 ms         Histogram: frequency by time        8.91 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark _sum2($a)
BenchmarkTools.Trial: 1174 samples with 1 evaluation.
 Range (min  max):  3.880 ms    5.134 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     4.109 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.241 ms ± 279.201 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

      ▁▄▆▆█▆▇▄▁▁                                               
  ▁▂▄▅██████████▆▆▅▄▃▄▂▃▂▄▃▄▃▅▄▂▄▃▄▄▃▄▄▄▄▃▃▃▄▄▃▃▄▄▃▂▃▄▃▂▂▃▃▂▂ ▃
  3.88 ms         Histogram: frequency by time        4.92 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark _sum3($a)
BenchmarkTools.Trial: 1188 samples with 1 evaluation.
 Range (min  max):  3.781 ms    5.362 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     4.064 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.191 ms ± 286.753 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

         ▂▂▅█▅▃▆▃                                              
  ▂▃▃▄▄▇▇██████████▆▄▅▄▃▄▅▄▃▄▄▄▃▃▅▄▄▄▃▅▄▄▅▃▄▃▄▅▆▅▅▄▃▃▃▃▃▂▂▁▂▂ ▄
  3.78 ms         Histogram: frequency by time        4.94 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant