Evaluate to see how feasible it would be to combine loads of subsequent fields using ldp instead of loading them separately during the use. It cannot be considered as a peep-hole optimization, but an analysis is needed in earlier phases to check around for consecutive field loads and if found one, combine them to a single load.
class Body { public double x, y, z, vx, vy, vz, mass; }
...
foreach (var b in bodies) {
b.x += dt * b.vx; b.y += dt * b.vy; b.z += dt * b.vz;
}
Below code is generated for the loop that deals with multiplication of double.
#1 can be combined into ldp dX, dY, [x3, #8]
#2 can be combined into ldp dX, dY, [x3, #32]
#3 can be combined into stp dX, dY, [x3, #8]
G_M56457_IG05: ;; offset=0114H
D37D7C43 ubfiz x3, x2, #3, #32
91004063 add x3, x3, #16
F8636803 ldr x3, [x0, x3]
FD400470 ldr d16, [x3,#8] ; <-- #1
FD401071 ldr d17, [x3,#32] ; <-- #2
1E710811 fmul d17, d0, d17
1E712A10 fadd d16, d16, d17
FD000470 str d16, [x3,#8] ; <-- #3
FD400870 ldr d16, [x3,#16] ; <-- #1
FD401471 ldr d17, [x3,#40] ; <-- #2
1E710811 fmul d17, d0, d17
1E712A10 fadd d16, d16, d17
FD000870 str d16, [x3,#16] ; <-- #3
FD400C70 ldr d16, [x3,#24]
FD401871 ldr d17, [x3,#48]
1E710811 fmul d17, d0, d17
1E712A10 fadd d16, d16, d17
FD000C70 str d16, [x3,#24]
11000442 add w2, w2, #1
6B02003F cmp w1, w2
54FFFD8C bgt G_M56457_IG05
Reference: https://godbolt.org/z/9jY5hYnoa
category:implementation
theme:codegen
skill-level:intermediate
cost:medium
impact:medium
Evaluate to see how feasible it would be to combine loads of subsequent fields using
ldpinstead of loading them separately during the use. It cannot be considered as a peep-hole optimization, but an analysis is needed in earlier phases to check around for consecutive field loads and if found one, combine them to a single load.Below code is generated for the loop that deals with multiplication of
double.#1can be combined intoldp dX, dY, [x3, #8]#2can be combined intoldp dX, dY, [x3, #32]#3can be combined intostp dX, dY, [x3, #8]Reference: https://godbolt.org/z/9jY5hYnoa
category:implementation
theme:codegen
skill-level:intermediate
cost:medium
impact:medium