Replace parallel row iteration with sequential loops in Crop/Convolution Processors#3113
Replace parallel row iteration with sequential loops in Crop/Convolution Processors#3113JimBobSquarePants wants to merge 12 commits intomainfrom
Conversation
antonfirsov
left a comment
There was a problem hiding this comment.
I think this should get rid of the operation types, they only scatter the code, which doesn't help readability.
|
|
||
| for (int y = interest.Top; y < interest.Bottom; y++) | ||
| { | ||
| operation.Invoke(y, span); |
There was a problem hiding this comment.
Is there any reason for Convolution2DRowOperation isolation? IMO removing it/moving it to a private method would simplify code.
There was a problem hiding this comment.
None other than I want to make minimal changes.
| in operation); | ||
| for (int y = bounds.Top; y < bounds.Bottom; y++) | ||
| { | ||
| operation.Invoke(y); |
There was a problem hiding this comment.
The operation type is clearly unneeded here.
There was a problem hiding this comment.
I just wanted to do the minimal amount of work required to fix the issues for now.
| MemoryAllocator allocator = this.Configuration.MemoryAllocator; | ||
|
|
There was a problem hiding this comment.
| MemoryAllocator allocator = this.Configuration.MemoryAllocator; |
There was a problem hiding this comment.
Would it make sense to enable warnings that help detecting unused locals or dead code?
There was a problem hiding this comment.
It was weird I missed that. I'll see what I can enable. Fixed anyway. I didn't use your suggestion as there was an addition rogue using to remove.
Prerequisites
Description
See #3111 for relevant details.
Changes to how convolution and crop operations are parallelized. The main focus is on improving performance by removing unnecessary parallelization in memory-bandwidth-bound operations, which can actually degrade performance due to cache contention.
Additionally, it updates the gamma correction logic in the Bokeh blur processor for improved performance. Bokeh and Median blur have been left parallel as they contain significant per-task work.
Relevant reference images have also been updated to reflect minor changes in the output due to earlier clamping.