Implement an autotuning approach to vectorization decision
Tasks:
-
Generate a standalone C++ file for benchmarking a given sum factorization kernel -
Implement a way to do JIT compilation of benchmark programs (codepy?) -
Asynchronize the cost function evaluation (instead of using min
) -
Add a pickle
-cache for benchmarking results -
Add an autotune
cost model implementation -
Think of an interface how benchmark programs communicate their measurements to the code generator