Vectorize and optimize simple C++ algorithm with NEON instructions for the Freescale i.MX6 ARM Development Board
$30-250 USD
Completed
Posted about 8 years ago
$30-250 USD
Paid on delivery
We have a short C++ Code snippet (lines of code to optimize < 50) that runs on a Freescale [login to view URL] ARM development board ([login to view URL]). Your task would be to optimize the run time given implementation using C++, NEON and ASM optimization/implementation techniques. The data is aligned and padded so full 128 bit vector operations can be used. The operations are 32 bit floating point and the size of the memory chunks to be operated on are known during compilation which should enable further optimization for our specific setup.
We expect a minimum speed up of factor 2; any additional 1.0 factor of speed up we would give an extra 20% payment.
We will provide ssh access to a [login to view URL] boundary devices development board with the gcc-compiler-toolchain for testing and measurements. The code fragment will be delivered in a .zip file, and has no dependencies (Makefile included).
Constraints:
1. You must be fluent in English or German.
2. You must be familiar with programming NEON and ASM.
3. You should have experience with optimizing calculations for ARM
4. Please deliver your results with measurements, and a short description.
Hello. I have experience in using ARM NEON instruction set. But I am not very sure how this would optimize things though I can try my best. It would be great if we could discuss about the requirements. Hope you'd contact. Thanks!