    • Debugging in Intel is easy refer Using the Intel® OpenCL SDK Debugger

      • Add file path when you build cl program

        • Note: Must using original path not copy path.

          • EX: If you copy your CL file in post build process, need add original path not debugging CL code path.
        • Note: Path need full path

      • Enable Intel OpenCL SDK debugging in toolIntel SDK

        • Note: Work item set 0,0,0 as default is enough.
  • OpenCL kernel need warm up

    • Run any other kernel code first (even not the same application), it will speed up your major CL kernel code

    • AMD’s magic number is to run “twice” on dump kernel

      • Testing result: (SW/CPU 160ms)

        • Intel:

          • 1st time setup 700ms, effect 160
        • AMD:

          • 1st time

            • setup 6000ms

            • effect 16ms

          • 2nd time

            • setup 160ms

            • effect 16ms  (it might goes to 0ms some time)

      • It could be reduce time to pre-load *.cl file but no way to not reduce clBuildProgram. (Program will cache result as previous one, even you reset twice the argument.

      • Run two times the same kernel code:

        • After second time, (no matter the same program or not) the CL kernel will cache it and very fast.

        • Run second time, the CL setup code will faster the first time (Found on AMD GPU)

        • Note: IVB don’t have such issue.

