- 
    Related code survey found: - 
        Debugging in Intel is easy refer Using the Intel® OpenCL SDK Debugger - 
            Add file path when you build cl program - 
                Note: Must using original path not copy path. - EX: If you copy your CL file in post build process, need add original path not debugging CL code path.
 
- 
                Note: Path need full path 
 
- 
                
- 
            Enable Intel OpenCL SDK debugging in toolIntel SDK - Note: Work item set 0,0,0 as default is enough.
 
 
- 
            
 
- 
        
- 
    OpenCL kernel need warm up - 
        Run any other kernel code first (even not the same application), it will speed up your major CL kernel code 
- 
        AMD’s magic number is to run “twice” on dump kernel - 
            Testing result: (SW/CPU 160ms) - 
                Intel: - 1st time setup 700ms, effect 160
 
- 
                AMD: - 
                    1st time - 
                        setup 6000ms 
- 
                        effect 16ms 
 
- 
                        
- 
                    2nd time - 
                        setup 160ms 
- 
                        effect 16ms (it might goes to 0ms some time) 
 
- 
                        
 
- 
                    
 
- 
                
- 
            It could be reduce time to pre-load *.cl file but no way to not reduce clBuildProgram. (Program will cache result as previous one, even you reset twice the argument. 
- 
            Run two times the same kernel code: - 
                After second time, (no matter the same program or not) the CL kernel will cache it and very fast. 
- 
                Run second time, the CL setup code will faster the first time (Found on AMD GPU) 
- 
                Note: IVB don’t have such issue. 
 
- 
                
 
- 
            
 
- 
        
- 
    Refer: - 
        Basic concept of OpenCL and sample program OpenCL 教學(一) - Hotball’s Hive 
- 
        Advanced concept and clarify OpenCL comment Richard’s blog: OpenCL 介紹 
 
- 
        
