-
英特尔® C++编译器Cilk语言扩展.pdf下载
资源介绍
英特尔® C++编译器Cilk语言扩展 ............................................................................................... 1
1. 介绍............................................................................................................................................ 7
1.1 目标读者 ............................................................................................................................... 7
1.2 前提条件 ............................................................................................................................... 7
1.3 排字约定 ............................................................................................................................... 7
1.4 附加资源和信息 ................................................................................................................... 7
2. 新手上路 .................................................................................................................................... 8
2.1 编译运行一个Cilk用例 ..................................................................................................... 8
2.1.1 编译生成 qsort ........................................................................................................... 8
2.1.2 执行 qsort ................................................................................................................... 9
2.1.3 观察多核系统中的加速 ............................................................................................... 9
2.2 改写一个C++程序 .............................................................................................................. 10
2.2.1 从一个串行程序开始 ................................................................................................. 11
2.2.2 使用_Cilk_spawn加入并行性 .................................................................................. 12
2.2.3 编译,执行和测试 ..................................................................................................... 14
3. 编译、运行和调试Cilk程序 ................................................................................................. 15
3.1 设定工作线程数量 ............................................................................................................. 15
3.1.1 环境变量 ..................................................................................................................... 15
3.1.2 程序控制 ..................................................................................................................... 15
3.2 串行化................................................................................................................................ 15
3.2.1 如何创建串行化 ......................................................................................................... 16
3.3 调试策略 ............................................................................................................................. 16
4. Cilk 语言特性说明 ................................................................................................................. 17
5. Cilk关键字 .............................................................................................................................. 18
5.1 cilk_spawn ......................................................................................................................... 18
5.2 cilk_sync ........................................................................................................................... 19
5.3 cilk_for ............................................................................................................................. 19
5.3.1 串行或并行结构的 cilk_for ................................................................................... 20
5.3.2 衍生发生在串行循环内的串行或并行结构 ............................................................. 21
5.3.3 cilk_for 循环体 ....................................................................................................... 21
5.3.4 cilk_for 类型要求 ................................................................................................... 22
5.3.5 cilk_for限制 ............................................................................................................ 23
5.3.6 cilk_for的粒度 ........................................................................................................ 24
5
5.4预处理宏 .............................................................................................................................. 25
6. Cilk 执行模型 ......................................................................................................................... 27
6.1 Strands ............................................................................................................................... 27
6.2 工时和跨度 ......................................................................................................................... 28
6.3 strand到工作线程的映射 ................................................................................................ 30
6.4 异常处理 ............................................................................................................................. 32
7. Reducers .................................................................................................................................. 34
7.1 使用Reducers – 一个简单的例子 ................................................................................ 34
7.2 Reducers是如何工作的 .................................................................................................... 36
7.3 安全性和性能考虑 ............................................................................................................. 38
7.3.1 安全性 ......................................................................................................................... 38
7.3.2 确定性 ......................................................................................................................... 39
7.3.3 性能 ............................................................................................................................. 39
7.4 Reducer 库 ......................................................................................................................... 39
7.5 使用Reducers – 另一个例子 ........................................................................................ 41
7.5.1 字符串Reducer .......................................................................................................... 41
7.5.2 List reducer (使用用户定义类型) ....................................................................... 42
7.5.3 递归函数中的Reducers ............................................................................................ 43
8. 操作系统相关事项 ................................................................................................................... 44
8.1在Cilk程序上使用其它工具 ............................................................................................ 44
8.2 和操作系统线程的一般交互 ............................................................................................. 44
8.3 Microsoft Foundation Class 和 Cilk程序 ................................................................ 45
9. Cilk运行系统API ................................................................................................................... 47
9.1 __cilkrts_set_param ....................................................................................................... 47
9.2 __cilkrts_get_nworkers ................................................................................................. 47
9.3 __cilkrts_get_worker_number .............................................................................................. 47
9.4 __cilkrts_get_total_workers .................................................................................................. 48
10. 理解竞争条件 ......................................................................................................................... 49
10.1 数据竞争 ........................................................................................................................... 49
10.2 良性竞争 ........................................................................................................................... 50
10.3 解决数据竞争 ................................................................................................................... 50
10.3.1 纠正程序中的错误 ................................................................................................... 51
10.3.2 使用局部变量而不是全局变量 ............................................................................... 51
10.3.3 重新构造代码 ........................................................................................................... 52
6
10.3.3 更改算法 ................................................................................................................... 52
10.3.4 使用reducer ............................................................................................................ 52
10.3.5 使用锁 ....................................................................................................................... 52
11. 使用锁的注意事项 ................................................................................................................. 54
11.1 锁引起的确定性竞争 ....................................................................................................... 54
11.2 死锁.................................................................................................................................. 55
11.3锁竞争对并行性的影响 .................................................................................................... 56
11.4跨越strand边界的锁 ...................................................................................................... 56
12. Cilk程序性能方面的注意事项 ............................................................................................ 58
12.1 粒度.................................................................................................................................. 58
12.2 首先优化串行程序 ........................................................................................................... 58
12.3 程序和程序段计时 ........................................................................................................... 59
12.4 常见性能隐患 ................................................................................................................... 59
12.5 高速缓存效率和内存带宽 ............................................................................................... 60
12.6 伪共享 ............................................................................................................................... 60
12.7 内存分配瓶颈 ................................................................................................................... 61
Appendix A. 怎样写一个新的Reducer ...................................................................................... 62
Reducer的组件 .......................................................................................................................... 62
恒等值 ....................................................................................................................................... 63
The Monoid................................................................................................................................ 63
写Reducer – 一个“Holder”的例子 .................................................................................... 64
附录B: 参考读物 .......................................................................................................................... 67
Cilk 总体说明: ......................................................................................................................... 67
串行语义: .................................................................................................................................. 67
例子: ....................................................................................................................................... 67
竞争条件: .................................................................................................................................. 67