- Author: David A. Patterson and John L. Hennessy
- Language: English
- Published: 2014
- Page: 793
- Format: pdf
- Size: 29 MB
CONTENTS
1 Computer Abstractions and Technology 2
1.1 Introduction 31.2 Eight Great Ideas in Computer Architecture 11
1.3 Below Your Program 13
1.4 Under the Covers 16
1.5 Technologies for Building Processors and Memory 24
1.6 Performance 28
1.7 Th e Power Wall 40
1.8 Th e Sea Change: Th e Switch from Uniprocessors to Multiprocessors 43
1.9 Real Stuff : Benchmarking the Intel Core i7 46
1.10 Fallacies and Pitfalls 49
1.11 Concluding Remarks 52
1.12 Historical Perspective and Further Reading 54
1.13 Exercises 54
2 Instructions: Language of the Computer 60
2.1 Introduction 622.2 Operations of the Computer Hardware 63
2.3 Operands of the Computer Hardware 66
2.4 Signed and Unsigned Numbers 73
2.5 Representing Instructions in the Computer 80
2.6 Logical Operations 87
2.7 Instructions for Making Decisions 90
2.8 Supporting Procedures in Computer Hardware 96
2.9 Communicating with People 106
2.10 MIPS Addressing for 32-Bit Immediates and Addresses 111
2.11 Parallelism and Instructions: Synchronization 121
2.12 Translating and Starting a Program 123
2.13 A C Sort Example to Put It All Together 132
2.14 Arrays versus Pointers 141
2.15 Advanced Material: Compiling C and Interpreting Java 145
2.16 Real Stuff : ARMv7 (32-bit) Instructions 145
2.17 Real Stuff : x86 Instructions 149
2.18 Real Stuff : ARMv8 (64-bit) Instructions 158
2.19 Fallacies and Pitfalls 159
2.20 Concluding Remarks 161
2.21 Historical Perspective and Further Reading 163
2.22 Exercises 164
3 Arithmetic for Computers 176
3.1 Introduction 1783.2 Addition and Subtraction 178
3.3 Multiplication 183
3.4 Division 189
3.5 Floating Point 196
3.6 Parallelism and Computer Arithmetic: Subword Parallelism 222
3.7 Real Stuff : Streaming SIMD Extensions and Advanced Vector Extensions in x86 224
3.8 Going Faster: Subword Parallelism and Matrix Multiply 225
3.9 Fallacies and Pitfalls 229
3.10 Concluding Remarks 232
3.11 Historical Perspective and Further Reading 236
3.12 Exercises 237
4 The Processor 242
4.1 Introduction 2444.2 Logic Design Conventions 248
4.3 Building a Datapath 251
4.4 A Simple Implementation Scheme 259
4.5 An Overview of Pipelining 272
4.6 Pipelined Datapath and Control 286
4.7 Data Hazards: Forwarding versus Stalling 303
4.8 Control Hazards 316
4.9 Exceptions 325
4.10 Parallelism via Instructions 332
4.11 Real Stuff : Th e ARM Cortex-A8 and Intel Core i7 Pipelines 344
4.12 Going Faster: Instruction-Level Parallelism and Matrix Multiply 351
4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations 354
4.14 Fallacies and Pitfalls 355
4.15 Concluding Remarks 356
4.16 Historical Perspective and Further Reading 357
4.17 Exercises 357
5 Large and Fast: Exploiting Memory Hierarchy 372
5.1 Introduction 3745.2 Memory Technologies 378
5.3 Th e Basics of Caches 383
5.4 Measuring and Improving Cache Performance 398
5.5 Dependable Memory Hierarchy 418
5.6 Virtual Machines 424
5.7 Virtual Memory 427
5.8 A Common Framework for Memory Hierarchy 454
5.9 Using a Finite-State Machine to Control a Simple Cache 461
5.10 Parallelism and Memory Hierarchies: Cache Coherence 466
5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 470
5.12 Advanced Material: Implementing Cache Controllers 470
5.13 Real Stuff : Th e ARM Cortex-A8 and Intel Core i7 Memory Hierarchies 471
5.14 Going Faster: Cache Blocking and Matrix Multiply 475
5.15 Fallacies and Pitfalls 478
5.16 Concluding Remarks 482
5.17 Historical Perspective and Further Reading 483
5.18 Exercises 483
6 Parallel Processors from Client to Cloud 500
6.1 Introduction 5026.2 Th e Diffi culty of Creating Parallel Processing Programs 504
6.3 SISD, MIMD, SIMD, SPMD, and Vector 509
6.4 Hardware Multithreading 516
6.5 Multicore and Other Shared Memory Multiprocessors 519
6.6 Introduction to Graphics Processing Units 524
6.7 Clusters, Warehouse Scale Computers, and Other Message-Passing Multiprocessors 531
6.8 Introduction to Multiprocessor Network Topologies 536
6.9 Communicating to the Outside World: Cluster Networking 539
6.10 Multiprocessor Benchmarks and Performance Models 540
6.11 Real Stuff : Benchmarking Intel Core i7 versus NVIDIA Tesla GPU 550
6.12 Going Faster: Multiple Processors and Matrix Multiply 555
6.13 Fallacies and Pitfalls 558
6.14 Concluding Remarks 560
6.15 Historical Perspective and Further Reading 563
6.16 Exercises 563
DOWNLOAD HERE
No comments:
Post a Comment