I am a Professor in the Computer Science Division of the EECS Department at the University of California, Berkeley. My main research areas are computer architecture, VLSI design, parallel programming and operating system design. I am Director of the new ASPIRE lab tackling the challenge of improving computational efficiency now that transistor scaling is ending. ASPIRE builds upon the earlier success of the Par Lab, whose goal was to make parallel programming accessible to most programmers. I also lead the Architecture Group at the International Computer Science Institute, am an Associate Director at the Berkeley Wireless Research Center, and hold a joint appointment with the Lawrence Berkeley National Laboratory. Previously at MIT, I led the SCALE group, investigating advanced architectures for energy-efficient high-performance computing.

Active Research Projects
(still under construction)

The ASPIRE Lab

ASPIRE is a new 5-year research project that recognizes the shift from transistor-scaling-driven performance improvements to a new post-scaling world where whole-stack co-design is the key to improved efficiency. Building on the success of the soon to be completed Par Lab project, it uses deep hardware and software co-tuning to achieve the highest possible performance and energy efficiency for future mobile and rack computing systems.

The Parallel Computing Laboratory (Par Lab)

With the end of sequential processor performance scaling, multicore processors provide the only path to increased performance and energy efficiency in all platforms from mobile to warehouse-scale computers. The Par Lab was created by a team of Berkeley researchers with the ambitious goal of enabling "most programmers to be productive writing efficient, correct, portable SW for 100+ cores & scale as cores increase every 2 years".

Graph Algorithm Platform

Graph algorithms are becoming increasingly important, from warehouse-scale computers reasoning about vast amounts of data for analytics and recommendation applications to mobile clients running recognition and machine-learning applications. Unfortunately, graph algorithms execute inefficiently on current platforms, either shared-memory systems or distributed clusters. The Berkeley Graph Algorithm Platform (GAP) Project is a Par Lab project that spans the entire stack, aiming to accelerate graph algorithms through software optimization and hardware acceleration.

A Liquid Thread Environment

Applications built by composing different parallel libraries perform poorly when those libraries interfere with one another by obliviously using the same physical cores, leading to destructive resource oversubscription. Lithe was developed in Par Lab as a low-level substrate that provides basic primitives and a standard interface for composing parallel libraries efficiently. Lithe can be inserted underneath the runtimes of legacy parallel libraries to provide bolt-on composability without needing to change existing application code.

Tessellation OS

Tessellation is a manycore OS developed within Par Lab and targeted at the resource management challenges of emerging client devices. Tessellation is built on two central ideas: Space-Time Partitioning and Two-Level Scheduling.

The RISC-V Instruction Set Architecture

RISC-V is a new instruction set architecture (ISA) developed at UC Berkeley as part of Par Lab. RISC-V is designed to be a realistic, clean, and open ISA that is easy to extend for research or subset for education. A wide variety of implementations have been produced including silicon fabrications and FPGA emulations, and RISC-V is being used in a number of classes. A full set of software tools for the architecture are also under development and are being prepared for open distribution.

Resiliency for Extreme Energy Efficiency

Most manycore hardware designs have the potential to achieve maximum energy efficiency when operated in a broad range of supply voltages, spanning from nominal down to near the transistor threshold. We are working on new circuit and architectural techniques to enable parallel processors to work across a broad supply range while tolerating technology variability, and providing immunity to soft- and hard‐errors.

Constructing Hardware in a Scala Embedded Language

Chisel is a new open-source hardware construction language developed at UC Berkeley that supports advanced hardware design using highly parameterized generators and layered domain-specific hardware languages. Chisel is embedded in the Scala programming language, which raises the level of hardware design abstraction by providing concepts including object orientation, functional programming, parameterized types, and type inference.

Monolithically Integrated CMOS Photonics

In a collaboration with MIT, the University of Colorado at Boulder, and Micron Technology, we are exploring the use of silicon photonics to provide high bandwidth energy-efficient links between processors and memory.

DEGAS: Dynamic Exascale Global Address Space Programming Environments

The Dynamic, Exascale Global Address Space programming environment (DEGAS) project will develop the next generation of programming models, runtime systems and tools to meet the challenges of Exascale systems.

DHOSA: Defending Against Hostile Operating Systems

The DHOSA research project focuses on building systems that will remain secure even when the operating system is compromised or hostile. DHOSA is a collaborative effort among researchers from Harvard, Stony Brook, U.C. Berkeley, University of Illinois at Urbana-Champaign, and the University of Virginia.

Earlier Projects at UC Berkeley

RAMP: Research Accelerator for Multi-Processors

The RAMP project was a multi-University project to develop new techniques for efficient FPGA-based emulation of novel parallel architectures thereby overcoming the multicore simulation bottlenecks facing computer architecture researchers. At Berkeley, prototypes included the 1,008 processor RAMP Blue system and the RAMP Gold manycore emulator.

Earlier Projects from the MIT SCALE Group

The Scale Vector-Thread Microprocessor

The Scale microprocessor introduced a new architectural paradigm, vector-threading, which combines the benefits of vector and threaded execution. The vector-thread unit can smoothly morph its control structure from vector-style to threaded-style execution.

Transactional Memory

In many dynamic thread-parallel applications, lock management is the source of much programming complexity as well as space and time overhead. We are investigating possible practical microarchitectures for implementing transactional memory, which provides a superior solution for atomicity that is much simpler to program than locks, and which also reduces space and time overheads.

Low-power Microprocessor Design

We have been developing techniques that combine new circuit designs and microarchitectural algorithms to reduce both switching and leakage power in components that dominate energy consumption, including flip-flops, caches, datapaths, and register files.

Energy-Exposed Instruction Sets

Modern ISAs such as RISC or VLIW only expose to software properties of the implementation that affect performance. In this project we are developing new energy-exposed hardware-software interfaces that also allow software to have fine-grain control over energy consumption.

Mondriaan Memory Protection

Mondriaan memory protection (MMP) is a fine-grained protection scheme that allows multiple protection domains to flexibly share memory and export protected services. In contrast to earlier page-based systems, MMP allows arbitrary permissions control at the granularity of individual words.

Highly Parallel Memory Systems

We are investigating techniques for building high-performance, low-power memory subsystems for highly parallel architectures.

Mobile Computing Systems

Within the context of MIT Project Oxygen, several projects examine the energy and performance of complete mobile wireless systems.

Heads and Tails: Efficient Variable-Length Instruction Encoding

Existing variable-length instruction formats provide higher code densities than fixed-length formats, but are ill-suited to pipelined or parallel instruction fetch and decode. Heads-and-Tails is a new variable-length instruction format that supports parallel fetch and decode of multiple instructions per cycle, allowing both high code density and rapid execution for high-performance embedded processors.

Early Projects

IRAM: Intelligent RAM

The Berkeley IRAM project sought to understand the entire spectrum of issues involved in designing general-purpose computer systems that integrate a processor and DRAM onto a single chip - from circuits, VLSI design and architectures to compilers and operating systems.

PHiPAC: Portable High-Performance ANSI C

PHiPAC was the first autotuning project, automatically generating a high-performance general matrix-multiply (GEMM) routine by using parameterized code generators and empirical search to produce fast code for any platform. Autotuners are now standard in high-performance library development.

The T0 Vector Microprocessor

T0 (Torrent-0) was the first single-chip vector microprocessor. T0 was designed for multimedia, human-interface, neural network, and other digital signal processing tasks. T0 includes a MIPS-II compatible 32-bit integer RISC core, a 1KB instruction cache, a high performance fixed-point vector coprocessor, a 128-bit wide external memory interface, and a byte-serial host interface. T0 formed the basis of the SPERT-II workstation accelerator.

SPACE: Symbolic Processing in Associative Computing Elements

In the PADMAVATI prototype system, a hierarchy of packaging technologies cascade multiple SPACE chips to form an associative processor array with 170,496 36-bit processors. Primary applications for SPACE are AI algorithms that require fast searching and processing within large, rapidly changing data structures.

[Krste]
Professor
Computer Science Division
EECS Department
579 Soda Hall, MC #1776
University of California
Berkeley, CA 94720-1776
email: krste at eecs dot berkeley dot edu
(I don't do social networks, so please don't ask.)
phone: 510-642-6506 (don't phone, use email!)
fax: 510-643-1534
office hours: Mondays 5-6pm
579 Soda Hall
(email to confirm)
Administrative Support:
Roxana Infante
563 Soda Hall
phone: 510-643-1455
email: parlab-admin at eecs dot berkeley dot edu

Tammy Johnson
565 Soda Hall
phone: 510-643-4816
email: parlab-admin at eecs dot berkeley dot edu
Grant Administrator:
Lauren Mitchell
617 Soda Hall
phone: 510-642-3417
email: lbailey at cs dot berkeley dot edu