In early 2015 I was honored to be invited to develop and present a graduate course on Virtual Machines at UC Berkeley. The result is CS294: Virtual Machines and Managed Runtimes , which was presented in the Fall of 2015.
This page contains the materials from that course. All materials are Copyright © Oracle and Mario Wolczko, 2015-6, except. The materials can be used non-commercially under the following Creative Commons license:
CS294: Virtual Machines and Managed Runtimes byMario Wolczko is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License .
Caveat: a few items are still to be completed.
The original web page advertising the course can be found here (of historical interest only, as it was written before the course was prepared and hence is not totally accurate).
Students are required to have a strong background in systems programming in C and machine-level operation (assembly and machine code), and a working knowledge of Java. Basic knowledge of compiler internals is recommended but not required.
The widespread adoption of FORTRAN in the 1950s and 1960s resulted in a plethora of high-level programming languages directly compiled to machine code, some of which still thrive (e.g. FORTRAN itself, as well as C and C++). However, in the 1970s and 1980s a different approach to execution gained in popularity, in which a layer of software continually intermediates between the high-level program and the machine. Most often called Virtual Machines, this approach initially gained popularity with the Pascal P-machines, Smalltalk-80’s bytecode machine, and gained a huge boost with the emergence of Java and the JVM in the mid-1990s. Though Virtual Machines now dominate high-level language implementation, they have a reputation for being many orders of magnitude slower than traditionally compiled languages.
However, when coupled with dynamic compilation techniques, Virtual Machines can provide performance comparable to direct compilation while offering machine independent binary distribution, advanced memory management, better security, interactive program development and many other advantages. The objective of this seminar is to explore the design and construction of virtual machines by studying the history of the field, analyzing landmark systems and by hands-on construction and modification.
The presentation will take a mostly chronological approach, starting with early techniques and progressing through to the state of the art. Each week we will learn about the preeminent problems of a given era and how those problems were overcome. In the labs we will reprise some of these accomplishments through a graded series of exercises in which we build components of a virtual machine for an invented language. The initial exercises will implement basic techniques in C; later we will switch to a virtual machine framework known as Truffle which will provide sophisticated components that we can assemble and customize into a larger system.
Topics covered will include:
- Virtual machines and managed runtimes: taxonomy, characteristics, history
- Interpretation of abstract syntax trees and bytecodes threaded code and other performance techniques
- Automatic memory management (reference counting, tracing collection)
- Basic just-in-time compilation
- Advanced garbage collection (generational and concurrent techniques)
- Dynamic optimizing compilation
- adaptive feedback-driven techniques
- dynamic deoptimization
- Trace compilation; metatracing
- Metacircular VMs
Each lecture will be on Friday from 2pm to 5pm, with breaks.
Weekly reading will be assigned. Some papers will be key for continued understanding of the lectures; for these there will be class discussions, and possibly written or verbal summaries required. All assigned reading and supporting material is listed in the course bibliography
The labs will consist of a series of construction/modification exercises in which components of a VM are implemented or enhanced, to reinforce the corresponding lecture material. Most exercises will be in C, and some in Java. Labs will be done either solo, or in pairs.
Source code — TBD
TBD: make available source code to model answers, test harnesses, etc.
The Berkeley lectures were recorded and the videos uploaded to YouTube. Note that it is raw, unedited amateur video — don’t expect professional production standards.
The neat boundaries between topics are not reflected in the videos: sometimes a topic ran over the allotted time and was finished the following week, so you may have to hunt in the videos to find a specific topic.
In the following table I have listed the lectures with links to the videos, the accompanying slides and the lab exercises. The exercises are placed next to the relevant material; in the actual course, they sometimes lagged by a week or two (typically, exercises were dispensed weekly; students were given two week for #5 and #9, and 3 weeks for #7).
|1||#1|| 1 Intro
2 AST interpretation
|Course overview; Introduction to VMs: terminology, taxonomy; some examples to illustrate the variety and architectures of VMs: JVM, VirtualBox, System/370 VM, Virtual PC, Rosetta, Dynamo, Crusoe; applications and advantages of VMs. Introduction to the Feeny teaching language. Introduction to interpretation using ASTs.|| 1 Feeny Introduction ( review )
2 Feeny AST intepreter ( review )
|Smith & Nair (2005), Chapter 1|
|2||#2|| 3 VM anatomy
|Anatomy of a VM, using the JVM as an examplar: state (threads, heap, classes); class files; bytecodes; primitives and intrinsics; native libraries and JNI. Bytecode and bytecode interpretation: abstract machines; encodings; compiling to bytecode; interpreting bytecode; control flow: branches, call and return, method dispatch; serialization; performance.|| 3 Feeny bytecode interpreter
4 Feeny bytecode compiler
( reviews )
|3||#3||5 JVM bytecode ISA||A tour of the JVM bytecode ISA: arithmetic, loads and stores, object creation and access, type checks, stack management, type conversion, control transfer, method invocation and return, detour: vtable dispatch, synchronization, exceptions, verification, types. Abstract interpretation; stack type maps.||See last slide.||Gosling (1995);JVM Spec chapter 3|
|4||#4 (Caution: last few minutes NSFW)||Introduction||This session was a discussion of the Deutsch-Schiffman Smalltalk implementation and associated paper, with L Peter Deutsch and Allan Schiffman present to answer questions.||–||Deutsch & Schiffman (1984)|
|5||#5|| 6 Dynamic languages
7 Memory management part 1
|Dynamic languages: dispatch and storage challenges; boxing; object tables; primitives and tagging; call target caches.Memory management: allocation (static, stack, heap); list-based allocation; fragmentation; performance considerations; bump allocation; deallocation; GC; liveness; reachability; reference counting; (presentation of remaining slides left to next week).|| 5 Write a copying collector
|6||#6|| 7B Mem.mgmt part 2
7C Debugging hints
| This week our guest via Skype was David Ungar, to discuss his many contributions to VM technology.
After that we concluded the first part of memory management (not on video, alas — we forgot to turn the camera on!): tracing collection; marking and mark stacks; sweeping; heap parsing; compaction; sliding compaction; forwarding pointer compaction; using a temporary object table; threaded compaction; copying collection; tri-color abstraction; incremental collection; Baker’s algorithm; barriers; generational collection; scavenging; remembered sets and card marking; inexact collection.
Debugging hints for the collector exercise.
|–||Ungar (1984);Ungar & Smith (1987)|
|7||#7||8 Self internals||A tour of Self VM internals: language recap; bytecodes; memory organization (heap, tags, maps, object layouts; use of C++ objects; oops); allocation; scavenging; root-finding; programming primitives and _Define; heap scanning.||6 Tagging||–|
|8||#8|| 9 Advanced interpretation
10 Dynamic compilation
|Advanced interpretation: threading; indirect threading; stack caching; multiple dispatch tables; superinstructions; selective inlining; interpreter generation; the HotSpot interpreter(s).Dynamic compilation: design choices (compilation unit, when, optimization level); Just-In-Time compilation (including rant); simple speed metrics; example using expression language; managing compiled code: code cache; flushing; finding active methods; stack scanning; finding roots; safe points.||7 Write a jit||Ertl & Gregg (2004)|
|9||#9||11 Dynamic optimization|| Our guest this week was Cliff Click, with Bits of advice for the VM writer followed by audience questions.
Dynamic optimization: inline caches; single and PICs; optimizing primitives; customization; uncommon branches; splitting.
|–|| Any of:
Paleczny, Vick & Click (2001);
Click, Tene & Wolf (2005);
Click & Rose (2002);
Click & Paleczny (1995).
|10||#10||12 Advanced optimization||Inlining; extended example putting it all together: customization, inlining, splitting, primitive handling, uncommon branches; tiered compilation; prediction; counters; decay; PICs and counters; recompilation (deciding on the top level; gathering a call tree; inlining); class hierarchy analysis; escape analysis; dynamic deoptimization; deopt points; deoptimizing the active frame and other frames; on-stack replacement; dependencies.||8 Inline caches||Hölzle, Chambers & Ungar (1992)|
|11||#11.1 #11.2||13 Mem.mgmt. part 3|| Our guest this week was Lars Bak, who spoke about How Dart Learned From Past Object-Oriented Systems and then engaged in an extended Q&A.
Memory management part 3: conservative (aka inexact) collection; GC metrics (throughput, overhead, footprint, pause times, MMU and BMU, energy).
|–|| Inside V8
|12||#12.1 #12.2||14 Metacircularity|| This week’s guest via Skype was Carl Friedrich Bolz, talking about The meta-tracing approach to virtual machine construction .
Metacircular VMs: motivation; metacircular interpreters; compiling to C (Squeak); metacircularity — with performance; Jikes; Klein; Maxine (template JIT compiler; snippets; inspector).
|–||Gal, Probst & Franz (2006);Bolz et al. (2009)|
|13||#13||15 Truffle and Graal||A new approach: motivation; partial evaluation of specialized AST interpreters; example and pseudocode; node replacement; compiling ASTs; choice of implementation language; annotations and processors; Truffle: types; children; specializations; frames; root nodes; control flow (if, call and return); AST inlining; profiling; assumptions; objects.||9 Truffle implementation of Feeny||Würthinger et al. (2013)|
|14||#14||16 Truffle part 2|| This week our guest was Thomas Würthinger who gave an extended demo of Truffle.
Truffle part 2: objects; shapes and transitions; inline caches in Truffle.
|–||Humer et al. (2014);Würthinger et al. (2012)|
|15||#15|| 17 Concurrency
18 Concurrent GC
| Our guest this week was Michael Van De Vanter, talking about … need to get his slides .
Also (in a packed final session): Concurrency: motivation; scheduling; synchronization; HotSpot fast locks; biased locking; code management;
Concurrent GC: stacks; allocation; terminology; parallel collection; marking; work stealing; copying; sweeping; concurrent collection; challenges; tricolor invariants; color of mutator and allocator; SATB; Baker’s algorithm.
The Great Feeny Competition of 2015 — results and awards;
Wrap up: orphaned topics; next steps; thanks.
|–||Van De Vanter (2015);Seaton, Van De Vanter & Haupt (2014)|
The Whole Shebang
Here’s a zip file
containing all the slides, exercises and associated material. If you are a bona fide educator and wish to get the original files (in Keynote format), send me an email — the animations are smoother in the original, and you’ll probably want to change things for your own style and delivery. (Don’t forget the attribution required by the license, and the other license requirements.) If you say "pretty please", I may even be willing to give you a copy in PowerPoint or anything else that can be exported from Keynote ;-).
There aren’t many courses on the web that cover similar material, but here are the ones I’ve found (all done by former colleagues!):
- Gregor Richards’ Waterloo courses on VMs for Dynamic Languages and Automatic memory management and GC .
- Phil McGachey’s Harvard course on Managed Environments .
- Michael Haupt and Andreas Sewe taught a course on Virtual Machines at TU Darmstadt in 2010. Michael sent me a copy of his slides; I do not believe they are available on the web.
- Lars Bak taught a VM course in Aarhus but I don’t think the material is on the web.
All slides are Copyright © Oracle and Mario Wolczko, 2015-6, except those by Patrick Li (Feeny introduction and associated exercises), and the embedded graphics in the Deutsch-Schiffman introduction (taken from various web pages, mostly Wikipedia). Excerpts from papers and books (and the Hitch Hiker’s Guide to the Galaxy) have been attributed in the slides (let me know if I missed an attribution). Copyright on the youtube videos is probably some mashup of all the people who appear, UC, and the recordist, Patrick Li. You should assume all rights are reserved except those which were surrendered or granted as part of the youtube upload process.
I’d like to express my thanks to the following:
- Patrick Li, my T.A. for the course. Patrick devised the Feeny language used in the exercises, wrote the Lab exercises and the model answers, and did all the grading,
- Prof. Jonathan Bachrach for the invitation to give the course,
- The guest speakers (in order of appearance): Peter Deutsch, Allan Schiffman, David Ungar, Cliff Click, Lars Bak, Carl Friedrich Bolz, Thomas Würthinger and Michael Van De Vanter,
- My management at Oracle Labs for supporting this effort,
- Michael Haupt for sharing his VM course material,
- Christian Wimmer for assistance with Truffle, and
- All the students who participated, for their patience, enthusiasm, attention, questions, efforts and feedback.
Mario WolczkoMarch 2016