Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

76
Dynamic Language VMs Inside Ruby - Lourens Naudé sexta-feira, 4 de Dezembro de 2009

description

The only efficient way to make the most of something is understanding it's mechanics - a pilot has deep knowledge of many scientific factors and its effects on a plane. Why do so many developers fly blind? We'll take a peek into the Ruby 1.9 VM's internals with DTrace and observe the effect of some core components on memory, IO and CPU subsystems. No prior knowledge of Virtual Machines/Interpreters is assumed. Interpreter specific subjects touched upon: * Source to runtime : Loading files, parsing to Nodes and eval * VM : Symbol table, method cache, frames, method dispatch and optimizations * Object model : Core types, Modules and variables * Closures : Blocks and procedures * POSIX, IO and Contexts : Signals, system calls and Thread / Fiber switches * Garbage Collection : Heap space, alloc / dealloc and GC patterns

Transcript of Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Page 1: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Dynamic Language VMsInside Ruby - Lourens Naudé

sexta-feira, 4 de Dezembro de 2009

Page 2: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Background• Freelance Ruby/C/Systems Developer• http://github.com/methodmissing• Contractor at Trade2Win Ltd.• Realtime Forex / Autotrading Platform

sexta-feira, 4 de Dezembro de 2009

Page 3: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

ProcessFront-end (parsing)

Semantics

Back-end (runtime)

sexta-feira, 4 de Dezembro de 2009

Page 4: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Roadmap

• Source to Nodes and the AST• VM: Symbol table, caches, opcode dispatch and optimizations• Object Model : Objects, methods and variables• Garbage Collection• Contexts : Threads, GVL, Fibers

sexta-feira, 4 de Dezembro de 2009

Page 5: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Source to AST

sexta-feira, 4 de Dezembro de 2009

Page 6: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Lexical Analysis• Converts source code to a token stream• Token identification (keyword_class, keyword_module etc.)

sexta-feira, 4 de Dezembro de 2009

Page 7: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Grammar• Describes program syntax structure• Semantics of a program is defined by it’s syntax• Production rules : name and use case• Object#block_arg(&block)• Object#opt_block_arg(arg1, arg2, &block)

sexta-feira, 4 de Dezembro de 2009

Page 8: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Abstract Syntax Tree

sexta-feira, 4 de Dezembro de 2009

Page 9: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

VM Architecture• Reuse of some 1.8.x series architecture : parsing, AST nodes, Object, GC etc.

• Introduces a code generation phase to convert the AST to instruction sequences for better optimization hooks and faster runtime

• No speedup for inherited MRI features such as string processing etc.

sexta-feira, 4 de Dezembro de 2009

Page 10: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

• Represents grammar• Sometimes referred to as an annotated AST• Annotations / attributes attach semantics to nodes• Literals, values, statements, callsite info ( file and line number )• Can be augmented with semantic analysis

AST Annotations

sexta-feira, 4 de Dezembro de 2009

Page 11: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

AST Transformation

• Removes AST noise• Refactor to features that map closer to machine instructions• Usually yields more AST nodes, but reduces overall complexity

sexta-feira, 4 de Dezembro de 2009

Page 12: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Intermediate Tree Nodes

• Minimal subset required for code generation• Expressions and assignments • Method calls, arguments and return values• Conditional jumps - if/else, iterators• Unconditional jumps - exceptions, retry, catch/throw

sexta-feira, 4 de Dezembro de 2009

Page 13: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Code Generation

• Converts AST to code segments - a linear instruction set• Selection : Which tree sections to rewrite ?• AST Node -> instruction ordering• Narrow tree scope considers only small subsets of the AST to reduce the inherent complexity of code generation

sexta-feira, 4 de Dezembro de 2009

Page 14: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Codegen Workflow

• Preprocessing : AST node refactorings ( YARV doesn’t do this )• Codegen : Nodes to instruction sequences• Postprocessing : Generated instruction sequences replaced with optimal ones - compiled instruction sequences and peephole optimization

• Pre and Postprocessing phases may benefit from multiple passes

sexta-feira, 4 de Dezembro de 2009

Page 15: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

VM Internals

sexta-feira, 4 de Dezembro de 2009

Page 16: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Symbol (Hash) Table• Access to int/char indexed values in almost constant time with a hash table• Lookup of methods, ivars, global vars, encodings, VM instructions etc.• Table defaults to 11 bins and max 5 entries per bin.Bins count can increase.

• Sequential Lookup inside bins, thus slow down for a density of > 5

sexta-feira, 4 de Dezembro de 2009

Page 17: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Symbols - VMNS :-)• An entity with both a String and Number representation• It does NOT contain a String or Number, simply points to a hash table entry• Developer identifies by name, VM identifies by it’s numeric representation• Immutable (4 bytes per Symbol) for performance benefits • DNS anology : developers prefer named entities, runtime prefers numerical representations

sexta-feira, 4 de Dezembro de 2009

Page 18: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

VM Opcodes• Stateless functions that operate on a Stack Machine• 79 instructions as of Dec 4, 2009• Notation : instruction / opcode / operands

sexta-feira, 4 de Dezembro de 2009

Page 19: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Instruction Categories

• variable : get or set local variable• put : push an object onto the stack• stack : pop from stack, empty the stack• setting : is a given variable defined ?• class/module : define a class / module• method/iterator : invoking methods, calling blocks• exception : • jump : control flow branching• optimization : redefines +, <<, * etc. in some cases

sexta-feira, 4 de Dezembro de 2009

Page 20: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Pure Stack Machine

• 2 instruction types• Move / copy value(s) between top of stack and elsewhere• Operate on the top stack element(s)• SP: top of stack pointer• BP: beginning of stack pointer

sexta-feira, 4 de Dezembro de 2009

Page 21: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Stack Machine • Put 3 strings on the stack, “a”, “b” and “c”• Fetch the top 3 stack elements and create an Array from them

sexta-feira, 4 de Dezembro de 2009

Page 22: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Instruction Sequence

• Flat instruction sequences structure is much faster than traversing tree nodes, but instruction dispatch from this pipeline can be a bottleneck

• Ability to optimize simple instructions is very important• Native code / language extensions is usually only a small subset of the hot path

• Native DB socket layer VS multi-model ORM in Ruby• Direct Threaded Dispatch : fastest way to the next VM instruction• Switch Dispatch : slower, but more portable

sexta-feira, 4 de Dezembro de 2009

Page 23: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Switch Dispatch• Most portable, but much slower due to excessive CPU branch mispredictions

• Executes more native instructions per opcode dispatch• Average 50% slower than Threaded Dispatch

sexta-feira, 4 de Dezembro de 2009

Page 24: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Direct Threaded Dispatch• Represents an instruction by the address of the routine that implements it • Jumps context to the address of the current instruction and bumps the PC • Requires first class labels and some GCC help - thus portability concern

sexta-feira, 4 de Dezembro de 2009

Page 25: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

VM Versioning

• Each VM instance has a state counter used to scope caches to the current VM state

• Lazy cache invalidation: bumping the version value avoids any cache expiry overhead

• Expired on : const definition, constant removal, method definition, method removal and method cache changes (covered later)

sexta-feira, 4 de Dezembro de 2009

Page 26: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Common Optimizations*

• Constant folding• Constant propagation• Dead code elimination• Subexpression elimination• Method in-lining

sexta-feira, 4 de Dezembro de 2009

Page 27: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Static Analysis Notes

• Examining source code without execution• Dynamic analysis : Runtime introspection• Cannot assume much beyond literals in Ruby ...• Constants can be redefined• Open classes imply methods can be redefined at any time• Object#method_missing• Methods don't have an explicit return type

sexta-feira, 4 de Dezembro de 2009

Page 28: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Constant Folding

• Compile time constant expression evaluation

• Strength reductions : replace operationswith cheaper ones

• Null sequences : operations that can beremoved

• Very hard to pull off due to the dynamicnature of the Ruby spec

sexta-feira, 4 de Dezembro de 2009

Page 29: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

• Remove code segments without data flow• Works very well with static analysis, but tricky to pull off in Ruby

Code Elimination

sexta-feira, 4 de Dezembro de 2009

Page 30: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

• Expression reuse by extractingto a temporary variable

Subexpression elimination

sexta-feira, 4 de Dezembro de 2009

Page 31: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

• Replace a literal variable referencewith it’s value

Constant Propagation

sexta-feira, 4 de Dezembro de 2009

Page 32: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

• Replaces a method call with it’s body to reduce function calloverhead

• Very efficient in iterator contexts• Opportunity for further optimization• Not a silver bullet - excessive in-liningcan overload instruction cache

• Some cases change semantics

In-lining

sexta-feira, 4 de Dezembro de 2009

Page 33: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

• Copies a method to replace a commoncall pattern

• Identified with static analysis, thusof limited use to Ruby

Cloning

sexta-feira, 4 de Dezembro de 2009

Page 34: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

• Replace generated instruction sequenceswith more efficient ones

• Benefits is directly proportional tothe quality of the code generated

• Removes useless flow control

Peephole Optimization

sexta-feira, 4 de Dezembro de 2009

Page 35: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Object Model

sexta-feira, 4 de Dezembro de 2009

Page 36: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Object Requirements

• Identity : unique identifier to represent the object at runtime• Stateful : ability to maintain state• Methods : exposes methods to change / query object state

sexta-feira, 4 de Dezembro de 2009

Page 37: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Base Object Structure• Pointer type that represent addresses to language structures• Pointer cast dereferences VALUE to an object structure• RBASIC(obj)->flags; / * ((struct RBasic *)obj) -> flags * /• Flags: frozen, marked, tainted etc.

sexta-feira, 4 de Dezembro de 2009

Page 38: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Classes / modules• Symbol tables for methods, class and instance variables• Class / module distinction through flags• RCLASS(a_str)->ptr.super #=> Object• RCLASS(a_fixnum)->ptr.super #=> Integer

sexta-feira, 4 de Dezembro de 2009

Page 39: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Immediates

• Small enough to fit in a VALUE• No Runtime casting overheads• nil = 4• true = 2• false = 0 • Symbols• Fixnums <= 30 bits• Float, Bignum are complex objects, hence poor FP benchmarks• RFLOAT(float_obj)->float_value #=> a double

sexta-feira, 4 de Dezembro de 2009

Page 40: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Object Layout

• Assuming a 32bit architecture ....• sizeof(VALUE) is 4 bytes• Objects are even - multiples of 4• Symbols are even - multiples of 8• Integers are odd• Immediates < 4

sexta-feira, 4 de Dezembro de 2009

Page 41: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Mutable Objects• Mutable Strings and Arrays require the ability to shrink / grow capacity • Allocates slightly more memory than is required to represent object data in order to avoid malloc, realloc and memmove operations in common cases.

• Capacity for short strings and small arrays : “str” and %w(s t r)sexta-feira, 4 de Dezembro de 2009

Page 42: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Shared Objects• Literal declarations of Arrays and Strings is shared amongst instances • Avoids duplicates with this “copy-on-write” (COW) scheme• Attempt to modify creates a copy to the object, and modifies the copy

sexta-feira, 4 de Dezembro de 2009

Page 43: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Object Method Dispatch• Loose typing and open classes means that method calls could never be reduced to a single CALL instruction

• Method dispatch in OO languages requires methods to be searched for, on the object itself, superclasses etc.

sexta-feira, 4 de Dezembro de 2009

Page 44: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Call VS Send

• object.__send__(:method)• We don’t call functions / routines, rather send a command or query message to an object

• Ruby methods always return a value, thus RPC style messaging• Method cache is like a router • Method redefinition clears the method cache / router • “Routing” overhead for subsequent method calls

sexta-feira, 4 de Dezembro de 2009

Page 45: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Cache - before include

sexta-feira, 4 de Dezembro de 2009

Page 46: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Cache - after include• ALL methods on ALL classes invoked since VM startup is expired• DON’T extend / include in a request / response cycle• Rails busts the method cache multiple times on boot

sexta-feira, 4 de Dezembro de 2009

Page 47: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Method cache - Warm• Average 95% hit rate

sexta-feira, 4 de Dezembro de 2009

Page 48: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Instance Variables

• Optimization : the first 3 ivars is embedded on the object, iow. no symbol table lookups required

• Index table per class VS a symbol table per object on MRI 1.8• Index table is shared by all instances of the same class• Saves on the memory footprint of a table per instance

sexta-feira, 4 de Dezembro de 2009

Page 49: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Garbage Collection

sexta-feira, 4 de Dezembro de 2009

Page 50: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Process Memory Layout

• Code segment : executable code, read only area• Stack segment : stack storage, addressed with stack pointers• Heap : stretch of memory available for program / developer use

sexta-feira, 4 de Dezembro de 2009

Page 51: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

malloc / free layout• Free chunks == the free list• Linear search overhead to find free chunks

sexta-feira, 4 de Dezembro de 2009

Page 52: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

a better layout• Free chunks indexed by size intervals

sexta-feira, 4 de Dezembro de 2009

Page 53: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Garbage Collection

• Objects allocated explicitly on the heap• Automatically reclaim memory chunks not accessible from the root set

• Root set : C stack, global vars, global constants (accessible without pointer scanning)

• Unreachable hooks : variable assignment (nil), method return etc.• Stop the World : halts execution to reclaim memory, very disruptive when in the hot path

• Incremental : some collection actions occur for each allocation, smoother and suitable for realtime requirements

sexta-feira, 4 de Dezembro de 2009

Page 54: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

GC Algorhitms

• Most scripting languages implements either of the following• Mark and Sweep : identifies reachable chunks and assume remainder is garbage (concerned with garbage)

• Stop and Copy : 2 heap spaces, copies reachable chunks to the new active heap area (concerned with live chunks)

sexta-feira, 4 de Dezembro de 2009

Page 55: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

GC Issues

• Memory fragmentation• Dangling pointers• Memory leaks form incomplete recycling (circular garbage and conservative GC)

• Bursty allocation• Knowledge of pointer and chunk layouts required

sexta-feira, 4 de Dezembro de 2009

Page 56: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Ruby heap layout• Multiple heaps, referenced through the heap list• Heaps are freed when empty, IF all slots is tagged free• Ballpark : Rails allocates 4 to 6 heaps on startup

sexta-feira, 4 de Dezembro de 2009

Page 57: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Per heap slots layout• Each slot references a single object• 10 000 slots per Ruby heap• Threshold of 4096 free slots per heap• Free list points to the next free slot

sexta-feira, 4 de Dezembro de 2009

Page 58: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Heaps and slots layout

sexta-feira, 4 de Dezembro de 2009

Page 59: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Pointer Layout• Pointer layout of both the program data area and heap is self describing• RVALUE union can accommodate any ruby object, Ruby frames, global variable structure etc. is well defined

• 20 bytes (32bit arch) of Ruby heap space is require to represent a slot sexta-feira, 4 de Dezembro de 2009

Page 60: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Ruby Heap VS OS Heap• Slot points to the actual object data, on the OS / system heap• 20 byte (32bit arch) slot references an eg. 2MB chunk on the system heap• RVALUE union can accommodate any ruby object, Ruby frames, global variable structure etc. is well defined

• 20 bytes of Ruby heap space is require to represent a slot sexta-feira, 4 de Dezembro de 2009

Page 61: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

CRuby: Mark and Sweep

• Conservative : cannot determine with certainty if a given value is a pointer or not and assume it’s in use

• Two phase implementation• Mark phase : marks all reachable objects from the current program context

• Sweep phase : iterate through the object space and frees all objects not marked + unmark the marked ones

sexta-feira, 4 de Dezembro de 2009

Page 62: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Pros and Cons

• Pauses program execution• Work is proportional to the heap size• Prone to memory fragmentation (no compaction)• Recursive• Every 8MB allocated triggers GC• 8m malloc calls also triggers GC• Frees all* memory that can be freed

sexta-feira, 4 de Dezembro de 2009

Page 63: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Source representation

sexta-feira, 4 de Dezembro de 2009

Page 64: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Objectspace

sexta-feira, 4 de Dezembro de 2009

Page 65: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Objectspace - marked

sexta-feira, 4 de Dezembro de 2009

Page 66: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Objectspace after sweep

sexta-feira, 4 de Dezembro de 2009

Page 67: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Generational GC

• Vast majority of objects are short lived ( 80% + )• Expensive to continuously account for long lived objects• Partition objects by age and collect short lived ones more frequently OR

• Restrict GC to the most recently modified slots• Perform a full GC only when the younger generation fails to meet current memory requirements

sexta-feira, 4 de Dezembro de 2009

Page 68: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Context Switches

sexta-feira, 4 de Dezembro de 2009

Page 69: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Threading

• First CRuby to support native OS Threads• Ruby thread == pthread• Scheduling, synchronization and create delegated to syscalls, which implies a user / kernel space context switch

• Can use multiple CPU cores - NOT at the same time though• No parallel execution - Global VM Lock (GVL)• ... although MacRuby doesn’t have a GVL

sexta-feira, 4 de Dezembro de 2009

Page 70: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Global VM Lock (GVL)

• Thread that owns the GVL is allowed to execute• Blocking operations should release the GVL to not block the process• Also released during scheduling• Allows for easy C extensions - author doesn’t have to concern with synchronization

• The Kernel’s better suited for load balancing multiple processes than most developers can squeeze from a single process

• Constraintless Threading is a weapon of mass destruction• Effect on existing app performance that rely on user space threads from MRI 1.8 may be significant

• Unix pipes are often the best scheduler ....

sexta-feira, 4 de Dezembro de 2009

Page 71: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Releasing the GVL

• Internal API exposed to release the GVL• Blocking function : slow system call / computation• Unblock function : called on Thread interrupt • Dangerous territory - look for alternatives first• Cannot access Ruby VALUEs in blocking functions • No exception handling

sexta-feira, 4 de Dezembro de 2009

Page 72: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Blocking VM Operations

• IO : potentially blocking reads / writes• DNS resolution / connects : often has a lot more handshake overhead

• Expensive Bignum computations blocked 1.8 interpreters• File locking• Process#waitpid

sexta-feira, 4 de Dezembro de 2009

Page 73: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Fibers

• Coroutines for lightweight concurrency (4k stack size)• Very fast user space context switches• Cooperative scheduling required - also not concurrent• Common use cases being generators or blocking IO eg. Neverblock• Fiber.yield pauses the activation record, which keeps context across multiple calls

sexta-feira, 4 de Dezembro de 2009

Page 74: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

The Road Ahead

• MVM: Multiple Virtual Machines• Shared process space, cannot share state• Distribute VMs across multiple cores• Message passing / channel API for inter VM communication• Many Ruby deployments are not thread safe - MVM is better suited for this use case

• Thread safe framework does not guarantee a thread safe application ...

sexta-feira, 4 de Dezembro de 2009

Page 75: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Questions ?

sexta-feira, 4 de Dezembro de 2009

Page 76: Dynamic Language VMs - Inside Ruby (Sapo Codebits 2009)

Thanks for Listening !

@methodmissinghttp://github.com/methodmissinghttp://www.methodmissing.com

sexta-feira, 4 de Dezembro de 2009