Pynto Design ============ This document is intended to describe the data structures and some of the high level and optimizations. It does not describe the implementations or algorithms in detail. See algorithms.txt for the latter. See walkthrough.txt for a more concrete example. Goals ===== Pynto's goals are 100% compatibility with interpreted Python across a wide variety of platforms, not to mention incredible speed ups. This is very hard to obtain, especially because the Python interpreter has several hardcoded things. Optimizations ============= The majority of Pynto's optimizations focus on type improvements --- knowing what type an object is can enable to optimize the operations performed on it to great effect. This is not an easy task because Python's semantics are incredibly flexible. Pynto focuses on several types of optimization 1. Type Determination --- sometimes it is possible to say statically what the type of a given object will be. Pynto determines this to the best of its ability and takes advantage of that. For example: def f(list): i = 0 for item in list: print "%d: %s" % (i, item) i = i + 1 In this function, the variable 'i' always contains an integer (ignoring overflow issues which may cause it to contain a larger item). Pynto can thus optimize 'a' to be stored in a simple word counter rather than an object. 2. Type Specialization --- function parmeters will always introduce ambiguity. Sometimes it is possible to generate several versions of a function --- each version will make a different assumption. For example, we might make a version of the function 'f' above that assumes that list is a python list object. Then, when we are calling 'f' with a list object, we can use that specialized version; alternatively, we can check in a more general version of 'f' whether 'list' is a list object and call the specialized version dynamically. An important instance of this occurs for class methods --- we can create versions of the methods which assume that the self pointer is of the correct class type (or one derived from it by single inheritance) and which optimize attribute lookups and other expensive operations. 3. Structural Speculation --- virtually nothing can be taken for granted in Python. For example, examine the following module: def f(a): for item in a: g(a) def g(a): print a One might assume that the function 'f' called the function 'g' above; however, this is not necessarily true. To see this, examine the following module which imports the preceding example under the name 'example': import example def h(a): print "ha ha!" example.g = h example.f( (1, 2) ) Scary, huh? The fact remains, however, that few people would ever do this, and if they do, they deserve the slow performance that results. Pynto thus optimizes around this by making certain assumptions --- such as that global module function definitions won't change. If those assumptions turn out to be violated, we revert to byte code or a more conservative version of an existing function. In the above example, the change can be detected by checking in the __setattr__ method of the module if the changed item is a global function definition. The KTree ========= Pynto maintains awareness of everything that it has operated on; as more things are compiled, it may go back and optimize previously compiled functions because new information is available. The more you put through Pynto the better the results will be due to improved type determination. This information is compiled into Pynto's main data structure, the Knowledge Tree, or ktree for short. The ktree is a sort of symbol table, used to determine what to expect to see when the user dereferences a pointer of a given type or loads a global variable. Each knowledge node (knode) in the tree corresponds to a Python object. For example, module objects are represented in the tree. Within the module object are class objects, function objects, and other items that receive knodes. Not all attributes of a module receive a knode: only those items that we think are likely to keep their current value. Any generated code which relies on that value, making the assumption (for example) that it knows which function os.path.join is, will be carefully marked so that if os.path.join changes value it can be thrown out. It is possible to add knodes for global and other variables that do not have a constant value; we can also add knodes that simply assert the type of the structure. Like an assumed value, we can revert code that depends on this assumption if necessary. KNodes ------ There are five kinds of knodes: modules, classes, functions, constants, and instances. Except for modules, each has certain unique properties in addition to the set of children on the ktree. Classes correspond to all types. At start up time we create a Class knode for each built-in type. As we parse modules, user defined classes also receive class knodes. Each class knode keeps a separate list of knode children that correspond to attributes that only exist on instances of those classes, as opposed to existing on the class object itself. Functions keep a separate list of knode children corresponding to formal parameters to the function; they also keep the original Python codeobject as well as various jitted versions of it which may make different assumptions about the types of the parameters. Constants simply keep a constant value. Instances keep a list of other Class knodes indicating what set of types the given item may have. IR == Pynto's intermediate representation is made up of three components: blocks, instructions, and values. Blocks correspond to basic blocks and embody the control flow of the program; instructions correspond to a single action and have values as operands; values are the components that can be manipulated by instructions, such as constants or local variables. Currently, all Pynto values correspond to pointers. Eventually I would like to allow them to represent scalar values like integers. Pynto's IR is an SSA IR, meaning that each value can only be written by at most one instruction (in the case of constants, no instruction writes to the value). "Write" in this case means to modify the pointer or scalar value, not the contents pointed at by it. Blocks ------ Basic blocks are defined in any compiler textbook. Pynto's block structure contains a list of instructions, successors, and predeccessors (sp). Each instruction in the list except for the last one must have at most one successor, and it is assumed to fall through to the next instruction. The final instruction will have the same number of successors as the basic block and it decides which one to travel to. Instructions ------------ Instructions consist of an opcode, a set of values, and some flags indicating how the values are used. These values are the parameters of the instruction and are the same values discussed above in the section on code nodes. Values can be modified by an instruction in a few ways: Read: this means that the incoming pointer value is read, but not the data it points at. ReadObject: this means that the incoming pointer is used to access some fields of the object it points at in a read-only fashion. WriteObject: this means that the incoming pointer is used to modify some fields of the objects it points at. This implies that the incoming pointer value is important. Write: this means that the pointer value is changed, but not any of the fields it points at. WriteNew: this means that the pointer value is changed, and the object generated is guaranteed to be unique (i.e., this is a constructor). Note that ``ReadObject`` and ``WriteObject`` both imply a ``Read``, and ``WriteNew`` is a subset of ``Write``. Generally any given value is associated with one of the above modes for a given instruction: sometimes, however, an value may be used by the same instruction multiple times in different capacities. Pynto's IR is an SSA IR, however, meaning that every operand is "written" exactly once. No instruction may both read and write a single operand. Instructions like "a += 3" are implemented with two values: one value represents the incoming value of 'a', and one represents the outgoing value of 'a'. Note that they will commonly have the same pointer value at runtime, but not necessarily. Values ------ There are three kinds of values: constants, temporaries, and the undefined value (a singleton). Constant values represent constants embedded in the instruction stream. Temporaries represent local variables and other unknown values manipulated by the instructions. There is no way for an instruction to directly manipulate a global variable; instead, there are instructions that load and store global values. Cell and free variables are handled by having a temporary that points at the cell, and then using instructions to dereference it to load and store its current value. Program Flow ============ Basic Pass ---------- Adding a module is first added to the system provokes a number of phases. First, we do a walk of the module and add knodes for each function, class, etc. A small amount of initial processing takes place: for functions, we construct an initial copy of the IR from the byte codes, making no assumptions about the types of parameters. One thing worth nothing is that for class methods we always compile two versions: one that is purely generic with no assumptions made about the types of parameters and one that assumes that the first parameter (the 'self' parameter!) is an instance of the class in question. The latter version is what is used when an instance of the class is created. Refinement Pass --------------- Once all the knodes are processed, we enter the ``refine`` stage. Refinement can be thought of as a local optimization. It locates unreachable blocks and instructions, propagates constants when possible, and in particular handles knode dereferences. A knode dereference occurs most commonly with the LoadGlobal instruction, which loads a global by name --- if the global is found on the ktree with a constant value (such as a function pointer) then we will remove the LoadGlobal instruction and instead simply replace the target operand with a constant valued operand that points at the function. This also affects LoadAttr instructions where the ptr we are loading from is a constant or of a known type. Once the IR exits the refinment stage it is basically in a reasonable form for execution: some of the most glaring inefficiencies of python have been hidden. However, because we make no assumptions (other than for 'self') about the types of our operands, we are sometimes quite limited in how efficient this can be. Memory Management ----------------- Before the memory management phase, we don't make any effort to track ref counts or otherwise worry about the lifetime of data. The memory management phase introduces inc and dec ref instructions for each temporary as they go in and out of scope. For now, we assume that all WRITEs of a temporary increase the ref count of the corresponding object. This has a downside: we cannot remove unneeded and extraneous ref counts. On the other hand, its simpler than the alternative, which is to assume that some instructions inc the ref counts and others don't, depending on some sort of flag. Code Generation ---------------