Python syntax and semantics

Python syntax and semantics

The syntax of the Python programming language is the set of rules that defines how a Python program will be written and interpreted (by both the runtime system and by human readers). Python was designed to be a highly readable language. It aims toward an uncluttered visual layout, uses English keywords frequently where other languages use punctuation, and has notably fewer syntactic constructions than many structured languages such as C, Perl, or Pascal.


Python uses whitespace to delimit program blocks, following the off-side rule. Sometimes termed "the whitespace thing",Fact|date=April 2007 its uncommon block marking convention is a feature that many programmers otherwise unfamiliar with Python have heard of. Python borrows a feature from its predecessor ABC—instead of punctuation or keywords, it uses indentation to indicate the run of a block.

In so-called "free-format" languages, that use the block structure derived from ALGOL, blocks of code are set off with braces ({ }) or keywords. In most coding conventions for these languages programmers conventionally indent the code within a block, to set it off visually from the surrounding code.

Consider a function, foo, which is passed a single parameter, x, and if the parameter is 0 will call bar and baz, otherwise it will call qux, passing x, and also call itself recursively, passing x-1 as the parameter. Here are implementations of this function in both C and Python:

foo function in C with K&R indent style:

void foo(int x) { if (x = 0) { bar(); baz(); } else { qux(x); foo(x - 1);

foo function in Python:def foo(x): if x = 0: bar() baz() else: qux(x) foo(x - 1)

Some users have drawn an unflattering comparison of Python with the column-oriented style used on punched-card Fortran systems.Fact|date=October 2007 To most Python programmers, "the whitespace thing" simply mandates a convention that programmers in ALGOL-style languages follow anyway. Moreoever, in free-form syntax, since indentation is ignored, good indentation cannot be enforced by an interpreter or compiler. Incorrectly indented code can be understood by human reader differently than does a compiler or interpreter. For example:

Misleading indentation in C:for (i = 0; i < 20; ++i) a(); b(); c();

This code is intended to call functions a(), b(), and c() 20 times. However, the interpreted code block is just {a();}. The code calls a() 20 times, and then calls b() and c() one time each. Novice programmers frequently misread samples like this.

Both space characters and tab characters are currently accepted as forms of indentation in Python. Since many tools do not visually distinguish them, mixing spaces and tabs can create bugs that take specific efforts to find (a perennial suggestion among Python users has been removing tabs as block markers—except, of course, among those Python users who propound removing spaces instead). Moreover, formatting routines which remove whitespace—for instance, many Internet forums—can completely destroy the syntax of a Python program, whereas a program in a bracketed language would merely become more difficult to read.

Because whitespace is syntactically significant, it is not generally possible for a text editor or code prettifier to automatically correct the indentation in Python code as can be done with C or Lisp code. Many popular code editors handle Python's indentation conventions seamlessly, sometimes after a configuration option is enabled.

Data structures

Since Python is a dynamically typed language, Python "values," not variables, carry type. This has implications for many aspects of the way the language functions.

All variables in Python hold references to objects, and these references are passed to functions; a function cannot change the value a variable references in its calling function. Some people (including Guido van Rossum himself) have called this parameter-passing scheme "Call by object reference."

Among dynamically typed languages, Python is moderately type-checked. Implicit conversion is defined for numeric types, so one may validly multiply a complex number by a long integer (for instance) without explicit casting. However, there is no implicit conversion between (e.g.) numbers and strings; a string is an invalid argument to a mathematical function expecting a number.

Base types

Python has a broad range of basic data types. Alongside conventional integer and floating point arithmetic, it transparently supports arbitrary-precision arithmetic, complex numbers, and decimal floating point numbers.

Python supports a wide variety of string operations. Strings in Python are immutable, so a string operation such as a substitution of characters, that in other programming languages might alter a string in place, returns a new string in Python. Performance considerations sometimes push for using special techniques in programs that modify strings intensively, such as joining character arrays into strings only as needed.

Collection types

One of the very useful aspects of Python is the concept of "collection" (or "container") types. In general a collection is an object that contains other objects in a way that is easily referenced or "indexed". Collections come in two basic forms: "sequences" and "mappings".

The ordered sequential types are lists (dynamic arrays), tuples, and strings. All sequences are indexed positionally (0 through "length" − 1) and all but strings can contain any type of object, including multiple types in the same sequence. Both strings and tuples are immutable, making them perfect candidates for dictionary keys (see below). Lists, on the other hand, are mutable; elements can be inserted, deleted, modified, appended, or sorted in-place.

On the other side of the collections coin are mappings, which are unordered types implemented in the form of "dictionaries" which "map" a set of immutable keys, to corresponding elements much like a mathematical function. The keys in a dictionary must be of an immutable Python type such as an integer or a string.For example, one could define a dictionary having a string "toast" mapped to the integer 42 or vice versa. This is done under the covers via a hash function which makes for faster lookup times, but is also the culprit for a dictionary's lack of order and is the reason mutable objects (i.e. other dictionaries or lists) cannot be used as keys. Dictionaries are also central to the internals of the language as they reside at the core of all Python objects and classes: the mapping between variable names (strings) and the values which the names reference is stored as a dictionary (see Object system). Since these dictionaries are directly accessible (via an object's __dict__ attribute), metaprogramming is a straightforward and natural process in Python.

A set collection type was added to the core language in version 2.4. A set is an unindexed, unordered collection that contains no duplicates, and implements set theoretic operations such as union, intersection, difference, symmetric difference, and subset testing. There are two types of sets: set and frozenset, the only difference being that set is mutable and frozenset is immutable. Elements in a set must be hashable and immutable. Thus, for example, a frozenset can be an element of a regular set whereas the opposite is not true.

Python also provides extensive collection manipulating abilities such as built in containment checking and a generic iteration protocol.

Object system

In Python, everything is an object, even classes. Classes, as objects, have a class, which is known as their metaclass. Python also supports multiple inheritance and mixins (see also ).

The language supports extensive introspection of types and classes. Types can be read and compared—types are instances of type. The attributes of an object can be extracted as a dictionary.

Operators can be overloaded in Python by defining special member functions—for instance, defining __add__ on a class permits one to use the + operator on members of that class.


Comparison operators

The basic comparison operators such as =, <, >=, and so forth, are used on all manner of values. Numbers, strings, sequences, and mappings can all be compared. Although disparate types (such as a str and a int) are defined to have a consistent relative ordering, this is considered a historical design quirk, and will no longer be allowed in Python 3000.

Chained comparison expressions such as a < b < c have roughly the meaning that they have in mathematics, rather than the unusual meaning found in C and similar languages. The terms are evaluated and compared in order. The operation has short-circuit semantics, meaning that evaluation is guaranteed to stop as soon as a verdict is clear: if a < b is false, c is never evaluated as the expression cannot possibly be true anymore.

For expressions without side effects, a < b < c is equivalent to a < b and b < c. However, there is a substantial difference when the expressions have side effects. a < f(x) < b will evaluate f(x) exactly once, whereas a < f(x) and f(x) < b will evaluate it twice if the value of a is less than f(x) and once otherwise.

Logical operators

Python 2.2 and earlier does not have an explicit boolean type. In all versions of Python, boolean operators treat zero values or empty values such as "", 0, None, 0.0, [] , and {} as false, while in general treating non-empty, non-zero values as true. In Python 2.2.1 the boolean constants True and False were added to the language (subclassed from 1 and 0). The binary comparison operators such as = and &gt; return either True or False.

The boolean operators and and or use minimal evaluation. For example, y = 0 or x/y &gt; 100 will never raise a divide-by-zero exception. Note that these operators return the value of the last operand evaluated, rather than True or False. Thus the expression (4 and 5) evaluates to 5, and (4 or 5) evaluates to 4.

Functional programming

As mentioned above, another strength of Python is the availability of a functional programming style. As may be expected, this makes working with lists and other collections much more straightforward.

List comprehensions

One such construction is the list comprehension, as seen here in calculating the first five powers of two:

powers_of_two = [2**n for n in xrange(1, 6)]

The Quicksort algorithm can be expressed elegantly using list comprehensions:

def qsort(L): if L = [] : return [] pivot = L [0] return (qsort( [x for x in L [1:] if x < pivot] ) + [pivot] + qsort( [x for x in L [1:] if x >= pivot] ))

Although execution of this functional form of the Quicksort algorithm is less space-efficient than other forms that alter the sequence in-place, it is often cited as an example of the expressive power of list comprehensions.

First-class functions

In Python, functions are first-class objects that can be created and passed around dynamically.

Python's limited support for anonymous functions is the lambda construct.Since the availability of full anonymous functions is non-existent then named functions is the primary use of functions in Python. Lambdas are limited to containing expressions rather than statements, although control flow can still be implemented less elegantly within lambda by using short-circuiting. [ [ IBM developerWorks on "Functional Programming in Python"] ]


Python has had support for lexical closures since version 2.2. Here's an example:def derivative(f, dx): """Return a function that approximates the derivative of f using an interval of dx, which should be appropriately small. """ def function(x): return (f(x + dx) - f(x)) / dx return function

Python's syntax, though, sometimes leads programmers of other languages to think that closures are not supported. Variable scope in Python is implicitly determined by the scope in which one assigns a value to the variable, so it's not possible to assign to a variable from an outer scope. The following example illustrates the issue:

def makeCounter(): num = 0 def count(x): num += 1 # assignment used! This 'num' is local to the count-function; It's not the # same 'num' that was used in makeCounter's scope return num return count

This can be solved by using mutation instead of assignment [ [ Ivan: Closures in Python (part 2)] ] . However, the preferred approach when reassignment is required is to use a class.


Introduced in Python 2.2 as an optional feature and finalized in version 2.3, generators are Python's mechanism for lazy evaluation of a function that would otherwise return a space-prohibitive or computationally intensive list.

This is an example to lazily generate the prime numbers:

from itertools import count

def generate_primes(stop_at=0): primes = [] for n in count(2): if 0 < stop_at < n: return # raises the StopIteration exception composite = False for p in primes: if not n % p: composite = True break elif p**2 > n: break if not composite: primes.append(n) yield n

To use this function simply call, e.g.:

for i in generate_primes(): # iterate over ALL primes if i > 100: break print i

The definition of a generator appears identical to that of a function, except the keyword yield is used in place of return. However, a generator is an object with persistent state, which can repeatedly enter and leave the same scope. A generator call can then be used in place of a list, or other structure whose elements will be iterated over. Whenever the for loop in the example requires the next item, the generator is called, and yields the next item.

Generators don't have to be infinite like the prime-number example above. When a generator terminates, an internal exception is raised which indicates to any calling context that there are no more values. A for loop or other iteration will then terminate.

Generator expressions

Introduced in Python 2.4, generator expressions are the lazy evaluation equivalent of list comprehensions. Using the prime number generator provided in the above section, we might define a lazy, but not quite infinite collection.

from itertools import islice

first_million_primes = (i for i in generate_primes() if i < 1000000)two_thousandth_prime = islice(first_million_primes, 2000, 2001).next()

Most of the memory and time needed to generate this many primes will not be used until the needed element is actually accessed. Unfortunately, you cannot perform simple indexing and slicing of generators, but must use the "itertools" modules or "roll your own" loops. In contrast, a list comprehension is functionally equivalent, but is "greedy" in performing all the work:

first_million_primes = [i for i in generate_primes(2000000) if i < 1000000] two_thousandth_prime = first_million_primes [2000]

The list comprehension will immediately create a large list (with 78498 items, in the example, but transiently creating a list of primes under two million), even if most elements are never accessed. The generator comprehension is more parsimonious.


Python supports most object oriented programming techniques. It allows polymorphism, not only within a class hierarchy but also by duck typing. Any object can be used for any type, and it will work so long as it has the proper methods and attributes. And everything in Python is an object, including classes, functions, numbers and modules. Python also has support for metaclasses, an advanced tool for enhancing classes' functionality. Naturally, inheritance, including multiple inheritance, is supported. It has limited support for private variables using name mangling. See [ the "Classes" section of the tutorial] for details.Many Python users don't feel the need for private variables, though.The slogan "We're all consenting adults here" is used to describe this attitude.Fact|date=May 2007Some consider information hiding to be unpythonic, in that it suggests that the class in question contains unaesthetic or ill-planned internals. However, the strongest argument for name mangling is prevention of unpredictable breakage of programs: introducing a new public variable in a superclass can break subclasses if they don't use "private" variables.

From the tutorial: "As is true for modules, classes in Python do not put an absolute barrier between definition and user, but rather rely on the politeness of the user not to "break into the definition."

OOP doctrines such as the use of accessor methods to read data members are not enforced in Python. Just as Python offers functional-programming constructs but does not attempt to demand referential transparency, it offers an object system but does not demand OOP behavior. Moreover, it is always possible to redefine the class using "properties" so that when a certain variable is set or retrieved in calling code, it really invokes a function call, so that spam.eggs = toast might really invoke spam.set_eggs(toast). This nullifies the practical advantage of accessor functions, and it remains OOP because the property 'x' becomes a legitimate part of the object's interface: it need not reflect an implementation detail.

In version 2.2 of Python, "new-style" classes were introduced. With new-style classes, objects and types were unified, allowing the subclassing of types.Even entirely new types can be defined, complete with custom behavior for infix operators. This allows for many radical things to be done syntactically within Python. A new [ method resolution order] for multiple inheritance was also adopted with Python 2.3. It is also possible to run custom code while accessing or setting attributes, though the details of those techniques have evolved between Python versions.


Python supports (and extensively uses) exception handling as a means of testing for error conditions and other "exceptional" events in a program. Indeed, it is even possible to trap the exception caused by a syntax error.

Python style calls for the use of exceptions whenever an error condition might arise. Rather than testing for access to a file or resource before actually using it, it is conventional in Python to just go ahead and try to use it, catching the exception if access is rejected.

Exceptions can also be used as a more general means of non-local transfer of control, even when an error is not at issue. For instance, the Mailman mailing list software, written in Python, uses exceptions to jump out of deeply-nested message-handling logic when a decision has been made to reject a message or hold it for moderator approval.

Exceptions are often, especially in threaded situations, used as an alternative to the if-block. A commonly-invoked motto is EAFP, or "It is Easier to Ask for Forgiveness than Permission." In this first code sample, there is an explicit check for the attribute (i.e., "asks permission"):

if hasattr(spam, 'eggs'): ham = spam.eggselse: handle_error()

This second sample follows the EAFP paradigm:

try: ham = spam.eggsexcept AttributeError: handle_error()

These two code samples have the same effect, although there will be performance differences. When spam has the attribute eggs, the EAFP sample will run faster. When spam does not have the attribute eggs (the "exceptional" case), the EAFP sample will run slower. The Python [ profiler] can be used in specific cases to determine performance characteristics. If exceptional cases are rare, then the EAFP version will have superior average performance than the alternative. In addition, it avoids the whole class of time-of-check-to-time-of-use (TOCTTOU) vulnerabilities, other race conditions, [ [ EAFP v. LBYL] , python-list mailing list] and is compatible with Duck Typing.

Comments and docstrings

Python has two ways to annotate Python code. One is by using comments to indicate what some part of the code does.

def getline(): return sys.stdin.readline() # Get one line and return it

Comments begin with the hash character ("#") and are terminated by the end of line. Python does not support comments that span more than one line. The other way is to use docstrings (documentation string), that is a string that is located alone without assignment as the first line within a module, class, method or function. Such strings can be delimited with " or ' for single line strings, or may span multiple lines if delimited with either """ or "' which is Python's notation for specifying multi-line strings. However, the style guide for the language specifies that triple double quotes (""") are preferred for both single and multi-line docstrings.

Single line docstring:def getline(): """Get one line from stdin and return it.""" return sys.stdin.readline()

Multi-line docstring:def getline(): """Get one line from stdin and return it.""" return sys.stdin.readline()

Docstrings can be as large as the programmer wants and contain line breaks. In contrast with comments, docstrings are themselves Python objects and are part of the interpreted code that Python runs. That means that a running program can retrieve its own docstrings and manipulate that information. But the normal usage is to give other programmers information about how to invoke the object being documented in the docstring.

There are tools available that can extract the docstrings to generate an API documentation from the code. Docstring documentation can also be accessed from the interpreter with the help() function, or from the shell with the pydoc command.

The doctest standard module uses interactions copied from Python shell sessions into docstrings, to create tests.


A decorator is a Python object that can be called with a single argument, and that modifies functions or methods. Python decorators were inspired in part by Java annotations, and have a similar syntax; the decorator syntax is pure syntactic sugar, using @ as the keyword:

@viking_chorusdef menu_item(): print "spam"

is equivalent to

def menu_item(): print "spam"menu_item = viking_chorus(menu_item)

Decorators are a form of metaprogramming; they enhance the action of the function or method they decorate. For example, in the above sample, viking_chorus might cause menu_item to be run 8 times for each time it is called:

def viking_chorus(myfunc): def inner_func(*args, **kwargs): for i in range(8): myfunc(*args, **kwargs) return inner_func

Canonical uses of function decorators are for creating class methods or static methods, adding function attributes, tracing, setting pre- and postconditions, and synchronisation [cite web
title=Python 2.4 Decorators: Reducing code duplication and consolidating knowledge
work=Dr. Dobb's
] , but can be used for far more besides, including tail recursion elimination [cite web
title=New Tail Recursion Decorator
work=ASPN: Python Cookbook
] , memoization and even improving the writing of decorators. [cite web
title=The decorator module

Decorators can be chained by placing several on adjacent lines:

@invincible@favourite_colour("Blue")def black_knight(): pass

is equivalent to

def black_knight(): passblack_knight = invincible(favourite_colour("Blue")(black_knight))

In the above example, the favourite_colour decorator can take an argument (or arguments depending on its function definition). Decorator functions that do that must return yet another decorator that takes one argument, the function to be decorated:def favourite_colour(colour): def yet_another_decorator(func): def wrapper(): print colour func() return wrapper return yet_another_decoratorThis would then decorate the black_knight function such that the color, "Blue", would be printed prior to the black_knight function running.

At present, decorators apply to functions and methods, but not to classes. Decorating a (dummy) __new__ method can modify a class, however. [cite web
title=Charming Python: Decorators make magic easy; A look at the newest Python facility for metaprogramming
work=IBM developerWorks
] Class decorators will be supported [cite web
title=PEP 3129 - Class Decorators
work=Python Enhancement Proposals
] in Python 3.0.

Despite the name, Python decorators are not an implementation of the decorator pattern. The decorator pattern is a design pattern used in statically typed object-oriented programming languages to allow functionality to be added to objects at run time; Python decorators add functionality to functions and methods at definition time, and thus are a higher-level construct than decorator-pattern classes. The decorator pattern itself is trivially implementable in Python, because the language is duck typed, and so is not usually considered as such.

Easter Eggs

Users of curly bracket programming languages, such as C or Java, sometimes expect or wish Python to follow a block-delimiter convention. Brace-delimited block syntax has been repeatedly requested, and consistently rejected by core developers. The Python interpreter contains an easter egg that summarizes its developers' feelings on this issue. The code from __future__ import braces raises the exception SyntaxError: not a chance.

Another hidden message, The Zen of Python (a summary of Python philosophy), is displayed when trying to import this.


External links

* [ Python tutorial] - Tutorial written by the author of Python, Guido van Rossum.

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Python — У этого термина существуют и другие значения, см. Python (значения). Python Класс языка: му …   Википедия

  • Python (programming language) — infobox programming language name = Python paradigm = multi paradigm: object oriented, imperative, functional year = 1991 designer = Guido van Rossum developer = Python Software Foundation latest release version = 2.6 latest release date =… …   Wikipedia

  • Пайтон — Python Класс языка: функциональный, объектно ориентированный, императивный, аспектно ориентированный Тип исполнения: интерпретация байт кода, компиляция в MSIL, компиляция в байт код Java Появился в: 1990 г …   Википедия

  • List comprehension — A list comprehension is a syntactic construct available in some programming languages for creating a list based on existing lists. It follows the form of the mathematical set builder notation (set comprehension) as distinct from the use of map… …   Wikipedia

  • Futures and promises — In computer science, future, promise, and delay refer to constructs used for synchronization in some concurrent programming languages. They describe an object that acts as a proxy for a result that is initially not known, usually because the… …   Wikipedia

  • Comparison of C Sharp and Java — The correct title of this article is Comparison of C# and Java. The substitution or omission of the # sign is because of technical restrictions. Programming language comparisons General comparison Basic syntax Basic instructions …   Wikipedia

  • Comparison of programming languages (list comprehension) — Programming language comparisons General comparison Basic syntax Basic instructions Arrays Associative arrays String operations …   Wikipedia

  • Exception handling syntax — varies between programming languages to accommodate their overall syntax. Some languages don t call the concept exception handling or they may not have direct facilities for it, but they can still provide means for implementing it. Catalogue of… …   Wikipedia

  • List of computing and IT abbreviations — This is a list of computing and IT acronyms and abbreviations. Contents: 0–9 A B C D E F G H I J K L M N O P Q R S T U V W X Y …   Wikipedia

  • Programming language — lists Alphabetical Categorical Chronological Generational A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”