Latest News

First Sunday of Advent: Built-in Functions

This post is part of my series on Traversing the Python Standard Library.

Built-in functions are included in the standard library and are
always available with no need for imports. There are 69 (nice) of them. I’ve rearranged them from the Python
documentation order (alphabetical) into groups – this was short enough to do a detailed breakdown, the following days
will be more of a summary.

Highlights

These are the things I did not know or had forgotten:

  • open() takes an errors argument that you can set to surrogateescape, which allows you to process and even
    roundtrip encoding errors in files.
  • iter() can take two arguments and then calls the first one repeatedly until its result equals the second argument,
    good for chunked reading.
  • round(), when it could round either way, chooses the even number, so both round(1.5) and round(2.5) is 2.

Favourites

My favourites among the built-ins:

  • I love all() and any() and throw them at iterables all the time, particularly at generators.
  • The introduction of breakpoint was theoretically a godsend, but my fingers are so used to typing
    import pdb; pdb.set_trace() that I’m still breaking (ha.) them of the habit.
  • dir() returns a list of attributes of its argument (or all names in your scope if called without argument). Both of
    these modes are extremely useful when debugging. Uses __dir__ where possible, and otherwise __dict__. Absolutely
    no guarantees or event attempts at completeness, eg does not include metaclass attributes when called on a class.

Types and casting

Types and casting is pretty intuitive in Python, but it’s easy to forget some of the built-in types (bytearray, for
instance, when you don’t typically work on bytes).

  • bool() tests the truth value of its argument. It’s a
    subclass of int and cannot be subclassed further.
  • callable tells us if the argument has a __call__ method, ie is a class or a function or a lambda function (or
    probably something I’m forgetting)? Does not catch async functions.
  • bytearray() objects can only be created with this constructor, which returns a mutable array of bytes. It can be
    used with a string + encoding, or a buffer, or an iterable of numbers (all of these initialise the bytearray to the
    values given). Unintuitive: it can also be called with a number and will return a nulled bytearray of that length.
    bytes is the same thing, just immutable.
  • dict, list, set, frozenset, tuple: cast input, typically iterables
  • range([start, ]stop) and slice([start, ]stop) are powerful and slice in particular is underused.

Math and numbers

Doing math with Python is reasonably pleasant even without waiting for NumPy or SciPy to install. We’ll come back to
this in a couple of days when I stare at the statistics standard library module.

  • I always forget that abs exists (and uses __abs__()) and doesn’t need to be imported from math.
  • divmod() is exactly that: a // b, a % b
  • Why use pow() when you can use **. For integer operands, you can also pass a third mod argument, which is more
    efficient than doing the technically equivalent a ** b % c
  • bin() converts numbers to binary number strings, hex() does the same for hexadecimals, oct for octals.
  • int() and float() create numbers (for objects via __int__() and __float__()). int can take a base argument.
  • complex() creates a complex number, either from a string or from two numbers (on objects, it uses __complex__(),
    then float()/__float__(), which in turn falls back on __index__().
  • hash() returns a number, optionally using __hash__()
  • max() and min() are pretty straightforward. You can supply a key= named argument just like for list sorting, and
    you can provide a default= named argument that will be returned on empty iterables. Both of these take either an
    iterable or just a lot of arguments to be compared.
  • sum() has an optional start named argument that is easy to forget.

Stringy things

Everything is a string if you squint hard enough.

  • str(str) is ""
  • repr() calls __repr__(), and ascii() does the same, but escapes all non-ascii characters.
  • chr() and ord() to convert between numbers and characters
  • format(string, format_string) is the same as string.format(format_string), but of course now you can format all
    sorts of things instead.
  • input() prompts for STDIN. I mostly use the inquirer library when looking at input(), because parsing user input
    🙄.
  • open() turns files into file objects. Use as context manager. The mode can be any of rwax (read, write, append,
    exclusive create) combined with any of bt (binary or text mode), and + (open for reading and writing), default is
    rt. The buffering argument can change how files are buffered, by default text files use line buffering and binary
    files try to use the underlying device block size or io.DEFAULT_BUFFER_SIZE. The encoding argument is a life saver
    and you’ll know it when you need it. When it stops saving your life, use the error argument and set it to
    surrogateescape.
  • print() prints all its non-named arguments to its file= named argument, defaulting to sys.stdout. Set sep= and
    end= for fun.

List things

Lists and iterables are probably why I like Python most in everyday life, particularly thinking back to my clunky
experiences with Java. The biggest hurdle here is the fact that JavaScript iterating works so similarly but never
quite the same. So whenever I switch between the two, I get to take a tour of the respective language docs.

  • len() the most basic list thing. Will eat your generator alive.
  • enumerate() goes through an iterable and yields both the next element and its index (optionally starting the index
    count at the second argument). The index comes FIRST which I will never remember. (I’ve added it to my Anki deck,
    though).
  • filter(function, iterable) is the same as a for a in iterable if function(a). I will never remember that the
    filter function comes first.
  • map(function, iterable) gives you an iterator that applies the first argument to each of the second, and yields the
    result. Guess what I will never remember. (I never use map, because list comprehension is always there for me.)
  • zip(*iterables) steps through all iterables given to it at the same time. If they have different lengths, everything
    past the length of the shortest iterable is discarded.
  • reversed() is what you use when you don’t want to be eDgY and cool and use [::-1]. Uses __reversed__() if
    present. Both reversed and sorted can take a key function.
  • sorted() returns a sorted version of the provided iterable, optionally using key=, optionally reverse=True. Use
    list.sort() for in-place sorting and sorted() on non-lists and to create new objects.
  • next() retrieves the next object from an iterator. I think all the places where I’ve needed next() were hacky or
    just plain bad life choices.
  • iter(), when given one argument, creates an iterator object from it. Boring, and not often useful. Much more
    interesting is the version with two arguments: The first one is a callable, and the second one is the stopping value
    (“sentinel”). Iterating through the result will call the first argument until it returns the second one, making for an
    easy way to build block-based iteration:
from functools import partial
with open('mydata.db', 'rb') as f:
    for block in iter(partial(f.read, 64), b''):
        process_block(block)

Debugging

To take the optimistic view: Dynamic programming makes for great debugging skills.

  • globals(), locals() and vars(obj) are nice for debugging, please please please do not use them beyond that.
  • help() is something I should use more, but since it’s hard to predict which library provides good help strings, I
    usually don’t bother – when I hit this level of confusion, I go read the source.
  • id() is great to see if you’ve got a shallow copy problem.
  • isinstance() and issubclass() do exactly what’s on the tin. Remember that the class to be checked out can be an
    iterable for both of these. type(argument) is good for debugging, but try to use isinstance or issubclass in
    code.
  • type(name, bases, dict) is completely different from type(obj). It allows dynamic class creation, which you will
    know when you need it.

Magic

And all the other built-ins that didn’t fit in the categories above, and that provide dark magic not to be provoked
lightly.

  • super() is the height of magic, but points towards the extremely useful __mro__ attribute of classes or types.
    For more information and recipes, read Hettinger’s evergreen super() considered
    super
  • @classmethod turns the decorated function into a class function. You typically use cls instead of self for the
    first argument.
  • @staticmethod transforms a method into one that does not receive the implicit first self or cls argument. As
    classmethods, these can be called on the class and the object both.
  • property(getter, setter, deleter) creates a property. Use as decorator on def x(), then use @x.setter and
    @x.deleter if you need special access handling.
  • delattr (basically del), getattr, hasattr and setattr make dynamic programming way too tempting and fun
  • compile() turns a string (or bytes) into an AST object, which you can then exec() or eval() (returns the
    result). Can change the __future__ elements included, if the code may contain top level async code, and optimization
    levels (none / remove asserts / remove asserts and docstrings). You never want this, and if you do, you probably want
    ast.parse(). All of these raise auditing events.
  • __import__ is invoked by import. You nearly always want to use importlib.import_module when eyeing __import__.

Read More

Show More

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button

Adblock Detected

Please consider supporting us by disabling your ad blocker