GitHub - uhbhy/Advanced-Python: A deep-dive reference guide to advanced Python — internals, language features, ecosystem, concurrency, and production-grade patterns.

Advanced Python Mastery: From Competent to Expert

Depth is the priority. This guide assumes you can write clean Python — now we'll explain why it works, how it breaks, and how professionals wield it at scale.

1. Python's Identity

What Problem Does Python Actually Solve?

Most languages optimize for execution speed or type safety. Python optimized for something different: the speed of human thought → working code. The core insight of Python's design is that developer time is more valuable than CPU time for most problems.

This isn't just a philosophy statement. It has concrete engineering consequences:

A data scientist can prototype an ML pipeline in an afternoon
A sysadmin can automate infra tasks without a build system
A researcher can explore data interactively without compilation

Python solved the "scripting problem" that existed before it: shell scripts were too fragile, Perl was too write-only, C was too ceremonious. Python hit a sweet spot that no language had before.

Misconception to kill early: "Python is a scripting language." Python is a general-purpose language that happens to be excellent for scripting. Instagram's backend, Dropbox's desktop client, YouTube's original codebase, and CERN's physics pipelines all run on Python.

Where Python Genuinely Outperforms Other Languages

Python wins clearly when:

Prototyping and iteration speed matter more than runtime performance. Writing a REST API in Python takes hours; in Java, days.
The ecosystem is unmatched. No other language has libraries of NumPy, Pandas, PyTorch, and scikit-learn's quality for data/ML work. This is a network effect moat built over 20+ years.
Interoperability with C/Fortran/CUDA is seamless. NumPy is really C with a Python interface. PyTorch is C++/CUDA. Python acts as the "glue language" orchestrating high-performance compute without you ever touching the fast code.
Interactive exploration. Jupyter notebooks changed how science and data analysis work. Python's REPL and dynamic nature are designed for exploration.
Readability as a team asset. In large organizations, code is read 10x more than it's written. Python's syntax minimizes cognitive load.

Python loses clearly when:

Raw CPU performance is required. A tight numerical loop in Go or C will be 10–100x faster than CPython. This is the GIL's fault (covered below).
Concurrency at massive scale. Go's goroutines and Erlang's actor model scale to millions of concurrent tasks more naturally than Python.
Mobile development. Swift and Kotlin dominate iOS/Android. Python has no native story here.
Type safety at compile time. TypeScript, Rust, and Haskell catch entire classes of bugs before runtime that Python only catches when the code runs.
Memory footprint. Python objects are fat. A Python int consumes 28 bytes. A C int consumes 4 bytes.

Why Python Dominates Data, ML, and Automation

This isn't just popularity — there are structural reasons:

First-mover advantage with quality libraries. NumPy (2006) and SciPy established Python as the scientific computing language before any competitor got traction. When machine learning exploded after 2012, researchers used Python because the numerical stack was already there.
MATLAB migration path. Academic researchers moved from MATLAB to Python because NumPy's syntax is deliberately similar, and Python is free.
The "glue language" advantage. Python can call any C/C++/CUDA library via ctypes, cffi, or Cython. So when someone writes a blazing-fast neural network in CUDA, they wrap it in a Python API. The entire AI/ML stack is actually mostly C++ and CUDA under the hood — Python is just the interface.
Scripting and automation felt natural. DevOps, data engineers, and sysadmins reached for Python because it was already installed everywhere and didn't require a build step.

The Zen of Python — What It Actually Means

Run import this in Python to see the full Zen. Here are the principles that actually change how you write code:

import this
# The Zen of Python, by Tim Peters

"Beautiful is better than ugly." This isn't aesthetics. It means: if your code looks ugly, it's probably also unclear, fragile, or over-engineered. Ugliness is a code smell.

"Explicit is better than implicit." Django's magic is convenient but makes debugging harder. FastAPI's explicit dependency injection is more verbose but debuggable. In Python, the idiomatic preference is to be explicit about what you're doing, even if it takes an extra line.

# Implicit — where does 'user' come from?
def get_profile():
    return Profile.objects.get(user=current_user)

# Explicit — caller passes the user
def get_profile(user: User) -> Profile:
    return Profile.objects.get(user=user)

"Simple is better than complex. Complex is better than complicated." There's a distinction here: complex means "having multiple interacting parts" (sometimes unavoidable). Complicated means "unnecessarily hard to understand." Favor simplicity; accept complexity when necessary; never tolerate complicated.

"There should be one— and preferably only one —obvious way to do it." This is Python's answer to Perl's "there is more than one way to do it." Python nudges you toward consensus patterns. The for item in iterable loop is the Pythonic way. There's no need for while (i < len(arr)) { arr[i++] } style iteration.

"If the implementation is hard to explain, it's a bad idea." This is the most underrated principle. If you can't explain your design to a colleague in 2 minutes, refactor. Clever code that no one can understand is a liability.

2. Under the Hood — How Python Actually Works

CPython, PyPy, Jython — Why Implementations Matter

Python the language is a specification — a description of what valid Python code means. Python the implementation is the actual software that runs it. There are several:

CPython (the default)

Written in C, maintained by the CPython core developers
The reference implementation — what "Python" almost always means
Uses a bytecode interpreter with a Global Interpreter Lock
Every Python version you download from python.org is CPython

PyPy

A Python interpreter written in Python (via a toolchain called RPython)
Uses Just-In-Time (JIT) compilation — it compiles hot code paths to machine code at runtime
Can be 5–10x faster than CPython for CPU-bound code
Compatibility: supports Python 3.10 as of now, but some C extension modules don't work
Use it when: you have CPU-intensive pure-Python code and can't easily reach for NumPy or Cython

Jython

Python running on the Java Virtual Machine (JVM)
Python code can directly call Java libraries and vice versa
Largely obsolete now, mostly used in enterprise Java contexts
Currently stuck at Python 2.7 compatibility

> Interview flag: "Which Python implementation is fastest?" PyPy for pure Python code. CPython with NumPy/PyTorch (which are C extensions) for numerical work because PyPy's C extension support is incomplete.

The Global Interpreter Lock (GIL)

The GIL is the most important architectural decision in CPython, and the most misunderstood.

What is it? A mutex (mutual exclusion lock) that ensures only one thread executes Python bytecode at a time, even on multi-core hardware.

Why does it exist? CPython uses reference counting for memory management (explained below). Every Python object maintains a count of how many variables point to it. Incrementing and decrementing this count from multiple threads simultaneously without locks would cause race conditions, corrupting memory. The GIL is the simplest solution to making reference counting thread-safe.

What it prevents: True parallel execution of Python code across CPU cores. If you have 8 cores and 8 threads running Python, only 1 thread runs at a time.

import threading
import time

counter = 0

def increment():
    global counter
    for _ in range(1_000_000):
        counter += 1  # This is NOT atomic, but the GIL prevents the worst races

t1 = threading.Thread(target=increment)
t2 = threading.Thread(target=increment)

start = time.time()
t1.start(); t2.start()
t1.join(); t2.join()
print(f"Counter: {counter}")  # Not necessarily 2,000,000 — the GIL doesn't make += atomic!
print(f"Time: {time.time() - start:.2f}s")

The nuance most developers miss: The GIL is released during I/O operations. When a thread is waiting for a network response, reading a file, or sleeping, the GIL is released and other threads can run. This is why threading works well for I/O-bound code but not CPU-bound code.

Thread 1: [network call → GIL released] ←→ [got result → GIL acquired]
Thread 2:                [GIL acquired → runs Python code → GIL released]

How professionals work around it:

Use multiprocessing for CPU-bound parallelism. Each process has its own Python interpreter and its own GIL. True parallelism across cores.
Use asyncio for I/O-bound concurrency. Single-threaded, event-driven. No GIL issues because only one coroutine runs at a time anyway.
Use C extensions. NumPy, pandas, and PyTorch release the GIL when doing numerical computations. np.dot(a, b) runs in C with full multi-core parallelism.
Use PyPy which has a different GIL implementation, though it still has one.
Wait for Python 3.13+ which introduces a "no-GIL" experimental build (PEP 703). This is a major architectural change in progress.

Memory Management: Reference Counting and the Garbage Collector

Python uses a two-layer memory system:

Layer 1: Reference Counting

Every Python object has a ob_refcnt field (accessible via sys.getrefcount()). When you assign x = obj, the refcount goes up. When x goes out of scope, it goes down. When it hits zero, the memory is immediately freed.

import sys

x = [1, 2, 3]
print(sys.getrefcount(x))  # 2 (x + the getrefcount argument itself)

y = x
print(sys.getrefcount(x))  # 3 (x, y, and getrefcount argument)

del y
print(sys.getrefcount(x))  # 2 again

This is deterministic — you know exactly when memory is freed. In Java, C#, or Go, the garbage collector runs whenever it wants. In Python, when the refcount hits zero, the destructor (__del__) runs immediately.

The weakness: Reference counting can't handle cycles.

a = []
b = []
a.append(b)  # a references b
b.append(a)  # b references a

del a
del b
# Both a and b still have refcount > 0 because they reference each other.
# They'll never be freed by reference counting alone.

Layer 2: Cyclic Garbage Collector

Python's gc module runs periodically to detect and break reference cycles. It uses a generational collection algorithm:

Objects that survive multiple GC cycles are promoted to older "generations"
Older generations are collected less frequently (expensive objects you've been using a while are likely still needed)
New objects (generation 0) are collected most often

import gc

# Disable GC for performance-critical sections (if you know you have no cycles)
gc.disable()
# ... performance critical code ...
gc.enable()
gc.collect()  # Manual collection

# Debug: see what the GC is tracking
gc.set_debug(gc.DEBUG_LEAK)

Memory Pools (pymalloc)

Python doesn't call malloc/free for every object. It maintains its own memory pool called pymalloc for small objects (≤512 bytes). This dramatically reduces allocation overhead and memory fragmentation.

Object-specific pools also exist: small integers (-5 to 256) and interned strings are cached and reused. This is why:

a = 256
b = 256
print(a is b)  # True — same object from the cache

a = 257
b = 257
print(a is b)  # False — different objects (outside cache range)

> Interview flag: This is why you should use == for value equality and is only for identity checks (like is None).

How Python Executes Code: Bytecode and the Interpreter Loop

When you run python script.py, here's what actually happens:

Source code (.py)
      ↓
  Lexer/Tokenizer     # Splits text into tokens (keywords, identifiers, operators)
      ↓
    Parser            # Builds an Abstract Syntax Tree (AST)
      ↓
  Code Generator      # Compiles AST to bytecode
      ↓
  .pyc file cache     # Bytecode saved to __pycache__/
      ↓
CPython VM            # Interpreter loop executes bytecode instructions

Bytecode is a platform-independent intermediate representation, like JVM bytecode but not compiled to machine code. You can inspect it:

import dis

def add(a, b):
    return a + b

dis.dis(add)
# Output:
#   2           0 RESUME                   0
#   3           2 LOAD_FAST                0 (a)
#               4 LOAD_FAST                1 (b)
#               6 BINARY_OP               0 (+)
#              10 RETURN_VALUE

Each bytecode instruction is one opcode + optional argument. The CPython VM is a giant switch statement (in C) that handles each opcode.

Why .pyc files? Compiling source to bytecode takes time. Python caches the bytecode in __pycache__/*.pyc so it doesn't recompile unchanged files. The filename includes the Python version and a hash of the source, so it invalidates automatically when the source changes.

> Difference from JVM: Java compiles to bytecode that the JVM then JIT-compiles to machine code. CPython interprets the bytecode directly (no JIT in standard CPython). This is a major reason Python is slower than Java for CPU-bound tasks.

Dynamic Typing Internals: What Happens When You Assign a Variable

This is a fundamental mental model shift from statically typed languages.

In C:

int x = 5;  // x is a memory location that holds an int

The variable x IS a box that contains an integer.

In Python:

x = 5  # x is a label pointing to an object

The variable x is a name in a namespace (a dictionary) that refers to an object. The object 5 exists independently. x just points to it.

x = 5       # x → <int object 5>
x = "hello" # x → <str object "hello">  (the int object still exists if anything else references it)
x = [1,2,3] # x → <list object>

What a Python object actually is (in C): Every Python object has at minimum:

ob_refcnt: reference count
ob_type: pointer to the type object (which defines behavior)
Type-specific data (the actual value)

# You can see the type machinery:
x = 42
print(type(x))          # <class 'int'>
print(type(x).__mro__)  # Method Resolution Order
print(x.__class__)      # <class 'int'>

Why dynamic typing costs performance: When Python executes a + b, it must:

Look up the type of a
Find the __add__ method on that type
Check if b is a compatible type
Call the C function that does the actual addition

A C compiler does this at compile time once. Python does it at runtime every time. This is the core reason Python is "slow."

Decorators — The Deep Mechanics

A decorator is syntactic sugar for a higher-order function. @decorator on a function is exactly equivalent to func = decorator(func).

def my_decorator(func):
    def wrapper(*args, **kwargs):
        print("Before")
        result = func(*args, **kwargs)
        print("After")
        return result
    return wrapper

@my_decorator
def greet(name):
    print(f"Hello, {name}")

# This is EXACTLY equivalent to:
# greet = my_decorator(greet)

The functools.wraps issue: Without it, the decorated function loses its identity:

import functools

def my_decorator(func):
    @functools.wraps(func)  # Copies __name__, __doc__, __module__, etc.
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)
    return wrapper

Decorators with arguments require one extra layer of nesting:

def retry(max_attempts=3, delay=1.0):
    def decorator(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_attempts):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    if attempt == max_attempts - 1:
                        raise
                    time.sleep(delay)
        return wrapper
    return decorator

@retry(max_attempts=5, delay=0.5)
def unstable_api_call():
    ...

Class-based decorators (when you need state):

class RateLimit:
    def __init__(self, calls_per_second):
        self.calls_per_second = calls_per_second
        self.last_called = 0.0

    def __call__(self, func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            elapsed = time.time() - self.last_called
            wait = 1.0 / self.calls_per_second - elapsed
            if wait > 0:
                time.sleep(wait)
            self.last_called = time.time()
            return func(*args, **kwargs)
        return wrapper

@RateLimit(calls_per_second=10)
def call_api():
    ...

Real-world decorator patterns professionals use:

# Caching (memoization)
from functools import lru_cache, cache

@cache  # Python 3.9+, unbounded cache
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

# Timing/profiling
def timer(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        end = time.perf_counter()
        print(f"{func.__name__} took {end - start:.4f}s")
        return result
    return wrapper

# Authentication in web frameworks
def require_auth(func):
    @functools.wraps(func)
    def wrapper(request, *args, **kwargs):
        if not request.user.is_authenticated:
            raise PermissionError("Authentication required")
        return func(request, *args, **kwargs)
    return wrapper

# Type validation
def validate_types(**type_map):
    def decorator(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            sig = inspect.signature(func)
            bound = sig.bind(*args, **kwargs)
            for name, value in bound.arguments.items():
                if name in type_map and not isinstance(value, type_map[name]):
                    raise TypeError(f"{name} must be {type_map[name].__name__}")
            return func(*args, **kwargs)
        return wrapper
    return decorator

@validate_types(user_id=int, name=str)
def create_user(user_id, name):
    ...

> Interview flag: "How do decorators work under the hood?" and "How do you write a decorator that takes arguments?" are common senior Python interview questions.

Generators and Iterators — Lazy Evaluation

Python's iteration protocol is built on two methods:

__iter__: returns an iterator object
__next__: returns the next value, raises StopIteration when exhausted

class CountUp:
    def __init__(self, start, end):
        self.current = start
        self.end = end

    def __iter__(self):
        return self

    def __next__(self):
        if self.current > self.end:
            raise StopIteration
        value = self.current
        self.current += 1
        return value

for n in CountUp(1, 5):
    print(n)  # 1 2 3 4 5

Generators are syntactic sugar for this pattern using yield:

def count_up(start, end):
    current = start
    while current <= end:
        yield current  # Suspends here, returns value to caller
        current += 1   # Resumes here on next() call

What yield actually does: When Python hits yield, it:

Returns the yielded value to the caller
Suspends the function's execution frame (saves all local state)
On the next next() call, resumes from exactly where it left off

This is fundamentally different from return. The function is a resumable coroutine.

Memory efficiency — the killer use case:

# Bad: loads entire file into memory
def read_large_file_bad(path):
    with open(path) as f:
        return f.readlines()  # 10GB file = 10GB in memory

# Good: yields one line at a time
def read_large_file(path):
    with open(path) as f:
        for line in f:
            yield line.strip()

# Process a 10GB file with constant memory usage
for line in read_large_file("huge_log.txt"):
    process(line)

Generator pipelines — composing lazy transformations:

def read_records(path):
    with open(path) as f:
        for line in f:
            yield line.strip()

def parse_json(records):
    for record in records:
        yield json.loads(record)

def filter_active(records):
    for record in records:
        if record.get("status") == "active":
            yield record

def extract_emails(records):
    for record in records:
        yield record["email"]

# Composing the pipeline — no intermediate lists created
pipeline = extract_emails(
    filter_active(
        parse_json(
            read_records("users.jsonl")
        )
    )
)

# Nothing executes until you iterate
for email in pipeline:
    send_email(email)

yield from — delegating to a sub-generator:

def flatten(nested):
    for item in nested:
        if isinstance(item, list):
            yield from flatten(item)  # Recursively yields from sub-generator
        else:
            yield item

list(flatten([1, [2, [3, 4]], 5]))  # [1, 2, 3, 4, 5]

Generator expressions — like list comprehensions but lazy:

# List comprehension — all values computed immediately
squares_list = [x**2 for x in range(1_000_000)]  # Uses ~8MB

# Generator expression — values computed on demand
squares_gen = (x**2 for x in range(1_000_000))   # Uses ~200 bytes

# Both support:
sum(x**2 for x in range(1_000_000))  # sum() accepts any iterable

Context Managers — The `with` Statement

A context manager guarantees that setup and teardown code always runs, even if an exception occurs. It implements __enter__ and __exit__.

class DatabaseConnection:
    def __init__(self, dsn):
        self.dsn = dsn
        self.conn = None

    def __enter__(self):
        self.conn = connect(self.dsn)
        return self.conn  # Value assigned to 'as' variable

    def __exit__(self, exc_type, exc_val, exc_tb):
        if self.conn:
            if exc_type is not None:
                self.conn.rollback()  # Exception occurred, rollback
            else:
                self.conn.commit()
            self.conn.close()
        return False  # Don't suppress the exception (return True to suppress)

with DatabaseConnection("postgresql://...") as conn:
    conn.execute("INSERT INTO ...")
# Connection always closed, even if exception raised

contextlib.contextmanager — write context managers as generators:

from contextlib import contextmanager

@contextmanager
def timer(label):
    start = time.perf_counter()
    try:
        yield  # Code inside 'with' block runs here
    finally:
        elapsed = time.perf_counter() - start
        print(f"{label}: {elapsed:.3f}s")

with timer("database query"):
    results = db.execute("SELECT ...")

contextlib.suppress — a built-in context manager for ignoring exceptions:

from contextlib import suppress

with suppress(FileNotFoundError):
    os.remove("maybe_exists.txt")
# Equivalent to try/except FileNotFoundError: pass

contextlib.ExitStack — dynamic number of context managers:

from contextlib import ExitStack

files = ["a.txt", "b.txt", "c.txt"]
with ExitStack() as stack:
    handles = [stack.enter_context(open(f)) for f in files]
    # All files closed when block exits, even if some failed to open

Metaclasses — The Machinery Behind Classes

A metaclass is to a class what a class is to an instance. It controls how classes themselves are created.

The type hierarchy:

Everything is an object in Python:
- int, str, list are objects of type 'type'
- User-defined classes are objects of type 'type' (or a custom metaclass)
- Instances are objects of their class

42 is an instance of int
int is an instance of type
type is an instance of type (circular — type is its own metaclass)

# These are equivalent:
class MyClass:
    x = 10

MyClass = type("MyClass", (object,), {"x": 10})

Writing a metaclass:

class SingletonMeta(type):
    _instances = {}

    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super().__call__(*args, **kwargs)
        return cls._instances[cls]

class Database(metaclass=SingletonMeta):
    def __init__(self):
        self.connection = connect()

db1 = Database()
db2 = Database()
print(db1 is db2)  # True

How Django uses metaclasses: Django's ORM is powered by metaclasses. When you write:

class User(models.Model):
    name = models.CharField(max_length=100)
    email = models.EmailField()

The ModelBase metaclass inspects the class body, collects the field definitions, registers the model in a central registry, and creates the database schema. None of this magic happens in __init__ — it happens at class creation time via the metaclass.

When to use metaclasses: Metaclasses solve problems that can't be solved any other way:

Automatically registering subclasses in a registry
Enforcing that certain methods are implemented (before ABCs existed)
Adding or modifying class attributes at definition time
ORM field introspection (Django, SQLAlchemy)

Guido's rule: "If you're not sure whether you need a metaclass, you don't need one." For most use cases, class decorators or __init_subclass__ are simpler.

# Modern alternative to metaclass for subclass registration:
class Plugin:
    _registry = {}

    def __init_subclass__(cls, plugin_name=None, **kwargs):
        super().__init_subclass__(**kwargs)
        if plugin_name:
            Plugin._registry[plugin_name] = cls

class CSVPlugin(Plugin, plugin_name="csv"):
    pass

class JSONPlugin(Plugin, plugin_name="json"):
    pass

print(Plugin._registry)  # {'csv': CSVPlugin, 'json': JSONPlugin}

Descriptors — The Machinery Behind Properties

Descriptors are objects that define __get__, __set__, or __delete__. They control attribute access.

class Descriptor:
    def __set_name__(self, owner, name):
        self.name = name  # Called when class is defined

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self  # Accessing from class, not instance
        return obj.__dict__.get(self.name)

    def __set__(self, obj, value):
        obj.__dict__[self.name] = value

    def __delete__(self, obj):
        del obj.__dict__[self.name]

@property is a descriptor. When you write:

class Circle:
    def __init__(self, radius):
        self._radius = radius

    @property
    def area(self):
        return 3.14159 * self._radius ** 2

    @area.setter
    def area(self, value):
        self._radius = (value / 3.14159) ** 0.5

property is a built-in descriptor class. @property creates a property object stored as a class attribute. When you access circle.area, Python calls the descriptor's __get__ method.

Validated attribute descriptor (a real-world pattern):

class Validated:
    def __set_name__(self, owner, name):
        self.private_name = f"_{name}"

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return getattr(obj, self.private_name, None)

    def __set__(self, obj, value):
        value = self.validate(value)
        setattr(obj, self.private_name, value)

    def validate(self, value):
        raise NotImplementedError

class PositiveFloat(Validated):
    def validate(self, value):
        if not isinstance(value, (int, float)):
            raise TypeError(f"Expected number, got {type(value)}")
        if value <= 0:
            raise ValueError(f"Expected positive, got {value}")
        return float(value)

class Product:
    price = PositiveFloat()
    weight = PositiveFloat()

    def __init__(self, price, weight):
        self.price = price     # Calls PositiveFloat.__set__
        self.weight = weight

p = Product(19.99, 0.5)
p.price = -5  # Raises ValueError

@classmethod and @staticmethod are also descriptors:

classmethod.__get__ returns a bound method with the class as first argument
staticmethod.__get__ returns the plain function with no binding

*args, **kwargs, and Argument Unpacking — Full Depth

def func(a, b, *args, keyword_only, **kwargs):
    print(f"a={a}, b={b}")
    print(f"args={args}")           # Tuple of extra positional args
    print(f"keyword_only={keyword_only}")  # Must be passed as keyword
    print(f"kwargs={kwargs}")       # Dict of extra keyword args

func(1, 2, 3, 4, keyword_only="hello", x=10, y=20)
# a=1, b=2
# args=(3, 4)
# keyword_only=hello
# kwargs={'x': 10, 'y': 20}

Positional-only parameters (Python 3.8+, the / separator):

def point(x, y, /, label=""):
    # x and y CANNOT be passed as keywords
    return f"({x}, {y}) {label}"

point(1, 2, label="origin")  # OK
point(x=1, y=2)  # TypeError: x is positional-only

This is how built-in functions like len() work — you can't call len(obj=mylist).

Unpacking into function calls:

args = [1, 2, 3]
kwargs = {"sep": "-", "end": "\n"}
print(*args, **kwargs)  # Equivalent to print(1, 2, 3, sep="-", end="\n")

# Merging dicts (Python 3.9+)
defaults = {"timeout": 30, "retries": 3}
overrides = {"timeout": 60}
config = {**defaults, **overrides}  # {'timeout': 60, 'retries': 3}

Forwarding arguments pattern:

def with_logging(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        print(f"Calling {func.__name__}")
        return func(*args, **kwargs)  # Forward all args unchanged
    return wrapper

Closures and Scope — The LEGB Rule

Python resolves names in this order: Local → Enclosing → Global → Built-in

x = "global"

def outer():
    x = "enclosing"

    def inner():
        x = "local"
        print(x)  # "local" — found in Local scope

    inner()

outer()

Closures capture variables from the enclosing scope:

def make_multiplier(n):
    def multiplier(x):
        return x * n  # n is captured from enclosing scope — this is a closure
    return multiplier

double = make_multiplier(2)
triple = make_multiplier(3)
print(double(5))   # 10
print(triple(5))   # 15

multiplier is a closure — it "closes over" the variable n.

The nonlocal keyword:

def make_counter():
    count = 0

    def increment():
        nonlocal count  # Required to rebind, not just read
        count += 1
        return count

    return increment

counter = make_counter()
print(counter())  # 1
print(counter())  # 2

Without nonlocal, count += 1 would try to create a local count variable, but Python would see it's referenced before assignment and raise UnboundLocalError.

Classic closure gotcha — late binding:

# Bug: all functions capture the SAME variable i
funcs = [lambda: i for i in range(5)]
print([f() for f in funcs])  # [4, 4, 4, 4, 4] — all see i=4

# Fix: use default argument to capture value at creation time
funcs = [lambda i=i: i for i in range(5)]
print([f() for f in funcs])  # [0, 1, 2, 3, 4]

Real-world closures in Django, Flask, and Click all use this pattern for registering functions with state captured at definition time (URL routes, command registrations, etc.).

Dunder Methods — The Complete Picture

Dunder (double underscore) methods define how your objects behave with Python's built-in operations. This is how Python achieves operator overloading, and it's more principled than C++/Java's approach.

Object Lifecycle

class MyClass:
    def __new__(cls, *args, **kwargs):
        # Called to CREATE the instance (before __init__)
        # Useful for singletons, immutable types, metaclass tricks
        instance = super().__new__(cls)
        return instance

    def __init__(self, value):
        # Called to INITIALIZE the instance
        self.value = value

    def __del__(self):
        # Called when refcount hits zero (destructor)
        # Don't rely on this for cleanup — use context managers instead
        print("Object deleted")

Representation

    def __repr__(self):
        # For developers: eval(repr(obj)) should ideally recreate obj
        return f"MyClass(value={self.value!r})"

    def __str__(self):
        # For end users: human-readable string
        return f"MyClass with value {self.value}"

    def __format__(self, format_spec):
        # Controls f"{obj:spec}" formatting
        return format(self.value, format_spec)

Comparison Operators

from functools import total_ordering

@total_ordering  # Define __eq__ and one of lt/le/gt/ge, get the rest free
class Temperature:
    def __init__(self, celsius):
        self.celsius = celsius

    def __eq__(self, other):
        if not isinstance(other, Temperature):
            return NotImplemented  # Let Python try the other operand
        return self.celsius == other.celsius

    def __lt__(self, other):
        if not isinstance(other, Temperature):
            return NotImplemented
        return self.celsius < other.celsius

Container Protocol

class Grid:
    def __init__(self, data):
        self.data = data

    def __len__(self):
        return len(self.data)

    def __getitem__(self, key):
        return self.data[key]

    def __setitem__(self, key, value):
        self.data[key] = value

    def __delitem__(self, key):
        del self.data[key]

    def __contains__(self, item):
        return item in self.data

    def __iter__(self):
        return iter(self.data)

Arithmetic Operators

class Vector:
    def __init__(self, x, y):
        self.x, self.y = x, y

    def __add__(self, other):
        return Vector(self.x + other.x, self.y + other.y)

    def __radd__(self, other):
        # Called when left operand doesn't know how to add: other + self
        return self.__add__(other)

    def __iadd__(self, other):
        # In-place addition: self += other
        self.x += other.x
        self.y += other.y
        return self

    def __mul__(self, scalar):
        return Vector(self.x * scalar, self.y * scalar)

    def __rmul__(self, scalar):
        return self.__mul__(scalar)

    def __neg__(self):
        return Vector(-self.x, -self.y)

    def __abs__(self):
        return (self.x**2 + self.y**2) ** 0.5

Callable and Attribute Access

class LazyLoader:
    def __call__(self, *args, **kwargs):
        # Makes instances callable: obj()
        return self.execute(*args, **kwargs)

    def __getattr__(self, name):
        # Called ONLY when normal attribute lookup fails
        # Good for proxy objects, lazy loading
        return self._delegate.__getattribute__(name)

    def __getattribute__(self, name):
        # Called for EVERY attribute access (be careful — easily infinite loops)
        return super().__getattribute__(name)

    def __setattr__(self, name, value):
        # Called for every attribute assignment
        super().__setattr__(name, value)

Context Manager

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.cleanup()
        return False  # Don't suppress exceptions

`slots` — Memory Optimization

By default, Python stores instance attributes in a __dict__ (a dictionary). This is flexible but uses memory. __slots__ replaces the dict with a fixed set of attributes:

class PointWithSlots:
    __slots__ = ("x", "y")  # No __dict__ created

    def __init__(self, x, y):
        self.x, self.y = x, y

class PointWithoutSlots:
    def __init__(self, x, y):
        self.x, self.y = x, y

import sys
p_slots = PointWithSlots(1, 2)
p_no_slots = PointWithoutSlots(1, 2)
print(sys.getsizeof(p_slots))     # ~48 bytes
print(sys.getsizeof(p_no_slots))  # ~48 bytes + dict overhead

# Real savings on millions of instances
# Also: slots instances are slightly faster to access

Type Hints and the `typing` Module

Type hints don't affect runtime behavior in CPython — they're metadata for tools (mypy, pyright, IDEs). But at scale, they're transformative.

from typing import Optional, Union, List, Dict, Tuple, Set
from typing import Callable, Iterator, Generator, AsyncGenerator
from typing import TypeVar, Generic, Protocol, TypedDict
from collections.abc import Sequence, Mapping

# Basic annotations
def greet(name: str) -> str:
    return f"Hello, {name}"

# Optional (can be None)
def find_user(user_id: int) -> Optional[User]:
    ...

# Union types (Python 3.10+: int | str instead of Union[int, str])
def process(value: int | str) -> None:
    ...

# Generic collections
def first_item(items: list[str]) -> str | None:
    return items[0] if items else None

# TypeVar for generic functions
T = TypeVar("T")

def first(items: list[T]) -> T | None:
    return items[0] if items else None

TypedDict — typed dictionaries:

from typing import TypedDict

class UserConfig(TypedDict):
    name: str
    age: int
    email: str | None

def create_user(config: UserConfig) -> User:
    return User(config["name"], config["age"])

Protocol — structural subtyping (duck typing with type checking):

from typing import Protocol

class Drawable(Protocol):
    def draw(self) -> None: ...
    def get_bounds(self) -> tuple[int, int, int, int]: ...

# Any class with these methods satisfies Drawable,
# without needing to inherit from it
class Circle:
    def draw(self) -> None:
        pygame.draw.circle(...)

    def get_bounds(self) -> tuple[int, int, int, int]:
        return (...)

def render_all(shapes: list[Drawable]) -> None:
    for shape in shapes:
        shape.draw()

render_all([Circle(), Rectangle()])  # Type checks correctly

dataclasses — the modern alternative to boilerplate classes:

from dataclasses import dataclass, field

@dataclass(frozen=True)  # Immutable, hashable
class Point:
    x: float
    y: float
    label: str = ""
    tags: list[str] = field(default_factory=list)

    def distance_to(self, other: "Point") -> float:
        return ((self.x - other.x)**2 + (self.y - other.y)**2) ** 0.5

# Automatically generates __init__, __repr__, __eq__, __hash__
p1 = Point(0.0, 0.0)
p2 = Point(3.0, 4.0)
print(p1.distance_to(p2))  # 5.0

Annotated — attaching metadata to type hints:

from typing import Annotated

# Used by Pydantic, FastAPI, etc. to attach validation rules
Positive = Annotated[float, "must be positive"]
UserId = Annotated[int, "user primary key"]

def get_user(user_id: UserId) -> User:
    ...

4. Python's Ecosystem

Data & Analysis

NumPy — The Foundation of Scientific Python

NumPy provides the ndarray: a typed, dense, multi-dimensional array stored contiguously in memory. Operations are vectorized — they run in C or Fortran, not in the Python interpreter loop.

import numpy as np

# The key mental model: operate on arrays, not loops
arr = np.array([1.0, 2.0, 3.0, 4.0, 5.0])

# BAD — Python loop, slow
result = []
for x in arr:
    result.append(x ** 2)

# GOOD — vectorized, runs in C
result = arr ** 2  # [1., 4., 9., 16., 25.]

# Broadcasting — operations on arrays of different shapes
matrix = np.ones((3, 4))
row = np.array([1, 2, 3, 4])
matrix + row  # Broadcasts row across each row of matrix

# Advanced indexing
arr = np.arange(100).reshape(10, 10)
arr[arr > 50]              # Boolean indexing
arr[[0, 3, 7], [1, 2, 5]]  # Fancy indexing — select specific (row, col) pairs

Why NumPy is fast:

Memory layout is contiguous (cache-friendly)
Operations are implemented in C/Fortran
NumPy can release the GIL for numerical operations
BLAS/LAPACK libraries for linear algebra (same libraries MATLAB uses)

Pandas — Tabular Data Manipulation

Pandas DataFrame is an in-memory table with labeled rows and columns, built on NumPy.

import pandas as pd

# Reading data
df = pd.read_csv("sales.csv")
df = pd.read_parquet("data.parquet")  # Parquet is preferred for large data

# Key operations
df["revenue"] = df["price"] * df["quantity"]  # Vectorized column math
df_filtered = df[df["region"] == "APAC"]       # Boolean filtering
df_grouped = df.groupby("product")["revenue"].sum()  # Aggregation

# The method chain pattern (Pandas style)
result = (
    df
    .query("region == 'APAC' and revenue > 1000")
    .assign(margin=lambda x: x["revenue"] - x["cost"])
    .groupby("product")
    .agg({"revenue": "sum", "margin": "mean"})
    .sort_values("revenue", ascending=False)
    .head(10)
)

Common performance pitfalls:

# BAD — iterrows() is the slowest way to process DataFrame rows
for idx, row in df.iterrows():
    df.at[idx, "label"] = classify(row["value"])

# GOOD — vectorized with apply (still Python, but faster)
df["label"] = df["value"].apply(classify)

# BEST — fully vectorized with numpy/pandas operations
df["label"] = np.where(df["value"] > threshold, "high", "low")

# CRITICAL: Avoid chained indexing (creates copies)
df["col"]["row"]       # BAD — creates a copy, sets on it, has no effect
df.loc["row", "col"]   # GOOD — direct access

Polars — Why It's Gaining on Pandas

Polars is a DataFrame library written in Rust, with a Python API. Key advantages:

Lazy evaluation — build a query plan, execute once (like Spark)
True parallelism — no GIL in Rust, uses all CPU cores
Memory efficient — Apache Arrow columnar format
10–100x faster than pandas for many operations

import polars as pl

# Lazy API — like a query plan
result = (
    pl.scan_csv("huge_file.csv")          # Doesn't load file yet
    .filter(pl.col("region") == "APAC")
    .with_columns(
        (pl.col("price") * pl.col("qty")).alias("revenue")
    )
    .group_by("product")
    .agg(pl.col("revenue").sum())
    .sort("revenue", descending=True)
    .collect()  # Execute the entire plan optimally
)

When to choose Polars over Pandas:

Large datasets (>1GB) where Pandas is slow
Multi-core hardware you want to use fully
New greenfield projects where you have no Pandas muscle memory
When you need consistent, predictable performance

Web & APIs

FastAPI — Modern Async API Framework

FastAPI uses Python type hints to automatically generate request validation, serialization, and OpenAPI documentation.

from fastapi import FastAPI, HTTPException, Depends
from pydantic import BaseModel
from typing import Optional

app = FastAPI()

class UserCreate(BaseModel):
    name: str
    email: str
    age: int | None = None

class UserResponse(BaseModel):
    id: int
    name: str
    email: str

    class Config:
        from_attributes = True  # Allow creating from ORM objects

@app.post("/users", response_model=UserResponse, status_code=201)
async def create_user(user: UserCreate, db: Session = Depends(get_db)):
    # user is already validated by Pydantic
    # db is injected by FastAPI's dependency system
    db_user = User(**user.model_dump())
    db.add(db_user)
    db.commit()
    db.refresh(db_user)
    return db_user  # Pydantic converts ORM object to response model

@app.get("/users/{user_id}", response_model=UserResponse)
async def get_user(user_id: int, db: Session = Depends(get_db)):
    user = db.get(User, user_id)
    if user is None:
        raise HTTPException(status_code=404, detail="User not found")
    return user

FastAPI's dependency injection is one of its killer features:

# Dependencies can be async, can depend on other dependencies, can have teardown
async def get_db():
    db = SessionLocal()
    try:
        yield db          # Setup: provide value to endpoint
    finally:
        db.close()        # Teardown: always runs

async def get_current_user(
    token: str = Depends(oauth2_scheme),
    db: Session = Depends(get_db)
) -> User:
    user = verify_token(token, db)
    if not user:
        raise HTTPException(status_code=401)
    return user

@app.get("/me")
async def get_me(current_user: User = Depends(get_current_user)):
    return current_user

Flask vs FastAPI vs Django

	Flask	FastAPI	Django
Philosophy	Minimal, explicit	Modern, typed	Batteries-included
Async	Plugin (via Quart)	Native	Partial (3.1+)
ORM	None built-in	None built-in	Built-in
Admin UI	None	None	Built-in
Best for	Simple APIs, microservices	High-perf APIs, ML serving	Full web apps
Learning curve	Low	Medium	High

Machine Learning & AI

The Stack

scikit-learn: Classical ML (regression, classification, clustering, preprocessing). The API (fit, transform, predict) became the industry standard interface.
PyTorch: Deep learning research and production. Dynamic computational graphs, imperative execution. Preferred by researchers.
TensorFlow: Deep learning, especially at Google scale. Static graphs (TF2 is more dynamic). Preferred in some production settings.
Hugging Face Transformers: Pre-trained transformer models (BERT, GPT, T5, etc.) with a unified API.

# scikit-learn pipeline pattern
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import cross_val_score

pipeline = Pipeline([
    ("scaler", StandardScaler()),
    ("classifier", GradientBoostingClassifier(n_estimators=100))
])

scores = cross_val_score(pipeline, X_train, y_train, cv=5, scoring="roc_auc")
print(f"AUC: {scores.mean():.3f} ± {scores.std():.3f}")

# PyTorch training loop
import torch
import torch.nn as nn

class MLP(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super().__init__()
        self.network = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(hidden_dim, output_dim)
        )

    def forward(self, x):
        return self.network(x)

model = MLP(128, 256, 10)
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()

for batch_x, batch_y in dataloader:
    optimizer.zero_grad()
    logits = model(batch_x)
    loss = criterion(logits, batch_y)
    loss.backward()
    optimizer.step()

Concurrency & Performance Libraries

asyncio

import asyncio
import httpx

async def fetch(client, url):
    response = await client.get(url)
    return response.json()

async def fetch_all(urls):
    async with httpx.AsyncClient() as client:
        tasks = [fetch(client, url) for url in urls]
        return await asyncio.gather(*tasks)

# Fetch 100 URLs concurrently (not in parallel — on one thread)
results = asyncio.run(fetch_all(urls))

Celery — Distributed Task Queue

from celery import Celery

app = Celery("tasks", broker="redis://localhost:6379/0")

@app.task(bind=True, max_retries=3)
def send_email(self, user_id: int):
    try:
        user = User.objects.get(id=user_id)
        email_service.send(user.email, "Welcome!")
    except Exception as exc:
        raise self.retry(exc=exc, countdown=60)

# From your web handler:
send_email.delay(user.id)  # Enqueues task, returns immediately

Testing — Pytest

import pytest
from unittest.mock import MagicMock, patch

# Fixtures — reusable test setup
@pytest.fixture
def db_session():
    session = create_test_session()
    yield session
    session.rollback()
    session.close()

@pytest.fixture
def user_service(db_session):
    return UserService(db=db_session)

# Parametrize — run test with multiple inputs
@pytest.mark.parametrize("email,is_valid", [
    ("user@example.com", True),
    ("not-an-email", False),
    ("", False),
    ("a@b.c", True),
])
def test_email_validation(email, is_valid):
    assert validate_email(email) == is_valid

# Mocking external dependencies
def test_send_welcome_email(user_service):
    with patch("myapp.email.send") as mock_send:
        user_service.create_user("Alice", "alice@example.com")
        mock_send.assert_called_once_with(
            to="alice@example.com",
            subject="Welcome!"
        )

# Testing exceptions
def test_create_duplicate_user(user_service, db_session):
    user_service.create_user("Alice", "alice@example.com")
    with pytest.raises(DuplicateEmailError, match="alice@example.com"):
        user_service.create_user("Alice2", "alice@example.com")

5. Concurrency & Performance — The Hard Parts

Threading vs. Multiprocessing vs. asyncio

This is one of the most common points of confusion. The right mental model:

Problem type          → Solution
──────────────────────────────────────────────
I/O-bound (async)     → asyncio (single thread, cooperative)
I/O-bound (sync libs) → threading (multiple threads, GIL released during I/O)
CPU-bound             → multiprocessing (multiple processes, separate GILs)
Mixed, complex        → concurrent.futures (unified API over threads/processes)
Long-running tasks    → Celery/RQ (distributed task queue)

Threading — when to use:

import threading
import queue
import requests

def download_worker(url_queue, result_queue):
    while True:
        url = url_queue.get()
        if url is None:  # Poison pill to stop worker
            break
        result = requests.get(url)
        result_queue.put((url, result.status_code))
        url_queue.task_done()

urls = ["https://example.com"] * 100
url_queue = queue.Queue()
result_queue = queue.Queue()

# Spawn worker threads
threads = [
    threading.Thread(target=download_worker, args=(url_queue, result_queue))
    for _ in range(10)
]
for t in threads:
    t.start()

# Enqueue work
for url in urls:
    url_queue.put(url)

# Wait for completion
url_queue.join()

# Shutdown workers
for _ in threads:
    url_queue.put(None)

Multiprocessing — CPU-bound work:

from multiprocessing import Pool
import os

def cpu_intensive(n):
    # Imagine this is a real computation
    return sum(i * i for i in range(n))

with Pool(processes=os.cpu_count()) as pool:
    results = pool.map(cpu_intensive, [10**6] * 8)
    # Each call runs in a separate process with its own GIL

concurrent.futures — the clean, unified API:

from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor, as_completed

# Thread pool for I/O-bound work
with ThreadPoolExecutor(max_workers=20) as executor:
    futures = {executor.submit(requests.get, url): url for url in urls}
    for future in as_completed(futures):
        url = futures[future]
        try:
            response = future.result()
            print(f"{url}: {response.status_code}")
        except Exception as e:
            print(f"{url} failed: {e}")

# Process pool for CPU-bound work
with ProcessPoolExecutor(max_workers=os.cpu_count()) as executor:
    results = list(executor.map(cpu_intensive, data_chunks))

How asyncio Actually Works

asyncio is a cooperative multitasking system. Only one coroutine runs at a time. Coroutines yield control voluntarily via await.

Event Loop
│
├── Task 1: fetch("url_a")
│     │
│     ├── await http_get() → IO registered, task suspended
│     │
│     └── (resumes when IO completes)
│
├── Task 2: fetch("url_b")
│     │
│     ├── await http_get() → IO registered, task suspended
│     │
│     └── (resumes when IO completes)
│
└── [Polling OS for IO completion events, resuming ready tasks]

Async patterns:

import asyncio

async def producer(queue):
    for i in range(10):
        await asyncio.sleep(0.1)  # Simulates async work
        await queue.put(i)
    await queue.put(None)  # Sentinel

async def consumer(queue, name):
    while True:
        item = await queue.get()
        if item is None:
            await queue.put(None)  # Pass poison pill to other consumers
            break
        print(f"{name} processing {item}")
        await asyncio.sleep(0.2)

async def main():
    queue = asyncio.Queue(maxsize=5)
    await asyncio.gather(
        producer(queue),
        consumer(queue, "consumer-1"),
        consumer(queue, "consumer-2"),
    )

asyncio.run(main())

asyncio.gather vs asyncio.wait vs asyncio.TaskGroup:

# gather — run all, collect results (Python 3.11+ use TaskGroup)
results = await asyncio.gather(task1(), task2(), task3())

# wait — more control over cancellation/timeout
done, pending = await asyncio.wait(
    [task1(), task2()],
    timeout=5.0,
    return_when=asyncio.FIRST_COMPLETED
)

# TaskGroup (Python 3.11+) — structured concurrency, cleaner error handling
async with asyncio.TaskGroup() as tg:
    task1 = tg.create_task(fetch(url1))
    task2 = tg.create_task(fetch(url2))
# Both tasks complete before exiting the block
results = [task1.result(), task2.result()]

Profiling and Identifying Bottlenecks

Never optimize without measuring. The bottleneck is rarely where you think it is.

# cProfile — function-level timing
import cProfile
import pstats

with cProfile.Profile() as pr:
    your_function()

stats = pstats.Stats(pr)
stats.sort_stats("cumulative")
stats.print_stats(20)  # Top 20 functions by cumulative time

# From command line
python -m cProfile -s cumulative your_script.py | head -30

# Visualize with snakeviz
pip install snakeviz
python -m cProfile -o profile.stats your_script.py
snakeviz profile.stats

# line_profiler — line-by-line timing (install: pip install line_profiler)
from line_profiler import LineProfiler

profiler = LineProfiler()
profiler.add_function(slow_function)
profiler.runcall(slow_function, arg1, arg2)
profiler.print_stats()

# Or use @profile decorator with kernprof:
# kernprof -l -v script.py

# memory_profiler — memory usage per line
from memory_profiler import profile

@profile
def memory_heavy_function():
    data = [i for i in range(1_000_000)]  # Peaks here
    filtered = [x for x in data if x % 2 == 0]
    return filtered

When to Drop Into C: Cython, Numba, ctypes

Numba — JIT compilation for numerical Python:

from numba import jit
import numpy as np

@jit(nopython=True)  # Compile to machine code, no Python interpreter
def monte_carlo_pi(n):
    inside = 0
    for _ in range(n):
        x = np.random.random()
        y = np.random.random()
        if x**2 + y**2 <= 1.0:
            inside += 1
    return 4.0 * inside / n

# First call compiles — subsequent calls run at C speed
result = monte_carlo_pi(10_000_000)  # ~100x faster than pure Python

Cython — write Python-like code that compiles to C:

# montecarlo.pyx
import numpy as np
cimport numpy as np

def sum_array(np.ndarray[double, ndim=1] arr):
    cdef double total = 0.0
    cdef int i
    for i in range(len(arr)):
        total += arr[i]
    return total
# After: cython montecarlo.pyx, gcc to compile → Python extension

Decision matrix:

Scenario	Tool
Tight numerical loops	Numba `@jit`
Complex algorithm needing C speed	Cython
Call existing C library	ctypes or cffi
Call C++ library	pybind11
Array operations	NumPy (already C)

Why Python Is "Slow" — And When It Doesn't Matter

Python is slow because:

Dynamic dispatch: Every operation (a + b) looks up the method at runtime
Boxed types: 42 is a 28-byte object, not a 4-byte int
Interpreter overhead: The CPython loop processes one bytecode at a time
GIL: No multi-core parallelism for Python code

But "slow" is contextual:

Waiting on a database query? Python takes 0.0001s, DB takes 50ms. Python speed is irrelevant.
Training a neural network? PyTorch's CUDA kernels run on GPU. Python glues them together.
Processing 1TB of data? You're using Spark/Dask — Python is just the driver.
Writing a CLI tool? Startup time matters, but it's still sub-second.

Python is slow at: pure CPU-bound computation in tight loops. Python is fast enough at: nearly everything else, especially when the actual work is in C/Rust/CUDA libraries.

6. Software Engineering in Python — Writing Code That Scales

Professional Project Structure

myproject/
├── src/
│   └── myproject/
│       ├── __init__.py
│       ├── core/
│       │   ├── __init__.py
│       │   ├── models.py
│       │   └── services.py
│       ├── api/
│       │   ├── __init__.py
│       │   ├── routes.py
│       │   └── schemas.py
│       ├── db/
│       │   ├── __init__.py
│       │   ├── session.py
│       │   └── migrations/
│       └── utils/
│           ├── __init__.py
│           └── helpers.py
├── tests/
│   ├── conftest.py          # Shared fixtures
│   ├── unit/
│   │   ├── test_services.py
│   │   └── test_models.py
│   └── integration/
│       └── test_api.py
├── scripts/
│   └── seed_db.py
├── pyproject.toml
├── README.md
├── .env.example
└── Makefile

The src/ layout (putting source in src/myproject/ rather than just myproject/) prevents the package from being importable without installation, which catches packaging bugs early.

Advanced OOP: Composition vs. Inheritance

The inheritance trap: Deep inheritance hierarchies become unmanageable. Python supports multiple inheritance (unlike Java), which adds complexity. Prefer composition.

# BAD — inheritance hierarchy that doesn't scale
class Animal:
    def breathe(self): ...

class Mammal(Animal):
    def feed_young(self): ...

class Dog(Mammal):
    def bark(self): ...

class RobotDog(Dog):  # What about the Mammal methods that don't apply?
    pass

# GOOD — composition with protocols
class Walker:
    def walk(self): ...

class Barker:
    def bark(self): ...

class Dog:
    def __init__(self):
        self._walker = Walker()
        self._barker = Barker()

    def walk(self): return self._walker.walk()
    def bark(self): return self._barker.bark()

Abstract Base Classes — enforcing interfaces:

from abc import ABC, abstractmethod

class DataProcessor(ABC):
    @abstractmethod
    def read(self, source: str) -> list[dict]:
        """Read data from source."""
        ...

    @abstractmethod
    def write(self, data: list[dict], destination: str) -> None:
        """Write processed data to destination."""
        ...

    def process(self, source: str, destination: str) -> None:
        # Template method pattern
        data = self.read(source)
        processed = self.transform(data)
        self.write(processed, destination)

    def transform(self, data: list[dict]) -> list[dict]:
        return data  # Default: no transformation

class CSVProcessor(DataProcessor):
    def read(self, source: str) -> list[dict]:
        return pd.read_csv(source).to_dict("records")

    def write(self, data: list[dict], destination: str) -> None:
        pd.DataFrame(data).to_csv(destination, index=False)

Design Patterns in Python — What Applies

Patterns that are natural in Python:

# Strategy pattern — using callables/functions instead of classes
def sort_by_name(items):
    return sorted(items, key=lambda x: x.name)

def sort_by_date(items):
    return sorted(items, key=lambda x: x.created_at)

def display_items(items, sort_strategy=sort_by_name):
    for item in sort_strategy(items):
        print(item)
# Functions as strategies — no abstract class needed

# Observer pattern — using events/callbacks
class EventEmitter:
    def __init__(self):
        self._listeners: dict[str, list[Callable]] = {}

    def on(self, event: str, callback: Callable):
        self._listeners.setdefault(event, []).append(callback)

    def emit(self, event: str, *args, **kwargs):
        for callback in self._listeners.get(event, []):
            callback(*args, **kwargs)

# Factory pattern — using dict dispatch instead of if/elif chains
PROCESSORS = {
    "csv": CSVProcessor,
    "json": JSONProcessor,
    "parquet": ParquetProcessor,
}

def get_processor(format: str) -> DataProcessor:
    if format not in PROCESSORS:
        raise ValueError(f"Unknown format: {format}")
    return PROCESSORS[format]()

Patterns that are anti-patterns in Python:

Singleton via metaclass: Python modules are singletons. If you need a shared state, put it in a module-level variable.
Abstract factory with boilerplate classes: Functions returning callables are simpler.
Builder pattern via chained setters: Use dataclasses or __init__ with defaults.

Logging, Error Handling, and Observability

import logging
import structlog  # pip install structlog — structured logging

# Configure structured logging
structlog.configure(
    processors=[
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.add_log_level,
        structlog.processors.JSONRenderer(),
    ]
)

logger = structlog.get_logger()

# Context-rich logging
logger = logger.bind(service="user-service", env="production")

def create_user(user_data: dict) -> User:
    log = logger.bind(email=user_data["email"])
    log.info("creating_user")
    try:
        user = User(**user_data)
        db.add(user)
        db.commit()
        log.info("user_created", user_id=user.id)
        return user
    except IntegrityError as e:
        db.rollback()
        log.warning("duplicate_email", error=str(e))
        raise DuplicateEmailError(user_data["email"]) from e
    except Exception as e:
        db.rollback()
        log.error("user_creation_failed", error=str(e), exc_info=True)
        raise

Custom exception hierarchy:

class AppError(Exception):
    """Base class for all application errors."""
    def __init__(self, message: str, code: str = "UNKNOWN"):
        self.message = message
        self.code = code
        super().__init__(message)

class ValidationError(AppError):
    def __init__(self, field: str, message: str):
        super().__init__(message, code="VALIDATION_ERROR")
        self.field = field

class NotFoundError(AppError):
    def __init__(self, resource: str, id: int | str):
        super().__init__(f"{resource} {id} not found", code="NOT_FOUND")
        self.resource = resource
        self.id = id

Configuration Management

# settings.py — using pydantic-settings
from pydantic_settings import BaseSettings
from pydantic import Field

class Settings(BaseSettings):
    # Reads from environment variables or .env file
    database_url: str = Field(..., description="PostgreSQL connection URL")
    redis_url: str = "redis://localhost:6379/0"
    secret_key: str = Field(..., min_length=32)
    debug: bool = False
    max_connections: int = 10
    allowed_hosts: list[str] = ["localhost"]

    model_config = {
        "env_file": ".env",
        "env_file_encoding": "utf-8",
        "case_sensitive": False,
    }

# Singleton pattern for settings
from functools import lru_cache

@lru_cache(maxsize=1)
def get_settings() -> Settings:
    return Settings()

# Usage — FastAPI dependency injection
from fastapi import Depends

@app.get("/health")
def health_check(settings: Settings = Depends(get_settings)):
    return {"debug": settings.debug}

Packaging and Publishing a Library

pyproject.toml is the modern standard (replacing setup.py):

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "mypackage"
version = "0.1.0"
description = "A well-packaged Python library"
readme = "README.md"
license = { file = "LICENSE" }
requires-python = ">=3.11"
dependencies = [
    "pydantic>=2.0",
    "httpx>=0.25",
]

[project.optional-dependencies]
dev = [
    "pytest>=7.0",
    "pytest-cov",
    "mypy",
    "ruff",
]

[project.scripts]
myapp = "mypackage.cli:main"  # Creates a CLI command on install

[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "--cov=src/mypackage --cov-report=term-missing"

[tool.ruff]
line-length = 100
select = ["E", "F", "I", "N", "UP"]

[tool.mypy]
strict = true

7. Python in the Real World

Data Engineers vs. Backend Developers vs. ML Engineers

Data engineers focus on pipelines: moving, transforming, and storing data at scale.

# A data pipeline with Prefect
from prefect import flow, task
from prefect.tasks import task_input_hash
from datetime import timedelta

@task(cache_key_fn=task_input_hash, cache_expiration=timedelta(hours=1))
def extract(source_url: str) -> pd.DataFrame:
    return pd.read_parquet(source_url)

@task
def transform(df: pd.DataFrame) -> pd.DataFrame:
    return (
        df
        .dropna(subset=["user_id", "event_type"])
        .assign(event_date=lambda x: pd.to_datetime(x["timestamp"]).dt.date)
        .groupby(["user_id", "event_date", "event_type"])
        .size()
        .reset_index(name="count")
    )

@task
def load(df: pd.DataFrame, table: str):
    df.to_sql(table, engine, if_exists="append", index=False)

@flow(name="user-events-pipeline")
def pipeline(date: str):
    raw = extract(f"s3://data-lake/events/{date}.parquet")
    cleaned = transform(raw)
    load(cleaned, "user_event_counts")

Backend engineers focus on APIs and services: request handling, database access, authentication, and reliability.

ML engineers bridge data science and production: they take a notebook prototype and make it reliable, scalable, and monitored.

# ML model serving with FastAPI
import joblib
from fastapi import FastAPI
from pydantic import BaseModel
import numpy as np

app = FastAPI()
model = joblib.load("model.pkl")
preprocessor = joblib.load("preprocessor.pkl")

class PredictionRequest(BaseModel):
    features: list[float]

class PredictionResponse(BaseModel):
    prediction: float
    confidence: float

@app.post("/predict", response_model=PredictionResponse)
async def predict(request: PredictionRequest):
    X = np.array(request.features).reshape(1, -1)
    X_processed = preprocessor.transform(X)
    prediction = model.predict(X_processed)[0]
    proba = model.predict_proba(X_processed)[0]
    return PredictionResponse(
        prediction=float(prediction),
        confidence=float(max(proba))
    )

8. What Separates Good Python from Great Python

Idiomatic Python — Writing Code That Looks Like Python

Iteration:

# Not idiomatic
i = 0
while i < len(items):
    process(items[i])
    i += 1

# Idiomatic
for item in items:
    process(item)

# With index when needed
for i, item in enumerate(items):
    print(f"{i}: {item}")

# Iterating two sequences together
for a, b in zip(list1, list2):
    compare(a, b)

# Reverse iteration
for item in reversed(items):
    process(item)

Comprehensions:

# Not idiomatic
result = []
for x in data:
    if condition(x):
        result.append(transform(x))

# Idiomatic
result = [transform(x) for x in data if condition(x)]

# Dict comprehension
word_lengths = {word: len(word) for word in words}

# Set comprehension
unique_domains = {email.split("@")[1] for email in emails}

Unpacking:

# Not idiomatic
first = items[0]
rest = items[1:]

# Idiomatic
first, *rest = items
first, second, *_, last = items  # _ is convention for "don't care"

# Swap without temp variable
a, b = b, a

# Unpacking in for loops
for key, value in dictionary.items():
    process(key, value)

Default values:

# Not idiomatic
if key in dictionary:
    value = dictionary[key]
else:
    value = default

# Idiomatic
value = dictionary.get(key, default)

# Mutable default anti-pattern (classic bug)
def append_to(element, target=[]):  # BAD — list is created once, shared
    target.append(element)
    return target

def append_to(element, target=None):  # GOOD — None as sentinel
    if target is None:
        target = []
    target.append(element)
    return target

String formatting:

name, age = "Alice", 30

# Not idiomatic (2010s style)
"Hello, %s, you are %d years old" % (name, age)
"Hello, {}, you are {} years old".format(name, age)

# Idiomatic (Python 3.6+)
f"Hello, {name}, you are {age} years old"
f"Pi is approximately {3.14159:.2f}"
f"{value:>10.2f}"  # Right-aligned, width 10, 2 decimal places

Common Anti-Patterns Intermediate Developers Fall Into

1. Using bare except:

# BAD — catches SystemExit, KeyboardInterrupt, everything
try:
    risky_operation()
except:
    pass

# GOOD — explicit about what you're catching
try:
    risky_operation()
except (ValueError, TypeError) as e:
    logger.warning(f"Expected error: {e}")
except Exception as e:
    logger.error(f"Unexpected error: {e}", exc_info=True)
    raise

2. Not using context managers for resources:

# BAD — file might not be closed if exception occurs
f = open("file.txt")
data = f.read()
f.close()

# GOOD
with open("file.txt") as f:
    data = f.read()

3. Comparing to True/False/None with ==:

# BAD
if x == True: ...
if x == None: ...

# GOOD
if x: ...
if x is None: ...   # None is a singleton, always use 'is'
if x is not None: ...

4. Catching and re-raising with lost traceback:

# BAD — hides original traceback
try:
    risky()
except Exception as e:
    raise RuntimeError("Failed") # traceback says line X, not original location

# GOOD — chains exceptions, preserves original context
try:
    risky()
except Exception as e:
    raise RuntimeError("Failed") from e

5. Modifying a list while iterating it:

# BAD — skips elements
items = [1, 2, 3, 4, 5]
for item in items:
    if item % 2 == 0:
        items.remove(item)  # Modifies list during iteration!

# GOOD — filter to new list
items = [item for item in items if item % 2 != 0]

6. String concatenation in a loop:

# BAD — O(n²) — creates a new string on every iteration
result = ""
for part in parts:
    result += part

# GOOD — O(n) — join is implemented in C
result = "".join(parts)

7. Not using pathlib:

# Outdated
import os
path = os.path.join(base_dir, "data", "file.csv")
os.makedirs(os.path.dirname(path), exist_ok=True)

# Modern
from pathlib import Path
path = Path(base_dir) / "data" / "file.csv"
path.parent.mkdir(parents=True, exist_ok=True)

How Senior Engineers Read Complex Python Code

Step 1: Start with the public API — __init__.py, README, docstrings on public functions. Understand what the library does before how it does it.

Step 2: Run it. Use the REPL. import library; dir(library) to see what's exported. help(library.function) to read docstrings.

Step 3: Find the entry point. For a web framework, trace a request through middleware → routing → handler. For an ORM, trace Model.objects.filter() from call to SQL.

Step 4: Use dis.dis(), inspect.getsource(), and type() liberally:

import inspect
import dis

# See the actual source of a function
print(inspect.getsource(some_function))

# See what methods an object has
print([m for m in dir(obj) if not m.startswith('_')])

# Trace method resolution
print(type(obj).__mro__)

# See bytecode for any function
dis.dis(complex_function)

Step 5: Read the tests. Well-written tests are the best documentation. They show you how the library is meant to be used and what edge cases the authors thought about.

What Else Separates Expert Python Developers

They understand the data model. Everything in Python (int, str, function, class) is an object. Types are objects. Modules are objects. This is deeply consistent and once internalized, Python stops surprising you.

They profile before optimizing. They've been burned by optimizing the wrong thing. They reach for cProfile before rewriting anything.

They know when Python is the wrong tool. A real Python expert knows when to write a C extension, use Go for a high-throughput microservice, or use a compiled language for a performance-critical module.

They use the standard library. collections.defaultdict, collections.Counter, itertools.chain, itertools.groupby, functools.reduce, contextlib.suppress — the standard library has highly optimized, well-tested implementations of patterns that developers often re-implement poorly.

from collections import Counter, defaultdict
from itertools import groupby, chain, islice, product

# Count occurrences
word_counts = Counter(words)
most_common = word_counts.most_common(10)

# Group by key
data.sort(key=lambda x: x["department"])
for dept, employees in groupby(data, key=lambda x: x["department"]):
    print(f"{dept}: {list(employees)}")

# Chaining iterables without creating new lists
all_items = chain(list1, list2, list3)

# Taking first N items from any iterator
first_ten = list(islice(large_generator, 10))

They write for maintainability, not cleverness. Code written to impress is a maintenance burden. Code written to be understood by a tired version of yourself six months later is professional work.

# Clever — impressive to write, painful to read
result = next((x for x in items if pred(x)), None) or default_factory()

# Clear — immediately understood
match = next((x for x in items if pred(x)), None)
result = match if match is not None else default_factory()

They treat type annotations as documentation. In a large codebase, -> Optional[User] is more valuable than a docstring. It tells you, your IDE, and your CI pipeline exactly what to expect.

They automate quality. They run ruff (linting + formatting), mypy (type checking), and pytest (tests) in pre-commit hooks and CI. They don't rely on willpower to maintain quality — they make quality automatic.

Quick Reference: Key Interview Topics

Topic	What They're Testing
GIL	Concurrency understanding, tradeoffs
`args`/`*kwargs`	Argument handling, decorator forwarding
Generator vs list	Memory efficiency, lazy evaluation
Decorator internals	Higher-order functions, closures
`__enter__`/`__exit__`	Resource management, exception safety
`__slots__`	Memory optimization awareness
`@property` → descriptor	Understanding the data model
`asyncio` event loop	Async I/O understanding
Metaclass vs class decorator	Advanced class creation
LEGB / `nonlocal`	Scope and closure model
`is` vs `==`	Object identity vs equality
Mutable default args	Common bug, Python internals
`multiprocessing` vs `threading`	Concurrency model, GIL

Python mastery is not about knowing every method on every class. It's about having the right mental model — understanding that everything is an object, that names are references, that the interpreter loop is executing bytecodes, and that the standard library and ecosystem are your leverage. Write explicit, readable code. Profile before optimizing. Compose over inherit. And read the source code of the libraries you depend on — they're the best Python curriculum that exists.