Now you may know the main keypoints of the python development. We will talk about some miscellaneous in python practice. Including concurrecy & parallelism, build-in modules, and collaboration, production.
Concurrency is when computer does many different things seeming at the same time. Parallelism is actually doing many different things at the same time.
- Use subprocess to Manage Child Processes
Python has many libraries for running and managing child processes. The best and simplest choice for managing child processes is to use the
subprocess build-in module. The
Popen constructor starts the process. The
communicate method reads the child process’s output and wait for ternimination.
Use subprocess module to run child processes in parallel and manage their input and output. Use
timeout parameter with
communicate to avoid deadlocks.
- Use Threads for Blocking I/O, Avoid for Parallelism
The standard implementation of Python, CPython, enforces coherence with a mechanism call Global Interpreter Lock (GIL). The GIL prevents CPython from being affected by preemptive multithreading. It means that python threads can’t run bytecode in parallel.
There are ways to get CPython to utilize multiple cores, but it doesn’t work with the standard
Thread class. But for blocking I/O tasks, Python can run seemingly in parallel. Because python use system calls, which interacting with external environment, rather block GIL.
- Use Lock to Prevent Data Races in Threads
Though GIL here, but it will not protect you. You’re still responsible for protecting against data races between the threads in your programs. Your programs will corrupt their data stuctures if you allow multiple threads to modify the same objects without locks.
Thread build-in module is the standard mutual exclusion lock implementation.
- Use Queue to Coordinate Work Between Threads
Queue class in
queue build-in module provides all of the functions you need to solve the problems. But you need also be aware of the many problems in building concurrent piplines: busy waiting, stopping workers, and memory explosion.
- Consider Coroutines to Run Many Functions
There’re 3 big problems with threads.
- require special tools to coordinate.
- Threads require a lot of memory, about 8 MB per executing thread.
- Threads are costly to start.
You can use coroutines to work around all these issues. They’re implemented as an extension to generators. The cost of starting a generator is less than 1KB of memory.
Syntax in python2 seems not so elegant. you need to include an additional loop at the delegation point.
- Consider concurrent.futures for True Parallelism
concurrent.futures module provide
ProcessPollExecutor class. The advanced parts of the
multiprocessing module should be avoid because they are so complex.
- Define Function Decorators with functools.wraps
functools.wraps helpping your handle with some meta information of functions like
func.__name__ and debuggers.
- Consider contextlib and with Statements for Reusable try/finally Behavior
You can defining a new class with the sepical method
__exit__. But it’s too heavy to do this.
contextlib build-in module provides a
contextmanager decorator that make it easy to use your own functions in with statements.
- Use datetime instead of time for Local Clocks
Avoid using the time module for translating between different time zones. Use the datetime built-in module along with the
pytz module to reliably convert between times in different time zones. And always represent time in UTC and do covnersions to local time as the final step before presentation.
Use Built-in Algorithms and Data Structures
- Ordered Dictionary
- Default Dictionary
- Heap Queue
Use decimal When Precision Is Paramout
- Write Docstr for Every Func, Class, Module
- Use Packages to Organize Modules and Provide Stable APIs
- Define a Root Exception
- Know How to Break Circular Dependencies
- Use Virtual Environments
- Consider Module-Scoped Code to Configure Deployment Environments
- Test Everything
- Consider Interactive Debugging with pdb
- Profile Before Optimizing
- User tracemalloc to Understand Memory Usage and Leaks