Basics of Multiprocessing and Using the multiprocessing Module in Python


Multiprocessing in Python is a technique to create multiple processes that run in parallel, utilizing multiple CPU cores for increased performance. Unlike multithreading, multiprocessing avoids the Global Interpreter Lock (GIL), making it ideal for CPU-bound tasks. Python's multiprocessing module provides tools for creating and managing processes efficiently.

What is Multiprocessing?

Multiprocessing allows a program to perform multiple tasks simultaneously by creating separate processes. Each process has its own memory space, making it ideal for tasks that require heavy computation or data isolation.

Features of the multiprocessing Module

  • Creating processes using the Process class.
  • Sharing data between processes using Queue and Pipe.
  • Synchronization with Lock, Semaphore, etc.
  • Managing pools of worker processes using the Pool class.

Example 1: Creating and Starting Processes

Here is a simple example of creating and running multiple processes.

    import multiprocessing
    import time

    def print_numbers():
        for i in range(5):
            print(f"Number: {i}")
            time.sleep(1)

    def print_letters():
        for letter in "ABCDE":
            print(f"Letter: {letter}")
            time.sleep(1)

    if __name__ == "__main__":
        process1 = multiprocessing.Process(target=print_numbers)
        process2 = multiprocessing.Process(target=print_letters)

        process1.start()
        process2.start()

        process1.join()
        process2.join()

        print("Processes have completed execution")
        

Output:

    Number: 0
    Letter: A
    Number: 1
    Letter: B
    ...
    Processes have completed execution
        

Example 2: Using a Queue to Share Data Between Processes

The Queue class allows processes to share data safely.

    import multiprocessing

    def producer(queue):
        for item in range(5):
            print(f"Producing: {item}")
            queue.put(item)

    def consumer(queue):
        while not queue.empty():
            item = queue.get()
            print(f"Consuming: {item}")

    if __name__ == "__main__":
        queue = multiprocessing.Queue()

        producer_process = multiprocessing.Process(target=producer, args=(queue,))
        consumer_process = multiprocessing.Process(target=consumer, args=(queue,))

        producer_process.start()
        producer_process.join()

        consumer_process.start()
        consumer_process.join()
        

Output:

    Producing: 0
    Producing: 1
    Producing: 2
    Producing: 3
    Producing: 4
    Consuming: 0
    Consuming: 1
    Consuming: 2
    Consuming: 3
    Consuming: 4
        

Example 3: Using a Pool of Worker Processes

The Pool class allows you to manage multiple worker processes efficiently.

    import multiprocessing

    def square(number):
        return number * number

    if __name__ == "__main__":
        numbers = [1, 2, 3, 4, 5]

        with multiprocessing.Pool(processes=3) as pool:
            results = pool.map(square, numbers)

        print(f"Squared Numbers: {results}")
        

Output:

    Squared Numbers: [1, 4, 9, 16, 25]
        

Advantages of Multiprocessing

  • Utilizes multiple CPU cores for better performance.
  • Each process has its own memory space, avoiding data conflicts.
  • Ideal for CPU-bound tasks like heavy computations.

Challenges of Multiprocessing

  • Processes have higher memory overhead compared to threads.
  • Inter-process communication (IPC) can be complex.
  • Debugging multiple processes can be challenging.

Conclusion

The multiprocessing module in Python is a powerful tool for running parallel tasks and improving program efficiency, especially for CPU-intensive tasks. By understanding and utilizing features like Process, Queue, and Pool, you can write robust and efficient programs that take full advantage of modern multi-core processors.





Advertisement