Working with .npy and .npz Formats in Numpy Framework

In Numpy, the .npy and .npz file formats are commonly used for storing Numpy arrays efficiently. The .npy format is used for single arrays, while .npz is used to store multiple arrays in one compressed or uncompressed archive. These formats preserve the array's data type, shape, and other essential information.

1. Saving and Loading Data with .npy Format

The .npy format is designed for storing a single Numpy array. You can save an array to a file using the save function and load it back using the load function.

Saving Data to .npy File

    import numpy as np
    
    # Creating a sample array
    data = np.array([1.5, 2.3, 3.8, 4.1, 5.2])
    
    # Saving the array to a .npy file
    np.save('data.npy', data)
    print("Data saved to data.npy.")
        

In the above code, we create a 1D array and save it to a file named data.npy.

Result

After running the code, the array will be saved as a binary file data.npy. You won’t be able to open it as a text file, but it will store the data efficiently in Numpy’s binary format.

Loading Data from .npy File

    # Loading the saved data
    loaded_data = np.load('data.npy')
    
    # Displaying the loaded data
    print("Loaded data:", loaded_data)
        

The array can be loaded back into a variable using the load function, which reads the binary data and reconstructs the original array.

Result

The output will be:

    Loaded data: [1.5 2.3 3.8 4.1 5.2]
        

2. Saving and Loading Data with .npz Format

The .npz format allows you to store multiple Numpy arrays in a single compressed or uncompressed archive. This is useful when you need to store and manage more than one array in a single file.

Saving Multiple Arrays to .npz File

    # Creating multiple arrays
    data1 = np.array([1.5, 2.3, 3.8])
    data2 = np.array([4.1, 5.2, 6.3])
    
    # Saving the arrays to a .npz file
    np.savez('multiple_data.npz', array1=data1, array2=data2)
    print("Multiple arrays saved to multiple_data.npz.")
        

In this example, we create two arrays, data1 and data2, and save them into a single multiple_data.npz file. Each array is saved with a specified name: array1 and array2.

Result

The arrays will be stored together in the multiple_data.npz file, and the file will be in binary format for efficiency.

Loading Multiple Arrays from .npz File

    # Loading the multiple arrays from the .npz file
    loaded_data = np.load('multiple_data.npz')
    
    # Accessing individual arrays by name
    array1 = loaded_data['array1']
    array2 = loaded_data['array2']
    
    # Displaying the arrays
    print("Array 1:", array1)
    print("Array 2:", array2)
        

The arrays in the multiple_data.npz file can be accessed by their respective names using dictionary-like syntax. In this case, we access array1 and array2.

Result

The output will be:

    Array 1: [1.5 2.3 3.8]
    Array 2: [4.1 5.2 6.3]
        

3. Compressed .npz Format

When saving multiple arrays with the np.savez function, you can also use compression to reduce the size of the resulting .npz file. To do this, use the compression argument.

Saving Data with Compression

    # Saving multiple arrays with compression
    np.savez_compressed('compressed_data.npz', array1=data1, array2=data2)
    print("Arrays saved with compression in compressed_data.npz.")
        

The np.savez_compressed function creates a compressed .npz file, which is smaller than an uncompressed file, but the process of compression and decompression adds a small overhead when reading or writing the file.

Loading Compressed Data

    # Loading the compressed data
    compressed_data = np.load('compressed_data.npz')
    
    # Accessing the arrays
    array1_compressed = compressed_data['array1']
    array2_compressed = compressed_data['array2']
    
    # Displaying the arrays
    print("Array 1 (Compressed):", array1_compressed)
    print("Array 2 (Compressed):", array2_compressed)
        

Result

The output will be the same as before, but the data is stored in a smaller compressed file:

    Array 1 (Compressed): [1.5 2.3 3.8]
    Array 2 (Compressed): [4.1 5.2 6.3]
        

4. Summary of .npy and .npz File Formats

Here’s a quick summary of the .npy and .npz formats in Numpy:

  • .npy: Used for saving a single Numpy array in a binary format. Efficient storage of array data with preservation of shape, dtype, and other properties.
  • .npz: Used for saving multiple Numpy arrays in one compressed or uncompressed archive. Stores arrays with names as keys and their data as values.
  • .npz with compression reduces file size but adds slight overhead during saving and loading.

Conclusion

The .npy and .npz formats in Numpy offer efficient and convenient ways to store and manage arrays. The .npy format is ideal for saving single arrays, while the .npz format is perfect for managing multiple arrays in one file, with optional compression to save disk space. These file formats are highly optimized for both storage and retrieval of Numpy arrays.





Advertisement