Numpy#
While Python’s built-in data structures e.g. list, are powerful and flexible,
sometimes you need even more - especially when working with large datasets or doing
scientific and numerical computing.
NumPy is a popular external library that introduces the array data structure, which looks and feels a lot like a Python list but offers better performance with the right mathematical implementation.
Installation#
pip comes in default with python as the package manager to install and manage packages
from PyPI. To install numpy
pip install numpy
Example#
Let’s say you have a grayscale image stored as a 2D NumPy array, with pixel values between 0 and 1. Even though the pixel values are between 0 and 1, the average brightness can still vary from image to image. One image might be mostly dark, another might be very bright. This variation can affect how well a model learns, since it might pick up on brightness differences instead of actual shapes or patterns.
By applying Z-score normalization, we shift the data so its mean becomes 0 and its spread becomes consistent. This removes brightness bias and makes the data more stable and reliable for learning.
1import numpy as np
2from numpy import typing as npt
3
4
5# Simulate an image
6image: npt.NDArray[np.float64] = np.random.rand(1000, 1000)
7print("Raw:\n", image)
8
9
10# Z-score normalization
11mean: float = np.mean(image)
12std: float = np.std(image)
13normalized_image: npt.NDArray[np.float64] = (image - mean) / std
14print("Normalized:\n", normalized_image)
15
16
17print("Mean after normalization:", np.mean(normalized_image))
18# Mean after normalization: -2.503490748040349e-16 ~0
19print("Std after normalization:", np.std(normalized_image))
20# Std after normalization: 0.9999999999999998 ~1