Statistical & Complex Number Functions in NumPy
NumPy provides a robust suite of statistical and complex number functions that are essential for data analysis, signal processing, and scientific computation. These tools are optimized to handle large arrays efficiently and support element-wise as well as axis-based operations.
From summarizing datasets using statistics like mean, median, and standard deviation, to manipulating and analyzing complex numbers with ease — NumPy simplifies high-performance mathematical computing.
- Statistical Functions: Includes sum(), mean(), median(), std(), min(), max(), and more.
- NaN-Aware Variants: Functions like nanmean() and nansum() safely handle missing data.
- Complex Functions: Work with real and imaginary parts using real, imag, angle(), conj(), and more.
Whether you're computing statistics or working with imaginary numbers, NumPy provides reliable, fast, and readable tools for numerical operations.
Common Statistical Functions in NumPy
Statistical functions in NumPy are used to compute summary statistics from arrays, such as sums, averages, medians, minima, maxima, variances, percentiles, and standard deviations. These functions are optimized for multi-dimensional arrays and support axis-based computations.
import numpy as np
data = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
print("Sum:", np.sum(data)) # 45
print("Sum (axis=0):", np.sum(data, axis=0)) # [12 15 18]
print("Mean:", np.mean(data)) # 5.0
print("Mean (axis=1):", np.mean(data, axis=1)) # [2. 5. 8.]
print("Median:", np.median(data)) # 5.0
print("Min:", np.min(data)) # 1
print("Max:", np.max(data)) # 9
print("Argmin:", np.argmin(data)) # 0 (index of min value in flattened array)
print("Argmax:", np.argmax(data)) # 8 (index of max value in flattened array)
print("Variance:", np.var(data)) # 6.666666666666667
print("Standard Deviation:", np.std(data)) # 2.581988897471611
print("25th Percentile:", np.percentile(data, 25))# 3.0
print("Quantile (0.75):", np.quantile(data, 0.75))# 7.0
import numpy as np
data = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
print("Sum:", np.sum(data)) # 45
print("Sum (axis=0):", np.sum(data, axis=0)) # [12 15 18]
print("Mean:", np.mean(data)) # 5.0
print("Mean (axis=1):", np.mean(data, axis=1)) # [2. 5. 8.]
print("Median:", np.median(data)) # 5.0
print("Min:", np.min(data)) # 1
print("Max:", np.max(data)) # 9
print("Argmin:", np.argmin(data)) # 0 (index of min value in flattened array)
print("Argmax:", np.argmax(data)) # 8 (index of max value in flattened array)
print("Variance:", np.var(data)) # 6.666666666666667
print("Standard Deviation:", np.std(data)) # 2.581988897471611
print("25th Percentile:", np.percentile(data, 25))# 3.0
print("Quantile (0.75):", np.quantile(data, 0.75))# 7.0
How It Works:
- np.sum(): Adds all elements or sums along an axis.
- np.mean(): Calculates the average value.
- np.median(): Finds the middle value when data is sorted.
- np.min() / np.max(): Finds minimum and maximum values.
- np.argmin() / np.argmax(): Returns indices of the minimum and maximum values in the flattened array (i.e., the array is treated as a 1D array regardless of its original shape).
- np.var(): Computes the variance, a measure of data spread.
- np.std(): Computes the standard deviation, measuring spread.
- np.percentile(): Finds the value below which a given percentage of the data falls. For example, the 25th percentile is the value below which 25% of data lies — useful for understanding data distribution in 100 parts.
- np.quantile(): Generalizes percentiles by computing the value below which a given fraction (between 0 and 1) of data falls. Percentiles are quantiles expressed as percentages (e.g., 0.25 quantile = 25th percentile).
Output
Sum: 45
Sum (axis=0): [12 15 18]
Mean: 5.0
Mean (axis=1): [2. 5. 8.]
Median: 5.0
Min: 1
Max: 9
Argmin: 0
Argmax: 8
Variance: 6.666666666666667
Standard Deviation: 2.581988897471611
25th Percentile: 3.0
Quantile (0.75): 7.0
Sum: 45
Sum (axis=0): [12 15 18]
Mean: 5.0
Mean (axis=1): [2. 5. 8.]
Median: 5.0
Min: 1
Max: 9
Argmin: 0
Argmax: 8
Variance: 6.666666666666667
Standard Deviation: 2.581988897471611
25th Percentile: 3.0
Quantile (0.75): 7.0
💡 Tip: Use the axis parameter to aggregate along rows (axis=1) or columns (axis=0), depending on your data shape.
If you want to learn more about axis in NumPy, read this tutorial Axis in NumPy.
Handling NaNs in NumPy
Real-world datasets often contain NaN (Not a Number) values representing missing or undefined data. NumPy provides special aggregator functions that gracefully handle these NaN values by ignoring them during computation.
import numpy as np
data = np.array([1, 2, np.nan, 4, 5])
print("Check NaNs:", np.isnan(data)) # [False False True False False]
print("Sum with NaNs:", np.sum(data)) # nan
print("Sum ignoring NaNs:", np.nansum(data)) # 12.0
print("Mean with NaNs:", np.mean(data)) # nan
print("Mean ignoring NaNs:", np.nanmean(data)) # 3.0
import numpy as np
data = np.array([1, 2, np.nan, 4, 5])
print("Check NaNs:", np.isnan(data)) # [False False True False False]
print("Sum with NaNs:", np.sum(data)) # nan
print("Sum ignoring NaNs:", np.nansum(data)) # 12.0
print("Mean with NaNs:", np.mean(data)) # nan
print("Mean ignoring NaNs:", np.nanmean(data)) # 3.0
- np.isnan(): Returns a boolean array indicating where NaN values are present.
- np.nansum(): Sum of array elements, ignoring NaNs.
- np.nanmean(): Mean of elements, excluding NaNs.
- np.nanmin() and np.nanmax(): Minimum and maximum ignoring NaNs.
- np.nanstd(): Standard deviation ignoring NaNs.
Output
Check NaNs: [False False True False False]
Sum with NaNs: nan
Sum ignoring NaNs: 12.0
Mean with NaNs: nan
Mean ignoring NaNs: 3.0
Check NaNs: [False False True False False]
Sum with NaNs: nan
Sum ignoring NaNs: 12.0
Mean with NaNs: nan
Mean ignoring NaNs: 3.0
💡 Tip: Use the nan* functions to safely perform aggregations on datasets containing missing values without errors or incorrect results.
Complex Number Functions in NumPy
NumPy provides full support for complex numbers and includes functions to work with their components, magnitude, phase, and logical checks. Complex arrays can be created using Python’s j suffix or by specifying a complex data type.
These tools allow you to efficiently manipulate complex values and determine whether elements are purely real or complex.
import numpy as np
# Creating a complex array
z = np.array([1 + 2j, 3 + 0j, 5])
# Real and imaginary parts
print("Real Part:", np.real(z)) # [1. 3. 5.]
print("Imaginary Part:", np.imag(z)) # [2. 0. 0.]
# Conjugate
print("Conjugate:", np.conj(z)) # [1.-2.j 3.-0.j 5.-0.j]
# Magnitude (modulus)
print("Magnitude:", np.abs(z)) # [2.236 3. 5. ]
# Phase angle
print("Angle:", np.angle(z)) # [1.107 0. 0. ]
# Check if values are real or complex
print("Is Real:", np.isreal(z)) # [False True True]
print("Is Complex:", np.iscomplex(z)) # [ True False False]
import numpy as np
# Creating a complex array
z = np.array([1 + 2j, 3 + 0j, 5])
# Real and imaginary parts
print("Real Part:", np.real(z)) # [1. 3. 5.]
print("Imaginary Part:", np.imag(z)) # [2. 0. 0.]
# Conjugate
print("Conjugate:", np.conj(z)) # [1.-2.j 3.-0.j 5.-0.j]
# Magnitude (modulus)
print("Magnitude:", np.abs(z)) # [2.236 3. 5. ]
# Phase angle
print("Angle:", np.angle(z)) # [1.107 0. 0. ]
# Check if values are real or complex
print("Is Real:", np.isreal(z)) # [False True True]
print("Is Complex:", np.iscomplex(z)) # [ True False False]
How It Works:
- np.real(): Extracts the real part of each complex number.
- np.imag(): Extracts the imaginary part.
- np.conj(): Returns the complex conjugate.
- np.abs(): Computes the magnitude (distance from origin in complex plane).
- np.angle(): Returns the angle (in radians) from the real axis.
- np.isreal(): Returns True for elements with zero imaginary part.
- np.iscomplex(): Returns True for elements with a non-zero imaginary part.
Output
Real Part: [1. 3. 5.]
Imaginary Part: [2. 0. 0.]
Conjugate: [1.-2.j 3.-0.j 5.-0.j]
Magnitude: [2.23606798 3. 5. ]
Angle: [1.10714872 0. 0. ]
Is Real: [False True True]
Is Complex: [ True False False]
Real Part: [1. 3. 5.]
Imaginary Part: [2. 0. 0.]
Conjugate: [1.-2.j 3.-0.j 5.-0.j]
Magnitude: [2.23606798 3. 5. ]
Angle: [1.10714872 0. 0. ]
Is Real: [False True True]
Is Complex: [ True False False]
💡 Tip: Even if the imaginary part is zero, NumPy may still treat the number as complex depending on the data type. Use np.isreal() or np.iscomplex() to verify the nature of the elements.
Frequently Asked Questions
How do I calculate the mean and median in NumPy?
How do I calculate the mean and median in NumPy?
Use np.mean() and np.median() to find the average and median values of elements in an array. They accept an optional axis parameter for multi-dimensional arrays.
How can I handle NaN values when computing statistics in NumPy?
How can I handle NaN values when computing statistics in NumPy?
Use np.nanmean(), np.nansum(), or np.nanstd() which ignore NaNs during calculations, useful for datasets with missing or incomplete data.
How do I compute standard deviation and variance in NumPy?
How do I compute standard deviation and variance in NumPy?
Use np.std() for standard deviation and np.var() for variance. Both functions support the axis parameter to specify the dimension of calculation.
What functions are available for complex number operations in NumPy?
What functions are available for complex number operations in NumPy?
NumPy offers np.real(), np.imag(), np.conj() for extracting real, imaginary parts, and the complex conjugate. Use np.angle() to get the phase angle.
How do I check if elements in a NumPy array are complex or real?
How do I check if elements in a NumPy array are complex or real?
Use np.iscomplex() and np.isreal() to return boolean arrays indicating which elements are complex or real numbers.
What's Next?
Up next, we’ll explore Floating-Point Number Functions in NumPy — key tools for handling numerical precision, rounding, and managing special values like NaNs and infinities in scientific computing.