Finding Euclidean distance using Scikit-Learn in Python

In this article, we will learn to find the Euclidean distance using the Scikit-Learn library in Python. Euclidean distance measures the straight-line distance between two points in space and is widely used in machine learning algorithms, particularly clustering.

What is Euclidean Distance?

The Euclidean distance formula calculates the straight-line distance between two points in n-dimensional space ?

d = ?[(x?-x?)² + (y?-y?)²] For 2D points (x?,y?) and (x?,y?) Extends to n-dimensions

Scikit-Learn provides the euclidean_distances() function to calculate these distances efficiently for arrays of points.

Method 1: Distance from Points to Origin

Calculate the Euclidean distance from multiple points to the origin (0,0,0) ?

# importing euclidean_distances function from scikit-learn module
from sklearn.metrics.pairwise import euclidean_distances
# importing NumPy module with an alias name
import numpy as np

# input NumPy array with 3D points
input_array = np.array([[3.5, 1.5, 5],
                       [1, 4, 2],
                       [6, 3, 10]])

# calculating the euclidean distance between points and origin (0,0,0)
result_distance = euclidean_distances(input_array, [[0, 0, 0]])

# printing the resultant euclidean distance
print("Euclidean distances from origin:")
print(result_distance)
Euclidean distances from origin:
[[ 6.28490254]
 [ 4.58257569]
 [12.04159458]]

Each row shows the distance from the corresponding point to the origin.

Method 2: Distance Between Two Arrays

Calculate pairwise distances between points in two different arrays ?

# importing euclidean_distances function from scikit-learn module
from sklearn.metrics.pairwise import euclidean_distances
# importing numpy library with an alias name
import numpy as np

# input numpy array 1 
input_array_1 = np.array([[3.5, 1.5, 5],
                         [1, 4, 2],
                         [6, 3, 10]])

# input numpy array 2
input_array_2 = np.array([[5, 4, 2],
                         [4, 3, 1],
                         [8.5, 2, 6]])

# calculating the euclidean distance between input_array_1 and input_array_2
result_distance = euclidean_distances(input_array_1, input_array_2)

# printing the resultant euclidean distance
print("Pairwise Euclidean distances:")
print(result_distance)
Pairwise Euclidean distances:
[[4.18330013 4.30116263 5.12347538]
 [4.         3.31662479 8.7321246 ]
 [8.1240384  9.21954446 4.82182538]]

The output is a 3×3 matrix where element [i,j] represents the distance between point i from the first array and point j from the second array.

Understanding the Output Matrix

Array 1 Point Array 2 Point [5,4,2] Array 2 Point [4,3,1] Array 2 Point [8.5,2,6]
[3.5,1.5,5] 4.183 4.301 5.123
[1,4,2] 4.000 3.317 8.732
[6,3,10] 8.124 9.220 4.822

Applications in Machine Learning

Euclidean distance is fundamental in clustering algorithms like K-means, where it helps determine cluster membership by measuring similarity between data points. Points with smaller distances are considered more similar and are grouped into the same cluster.

Conclusion

Scikit-Learn's euclidean_distances() function efficiently computes distances between points and arrays. Use it for single-point-to-origin calculations or pairwise distance matrices between multiple point sets in machine learning applications.

Updated on: 2026-03-27T00:13:59+05:30

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements