Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to create an ogive graph in python?
An ogive graph graphically represents the cumulative distribution function (CDF) of a dataset, also known as a cumulative frequency curve. It helps analyze data distribution and identify patterns and trends. Python provides libraries like Matplotlib, Pandas, and NumPy to create ogive graphs effectively.
What is an Ogive Graph?
An ogive is a line graph that shows the cumulative frequency of data points up to each value. The curve typically starts at zero and rises to the total frequency, creating an S-shaped curve that reveals distribution characteristics.
Syntax
# Calculate cumulative frequency freq, bins = np.histogram(data, bins=bins) cumulative_freq = np.cumsum(freq) # Plot ogive plt.plot(bins[1:], cumulative_freq, '-o')
The np.histogram() function calculates frequency distribution, while np.cumsum() computes cumulative frequencies. The plt.plot() function creates the ogive using line markers.
Example 1: Dice Rolls Distribution
Here's an example creating an ogive graph to visualize the cumulative frequency distribution of dice rolls −
import numpy as np
import matplotlib.pyplot as plt
# List of dice rolls
rolls = [1, 2, 3, 4, 5, 6, 3, 6, 2, 5, 1, 6, 4, 2, 3, 5, 1, 4, 6, 3]
# Calculate the cumulative frequency
bins = np.arange(0, 8, 1)
freq, bins = np.histogram(rolls, bins=bins)
cumulative_freq = np.cumsum(freq)
# Create the ogive graph
plt.figure(figsize=(8, 6))
plt.plot(bins[1:], cumulative_freq, '-o', linewidth=2, markersize=6)
plt.xlabel('Dice Values')
plt.ylabel('Cumulative Frequency')
plt.title('Ogive Graph of Dice Rolls')
plt.grid(True, alpha=0.3)
plt.show()
print("Dice values:", bins[1:])
print("Cumulative frequencies:", cumulative_freq)
Dice values: [1 2 3 4 5 6 7] Cumulative frequencies: [ 3 6 10 12 15 20 20]
Example 2: Random Data Distribution
This example demonstrates creating an ogive graph for a larger dataset with random numbers −
import numpy as np
import matplotlib.pyplot as plt
# Set random seed for reproducibility
np.random.seed(42)
# Generate random data
data = np.random.randint(0, 100, 500)
# Calculate the cumulative frequency
bins = np.arange(0, 110, 10)
freq, bins = np.histogram(data, bins=bins)
cumulative_freq = np.cumsum(freq)
# Create the ogive graph
plt.figure(figsize=(10, 6))
plt.plot(bins[1:], cumulative_freq, '-o', color='red', linewidth=2, markersize=6)
plt.xlabel('Data Values')
plt.ylabel('Cumulative Frequency')
plt.title('Ogive Graph of Random Data (0-100)')
plt.grid(True, alpha=0.3)
plt.xticks(bins[1:])
plt.show()
print("Bin ranges:", bins[1:])
print("Cumulative frequencies:", cumulative_freq)
Bin ranges: [ 10 20 30 40 50 60 70 80 90 100] Cumulative frequencies: [ 47 98 154 199 252 303 352 404 447 500]
Key Components of an Ogive
- X-axis: Represents data values or class boundaries
- Y-axis: Shows cumulative frequency
- Curve shape: Steep rises indicate high frequency in that range
- Total height: Equals the total number of observations
Customizing Your Ogive
import numpy as np
import matplotlib.pyplot as plt
# Sample data
scores = [65, 72, 78, 85, 91, 68, 77, 82, 89, 94, 71, 76, 83, 88, 92]
# Calculate cumulative frequency
bins = np.arange(60, 101, 5) # 5-point intervals
freq, bins = np.histogram(scores, bins=bins)
cumulative_freq = np.cumsum(freq)
# Create customized ogive
plt.figure(figsize=(10, 6))
plt.plot(bins[1:], cumulative_freq, '-o',
color='darkblue', linewidth=3, markersize=8,
markerfacecolor='lightblue', markeredgecolor='darkblue')
plt.xlabel('Test Scores', fontsize=12, fontweight='bold')
plt.ylabel('Cumulative Frequency', fontsize=12, fontweight='bold')
plt.title('Student Test Scores - Cumulative Distribution', fontsize=14, fontweight='bold')
plt.grid(True, alpha=0.3, linestyle='--')
plt.xticks(bins[1:])
# Add percentage labels
for i, freq_val in enumerate(cumulative_freq):
percentage = (freq_val / len(scores)) * 100
plt.annotate(f'{percentage:.1f}%',
(bins[i+1], freq_val),
textcoords="offset points",
xytext=(0,10), ha='center')
plt.tight_layout()
plt.show()
[Graph displays with percentage annotations showing cumulative distribution of test scores]
Interpreting Ogive Graphs
| Curve Feature | Interpretation | Example |
|---|---|---|
| Steep rise | High frequency in that range | Many students scored 80-85 |
| Gradual slope | Low frequency in that range | Few students scored 90-95 |
| Horizontal line | No data points in that range | No scores between 95-100 |
Conclusion
Ogive graphs are powerful tools for visualizing cumulative distributions in Python using Matplotlib. They help identify data patterns, quartiles, and distribution characteristics. Use customization options like colors, markers, and annotations to create informative and visually appealing ogive graphs for statistical analysis.
