Python Articles

Page 145 of 855

Drop duplicate rows in PySpark DataFrame

Devesh Chauhan
Devesh Chauhan
Updated on 27-Mar-2026 582 Views

PySpark is a Python API for Apache Spark, designed to process large-scale data in real-time with distributed computing capabilities. Unlike regular DataFrames, PySpark DataFrames distribute data across clusters and follow a strict schema for optimized processing. In this article, we'll explore different methods to drop duplicate rows from PySpark DataFrames using distinct() and dropDuplicates() functions. Installation Install PySpark using pip ? pip install pyspark Creating a PySpark DataFrame First, let's create a sample DataFrame with duplicate rows to demonstrate the deduplication methods ? from pyspark.sql import SparkSession import pandas as ...

Read More

Drop columns in DataFrame by label Names or by Index Positions

Devesh Chauhan
Devesh Chauhan
Updated on 27-Mar-2026 267 Views

A pandas DataFrame is a 2D data structure for storing tabular data. When working with DataFrames, you often need to remove unwanted columns. This can be done by specifying column names or their index positions using the drop() method. In this tutorial, we'll explore different methods to drop columns from a pandas DataFrame including dropping by names, index positions, and ranges. Creating the Sample DataFrame Let's start by creating a sample DataFrame to work with ? import pandas as pd dataset = { "Employee ID": ["CIR45", "CIR12", "CIR18", "CIR50", "CIR28"], ...

Read More

Drop a list of rows from a Pandas DataFrame

Devesh Chauhan
Devesh Chauhan
Updated on 27-Mar-2026 580 Views

The pandas library in Python is widely popular for representing data in tabular structures called DataFrames. When working with data analysis, you often need to remove specific rows from your DataFrame. This article demonstrates three effective methods for dropping multiple rows from a Pandas DataFrame. Creating a Sample DataFrame Let's start by creating a DataFrame with student marks data ? import pandas as pd dataset = { "Aman": [98, 92, 88, 90, 91], "Raj": [78, 62, 90, 71, 45], "Saloni": [82, ...

Read More

How to Locate Elements using Selenium Python?

Saba Hilal
Saba Hilal
Updated on 27-Mar-2026 1K+ Views

Selenium is a powerful web automation tool that can be used with Python to locate and extract elements from web pages. This is particularly useful for web scraping, testing, and automating browser interactions. In this tutorial, we'll explore different methods to locate HTML elements using Selenium with Python. Setting Up Selenium Before locating elements, you need to set up Selenium with a WebDriver. Here's a basic setup ? from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.chrome.service import Service import time # Setup Chrome driver driver = webdriver.Chrome() driver.get("https://example.com") time.sleep(2) # Always close ...

Read More

How to iterate through a nested List in Python?

Saba Hilal
Saba Hilal
Updated on 27-Mar-2026 6K+ Views

A nested list in Python is a list that contains other lists as elements. Iterating through nested lists requires different approaches depending on the structure and your specific needs. What is a Nested List? Here are common examples of nested lists ? # List with mixed data types people = [["Alice", 25, ["New York", "NY"]], ["Bob", 30, ["Los Angeles", "CA"]], ["Carol", 28, ["Chicago", "IL"]]] # 3-dimensional nested list matrix = [ ...

Read More

How to invert the elements of a boolean array in Python?

Saba Hilal
Saba Hilal
Updated on 27-Mar-2026 1K+ Views

Boolean array inversion is a common operation when working with data that contains True/False values. Python offers several approaches to invert boolean arrays using NumPy functions like np.invert(), the bitwise operator ~, or np.logical_not(). Using NumPy's invert() Function The np.invert() function performs bitwise NOT operation on boolean arrays ? import numpy as np # Create a boolean array covid_negative = np.array([True, False, True, False, True]) print("Original array:", covid_negative) # Invert using np.invert() covid_positive = np.invert(covid_negative) print("Inverted array:", covid_positive) Original array: [ True False True False True] Inverted array: [False ...

Read More

How to Make a Bell Curve in Python?

Saba Hilal
Saba Hilal
Updated on 27-Mar-2026 2K+ Views

A bell curve (normal distribution) is a fundamental concept in statistics that appears when we plot many random observations. Python's Plotly library provides excellent tools for creating these visualizations. This article demonstrates three practical methods to create bell curves using different datasets. Understanding Bell Curves The normal distribution emerges naturally when averaging many observations. For example, rolling two dice and summing their values creates a bell-shaped pattern — the sum of 7 occurs most frequently, while extreme values (2 or 12) are rare. Example 1: Bell Curve from Dice Roll Simulation Let's simulate 2000 dice rolls ...

Read More

What are the limitations of Python?

Md Waqar Tabish
Md Waqar Tabish
Updated on 27-Mar-2026 9K+ Views

Python is a popular and widely used programming language known for its simplicity, flexibility, and productivity. It excels in web development, data science, automation, and machine learning. However, like any programming language, Python has certain limitations that developers should consider when choosing it for their projects. Performance and Speed Limitations Python is an interpreted language that executes code at runtime through a virtual machine or interpreter. This makes it significantly slower than compiled languages like C or C++. import time # Python's interpreted nature makes operations slower start = time.time() result = sum(range(1000000)) end = ...

Read More

Positive and negative indices in Python?

Md Waqar Tabish
Md Waqar Tabish
Updated on 27-Mar-2026 6K+ Views

Python sequences like lists, tuples, and strings support two types of indexing: positive indexing (starting from 0) and negative indexing (starting from -1). This tutorial explains both approaches with practical examples. What Are Sequence Indexes? Indexing allows us to access individual elements in Python sequence data types. There are two types: Positive indexing − Starts from 0 and increases to n-1 (where n is the total number of elements) Negative indexing − Starts from -1 (last element) and moves backwards to -n List: [10, 20, 30, 40, 50] ...

Read More

What are the different types of Python data analysis libraries used?

Md Waqar Tabish
Md Waqar Tabish
Updated on 27-Mar-2026 382 Views

Python has established itself as the leading language for data science, consistently ranking first in industry surveys. Its success comes from combining an easy-to-learn, object-oriented syntax with specialized libraries for every data science task − from mathematical computations to data visualization. Core Data Science Libraries NumPy NumPy (Numerical Python) forms the foundation of Python's data science ecosystem. It provides efficient arrays and mathematical functions for numerical computing ? import numpy as np # Creating arrays and basic operations data = np.array([1, 2, 3, 4, 5]) print("Array:", data) print("Mean:", np.mean(data)) print("Standard deviation:", np.std(data)) ...

Read More
Showing 1441–1450 of 8,549 articles
« Prev 1 143 144 145 146 147 855 Next »
Advertisements