Terminating Linux Processes by PID

In Linux and Unix-based operating systems, every running process gets assigned a unique process identification number (PID). The PID allows administrators to target specific processes in order to inspect, manage, control or terminate them.

This comprehensive guide will cover multiple methods for finding and working with Linux processes using the PID and other attributes.

An Overview of Linux Processes

Before diving into the details of process management, let‘s quickly recap how Linux processes work under the hood.

Process Lifecycle

A process in Linux goes through several key states during its lifecycle:

Created – The initial state when a new process is spawned. Memory resources may not be allocated yet.
Running – The process is actively executing instructions on the CPU.
Waiting – Temporarily paused waiting for an I/O operation like disk access to complete.
Terminated – Process finished its job and is now exiting by releasing resources.

Linux Process Lifecycle

(Image source: Real Python)

Processes can transition between these states many times during their lifespan.

The Linux kernel manages tracking process state changes and scheduling across CPU cores.

Process Attributes

Key attributes of a Linux process you can inspect include:

PID – Unique process ID number assigned at creation. Used to identify and control processes.
PPID – Parent process ID that spawned this one.
User/Group ID – Associated user and group owner.
Priority – Kernel assigned priority influencing scheduler and CPU time. Ranges from -20 (highest) to 19 (lowest).
Threads – Number of threads being executed within the process.
Environment – Environment variables from the process‘s context.
Open Files – File descriptors pointing to open files and sockets.
Memory – Virtual, shared, and physical memory usage statistics.
CPU Time – Total user and system CPU time consumed so far.
Command – Path and arguments for the executable binary being run.

There are dozens of other metadata points tracked as well. These attributes are critical for managing processes.

Next let‘s explore tools that expose these process details.

Finding the PID of a Process

There are several simple commands for querying active processes and displaying their PID:

ps

The ps (process status) command shows snapshot information about currently running processes. To display all processes with their PID and other details:

ps aux

The output includes the PID, user, CPU usage percentage, full executable command line, and more.

You can grep the output of ps to filter on specific criteria like user or command name:

ps aux | grep ‘[f]irefox‘

Common flags for ps include:

a – Show processes for all users
u – Display the process owner
x – Also shows processes not attached to a terminal

pstree

pstree visually shows running processes as a tree so you can understand the parent-child relationships:

pstree

Output:

systemd─┬─NetworkManager─┬─dhclient
        │                └─2*[{NetworkManager}]
        ├─accounts-daemon─┬─{gdbus}
        │                └─{gmain}
        ├─atd
        ├─cron
        ├─dbus-daemon

It‘s very helpful for identifying cascading process dependencies.

pidof

pidof simply returns the PID given the name of a process:

pidof firefox

top & htop

The top command shows dynamic real-time information about active processes ordered by highest CPU/memory usage:

top

htop is an enhanced interactive version of top for easier browsing and killing processes.

pgrep

pgrep searches running processes by name or other attributes and returns PIDs:

pgrep -u user1 bash

This returns PIDs of bash processes owned by user1.

Getting Additional Process Information

Beyond the PID, you can view extended details about a running process to help troubleshoot issues or identify resource consumption.

Fetch Full Details with ps

Appending aux to ps grabs information from additional sources:

ps aux

Commonly inspected columns include:

%CPU – CPU utilization percentage
%MEM – Percent of total physical RAM used
VSZ – Virtual memory size in KB
RSS – Resident set size, the non-swapped physical memory used
TTY – Controlling terminal (tty)
STAT – Process state code (R = running, S = sleeping, etc)
START – When the process was started
TIME – Total CPU time consumed so far
COMMAND – Full command/arguments used to launch

This exposes more detail about each process to aid troubleshooting.

Inspect Open Files with lsof

lsof stands for "list open files" and shows which files (and network sockets) are open by each process:

lsof -p 6348

Sample output:

COMMAND     PID   USER   FD   TYPE  DEVICE    SIZE/OFF       NODE NAME 
mysqld    6348 mysql    cwd    DIR  253,1     12288      32786 /usr/ 
mysqld    6348 mysql  mem    REG  253,1  18841344   64424570 /usr/libexec/mysqld

This helps determine if a process is still actively reading/writing files or network connections.

View Environmental Variables

Each process executes within an isolated context that includes environment variables, user permissions, working directory, resource limits, and more.

View a process‘s full environmental details via the /proc virtual file system:

cat /proc/{PID}/environ

See Memory Maps

Processes allocate different memory regions for code, data, shared libraries, etc. The full memory maps of a process are visible under /proc:

cat /proc/{PID}/maps

This helps identify memory leaks or fragmentation issues if a process‘s resident memory set unexpectedly grows over time.

Controlling Processes

Now that you can find PIDs and process details, let‘s discuss methods for controlling them.

Kill Commands to Terminate Processes

You can terminate Linux processes using the kill command and related tools. Here are some common options:

kill

Syntax:

kill [signal] PID

Common signals:

-9 – SIGKILL immediately terminates the process. Risks data loss.
-15 – SIGTERM gracefully stops the process.

Force kill process 1215:

kill -9 1215

killall

killall terminates by process name instead of PID:

killall dockerd

pkill

pkill is similar but allows killing processes owned by a specific user or terminal.

xkill

xkill provides a graphical interface. Click on a window to terminate the owning process.

Shutting Down Cleanly

The best way to terminate processes is by gracefully shutting them down to avoid data loss or corruption.

First send SIGTERM (15) to allow finishing work and releasing resources cleanly.
After some timeout, follow up with SIGKILL (-9) to force termination if still running.

Here is a common pattern:

# Graceful shutdown attempt 
kill -15 1234  

sleep 30

# Still running after 30 seconds? Force kill.
kill -9 1234

Pausing Processes

Instead of terminating processes, you can use signals to pause and resume execution:

# Freeze execution  
kill -STOP 1345 

# Resume running
kill -CONT 1345

This keeps the process active but suspended, useful in some cases for troubleshooting or checkpoints.

Managing Process Resources

Beyond controlling process state, Linux offers advanced control over resources consumed by processes including CPU, memory, network bandwidth and disk I/O.

Set Process Priority

The kernel scheduler determines which processes get access to CPU cores and for how much time. This is influenced by the configured priority value.

Processes with a nicer value (-20 to +19) will get scheduled less frequently. This requires voluntarily yielding control.

Check and alter priorities:

# Check
ps -eo pid,args,nice,comm

# Set 
renice +5 -p 2140

Limit Resource Usage

Cgroups allow grouping processes under specific limits for better isolation and control.

For example, bound Docker containers or high traffic web servers to a max CPU quota. Or restrict total memory usage for a batch video processing pipeline.

Common resources to apply quotas and throttles include:

CPUs
RAM
Network bandwidth
Disk I/O

This ensures runaway processes don‘t overload your systems by consuming too many shared resources.

Architecting Long-Running Processes

There are various architectural patterns for implementing long-running Linux processes depending on the use case:

Daemons

Daemons are system-level background services launched at boot to handle critical infrastructure functionality network services, databases, web servers, etc.

Run daemons as systemd services so they are supervised and start across reboots.

Cron Jobs

Scripted jobs that need to run on a schedule work well implemented as Cron tasks. This avoids needing an external process.

Fire off time-based background crunching tasks without consuming resources in between.

Message Queues

Implement long-running processing using messaging queues like RabbitMQ or Kafka instead of direct client/server.

Producers publish job messages to a durable queue. Separate worker consumers pull tasks asynchronously. This decouples the distributed components for more resilience.

Containers

Docker containers provide isolated environments for processes. This prevents interference from other users and workloads on the same hosts.

Also simplifies clustering apps across hosts.

Linux Process Management Gotchas

Here are some common pitfalls and best practices when architecting Linux process workflows:

Plan out worker process startup, restart policies and graceful shutdown handling from the beginning.
Select an appropriate communication mechanism like HTTP, messages, files or databases between detached components.
Validate all processes clean up child processes and threads correctly to prevent zombies.
Enable core dumps by default via ulimit for critical processes.
Collect resource usage metrics on processes before hitting limits in production.
Profile CPU, memory, disk and network at a thread level.
Stress test process watchdog components that monitor health.
Mind security by restricting users and applying principle of least privilege to processes.
Don‘t share code or configurations between production process instances.
Document architectures with process flow diagrams.

Getting Linux process management right ensures your services stay available and stable.

Monitoring Linux Processes at Scale

While interactive ps and top commands work find on individual servers, they don‘t scale across large environments with thousands of containers or distributed services.

Here are common tools for monitoring Linux processes across fleets of servers:

System Resource Monitors

Tools like:

Nagios – Alert when key process health metrics cross thresholds.
Datadog – Graph and correlate metrics collected across hosts.
Prometheus – Pull and aggregate custom app-level process metrics.

Help track usage, saturation and errors at scale.

Log Aggregators

Centralized logging systems like the ELK stack enable searching through process stdout/stderr logs across environments using metadata like process ID.

APMs

Application performance management (APM) tools hook into app process code paths to trace requests end-to-end through complex systems.

Gather fine-grained performance metrics and detect latency outliers.

Tracing

Distributed tracing follows a specific thread of execution across networked microservices processes.

Understand how change impacts interconnected systems.

Processes in Docker Containers

Docker containers provide portable isolated environments for Linux processes to run without conflicting with other host users or workloads.

Containers have their own:

Filesystem
Process list
Network stack
Resource usage quotas

This removes many traditional process coordination headaches.

However Docker introduces new operational challenges like:

Increased density of processes per host.
Ephemeral container lifespans.
Networked communication between containers.

Make sure to integrate containers with host-level DevOps tools for visibility. Track deployment events, logs, and metrics at a container level.

Troubleshooting High Process Usage

Sometimes Linux systems may experience degraded performance, instability, or outages due to processes overconsuming resources from contention, leaks or other issues.

Here is a methodology for troubleshooting processes:

Profile Baseline Usage

First capture a profile of standard process activity across CPUs, memory, disk, network when the system is healthy. This establishes a baseline for comparison.

Identify trends by time of day, schedules and usage patterns.

Gather Symptom Data

When incidents strike, record observational data like:

Was process usage steadily increasing over hours/days or spike suddenly?
Are multiple processes slowing down or a single runaway process?
Are system resources hitting limits or saturation?
What is the impact on dependent services?

Reproduce scenarios that lead to problems if intermittent.

Inspect Dependencies

View process information to narrow down culprits.

Which PIDs use the most resources? Any child processes?
What states are heavy processes stuck in?
Any increase in open files, sockets or network activity?
Growing memory allocation but not being freed?

Get thread-level CPU and tracing profiles if available.

Consider External Factors

Look at other factors like:

Application configuration or provisioning changes.
Traffic patterns or data volumes.
Upstream dependencies performance.
Infrastructure platform problems.

Implement Fixes

Address root cause issues like:

Configuration tuning or optimizing inefficient code paths
Refactoring models or queuing
Load balancing
Resource quota enforcement via Cgroups.
Scaling out processes across more hosts.

Then validate monitoring catches problems going forward.

Wrapping Up

As you can see, Linux offers rich interfaces for orchestrating processes – the building blocks of complex operating systems and applications.

Make sure to architect process lifecycles upfront with scaling, observability and resiliency considering from the beginning.

Dig into the extensive metrics exposed through commands like ps, /proc and lsof for deep visibility into all aspects of process health.

Control and contain misbehaving processes with Linux‘s strong process management and resource constraint capabilities.

Mature monitoring and tracing tooling integration surfaces process issues instantly even across fleets of distributed servers.

With these Linux process foundations and methodologies in place, you can confidently build and operate large-scale systems reliably.