Basic Pipeline Pattern:
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ Input │──▶│ Filter │──▶│ Process │──▶│ Output │
│ Source │ │ Data │ │ Data │ │ Format │
└─────────┘ └─────────┘ └─────────┘ └─────────┘
Multi-stage Processing:
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ Raw │──▶│ Clean │──▶│ Extract │──▶│ Sort │──▶│ Format │
│ Data │ │ Data │ │ Fields │ │ Data │ │ Output │
└─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘
Parallel Processing:
┌─────────┐
┌───▶│Process A│────┐
┌─────────┐ └─────────┘ ▼ ┌─────────┐
│ Input │ │ Combine │──▶ Output
│ Source │ ┌─────────┐ ▲ │ Results │
└─────────┘ │Process B│────┘ └─────────┘
└───▶└─────────┘
# Data analysis pipeline
cat data.csv | \
grep -v "^#" | \ # Remove comments
cut -d, -f2,4 | \ # Extract columns 2 and 4
grep -v "^$" | \ # Remove empty lines
sort | \ # Sort the data
uniq -c | \ # Count occurrences
sort -nr | \ # Sort by count (descending)
head -10 # Top 10 results
# Log analysis pipeline
grep "$(date '+%Y-%m-%d')" /var/log/syslog | \
grep "ERROR" | \
cut -d' ' -f5- | \
sort | \
uniq -c | \
sort -nr
# System monitoring pipeline
ps aux | \
grep -v "^USER" | \ # Remove header
awk '{print $4, $11}' | \ # Memory % and command
sort -nr | \ # Sort by memory usage
head -10 | \ # Top 10 processes
while read mem cmd; do \ # Format output
printf "%.1f%% %s\n" "$mem" "$cmd"
done
Real-world Log Analysis Pipeline:
cat access.log | grep "404" | cut -d' ' -f1 | sort | uniq -c | sort -nr | head -5
Step 1: Read log Step 2: Filter 404s Step 3: Extract IPs
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│192.168.1.1 200 │ │192.168.1.1 404 │ │192.168.1.1 │
│192.168.1.2 404 │───▶│192.168.1.3 404 │───▶│192.168.1.3 │
│192.168.1.3 404 │ │192.168.1.1 404 │ │192.168.1.1 │
│192.168.1.1 404 │ │192.168.1.5 404 │ │192.168.1.5 │
│192.168.1.4 200 │ └─────────────────┘ └─────────────────┘
│192.168.1.5 404 │
└─────────────────┘
Step 4: Sort IPs Step 5: Count unique Step 6: Sort by count
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│192.168.1.1 │ │ 2 192.168.1.1 │ │ 2 192.168.1.1 │
│192.168.1.1 │───▶│ 1 192.168.1.3 │───▶│ 1 192.168.1.5 │
│192.168.1.3 │ │ 1 192.168.1.5 │ │ 1 192.168.1.3 │
│192.168.1.5 │ └─────────────────┘ └─────────────────┘
└─────────────────┘
# Clean and analyze text file
cat document.txt | \
tr '[:upper:]' '[:lower:]' | \ # Convert to lowercase
tr -d '[:punct:]' | \ # Remove punctuation
tr ' ' '\n' | \ # One word per line
grep -v "^$" | \ # Remove empty lines
sort | \ # Sort words
uniq -c | \ # Count word frequency
sort -nr | \ # Sort by frequency
head -20 > word_frequency.txt # Save top 20 words
Next: → Practical Labs
Previous: ← Sed Stream Editor
Lesson Home: ↑ Lesson 7: Filters & Pipelines
Course Home: ⌂ Introduction to Linux