No articles found
Try different keywords or browse our categories
Read and Write CSV Files with Python
Simple guide to reading and writing CSV files in Python using csv module and pandas. Quick examples for data processing.
CSV files are everywhere. Here’s how to read and write them efficiently with Python.
Built-in CSV Module
No installation needed. Part of Python’s standard library.
Basic Reading
import csv
with open('data.csv', 'r') as file:
reader = csv.reader(file)
for row in reader:
print(row) # ['Name', 'Age', 'City']
Reading with Headers
import csv
with open('data.csv', 'r') as file:
reader = csv.DictReader(file)
for row in reader:
print(row['Name'], row['Age']) # Access by column name
Basic Writing
import csv
data = [
['Name', 'Age', 'City'],
['John', '28', 'New York'],
['Sarah', '34', 'Boston']
]
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerows(data)
Writing with Headers
import csv
headers = ['Name', 'Age', 'City']
data = [
{'Name': 'John', 'Age': '28', 'City': 'New York'},
{'Name': 'Sarah', 'Age': '34', 'City': 'Boston'}
]
with open('output.csv', 'w', newline='') as file:
writer = csv.DictWriter(file, fieldnames=headers)
writer.writeheader()
writer.writerows(data)
Using Pandas (Recommended)
More powerful for data manipulation.
pip install pandas
Read CSV
import pandas as pd
# Basic read
df = pd.read_csv('data.csv')
print(df.head())
# Skip rows
df = pd.read_csv('data.csv', skiprows=2)
# Select specific columns
df = pd.read_csv('data.csv', usecols=['Name', 'Age'])
# Handle missing values
df = pd.read_csv('data.csv', na_values=['NA', 'null', ''])
# Custom delimiter
df = pd.read_csv('data.tsv', sep='\t')
# No headers
df = pd.read_csv('data.csv', header=None)
Write CSV
import pandas as pd
data = {
'Name': ['John', 'Sarah', 'Mike'],
'Age': [28, 34, 45],
'City': ['New York', 'Boston', 'Chicago']
}
df = pd.DataFrame(data)
# Basic write
df.to_csv('output.csv', index=False)
# Custom delimiter
df.to_csv('output.tsv', sep='\t', index=False)
# Append to existing file
df.to_csv('output.csv', mode='a', header=False, index=False)
# Write specific columns
df[['Name', 'City']].to_csv('output.csv', index=False)
Filter Data
import pandas as pd
df = pd.read_csv('employees.csv')
# Filter rows
young_employees = df[df['Age'] < 30]
young_employees.to_csv('young_employees.csv', index=False)
# Multiple conditions
filtered = df[(df['Age'] > 25) & (df['City'] == 'New York')]
filtered.to_csv('filtered.csv', index=False)
Clean Data
import pandas as pd
df = pd.read_csv('messy_data.csv')
# Remove duplicates
df = df.drop_duplicates()
# Remove empty rows
df = df.dropna()
# Fill missing values
df['Age'] = df['Age'].fillna(0)
# Strip whitespace
df['Name'] = df['Name'].str.strip()
# Save cleaned data
df.to_csv('clean_data.csv', index=False)
Combine CSV Files
import pandas as pd
import glob
# Read all CSV files in directory
csv_files = glob.glob('data/*.csv')
# Combine into one dataframe
dfs = [pd.read_csv(file) for file in csv_files]
combined = pd.concat(dfs, ignore_index=True)
# Save combined data
combined.to_csv('combined.csv', index=False)
Transform Data
import pandas as pd
df = pd.read_csv('sales.csv')
# Add new column
df['Total'] = df['Quantity'] * df['Price']
# Rename columns
df = df.rename(columns={'old_name': 'new_name'})
# Sort data
df = df.sort_values('Total', ascending=False)
# Group and aggregate
summary = df.groupby('Product').agg({
'Quantity': 'sum',
'Total': 'sum'
})
summary.to_csv('summary.csv')
Handle Large Files
import pandas as pd
# Read in chunks
chunk_size = 10000
for chunk in pd.read_csv('large_file.csv', chunksize=chunk_size):
# Process each chunk
processed = chunk[chunk['Age'] > 30]
processed.to_csv('output.csv', mode='a', header=False, index=False)
Convert Data Types
import pandas as pd
df = pd.read_csv('data.csv')
# Convert types during read
df = pd.read_csv('data.csv', dtype={
'Age': int,
'Salary': float,
'Name': str
})
# Convert dates
df = pd.read_csv('data.csv', parse_dates=['Date'])
# Or after reading
df['Date'] = pd.to_datetime(df['Date'])
df['Age'] = df['Age'].astype(int)
df.to_csv('typed_data.csv', index=False)
CSV to JSON
import pandas as pd
df = pd.read_csv('data.csv')
# To JSON
df.to_json('data.json', orient='records', indent=2)
# From JSON back to CSV
df = pd.read_json('data.json')
df.to_csv('converted.csv', index=False)
Handle Different Encodings
import pandas as pd
# Common encodings
df = pd.read_csv('data.csv', encoding='utf-8')
df = pd.read_csv('data.csv', encoding='latin-1')
df = pd.read_csv('data.csv', encoding='cp1252')
# Auto-detect encoding
import chardet
with open('data.csv', 'rb') as file:
result = chardet.detect(file.read())
encoding = result['encoding']
df = pd.read_csv('data.csv', encoding=encoding)
CSV with Quotes
import csv
# Read with quotes
with open('data.csv', 'r') as file:
reader = csv.reader(file, quotechar='"')
for row in reader:
print(row)
# Write with quotes
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file, quoting=csv.QUOTE_ALL)
writer.writerow(['Name', 'Description'])
writer.writerow(['Product', 'Description with, comma'])
Error Handling
import pandas as pd
try:
df = pd.read_csv('data.csv')
except FileNotFoundError:
print("File not found")
except pd.errors.EmptyDataError:
print("CSV file is empty")
except pd.errors.ParserError:
print("Error parsing CSV")
else:
df.to_csv('output.csv', index=False)
Quick Reference
Reading:
csv.reader()- Simple row-by-row readingcsv.DictReader()- Read as dictionariespd.read_csv()- Full-featured with data manipulation
Writing:
csv.writer()- Simple row writingcsv.DictWriter()- Write dictionariesdf.to_csv()- Write pandas DataFrame
Key Parameters:
delimiter/sep- Column separator (default: comma)quotechar- Character for quoting fields (default:")header- Row number for column namesskiprows- Rows to skipencoding- File encoding (utf-8, latin-1, etc.)index=False- Don’t write row numbers
When to Use What:
- Built-in
csvmodule: Simple reading/writing, no dependencies - Pandas: Data cleaning, filtering, transformation, analysis
Conclusion
For simple tasks, use Python’s built-in csv module. For anything involving data manipulation, filtering, or transformation, use pandas. Both are fast and reliable for processing CSV files.
Related Articles
Generate Excel Files from Raw Data with Python
Quick guide to creating Excel files from raw data using Python. Learn to use openpyxl, xlsxwriter, and pandas for Excel generation.
Python FFMPEG Integration: Edit Videos in Terminal
Master video editing from the command line using Python and FFmpeg. Learn to trim, merge, compress, and manipulate videos programmatically.
Python ReportLab Tutorial: Edit PDF Files in Terminal
Learn how to use Python's ReportLab library to create and edit PDF files directly from the command line with practical examples.