To analyze tornado data using Python and GeoPandas, follow these steps:
1. Install Dependencies
Ensure you have the required libraries installed:
pip install geopandas pandas matplotlib seaborn shapely folium
2. Load Tornado Dataset
You'll need a dataset containing tornado information, such as the NOAA Storm Events Database. Load it into a Pandas DataFrame:
import pandas as pd
# Load the tornado dataset (assuming CSV format)
df = pd.read_csv("tornado_data.csv")
# Display basic information
print(df.head())
print(df.info())
3. Convert Data to GeoPandas
GeoPandas allows spatial analysis. Convert the tornado data into a GeoDataFrame:
import geopandas as gpd
from shapely.geometry import Point
# Ensure longitude and latitude columns exist
df['geometry'] = df.apply(lambda row: Point(row['longitude'], row['latitude']), axis=1)
# Convert to a GeoDataFrame
gdf = gpd.GeoDataFrame(df, geometry='geometry', crs="EPSG:4326")
print(gdf.head())
4. Load a Base Map (US States or Counties)
To visualize tornado locations, use a shapefile for U.S. states or counties:
# Load US states shapefile (example: naturalearth_lowres dataset)
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
# Filter for the United States
us_map = world[world.name == "United States"]
# Plot the base map with tornado locations
ax = us_map.plot(figsize=(10, 6), color='lightgrey', edgecolor='black')
gdf.plot(ax=ax, color='red', markersize=5, alpha=0.6)
5. Analyze Tornado Trends
You can explore patterns based on magnitude, seasonality, or geography:
5.1 Tornado Frequency by Year
import seaborn as sns
import matplotlib.pyplot as plt
df['year'] = pd.to_datetime(df['date']).dt.year
sns.histplot(df['year'], bins=30, kde=True)
plt.title("Tornado Frequency Over Time")
plt.xlabel("Year")
plt.ylabel("Count")
plt.show()
5.2 Tornado Intensity by State
# Group by state and count occurrences
state_counts = df.groupby('state').size().reset_index(name='count')
# Merge with a states GeoDataFrame
us_states = gpd.read_file("us_states_shapefile.shp") # Load US states shapefile
us_states = us_states.merge(state_counts, left_on="state_name", right_on="state")
# Plot tornado intensity by state
fig, ax = plt.subplots(figsize=(12, 6))
us_states.plot(column='count', cmap='Reds', legend=True, ax=ax, edgecolor='black')
plt.title("Tornado Frequency by State")
plt.show()
6. Advanced Spatial Analysis
6.1 Kernel Density Estimation (KDE) for Hotspot Detection
import geopandas.tools
from scipy.stats import gaussian_kde
# Extract tornado coordinates
x, y = gdf.geometry.x, gdf.geometry.y
kde = gaussian_kde([x, y])
xi, yi = np.meshgrid(np.linspace(x.min(), x.max(), 100), np.linspace(y.min(), y.max(), 100))
zi = kde(np.vstack([xi.flatten(), yi.flatten()]))
# Plot KDE heatmap
fig, ax = plt.subplots(figsize=(10, 6))
us_map.plot(ax=ax, color='lightgrey')
ax.scatter(x, y, c=zi, cmap="Reds", alpha=0.5)
plt.title("Tornado Hotspot Map")
plt.show()
7. Save and Export Data
You can export processed data for further use:
gdf.to_file("tornado_analysis.geojson", driver="GeoJSON")
Would you like a specific analysis, such as clustering, time series forecasting, or damage assessment? 🚀