Analyze Tornado Data with Python and GeoPandas

 

To analyze tornado data using Python and GeoPandas, follow these steps:

1. Install Dependencies

Ensure you have the required libraries installed:

pip install geopandas pandas matplotlib seaborn shapely folium

2. Load Tornado Dataset

You'll need a dataset containing tornado information, such as the NOAA Storm Events Database. Load it into a Pandas DataFrame:

import pandas as pd

# Load the tornado dataset (assuming CSV format)
df = pd.read_csv("tornado_data.csv")

# Display basic information
print(df.head())
print(df.info())

3. Convert Data to GeoPandas

GeoPandas allows spatial analysis. Convert the tornado data into a GeoDataFrame:

import geopandas as gpd
from shapely.geometry import Point

# Ensure longitude and latitude columns exist
df['geometry'] = df.apply(lambda row: Point(row['longitude'], row['latitude']), axis=1)

# Convert to a GeoDataFrame
gdf = gpd.GeoDataFrame(df, geometry='geometry', crs="EPSG:4326")
print(gdf.head())

4. Load a Base Map (US States or Counties)

To visualize tornado locations, use a shapefile for U.S. states or counties:

# Load US states shapefile (example: naturalearth_lowres dataset)
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))

# Filter for the United States
us_map = world[world.name == "United States"]

# Plot the base map with tornado locations
ax = us_map.plot(figsize=(10, 6), color='lightgrey', edgecolor='black')
gdf.plot(ax=ax, color='red', markersize=5, alpha=0.6)

5. Analyze Tornado Trends

You can explore patterns based on magnitude, seasonality, or geography:

5.1 Tornado Frequency by Year

import seaborn as sns
import matplotlib.pyplot as plt

df['year'] = pd.to_datetime(df['date']).dt.year
sns.histplot(df['year'], bins=30, kde=True)
plt.title("Tornado Frequency Over Time")
plt.xlabel("Year")
plt.ylabel("Count")
plt.show()

5.2 Tornado Intensity by State

# Group by state and count occurrences
state_counts = df.groupby('state').size().reset_index(name='count')

# Merge with a states GeoDataFrame
us_states = gpd.read_file("us_states_shapefile.shp")  # Load US states shapefile
us_states = us_states.merge(state_counts, left_on="state_name", right_on="state")

# Plot tornado intensity by state
fig, ax = plt.subplots(figsize=(12, 6))
us_states.plot(column='count', cmap='Reds', legend=True, ax=ax, edgecolor='black')
plt.title("Tornado Frequency by State")
plt.show()

6. Advanced Spatial Analysis

6.1 Kernel Density Estimation (KDE) for Hotspot Detection

import geopandas.tools
from scipy.stats import gaussian_kde

# Extract tornado coordinates
x, y = gdf.geometry.x, gdf.geometry.y
kde = gaussian_kde([x, y])
xi, yi = np.meshgrid(np.linspace(x.min(), x.max(), 100), np.linspace(y.min(), y.max(), 100))
zi = kde(np.vstack([xi.flatten(), yi.flatten()]))

# Plot KDE heatmap
fig, ax = plt.subplots(figsize=(10, 6))
us_map.plot(ax=ax, color='lightgrey')
ax.scatter(x, y, c=zi, cmap="Reds", alpha=0.5)
plt.title("Tornado Hotspot Map")
plt.show()

7. Save and Export Data

You can export processed data for further use:

gdf.to_file("tornado_analysis.geojson", driver="GeoJSON")

Would you like a specific analysis, such as clustering, time series forecasting, or damage assessment? 🚀