The full name of the GHCNd dataset is the Global Historical Climatology Network daily. It is a global observation dataset publicly released by NOAA, with data starting from 1763, containing daily meteorological observation data from a total of 120,000 weather stations worldwide. The monthly summary version of the GHCNd dataset is the Global Historical Climatology Network monthly.
During the weekend, I performed some statistical analysis on this dataset, and I’d like to record the general processing workflow.
First, I synchronized the data. This dataset is stored on NOAA’s own servers as well as on AWS/GCP public datasets. For simplicity, I chose the dataset on AWS. Initially, I used AWS CLI for synchronization, but JuiceFS can also be used.
| |
Subsequently, I performed daily and monthly level data statistical work.
The core approach involves using Polars to calculate the required statistical information within specified time ranges. For example, the daily statistics calculation:
| |
Then I used Matplotlib for visualization:
| Daily Data | Monthly Data | |
|---|---|---|
| Beijing | ![]() | ![]() |
| Shanghai | ![]() | ![]() |
| Tokyo | ![]() | ![]() |
The processing code is open-sourced on GitHub at ringsaturn/ghcn-showcases. Additionally, a static page displaying statistical information for select stations is available on GitHub Pages at ghcn-showcases.
Web Page Preview:
Beijing Display Preview
Shanghai Display Preview
Temperature Comparison of Beijing/Shanghai/Tokyo
Precipitation Comparison of Beijing/Shanghai/Tokyo






