https://www.kaggle.com/datasets/annbengardt/noway-meteorological-data/data
First off, the following imports are done in the Jupyter Notebook. This is a free IDE part of the Anaconda distribution that is prepared for data visualization, that runs in a browser.
NorwayMeteoDemo1.pynb
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
Next off, using Pandas, Python's Data Analysis Library, the data is prepared from the mentioned dataset. The format is in CSV format. Also, columns are added dynamically to the dataset.
A moving 14-days average of the daily maxmimum air temperature is added using the rolling method and setting window to 14. Also, a date column is added. Our dataset contains three int64 values day, month and year. We combine these
to create a date column.
NorwayMeteoDemo1.pynb
df = pd.read_csv("datasets/weather/NorwayMeteoDataCompleted.csv")
df['moving_max_air_temp_avg'] = df['max(air_temperature P1D)'].rolling(window = 14, min_periods = 1).mean()
df['date'] = pd.to_datetime(df[['year', 'month', 'day']])
Note the use of min_periods set to 1 for the moving average. Or else, you will get NaN in the start of the data of your created moving average column and the way Python works, it will cause NaN for all the next periods too !
Next, choosing what data to display. The following data will be shown in the demo.
- Station id : SN69100 (This is Værnes - Trondheim Airport weather station by the way,
- Year 2020
NorwayMeteoDemo1.pynb
filtered_df_2020= df[
(df['sourceId'] == 'SN69100') &
(df['year'] == 2020)
]
Next up, the data is then displayed. The demo will show two lineplots, first the maximum air temperature as a lineplot using Seaborn. In addition, another line plot with the moving 14 days average of the daily maximum air temperature.
Also, a bar plot is added below these two line plots. Note that the bar plot used here is bar and not barplot. Barplot is available in Seaborn, while Bar is available in the MatPlotLib.
NorwayMeteoDemo1.pynb
sns.set_style('whitegrid') # set the style to 'whitegrid'
# Create a 2x1 subplot
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 10), sharex=True)
sns.lineplot(data = filtered_df_2020, x = filtered_df_2020['date'], y = filtered_df_2020['max(air_temperature P1D)'], label='Daily max air temperature (C)', linewidth = 1.5, color = 'pink', ax = ax1)
sns.lineplot(data = filtered_df_2020, x = filtered_df_2020['date'], y = filtered_df_2020['moving_max_air_temp_avg'], label='14-day moving average Daily max air temperature (C)', color = 'red', linewidth = 1.8, ax = ax1)
ax2.bar(filtered_df_2020['date'], filtered_df_2020['sum(precipitation_amount P1D)'], data = filtered_df_2020, color = 'blue', label = 'Daily sum precipitation (mm)')
ax1.set_title('Værnes - Weather data - 2020')
ax1.set_xlabel('Date of year')
ax1.set_ylabel('Daily max air temperature (C)')
ax2.set_xlabel('Date of year')
ax2.set_ylabel('Daily sum precipitation (mm)')
The code above shows the resulting figure consisting of a 2x1 subplots layout, the upper plot shows the 2020 daily maximum air temperature combined with a moving 14-days average as a smoothing or trending function to show the general temperature shifts every second week of the year in average.
The lower plot shows a bar plot, using MatPlotLib, since Seaborn's bar plot does not handle dates as x-axis (in our dataset, datetime64 is used).
Seaborn and MatPlotLib offers a ton of plotting functionality. Maybe it also could be of interest for .NET Developers to use it more often? My previous article showed how it is possible to render in the backend images using MatPlotLib and then display in a Blazor serverside app, combining both Python and .NET. That demo used Python.Net library for the Python interop with .NET. The screenshot shows the Jupyter Notebook IDE, part of Anaconda distribution of Python tailored for data analysis and data science. It displays the plots described from the Python script above.

No comments:
Post a Comment