Skip to article frontmatterSkip to article content

Workshop 2: Pandas, xarray, matplotlib, CartoPy, and Documentation

This notebook is designed to introduce the basic functionality of some of the more important data science/analysis packages for Atmospheric Science

#### IMPORTS GO HERE ####
# STANDARD LIBRARIES #
import timeit # This is used to time code we saw it last time
import datetime as dt # This is used to handle dates
# PUBLIC LIBRARIES #
import cartopy.crs as ccrs # This lets us easily work with map projections
import cartopy.feature as cfeature # This adds political/geographic features to maps
import matplotlib.pyplot as plt # This lets us make plots easily
import matplotlib.ticker as mticker # Extends plotting functionality
import numpy as np # This lets us have access to numpy arrays and other tools
import pandas as pd # This gives us access to powerful data manipulation tools
import xarray as xr # Also gives us access to powerful data manipulation tools

The ability to manipulate and visualize data is incredibly important in science. We’re going to introduce some of the packages most commonly used for this.

NumPy we already saw some of in the last workshop the packages we’re looking at today are actually built around/on top of NumPy.

We’re going to look at four packages:

  • Pandas
  • XArray
  • Matplotlib
  • Cartopy

We’re going to look at pandas first. The content of this tutorial is based partially on the official pandas tutorial.

Pandas is based around two different data structures: 1.) The series, 2.) The dataframe

# A series in pandas is type of 1-D array, similar to numpy, except with a few interesting
# extensions in functionality
ex_series = pd.Series(np.random.random(10)*10)
# When we print out a Pandas series we see not only the values of the series
# but also the index used to access data within the array
# This index is customizable and we will explore that feature later
print(ex_series)
# We can access element of a pandas series using the index
print(f'Indexing a series example: {ex_series[0]:.3f}, {ex_series[9]:.3f}')
# We can't however use values not in the index to access values in a series
# Uncomment the line below to get an error
# print(ex_series[-1])
# If we want to make use of python's indexing we have to use .iloc[]
print(f'Indexing a series with iloc example:\n{ex_series.iloc[-3:]}')
# We can also access pieces of a series using boolean operators
ex_series_bool_ops = pd.Series(np.arange(0,25,1))
print(f'Using Boolean Operators on a series example:\n{ex_series_bool_ops[ex_series_bool_ops > 20]}')
# We can also apply mathematical operations onto a series
print(f'Applying math to a series:\n{np.sqrt(ex_series_bool_ops).iloc[-5:]}')
# This however doesn't change the underlying series, we would need to
# either save the modified series to a new variable or overwrite our previous one
print(f'The original series is still there:\n{ex_series.iloc[-5:]}')
# The index of a Pandas Series can be set to lots of things, one common index is datetime objects
# The line below generates a series with values corresponding to a week of dates
ex_series_cust_index = pd.Series(np.array([85,96,94,92,96,98,88]),index = np.array([dt.datetime(2024,8,1)+dt.timedelta(days = i) for i in range(7)]))
print(ex_series_cust_index)
# We can either access the values in this series by the index we just created (datetime objects)
# or with .iloc
print(f'Accessing custom index via index: {ex_series_cust_index[dt.datetime(2024,8,4)]}')
print(f'Accessing custom index via iloc: {ex_series_cust_index.iloc[-1]}')
# Pandas Dataframes are a collection of multiple Pandas Series with a common index
# This example generates a data frame via combining multiple series, numpy arrays, or lists
# In a later workshop we'll explore generating a Dataframe from a file
df_size = 5000
df_series_a = pd.Series(np.random.normal(65,10,df_size))
df_series_b = pd.Series(np.random.uniform(0,100,df_size))
# When generating a Dataframe in this way the preferred way is to call the pd.DataFrame()
# function on a dictionary. This let's us access items in the dataframe via the key from
# the dictionary as well as the index (via .iloc)
example_df = pd.DataFrame({'Temp':df_series_a,'RH':df_series_b})
print(example_df[:10])
print(f'Accessing dataframe values via Column Name:\n{example_df['Temp']}')
print(f'Accessing dataframe values vis Pandas Index:\n{example_df[2:5]}')
print(f'Accessing dataframe values via Python Index:\n{example_df.iloc[2:5]}')
# With dataframes we can use methods such as .where() and .query() to access
# specific pieces of data within the dataframe
print('Accessing Dataframe via .where():')
print(example_df.where(example_df['Temp'] <= example_df['RH']))
print('Accessing Dataframe via .query():')
print(example_df.query('Temp <= RH'))
# Dataframes can be modified, the values within columns can be changed
# and new columns can be added. One caveat however is that dataframes
# do not like new rows being added, they can get wider but not longer
# The line below adds an empty column for DewPt temperature
example_df['DewPt'] = np.array([np.nan]*df_size)
# We can use the .head() method to look at the top 5-rows of a data frame
example_df.head()
# Similarly we can use .tail() to look at the last 5-rows
example_df.tail()
# These are the functions we wrote in Workshop 1
# We're going to use these to both calculate
# dew point temperature and add a celsius temperature column
def convert_f_to_c(temp:float) -> float:
        '''
            This functions converts temperatures in fahrenheit to celsius.

            Parameters:
                - temp (float): The temperature in fahrenheit

            Outputs:
                - conv_temp (float): The temperature in celsius
        '''

        conv_temp = 5*(temp - 32)/9

        return conv_temp

def calculate_gamma(temp,rh):
    '''
        This functions calculates the gamma needed to calculate the
        dew point temperature using the Magnus Formula

        Parameters:
            - temp (float): The temperature in celsius
            - RH (float): The relative humidity in %
        
        Returns:
            - gamma (float): The value of gamma that we calculated
            - b (float): An empirical constant for the Magnus formula
            - c (float): An empirical constant for the Magnus formula
    '''
    b = 17.625
    c = 243.04
    gamma = np.log(rh/100) + (b*temp)/(c+temp)
    
    return gamma, b, c

def calculate_dewpoint(temp,rh):
    '''
        This function calculates the dew point temperature using the
        Magnus formula assuming that the input temperature is in Fahrenheit

        Returns:
            - temp (float): The temperature in Fahrenheit
            - RH (float): The relative humidity in %
        
        Outputs:
            - dew_pt_temp (float): The calculated dewpoint temperature in Celsius
    '''
    #Convert input temp to Celsius
    c_temp = convert_f_to_c(temp)
    #Calculate gamma and get the empirical constants
    gamma,b,c = calculate_gamma(c_temp,rh)
    #Put it all together
    dew_pt_temp = (c*gamma) / (b - gamma)

    return dew_pt_temp
# This applies our functions to our example dataframe
example_df['Temp_degC'] = convert_f_to_c(example_df['Temp'])
example_df['DewPt'] = calculate_dewpoint(example_df['Temp'],example_df['RH'])
# We can also easily get statistics on values stored in dataframes
example_df.describe()
# If we're interested in specific columns we can also subset onto just one column or multiple in pandas
# This code below selects just the Temp_degC and DewPt columns and calls .describe() on them
# Notice the double square brackets
example_df[['DewPt','Temp_degC']].describe()
# Pandas also let's us make a few simple plots of data quickly
# and easily. Later we'll explore Matplotlib, the main plotting
# package for Python and what Pandas is using to make these
# simple plots
# The line below makes simple histograms of the data
example_df.hist()
plt.show() # This just hides various messages from plot creation
# The line below makes simple scatter plots of the columns
pd.plotting.scatter_matrix(example_df,color = 'k',figsize = (8,8))
plt.show() # This just hides various messages from plot creation

Pandas has lots of plotting functionality and here is a link to how to do various visualizations in Pandas.

Link to the documentation here.

The next package we’re going to look at is XArray. Xarray is a package designed to making work with N-dimensional data easier, it is widely used in the atmospheric science field.

The main reason XArray is used widely in our field is that it works well with the netCDF file type, a type of file designed by NCAR meant to store large amounts of data from observations, reanalysis, and climate models easily and efficiently.

We will cover netCDF files in detail in the final workshop. Here we’ll introduce basic ideas of how to use XArray and how to look at data with it.

Like how Pandas has Series and Dataframe objects XArray utilizes DataSet and DataArray objects.

# This line downloads a temperature dataset from NCEP
# The National Center for Environmental Prediction
ds = xr.tutorial.load_dataset("air_temperature")
print(1)
# In a notebook environment just typing the name of an XArray
# DataSet or DataArray will pull up this informational panel
ds
# Like pandas variables/coordinates in an XArray dataset are accessed by their name
ds['air']
# data can also be accessed using 'dot' notation
ds.air
# XArray is built on pandas so it inherits a lot of the same features
# This selects all data at 40N, 200E
# Pandas has a similar function that we didn't talk about but is detailed
# in the documentation for pandas
ds.sel(lat=40,lon = 200)
# We can also use the slice() function to get values within certain ranges
# This selects data over CONUS
ds.sel(lat=slice(22,50),lon=slice(220,320))
# We can also use position based indexing
ds.air.data[:,0,0]
# functions called on XArray data can return XArray data-arrays
# This gives us the mean across the entire field for each time step
# Like pandas without saving the operation to the DataSet we are 
# not overwriting  our original data
np.mean(ds.air,axis = (1,2))
# Functions can also easily be applied to XArray DataArrays
# Below is a very simple function to convert temperature units
def convert_K_to_C(temp_data:xr.DataArray) -> xr.DataArray:
    '''
        Converts Kelvin temps stored in an XArray Data Array into
        Degrees Celsius as an XArray Data Array

        Parameters:
            - temp_data (xr.DataArray): The temps in Kelvin to convert
                to Celsius
        
        Outputs:
            - conv_data (xr.DataArray): The temps in Celsius
    '''

    conv_data = temp_data - 273.15

    return conv_data

# This is how easy it is to apply the function to the DataArray
convert_K_to_C(ds.air)
# XArray also has very helpful high level functions for grouping together data
# This line gives us the data grouped by seasons (DJF, MAM, JJA, SON) and 
# takes the mean across the seasons all in only a single line!
season_mean = ds.groupby("time.season").mean()
season_mean
# Like pandas XArray has built in plotting capabilities
season_mean.air.plot(col='season',col_wrap=2)
plt.show()

This is only the surface of XArray, it is an incredibly powerful package and there is lots of stuff you can do with it. I recommend looking through the documentation of XArray and getting a feel for all of the stuff you can do with it. XArray Documentation

The final packages we’re going to look at are Matplotlib and Cartopy. These are packages designed to help us visualize data and plot data onto maps of the Earth respectively.

We’ve seen some plotting from Pandas and XArray, both of those are secretly using Matplotlib to make their plots. By using matplotlib directly however we gain more features and customizability than either Pandas or XArray provides

Matplotlib code is verbose compared to Pandas/XArray, this is because it wants you to state what youwant in your plot explicitly. We’re to plot some simple data now and slowly introduce the various features of Matplotlib

# Generate some simple line data and plot
example_xs = np.linspace(-10,10,20)
example_ys = 2 * example_xs + 4
# These two lines make the plot and shows it
plt.plot(example_xs,example_ys)
plt.show()
# This plot is really simple looking but it gets the job done
# it shows that our data is linear. Let's spice up the plot though
# and add some more things.
# The first thing we're going to add are labels for our various axes
# as well as a title
plt.plot(example_xs,example_ys)
plt.xlabel('Example X-Values')
plt.ylabel('Example Y-Values')
plt.title('Example Title')
plt.show()
# This plot is a little hard to read, let's add some grid lines so we 
# can see things like the X and Y intercepts
plt.plot(example_xs,example_ys)
plt.xlabel('Example X-Values')
plt.ylabel('Example Y-Values')
plt.title('Example Title')
plt.grid()
plt.show()
# We can also add other lines to this plot
example_xs2 = np.linspace(-10,10,50)
example_ys2 = 10*np.sin(example_xs2)

plt.plot(example_xs,example_ys)
plt.plot(example_xs2,example_ys2)
plt.xlabel('Example X-Values')
plt.ylabel('Example Y-Values')
plt.title('Example Title')
plt.grid()
plt.show()
# Let's customize our lines and add a legend

plt.plot(example_xs,example_ys,color = 'k',linestyle = ':',linewidth = 3,label = 'Example 1')
plt.plot(example_xs2,example_ys2,color = 'r',linewidth = 3,label = 'Example 2')
plt.xlabel('Example X-Values')
plt.ylabel('Example Y-Values')
plt.title('Example Title')
plt.grid()
plt.legend(loc = 'best')
plt.show()
# We can also make multiple plots on the same figure using the plt.subplots() command
fig,ax = plt.subplots(nrows = 1,ncols = 2,figsize = (8,4)) # this makes a 2-figure plot
# Because we're working with subplots we have to interact with the ax variable we just created
# this contains both of the subplots we just made as ax[0] and ax[1]
ax[0].plot(example_xs,example_ys,color = 'k',linestyle = ':',linewidth = 3,label = 'Example 1')
ax[1].plot(example_xs2,example_ys2,color = 'r',linewidth = 3,label = 'Example 2')
# Because the lines were plotted to a specific subplot the other subplot doesn't know they exist
# the legend we add to ax[0] will only have the Example 1 line
ax[0].legend(loc = 'best')
plt.show()
fig,ax = plt.subplots(nrows = 1,ncols = 2,figsize = (8,3)) # this makes a 2-figure plot
#make data for the scatter plot
scatter_ex_xs = np.random.normal(0,1,250) # this makes normally distributed data
scatter_ex_ys = np.random.normal(-20,20,250)
hist_ex = np.random.normal(0,2.5,2500)
# the .scatter() method makes a scatter plot
ax[0].scatter(scatter_ex_xs,scatter_ex_ys,color = 'Green',marker = '^',alpha = 0.8)
# the .hist() method makes a histogram
ax[1].hist(hist_ex,color = 'Purple',rwidth = 0.8,bins = np.arange(-10,11,1))
# Becuase we're working with axes objects we have to use .set_title() instead of .title()
ax[0].set_title('Example Scatter Plot')
ax[1].set_title('Example Histogram')
plt.show()

Those plots were relatively simple. A lot of our data has geographic components which makes it more complicated to visualize.

Let’s make some plots of variables on the earth using the data we got earlier.

# This code makes a four-panel plot similar to the seasonal mean one that XArray
# generated for us automatically, notice that it takes decently more than 1 line

# This makes the base of our figure as a collection of sub-plots
fig,ax = plt.subplots(nrows = 2, ncols = 2,figsize = (8,8)) # this makes a 2x2 plot that is 8 inches x 8 inches in size
# These are various parameters for our plot that we want
cmin = 250
cmax = 300

# These are the actual plots
pc0 = ax[0,0].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[0,:,:],vmin = cmin,vmax = cmax)
pc1 = ax[0,1].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[1,:,:],vmin = cmin,vmax = cmax)
pc2 = ax[1,0].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[2,:,:],vmin = cmin,vmax = cmax)
pc3 = ax[1,1].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[3,:,:],vmin = cmin,vmax = cmax)

# This makes a colorbar axis, and adds it to the plot
cax = fig.add_axes([0.92,0.15,0.03,0.7])
cbar = plt.colorbar(pc0,cax=cax,label = 'Sigma Level 995 Air Temp [Kelvin]')

# This modifies the spacing of our plots
plt.subplots_adjust(hspace = 0.3,wspace = 0.2,top = 0.9,bottom = 0.1)

# This adds labels to our various plots, and sets the X/Y ticks
for row in ax:
    for item in row:
        item.set_xlabel('Longitude [Degrees East]')
        item.set_ylabel('Latitude [Degrees North]')

# Sets the title for the plots
ax[0,0].set_title('DJF')
ax[0,1].set_title('JJA')
ax[1,0].set_title('MAM')
ax[1,1].set_title('SON')

# Call plt.show() to suppress outputs from the final method call
plt.show()

While that took several more lines than the single line we used with XArray we can also start customizing this plot and adding features that make it more meaningful/useful to look at. The first thing we’re going to add our geographic features/borders with CartoPy

# This code makes a four-panel plot similar to the seasonal mean one that XArray
# generated for us automatically, notice that it takes decently more than 1 line

# This makes the base of our figure as a collection of sub-plots
fig,ax = plt.subplots(nrows = 2, ncols = 2,figsize = (8,8),subplot_kw={'projection':ccrs.PlateCarree()}) # this makes a 2x2 plot that is 8 inches x 8 inches in size
# These are various parameters for our plot that we want
cmin = 250
cmax = 300

# These are the actual plots
pc0 = ax[0,0].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[0,:,:],vmin = cmin,vmax = cmax)
pc1 = ax[0,1].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[1,:,:],vmin = cmin,vmax = cmax)
pc2 = ax[1,0].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[2,:,:],vmin = cmin,vmax = cmax)
pc3 = ax[1,1].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[3,:,:],vmin = cmin,vmax = cmax)

# This makes a colorbar axis, and adds it to the plot
cax = fig.add_axes([0.92,0.25,0.03,0.5])
cbar = plt.colorbar(pc0,cax=cax,label = 'Sigma Level 995 Air Temp [Kelvin]')

# This modifies the spacing of our plots
plt.subplots_adjust(hspace = -0.5,wspace = 0.2,top = 0.9,bottom = 0.1)

# This adds labels to our various plots, and sets the X/Y ticks
for row in ax:
    for item in row:
        # Adds labels to each plot
        item.set_xlabel('Longitude [Degrees West]')
        item.set_ylabel('Latitude [Degrees North]')
        # Adds ticks to each plot
        gl = item.gridlines(crs = ccrs.PlateCarree(),draw_labels = True,)
        gl.xlines = False
        gl.ylines = False
        gl.top_labels = False
        gl.right_labels = False
        gl.xlocator = mticker.FixedLocator([-160,-120,-80,-40])
        # Adds coastlines
        item.coastlines()
        # Adds country borders
        item.add_feature(cfeature.BORDERS)
        # Adds Country subdivisions (e.g. States/Provinces)
        item.add_feature(cfeature.NaturalEarthFeature(category='cultural', 
    name='admin_1_states_provinces_lines', scale='10m', facecolor='none', edgecolor='k'))


# Sets the title for the plots
ax[0,0].set_title('DJF')
ax[0,1].set_title('JJA')
ax[1,0].set_title('MAM')
ax[1,1].set_title('SON')
# Call plt.show() to suppress outputs from the final method call
plt.show()

This plot is more readable, we can see what the air temp is over land vs ocean, and in specific locations much easier. Now let’s zoom in on the US.

# This code makes a four-panel plot similar to the seasonal mean one that XArray
# generated for us automatically, notice that it takes decently more than 1 line

# This makes the base of our figure as a collection of sub-plots
fig,ax = plt.subplots(nrows = 2, ncols = 2,figsize = (8,8),subplot_kw={'projection':ccrs.PlateCarree()}) # this makes a 2x2 plot that is 8 inches x 8 inches in size
# These are various parameters for our plot that we want
cmin = 250
cmax = 300

# These are the actual plots
pc0 = ax[0,0].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[0,:,:],vmin = cmin,vmax = cmax)
pc1 = ax[0,1].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[1,:,:],vmin = cmin,vmax = cmax)
pc2 = ax[1,0].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[2,:,:],vmin = cmin,vmax = cmax)
pc3 = ax[1,1].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[3,:,:],vmin = cmin,vmax = cmax)

# This makes a colorbar axis, and adds it to the plot
cax = fig.add_axes([0.92,0.25,0.03,0.5])
cbar = plt.colorbar(pc0,cax=cax,label = 'Sigma Level 995 Air Temp [Kelvin]')

# This modifies the spacing of our plots
plt.subplots_adjust(hspace = -0.5,wspace = 0.2,top = 0.9,bottom = 0.1)

# This adds labels to our various plots, and sets the X/Y ticks
for row in ax:
    for item in row:
        # Adds labels to the plots
        item.set_xlabel('Longitude [Degrees West]')
        item.set_ylabel('Latitude [Degrees North]')
        # Adds tickmarks to the plots
        gl = item.gridlines(crs = ccrs.PlateCarree(),draw_labels = True,color = 'k',alpha = 0.3)
        gl.top_labels = False
        gl.right_labels = False
        gl.xlocator = mticker.FixedLocator([-160,-120,-80,-40])
        # Adds coastlines
        item.coastlines()
        # Adds country borders
        item.add_feature(cfeature.BORDERS)
        # Adds Country subdivisions (e.g. States/Provinces)
        item.add_feature(cfeature.NaturalEarthFeature(category='cultural', 
    name='admin_1_states_provinces_lines', scale='10m', facecolor='none', edgecolor='k'))
        # These lines zoom us in onto the US
        item.set_xlim(-130,-60)
        item.set_ylim(25,50)


# Sets the title for the plots
ax[0,0].set_title('DJF')
ax[0,1].set_title('JJA')
ax[1,0].set_title('MAM')
ax[1,1].set_title('SON')

# Call plt.show() to suppress outputs from the final method call
plt.show()

Now let’s change the colorbar to make the temperature variations easier to see. Matplotlib has lots of color options, as well as the ability to make your own colormap. This is a list of the default colorbar options for Matplotlib

Try the following colormaps to gauge how they impact the data and its presentation:

  • viridis
  • seismic
  • bwr
  • turbo
  • PiYG
# This code makes a four-panel plot similar to the seasonal mean one that XArray
# generated for us automatically, notice that it takes decently more than 1 line

# This makes the base of our figure as a collection of sub-plots
fig,ax = plt.subplots(nrows = 2, ncols = 2,figsize = (8,8),subplot_kw={'projection':ccrs.PlateCarree()}) # this makes a 2x2 plot that is 8 inches x 8 inches in size
# These are various parameters for our plot that we want
cmin = 250
cmax = 300
cmap = 'turbo'

# These are the actual plots
pc0 = ax[0,0].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[0,:,:],vmin = cmin,vmax = cmax,cmap = cmap)
pc1 = ax[0,1].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[1,:,:],vmin = cmin,vmax = cmax,cmap = cmap)
pc2 = ax[1,0].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[2,:,:],vmin = cmin,vmax = cmax,cmap = cmap)
pc3 = ax[1,1].pcolormesh(season_mean.lon,season_mean.lat,season_mean.air[3,:,:],vmin = cmin,vmax = cmax,cmap = cmap)

# This makes a colorbar axis, and adds it to the plot
cax = fig.add_axes([0.92,0.25,0.03,0.5])
cbar = plt.colorbar(pc0,cax=cax,label = 'Sigma Level 995 Air Temp [Kelvin]')

# This modifies the spacing of our plots
plt.subplots_adjust(hspace = -0.5,wspace = 0.2,top = 0.9,bottom = 0.1)

# This adds labels to our various plots, and sets the X/Y ticks
for row in ax:
    for item in row:
        # Adds labels to each plot
        item.set_xlabel('Longitude [Degrees West]')
        item.set_ylabel('Latitude [Degrees North]')
        # Adds grid lines to better orient ourselves E/W & N/S
        gl = item.gridlines(crs = ccrs.PlateCarree(),draw_labels = True,color = 'k',alpha = 0.3)
        gl.top_labels = False
        gl.right_labels = False
        gl.xlocator = mticker.FixedLocator([-160,-120,-80,-40])
        # Adds coastlines
        item.coastlines()
        # Adds country borders
        item.add_feature(cfeature.BORDERS)
        # Adds Country subdivisions (e.g. States/Provinces)
        item.add_feature(cfeature.NaturalEarthFeature(category='cultural', 
    name='admin_1_states_provinces_lines', scale='10m', facecolor='none', edgecolor='k'))
        # These lines zoom us in onto the US
        item.set_xlim(-130,-60)
        item.set_ylim(25,50)


# Sets the title for the plots
ax[0,0].set_title('DJF')
ax[0,1].set_title('JJA')
ax[1,0].set_title('MAM')
ax[1,1].set_title('SON')

# Call plt.show() to suppress outputs from the final method call
plt.show()

In the examples above we have been using the Plate Carree map projection but there are others. The example below initializes the plots with two different map projections to show how this impacts the appearance of the plots. Map projections carry a lot of calculations behind the scenes and can sometimes make plotting slow.

This is only a very small subset of possible map projections with cartopy, you can even make custom projection if you wanted to. As for which projection is best that is entirely dependent upon your data and what you’re plotting. I recommend you use what projection your advisor or other group members recommend.

# Initialize the figure
fig = plt.figure(figsize = (12, 7))
# add subplots to the figure with our desired projections
# Mollweide Projection
ax00 = plt.subplot(2,2,1,projection=ccrs.Mollweide())
ax00.coastlines()
ax00.set_title('Mollweide Projection')
# Stereographic
ax01 = plt.subplot(2,2,2,projection=ccrs.Stereographic())
ax01.coastlines()
ax01.set_title('Stereographic Projection')
# Platecarreee
ax10 = plt.subplot(2,2,3,projection=ccrs.PlateCarree())
ax10.coastlines()
ax10.set_title('Platecarree Projection')
# Interrupted Goode Homolosine
ax11 = plt.subplot(2,2,4,projection=ccrs.InterruptedGoodeHomolosine())
ax11.coastlines()
ax11.set_title('Interrupted Goode Homolosine')

plt.show()

Congrats! You have now covered the basics of Pandas, XArray, Matplotlib, and Cartopy! The next notebook contains a few examples for us to work through together.