Download Climate Change Data (2006-2096) on Daily Scale from NASA NCCS Server with Python - Tutorial
/There are many Global Circulation Models (GCMs) with historic and future data of Precipitation, Maximum Temperature and Minimum Temperature for different emission scenarios. Data is available on daily timescale from particular servers, in this tutorial we will show the main characteristics of the NASA NCCS THREDDS Data Server that provide the NASA Earth Exchange Global Daily Downscaled Projections (NEX-GDDP) dataset that has two of the four greenhouse gas emissions scenarios. Data from this dataset is available from 1950 to 2100 separated on historic and future with a spatial resolution of 0.25 degrees (~25km x 25 km). The tutorial show the main parts of the web server and scripts in Python to locate the closest model cell and to recursively download group of records.
Link of the NASA NCCS THREDDS Data Server:
https://dataserver.nccs.nasa.gov/thredds/catalog/bypass/NEX-GDDP/bcsd/catalog.html
Tutorial
Code
These is the Python code used in the tutorial:
# ## Import required packages get_ipython().magic('matplotlib inline') import matplotlib.pyplot as plt import re import urllib.request import numpy as np from datetime import datetime, timedelta # ## Define GCM cell close to location def find_nearest_cell(array,value): idx = (np.abs(array-value)).argmin() return idx latarray = np.linspace(-89.875,89.875,720) lonarray = np.linspace(0.125,359.875,1440) huancayo = (-12.06513, 360-75.20486) #notice longuitude must be on degrees east celllatindex = find_nearest_cell(latarray,huancayo[0]) celllonindex = find_nearest_cell(lonarray,huancayo[1]) print(celllatindex,celllonindex) # ## Define begin and end date of GCM simulation zerodate = datetime(1850,1,1) zerodate.isoformat(' ') begindate = zerodate + timedelta(days=56978.5) begindate.isoformat(' ') enddate = zerodate + timedelta(days=89850.5) enddate.isoformat(' ') # ## Get data from the NCCS server by 5000 records # We divide the request by groups of 5000 records because the server do not provide all the records. intervals=[[0,4999],[5000,9999],[10000,14999],[15000,19999],[20000,24999],[25000,29999],[30000,34674]] #empty array for ppt and time pptlist = [] daylist = [] for interval in intervals: fp = urllib.request.urlopen("https://dataserver.nccs.nasa.gov/thredds/dodsC/bypass/NEX-GDDP/bcsd/rcp45/r1i1p1/pr/CSIRO-Mk3-6-0.ncml.ascii?pr["+str(interval[0])+":1:"+str(interval[1])+"]["+str(celllatindex)+":1:"+str(celllatindex)+"]["+str(celllonindex)+":1:"+str(celllonindex)+"]") # In case of Historic Data, from 1949 to 2005 # fp = urllib.request.urlopen("https://dataserver.nccs.nasa.gov/thredds/dodsC/\ #bypass/NEX-GDDP/bcsd/historical/r1i1p1/pr/CSIRO-Mk3-6-0.ncml.ascii?pr\ #["+str(interval[0])+":1:"+str(interval[1])+"]\ #["+str(celllatindex)+":1:"+str(celllatindex)+"]\ #["+str(celllonindex)+":1:"+str(celllonindex)+"]") mybytes = fp.read() mystr = mybytes.decode("utf8") fp.close() lines = mystr.split('\n') breakers = [] breakerTexts = ['pr[time','pr.pr','pr.time'] for line in lines: for text in breakerTexts: if text in line: breakers.append(lines.index(line)) dayline = lines[breakers[0]] dayline = re.sub('\[|\]',' ',dayline) days = int(dayline.split()[4]) print("Procesing interval %s of %d days" % (str(interval), days)) for item in range(breakers[1]+1, breakers[1]+days+1): ppt = float(lines[item].split(',')[1])*86400 pptlist.append(ppt) for day in lines[breakers[2]+1].split(','): daylist.append(zerodate + timedelta(days=float(day))) plt.plot(daylist,pptlist) plt.gcf().autofmt_xdate() plt.ylabel('Precipitation (mm/day)') plt.show() plt.plot(daylist[14000:15000],pptlist[14000:15000]) plt.gcf().autofmt_xdate() plt.ylabel('Precipitation (mm/day)') plt.show()