Screen Shot 2018-02-23 at 5.21.36 PM.png

There are many Global Circulation Models (GCMs) with historic and future data of Precipitation, Maximum Temperature and Minimum Temperature for different emission scenarios. Data is available on daily timescale from particular servers, in this tutorial we will show the main characteristics of the NASA NCCS THREDDS Data Server that provide the NASA Earth Exchange Global Daily Downscaled Projections (NEX-GDDP) dataset that has two of the four greenhouse gas emissions scenarios. Data from this dataset is available from 1950 to 2100 separated on historic and future with a spatial resolution of 0.25 degrees (~25km x 25 km). The tutorial show the main parts of the web server and scripts in Python to locate the closest model cell and to recursively download group of records.

Link of the NASA NCCS THREDDS Data Server:

https://dataserver.nccs.nasa.gov/thredds/catalog/bypass/NEX-GDDP/bcsd/catalog.html

Tutorial

Code

These is the Python code used in the tutorial:

# ## Import required packages

get_ipython().magic('matplotlib inline')
import matplotlib.pyplot as plt
import re
import urllib.request
import numpy as np
from datetime import datetime, timedelta


# ## Define GCM cell close to location

def find_nearest_cell(array,value):
    idx = (np.abs(array-value)).argmin()
    return idx

latarray = np.linspace(-89.875,89.875,720)
lonarray = np.linspace(0.125,359.875,1440)
huancayo = (-12.06513, 360-75.20486) #notice longuitude must be on degrees east

celllatindex = find_nearest_cell(latarray,huancayo[0])
celllonindex = find_nearest_cell(lonarray,huancayo[1])
print(celllatindex,celllonindex)


# ## Define begin and end date of GCM simulation

zerodate = datetime(1850,1,1)
zerodate.isoformat(' ')

begindate = zerodate + timedelta(days=56978.5)
begindate.isoformat(' ')

enddate = zerodate + timedelta(days=89850.5)
enddate.isoformat(' ')


# ## Get data from the NCCS server by 5000 records 

# We divide the request by groups of 5000 records because the server do not provide all the records.
intervals=[[0,4999],[5000,9999],[10000,14999],[15000,19999],[20000,24999],[25000,29999],[30000,34674]]

#empty array for ppt and time
pptlist = []
daylist = []

for interval in intervals:
    fp = urllib.request.urlopen("https://dataserver.nccs.nasa.gov/thredds/dodsC/bypass/NEX-GDDP/bcsd/rcp45/r1i1p1/pr/CSIRO-Mk3-6-0.ncml.ascii?pr["+str(interval[0])+":1:"+str(interval[1])+"]["+str(celllatindex)+":1:"+str(celllatindex)+"]["+str(celllonindex)+":1:"+str(celllonindex)+"]")

# In case of Historic Data, from 1949 to 2005
#    fp = urllib.request.urlopen("https://dataserver.nccs.nasa.gov/thredds/dodsC/\
#bypass/NEX-GDDP/bcsd/historical/r1i1p1/pr/CSIRO-Mk3-6-0.ncml.ascii?pr\
#["+str(interval[0])+":1:"+str(interval[1])+"]\
#["+str(celllatindex)+":1:"+str(celllatindex)+"]\
#["+str(celllonindex)+":1:"+str(celllonindex)+"]")
    
    mybytes = fp.read()

    mystr = mybytes.decode("utf8")
    fp.close()
    
    lines = mystr.split('\n')
    breakers = []
    breakerTexts = ['pr[time','pr.pr','pr.time']
    for line in lines:
        for text in breakerTexts:
            if text in line:
                breakers.append(lines.index(line))
                
    dayline = lines[breakers[0]]
    dayline = re.sub('\[|\]',' ',dayline)
    days = int(dayline.split()[4])
    print("Procesing interval %s of %d days" % (str(interval), days))
    
    for item in range(breakers[1]+1, breakers[1]+days+1):
        ppt = float(lines[item].split(',')[1])*86400
        pptlist.append(ppt)
        
    for day in lines[breakers[2]+1].split(','):
        daylist.append(zerodate + timedelta(days=float(day)))

plt.plot(daylist,pptlist)
plt.gcf().autofmt_xdate()
plt.ylabel('Precipitation (mm/day)')
plt.show()

plt.plot(daylist[14000:15000],pptlist[14000:15000])
plt.gcf().autofmt_xdate()
plt.ylabel('Precipitation (mm/day)')
plt.show()