Geospatial Data Gathering, Cleaning and Conversion

For this tutorial, we'll be using a NASA dataset, consisting of ~50k rows with information about fallen metoerites in the last 40 years. The dataset can be found here. However, we'll be using a script to download it. Copy the following code into a file named "api.py":

import pickle

import os.path

import requests

import json

import csv

import time

import math


if os.path.exists("./saved_data"):

  print("Found saved data.")

 with open("saved_data", "rb") as f:

  rows = pickle.load(f)

 with open("json_data.json", "w") as f:

  json.dump(rows, f)


else:

 print("No saved data found. Pulling data from NASA API.")

 r = requests.get("https://data.nasa.gov/resource/y77d-th95.json?$limit=50000")

 rows = r.json()


 with open("saved_data", 'wb') as f:

     pickle.dump(rows, f)


 with open("json_data.json", "w") as f:

  json.dump(rows, f)



with open("csv_data.csv", "w", newline='', encoding='utf-8') as f:

 csvwriter = csv.writer(f, delimiter=",")

 count = 0

 for emp in rows:

  if count == 0:

   header = list(emp.keys())

   header+=["X", "Y", "Z"]

   print(header)

   csvwriter.writerow(header)

   count+=1


  if ( len(emp.keys()) != 10 or

   "reclat" not in emp.keys() or

   "reclong" not in emp.keys() or

   "fall" not in emp.keys() or

   "geolocation" not in emp.keys() or

   "id" not in emp.keys() or

   "mass" not in emp.keys() or

   "name" not in emp.keys() or

   "nametype" not in emp.keys() or

   "recclass" not in emp.keys() or

   "year" not in emp.keys() ):

    "do not add"


  elif emp["name"]=="Havana": #emp["fall"]!="Found" and emp["fall"]!="Fell":

   print(emp)

   print("mass" not in emp.keys())


  else:

   emp['reclat'] = float(emp['reclat'])

   emp['reclong'] = float(emp['reclong'])


   radius = 10

   lat = emp["reclat"]

   lon = emp["reclong"]


   # if lat<


   emp["X"] = radius * math.cos(math.radians(lat)) * math.cos(math.radians(lon))

   emp["Y"] = radius * math.cos(math.radians(lat)) * math.sin(math.radians(lon))

   emp["Z"] = radius * math.sin(math.radians(lat))


   csvwriter.writerow(emp.values())

   # print(emp)

  # print(emp)


Make sure that you have the "requests" package installed. to install it run:

pip install requests

Now, let's pull the data! Run it with 

python api.py

Check your folder, you should now have the data saved in a few formats (json, csv, binary data).  The script also performed conversion from latitude and longitude to x,y,z coordinates since neither of the programs we're using natively supports geospatial coordinates.

We'll only be using the CSV for this tutorial.  Next, choose between the ParaView and Unity tutorial to visualize the data.