Geospatial Data Gathering, Cleaning and Conversion
For this tutorial, we'll be using a NASA dataset, consisting of ~50k rows with information about fallen metoerites in the last 40 years. The dataset can be found here. However, we'll be using a script to download it. Copy the following code into a file named "api.py":
import pickle
import os.path
import requests
import json
import csv
import time
import math
if os.path.exists("./saved_data"):
print("Found saved data.")
with open("saved_data", "rb") as f:
rows = pickle.load(f)
with open("json_data.json", "w") as f:
json.dump(rows, f)
else:
print("No saved data found. Pulling data from NASA API.")
r = requests.get("https://data.nasa.gov/resource/y77d-th95.json?$limit=50000")
rows = r.json()
with open("saved_data", 'wb') as f:
pickle.dump(rows, f)
with open("json_data.json", "w") as f:
json.dump(rows, f)
with open("csv_data.csv", "w", newline='', encoding='utf-8') as f:
csvwriter = csv.writer(f, delimiter=",")
count = 0
for emp in rows:
if count == 0:
header = list(emp.keys())
header+=["X", "Y", "Z"]
print(header)
csvwriter.writerow(header)
count+=1
if ( len(emp.keys()) != 10 or
"reclat" not in emp.keys() or
"reclong" not in emp.keys() or
"fall" not in emp.keys() or
"geolocation" not in emp.keys() or
"id" not in emp.keys() or
"mass" not in emp.keys() or
"name" not in emp.keys() or
"nametype" not in emp.keys() or
"recclass" not in emp.keys() or
"year" not in emp.keys() ):
"do not add"
elif emp["name"]=="Havana": #emp["fall"]!="Found" and emp["fall"]!="Fell":
print(emp)
print("mass" not in emp.keys())
else:
emp['reclat'] = float(emp['reclat'])
emp['reclong'] = float(emp['reclong'])
radius = 10
lat = emp["reclat"]
lon = emp["reclong"]
# if lat<
emp["X"] = radius * math.cos(math.radians(lat)) * math.cos(math.radians(lon))
emp["Y"] = radius * math.cos(math.radians(lat)) * math.sin(math.radians(lon))
emp["Z"] = radius * math.sin(math.radians(lat))
csvwriter.writerow(emp.values())
# print(emp)
# print(emp)
Make sure that you have the "requests" package installed. to install it run:
pip install requests
Now, let's pull the data! Run it with
python api.py
Check your folder, you should now have the data saved in a few formats (json, csv, binary data). The script also performed conversion from latitude and longitude to x,y,z coordinates since neither of the programs we're using natively supports geospatial coordinates.
We'll only be using the CSV for this tutorial. Next, choose between the ParaView and Unity tutorial to visualize the data.