It looks like you're using an Ad Blocker.

Please white-list or disable AboveTopSecret.com in your ad-blocking tool.

Thank you.

 

Some features of ATS will be disabled while you continue to use an ad-blocker.

 

USGS Data Mining and Integrity

page: 1
1

log in

join
share:

posted on Jul, 17 2012 @ 01:56 PM
link   
Before I get started I would just like to say - FIRST THREAD! I'm quite excited about it


Okay, that is out the way. Down to business. After lurking here for a long time, I have noticed that the information about earthquakes and volcanoes is quite often used. In light of this I am creating some tools that will aggregate data about them on a daily (perhaps hourly or half-hourly) basis. I do have some concerns I would like to address about these tools. Perhaps ATS could give some wonderful insights!

1) Is the USGS a reliable data source (USGS.gov)? I have read threads containing posts about down-grades or information changes on already established earthquakes.
1a) If they do make changes to the data, would you like me to do so as well or keep the original data?

2) Are there other sources available for data mining other than the USGS? I have been looking for frequently updated and reliable XML files for this task.

When all is said and done I hope to deliver a functional site where we can instantly reference any earthquake or volcanic activity. Even view it on Google Earth/Maps right there on the site.

Thoughts? Has it been done and I missed it?



posted on Jul, 17 2012 @ 02:08 PM
link   
KML files here...

It might be worth looking around for free software that accepts KML files, take screenshots of the output, then post them to your site regularly.

If you're lucky you could also develop an ASP-site that would process and present data from the RSS feeds.
edit on 17-7-2012 by XeroOne because: (no reason given)



posted on Jul, 17 2012 @ 02:15 PM
link   
reply to post by XeroOne
 


Thats actually what I am doing. This is all automated


I grab the earthquake and volcano Atom RSS feeds and parse them into valuable data.
Then I store this information in a database. So far all of this is working perfectly.

Right now i am automating the process of creating KML files with the data i have in my database. Once that is done, I should be able to apply the KML layer to a Google Earth/Maps object embedded on the site on-the-fly. It will always be the most recent available data in the KML file. It's pretty nifty


ETA: As soon as I pay for my server account and bring it online again I'll post the URL so we can beta test this mofo! Oh, and my final dreams for this project are to implement a predictive neural network to "guess" the location and date of future earthquakes - way out of my league right now though lol.
edit on 17-7-2012 by hidden0 because: (no reason given)



posted on Jul, 17 2012 @ 02:28 PM
link   
There are other earthquake reporting agencies besides USGS. Most regions have their own.
Not sure data mining is the correct term?

Yes provide both as they often get downgraded.



posted on Jul, 17 2012 @ 02:32 PM
link   

Originally posted by violet
There are other earthquake reporting agencies besides USGS. Most regions have their own.
Not sure data mining is the correct term?

Yes provide both as they often get downgraded.


I think 'analytics' is the correct term.



posted on Jul, 17 2012 @ 02:36 PM
link   
reply to post by violet
 


I'll think of a method for storing alterations of data separately from their originals. This way we can reference side-by-side the changes of data, and can perhaps question it in the future. I'm already running ~1200 earthquakes, so checking all 1200 for changes may be processor intensive. At the current rate, I am storing about ~150 earthquakes per day (all magnitudes and depths). I will have to do some research to see how to best implement this.

As for other sites, I was just curious. I don't want the USGS to be my only source, but I'm not having much luck finding for example the Japan's equivalent of the USGS. I'll keep digging though, I hope to have quite a list of sources, maybe even with the option of only seeing EQ's from specific sources.

ETA: I found the JGS (Japanese Geological Society) so I'm checking around for some XML feeds. They will probably be in Japanese though...hmm...



posted on Jul, 17 2012 @ 08:16 PM
link   
reply to post by hidden0
 



I'll think of a method for storing alterations of data separately from their originals. This way we can reference side-by-side the changes of data, and can perhaps question it in the future.


All it requires is the quake ID and before you update a change, insert the original record into a revisions table. Then flag the updated record as revised.


I'm already running ~1200 earthquakes, so checking all 1200 for changes may be processor intensive. At the current rate, I am storing about ~150 earthquakes per day (all magnitudes and depths). I will have to do some research to see how to best implement this.


I do all of the foregoing in my QVS Data program but it is not web based. Currently I have 1,000,000+ earthquakes in the database. Changes happen on earthquakes anything up to two years after the event, but no one out there is interested after 7 days or so. If you are going to grab everything then you should add Italy, Spain, Chile, Columbia, Greece, Turkey, Thailand, Kazakhstan, Russia, Indonesia to mention a few to get started. You should be looking to pack away about 3000 to 5000 quakes a day. Oh yes there is Australia as well and New Zealand - there at least you get an XML file.


As for other sites, I was just curious. I don't want the USGS to be my only source, but I'm not having much luck finding for example the Japan's equivalent of the USGS. I'll keep digging though, I hope to have quite a list of sources, maybe even with the option of only seeing EQ's from specific sources.


Especially for you Japan

There are a number of sites (I suggest you look at the lists in my signature), but the problem is that if you are wanting to depend on RSS feeds - which probably won't give you the updates - many of them do not have RSS/XML data. EMSC is best used by parsing the KML data which is what I do otherwise you do not get the ID numbers and without them it is difficult to track changes. Japan is a pure and very intensive screen scrape (with no IDs) requiring two file downloads for every quake. You also need to look out for nearly identical quakes and devise some way of dealing with them - particularly with Japan since the lists that you have to scrape are not lists of earthquakes but lists of earthquake reports and a quake can appear more than once. You have to devise your own ID numbers for these and check to see if they have been posted before. It took me nearly a year to get that working properly.


ETA: I found the JGS (Japanese Geological Society) so I'm checking around for some XML feeds. They will probably be in Japanese though...hmm...


You could try Hinet but i did not want to have to register and their terms are very restricting.

In addition to all of that you must be aware of the scales being used if you are wanting to make comparisons. USGS use a number of different scales, EMSC tend to use mb and Mw, Chile uses ML and Japan uses ML but ML is different for each area. Spain tends to use mbLg. Ideally you would convert all the different scales to a rough approximation of Mw and display both the original and the conversion. Then there is energy. I am not interested in numbers only the energy so it would have to have energy base on M0 (The starting point of the Mw calculation) to be of interest to me.


aggregate data about them on a daily (perhaps hourly or half-hourly) basis.


Try 5 minutes. Anything longer is not worth bothering with.


If they do make changes to the data, would you like me to do so as well or keep the original data?


If you don't change your data is not worth a light - but nearly all the sites out there do not so you would be in good company! It is only quake nerds like me that are into the nitty gritty details.


Has it been done and I missed it?


Global Incident Map, Live Earthquake Mashup, RSOE EDIS to mention three, all of which have multiple feeds. There are more out there if you look. Whether these are similar to what you are planning I do not know, but good luck with your quest anyway.



posted on Jul, 18 2012 @ 02:51 AM
link   
reply to post by hidden0
 


I will try to get you the other feeds links from the other sources later. I just don't remember them offhand.

There's Europe, Canada, Australia & NZ to name a few.

There's many other reporting agencies for volcanoes as well.

Look forward to your future posts



posted on Jul, 18 2012 @ 03:18 AM
link   
EMSC European Mediterranean Seismic Centre
EMSC RSS Feeds

BGS British Geological Survey

GFZ German Research Centre for Geosciences

GNNZ GeoNet New Zealand

GSA Geoscience Australia

NRC Natural Resources Canada

MNSN Mexican National Seismological Network

National Seismic Network of Azerbaijan Republic Center of Seismic Survey

Singapore Seismological Network Meteorological Service




edit on 18-7-2012 by violet because: (no reason given)



posted on Jul, 20 2012 @ 08:04 PM
link   
reply to post by PuterMan
 


Wow you guys rock! Thank you for all of this information, it is very helpful. I'll be taking it all into consideration. Screen scraping is no problem for me. Considering I'm working from scratch, this may take me a while XD

The end result I'm going for is kind of a new way to browse for certain information via Google Earth/Maps vs Google Search. It ends up being geared towards these types of events.

I'll devise code to obtain a more broad spectrum of data over time. Hopefully by wednesday I'll have the site online for some real testing.

Hrm...I have to go, but I have so much to say! This thread is going to be a frequent reference for me now lol.
Thank you Violet and PuterMan, great help!
Be back with more details later!



new topics

top topics



 
1

log in

join