Originally posted by Xtraeme
reply to post by IsaacKoi
Hey Isaac, sorry I've been incommunicado.
No problem at all. As I've mentioned to you before, we seem to have taken it in turns to drop out of contact for a while due to pressures in real
life.
I have most of the fold3 data now (minus a few stray images, still can't figure out one pesky bug)
That's great news. That ends the complete dependence on fold3 keeping these images on their website.
I've been holding off running your Reaper myself due to the issues we've talked about before (particularly the absence of a working check that an
image had been downloaded before moving on to the next page, resulting in about 10% of images failing to download on my last attempt).
but magonia.haaan.com never made it into my library.
Well, it sounds like Kandinsky may be able to plug the gap.
When did the site go down?
Very recently. I was contacted about it yesterday from someone involved in running the website - hence my post on here today.
Maybe to prevent this sort of thing from happening again, I could put together a website where people submit UFO domains and then my spider will just
automatically go out and grab them? I'll think on it.
Sounds like a good idea to me, although I'd be concerned about any sort of centralised repository rather than having a few people holding independent
back-ups on their own computers (since people within ufology have a habit of losing interest and appearing to drop off the face of the planet).
I keep meaning to get to grips with HTTrack or one of the other similiar pieces of software, but haven't yet got on to the issue of storing mirrors of
websites. (I've done some experimenting with using Adobe Acrobat to create PDF files containing the webpages of various websites and have very mixed
results. I'll post something in this thread about those attempts some time).
edit on 6-11-2012 by IsaacKoi because: (no reason given)