Social Networking and Data Mining: What It Reveals About You, page 1
Pages: <<  1    2  >>
ATS Members have flagged this thread 17 times
Topic started on 14-2-2010 @ 05:20 PM by LadySkadi
Pete Warden, a Colorado-based, British-born ex-Apple engineer who has spent the last six months gathering and analyzing data from more than 215 million public Facebook profile pages. What he’s discovered just might shed more light on the culture of connected America than the 2010 census.

"If you actually look at [Facebook user data] in the aggregate, it's like a painting," Warden told TechNewsDaily. "Each individual data point isn't interesting, but when you step back and look at the trends in millions of profiles, you start to see some pretty interesting pictures emerging."


Facebook Mining

Now, after gathering the data from Facebook’s site using software he designed and honed in the process, and making a first round of enticing observations, he wants to turn the raw data he’s culled over to academia for further analysis. But he also hopes to steer investors and customers to his own software and services for further data gathering and aggregation.

"I'm much better at building the pipeline for processing the data than I am at doing really rigorous stuff with the results that come out at the end," Warden said in a telephone interview. "The patterns that I've blogged about in the U.S. data are very qualitative."


Serious about privacy

But Warden is serious when it comes to people’s privacy concerns, even though all the data being gathered is publicly available on Facebook’s site, and can be found via Google. He says he wants to make the data useful for large-scale data analysis, but not for tracking down individuals.

To that end, Warden has delayed releasing the data for the time being (he initially intended to release it yesterday, Feb. 9), after someone from Facebook contacted him, asking for some time to check the privacy implications.

Once Facebook clears the data for release to the academic world, Warden says he’s ready to pass the task of interpreting all this data on to others and feature their conclusions on his blog more often than his own.
Link


I think that the majority of ATS users are well aware of the data mining via Facebook and other social networking sites, how that can be used for marketing and possible other purposes and how to best protect one's privacy; whether one chooses to abstain from social networking altogether, or whether one confines social networking to close family and friends, etc...

I do have to wonder how much thought the typical person puts into how much information is available to the masses and whether the typical person considers this availability of information a concern?



How to split up the US - Warden's Blog

Warden gathered data on 210 million public Facebook profiles and discovered emerging patterns and detail. His observations and visualization represented in the map below shows the information by location, with connections drawn between places that share friends.

The descriptions below are just part of the analysis Warden was able to make based on public profile data. Bear in mind, this analysis is amateur and speculative. However, Warden does plan to turn his data mining results over to the Pro's for further study and use.



Quoted for interest: there is greater detail in the actual Blog descriptions, this is just bare bones...

Stayathomia
Stretching from New York to Minnesota, this belt's defining feature is how near most people are to their friends, implying they don't move far.

Dixie
Probably the least surprising of the groupings, the Old South is known for its strong and shared culture, and the pattern of ties I see backs that up. Like Stayathomia, Dixie towns tend to have links mostly to other nearby cities rather than spanning the country.

Greater Texas
Orbiting around Dallas, the ties of the Gulf Coast towns and Oklahoma and Arkansas make them look more Texan than Southern. Unlike Stayathomia, there's a definite central city to this cluster, otherwise most towns just connect to their immediate neighbors.
God shows up, but always comes in below the Dallas Cowboys for Texas proper, and other local sports teams outside the state.

Mormonia
The only region that's completely surrounded by another cluster, Mormonia mostly consists of Utah towns that are highly connected to each other, with an offshoot in Eastern Idaho. It's worth separating from the rest of the West because of how interwoven the communities are, and how relatively unlikely they are to have friends outside the region.

Nomadic West
The defining feature of this area is how likely even small towns are to be strongly connected to distant cities, it looks like the inhabitants have done a lot of moving around the county. For example, Boise, ID, Bend, OR and Phoenix, AZ all have much wider connections than you'd expect for towns their size.

Socalistan
LA is definitely the center of gravity for this cluster. Almost everywhere in California and Nevada has links to both LA and SF, but LA is usually first. Part of that may be due to the way the cities are split up...Californians outside the super-cities tend to be most connected to other Californians, making almost as tight a cluster as Greater Texas.

Pacifica
The most boring of the clusters, the area around Seattle is disappointingly average. Tightly connected to each other, it doesn't look like Washingtonians are big travelers compared to the rest of the West... Link


As interesting as this amateur study is, I think that it drives home a significant point. Regardless of whether one personally uses social networking and media - or doesn't - it's affect upon you are still relevant and will probably still be used for target marketing, at the very least.



ed: Shorten Title


[edit on 14-2-2010 by LadySkadi]


reply posted on 14-2-2010 @ 05:59 PM by SuperSlovak



reply posted on 14-2-2010 @ 08:57 PM by LadySkadi
FACEBOOK: Federal Human Data Mining Program




*I think this video has been shown on ATS before, but never hurts to revisit it*


reply posted on 15-2-2010 @ 12:14 AM by MemoryShock
Great Thread...

But I wanted to briefly suggest that this isn't just related to social networking sites. Microsoft is hashing out a program to glean relevant marketing directives based on all websites visited by a web user...

Detecting Online Commercial Intention

Note the tool input in the upper right. Type in any web address and the likelyhood of consumer motivated intention will be displayed (ATS came back as a 0.84 probability for Non-Commercial Intention").

And this is just a basic assessment based on website hits and subsequent activity regarding each website.

Who here thinks that they aren't already attempting to piece together personality profiles based on the entirety of one's internet habits?

And would it be irrational to suppose, even for a minute, that social networking sites may perhaps be sharing their 'generalized' data for use within much more complex internet social/consumer models?

Edit To Add -


People Name Detection
Many online queries are either personal names or they contain personal names. For example, "John Smith" is a personal name, and "I am pleased to tell you that Andy Beal will be working as an internet marketing consultant" contains a personal name. This tool detects names of people in a query, which can improve ad relevancy for online marketing. This also helps in understanding a user's Web search intent and could be used to provide more relevant search results to meet the user's needs, such as providing the biography and other attributes of the targeted person to the users.

Same site as above link

[edit on Mon, 15 Feb 2010 00:50:00 -0600 by MemoryShock]



reply posted on 15-2-2010 @ 12:48 AM by LadySkadi
reply to post by MemoryShock

Excellent point MemoryShock and very relevant.

This is not just about social networking sites, as you pointed out and IMO is only going to continue to grow tentacles that will reach into every aspect of online activity.

Let us not forget Google's recent announcement that it would start to track all searches, regardless of whether one was signed into their account plus their recent partnership with the NSA (though at present, does seem to be with the intent to protect privacy and information) but nonetheless, all signs point to continued advancements in harnessing the publics information.

The money is there and large profits to be made. Among other things.


reply posted on 15-2-2010 @ 12:54 AM by LadySkadi
reply to post by maxcobalt

This kind of data collection sounds like an awesome idea as long as it’s fairly generic.

I suppose the question is, will it stay that way?

Of course, marketing campaigns are big business, consumer information is its backbone and there are numerous ways to go about ensuring that companies get the information they need to supply the public with its wants (or some say, decide for the public what they want.) Social networking and online tracking are added components that will be incredibly useful going forward. No doubt.

However, always consider both the positives and negatives... there will be many of both...


ed: sp (doh)

[edit on 15-2-2010 by LadySkadi]


reply posted on 15-2-2010 @ 01:05 AM by Moonsouljah
reply to post by MemoryShock



Great point as I see a thread about FB every so often and evreyone puts FB down but are expressing away on ATS.


reply posted on 15-2-2010 @ 01:23 AM by m0r1arty
S+F - This is my favourite subject.

I personally have no problem volunteering my information out to certain sites. It allows me to have opportunities and establish a cultural identity with which I can make friends and enemies.

I do have a problem with who is in control of my data though. God and angels I'm fine with - but sadly they're fictional (IMHO).

When I was back in the UK data blunders (which still haven't been cleared up) cost millions of people their personal private information. Addresses, next of kin, bank account details, date of birth...pretty much the whole shebang have been 'lost' on trains and trashcans and 'disappeared' from where they should be. So if this is the data we know about (because it's physical) then what of the data that just just copied and passed along without any one knowing?

I gave a lecture once on Facebook, it's benefits and potential problems. It was about a year ago. During the lecture I had the class pick a city that wasn't in our country and asked them to select a male or female candidate. I then hunted down through the publicly available information and found a single male who had pictures of his house and its equipment, his pet fish, his 'friends', his phone number, favourite football team, date of birth, email and fact that he was going to a 3 day rock festival 300 miles north of where he was located the following weekend. All of this was projected onto large display for the group to see. It took about 10 mins.

I put forward the idea that if we were of a criminal nature we could break into his house whilst he was away and steal all of his stuff. I asked the class how viable that option would be and they agreed it was incredibly realistic. I then put forward the idea that since we know who his mother's mother is from his status updates we could get some credit cards planned as we know his date of birth, mother's maiden name and address. We could pick up the cards when we visit his house to break in. Again, it was seen as a viable option should we have a criminal nature. Finally I said, we know his email address. Who thinks that the password will be his pet fish, football team or some other piece of information that he's volunteered?

I didn't go so far as to break into someone's email (As I'd already shown the class prior that most of their passwords were either their kids or loved ones names, pets or sports teams) but it left the group with the feeling I was aiming for. To break into a house you have to be there - but he'd given us all we needed to know on how to do it. To break into someone's entire online life you only need an email and password and you can be anywhere.

With social media being integrated more and more with third party applications one fault in our security could have all our online identity compromised and we wouldn't even be aware of it until it was too late.

URL shorteners also cause problems. With Twitter constricting us to 140 characters people use Bit.ly or some equivalent to point people to sites they think are worthy of note. These mask the target URL and are a ripe market for being exploited in the phishing market. One click from a Twitter posted link could lead you to a Facebook like page which only has the purpose of stealing your email and password. Chances are that your Facebook password is the same as your email password which further compromises security.

I think 2010 will be the year of identity theft and compromised security. No longer do scammers have to write complicated coding, they can just use the tools already available to them to help fabricate their ruse. Check YouTube for a bunch of videos telling you how to do this already - that's how easy it is that 13 year olds are propagating it.

I agree the long term problems potentially brought by Facebook should be voiced loud and focussed on by government and human rights activists. However I think a basic literacy on how to use the internet is much more important for the short term as common criminals are much more dangerous in an immediate way.

I do think Facebook (and Twitter) are government funded operations that have the potential to go awry, but as long as we hold the leash and not the other way around it can be a useful tool for a great number of things.

Cheers for letting me vent!

-m0r


reply posted on 15-2-2010 @ 06:24 AM by cushycrux
reply to post by LadySkadi



Caution! This Thread is already examined!

www.google.ch...


add:
Try search in google.com this without ": "site:www.abovetopsecret.com intext:google"

sorry, can't post a link, it's to complex..try by yourself with copy and paste. Be careful, any user can use google for data mining!



[edit on 15-2-2010 by cushycrux]

Pages: <<  1    2  >>    ^^TOP^^



A MUST Read, "Written By A Female Cop"
  Posted 1 days ago with 32 member flags
Know your enemy ... the Daily Mail & Viscount Rothermere
  Posted 11 days ago with 17 member flags
NBC and the Banned \'Fear Factor\' Episode
  Posted 7 days ago with 6 member flags
What are Seventh-day Adventists (SDAs) REALLY like?
  Posted 11 days ago with 5 member flags

Newest topics getting replies, in real-time:

Santorum wants more fracking!!!
  US Political Madness, Posted 11 hours ago, 53 replies
Pass Me My Rifle
  World War Three, Posted 7 hours ago, 51 replies