posted on Jun, 20 2013 @ 05:06 AM
Originally posted by WhiteAlice
reply to post by BayesLike
I agree with you on the models and patterns though maybe it's because I am also a biologist (and an accountant so I'm a literal bean counter), I see
the two groups to be less distinct. Thinking in terms of probabilities is fairly beneficial to the physical sciences as it lends itself to create
"what if" scenarios that may end up modifying hypotheses or even theories. It just takes a 1000 or more experiments with repeat results to prove
Rand has written a whole slew of papers on the use of game theory in a pretty wide variety of subjects over the years, ranging from military
strategies, economics and politics. www.rand.org...
I'll check out some of these papers too, more to see what new ideas have been developed over the years. The applications will be fascinating too.
Some disciplines are more accepting of randomness than others. Physical scientists and most of the engineering disciplines (plus most programmers)
are not very open to randomness in an experiment -- anything less than extremely high precision is looked upon as bad technique. They don't deny
randomness exists and can talk about it some, but it's not a default thought pattern. Chemical engineers are a bit different, they are generally
quite open to statistical concepts and quite a few get very deeply into applied stats. I haven't worked with a lot of biologists, but I would
imagine that biologists must be quite open to randomness in general -- as are psychologists and sociologists.
Where I think a lot of the comp sci community makes a possible error is in the thought that if all the detail is captured it is somehow better. All
the more modern machine learning tools tend to extreme flexibility. In many cases, to have a good model, you actually want a somewhat stiff function
for the response to ensure the response model is not capturing noise. It's the support domains over which these are to be applied that requires
flexibility to isolate. Flexible capture all round is more, in my mind, a data compression method for future replay.
As you note elsewhere, when talking about the conditions in 2010 being right for some level of political activity, certain conditions do seem to make
political activity more likely. You are looking at the domain (the support) for political activity and don't expect a prediction function to explain
everything (both activity and non-activity) over the entire spectrum of possible outcomes.
To build a good model and capture the domain, it's often best to compare and contrast extremes and not look at anything in the messy middle. We can
then, because of the high contrast, find the factors which are relevant and build a tentative model. It then makes some sense to go back and look at
the messy middle to see what happens with the model in those cases. I can't quite express what to do with problems like this in a few words,
Some of the problems I get called in to solve have the same types of features: a full spectrum of responses and often thousands of possible factors.
Only some are relevant. Some of the factors are specific to the individual and some are specific to the environment, then there are the interactions
of these two groups of factors. A machine learning tool usually tries to deal with what someone guesses is relevant or possibly relevant and can't
toss anything out -- which means it is blindly fitting noise. Also machine learning tools, because of high flexibility, are really incapable of
interpolating through the messy middle if that data isn't included in their training. Somewhat stiff functions interpolate through gaps in the
domains better and they extrapolate better.