We've written about BIG data before and while some reckon it's sexy, you better roll up your sleeves because you'll invariably need to do a lot of 'janitorial' (a.k.a. shit) work first!
Ron Sandland recently wrote about the new phenomenon of 'big data' - weighing up the benefits and concerns. Terry Speed reflected on the same issue in a talk earlier this year inGothenburg, Sweeden noting that this is nothing new to statisticians. So what's all the fuss about? Here's another take on the 'big data' bandwagon.
Information-gap decision theory creates a gap in ecological applications and then fills it May 14, 2014
You may not of heard of Info Gap Decision Theory (IGDT) but don't worry, not many people have.
While the theoretical foundations of IGDT have been well developed and articulated by its architect Yakov Ben-Haim at the Israel Institute of Technology, controversy continues to surround its legitimacy as a credible alternative to existing methodologies.
The issue has again resurfaced with the publication of a letter to the Editor of Ecological Applications by Professor Mark Burgman and Dr. Helen Regan arguing that IGDT is both useful and credible.
Professor David Fox, a one-time IGDT follower, weighs into the the debate. His views are expressed below.
In their recent letter to Ecological Applications, Burgman and Regan (2014) provide counter arguments to some of Sniedovich's (2012) severe, and mostly harsh criticisms of Ben-Haim's info-gap decision theory (IGDT) (Ben-Haim, 2006). While I have a deep respect for Professor Burgman and Dr. Regan, I believe their unwavering faith in info-gap theory is misplaced. As the title of this note suggests, I agree with Sniedovich (2014) that 'the gap' referred to by Burgman and Regan (2014) is illusionary.
For the record, I have worked alongside Ben-Haim, Burgman, Regan, and many others who (myself included) got caught up in a rather unscientific infatuation with a 'new' paradigm some ten years ago. I also plead mea culpa to having co-authored a paper on the application of IGDT to the problem of statistical power analysis (Fox et al., 2007). It was, in essence a case of a solution in search of a problem. On reflection, the problem we tackled was eminently solvable within existing frameworks - and possibly better handled by those frameworks (see for example Reyes and Ghosh, 2013).
Sniedovich has waged a vigorous campaign against IGDT which, as noted by Burgman and Regan (2014), has at times been "disingenuous" - a case perhaps of what we football-loving Australians refer to as playing the man and not the ball. Nevertheless, Sniedovich has played a pivotal role in stress-testing the theory as well as urging IGDT practitioners to think more carefully about their models and analysis.
Not long after the publication of our own IGDT application paper (Fox et al., 2007), I began to have reservations about the utility or, more correctly, the necessity of the whole approach. To be clear, I don't think there is anything fundamentally wrong with IGDT, but when you strip it of its rather obtuse mathematics, it is essentially little more than the formalisation of a deterministic sensitivity analysis (a fact readily acknowledged by Ben-Haim himself). While I expressed concerns about the use and interpretation of the robustness metric and questioned the ability of IGDT to handle simultaneous (and correlated) uncertainty in more complex multi-parameter models (Fox, 2008), a more fundamental question is "do we need IGDT at all"? I believe not. As noted by Burgman and Regan (2014) there already exists a plethora of 'conventional' tools to deal with uncertainty and, unlike IGDT these come 'certified' by virtue of their long history of use and acceptance by the broad scientific community. Outside the isolated pockets of support for IGDT, the theory remains largely unknown. Certainly within statistical circles, no one I've spoken to has heard of Ben-Haim or IGDT. In 2009 I sent a post titled "What is Info-Gap Theory" to Andrew Gelman's blog (http://goo.gl/EPKp3h). Gelman, a highly-credentialed Bayesian statistician at Columbia University and co-author of the popular text "Bayesian Data Analysis" (Gelman et al., 2013) frankly admitted he had never heard of IGDT and after having looked at some of the material concluded that the complicated mathematics "appeared to be a distraction from the more important goals of modelling the decision problems directly". Another contributor to the blog noted that "there seems to be interesting sociological questions about how such theories come to be dominant in certain narrow fields" to which Gelman offered the following insight:
Regarding the sociological question, I have a theory, which I believe I mentioned in the rejoinder to my recent Bayesian Analysis article. The theory is that (a) there are a lot of ways to get a good solution to any particular statistical problem, and (b) people will often attribute the success to the method rather than to the analyst. The result is that, first, people in applied fields can become easily convinced of the efficacy of any particular method, if applied by a charismatic practitioner; and, conversely, said practitioner will become even more confident of the virtues of his or her method, once it is endorsed by practical researchers in applied fields.
Ben-Haim is charismatic, articulate, and intelligent. These qualities resonated within the newly conceived Australian Centre of Excellence for Risk Analysis (ACERA) at the University of Melbourne whose mandate was broadly to provide knowledge, tools, and advice to better manage and understand biosecurity risk. And so it was that IGDT rapidly embedded itself within ACERA as a tool of choice for assessing 'risk' although to be fair, ACERA project 0705 was commissioned to review the role and treatment of uncertainty in risk assessments (Hayes, 2011). Section 4.4.3 of this comprehensive review examined the role of IGDT in a biosecurity context. In his introduction, Hayes (2011) notes that "IGT is different because it offers a non-probabilistic approach to decision-making under uncertainty" although later acknowledges that "deterministic models ... have limited utility in a risk assessment context".
For me, the 'IGDT debate' has largely been a technical one that has been dominated by one protagonist and one defendant and a small, but loyal bunch of supporters. What appears to be lacking is evidence in the form of case studies where the superiority of actual decisions made on the basis of an IG analysis can be demonstrated when compared to decisions that would have be made had more traditional methods been employed. If this evidence exists and stands the scrutiny of normal scientific review, then I believe IGDT has a rightful role in the risk analysts' tool box - even if it shares features with or can be subsumed within other, more established paradigms. Mathematicians and Statisticians are used to the rebadging of their techniques. Genichi Taguchi cleverly repackaged ANOVA for engineers by using familiar terms such as signal-to-noiseratio in place of Mean Square Error and orthogonal arrays instead of fractional factorial designs while Multi Criteria Analysis (MCA) is a favoured tool of environmental scientists - otherwise known by its original name of Goal Programming (which interestingly utilises the concept of satisficing as does IGDT!).
In the end, it doesn't matter what you call it and how it's packaged if it leads to more informed decision-making. If practitioners find Ben-Haim's IGDT and concepts like robustness easier to use and interpret than Wald's maximin criterion - so be it. But I doubt it!
Prof. David Fox May 14, 2014
Literature Cited
Ben-Haim Y.2006 Info-Gap Decision Theory: Decisions Under Severe Uncertainty. 2nd ed, Academic Press, Oxford, UK.
Burgman, M.A. and Regan, H.M.2014Information-gap decision theory fills a gap in ecological applications. Ecological Applications 24:227-228.
Fox, D.R. 2008To IG or not to IG? - that is the question. Decision Point, 24:10-11.
Fox D.R., Ben-Haim Y., Hayes K.R., McCarthy M., Wintle B. and Dunstan P.2007.An Info-Gap Approach to Power and Sample-size calculations. Environmetrics, 18:189-203.
Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., and Rubin, D.B.2013.Bayesian Data Analysis, Third edition, Chapman and Hall/ CRC.
Hayes, K.R.2011Issues in quantitative and qualitative risk modelling with application to import risk assessment ACERA project (0705).Australian Centre of Excellence for Risk Analysis, University of Melbourne.
Reyes, E.M. and Ghosh, S.K.2013.Bayesian average error-based approach to sample size calculations for hypothesis testing. J. Biopharmaceutical Statistics, 23:569:588.
Sniedovich, M.2012.Fooled by local robustness: an applied ecology perspective.Ecological Applications22:1421-1427.
Sniedovich, M.2014.Response to Burgman and Regan: The elephant in the rhetoric on info-gap decision theory.Ecological Applications24(1):229-233.