Die Klimazwiebel: Model uncertainty

Wednesday, March 3, 2010

Model uncertainty

by @ReinerGrundmann

I am opening a new thread to discuss issues arising in the Grilling Jones post. This has to do with data sharing and the relation between datasets and theoretical models. So please stay on this topic when commenting. The reference is to a sociological study (thanks to Jin W for alerting me!). In case you find it difficult to access this article(free our data!), I reproduce the main findings from the conclusion below.

Young, Cristobal (2009) Model Uncertainty in Sociological Research: An Application to Religion and Economic Growth, AMERICAN SOCIOLOGICAL REVIEW, 2009, VOL. 74 (June:380–397)

...In methodological terms, this article illustrates
how research findings can contain a great
deal of model uncertainty that is not revealed in
conventional significance tests. A point estimate
and its standard error is not a reliable
guide to what the next study is likely to find,
even if it uses the same data. This is true even
if, as in this case, the research is conducted by
a highly respected author and is published in a
top journal. Below, I outline a number of specific
steps that could help improve the transparency
and credibility of statistical research in
sociology.

1. Pay greater attention to model uncertainty.
The more that researchers (and editors and
reviewers) are attuned to the issue of model
uncertainty, it seems likely that more sensitivity
analyses will be reported. Researchers with
results they know are strong will look for ways
to signal that information (i.e., to report estimates
from a wider range of models). Results
that depend on an exact specification, and unravel
with sensible model changes, are not reliable
findings. When this is openly acknowledged, the
extensiveness of sensitivity analysis will, more
and more, augment significance tests as the
measure of a strong finding.

2. Make replication easier. Authors should
submit complete replication packages (dataset
and statistical code) to journals as a condition
of publication, so that skeptical readers can easily
interrogate the results themselves (Freese
2007). This is particularly important for methodologically
complex papers where it can be quite
difficult and time consuming to perform even
basic replications from scratch (Glaeser 2006).
Asking authors for replication materials often
seems confrontational, and authors often do not
respond well to their prospective replicators.

In psychology, an audit study found that only 27
percent of authors complied with data requests
for replication (Wicherts et al. 2006). Barro’s
openness in welcoming this replication—readily
providing the data and even offering encouragement—
seems to be a rare quality. Social
science should not have to rely on strong personal
integrity of this sort to facilitate replication.
The institutional structure that publishes
research should also ensure that any publication
can be subject to critical inspection.16

3. Establish routine random testing of published
results. Pushing the previous point a bit
further, Gerber and Malhotra (2006) suggest
establishing a formal venue within journals for
randomly selected replications of published articles.
The idea is to develop a semiregular section
titled “Replications,” with its own
designated editor, in which several of the statistical
papers each year are announced as randomly
selected for detailed scrutiny, with wide
distribution of the data and code, and the range
of findings reported in brief form (as in Table
1).
Indeed, this could provide ideal applied exercises
for graduate statistics seminars. Even if
only a dozen or so departments across the country
incorporate it into their classes, this alone
would provide a remarkably thorough robustness
check. The degree of model variance would
quickly become transparent. Moreover, the
prospect of such scrutiny would no doubt
encourage researchers to preemptively critique their own findings and report more rigorous
sensitivity analyses.

4. Encourage pre-specification of model
design. One of the problems in statistics today
is that authors have no way to credibly signal
when they have conducted a true (classical)
hypothesis test. Suppose a researcher diligently
plans out her model specifications before
she sees the data and then simply reports those
findings. This researcher would be strategically
better off to conduct specification searches
to improve her results because readers cannot
tell the difference between a true hypothesis
test and a data mining exercise.

The situation
would be greatly improved if there were some
infrastructure to facilitate credible signaling. A
research registry could be a partial solution. In
medical research, clinical trials must be reported
to a registry—giving a detailed account of
how they will conduct the study and analyze the
data—before beginning the trial.17

A social science
registry would similarly allow authors to
specify their models before the data become
available (Nuemark 2001). This is feasible for
established researchers using materials like time
series data or future waves of the major surveys
(e.g., NLSY, PSID, and GSS). This will, for the
subset of work that is registered, bring us back
to a time when model specification had to be
carefully planned out in advance. Authors could
then report the results of their pre-specified
designs (i.e., their true hypothesis tests), as well
as search for alternative, potentially better, specifications
that can be tested again when the next
round of data becomes available.

Because most
data already exist, and authors can only credibly
pre-specify for future data, this would be a
long-term strategy for raising the transparency
of statistical research and reducing the information
asymmetry between analyst and reader.
Thirty years ago, model uncertainty existed
but computational limitations created a “veil
of ignorance”—neither analyst nor reader knew
much about how model specification affected
the results. Today, authors know (or can learn)
much more about the reliability of their estimates—
how much results change from model
to model—than their readers.

As Hoeting and colleagues (1999:399) argue, it seems clear that
in the future, “accounting for model uncertainty
will become an integral part of statistical
modeling.” All of the steps outlined here would
go far, as Leamer (1983) humorously put it, to
“take the con out of econometrics.”
(pp 394-5)

Print this post

11 comments:

Steve Carson said...: In the amazingly wide ranging The Black Swan by Nassim Nicholas Taleb, he comments on the "confirmation bias" (a lesser point in all he talks about, but still significant).

The Confirmation Bias, or Why None of Us are Really Skeptics

Human nature makes us want to confirm our own theories. And anyone else who looks like overturning them is obviously a threat.

This suggests that an institutional approach to allowing/encouraging replication and falsification would be a huge step forward. Perhaps obvious in hindsight?

This seems especially desirable in climate science, many aspects of which are quite new and unclear. For example, the theoretical basis for "ensembles of models" instead of "a model" - yet an important pillar in attribution of CO2 to 20th century climate change.

Unfortunately, the huge stakes involved have made it that much harder for everyone to take a step back and review the strength of the evidence.

This only increases the desirability of a framework that pushes hard in the other direction, against the natural instincts of the scientists promoting their science and a worthy cause..; March 3, 2010 at 10:22 AM
P Gosselin said...: A resounding YES to all 4 points!; March 3, 2010 at 10:46 AM
Hans von Storch said...: Reiner,
I wonder how your comment relates tot he situation in climate research. Climate models are not statistical models, but process based models. Indeed, the meaning of the word model differs very strongly from community to community. (See, e.g., Müller, P., and H. von Storch, 2004: Computer Modelling in Atmospheric and Oceanic Sciences - Building Knowledge. Springer Verlag Berlin - Heidelberg - New York, 304pp, ISN 1437-028X); March 3, 2010 at 11:45 AM
@ReinerGrundmann said...: Hans
I am afraid I am not the right person to comment on this question. I thought however there were similarities to the paleoclimate controversy.; March 3, 2010 at 12:16 PM
AnonyMoose said...: Climate models are full of processes, but there are a lot of statistical components. It would be helpful if there were error range calculations carried along with every process step, parameterized adjustment, and random variation. Of course, some other climate work such as paleoclimate study is full of statistical tasks for which these guidelines would be very helpful.; March 3, 2010 at 2:30 PM
Anonymous said...: Good article. VERY hard to read with no blank lines in between paragraphs.; March 3, 2010 at 4:24 PM
@ReinerGrundmann said...: Fixed -- better now?; March 3, 2010 at 4:41 PM
Unknown said...: When I joined a faculty of geography after being educated in meteorology, I did not understand models of human geographers. After one year, I realized that their "models" are our "parameterizations". It was fortunate that we had corresponding (or "commensurable") concepts.

Here I understand that the title of the article was written in the dialect of social scientists. In climate modelers' dialect, it reads "Parameterization uncertainty".; March 3, 2010 at 5:29 PM
eduardo said...: @ 5
AnonyMoose

could you explain a bit more what do you mean by 'but there are a lot of statistical components' and 'random variations'.

If we make a climate simulation with the same model, on the same computer and with the same initial conditions we get the same result. The only random component is in the choice of initial conditions. Other than that there is no randomness included in the simulations (not considering weird errors in the parallelization or memory management and the like, which I do not think you were referring to); March 3, 2010 at 5:35 PM
Anonymous said...: Hello,

I would like to add some comments as someone who is familiar with software engineering as well as someone who has significant experience in programming. I have degrees in Physics and Mathematics.

A bedrock of science is reproducibility. If computer generated output play a part of science the code must be made public; it is not enough to give specifications or methods or whatever. What happens if another scientist, using the specificationis or methods or whatever get different results? What if one of the programs had an error; who made the error? There is a saying in software engineering that the program is its documentation. Many times seemingly small changes are made by a programmer and the programmer forgets to document the changes.

A computer program, if it is useful, almost always undergoes changes as long as it is used. Perhaps new features are added or bugs are corrected. Confusion can arise if one publishes a paper using one version of a program and later changes are made to the program. It is not difficult to keep track of different versions.
CVS is a system used in Open Source software that does just that. To use CVS you make your changes and you save (commit) the file with the changes and you get a new current version. CVS works by creating diff files that keep track of the changes made from the previous version. One can retrieve the version 1, apply the diff file to get version 2, and another diff file to get version 3, and so on. It is not difficult to create all previous versions of a file.

One can also use a versioning system like CVS for data files. The base file of, say weather station data, may be used as version 1, and changes from homogenization or whatever may be applied as different versions. This would allow one to reproduce a changing data file at a particular time.

Often intermediate files are created. Needless to say the progarm that creates them must be saved programs.

Finally one should have a script ( or a makefile) that that goes through all the steps, including creating intermediate files, to get the results. Open Source usually use makefiles. The data files and programs are generally is generally put together in what is called a tarball with a README file.

Not only does something like the above allow results to be easily reproduced, but based on my experience a procedure like what I have advocated can save the programmer time. Admittedly there may be a steep learning curve but if one spends 5 years developing and using a large software system it probably pays to automate as much of the work as possible. A large system by definition involves many different inacting parts and it is hard to keep everything in one's head. Documentation on how to generate the results from the data files to the output generally does not work because it may not be kept up to date. The script or makefile is documentation

klee12; March 3, 2010 at 9:04 PM
Zajko said...: This article deals with quantitative social science, but I do see a lot of parallels with climate modeling (though I admit that is something I have no experience with).

The journal publication format has never been suited for the sort of data and model accountability advocated here. I agree with the need, and am happy that the computer age makes this possible, but I can also understand the resistance.
Random testing by graduate students? (and fodder for one's critics?) Preparing a complete replication package in addition to a journal article?
Hmpf... maybe I'll just take my work to another journal.

The pre-specification of model design seems something more particular to social sciences, where it is very easy to mine datasets for statistically significant correlations and use that as a finding.

I have had real trouble with the quantification of uncertainty in climate model predictions. Sure we can check the variance of multiple models, but that seems to only scratch the surface of the possible uncertainty.
As for the historical temperature record of the last century - it would be great to see how sensitive these records (or models if you will) are to various specifications.

I like the preceding discussion of CVS - wouldn't it be handy if such a tool had been used from the outset on climate datasets (along with other kinds of wishful thinking if people had known how the data would have been used).; March 4, 2010 at 12:58 AM

Post a Comment (pop-up window,non-moderated)

Sustainable use of KLIMAZWIEBEL

The participants of KLIMAZWIEBEL are made of a diverse group of people interested in the climate issue; among them people, who consider the man-made climate change explanation as true, and others, who consider this explanation false. We have scientists and lay people; natural scientists and social scientists. People with different cultural and professional backgrounds. This is a unique resource for a relevant and inspiring discussion. This resource needs sustainable management by everybody. Therefore we ask to pay attention to these rules:

1. We do not want to see insults, ad hominem comments, lengthy tirades, ongoing repetitions, forms of disrespect to opponents. Also lengthy presentation of amateur-theories are not welcomed. When violating these rules, postings will be deleted.
2. Please limit your contributions to the issues of the different threads.
3. Please give your name or use an alias - comments from "anonymous" should be avoided.
4. When you feel yourself provoked, please restrain from ranting; instead try to delay your response for a couple of hours, when your anger has evaporated somewhat.
5. If you wan to submit a posting (begin a new thread), send it to either Eduardo Zorita or Hans von Storch - we publish it within short time. But please, only articles related to climate science and climate policy.
6. Use whatever language you want. But maybe not a language which is rarely understood in Hamburg.

Impressum/Datenschutzerklärung

Kontakt: Hans von Storch, Kirchenallee 23, 20099 Hamburg, +49 40 41924472, hvonstorch(at)privat.dk
Alle Postings sind namentlich gezeichnet. Die presserechtliche Verantwortung für Texte und für die Verwendung von Abbildungen liegt bei den jeweils genannten "Postern" (Verfasser). Haftungshinweis: Trotz sorgfältiger inhaltlicher Kontrolle übernehmen wir keine Haftung für die Inhalte externer Links. Für den Inhalt der verlinkten Seiten sind ausschließlich deren Betreiber verantwortlich.

Datenschutzerklärung

Allgemeiner Hinweis und Pflichtinformationen

Benennung der verantwortlichen Stelle

Die verantwortliche Stelle für die Datenverarbeitung auf dieser Website ist:

Hans von Storch
Kirchenalle 23
20099 Hamburg

Die verantwortliche Stelle entscheidet allein oder gemeinsam mit anderen über die Zwecke und Mittel der Verarbeitung von personenbezogenen Daten (z.B. Namen, Kontaktdaten o. Ä.).

Widerruf Ihrer Einwilligung zur Datenverarbeitung

Nur mit Ihrer ausdrücklichen Einwilligung sind einige Vorgänge der Datenverarbeitung möglich. Ein Widerruf Ihrer bereits erteilten Einwilligung ist jederzeit möglich. Für den Widerruf genügt eine formlose Mitteilung per E-Mail. Die Rechtmäßigkeit der bis zum Widerruf erfolgten Datenverarbeitung bleibt vom Widerruf unberührt.

Recht auf Beschwerde bei der zuständigen Aufsichtsbehörde

Als Betroffener steht Ihnen im Falle eines datenschutzrechtlichen Verstoßes ein Beschwerderecht bei der zuständigen Aufsichtsbehörde zu. Zuständige Aufsichtsbehörde bezüglich datenschutzrechtlicher Fragen ist der Landesdatenschutzbeauftragte des Bundeslandes, in dem sich der Sitz unseres Unternehmens befindet. Der folgende Link stellt eine Liste der Datenschutzbeauftragten sowie deren Kontaktdaten bereit: https://www.bfdi.bund.de/DE/Infothek/Anschriften_Links/anschriften_links-node.html.

Recht auf Datenübertragbarkeit

Ihnen steht das Recht zu, Daten, die wir auf Grundlage Ihrer Einwilligung oder in Erfüllung eines Vertrags automatisiert verarbeiten, an sich oder an Dritte aushändigen zu lassen. Die Bereitstellung erfolgt in einem maschinenlesbaren Format. Sofern Sie die direkte Übertragung der Daten an einen anderen Verantwortlichen verlangen, erfolgt dies nur, soweit es technisch machbar ist.

SSL- bzw. TLS-Verschlüsselung

Aus Sicherheitsgründen und zum Schutz der Übertragung vertraulicher Inhalte, die Sie an uns als Seitenbetreiber senden, nutzt unsere Website eine SSL-bzw. TLS-Verschlüsselung. Damit sind Daten, die Sie über diese Website übermitteln, für Dritte nicht mitlesbar. Sie erkennen eine verschlüsselte Verbindung an der „https://“ Adresszeile Ihres Browsers und am Schloss-Symbol in der Browserzeile.

Twitter Plugin

Unsere Website vewendet Funktionen des Dienstes Twitter. Anbieter ist die Twitter Inc., 1355 Market Street, Suite 900, San Francisco, CA 94103, USA.

Bei Nutzung von Twitter und der Funktion "Re-Tweet" werden von Ihnen besuchte Websites mit Ihrem Twitter-Account verknüpft und in Ihrem Twitter-Feed veröffentlicht. Dabei erfolgt eine Übermittlung von Daten an Twitter. Über den Inhalt der übermittelten Daten sowie die Nutzung dieser Daten durch Twitter haben wir keine Kenntnis. Einzelheiten finden Sie in der Datenschutzerklärung von Twitter: https://twitter.com/privacy.

Sie können Ihre Datenschutzeinstellungen bei Twitter ändern: https://twitter.com/account/settings

Pinterest Plugin

Unserer Website verwendet Funktionen des sozialen Netzwerkes Pinterest. Anbieter ist die Pinterest Inc., 808 Brannan Street, San Francisco, CA 94103-490, USA.

Bei Aufruf einer Seite mit Funktionen von Pinterest, stellt Ihr Browser eine direkte Verbindung zu den Pinterest-Servern her. Es werden Protokolldaten an die Server von Pinterest übermittelt. Standort der Server sind die USA. Die Protokolldaten können möglicherweise Rückschlüsse auf Ihre IP-Adresse, besuchte Websites, Art und Einstellungen des Browsers, Datum und Zeitpunkt der Anfrage, Ihre Verwendungsweise von Pinterest sowie Cookies zulassen.

Einzelheiten hierzu finden Sie in den Datenschutzhinweisen von Pinterest: https://about.pinterest.com/de/privacy-policy.

Quelle: Datenschutz-Konfigurator von mein-datenschutzbeauftragter.de

Wednesday, March 3, 2010

Model uncertainty

11 comments:

Sustainable use of KLIMAZWIEBEL

Recent comments

Contributors

Popular Posts

Blog Archive

Impressum/Datenschutzerklärung

Wednesday, March 3, 2010

Model uncertainty

11 comments:

Sustainable use of KLIMAZWIEBEL

Recent comments

Contributors

Popular Posts

Blog Archive

Subscribe To

Impressum/Datenschutzerklärung