Strange Weather gains new Strange Weatherman

12 11 2012

Dear reader,

it is with great pleasure that I welcome to this blog my esteemed friend, colleague, and lobster-slayer  Kevin Anchukaitis.

Kevin Anchukaitis coring a tree in a particularly scenic location

Kevin is a fellow paleoclimatologist with origins in dendroclimatology (using trees to infer past climate conditions), a long record of field work in some of the world’s most amazing places (Bhutan, Vietnam, Cambodia, Guatemala), an equally long record of great publications, a healthy obsession with good code, a peerless taste for good wine and an even better writing style, with which he has previously graced the New York Times “Scientist at Work” blog.  It is a privilege and an honor to welcome him as a fellow Strange Weatherman, and I trust you will enjoy his first post as much as I did. Let’s hope for many more to come!

Happy reading,

Julien

PS: it should be said that  Strange Weather science is gender-blind, so we will heartily welcome Weatherwomen too. And this reminds us that one should invent a less sexist term than “weatherman” to designate the TV meteorologists whose job it is to gesticulate in front of a blue screen while blurting out technical terms about the weather.





Climate Informatics

24 09 2012

It’s a good feeling when you are part of a world changing in a positive direction. I just came back from the Climate Informatics workshop in Boulder , CO (always an amazing place to visit, especially in the Fall), where I witnessed our field evolving towards smarter  approaches to large-scale data analysis.  Informatics being a relatively new term, so we spent some time pondering what it means. One definition I like is “the study of the processing, management, and retrieval of information as applied to climate science”. Broadly speaking, the workshop brought together a 50/50 mix of climate scientists on one side, and “information folks” on the other side, be they statisticians, computer scientists, and the machine learning community that swims in between.

It’s always exciting to see smart people take an interest in your field, and there were definitely some ah-ah moments for me in what machine learning could bring into the study of climate variability. The talks are available here in mp4 format.

Before we can bring the power of automation to the study of past climates, however, we must structure them in a way that enables data mining. This is currently not the case, and something that the community is working towards. Towards a semantic web for paleoclimatology is a recently submitted paper that explains the problem and proposes some solutions ; i hope some of you find it useful, especially the paleoclimatology community!

Back to work ; hoping to post more than once a year in the  future…

E.N.





McShane and Wyner

12 11 2010

An influential new paper on paleoclimate reconstructions of the past millennium has been doing the rounds since August.

Authored by Blake McShane and Abraham Wyner, two professional statisticians with affiliations in business school, the work is remarkable in more than one way:

  1. It is performed by professional statisticians on an up-to-date network of proxy data (that of Mann et al, PNAS 2008), a refreshing change from the studies of armchair skeptics who go cherry-pick their proxies so they can get a huge Medieval Warm Period.
  2. It uses modern statistical methods, unlike the antiquated analyses that Climate Audit and other statistical conservatives have wanted us to use for a few centuries now. (whether these methods lead to correct conclusions will be investigated shortly).
  3. It makes the very strong claim that climate proxies are essentially useless.

It is the last claim that has justifiably generated a great uproar in my community. For, if it were correct, then it would imply that my colleagues and I have been wasting our time all along – we had better go to Penn State or Kellogg beg for food, as the two geniuses have just put us out of a job. Is that really so?.

Before I start criticizing it, here are a few points I really liked:

  • I agree with their conclusion that calibration/verification scores are  a poor metric of out-of-sample performance, which is indeed the only way we can judge the reliability (“models that perform similarly  at predicting the instrumental temperature series [..] tell very different stories about the past). In my opinion, a much more reliable assessment of performance comes from realistic pseudoproxy experiments: the Oracle of paleoclimate reconstructions. Although they are problems with how realisticthey are at present, the community is fast moving to improve this.
  • Operational climate reconstruction methods should indeed outperform sophisticated nulls, but caution is required here: estimating key parameters from the data to specify the AR models, for instance,  reduces the degree of  “independence” between the null models and the climate signal.
  • Any climate reconstruction worth its salt should give credible intervals (or confidence intervals if you insist on being a frequentist… However, whenever you ask a non-statistician what a confidence interval is, their answer invariably corresponds to the definition of a credible interval).
  • The authors really are spot on the big question: what is the probability that the current temperature (say averaged over a decade) is unprecedented in the past 2 millennia? And more importantly, what is the probability that the current rate of warming is unprecedented in the past 2 millennia? Bayesian methods give a real advantage here, as their output readily gives you an estimate for such probabilities, instead of wasting time with frequentist tests, which are almost always ill-posed.
  • Finally, I agree with their main conclusion:

“the long flat handle of the hockey stick is best understood to be a feature of regression and less a reflection of our knowledge of the truth. Nevertheless, the temperatures of the last few decades have been relatively warm compared to many of the thousand year temperature curves sampled from the posterior distribution of our model.”  That being said, it is my postulate than when climate reconstruction methods incorporate the latest advances in the field of statistics, it will indeed be found that the current warming and its rate are unprecedented… the case for it just isn’t completely airtight now.

Now for some points of divergence:

  • Pseudoproxies are understood to mean timeseries derived from the output of long integrations of general circulation models, corrupted by noise (Gaussian white or AR(1)) to mimic the imperfect correlation between climate proxies and the climate field they purport to record. The idea, as in much of climate science, is to use these numerical models as a virtual laboratory in which controlled experiments can be run. The climate output gives an “oracle” from which one can quantitatively assess the performance of a method meant to extract a climate signal from a set of noisy timeseries (the pseudoproxies), for varying qualities of the proxy network (as measured by their signal-to-noise ratio or sparsity). McShane and Wyner show a complete lack of reading of the literature (always a bad idea when you’re about to make bold claims about someone else’s field) by mistaking pseudoproxies for “nonsense predictors” (random timeseries generated independently of any climate information). That whole section of their paper is therefore of little relevance to climate problems.
  • Use of the lasso to estimate a regression model.  As someone who has spent the better part of the past 2 years working on improving climate reconstruction methodologies (a new paper will be posted here in a few weeks),  I am always delighted to see modern statistical methods applied to my field. Yet every good statistician will tell you that the method must be guided by the structure of the problem, and not the other way around. What is the lasso? It is an L-1 based method aimed at selecting very few predictors in a list of many. That is extremely useful when you face a problem where there are many variables but you only expect a few to actually matter for the thing you want to predict.  Is that the case in paleoclimatology? Nope. Climate proxies are all noisy, and one never expects a small subset of them to be dominating the rest of the pack, but rather their collective expression to have predictive power. Using the lasso to decimate the set of proxy predictors, and then concluding that the ailing ones then fail to tell you anything, is like treating a whooping cough with chemotherapy, then diagnosing that the patient is much more ill than in the first place and ordering a coffin. For a more rigorous explanation of this methodological flaw, please see the excellent reviews by Martin Tingley and Peter Craigmile & Bala Rajaratnam.
  • Useless proxies? Given the above, the statement “the proxies do not predict temperature significantly better than random series generated independently of temperature” is therefore positively ridonkulus. Further, many studies have shown that in fact, they do beat nonsense predictors in most cases (they had better!!!). Michael Mann’s recent work (2008, 2009)  actually has been systematically reporting RE scores against benchmarks obtained from nonsense predictors. I have found the same in the case of ENSO reconstructions.
  • Arrogance.  When entering a new field and finding results at odds with the vast majority of its investigators, two conclusions are possible : (1) you are a genius and everyone else is blinded by the force of habit  (2) you have the humility to recognize that you might have missed an important point, which may warrant further reading and possibly, interactions with people in said field. Spooky how McShare & Wyner jumped on (1) without considering (2).  I just returned from a very pleasant visit in the statistics department at Carnegie Mellon department, where I was happy to see that top-notch statisticians have quite different attitudes towards applications: they start by discussing with colleagues in an applied field before they unroll (or develop) the appropriate machinery to solve the problem. Since temperature reconstructions are now in the statistical spotlight, I can only hope that other statisticians interested in this topic will first seek to make of  a climate scientists a friend rather than a foe.
  • Bayesianism:  there is no doubt in my mind that the climate reconstruction problem is ideally suited to the Bayesian framework. In fact, Martin Tingley and colleagues wrote a very nice (if somewhat dense) paper explaining how all attempts made to date can be subsumed under the formalism of Bayesian inference. When Bayesians ask me why I am not part of their cult, I can only reply that I wish I were smart enough to understand and use their methods. So the authors’ attempt at a Bayesian multiproxy reconstruction is most welcome, because as said earlier it enables to answer climate questions in probabilistic terms, therefore giving a measure of uncertainty about the result. However, I was stunned to see them mention Martin Tingley’s pioneering work on the topic, and then proceeding to ignore it. The reason why he (and Li et al 2007) “do not use their model to produce temperature reconstructions from actual proxy observations” is that they are a tad more careful than the authors, and prefer to validate their methods on known problem (e.g. pseudoproxies) before recklessly applying to proxy data and proceed to fanciful conclusions. However, given that climate scientists have generally been quite reckless themselves in their use of statistical methods (without always subjecting them to extensive testing beforehand), I’ll say it’s a draw. Now we’re even, let’s put the glove and be friends. Two wrongs have never made a right, and I am convinced that the way forward is for statisticians (Bayesians or otherwise) to collaborate with climate scientists to come up with a sensible method, test it on pseudoproxies, and then (and only then) apply it to proxy data. That’s what I will publish in the next few months, anyway… stay tuned 😉
  • 10 PCs: after spending so much time indicting proxies it is a little surprising to see how tersely the methodology is described. I could not find a justification for their choice of the  number of PCs used to to compress the design matrices, and I wonder how consequential that choice is.  Hopefully this will be clarified in the responses to the flurry of comments that have flooded the AOAS website since the article’s publication.
  • RegEM is misunderstood here once again.   Both ridge regression and truncated total least squares are “Error-in-Variable” methods aimed at regularizing an ill-conditioned or rank-deficient sample covariance matrix to obtain what is essentially a penalized maximum likelihood estimate of the underlying covariance matrix of the { proxy + temperature } matrix. The authors erroneously state that Mann et al, PNAS 2008 use ridge regression, while in fact they use TTLS regularization. The error is of little consequence to their results, however.

In summary, the article ushers in a new era in research on the Climate of the past 2000 years: we now officially have the attention of professional statisticians. I believe that overall this is a good thing, because it means that smart people will start working with us on this topic where their expertise is much needed. But I urge all those who believe they know everything to please consult with a climate scientist before treating them like idiots! Much as it is my responsibility to make sure that another ClimateGate does not happen, may all statisticians who read this feel inspired to give their field a better name than McShane and Wyner’s doings !

PS: Many other commentaries have been offered thus far.  The reader is referred to a recent climatological refutation by Gavin Schmidt and Michael Mann for a climate perspective.





Climate Mammoth

10 04 2010

Former French science minister Claude Allègre is perhaps the most prominent global warming skeptic in my homeland. He is one of the few to have scientific credentials – but unfortunately, not in the right kind of science. Allègre is a specialist in what is called high temperature geochemisty, where he was noted (and decorated) for his celebrated work on the age of the Earth, for instance. No doubt Allègre knows his stuff, as attested by his publication record and numerous medals. Unfortunately, his climate credentials are a little thinner, which is a problem when you start publishing several books essentially calling the entire climate science community a bunch of idiots, or worse – mobsters.  His latest outcry (L’imposture Climatique, “Climate Fraud) has upset so many of my colleagues that a  petition was doing the e-rounds this week, in which the French climate science community is asking current Minsiter Valérie Pécresse  to hold an objective and fair debate at the Académie des Sciences. The full story is here ,and the debate looks like it will indeed happen soon.

Why is such a debate necessary? Well, Allègre is known to be a bully, and got famous as a minister for calling the French educational system a “mammoth” that needed to lose some fat. Needless to say, this phraseology and a legendary lack of tact (his temper literally got him defenestrated at a political rally in 1968, which old timer French professors always liked to joke about),  did little to garner support in favor of  his policies, no matter how necessary they might have been. Still, I’m not one to cast the first stone when it comes to dealing untactfully with opponents, so why should I even mention this?

The problem is that Allègre and his long-time colleague  Vincent Courtillot have used their clout at the Académie (of which they are  bona fide members) to organize  some fake debates on the issue, where they failed to invite people who know anything about climate, or censoring their response. So the people who do know about climate understanbly felt  left out, and would like a seat in the debate. This would just be petty academic disputes, if it weren’t for the fact that the French media seem very hungry for Allègre’s presence. This is apparently as much because of his aggressive communication style (which always makes for a heated debate, therefore a healthy amount of  prime time drama during news hour) than for his current position in this fake debate. I say current because apparently, he wrote 20 years ago in a book (Clés pour la géologie) “By burning fossil fuels, man increased the concentration of carbon dioxide in the atmosphere which, for example, has raised the global mean temperature by half a degree in the last century”. Now the tone has changed drastically: we hear the familiar refrain that warming is barely discernible or, when it is (for example, in the melting of the snowcap of Mt  Kilimanjaro), that this is simply due to “natural causes”. A strange feeling of déjà vu?

As usual with climate skepticism, we have to go beyond personal motivations and analyze the arguments in their own right. This was done masterfully by the inimitable Ray Pierrehumbert in a 2-part blog post on RealClimate, which rank among my favorite  posts of all time. Part 1 is here, Part 2 there. You would think that the Flat Earth Knights (that’s what they are now called in the climate community, referring to their omission of the Earth’s  rotundity  in elementary radiation budget calculations) would have stopped embarrassing themselves after top-notch climate scientist Edouard Bard patiently debunked all of their arguments. Alas,  far from an end, it seemed to have only ushered an era of growing media attention for Allègre and Courtillot, who tell the skeptics just what they want to hear under a varnish of scientific credibility that few care to question. In Courtillot’s case, scientific misconduct is beyond doubt. Allègre seems to be more subtle, but the fact that he is using his prominence and weight in the media to hijack the debate is troubling. The reason why I blur the line between the two is that they are clearly tag-teaming, with Allègre handling the book (non peer reviewed) and television PR campaign, while Courtillot is very active  in the peer-reviewed literature, with the success that we know.  To his credit, Courtillot  recently made the news for his association with two publications in the Journal of Atmospheric and Solar-Terrestrial Physics, which aimed at establishing a statistically-significant correlation between solar activity and the recent warming. The papers are here and here.  A well-argued critique of their methodology, lead by statistical climatologist Pascal Yiou can be found here.

While the peer-reviewed literature is indeed a venue of choice for scientists to debate arguments, it may not be the most transparent to the general public. As a scientist and signatory of the aforementioned online petition, I look forward to a free, open and impartially moderated debate at the Académie des Sciences, where I trust that the considerable knowledge, integrity and  intelligence of some of the most noted French climate scientists will give real scientific arguments a fair chance of being heard. Then, let the people decide what to believe, but at least on the basis of sound arguments.

If, as seems unavoidable, our mammoth ends up losing a few tusks in the battle, I hope he will (this time) respect scientific ethics when intervening in a scientific debate where (so far) ignorance and arrogance are his only medals.

J.E.G.

PS: The quixotic claims from Allègre’s latest book are debunked here (in French) and honestly they are so pathetic that I won’t waste my saturday afternoon on an English translation!





A 5-step algorithm to scientific discovery

18 03 2010

Today I stumbled upon these pearls of wisdom by Paydarfar & Schwartz, thanks to Thierry Huck. Thought I’d share. It is a five-step process, one might say algorithm, to scientific discovery.

1. Slow down to explore. Discovery is facilitated by an unhurried attitude. We favor a relaxed yet attentive and prepared state of mind that is free of the checklists, deadlines, and other exigencies of the workday schedule. Resist the temptation to settle for quick closure and instead actively search for deviations, inconsistencies, and peculiarities that don’t quite fit. Often hidden among these anomalies are the clues that might challenge prevailing thinking and conventional explanations.

2. Read, but not too much. It is important to master what others have already written. Published works are the forum for scientific discourse and embody the accumulated experience of the research community. But the influence of experts can be powerful and might quash a nascent idea before it can take root. Fledgling ideas need nurturing until their viability can be tested without bias. So think again before abandoning an investigation merely because someone else says it can’t be done or is unimportant.

3. Pursue quality for its own sake. Time spent refining methods and design is almost always rewarded. Rigorous attention to such details helps to avert the premature rejection or acceptance of hypotheses. Sometimes, in the process of perfecting one’s approach, unexpected discoveries can be made. An example of this is the background radiation attributed to the Big Bang, which was identified by Penzias and Wilson while they were pursuing the source of a noisy signal from a radio telescope. Meticulous testing is a key to generating the kind of reliable information that can lead to new breakthroughs.

4. Look at the raw data. There is no substitute for viewing the data at first hand. Take a seat at the bedside and interview the patient yourself; watch the oscilloscope trace; inspect the gel while still wet. Of course, there is no question that further processing of data is essential for their management, analysis, and presentation. The problem is that most of us don’t really understand how automated packaging tools work. Looking at the raw data provides a check against the automated averaging of unusual, subtle, or contradictory phenomena.

5. Cultivate smart friends. Sharing with a buddy can sharpen critical thinking and spark new insights. Finding the right colleague is in itself a process of discovery and requires some luck. Sheer intelligence is not enough; seek a pal whose attributes are also complementary to your own, and you may be rewarded with a new perspective on your work. Being this kind of friend to another is the secret to winning this kind of friendship in return.

All in all my own process does follow those rules, although I’m still waiting on the major discovery. Sometimes I commiserate about spending too much time in step 1, only to be reminded that all I was doing was step 3. I have been working on some methodological improvements to climate reconstruction techniques for a while, and they are finally panning out… Expect some fireworks soon. Step 2 Idefinitely follow in bursts – feast or famine. Step 4 I learned doing my postdoc, which is when I actually starting looking at data and not just model output.  Step 5 is an ongoing process. Going for a drink tonight …





Climate Science in Context

9 12 2009

As world leaders and NGOs gather this week in Copenhagen for what many of us hope (with increasingly cautious optimism) to be a turning point in climate negotiations, an excellent recapitulation of some key milestones was published in the New York Times today. I retraces the physical origins of greenhouse effect theories and experimental demonstrations, and the long and  inexorable rise of global warming to the top of the international political agenda.

The recent Climate Gate is featured there, and put properly into context. One can’t help but marvel at the exquisite timing of this questionably legitimate release of private information – and of the underlying motivations behind it – but in truth i doubt this will affect the negotiations much. Few people were optimistic about their outcome before the ‘scandal’, and for reasons having very little to do with the denial of the unequivocal nature of global warming, which is, as you well know, due to human activities with very high probability (>90%).

My own personal opinion is that April showers bring May flowers, and that once the dust settles, the Climate Gate uproar may have the positive outcome of forcing more transparency in our community, which will be excellent for both scientists and the public. In the age of  the Web 2.0, open source code sharing, crowd computing,  and decentralized information traveling at the speed of light, hoarding data like it’s the Dark Ages simply doesn’t make any sense. I think the hackneyed argument that releasing the raw data would lead to more bogus studies from ‘independent scientists” (i.e. fossil fuel-funded think tanks), is moot: the bogus machine is already alive and well (cf Soon and Baliunas, Allègre, Courtillot and the French refusards, Vice-Count Monckton, the Cato Institute, etc), and the obfuscation of data and code only gives fodder to the blood-thirsty crowd of armchair skeptics, who are always looking at our slightest mistakes to run to the nearest hill and shout that “Global Warming is a hoax”, or some other Inhofery of that ilk. The answer to this is not more secrecy, but better work ethics and complete transparency.
An excellent discussion of the issues raised by the hack can be found here.
In the meanwhile I’ll be watching closely what happens in Copenhagen, and i’ve been staying very calm while talking to climate skeptics.

Speaking of putting things in context, I’m quite fond of this video, even if i think the Beavis &Butthead running theme is needlessly inflamatory.

But it’s a pretty different impression that one gets from watching Fox News, innit ?





Wake up

16 07 2009

A clear, arresting and entertaining exposition of why we can’t afford to sit on our hands. Very nicely done, and im my professional opinion, very accurate. Some skeptics like Lindzen will scream that we’ve forgotten many negative feedbacks while exaggerating the positive ones. While they’re  scrambling to find more imaginative negative feedbacks, we’ll have a few decades to watch videos, starting witth this little piece of art :

Wake Up, Freak Out – then Get a Grip from Leo Murray on Vimeo.

While the exact position of the tipping point mentioned isn’t known, it’s becoming increasingly clear that we will hit it sometime in the near future, and once we do it will be very hard -if not impossible – to come back to livable conditions for a very long time. My personal advice, however, is to take the “Freak Out ” part out of the equation, because fear has never helped anyone – but awareness does.

While I’m at it, here’s another another video that I think every skeptic should watch :

I am curious to hear insightful counter-arguments.

Wake up !