works
Nuño Sempere Shallow evaluations of longtermist organizations online This document reviews a number of organizations in the longtermist ecosystem, and poses and answers a number of questions which would have to be answered to arrive at a numerical estimate of their impact. My aim was to see how useful a “quantified evaluation” format in the longtermist domain would be. In the end, I did not arrive at GiveWell-style numerical estimates of the impact of each organization, which could be used to compare and rank them. To do this, one would have to resolve and quantify the remaining uncertainties for each organization, and then convert each organization’s impact to a common unit. In the absence of fully quantified evaluations, messier kinds of reasoning have to be used and are being used to prioritize among those organizations, and among other opportunities in the longtermist space. But the hope is that reasoning and reflection built on top of quantified predictions might prove more reliable than reasoning and reflection alone. In practice, the evaluations below are at a fairly early stage, and I would caution against taking them too seriously and using them in real-world decisions as they are. By my own estimation, of two similar past posts, 2018-2019 Long Term Future Fund Grantees: How did they do? had 2 significant mistakes, as well as half a dozen minor mistakes, out of 24 grants, whereas Relative Impact of the First 10 EA Forum Prize Winners had significant errors in at least 3 of the 10 posts it evaluated. To make the scope of this post more manageable, I mostly did not evaluate organizations included in Lark’s yearly AI Alignment Literature Review and Charity Comparison posts, nor meta-organizations.

Shallow evaluations of longtermist organizations

Nuño Sempere

Effective Altruism Forum, June 24, 2021

Abstract

This document reviews a number of organizations in the longtermist ecosystem, and poses and answers a number of questions which would have to be answered to arrive at a numerical estimate of their impact. My aim was to see how useful a “quantified evaluation” format in the longtermist domain would be. In the end, I did not arrive at GiveWell-style numerical estimates of the impact of each organization, which could be used to compare and rank them. To do this, one would have to resolve and quantify the remaining uncertainties for each organization, and then convert each organization’s impact to a common unit. In the absence of fully quantified evaluations, messier kinds of reasoning have to be used and are being used to prioritize among those organizations, and among other opportunities in the longtermist space. But the hope is that reasoning and reflection built on top of quantified predictions might prove more reliable than reasoning and reflection alone. In practice, the evaluations below are at a fairly early stage, and I would caution against taking them too seriously and using them in real-world decisions as they are. By my own estimation, of two similar past posts, 2018-2019 Long Term Future Fund Grantees: How did they do? had 2 significant mistakes, as well as half a dozen minor mistakes, out of 24 grants, whereas Relative Impact of the First 10 EA Forum Prize Winners had significant errors in at least 3 of the 10 posts it evaluated. To make the scope of this post more manageable, I mostly did not evaluate organizations included in Lark’s yearly AI Alignment Literature Review and Charity Comparison posts, nor meta-organizations.

PDF

First page of PDF