We don't have good estimates for how much Social Security will costIn the US Treasury's annual "Financial Report of the U.S. Government" released in January 2013, the Government Accountability Office reports that the numbers for Social Security are so problematic it couldn't audit them to even render an opinion, saying: "In addition, GAO issued disclaimers of opinion on the 2012, 2011, and 2010 Statements of Social Insurance (SOSI)". They don't attempt to audit projections of future finances which are even more problematic, and may vastly understate worst case costs.The New York Times recently printed an oped by two professors on the topic: "Social Security: It's worse than you think" based on concerns over demographic projections, stating "What’s not is that the Social Security Administration (SSA) underestimates how long Americans will live and how much the trust funds will need to pay out — to the tune of $800 billion by 2031". The piece was based on results published in a journal article. Their concerns were based on costs that by 2031 were less than 4% higher than SSA estimates. The SSA figures may turn out to be off far more than that even without taking into account their concerns. The results have been critiqued and this site hasn't researched those demographic projections and takes no position on them yet. It addresses several other major concerns with their cost projections. However it is important to note the SSA has a history of not just being wrong on their expected demographic estimates (which is understandable for a long term forecast), but more importantly being overoptimistic about how much variation to consider for best/worst case scenarios. The 1950 annual SSA report projected US population in 2000 to be worst case 199 million, best case 173 million. In reality it turned out to be 282 million, 42% above their highest projection. Imagine if their real costs turn out to be 42% higher than their current projections. The 1975 report projected US population in 2050 to reach 308 million (no alternate estimates given). In reality we already reached 309 million by 2010. Their forecast for 2010 was only 276 million. The 1980 report did raise the estimate to 290 million for 2010, and added a highest case estimate but that turned out to be still too low at 303 million. The SSA's annual report provides figures for expected future "intermediate costs" and for scenarios they label "high cost" and "low cost". Unfortunately people should not be misled into accepting government forecasts simply because they may give a superficial impression of being accurate by providing lots of numbers. The SSA claims to be able to forecast things like GDP yearly growth through 2090 more accurately than 2 or 5 year forecasts have proven to be for the past few decades from the Congressional Budget Office, the Administration, the Federal Reserve and blue chip private consensus estimates. Their estimates indicate they think they can forecast future GDP even more accurately than GDP is measured. Imagine by analogy someone were to document the most sophisticated weather forecast system existing today. They might provide lots of numbers, but if it gave a forecast for the weather 1 year from today and claimed it would be off at most by 1 degree we would be wise to be rather skeptical without evidence their methods were realistic. The US needs to plan for the possibility the Social Security system could wind up in far worse shape than it admits since the estimates should be considered far less certain than they appear. It is important to note that Medicare forecasts are based on some of the same data the Social Security forecasts use, though this page doesn't examine them. There are some obvious problems with their expected projections. On top of that they have overconfidence in those flawed projections. Occasionally the details below may be a bit technical for some, but they can be skimmed to get a sense of how many questionable aspects there are to the SSA forecast. SSA's high cost scenario turns out to be low cost if you adjust for inflationThe US Treasury annual report gives a clue to a huge problem with SSA forecasts when it notes an oddity in them: "assumptions under the high-cost alternative generally worsen the financial outlook. One exception occurs with the CPI assumption". (CPI=Consumer Price index). Most forecasts of future government spending and the economy primarily use "real" (inflation adjusted) dollars. This site planned to update a government wide forecast to use "highest real cost" figures from the SSA. It turned out none exist. Their claimed "high cost" scenario turns out to have a lower real $ cost in the future than the "low cost" scenario. The SSA appears to focus on nominal (non-inflation adjusted) dollars to obscure that obvious major problem with their forecast. The Social Security annual report gives cost figures for every year through 2090 in nominal $ and as a % of GDP. In both cases on the surface the high cost scenario appears superficially to be "high cost". The nominal $ cost for OASDI (Old Age Survivors and Disability Insurance) , i.e. Social Security, is: What matters however to the public are the inflation adjusted costs. They only give one table that provides real $ costs rather than nominal or % GDP figures. It only goes through 2026, however by then the figures already show the "high cost" scenario has in reality become the low cost scenario when adjusted for inflation. They likely found an excuse to stop at 2026, even though there other data runs through 2090, to avoid the problem standing out too obviously. Their %GDP figures and real GDP growth figures run through 2090 and can be used to derive the real costs for the whole time period, showing the "high cost" scenario turning into the low real $ cost scenario: That is effectively adjusted for inflation using their estimates of the GDP deflator measure of inflation. If their projections for the CPI measure of inflation were used (as they are in their table of real costs through 2026), the "low cost" scenario would be even more costly compared to the "high cost" scenario. The low cost scenario has a higher population, yet even the cost per capita is higher for the "low cost" scenario: You might wonder if assumptions are chosen to make it the "highest shortfall cost". That isn't the case. They explain that (1) the effect on taxable payroll due to a greater rate of increase in
average wages occurs immediately; and (2) the effect on benefits due to a
larger COLA occurs with a lag of about 1 year. As a result of these
effects, the higher taxable payrolls have a stronger effect than the
higher benefits, which results in lower cost rates. So higher inflation leads to higher revenue compared to costs, and leads to the "trust fund" lasting longer. Despite that they assign the high cost scenario the high inflation rate. It is possible they were using a rather useless conception of "high cost" as being "high nominal cost (in at least some ways)" rather than "highest real cost" or "highest real trust fund shortfall" or "highest real cost per capita" or "longest time till the trust fund runs out". The flawed choice calls into question whether other assumptions may be inappropriately chosen as well. It is questionable whether the "high cost" scenario is truly even a credible "highest nominal cost" scenario or merely "high cost in some ways". For the sake of argument, assume for a moment their intermediate estimates might have some basis in reality (even though this site already points out expected intermediate GDP forecasts are too optimistic, even ignoring the possibility that higher national debt may slow economic growth). Lets look at some of the data forecasts that go into their projections and consider how well they do at deciding how accurate their numbers might be, how well they select low and high cost alternatives. SSA nominal GDP forecast doesn't include a realistic "high cost" caseThe SSA gives yearly figures through 2090 only for nominal GDP and not real GDP (however their implicit real GDP figures can be derived from their projections of the GDP deflator or real GDP growth). That may be so no one notices the oddity that the "high cost" scenario has the highest nominal GDP.. and the lowest real GDP. Since they seem to be focused on using nominal values it is useful to look at their forecast for nominal GDP:In 2090 the low cost scenario GDP forecast is only 13.5% below the intermediate cost one, and the high cost GDP is 13.5% above the intermediate cost scenario. It is unrealistic to assume their forecast is accurate enough to choose that little difference for a "high cost" scenario. The annualized growth rate works out to be 4.4% for the low cost scenario, 4.6% for high cost and 4.8% for high cost, i..e. only a 0.2% difference between the high cost and the intermediate cost scenarios. That isn't remotely realistic given ranges for their other estimates, the accuracy of other GDP forecasts, and the accuracy with which GDP is measured. The CBO published a study in January 2013 of the accuracy of the last few decades of 2 and 5 year GDP forecasts from the CBO, the Administration and blue chip private consensus estimates. The difference between the forecast and actual values for 5 year forecasts of nominal GDP had mean absolute yearly differences of CBO: 0.9%, Administration: 1.0% and blue chip consensus: 0.9%. They usually were overoptimistic, if you look at the mean yearly difference (where underestimates in some years cancel out overestimates in others) the values are CBO: 0.5%, Administration: 0.6%, and blue chip consensus: 0.6%. Those are merely average errors, the worst cases are further off. Even if we are optimistic and just use the average 5 year yearly error and assume the SSA might only be off by 0.5% per year (rather than their claimed 0.2%) in either direction through 2090, the graph looks like this: Rather than differing by 13.5% in 2090, now the low value is 31% lower than the medium value, and the high value 45% higher. If we instead varied the growth rate by 1% per year (like the Administration's absolute error), the low value would be 53% lower and the high value 110% higher. Their uncertainty estimates aren't internally consistent. Pretend their intermediate estimate accurately predicted future GDP per capita. If the population were larger or smaller than what the intermediate scenario forecasts, the resulting total GDP would be larger or smaller by the percentage difference in population. Their population estimates for 2090 are 19.6% lower in one scenario and 23.2% higher in another (though as noted earlier in reality their extreme population forecast was off by 42% over a shorter forecast in the past). That by itself gives a wider range than their actual low and high GDP estimates. Pretend their expected intermediate forecast for total real GDP will match reality, but the inflation rate might match one of the other scenarios. The graph below shows the nominal GDP resulting from using the intermediate real GDP combined with the 3 different inflation estimates, compared to their forecast high and low values: The lowest value is now 49% below the intermediate, and the highest 97% above.The scenario with the highest nominal GDP forecast has the lowest inflation rate estimate. If we instead combine the lowest real GDP forecast with the lowest inflation rate, and the highest real GDP forecast with the highest inflation rate, there is a far greater difference in nominal GDP. Below the lowest value is now 71% less than the intermediate, and the highest 237% more. Nominal GDP growth has been declining overall for the last 30 years. One way to examine trends is to smooth over minor yearly fluctuations by using moving averages. The graph below shows the moving average of GDP growth for the last 30 years with horizontal lines showing the annualized growth rates projected for the future in the SSA forecast. Moving Average GDP growth vs. SSA Forecast It is possible that the trends could reverse. However given the steadily falling trends, their choice for growth rate seems questionable as does having such a narrow range of predictions for the alternative scenarios. Another problem is that even if they were sure they could accurately forecast the growth that accurately if they had precise historic data, existing data isn't that precise. The BEA estimates the size of the economy using two techniques: Gross Domestic Product (GDP) and Gross Domestic Income (GDI) which in theory they should be equal. In reality their data never precisely balances which means the error in estimating them must be at least as large as the difference between them. The SSA may base their projections on data other than GDP, but if other data could be combined to come up with a more accurate GDP estimate then some organization like BEA would be using that approach as well. A journal paper written by the BEA a few years ago noted a recent difference merely in the measured growth rate trends between the two (5.3% vs. 5.5%) of 0.2%. "For example, over the last decade, the average difference in the quarterly levels between GDP and gross domestic income was – 0.2 percentage point, and the average difference in the quarterly growth rates was also – 0.2 percentage point. [...] The mean absolute difference (difference without regard to sign) in the quarterly level was 0.5 percentage point, and the mean absolute difference in the quarterly growth rates was 1.4 percentage points. [..] Trend growth over this period was 5.3 percent for nominal GDP versus 5.5 percent for nominal gross domestic income. " Predictions based on history need include older data to spot trends. The Center for Economic and Policy Research reported: "The average difference between GDP and GDI, known as the 'statistical discrepancy,' over the years 1947 through the first quarter of 2011 is 0.5 percent of GDP.", and a Federal Reserve working paper from a few years ago noted: "As shown in the upper panel of chart 1, the absolute value of the statistical discrepancy as a fraction of the average of nominal GDP and nominal GDI peaked at 2.1 percent in 1993. From 1977 to 2001, the fraction averaged 0.8 percent with a standard deviation of 0.9 percent[...]" A long range forecasting method isn't going to produce exact predictions even with precise data so the potential error should be assumed to be far larger than the possible historic measurement trend differences are. No one knows what the real measurement error is in the estimates so the actual size of the economy could be above or below GDP and GDI estimates. This graph shows the difference in yearly growth rate between the GDP and GDI, which topped 1.5% in a recent year.: Again, another page on this site delves further into the possibility the intermediate expected GDP projections are too high by themselves. Real GDP estimates are also questionable.The average yearly real growth rates for the high and low cost scenarios differ from the intermediate scenario by 0.7%. A Federal Reserve working paper mentioned earlier noted that the difference between the trends in GDI and GDP just from measurement differences were almost that much: "between 1994 and 2000, real GDI grew on average ½ percentage point (annual rate) faster than real GDP" This time lets look at the Federal Reserve's uncertainty for real GDP estimates. They explain they choose an error range so that there is a 70% likelihood the real value will fall within the range they project. Just three years out they are only estimating the yearly growth estimate will be within or minus 1.7% of their estimate, over twice what SSA is claiming for the yearly growth difference for estimates through 2090. Much of the reason for the slower nominal GDP growth is due to low inflation rates. Real GDP growth has been falling as well however, here are moving averages of real GDP growth Long term moving averages smooth the trend in real GDP growth a little bit better: The trends may not continue, or not at a linear rate. However a worst case forecast should consider the possibility growth will continue to slow (especially given the potential that rising national debt levels will slow the economy). To provide perspective the graph below shows a linear projection of yearly real GDP growth trends into the future, compared to SSA's forecasts of yearly GDP growth Note: That isn't a meant to be a useful forecast approach since it is unlikely to be a linear trend in growth rates, it merely illustrates their apparent attempt to ignore a downward trend. Regardless of whether the slower growth rate trend will continue in the future, it has existed the last few decades. The SSA seems to have been ignoring that when producing its forecasts. Long range forecasts from prior year SSA annual reports show it has been steadily raising its projections for GDP growth over time rather than reducing them as the growth rate has continued to fall. Graphs below show projected annual real GDP growth forecast from SSA annual reports from 1995, 2000, 2005, and 2012. (in 1990 and earlier the SSA was using GNP growth rates rather than GDP). The forecast growth has been increasing for each of their scenarios: ![]() Interest RatesThe Social Security forecast varies depending on the real interest rate it gets from its "trust fund". The higher the interest rate, the longer the trust fund lasts. The "high cost" scenario has the lowest real interest rate.. but the highest nominal interest rate. It would seem more appropriate for the high cost scenario to have the lowest nominal interest rate leading to an even lower real interest rate assumption. SSA forecasts assume the rates will fluctuate a bit the next few years and then be stable. The SSA site provides a history of its nominal interest rates. The graph below shows the last few decades of nominal interest rates compared to lines showing the long term rates their forecasts use.![]() Their real interest rate forecast shows the same pattern of being optimistic the trend won't continue: It is possible the interest rate trend will reverse (in fact it seems likely it will if the level of national debt keeps rising), however there is no guarantee so they should consider the worst case that they will continue to fall. A higher interest rate is good for Social Security, but bad for a government wide-forecast since it increases interest expense on the publicly held national debt. The SSA should provide a scenario which is worst case from a Social Security program perspective, but also estimates using assumptions that are worst case from the public's standpoint. They need to consider a scenario where the interest rate rises much higher than their forecasts, in part merely because it has historically been higher and also because a high national debt level (or a higher inflation rate) may lead to a higher interest rate. They exhibit problems dealing with interest rates and inflation which call into question whether anyone with much financial knowledge ever checked their forecasts. A document which describes their economic assumptions says : "The real interest rate (real effective annual yield) on the special public debt obligations issuable to the trust funds for a given year is defined as the nominal effective annual yield less the increase in the CPI-W for the first year after issue.". The problem is that isn't accurate. Many people don't realize the equation "RealRate=NominalRate-InflationRate" (which the data they provide confirms is the equation they use) is only an approximation. Some "math-light" sources never mention the real equation because for most individual purposes for short range forecasts the difference is negligible. This is however a long range forecast supposedly done by professionals. As a private investment site explains: "(1+NominalRate)=(1+InflationRate)*(1+RealRate)". Even wikipedia explains (or a Federal Reserve paper if you prefer), that simple algebra using that shows the real interest rate is calculated using "RealRate=((1+NominalRate)/(1+InflationRate)) - 1". Over a long range forecast of 78 years the difference between the precise formula and the approximate one can add up to e.g. a 7-8% difference in resulting interest using some typical figures in the SSA forecast. Obviously the trust fund won't last that long, but its still something they should have gotten right. It is trivial to take a few more seconds to put the exact formula into a spreadsheet or computer program code. Instead e.g. in the long run for the high cost scenario they show a nominal interest rate of 6.2% minus an inflation rate of 3.8% leading to a real interest rate of 2.4%, rather than the actual 2.31% figure. Their use of lack of attention to details like that is another reason to question the presumption their forecast for nominal GDP is accurate enough to only have a 13.5% difference for the worst case scenario in 2090. Its true the uncertainty in estimates the math is applied to (and the choice of inflation measure) will vastly overshadow that sort of difference. That is not an excuse for avoiding simply taking a few seconds more to apply the right math to estimates to avoid making forecasts even less accurate. Their approach is like saying "2+2=5 is close enough since we guessed at the 2 figures". The concern is that the repetition for many years of simple errors easy to find like raises a red flag regarding the quality of the rest of the forecast. SSA makes a similar error when they introduce this concept: "The annual real wage differential is defined as the annual percentage change in the average OASDI covered wage minus the annual percentage change in the CPI". A more appropriate figure would be "RealWageChange = ((1+NominalRate)/(1+InflationRate)) - 1". It isn't always straightforward how to combine two differing measures of inflation in the same forecast (and there are other possible choices like the PCE to consider). Estimates for some figures in the Social Security forecast appropriately use the Consumer Price Index since laws link some of their expenditures to it. The GDP deflator works appropriately to convert between their nominal and real GDP figures (derived using their real growth rates). It is not clear without more information and research if they mixed the two appropriately everywhere when creating their forecasts. e.g. the World Bank provides data on international real interest rates where: "Real interest rate is the lending interest rate adjusted for inflation as measured by the GDP deflator.". The SSA instead uses the CPI to calculate its "real interest rates". Their CPI estimates sometimes differ by 0.5% from their GDP price index. The Labor Force Participation forecasts are questionable.
One important factor influencing future social security finances is the percentage of the population that will be in the workforce to support retirees, i.e. the labor force participation rate. The SSA uses separate forecasts for men and women since they appear to follow different trends. The male rate will be considered first. The federal Bureau of Labor Statistics shows the labor force participation rate for men has been steadily decreasing as long as it has been collecting data:The problem is the SSA doesn't seem to have accepted the trend, their past estimates have been too optimistic. Over time their estimates have gotten lower to avoid looking too out of touch with reality, but they questionably now project an *increase* in the labor force participation rate despite the trend. The Bureau of Labor Statistics in contrast did their last long range projections in 2006 and estimated by 2020 the male participation rate would decline to 70% and by 2050 it would decline to 66%. The rate fell faster than they estimated and in early 2012 they did projections through 2020 which lowered the estimate for 2020 to 68.2%. In contrast the SSA claims "73.0, 72.7, and 72.5 percent for the low-cost, intermediate, and high-cost assumptions, respectively." for the long range labor force participation rates. Note: those are "age-adjusted" rates (described below) which should be only slightly different than the BLS non-adjusted figures. For a while they appear to have noticed the downward trend, but then seem to have become overoptimistic again. The next graph shows the low and high long term predictions given in past annual reports vs. the BLS labor force participation rate in the year the forecast was made: Earlier it was pointed out the SSA seems to rely on using nominal figures to try to obscure real costs. They play a similar game with participation rates by not using the usual "labor force participation rate" the BLS reports. Their figures are for the "age adjusted labor force participation rate". The "age adjusted" rate adds up the participation rate for various age groups weighted by the percentage of people in each age bracket in a given base year. For that base year the "age adjusted labor force participation rate" is the same as the regular "labor force participation rate". The rate then pretends the age distribution in the population in say 2090 is the same as the age distribution for the base year and then gives what the labor force participation rate would be if that were the case. This isn't useful by itself to know what fraction of the public is in the workforce since obviously the age distribution will be different by then. The BLS doesn't seem to provide historic age adjusted labor force participation figures on its site directly, but the BLS has data for the workforce size for various age brackets that can be used to approximate it. The graph below uses 2010 as the base year (oddly the 2012 report says it uses 2010 as the base year rather than the year of the report. Likely the numbers looked better for them): The SSA report doesn't seem to specify the age brackets they use to calculate the age adjusted rate (this seems to indicate they use essentially the same brackets, this seems to indicate they may use individual years for those 55 and over), but that is likely to be a close approximation and shows the trend isn't much different. It makes sense to be concerned about how the labor force participation rate varies by age when generating projections, but that particular number doesn't make sense to provide as a summary in an annual report. You can't do an exact direct comparison of numbers in the current SSA figure to an old report since to be accurate you need to age adjust the current value to the base year used in the old report. Usually the year of the report seems to be the base year of the forecast (though they don't always make this explicit). The graph below adds in the 1st quarter 2012 labor force participation rate (the quarter before the last SSA report was released) age adjusted to the year of each forecast for comparison. The graph below compares the last part of the age adjusted labor force participation trend with the 2012 forecast for the "long term" rates they expect them to reach within the next decade: They seem to not notice that some aspects of their forecast may lead to a lower participation rate, e.g. if their overoptimistic estimates of future salaries are a reality then more people may choose to work fewer years. The documentation of their methods seems to indicate "data snooping", i.e. cherry picking. For example they use a trends for some of the labor force data from 1994-2008, a rather arbitrary choice of years for a 2012 report, presumably since it is flatter. They admit they may not use rely on actual trends in the data and that for many different aspects of their forecasts they use fudge factors to get figures they prefer: "Addfactors are adjustments that move an estimate closer to an expected value." Data snooping is sometimes appropriate to pick out best and worst case trends, or arguably to ensure that major shifts in trend are accounted for in expected values. It is not appropriate to us it for intermediate values to make them look better. The track record of their past projections calls into question their approach. One concern is whether they look at different age groups too separately without considering strongly enough the potential interaction between the trends. The participation rate in one age group may impact the participation rates of others. For some types of forecasts an aggregate estimate may be more accurate than combining individual estimates. If you do look at trends for each age group completely separately rather than looking at the overall trend it appears possible to rationalize the participation rate not going quite as low as the apparent overall rate trend. However their forecast of an increase seems questionable. The graph below compares their long term participation projections with the trends. The "Trend on rate" uses a linear projection of the age adjusted labor force participation rate .The "trend on all data" was derived using separate linear trends for each age bracket's population and workforce, and it uses the whole time period of data available rather than cherry picking a few years. It only uses linear trends for a quick estimate to illustrate the point that the trend is still downwards: Note: that isn't meant to be a realistic forecast, it merely continues to call into question of whether they are ignoring apparent downward trends as their past history seems to indicate. The real trends may be more complicated. The Labor force participation rate for women is also overoptimistic and doesn't seem to have accepted the possibility it may have peaked and be dropping. The BLS long range projections through 2050 expect the female participation rate to reach 59.4% in 2020 and drop to 55.1% by 2050 Their early 2012 update for data through 2020 dropped the 2020 expected rate to 57.1%. Here is the SSA forecast of "age adjusted labor force participation rates" for women compared to the historic age adjusted participation rates (approximated using the 2010 base year the SSA used): To look closer at the last twenty years or so: Oddly despite going to the trouble of breaking out the labor force participation rate by age bracket, their docs don't seem to indicate any attempt to use a similar approach to project average wages. Average wages will vary with age, so it would seem to be consistent they would have considered age brackets to project wages as well. It may be that the data wasn't available to do this, or perhaps the results make Social Security prospects look worse so they avoid that approach. Projections of nominal wage growth continue the pattern of being overoptimistic.The graph below compares the SSA's historic figures for average nominal wage growth compared to their long term projections: One problem with many forecasts is they consider trends on yearly percentage changes. Sometimes looking at the values themselves indicates they are likely following a different trend: e.g. a linear or quadratic trend. If you compare weekly earnings to different types of simple trends, it seems to match a quadratic trend fairly well. Usually science tests out theories by performing experiments. We don't have a time machine to test current projections for the future, but there is a way to apply that concept. One way to assess a forecast method is to see how well it matches historic data. e.g. you can pretend you were in 1981 and use only data available up through that year to forecast a value for 2011, and see how well your method matched reality. Better yet you can for example do a forecast for 2011 average wages as if it were done in every past year from 1981 onward and see how well your predictions would have matched reality. This graph shows forecasts done for 2011 average wages using linear, quadratic, and exponential trends using data available only up to the given year, with a line for the actual 2011 average earnings.i.e. the values for 1992 would be the projections you would have made then for 2011 using different trends. The quadratic tend seems to match fairly well. This next graph compares a quadratic trend for average nominal wages in the future with average wages that follow the growth estimates from the SSA , and throws in the CBO's estimates to show it isn't much different from SSA's medium scenario. They are all higher than the quadratic trend: The difference is likely due to their inflation expectations. The Inflation forecast range is questionableThe future consumer price index is hard to predict, however their ranges are questionable for that. It is possible the Federal Reserve has learned better how to control inflation over time. The graph below shows a linear trend for perspective, it isn't meant to be a realistic trend forecast.The SSA should at least consider the possibility a downward trend will continue. They should also consider the possibility that poor economic policies (e.g. possibly "printing" more money to fund the national debt) might eventually lead to a return of high inflation. A moving average of the inflation rate smooths the data a little: Just for perspective (again this isn't meant to be an accurate forecast, but to illustrate the possibility the general downward trend will continue ) a power regression was done on the 10 year moving average CPI which matches it fairly well: ![]() This shows the power regression trend continued in comparison to the SSA inflation forecasts: ![]() Trend analysisThroughout SSA documents there are references to "ordinary least squares" regression. That is a method of determining what equation best fits a set of data. The issue is that it isn't always the most appropriate type of regression.For convenience any trends created for data on this page were calculated using traditional least squares regression, just as the SSA does for its trend analysis. Unlike the SSA forecast these weren't meant to produce final completely accurate figures but just rough guidelines to call their figures into question. If an accurate actual forecast were being created it would be more appropriate to consider other forms of regression with different weighting factors like "least squares percentage regression". A standard least squares regression method determines how well data fits a curve based on the absolute difference between the curve and the data. The problem is that what matters more often is the percentage error. An error of $1 trilllion in a data point when the GDP was $1 trillion is of more of a concern than an error of $2 trillion when the GDP is $16 trillion. Other alternative approaches might weight the fit of recent data more heavily since it may be more representative of the current trend, and newer data may be more accurate than old data. SimulationsIt may be the results of stochastic simulations inappropriately lead them to be overoptimistic regarding the values they should use for low and high estimates. The International Actuarial Association has a Social Security Committee which produced a paper on: "Stochastic Projections of the Financial Experience of Social Security Programs: Issues, Limitations and Alternatives" (see also) where they remind people of basic concepts:For
both deterministic and stochastic models the rule of “garbage in,
garbage out” applies. No model, however refined or sophisticated, can be
better than its input data and assumptions. [...] When should use of
stochastic models be questioned? [...] When it is difficult or
impossible to determine the appropriate probability distribution
[...]Users of stochastic modeling should be cautioned against
the uninformed use of a particular distribution, which can generate
scenarios producing completely useless results. Such a misguided
assumption can be specially problematic when examining tail risk. [...]A
related approach to sensitivity analysis is referred to as "stress
testing", seeking There is no reasonable way to assign probabilities to some of the major estimates they use. As indicated on this page, many of their assumptions are questionable to begin with, and its unlikely any sort of guess at their probabilities is going to have much chance of matching reality. The Social Security Administration seems to choose assumptions for e.g. the high cost scenario which aren't truly a "stress test" or indicative of "tail risk". The problem the SSA has with being too confident in its estimates and not considering enough variation is shown when the SSA admits some of the concerns with its stochastic models, noting that once they find an equation for a trend: "Once estimated, the coefficients are treated as if they are known with certainty". It appears they are too likely throughout their projections to ignore uncertainty in the equation for a trend, and even in the choice of type of equation to use for a trend (especially when they don't seem to even consider alternative regressions like least squares percentage regression). This is questionable when some of their fits aren't very good, e.g. in a past description of one of their model for real wages they note: "The R-squared value was 0.53" (note: it might be possible to find better fits, e.g. by finding a trend on real wages directly rather than a trend on the change in yearly changes wages, but that option hasn't been explored in enough detail to say for sure that is what a forecast should be based on. Unfortunately it seems likely they haven't explored such alternatives either). Thomas Jefferson wrote while president: "We
might hope to see the finances of the Union as clear and intelligible
as a merchant's books, so that every member of Congress and every man of
any mind in the Union should be able to comprehend them, to
investigate abuses, and consequently to control them. Our predecessors
have endeavored by intricacies of system and shuffling the
investigation over from one office to another, to cover everything from
detection. Unfortunately some details of the SSA forecasts are difficult to evaluate since they rely on data they say is "unpublished" from other sources. They provide some bits of the data they use, but they should provide all of it in a convenient format online for easy critique, along with all their models. There shouldn't be a need for the public to try to wade through the bureaucracy to get the data, or wait for office hours to try to contact them if a member of the public is analyzing it on a weekend or during the evening. The author was analyzing some data on a weekend and wasted time working to recreate longer term historic wage data than SSA provided using the approach they described by gathering e.g. census data and past data on military personnel from a few different sources online. (A future page may address more details regarding their wage trends, beyond the ones noted above). There are other questionable aspects to the Social Security report which might be addressed in the future, time permitting (there are other policy areas to address on this site). That may be postponed to see if the SSA is inspired to overhaul its process and come out with a more credible report this year (likely due out within a few weeks from now). If they don't it might be worth considering the creation of an alternative forecast, though that might be beyond the scope of this site. The Social Security Administration also provides the forecast for future Medicare spending. It relies on many of the same assumptions for its report, which calls its figures into question as well. Its "high cost" scenario at least is projected to have higher real costs than the "low cost" scenario. That is for the figures they project those alternatives for, they don't do them for everything. Also some of their projections go far into the future but other don't, especially for things they say are in balance since their fees cover the costs. Those programs still contribute to the forecasts of government wide finances so long term projections would still be appropriate. There is an alternative Medicare report with some different higher cost assumptions but oddly it doesn't combine its higher cost assumptions with the other "high cost" factors in the main report. The rest of the Medicare forecast should be examined for problems as well. AppendixIt would be useful for analysts or academics curious about this issue to confirm the findings independently, most of this pages merely uses SSA data. The extra wage and labor force data it collected might be posted in the future, though it would be preferable for the SSA to provide its own data publicly to avoid the need for that and perhaps they will if people begin asking (the data was useful on a weekend for this page and it was better to recreate it than wait and bother asking). The spreadsheets used for this page were meant as working files to analyze the data and hadn't been polished for public use, but they might be posted soon anyway for the curious (though that might provide less incentive for others to bug the SSA to just get them to put up the raw data).Those that sometimes use spreadsheets to analyze data rather than statistical packages should be aware that Excel is fine for simple calculations but has a history of numerical bugs in things like its statistics routines and trend analysis routines (e.g. see here, here, here and here just for a start).This site by default uses Open Office and sometimes Libre Office for things like any trends created for graphs on this page. Spread the Word: |