There’s this fact that stares you in the face as you try to figure out whether, as I have hoped, private insurers might significantly displace the system of coastal windstorm insurance in Texas currently dominated by TWIA. It’s that the private market appears to be alive and well in parts of the Texas coast. In Cameron County, for example, TWIA has only 31% of the residential market. And in Kleberg County, the figure is 27%. On the other hand in Galveston County, TWIA owns 77% of the residential market and 81% in Aransas County. What accounts for this variation? Maybe if we could figure it out, we could engineer some policies likely to induce the private market to re-enter to a greater extent throughout the coast.
I will save you the trouble of reading ahead. I didn’t find much. The variation remains pretty much of a mystery. I look forward to suggestions for further experimentation or someone who will just reveal the obvious answer.
If anyone, by the way, has data on the proportion of property or the population that is located within some distance of the actual coastline within each county, I’d be very interested in seeing that. Maybe the reason the geographic data isn’t showing anything is that the county divisions are too coarse. If, for example, Galveston County has a higher proportion of its population living close to the ocean than does, say, Kleberg County and if insurers don’t feel, for some reason, that they can underwrite within counties, that might provide some better explanation for the variation in private insurer participation in the Texas coastal windstorm market.
For those who care how I came by my “negative result” — just the sort of thing many academic journals tend to disdain — I offer the following brief synopsis.
If you just look at a map, no particular pattern appears.
What if we look at some data? I grabbed data on the TWIA counties I thought might possibly be relevant from the United States census. Maybe population density is important on the thought that the more dense the county, the more correlated the risk and the less private insurers would want to write there. Or maybe private insurers have greater (or lesser) marketing power in densely populated counties. I grabbed median income data on the thought that private insurers might prefer to write policies in wealthier counties. I grabbed ethnicity data on the thought that race and ethnicity often matter in modern America — not necessarily causally but because race and ethnicity end up correlating with things that matter. We end up with 14 data points and 3 dependent variables. There’s not a huge amount one can do with data sets this small, but I thought I’d give it a try.
If one does a simple-minded logit regression, one ends up with the following somewhat unusual result. With these three variables, we end up accounting for about 72% of the variation in the data, but no single variable is statistically significant, or even close.
We can also try something more sophisticated. Instead of just assuming a logistic linear relationship between the independent variables and the dependent one (TWIA penetration), we can ask the computer to explore a huge space of potential models and see if anything turns up. Such statistical work used to be impracticable without super computers due to the amount of computation involved and the custom programming required. It’s now eminently possible on an average desktop with software such as DataModeler from Evolved Analytics. Although this process yields remarkable gains in understanding a system, such is not always the case. And, for this small dataset, exploring a much larger model space leaves us with a number of models that have somewhat higher R-squared values that our logistic regression, but nothing to truly brag about and none that clearly point towards one or another of the variables in our model as being critical.
I thus end up saying that, for now, the mystery of varying market penetration remains unsolved.