Thursday , November 14 2019
Home / David Flynn: Barter is Evil / ND County Clusters by Income Source

ND County Clusters by Income Source

by
David
My articles My books
Follow on:
Summary:
One of the most frequent questions I get regarding the nature of the regional economies within North Dakota focuses on proper comparisons. The question boils down to a search for comparable peers, and while there are jokes to be made regarding nobody compares it is an excellent question. So I start this process with a simple cluster analysis (k-means) looking only at the annual percentage change in farm and non-farm income from 2016 to 2017. The interesting constraint on this looks like it might be data. There are many suppression flags in the data set for counties based on disclosure concerns. However, all counties in the analysis include those grass categories. There will be a need to look at levels and rates in the analysis at some point, but this seems to be a decent

Topics:
David considers the following as important: , , , ,

This could be interesting, too:

Tyler Cowen writes Social Security isn’t doomed for younger generations

Tyler Cowen writes The winners and losers from Airbnb

Alex Tabarrok writes Politically Incorrect Paper of the Day: The United Fruit Company was Good!

Tyler Cowen writes A Coasean solution for New Delhi?

One of the most frequent questions I get regarding the nature of the regional economies within North Dakota focuses on proper comparisons. The question boils down to a search for comparable peers, and while there are jokes to be made regarding nobody compares it is an excellent question.

So I start this process with a simple cluster analysis (k-means) looking only at the annual percentage change in farm and non-farm income from 2016 to 2017. The interesting constraint on this looks like it might be data. There are many suppression flags in the data set for counties based on disclosure concerns. However, all counties in the analysis include those grass categories.

There will be a need to look at levels and rates in the analysis at some point, but this seems to be a decent starting point. I set it up with three, four, and six clusters to see how groupings changed, it at all.

ND County Clusters by Income Source
Three clusters

The three clusters give us an interesting outcome. There are two major clusters with several Bakken counties in an upper grouping. Then a group of others, and then two outliers on the negative side, Ramey and Oliver counties. To be clear the way the data are set up they could be growing, but are below the average performance of the other counties in the state. 

ND County Clusters by Income Source
Four clusters

When adding a fourth cluster the Bakken region distinguishes itself and really sets itself apart. The lower outliers remain in their own grouping again. In this case while it is not completely the case you almost have a vertical pecking with the highest being thought of as the best. Lastly we have six clusters. 

ND County Clusters by Income Source
Six clusters

The larger groupings stay the same at this point and the outliers are each in their own group. I do not like setting off individual observations on their own in these analyses (the goal is comparable peers after all) so I think less than 6 seems to be optimal. We also have another cluster of lower performers in this case (Cluster 1) in the above graph.

Further work will add in new variables and probably some levels as well because that is an important check on the North Dakota data typically (any of my former students reading this want to hazard a guess why?)

About David

Leave a Reply

Your email address will not be published. Required fields are marked *