# What is good science?

With the cold weather today, my teaching in the Malaysian workshop seems a lot longer ago than two weeks. But only two weeks ago I was teaching the statistics of crystallisation in sunny Malaysia. I enjoyed it. One of the fun things about teaching at this sort of workshop is listening to the other speakers. I am as happy to learn as the students. One of the speakers in Malaysia was Mason Porter of Oxford’s Mathematical Institute. He works on networks and it was interesting to hear what he does.

One of things he mentioned was a perspective paper in Science on fitting power laws (i.e., y ~ xs) to data. It is an interesting read. If I was to summarise his paper in a sentence, it would be: So you can fit a power law to your data, so what? (Incidentally the paper is only a page and a bit long so you can easily take a look.) This is a fair question. Let us say you have some data, and it is well fit by a power law, what have you learned?

One thing that you should have spotted before you fitted the data is that you have a broad spread of values. Power laws are very broad distributions, in the following sense. The distribution of wealth is roughly a power law. Bill Gates, Sergey Brin, and others have many billions, I have a lot less wealth than them. All three of us are on a broad distribution. The height of men is Gaussian (as is that of women, although it is shifted to somewhat lower values than that of men), which is a much narrower distribution of values. Bill Gates is 1.78 m, I am 1.85 m. This is a much smaller difference that in our wealth.

You don’t need to fit a power law to spot such broad distributions. So what is the point of fitting a power law?

Generally speaking, I would say that good pure science is able make predictions about the natural world that are non-trivial, and are true, just as good engineering is making non-trivial stuff that works.

Returning to power laws, if you have a good model that allows you calculate the exponent (s above) of a power law and to predict what it should be in a range of cases, then you can do experiments on one case, show that your model works there. Then you can use your verified model to make a prediction in another case. But often, as Stumpf and Porter say, the scientists just fit a power law and claim victory, without a good tested model for why the power law applies. Without such a model, no or few predictions can be made.

Without any ability to make predictions, has anything useful been learned?