Look mama. No hands!

When did hydrology become computer science?

There was once a time when all hydrology was field hydrology. Rating curves, pan evaporation, curve numbers, unit hydrographs were actually derived from field data. Run-off coefficients were obtained from paired catchment studies.

At some point (maybe in the 1970s?), people seem to have declared that the field of hydrology had become a mature field. There were no more discoveries left to be made. The next exciting phase for the field was to harness the massive computational power unleashed by computers of ever increasing power. Indeed, computers generated many exciting possibilities and made large scale river basin modelling possible.

In reality, modelling is more art than science. Models are highly dependent on assumptions or parameters. For instance, we know in theory, that stream flow depends on stream roughness or soil hydraulic parameters, in practice, we can never measure the roughness in every section of the stream or the soil types in every square inch of the watershed. Ultimately, we guess  these parameter values and then keep adjusting the guesses so that the stream flow values predicted by the model match what is observed in the real world, through model calibration.

Early on, models were calibrated manually by first running tests and then tweaking the parameters a bit. As computers became more powerful, inversion techniques made it possible to “auto-calibrate” models by simply trying every combination of numbers within the “parameter space” so that the simulated model output matched observations.


“Don’t worry. I have plenty of time. The auto-calibrate model will figure out the soils, vegetation, climate, human water use and also the research question by tomorrow.”

The more dependent we become on computers, the less we rely on our intuition. One paper I reviewed the authors  (from a prestigious Indian university) had not even bothered to replace the SWAT model variables with their English names! The parameters resulting from the model auto-calibration were simply presented in a table with no reflection of whether they made sense. It was obvious the authors had never even visited that part of the country and in fact had no research question.

The objective of the paper was “to develop a model”, a goal which the paper obviously fulfilled! How do we change this culture of garbage-in-garbage-out modelling so that hydrologic research is firmly rooted in critical questions that matter to people? The stakes are non-trivial; a billion people’s lives and livelihoods!

“But it’s all we have”

One of the fun challenges of doing water research in India is the data or rather the lack thereof. The good news is that only half of the data are wrong. The bad news is we just don’t know which half! This fact is well known to every water researcher in India. Yet somehow we seem fixated with the idea “but we have to use the data – its all we’ve got”. How on earth do we justify using data, we know to be wrong?


Groundwater levels in hard-rock peninsular India are an example of this blinkered vision. On one hand, everyone understands perfectly that water levels are highly heterogeneous and that two wells a few feet apart could have water levels that differ by hundreds of feet. On the other hand, that does not stop us from using  data from randomly chosen monitoring wells that are very sparsely distributed as somehow magically being perfectly representative of the entire 100 sq km region around.

We build entire models of the country using this flawed data and even base infrastructure decisions or insurance prices on it. If the whole castle is built on a weak base, what’s the point?

Despite this obvious flaw in logic, most groundwater hydrogeologists will run around in circles trying to explain why the monitoring well data are more reliable and its OK to use them. Here are some explanations often offered:

  1. The monitoring well data represent static water levels, while farmer wells exhibit dynamic water levels. Perhaps there’s a kernel of truth – but its not an reasonable explanation for heterogeneity because the unstated assumption is that given enough time all wells will reach an equilibrium, even in a hard rock aquifer. This simply isn’t true –two wells which haven’t been pumped for months can still show widely different groundwater levels.
  2. The government wells were carefully constructed and therefore more believable than farmers’ wells. But the monitoring well data are extremely shallow. If water was really available 15 feet below the ground, open wells should hold water. Its hard to believe that farmers spend lakhs of rupees drilling hundreds of feet for no reason.

Its quite clear that farmer narratives of depletion in hard rock aquifers diverge sharply from monitoring well data. The question is why. This is a valid line of inquiry. But only if the scientific community recognizes all types of data as being equally valid.


What do you mean you didn’t find water even after drilling to 1000 ft. Our monitoring wells have water even at 15 ft!

Some Indian journals do not in fact recognize certain types of data as valid. A review comment in an article submitted to an Indian journal (by me) for instance argued that government data were beyond question. Citizen science or farmer survey data even when systematically done could not be used in a scientific paper!

This needs to change. The notion that certain ideas, institutions or types of data are superior or inferior should have no place in science. The culture of Indian science needs to become more grounded and driven by a spirit of curiosity and open to any form of data, provided we challenge ourselves to interpret and communicate clearly what the data mean.