Laura Bell, Warren Currie (DFO), Stephi Sobek-Swant (rare reserve), John Drake, Wonhyo Lee & Madison Brook (DFO)
03/04/2023
Laura Bell, Warren Currie (DFO), Stephi Sobek-Swant (rare reserve), John Drake, Wonhyo Lee & Madison Brook (DFO)
ecotheory.ca/theorydata/datatheory.html
We may refer to beliefs supported by data, or which at least do not always contradict data, as theories
We will like theories to have a few other properties such as:
Model: a representation of reality
A structure that:
conceptual (e.g., a statement)
physical (e.g., lab experiment)
mathematical (e.g., ODE)
data-driven (e.g., regression)
computational (e.g., IBM)
“predators can positively impact prey”
\(\frac{dN}{dt}=f(N,E)+g(N,P,E)\)
\(E(y_i)=β_0+f(x_i)+\epsilon\)
“We actually made a map of the country, on the scale of a mile to the mile!”“Have you used it much?” I enquired.“It has never been spread out, yet,” said Mein Herr,
“the farmers objected: they said it would cover the whole country, and shut out the sunlight! So we now use the country itself, as its own map, and I assure you it does nearly as well.
Lewis Carroll - The Complete Illustrated Works. Gramercy Books, New York (1982)
“With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.”
John von Neumann
“There are two ways of doing calculations in theoretical physics”, he said.
“One way, and this is the way I prefer, is to have a clear physical picture of the process that you are calculating. The other way is to have a precise and self-consistent mathematical formalism. You have neither.”
Enrico Fermi speaking to Freeman Dyson about pseudoscalar meson theory
something is a “mechanistic model”, because it includes a priori knowledge of ecological processes (rather than patterns)
e.g., mechanistic niche model based on first principles of biophysics and physiology vs
correlational niche model based on environmental associations derived from analyses of geographic occurrences of species
(see Peterson et al. 2015 )
the correlation model is a phenomenological model
Data-driven: I’m including here standard statistical models (both frequentist and Bayesian), as well as machine learning models
these models are good at finding patterns
in the machine learning literature sometimes referred to as the ‘inductive capability’ of algorithms (from past data, one can identify patterns)
“While mechanistic models provide the causality missing from machine learning approaches, their oversimplified assumptions and extremely specific nature prohibit the universal predictions achievable by machine learning.”
Baker et al. (2018). Mechanistic models versus machine learning, a fight worth fighting for the biological community? Biology Letters, 14(5), 20170660
does not logically entail
“All swans are white”
Popper claimed to have refuted the idea that induction provides a foundation for knowledge (Popper, Karl, and David Miller. A Proof of the Impossibility of Inductive Probability. Nature 302, no. 5910 (1983): 687–88.)
given the success of machine learning, does this mean Popper was wrong that induction is a refuted theory?
“If a machine can learn from experience in this way, that is, by applying the simple inductive rule, is it not obvious that we can do the same? Of course I never said that we cannot learn from experience. Nor did I ever say that we cannot successfully use the simple inductive rule—if the objective conditions are appropriate. But I do assert that we cannot find out by induction whether the objective conditions are appropriate for using the simple inductive rule.
Popper, K. (1983) Realism and the Aim of Science
starting from \(\frac{dN}{dt}=f(N)-g(N,P)\) is no different than starting from \(y_i=\beta_0 + f(x_i)+\epsilon\) in terms of mechanism
we need an explanation or idea about how the predators negatively impact prey population growth rate (what is g(N,P)?)
to develop ideas about mechanism that inform the function
selection/creation of mathematical and computational models often tends to appeal to classifications rather than mechanism
a common example is pairwise species interactions such as “predator-prey model”, (e.g., Lotka-Volterra pred-prey model \(\frac{dN}{dt}=rN-aNP\))
“predator-prey” model supposes there is a class of predator-prey interactions that have general properties across species, systems and time that are related to the outcome of the interaction (-/+)
“classification” models have dubious explanatory and predictive value outside of the exemplar systems in which they were generated
instead of using classifications of outcomes, we should to focus on incorporating mechanisms, which may generalize across species, systems and time
for example, while specialist predators may always eat their prey, the net effect of this pairwise species interactions is not fixed
\[ \begin{aligned} &\frac{dN}{dt}=N((a_0+a_1E)-(b_0)N-(c_0)P) \\ &\frac{dP}{dt}=P((f_0+f_1E)-(g_0)P+(h_0)N) \\ &\frac{dE}{dt}=-k(E)+mP \end{aligned} \]
“In the mid-1990s, zebra and quagga mussels (Dreissena spp.) invaded the area, dramatically changing the water clarity because of the filter-feeding capacity.”
Bay of Quinte remedial action plant (2017)
-asymptotic vs transient dynamics predicted by models
all of which arise from the SAME mechanistic model
Where the derivative of the year smooth from our GAM significantly deviates from zero, we have a period of rapid change.
- rapid response to management in the 70s
Entropy: the model is calibrated to find the distribution that is most spread out, or closest to uniform throughout the study region.
Constraints: the rules that constrain the predicted distribution. These rules are based on the values of the environmental variables (called features) of the locations where the species has been observed.
Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Modell 190(3–4):231–259.
use experimental data to suggest candidate predictors: may require cold stratification, refer moist sites
initial Maxent model to find strong candidates and eliminate correlated predictors (normally we would leave these in and assume that the penalization would take care of overfitting)
Let’s use models more effectively by:
The Quinte project is funded by the Department of Fisheries and Oceans Canada. The Hines Emerald dragonfly and Giant Hogweed project was funded by NSERC and Faculty of Science, University of Waterloo.