import React, { Component } from 'react';
import fig9 from '../img/interfaceEssay/fig9.png'
import fig10 from '../img/interfaceEssay/fig10.jpeg'
import Fig from './Fig.js';
import TextLink from './textLink';
import CaptionTextLink from './captionTextLink';
import Quote from './Quote.js';

class DynamicSystemsSection extends Component {
    render() {
        return(
            <div>
                <a name="systems"><h2>Representing knowledge as dynamic systems</h2></a>
                <br />
                <p>Science (trying to derive a new theory from a system) and engineering (trying to create a new system from a theory) are both the study of systems (reference bret victor). As such, academic papers primarily exist to describe knowledge of systems. This is true across Biology (living systems, from genetic interactions to behavioural ecology), Chemistry (atomic interaction systems), Physics (fundamental systems of matter) and Medicine (<i>applied</i> systems for human health), all the way to the social systems of Economics and History. </p>
                <br />
                <br />
                <p>Systems are complex entities that can be understood in many ways. A system is not equal to its explanation in words, nor is it equal to a flowchart. Both are useful <i>abstractions</i> which seek to describe it in some dimension, but they do not represent it as a whole. The climate, for example, is a very complex system. When we ask questions like "what's the weather like next week?", we can use a variety of abstractions of the climate system to represent it in specific ways, though no abstraction represents the whole system.</p>
                <br />
                <br />
                <h6>The climate is a complex system that can be represented in many ways to help us understand it and apply it to our own purposes. </h6>
                <br />
                <br />
                <p>Scientific papers do not give us very powerful tools for describing complex systems like the climate. In fact, there are only two main tools that we can rely on: <strong>language</strong> and <strong>static graphs</strong>. These are individually very powerful ideas. However, they are insufficient when representing complex systems.</p>
                <br />
                <br />
                <p>Jerome Bruner provides us with a simple model that can help to put this into context. In this model, there are three ways to represent information: enactive (interactive/action-based), visual (image based) and symbolic (language-based). There are other models used to describe a similar idea, but this model is sufficient for our purposes. Scientific papers are mostly <strong>symbolic</strong> tools, in which we manipulate language and mathematical notation to represent knowledge. Language is supported by static graphs, which are <strong>visual</strong> artifacts that help to display trends in numerical data. However, we lack an <strong>interactive</strong> means of exploring visual and symbolic representations of ideas.</p>
                <br />
                <br />
                <p>Systems are <strong>dynamic</strong> entities which are reactive to change. When you make a change in one part of the system, a series of feedback effects will lead to a change in other parts of the system. Consider a simple example. If you turn your fridge off (and you live somewhere relatively warm), it will increase the temperature of the food inside it. In turn, this will increase the rate of bacterial/fungal growth, which will cause your food to spoil faster. By changing the temperature, you alter the dynamics of the system which causes changes elsewhere within it. This system is itself comprised of many subsystems (eg. bacterial growth), that are all interacting based on a common set of variables. </p>
                <br />
                <br />
                <p>Notably, that characteristic of seeing what happens when you change something in the system is key to understanding it more deeply. It's a different kind of understanding than reading about it in a book, or seeing a picture of someone else's fridge. There's an <i>experiential</i> understanding that you arrived at through taking an <i>action</i>, and seeing the feedback. </p>
                <br />
                <br />
                <p>It's this type of understanding that is key to whole systems thinking. As they currently stand, there's no way that scientific papers could ever represent this kind of thinking, because they are static objects that still lean on the mental models of pen and paper. There's actually nothing you can do with a digital scientific paper that can't be done with pen and paper (note - it's still called a paper!). If anything, digital papers are actually <i>more</i> constraining at expressing complex thought!</p>
                <br />
                <br />
                <p>Systems thinking is best represented by <strong>dynamic simulations</strong>, a paradigm unique to computers. With a dynamic simulation, you can interactively change something and the system will react to your input. The simulation can be constructed from our best understanding of a complex system, and we can <i>integrate</i> that understanding to provide a predictive model that can <i>show</i> us what happens in hypothetical scenarios. Importantly, we can create multiple rich representations (including dynamic visualisations, graphs, images, words etc) using that model to describe what is happening in the system, which are dynamic and reactive to user input. </p>
                <br />
                <br />
                <p>This is a powerful way to enable people to engage with ideas. It moves us from a <i>transmissive</i> (passive) model to an <i>interactive</i> model of learning. Users can choose how they want information represented, and use the model to answer their own specific questions.</p>
                <br />
                <br />
                <p>So, representing systems dynamically is important for learning about and understanding systems deeply. This is important for both novices and experts. However, dynamic representations of systems are not just useful for learning. Dynamic simulations are <i>models</i> of some interacting phenomena, which makes them able to <i>predict</i> things. Therefore, they are also valuable for the people producing new knowledge as a means of testing hypotheses. </p>
                <br />
                <br />
                <p>Models can produce results faster and more cheaply than a physical experiment. For example, graphene may become a foundational material of the future, but it is yet to really make an impact. One potential application is as a novel semiconductor for computer chips. However, 'normal' graphene has some characteristics that prevent it from being useful in this domain. While there are potential ways to fix this, it doesn't always make sense to manufacture (very) expensive graphene to 'waste' on potential fixes. Instead, modelling graphene can be predictive <i>enough</i> to help us test out ideas and figure out which ideas are worth taking to the next stage.</p>
                <br />
                <br />
                <p>In some cases, models can help us to answer questions which <i>cannot</i> be tested by a physical experiment. For example, the climate is an incredibly complex system of interacting forces. For our purposes, we want to know how our actions will affect the climate over <i>extended periods of time</i>. By definition, we cannot carry out a physical experiment, because we want to test <i>different futures</i> against each other.</p>
                <br />
                <br />
                <Quote text='"If we had observations of the future, we would obviously trust them more than models, but unfortunately, observations of the future are not available at this time."' source="Tom Knutson and Robert Tuleya" />
                <br />
                <br />
                <p>A key feature of systems is that they integrate many variables together. Models must therefore also integrate many variables together if they are to describe the system. A climate model is only useful if it accurately represents the <i>whole</i> climate system, otherwise the real behaviour won't match our predicted behaviour. </p>
                <br />
                <br />
                <p>Representing complex systems, therefore, usually means collecting together many subsystems, which are each composed of a set of interacting variables. Larger systems arise from 'plugging in' many <i>subsystems</i> in order to create emergent phenomena derived from their interaction.</p>
                <br />
                <br />
                <p>This indicates another problem with scientific literature. Each individual paper describes a system or a part of a system, but it is almost always <i>partial</i>. Big systems (like the human body) are too much work for a single paper. Therefore, knowledge of whole systems is distributed across a huge mass of individual articles. However, there is very little connective tissue between those papers to enable the construction of whole systems representations.</p>
                <br />
                <br />
                <p>In most cases, the painstaking responsibility for synthesising disconnected written material falls to individuals, who must all carry out the <i>same</i> work, while the interface provides little affordance for support. This process is necessarily extremely time-consuming, because language is a low bit-rate form of knowledge transfer. Even worse, the result of this process is an <i>internal</i> mental representation that only benefits the individual researcher. Dynamic whole-systems representations, by definition, synthesise and integrate many findings, <i>instantly</i> doing much of this time-consuming work. The user can 'zoom' in and out of levels of fidelity to clarify individual findings, though they begin with an understanding of <i>how it all fits together</i>.</p>
                <br />
                <br />
                <a name="data"><h3>Do you speak data?</h3></a>
                <br />
                <p>In order to build whole system representations that integrate the findings from many research papers, we need building blocks and an interaction language out of which we can build them. Given that systems are best represented by dynamic simulations, and dynamic simulations are native to computers, we need a language that is native to computers. </p>
                <br />
                <br />
                <p>All scientific exploration has the common feature of producing data. Data represents feedback from the world, from which we can abstract relationships that confirm or disprove a hypothesis. Data create the building blocks from which we can abstract inferential relationships that can be used to that explain and predict how the world works on some dimension. The model, constructed from data and an inferential relationship, is the <strong>unit of value</strong> provided by each new scientific finding. It is valuable because it can <i>predict</i> what will happen under a set of circumstances. This is handy, because computers speak data very well. </p>
                <br />
                <br />
                <p>For example, research into vaccines models the production of antibodies by the immune system, by introducing a small amount of inactive virus (or some synthetic mRNA). The model aims to find the <strong>lowest dosage</strong> of <strong>substance</strong> that creates an <strong>effective immune response</strong> over a <strong>given time period</strong>, to within an acceptable <strong>level of statistical error</strong> without <strong>adverse effects</strong>. The values in bold are measurable variables that contribute to the model of efficacy and safety of the vaccine, which are linked together by an inferential relationship - if you inject <strong>a</strong> amount of vaccine within <strong>b</strong> period of time, it will trigger <strong>c</strong> level of immune response, which in turn will lead to <strong>x</strong> reduction in probability of contracting and/or serious effects of the disease over <strong>y</strong> expected time period. You can carry on this inferential modelling process ad-infinitum, up to the population level and down through the biochemical to the subatomic level. Each level of system is deeply connected by a common set of interacting variables.</p>
                <br />
                <br />
                <p>As it currently stands, we encode this model as knowledge using language, supported by some visual artifacts (graphs), in the form of a scientific paper. My argument here is that the language encoding of new scientific knowledge is <i>not</i> the critical artifact. The <i>model</i> is the critical artifact, including the original experimental data, the data analysis strategy used and the conclusive abstract inferential relationship. Language is just a tool we use to <i>explain</i> the model. In other words, language is the thing that explains the thing, but it is not the thing itself. The thing itself is the model.</p>
                <br />
                <br />
                <p>Language is an insufficient medium for describing systems, because they are phenomena that are native to data. Language is a <i>static</i> encoding of ideas that only activates the <i>symbolic</i> pathway for understanding. It does not enable readers to 'see' ideas because language is not a visual medium. It does not enable the reader to 'play' with ideas, because language cannot react to reader's inputs. Instead, the author chooses the words for every reader, and every reader must try to construct their understanding from those same set of words. Words are a lossy medium for describing rich, multidimensional ideas. A model, on the other hand, is an underlying set of abstract relationships between interacting data that can be used to construct <i>any representation</i> of a system, including dynamic visualisations and graphs. Therefore, representations of a system can be weaved from the v<i>very material that the system is made of.</i></p>
                <br />
                <br />
                <p>Despite this, language constitutes the majority of 'space' in an article. Original data is nowhere to be found. If it is available, it is usually in an obscure place unrelated to the primary literature artifact. The problem here is that the reader can't construct their own representations from this data to help them understand the finding, and they can't leverage that data to produce new knowledge. Similarly, whole system representations can't be constructed, because they rely on the original data.</p>
                <br />
                <br />
                <p>This suggests that the way we report new scientific findings is partly backwards. The thing that is most valuable is the data, but papers contribute words. Original data should be a first class object that is directly connected to new scientific artifacts, such that articles can become dynamic objects that are made out of the data. This data can be used to create a vast range of dynamic whole system representations which seek to integrate the findings from many individual investigations.</p>
                <ul>
                    <li><p>Data is the basis of every new scientific finding, because it represents objective feedback from the world.</p></li>
                    <li><p>Data represents information that can be used to produce dynamic representations of complex systems, and can be fed into a model which has predictive capacity.</p></li>
                    <li><p>Language is the dominant form of reporting, but it is a static representation of new knowledge, which cannot be used for either of these purposes.</p></li>
                    <li><p>So, original data should be a first class object in scientific reporting so that it can be used by other people to produce new knowledge, and construct dynamic representations that make that knowledge more understandable.</p></li>
                </ul>
                <br />
                <br />
                <a name="mitigation"><h3>Mitigating known problems in the scientific process</h3></a>
                <br />
                <p>On a related note, communicating data as a first class object is key for at least one other reason.</p>
                <br />
                <br />
                <p>As discussed, science generates results in the form of numerical data. When analysing this data, the methods employed are almost always a choice made by the researcher. Naturally, this leads to <TextLink url="https://journals.sagepub.com/doi/10.1177/2515245917747646" text="variance in the methods" /> employed by different researchers, which in turn creates variance in the conclusions derived from the same data. An important dynamic arises from this fact when contextualising the results of a particular paper - x is true, <i>under conditions y and z</i>. This is a natural feature of data, but it has important implications for the scientific interface which are not captured by our existing solution. </p>
                <br />
                <br />
                <p>The replication crisis represents the unsettling discovery that many landmark findings, originally in psychology but increasingly in other disciplines, are not reproducible. Variable data analysis strategies represent the degrees of freedom which enable this to happen, and can manifest in the form of <TextLink url="https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002106" text="p-hacking" />, <TextLink url="https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002106" text="'flexible' analysis" />, <TextLink url="https://royalsocietypublishing.org/doi/full/10.1098/rsos.160254" text="low statistical power" />, <TextLink url="https://www.researchgate.net/publication/283156103_How_scientists_fool_themselves_-_And_how_they_can_stop" text="confirmation bias" /> and <TextLink url="https://science.sciencemag.org/content/349/6251/aac4716.full?keytype=ref&siteid=sci&ijkey=1xgFoCnpLswpk" text="incomplete reporting" />. Importantly, these degrees of freedom can be the difference between statistical significance and non-significance, which is the foundation of 'novel' knowledge. Despite this, there is no pathway in the current interface for debate to arise around data analysis strategies, or to add, redact or update findings in the event that a better strategy is identified. Essentially, the author has almost complete control over their strategy, and relatively little accountability to that strategy post-publishing. Therefore, when looking at a paper, one is forced to do the challenging work of deciding whether the strategy employed was suitable themselves; something that the <i>vast majority</i> of readers are not trained for, especially when the work lies outside of their primary discipline. The downstream effect is for the median reader (and, clearly, expert readers too) to be easily swayed by results that, given a better pathway for debate, may change significantly from the original version produced by the author.</p>
                <br />
                <br />
                <p>The responsibility of any interface is to understand and mitigate fallible human behaviours to meaningfully extend our capabilities. As fallible human behaviours go, this is quite an important one. If science exists in pursuit of the (workable) truth, but the interface does not correct for the natural biases which prevent us from finding it, then the interface has failed in its task. That we are calling it a replication <i>crisis</i> indicates that our current accountability measures (such as peer review) are clearly not sufficient. My point here isn't to lay blame on researchers or reviewers for producing non-reproducible science, but to observe that bad incentives and fallible behaviours exist because we are human, and these are not going to go away. Software can create new systems which understand and mitigate these problems.</p>
                <br />
                <br />
                <p>I would draw analogy here to blockchains for financial products. Blockchains (assuming they are sufficiently decentralised) represent consensus governed by code, which is agnostic of any kind of human bias. In theory, <i>it's not possible</i> to circumvent the rules of the system, even if you wanted to. As a result, blockchains can provide a level of security and certainty that has previously been impossible, and are a triumph of mechanism design because they <i>understand</i> that humans are greedy and/or make mistakes, and <i>mitigate</i> these biases with determinate processes. Eventually (hopefully), everyone will be better off for it.</p>
                <br />
                <br />
                <p>In the same way, the interface of science needs to have a mechanism for mitigating known problems in the scientific process. The first key responsibility is to solicit all original data in an easily accessible format such that results can be scrutinised in a completely open arena. To make this process simple, rather than storing that data in a disconnected database, new scientific artifacts should themselves contain the original data. If they contain original data, it means they can actually be <i>made out of</i> the data. Secondly, it should provide the affordance for debate to arise around the best data analysis strategies. Finally, the original artifact should be subject to version control, such that the results of the debate can be integrated back into the original artifact.</p>
                <br />
                <br />
                <p>There is an obvious consequence of this process, in that debate can arise after a result is published and the results of that debate can be reflected directly in the record. However, in bringing greater accountability to each author, I suspect there may also be a behavioural feedback effect of making each author more precise about the data analysis strategy they employ because it's no longer possible to get away with bad analysis. With both effects working together, inclusion of original data in primary scientific artifacts should go a long way to solving for the replication crisis. For this reason alone, it is worth doing, but as we've discussed, uploading original data also creates the conditions for <i>dynamic, whole-systems</i> representations to evolve.</p>
                <br />
                <br />
                <p>In summary:</p>
                <ul>
                    <li><p>Our current accountability measures for good data analysis (peer review) are clearly not sufficient for the job of generating reproducible knowledge.</p></li>
                    <li><p>So, we need to create systems which mitigate known problems in how we carry out science.</p></li>
                    <li><p>Data analysis is subjective in terms of the specific methods different people use to analyse the same information.</p></li>
                    <li><p>Fully open, <i>easily accessible</i> data, together with open debate and version control, enables much greater scrutiny over results, which should lead to more accurate knowledge.</p></li>
                </ul>
                <br />
                <br />
           </div>
        )
    }
}

export default DynamicSystemsSection;