The Science of Tools for Thought and Cognition Augmentation Software

November 18, 2021

For millions of years, humans have used tools to augment their evolutionary abilities. A few thousand years ago, we invented writing and thereby enabled the sharing and externalizing of information, ushering in the bronze age. A few decades ago, we created the computer and computer networks, inaugurating the information age. This new tool drastically expanded our abilities to share, internalize and work with knowledge.

Tools that augment our cognitive abilities aided by computers are still in their infancy and will have a massive influence on the ages to come. This article aims to give an overview of the progress and state of Cognition Augmentation Software. I’ll start with creating a more precise terminology for it and its adjacent concepts.

Tools For Thought Terminology

Standards, norms, terminology, and tractable definitions have guided many advancements in science and technology. However, the term Tools For Thought, which we use to describe software products like RemNote or Roam Research, is not clearly defined. To advance this field, we need a shared terminology and understanding of the space.

In their outstanding work, How can we develop transformative Tools For Thought? Michael Nielsen and Andy Matuschak describe the history of the term. The first usage of it, they explain, goes back to the early computer pioneers. A problem with this term is that it does not differentiate between cognitive, e.g., cultural-evolutionary tools (like number theory) and software tools, for example, note-taking apps.

Thinking and Thoughts are also not clearly defined, which makes the use of this term even harder. I will use the term Cognition Augmentation Software, or simply CAS, to describe software that augments our cognition.

cas map top lvl

Augmented Cognition

History

As World War II mercifully drew to a close, Vannevar Bush, President Truman’s Director of Scientific Research and initiator of the Manhattan Project, surveyed the post-war landscape. He laid out what he viewed as the most important forthcoming challenges to humankind. In 1945, he published his seminal work As We May Think, which aimed to set the stage for the post-WW2 allocation of scientific efforts towards understanding rather than destruction and described his concept of the memex, a hypothetical memory augmentation device.

The research of this field goes back to the 1960s and the pioneering work of Douglas C. Engelbart, who was highly influenced by Bush’s work. He viewed the process of Augmenting Human Intellect as increasing the capability to solve complex problems by information handling and symbol structuring to gain comprehension. The vision developed by Engelbart, J. C. R. Licklider, Alan Kay, and others had a strong influence on later entrepreneurs like Steve Jobs.

In the next decades, HCI made incredible leaps include interface advances such as Smalltalk, HyperCard, or WYSIWYG editors. A paper from 1998 lays out some of the most impactful innovations and their development timelines:

hci timeline g A formal sub-field was finally formed in early 2000 under the name of Augmented Cognition and instituted by the Augmented Cognition Program, which was renamed to Improving Warfighter Information Intake Under Stress Program in 2001. The program was separated into four phases:

  1. Measure cognitive state
  2. Manipulate cognitive state
  3. Exploit human sensory channels
  4. Optimize information allocation

The research field (and its conference1) today relies heavily on the terminology of Cognitive Load Theory. A big focus of the domain is to reduce Cognitive Overload. One way to do this is by developing schemas that act as memory templates. A schema is a cognitive framework or concept that helps organize and interpret information. Schemas can be useful because they allow us to take shortcuts in interpreting the vast amount of information that is available in our environment. For example, a child might learn to classify a bird by noting that it is a creature with a muzzle and two wings. A more tangible example might be learning the mathematical notations of a new field. In the beginning, the symbols are foreign to us, but after a while, we build up a mental (and implicit i.e. uninspectable) model of their function.

Schemas are coded into long term memory by working memory. If working memory is overloaded, its schema-building ability is compromised.

If we help someone to create new or surface relevant schemas by, for example, showing them a possible visualization of a mathematical operation, we support their schema-building abilities and reduce their Cognitive Load.

Cognitive State measurement

As we saw in the list of phases of the Augmented Cognition Program, a great deal of research focuses on how to “Measure cognitive state.” There are many possible applications of adjusting information presentation to the user’s cognitive abilities and background knowledge. For example, one could auto-control an audiobook’s playback speed based on the listener’s measured cognitive load (request for startup).

Distributed Cognition and Social Computing

A second valuable and related framework is Distributed Cognition. It defines cognitive processes by the functional relationship among elements that participate in them and not by its elements’ spatial colocation. I.e., a distributed cognitive process can be managed by multiple components that each serve a specific function but are located in different places communicating over a network.

It also provides us with a framework to think about Collaborative Cognition, a system in which multiple intelligent agents collaboratively work on a problem. An example of such a collaboration is a decision process to which multiple humans and machines contribute.

This new and emergent field was termed Social Computing by James Evans in September 2020. He describes it as the combination of socially inspired computer science and computationally enhanced social science. In this framing, machines act as complements rather than substitutes for human cognition. Social Computing aims to engineer systems for new social interaction between humans and machines that allow us to communicate more effectively by taking into account human biases.

We will look at cloud architectures that describe the combination of cognitive modules for augmentation interfaces later.

Personal Knowledge Management

Personal knowledge management (PKM) is one way to augment cognition, namely by helping someone extend their memory. By helping us manage and externalize our knowledge, these systems enable us to work with a more extensive knowledge base. The most comprehensive description and analysis of PKMs and related concepts I’ve found was the article Still Building the Memex by Stephen Davies, professor in the Computer Science department at the University of Mary Washington.

PKM systems are often called Second Brains. Still, most of them act more as a storage medium and replica of its user’s actual body of knowledge. At RemNote, we try not just to help you create a copy of your knowledge but an extension and augmentation of your brain.

New types of augmenting devices and interfaces

Cognitive Technology, GANs, and GPT-3

In 2016, Michael Nielsen, in his essay Thought as a Technology, described the term cognitive technology as an external artifact designed by humans, which can be internalized and used as a substrate for cognition. These are representations invented by other people, such as words, graphs, maps, algebra, mathematical diagrams, etc. He further differentiated between Models of Augmentation as Cognitive Transformation (e.g., a spreadsheet) or Cognitive Outsourcing (e.g., a calculator).

A year later, in a paper titled, Using Artificial Intelligence to Augment Human Intelligence, he and his colleague Shan Carter expanded the concept. They differentiate between Cognitive technology using Computers and Cognitive technology using Artificial Intelligence. The latter is expanding human thought itself, meaning the computer-performed action becomes a new generalizable concept. An example they give is the concept of applying a stamp in Photoshop to another layer. This concept is generalized as computer, [new type of action] this [new type of representation for a newly imagined class of object].

Cognitive technology using Artificial Intelligence discovers and reveals deep principles in ways meaningful to the user and helps us invent new cognitive technologies. The example they give here is an Interactive Generative Adversarial Model (iGAN). It can, for example, show a typographer novel operation as new primitives that he can then learn and apply in future designs without using the iGAN. In the context of a designer, these new primitives are what we call design patterns.

How this process can look like is demonstrated in a video by the AI researcher Károly Zsolnai-Fehér, in which he also describes it as the artistic control over images.

iGAN demontsration

Another fascinating technology in this category is GPT-3, an autoregressive language model that uses deep learning to produce human-like text. We saw commercially successful use of this model in copywriting (ex. copysmith.ai) and code completion (ex. GitHub Copilot). But besides software code, there has been little commercialization of text completion. Thus, I hope somebody builts a ubiquitous text completions browser extension (pls. lmk if you do).

GPT-3 can not only be used to outsource cognitive tasks, but it can generate entirely new ideas. David Dohan, Research Engineer at Google Brain, demonstrates the power of using this model for idea generation in a recent Athens community talk.

Although the descriptions of the two terms, provided by Nielsen and Carter, still come with less clearly defined terms of “deep principles,” “in a meaningful way,” and “human thought,” they help us draw a further categorization. For our purpose here and using our terminology, I’ll summarize them as the distinction between CAS and CAAI, Cognitive Augmentation AI.

It’s still early for software tools like those iGANs explored above. Still, we can already see that Offering predefined elements (shapes, text, symbols) directly exemplifies how a digital tool can augment human intelligence by allowing the user to get into the creative flow faster and minimize unnecessary construction work., as Molly Mielke phrased it in her thesis Computers and Creativity.

Memory and Knowledge Augmentation

In the before-mentioned seminal essay, Bush describes the personal knowledge management abilities of the Memex.

MemexWikiImage

It, for example, should be able to retrieve and reproduce items many years old. He also describes its sharing features as “Wholly new forms of encyclopedias will appear, ready-made with a mesh of associative trails running through them, ready to be dropped into the Memex and there amplified. The lawyer has at his touch the associated opinions and decisions of his whole experience, and of the experience of friends and authorities. …”

But in all those years, there has not been actual scientific work to engineer a cognition augmenting device as he describes it. However, this changed recently. In June 2019, the two researchers, Mahadev Satyanarayanan and Nigel Davies published a paper Augmenting Cognition Through Edge Computing, describing such a system’s possible architecture.

RECALL memory architecture

It might, for example, help users restore context before their next conference or class. While walking to a lecture, the student could be primed with a lecture overview through his smart glasses, surfacing relevant information. The description of the “Memory vault” in this architecture exhibits a high similarity to Vannevar Bush’s Memex.

But have people started building such applications?

Recent leaks about the startup Hu.ma.ne hinted that they are building a contextual recall memory device in the form of a lapel pin. However, the thrill remains since the job listings on LinkedIn also listed a BCI engineer, and they stated that it would have the same kind of impact as the iPhone.

The more intriguing and already usable product in this realm under development might be personal.ai (formerly hu.man.ai). Its founder Suman Kanuganti and his colleagues are building a personal ai for memory storage. More precisely, they state that the product will safekeep the thoughts and memories that define you with your personal AI secured by a blockchain. A very compelling value proposition, I think. This product is an excellent example of memory augmentation using Artificial Intelligence.

Personal AI’s Head of Design, Kristie Kaiser, also describes the application of Ambient Computing concepts as part of their product. Ambient Computing is a user experience that is seamlessly integrated into the flow of the users’ life, and that requires no conscious action. They also produced this very moving teaser video

Another example of memory augmentation is RemNote. It’s a tool for networked note-taking with a seamlessly integrated spaced repetition system (SRS). Currently, the offered spaced repetition algorithms are deterministic and don’t incorporate any AI. Other SRS software products already showed the successful application of Machine learning to memory augmentation.

Augmented Reality

AR Interfaces leverage the outstanding visual perception that human evolution created, e.g., our cognitive processing speed of visuospatial data. This superiority is why we can predict that they will be the next paradigm of Human-computer interfaces. Those interfaces will give rise to many new ways of interacting with computation and new types of media. The advances in UX design show that interactions with digital interfaces become ever more natural and intuitive.

ironman

What do we mean by intuitive here? An example of such an advancement in the development of the macintosh. Engineers and designers reduced the mouse interface’s cognitive load by removing the second button and utilizing existing mental models. This design decision was brought forth by a memo titled ‘One-Button Mouse’ by the legendary Apple HCI scientist Larry Tesler (as explained in more detail in the book Insanely great).

Following the doctrine of Build Glasses, not Binoculars (Albrecht Schmidt 2019), we can envision AR Interfaces. Those will allow us to navigate a three-dimensional multimedia environment by intuitive hand gestures rather than positioning a mouse on a desk.

Cognition Augmentation Software for Learning

When augmenting cognitive processes, the most useful application is to look for the most cognitively taxing tasks, with learning being one of them.

In Life 3.0, Max Tegmark describes a fictional AI, Prometheus, that helps humans learn:

“Given any person’s knowledge and abilities, Prometheus could determine the fastest way for them to learn any new subject in a manner that kept them highly engaged and motivated to continue and produce the corresponding optimized videos, reading materials, exercises, and other learning tools. […] by leveraging Prometheus’ movie-making talents, the video segments would truly engage, providing powerful metaphors that you would relate to, leaving you craving to learn more.”

Intelligent Tutoring Systems

What Tegmark describes is called an intelligent tutoring system (ITS). Such a computer system provides immediate and customized instruction or feedback to learners, usually without requiring intervention from a human teacher. The first widely used product of this kind was PLATO (Programmed Logic for Automatic Teaching Operations). It was started in 1960 at the University of Illinois at Urbana–Champaign and funded by the ARPA, the same institutions that also funded ARPANET, the precursor of today’s Internet. The system proved beneficial for many students, especially those with learning disabilities, as they surmounted the classroom’s social peer pressure, Brian Dear explains in a talk held at Google. Its commercialization NovaNET shut down after its acquisition in the early 2000s.

The popular website Kahn Academy implements a more straightforward and older model called CAI (computer-assisted instruction).

The most successful product might be ALEKS (bought by McGraw-Hill). It implements Knowledge Space Theory, a stochastic framework for the assessment of knowledge. It was developed by ALEKS’ founder Dr. Jean-Claude Falmagne, a mathematical psychologist whose scientific contributions deal with problems in reaction time theory, psychophysics, philosophy of science, measurement theory, decision theory, and educational technology dev. Students using ALEKS navigates a graph of possible knowledge states, i.e., a learning space.

But what Tegmark describes is not a normal ITS. It is a very advanced ITS that can customize, generate tutoring content for its user and generalize to new fields. Unlike ALEKS, which is limited to predefined domains and content.

We can be optimistic that, given this technical feasibility and the fast advancements in fields such as reinforcement learning (a subfield of Machine Learning that focuses on learning agents), we will see the first more generally intelligent, Prometheus-like tutoring systems in the coming decades. Those will function in a more adaptable way taking into account the students’ cognitive state, personal preferences, and psychological profile.

Outlook, Market, and Trends

Summarizing, we can draw the following distinction between tools for thought and CAS and some of its subareas:

The Learning Psychology of Software Engineering

Trends

For many decades we have been removing points of friction that bottleneck our cognitive capabilities. A Google search that does not fall below our 250 millisecond limit of perceivable time differences will always feel too slow while waiting to absorb the information we searched for.

The introduction of the typewriter and computer removed the major bottleneck of handwriting for externalizing our thoughts. In his book Smarter Than You Think, Clive Thompson cites the research of the Stanford University literary scholar Andrea Lunsford, that examined freshman entry essays from 1917 until the present. Lunsford found that while grammatical error rates have stayed the same, the length and complexity of the essays have dramatically increased. “It’s not that the kids of 1917 were stupider,” says Thompson.1 “It’s just that their tools were getting in the way of their thought.”

One can only surmise what the next hundred years will do to internalize and externalize information using software tools. Three major technology standards influence the future of these tools and their accelerated development:

  • superior sensing and capturing systems (e.g., the possibility of continuous collection of memory cues through life-logging)
  • advances in audio and image processing enabling widespread mining of stored cues for proactive presentation
  • pervasiveness of displays for displaying memory cues

Development

In their essay, Michael Nielsen and Andy Matuschak describe the difficulty for developers of Tools For Thought to prevent copy-cats. They give the examples of Adobe, which invested heavily in developing their products and is now copied or disrupted by cloud-based companies like Figma. Considering the economic drivers discussed, one can be optimistic that future CAS that leverages proprietary AI and AR technology will have fewer of those difficulties.

As mentioned, many innovations we take for granted today originated out of AC and HCI research. However, to date, most of the research is intended for and was funded by military and defense agencies (Drexler et al. 2007). This narrow focus might leave open a huge opportunity and an unaddressed consumer market of a growing number of knowledge workers. After all, the technologies developed as part of the research share the same goal of productivity improvements as the $102.98 Billion productivity software market.

RemNote

We can describe RemNote as CAS. Its integrated Spaced Repetition System (SRS) acts as a memory augmentation by exploiting the Ebbinghaus Forgetting Curve. Its referencing feature is a Cognitive Augmentation Software that lets its user reference concepts and construct new ones using other concepts as building blocks. Future features like Entity recognition or ML-based SRS would be categorized as CAIS.

If you are building or researching CAS or CAIS, I would be happy to have a chat!

Lastly, if you are a talented engineer, designer, or operator interested in working on cutting-edge productivity and learning software; We are hiring at RemNote and would love to hear from you!


  1. The main conference of the field is the Augmented Cognition (AC) Conference, an affiliated conference of the HCI International Conference, which will arrive at its 15th edition this July. Its last edition of 2018 brought forth two volumes of papers: Augmented Cognition: Intelligent Technologies and Augmented Cognition: Users and Contexts.

Profile picture

I'm building RemNote.com. I grew up in Germany and previously studied Software Engineering at CODE University of Applied Sciences. I strive to accelerate scientific and technological progress by building a new evolution of tools and systems for knowledge creation. Find me on GitHub, Twitter, LinkedIn, Product Hunt, or Goodreads.