AI/NLP. Evolution of Compilers? - Part 1.1 - Interesting questions and my thoughts

Table of Contents

I posted an article from my blog on my LinkedIn a little while back (see the post here) and I had a colleague ask 2 super interesting questions. Herewith my attempt to answer them.

The Questions

  1. What does it really mean to “understand” language - and can a machine ever truly do that
  2. How do we teach a machine not just to process the structure of our language, but to grasp the intention behind it - when even humans struggle to agree on meaning.

My Thoughts

These are super interesting questions and require talking about very dense topics 😂 I’ll try not to ramble. My short answers to this (with no claims of being right) are:

  1. I think so, but not language fully from a human.
  2. I’m not actually sure we can because I don’t believe it’s possible to get this 100% right. Although I do think we can get really close.

I’m not qualified enough to give you an academically good answer for them, but I can try to give my thoughts in 2 parts.

Part 1 - Language and humans

The theory that I love the most (which might very well be outdated by now) for how to describe how and why language has become useful for humans to understand one another is probably those coming out of the Minimalist Program (Chomsky), though there are many competing and more modern theories. Definitely worth a read if you’re interested in this stuff.

Chomsky and others talk about the difference between communication using “signal encoding” and that which can be classified as using an “unbound discrete infinity”-based language, so to speak. Another interesting fact is that most of the ‘language mechanism’ is actually used internally by humans to process data more than to externally communicate (but this is more relevant in part 2).

Basically, a computer currently uses a sophisticated form of signal coding, and in my mind, there’s not much difference today in how other animals process communication and machines do using NLP (and I’m not suggesting that machines and other animals are the same, merely that the processing mechanism is similar in my opinion).

I don’t think a computer would ever be able to fully “understand” language as we humans do. My main reason is because I believe language and thus the cognition of it has as one of its components an inherent biological factor, which computers obviously don’t. For example, if I tell you “I feel like having tiramisu,” you might understand that as a coffee lover I’m probably craving it, whereas a computer might process that and match it to a mechanical thing like me being “hungry” because that’s the best pattern it can match that phrase to.

Part 2 - Cognition and how little we actually know about it

The Chinese room thought experiment gives a clear mental model to avoid being fooled into thinking good answers from an AI equals cognition.

We don’t fully comprehend how we “comprehend” (which I call out in the article albeit in an attempt to be humorous), so my question to myself would be “how would we even know if a computer comprehends” in the first place. Because of these reasons, I don’t really have an objectively good answer.

To your point, we struggle to understand someone’s communication even while being part of the same species. I truly think we will only see pale imitations of “true cognition” in a computer unless we figure this out, and that’s very much a hard problem that we don’t even know is possible to solve yet.

Another really hard problem to solve is data entropy when moving through a medium. Think broken telephone game, right? We haven’t found a way to “perfectly” send data from 1 thing to another.

Although, deriving 0’s and 1’s from ‘intention’ might be a little easier. If we say semantic meaning is like a complex molecule made up of different parts to the n-th degree, then understanding the semantic meaning might be easier than understanding the component parts, just like how you can see plastic and know it’s plastic without understanding complex organic chemistry.

It’s for that reason I sidestep the topic of “how do we grasp intention” in my article. Currently, we can create something that can “compile” natural language into 1’s and 0’s while preserving enough semantic information (intent) that the structure of the machine’s instructions is similar to the intended natural instructions that a human has communicated. We’ve created plastic without knowing what plastic is actually made of, essentially. The “plastic” is low quality and absolute dog sht for the environment, but it’s plastic nevertheless, right?