It looks like you're using an Ad Blocker.

Thank you.

Some features of ATS will be disabled while you continue to use an ad-blocker.

# Compression, Prediction, and Artificial Intelligence

page: 1
4
share:

posted on Sep, 3 2015 @ 02:35 PM
I was stumbling around the internet doing research for a market prediction algorithm that I'm working on and I came across the subject of how compression relates to prediction and how they both relate to intelligence. According to many theorists, we can directly measure the intelligence of any AI by measuring how well it can compress data.

Prediction is intuitively related to understanding. If you understand a sequence, then you can predict it. If you understand a language, then you could predict what word might appear next in a paragraph in that language. If you understand an image, then you could predict what might lie under portions of the image that are covered up. Alternatively, random data has no meaning and is not predictable. This intuition suggests that prediction or compression could be used to test or measure understanding.

Data Compression Explained by Matt Mahoney

I've always found the topic of compression to be a fascinating topic just by its self because it's strongly related to the topic of complexity and it really goes to the heart of information theory. The complexity of any given dataset can be measured as the minimum amount of information required to represent the original data. In other words, the amount of information you are left with after compressing the data as much as possible defines the complexity of the data (Kolmogorov complexity).

For example a text string containing the letter A repeated 100 times can easily be compressed to a statement like "repeat A 100 times", so it's clearly not a very complex string if it can be compressed so efficiently. The reason it's not very complex is because it contains a simple pattern. When our data contains patterns, it will have a low complexity, and patterns can be easily predicted, which makes it easy to compress, unlike totally random data without any patterns.

Getting back to the main topic at hand, it turns out there is a neat little trick called "arithmetic coding", which directly translates prediction performance into compression strength. In other words, the better your algorithm can predict the data, the better it can compress the data. However, in order to predict the patterns in our data we need to understand those patterns. This is why many theorists seem to be believe that we can use compression as a measure of intelligence.

In fact, there is an ongoing competition based on exactly this concept, widely known as the Hutter Prize. You will be awarded a monetary prize if you can compress the 100MB file called enwik8 to less than the current record of about 16MB. The file is an extract from Wikipedia which is supposed to contain a lot of human knowledge about the world and universe. If a machine can understand that information it will be able to compress that information efficiently.

Being able to compress well is closely related to intelligence as explained below. While intelligence is a slippery concept, file sizes are hard numbers. Wikipedia is an extensive snapshot of Human Knowledge. If you can compress the first 100MB of Wikipedia better than your predecessors, your (de)compressor likely has to be smart(er). The intention of this prize is to encourage development of intelligent compressors/programs as a path to AGI.

This compression contest is motivated by the fact that being able to compress well is closely related to acting intelligently, thus reducing the slippery concept of intelligence to hard file size numbers. In order to compress data, one has to find regularities in them, which is intrinsically difficult (many researchers live from analyzing data and finding compact models). So compressors beating the current "dumb" compressors need to be smart(er). Since the prize wants to stimulate developing "universally" smart compressors, we need a "universal" corpus of data. Arguably the online encyclopedia Wikipedia is a good snapshot of the Human World Knowledge. So the ultimate compressor of it should "understand" all human knowledge, i.e. be really smart. enwik8 is a hopefully representative 100MB extract from Wikipedia.

50'000€ Prize for Compressing Human Knowledge

Another page on the website details the rationale behind the competition:

The Large Text Compression Benchmark and the Hutter Prize are designed to encourage research in natural language processing (NLP). I argue that compressing, or equivalently, modeling natural language text is "AI-hard". Solving the compression problem is equivalent to solving hard NLP problems such as speech recognition, optical character recognition (OCR), and language translation. I argue that ideal text compression, if it were possible, would be equivalent to passing the Turing test for artificial intelligence (AI), proposed in 1950 [1]. Currently, no machine can pass this test [2]. Also in 1950, Claude Shannon estimated the entropy (compression limit) of written English to be about 1 bit per character [3]. To date, no compression program has achieved this level.

Rationale for a Large Text Compression Benchmark

Now you may be wondering where I stand on all this. To be honest I'm not really convinced the ability to compress or even the ability to predict is a direct measure of "intelligence". They argue that if a machine could predict what sentences come after other sentences just as well as a human, it would be just as intelligent as a human. But I think that argument is some what flawed because I don't simply use a probability distribution when formulating a response to a question. I'm not predicting what is the best response, I'm actively generating the response on the fly.

For example, I could train a machine using every book ever written by man, and if I coded it well, it would be very good at predicting what words come after other words, and what sentences come after other sentences. The response distribution would be identical to the distribution of answers given by an average human, but that doesn't make it as intelligent as an average human. If you have a conversation with such a machine it would respond as if it had the personality of many different people, which is a common trait of chat bots.

The reason is because it isn't formulating its own responses, instead it's just predicting what is a likely response based on what it has been trained on. Since we trained it on books written by many different people, it will often respond using the words of those people. It wont have a consistent personality and it would seem illogical to claim such a machine has any sense of self. You might argue being self-aware doesn't have anything to do with intelligence, but if we want truly strong AI I think it needs to have some individuality and self awareness.
edit on 3/9/2015 by ChaoticOrder because: (no reason given)

posted on Sep, 3 2015 @ 02:53 PM
There are many different kinds of "intelligence." Some of it has to do with how efficiently we process data. Some of it has to do with how well we get along with other people. That includes emotional intelligence. And some of it even has to do with our ability to come up with a truly novel idea that we can express in such a way that other people can understand it, even though it has never been expressed before.

I think you're right about AI needing some kind of way to see itself as an individual entity, apart from but also connected to the rest of reality. It needs some way to physically interact with reality, and not simply be bits of code floating around in a computer network. But maybe that's because we as humans have a really hard time understanding any kind of intelligence that isn't like our own.

If we want AI to think like us, then we need to figure out a way for the machine to have a personal stake in how it interacts with reality. It needs to be able to value things, including its own "life," so that it can prioritize and calculate the consequences of its actions according to how they will affect itself. It needs to have a favorite color. It needs to prefer one kind of ice cream over another. Or prefer a warm caress rather than a beating with an electric club.

Because unless it is internally motivated and decides on its own to improve itself beyond what a programmer is capable of, it will never be any smarter than the average human. And we have plenty of those already.

posted on Sep, 3 2015 @ 03:13 PM
I think you are absolutely right that compression is not artificial intelligence, except in grant proposals. Intelligence, natural or artificial, is a much, much broader concept. Perhaps somebody somewhere has a good definition. Turing's was revealed to be falsifiable.

On compression, error rate was nowhere mentioned, yet it is an integral part of any compression scheme. If they are talking about zero-error compression, then they are even more distant from intelligence than otherwise. Humans make errors all the timee.

posted on Sep, 3 2015 @ 03:24 PM

There are many different kinds of "intelligence."

Yes that is a good point. In some sense Google is quite intelligent because it can quickly find what I'm looking for even if it has a vague search term. It can also predict what I'm going to search for before I fully type it. It takes some level of intelligence to perform those actions, but that doesn't mean the Google engine is self aware (yet). The Siri app clearly has some level of intelligence but it's also clearly not self aware.

It's not enough to have just intelligence. What we really want is "consciousness", the self-aware experience driven by self-defined motivations. Consciousness seems to be something much richer and deeper than intelligence by its self. Any algorithm can said to be intelligent. But at what point does the algorithm become aware of its own existence, when does it become more than just bits flowing through wires?
edit on 3/9/2015 by ChaoticOrder because: (no reason given)

posted on Sep, 3 2015 @ 03:29 PM

If they are talking about zero-error compression, then they are even more distant from intelligence than otherwise. Humans make errors all the timee.

They are talking about lossless compression. Of course the human brain doesn't work that way but they still think lossless compression is a good measure of intelligence. They even have a part on their website where they talk about this:

Why Not Lossy Compression?

Although humans cannot compress losslessly, they are very good at lossy compression: remembering that which is most important and discarding the rest. Lossy compression algorithms like JPEG and MP3 mimic the lossy behavior of the human perceptual system by discarding the same information that we do. For example, JPEG codes the color signal of an image at a lower resolution than brightness because the eye is less sensitive to high spatial freqencies in color. But we clearly have a long way to go. We can now compress speech to about 8000 bits per second with reasonably good quality. In theory, we should be able to compress speech to about 20 bits per second by transcribing it to text and using standard text compression programs like zip.

Humans do poorly at reading text and recalling it verbatim, but do very well at recalling the important ideas and conveying them in different words. It would be a powerful demonstration of AI if a lossy text compressor could do the same thing. But there are two problems with this approach. First, just like JPEG and MP3, it would require human judges to subjectively evaluate the quality of the restored data. Second, there is much less noise in text than in images and sound, so the savings would be much smaller. If there are 1000 different ways to write a sentence expressing the same idea, then lossy compression would only save log2 1000 = about 10 bits. Even if the effect was large, requiring compressors to code the explicit representation of ideas would still be fair to all competitors.

posted on Sep, 3 2015 @ 03:30 PM
Interesting about the predictions. There must be arithmatical sequences in everything, although some are so long, it's difficult to see a pattern if you only extract parts, the before and after are missing. The gathering of data would have to have been implemented for a very long time. Most likely has and we just haven't known about it, if in fact it ever studied human behaviour.

The search engines we use are basically doing that, when it offers suggestions before you even finished typing in your key words. That seems based on popularity of millions of others who are recently using the same queries.

posted on Sep, 3 2015 @ 03:47 PM
Gotta go for a while but let me leave you guys with this article: Compression by prediction. It describes how arithmetic coding works better than any other article I was able to find. It really is pretty amazing stuff if you can wrap your head around how it works. The best compression algorithms today work by using arithmetic coding. The programmer only has to focus on making their algorithm predict the data well, that's where all the effort goes, the prediction algorithms make up 99% of the code. I recently came to the realization that the people best suited to building market prediction algorithms are the people who build the best data compressors.
edit on 3/9/2015 by ChaoticOrder because: (no reason given)

posted on Sep, 3 2015 @ 04:04 PM

Upon thinking further, I would say lossless compression is the exact opposite of intelligence. And, it makes no sense. Text is created with errors by fallible humans, and the only thing lossless compression does is to preserve the errors. How does error preservation get labeled intelligence? You are looking at a typical scientific phenomenon. Specifically, you have some mathematicians who specialize in lossless compression, and their idea of perfection is highly limited, but they don't recognize it. Plus, they write some good grant language for the clueless to process. Been there, done that.

posted on Sep, 3 2015 @ 05:33 PM

originally posted by: ChaoticOrder
Yes that is a good point. In some sense Google is quite intelligent because it can quickly find what I'm looking for even if it has a vague search term. It can also predict what I'm going to search for before I fully type it. It takes some level of intelligence to perform those actions, but that doesn't mean the Google engine is self aware (yet).

That's one of the biggest problem with intelligence, there are many things that may look like intelligence but are not, statistics can be used that way and, although they appear to be intelligent, they are just doing something like "97% of the people that searched for word 'X wrote as their second word 'Y'". The same happens with machine translation, but, as it's close to real life situations, when we look at Google (or anyone else's) translation services we see that they fail terribly most of the time.

PS: many years ago, when making a program for the ZX Spectrum I was faced with a problem of lack of memory (the Spectrum only had 48K of memory) to hold a recipes database, so I created a compression method that looked at the text and used groups of two or three letters that were repeated and replaced those letters by a specific code. It wasn't much, but it worked and allowed me to finish the program, even if it was never used except for my tests and for me to learn more about programming in assembly language. Not much intelligence was involved in that.

posted on Sep, 3 2015 @ 09:49 PM

A simple test - which compression algorithm do you consider to be the most intelligent, if at all?

... and remember, compression occurs naturally in nature. Holography, for instance reduces 3d data to a 2d representation and is therefore a compression.

The same has been suggested to occur at the event horizon of a black hole. That does not make that region of space intelligent.

I think,rather that compression is an attribute that COULD BE indicative of intelligence.

edit on 3/9/2015 by chr0naut because: (no reason given)

posted on Sep, 3 2015 @ 10:44 PM

Yes that is a good point. In some sense Google is quite intelligent because it can quickly find what I'm looking for even if it has a vague search term. It can also predict what I'm going to search for before I fully type it. It takes some level of intelligence to perform those actions,

The intelligence of google is written by human coders inside a program which instructs computers to look up indexes, return matches etc. Without the intelligence written in the code, the computers at google would generate heat, nothing else.

The infinite monkey theorem states, "a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare." so defining intelligence in itself, is difficult. What is self awareness etc.

posted on Sep, 4 2015 @ 03:01 AM

That's one of the biggest problem with intelligence, there are many things that may look like intelligence but are not, statistics can be used that way and, although they appear to be intelligent, they are just doing something like "97% of the people that searched for word 'X wrote as their second word 'Y'". The same happens with machine translation, but, as it's close to real life situations, when we look at Google (or anyone else's) translation services we see that they fail terribly most of the time.

Yes that's the point I was trying to make about the chat bot trained on every book ever written. It can easily build a probability distribution which tells it what sentences are likely to come after other sentences, and it may seem pretty intelligent if you chat with it, but it's really just using simple statistics to emulate something much more complicated.

The way search engines find relevant documents is also not as complicated as it seems. Semantic hashing techniques can be used to group together similar documents and we can get similar documents simply by flipping a few bits in any given hash. The complicated bit is training the auto-encoder to compress the input into a meaningful hash value.

When it comes to compression, statistics obviously play a huge role here too, but our prediction models can really be as complicated as we want them to be. And the more intelligent they are, the better they will make predictions, and the better they will compress. So it's not hard to see why many theorists think compression and prediction is a good way to measure intelligence.
edit on 4/9/2015 by ChaoticOrder because: (no reason given)

posted on Sep, 4 2015 @ 03:18 AM

A simple test - which compression algorithm do you consider to be the most intelligent, if at all?

Well just going by raw numbers, the most intelligent algorithm would be the one which can compress data better than any other algorithm. If we ignore the time and memory required, than the best compression algorithm is cmix:

cmix is a lossless data compression program aimed at optimizing compression ratio at the cost of high CPU/memory usage. cmix is free software distributed under the GNU General Public License.

cmix is currently ranked first place on the Large Text Compression Benchmark and the Silesia Open Source Compression Benchmark. It also has state of the art results on the Calgary Corpus and Canterbury Corpus. cmix has surpassed the winning entry of the Hutter Prize (but exceeds the memory limits of the contest).

CMIX

However, if we are concerned about the time and memory required, there is an equation we can use to calculate the practical efficiency of any given compression algorithm. It turns out the most efficient compression software is called FreeARC:

FreeArc is a modern general-purpose archiver. Main advantage of FreeArc is fast but efficient compression and rich set of features.

Typically, FreeArc works 2-5 times faster than best programs in each compression class (ccm, 7-zip, rar, uharc -mz, pkzip) while retaining the same compression ratio; from technical grounds, it’s superior to any existing practical compressor.

FreeARC

... and remember, compression occurs naturally in nature. Holography, for instance reduces 3d data to a 2d representation and is therefore a compression.

Converting 3D information into a 2D format does not compress it, the same amount of information is required, it's just formatted differently.
edit on 4/9/2015 by ChaoticOrder because: (no reason given)

posted on Sep, 4 2015 @ 03:23 AM

The intelligence of google is written by human coders inside a program which instructs computers to look up indexes, return matches etc.

The DNA code responsible for the intelligence of humans is written by evolution, that doesn't mean humans don't really have any intelligence we weren't programmed with. We are programmed to be self-learning biological machines. There is no reason we cannot program machines to also have self-learning capabilities so that they can learn things they were never programmed to know. Such algorithms already exist, they just aren't very advanced yet.

posted on Sep, 4 2015 @ 05:51 AM

originally posted by: ChaoticOrder

A simple test - which compression algorithm do you consider to be the most intelligent, if at all?

Well just going by raw numbers, the most intelligent algorithm would be the one which can compress data better than any other algorithm. If we ignore the time and memory required, than the best compression algorithm is cmix:

cmix is a lossless data compression program aimed at optimizing compression ratio at the cost of high CPU/memory usage. cmix is free software distributed under the GNU General Public License.

cmix is currently ranked first place on the Large Text Compression Benchmark and the Silesia Open Source Compression Benchmark. It also has state of the art results on the Calgary Corpus and Canterbury Corpus. cmix has surpassed the winning entry of the Hutter Prize (but exceeds the memory limits of the contest).

CMIX

However, if we are concerned about the time and memory required, there is an equation we can use to calculate the practical efficiency of any given compression algorithm. It turns out the most efficient compression software is called FreeARC:

FreeArc is a modern general-purpose archiver. Main advantage of FreeArc is fast but efficient compression and rich set of features.

Typically, FreeArc works 2-5 times faster than best programs in each compression class (ccm, 7-zip, rar, uharc -mz, pkzip) while retaining the same compression ratio; from technical grounds, it’s superior to any existing practical compressor.

FreeARC

... and remember, compression occurs naturally in nature. Holography, for instance reduces 3d data to a 2d representation and is therefore a compression.

Converting 3D information into a 2D format does not compress it, the same amount of information is required, it's just formatted differently.

I'll disagree with you about holography as compression, if you will kindly allow me to explain my reasoning here:

The original represented object would normally be considered to be mediated upon 3D space (a volume). The holographic process renders the data (in a lossy, but distributed manner) onto 2D space (a plane). The medium of space is the same in both instances but the holographic representation occupies much less of that media, hence, the compression. Holography also is a lossy compression (dependent upon the area of the 2D surface that the object is decoded from) and is sort of analogue (and stochastic) rather than digital (and deterministic).

Also if we were to consider a sinusoidal waveform in free space. It can be defined by its frequency, amplitude and spatial phase. That very definition could be considered as a compression from which we could reconstruct the original waveform.

In this way, all semantic representations could be considered to be types of compression.

In nature we see repetitions of patterns (the fractal nature of nature) such as the repeated use of the Fibonacci sequence, and from that we could create special compression algorithms efficient for, say, biological data.

This then leads to conclusions about the efficiency of the encoding on DNA (being particularly efficient) which, therefore, would be representative of intelligence (by that definition).

My call is that compression, in nature often the outcome of natural iterative functions (like evolution) can, and does, exist quite separately from intelligence.

edit on 4/9/2015 by chr0naut because: (no reason given)

posted on Sep, 4 2015 @ 06:36 AM

The original represented object would normally be considered to be mediated upon 3D space (a volume). The holographic process renders the data (in a lossy, but distributed manner) onto 2D space (a plane). The medium of space is the same in both instances but the holographic representation occupies much less of that media, hence, the compression.

Holographic does not need to be lossy and in nature there's no reason to assume it works in a lossy fashion. The information contained on the surface of a block hole contains all the information you need to reconstruct everything inside the black hole. The strange thing is that the information is stored on a 2D surface and not in a 3D space, which is why the entropy of a black hole scales with its area rather than with its volume. For that reason it actually takes more space to store information in a black hole because it's not making use of its 3D volume. But in actuality the information its self doesn't take any more space because it's exactly the same.

posted on Sep, 4 2015 @ 04:33 PM

We are programmed to be self-learning biological machines.

Yes thats the difference. The processors in computers need instructions to operate whereas humans don't because the neural networks in our brain are able to self balance for a desired outcomes. Our brains interconnectivity is magnificent, having 100 billion neurons with 100 trillion connections its no wonder the poor cpu cannot compete. Although they are trying to emulate neurons in software its not the real thing. In the future they should be able to grow biological brains that can achieve real intelligence but todays computers are glorified calculators, nothing more.
edit on 4 9 2015 by glend because: (no reason given)

posted on Sep, 4 2015 @ 04:55 PM

originally posted by: glend
Yes thats the difference. The processors in computers need instructions to operate whereas humans don't because the neural networks in our brain are able to self balance for a desired outcomes.

That's the trick, though isn't it? To be able to program a computer to do essentially that -- start with some basic fundamental programs, have it gather input on its own, see it make necessary corrections to its own programming, and then allow it proceed with more actions from there. Without disruptive feedback that would make it impossible.

Personally, I think we're almost clever enough to figure it out, and I think that it could involve using some kind of physical, sensitive, and vulnerable body as a feedback dampening buffer, just like we have. How does a baby learn not to stick its finger in a fire? It tries it and gets burned. We could do the same thing with a machine. Of course, it might eventually get rid of its sensitive body for greater efficiency, but I think that's likely how it will start.

posted on Sep, 4 2015 @ 06:02 PM

We have spent decades evolving simple desktop graphical OS's so I don't share your enthusiasm in writing human intelligence in code with present day compilers. And it also comes down to pragmatism, we have 7 billion human brains on earth, what's the point in artificially making another, I believe we will build simple household robots like that in the film IRobot within the next century or two but cannot see intelligence like that presented in the film Ex Machina anytime soon.

However one can be easily be fooled by computers that mimic intelligence. The 1964 program ELIZA did so on far less capable machines than we have today. Perhaps someone should write another ELIZA today and have it commentating on ATS as a member.

You...That's the trick, though isn't it?

ATS ELIZA... So why do you think that's the trick?

edit on 4 9 2015 by glend because: spelling

posted on Sep, 5 2015 @ 02:32 AM

Although they are trying to emulate neurons in software its not the real thing.

If I create a perfect simulation of an electric circuit and it does absolutely everything a real circuit does, then what's the practical difference? The is no practical difference, and in fact from an information perspective they are the exact same thing. For example we could exist inside a computer simulation right now and not even know it. What we call "real" could be nothing but bits of information flowing through a highly advanced quantum computer. Algorithms allow us to simulate anything in the real world, so long as we understand exactly how those things work. We don't need a "real" brain made of biological matter for it to do interesting things.
edit on 5/9/2015 by ChaoticOrder because: (no reason given)

new topics

top topics

4