AI language models can exceed PNG and FLAC in lossless compression, says study

September 28, 2023
3:43 pm

Is compression equivalent to general intelligence? DeepMind digs up more potential clues.

Effective compression is about finding patterns to make data smaller without losing information. When an algorithm or model can accurately guess the next piece of data in a sequence, it shows it’s good at spotting these patterns. This links the idea of making good guesses—which is what large language models like GPT-4 do very well—to achieving good compression.

In an arXiv research paper titled “Language Modeling Is Compression,” researchers detail their discovery that the DeepMind large language model (LLM) called Chinchilla 70B can perform lossless compression on image patches from the ImageNet image database to 43.4 percent of their original size, beating the PNG algorithm, which compressed the same data to 58.5 percent. For audio, Chinchilla compressed samples from the LibriSpeech audio data set to just 16.4 percent of their raw size, outdoing FLAC compression at 30.3 percent.

In this case, lower numbers in the results mean more compression is taking place. And lossless compression means that no data is lost during the compression process. It stands in contrast to a lossy compression technique like JPEG, which sheds some data and reconstructs some of the data with approximations during the decoding process to significantly reduce file sizes.

Read 7 remaining paragraphs | Comments

AI language models can exceed PNG and FLAC in lossless compression, says study

Related Posts

New Intel CEO Lip-Bu Tan will pick up where Pat Gelsinger left off

Google’s new robot AI can fold delicate origami, close zipper bags without damage

OpenAI pushes AI agent capabilities with new developer API

Why extracting data from PDFs is still a nightmare for data experts

What does “PhD-level” AI mean? OpenAI’s rumored $20,000 agent plan explained.

Nearly 1 million Windows devices targeted in advanced “malvertising” spree

CMU research shows compression alone may unlock AI puzzle-solving abilities

Massive botnet that appeared overnight is delivering record-size DDoSes

Is “vibe coding” with AI gnarly or reckless? Maybe some of both.

Eerily realistic AI voice demo sparks amazement and discomfort online

Threat posed by new VMware hyperjacking vulnerabilities is hard to overstate

Researchers surprised to find less-educated areas adopting AI writing tools faster

Serbian student’s Android phone compromised by exploit from Cellebrite

“It’s a lemon”—OpenAI’s largest AI model ever arrives to mixed reviews

Copilot exposes private GitHub pages, some removed by Microsoft

Recent Events

RBC Signals adds 10 antennas to global ground-station network

Cognitive Space claims two SDA awards

New Intel CEO Lip-Bu Tan will pick up where Pat Gelsinger left off

Space Development Agency adjusts satellite procurement strategy

Amazon eero Pro 7: Tri-Band Mesh Wi-Fi 7 Router with Support for Internet Plans up to 5 Gbps

Google’s new robot AI can fold delicate origami, close zipper bags without damage

China’s expanding footprint in geostationary orbit raises security concerns

Isar Aerospace wins Norwegian Space Agency launch contract

Manufacturing defect blamed for Vulcan solid rocket motor anomaly

OnePlus Buds Pro 3: Dual Driver Earbuds for Supreme Sound and Personalized Voice Control

America’s next Sputnik moment is already here

Virtual leaders roundtable: Accelerate IT maturity in 2025 with AI

China opens 2028 Mars sample return mission to international cooperation

AWS bets big on agentic artificial intelligence

Bridenstine urges Senate to quickly confirm Isaacman as NASA administrator