DeepSeek OCR AI Model can process 200,000 pages of documents a day with a single Nvidia A100 graphics card – NotebookCheck.net News (19459000)
An Nvidia A100 GPU (Image source: Nvidia)
DeepSeek is preparing to revolutionize AI learning with a new open-source OCR compression model. Thanks to its advanced optical coding, DeepSeek can learn from more than 200,000 document pages per day on a single Nvidia A100 GPU.
With AI data centers proliferating and the associated processing costs, it is now the algorithm efficiency that matters, and no language models seems to be able to match DeepSeek. Its models are free to download and can be trained at a lower cost than OpenAI’s ChatGPT, or Google’s Gemini.
DeepSeek-OCR, a newly announced model, is a prime instance of learning efficiency. It can compress very long documents using optical mapping with a 97% recognition accuracy at a compression rate lower than 10x.
By converting more than nine tokens of document text into a single token using advanced encoders and decoders, the computing resources required to process the content are greatly reduced. The new DeepSeek system can achieve 60% accuracy in optical recognition even at a 20x compressed ratio. This is an unprecedented feat.
DeepSeek-OCR is able to learn from scientific and historical text by using the new AI compression algorithms. Nvidia A100 is a data center GPU that can process 200,000 pages a day. A 20-node cluster of A100 can process 33 million documents pages per day, a paradigm change in text-heavy LLM. According to the OmniDocBench rankings, DeepSeek beats other popular solutions such as GOT-OCR2.0 and MinerU2.0 when it comes to using fewer vision tokens per page.
New DeepEncoder algorithms are able to handle a wide range of document resolutions and sizes without sacrificing speed. The DeepSeek3B MoE-A570M encoder relies on a mixture-of experts architecture, which distributes knowledge across specialized models for each OCR task. DeepSeel-OCR is able to process documents that include graphs, scientific equations, diagrams or images even when they are written in multiple languages.
In order to achieve such accuracy and scale, DeepSeek analyzed 30 million pages of Portable Document Format (PDF) in nearly 100 different languages. This included all categories, from scientific handwriting and newspapers to textbooks and doctoral dissertations. While the speed and efficiency achieved by the new DeepSeek OCR system is undeniable it remains to see if it will improve the performance of language models when compared to the current text-based token paradigm.
Related Articles (19659011)
Daniel Zlatev – Senior Tech Writer – 1931 articles have been published on Notebookcheck since 2021
Daniel was enamored with tech ever since the industrial espionage and pixelized Nintendos of the 1980s. He opened a gaming lounge when consoles and personal computers were still expensive rarities. Today, the fascination is no longer with specs and speeds, but rather the lifestyle that the computers in our pockets, houses, and cars have forced us into, from the endless scroll and privacy hazards to authenticating our every move and bit of existence.