Home AI Companies News DeepSeek AI DeepSeek files patent to improve AI data collection

DeepSeek files patent to improve AI data collection

0
DeepSeek files patent to improve AI data collection
DeepSeek
Image:

Hangzhou DeepSeek AI Fundamental Technology Research Co. Ltd., a DeepSeek affiliate, filed today a patent for an improved web data collection system that improves efficiency and data quality. The patent describes a method to discover more webpage links, while minimizing the impact on website traffic. It uses downloaded content to predict undiscovered links. Prioritizing high-value information and reducing redundant downloading are the main goals. It is important to collect web data efficiently for the training of large language models (LLMs), used by AI systems such as ChatGPT. Existing techniques are plagued by incomplete link retrieval and excessive downloads which can crash websites. They also struggle with low-quality data filters. DeepSeek’s system is designed to address these issues by optimizing the data allocation while maintaining metadata accuracy.[ iThome, in Chinese]]

www.aiobserver.co

NO COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Exit mobile version