Workflow
Seek .(SKLTY)
icon
Search documents
DeepSeek-OCR 2发布:让AI像人一样“读懂”复杂文档
Feng Huang Wang· 2026-01-27 11:58
Core Insights - DeepSeek team released the paper "DeepSeek-OCR 2: Visual Causal Flow" and open-sourced the DeepSeek-OCR 2 model, which features an innovative DeepEncoder V2 structure that dynamically adjusts the processing order of visual information based on image semantics [1][2] - The new model aims to align machine processing more closely with human visual reading logic, addressing limitations in traditional visual language models that process images in a fixed grid order [1] Model Performance - DeepSeek-OCR 2 achieved an overall score of 91.09% on the OmniDocBench v1.5 benchmark, representing a 3.73% improvement over its predecessor [2] - The model demonstrated enhanced accuracy in reading order, with the edit distance decreasing from 0.085 to 0.057, indicating a better understanding of document content structure [2]
重磅!DeepSeek发布新模型并开源
Mei Ri Jing Ji Xin Wen· 2026-01-27 08:12
每经编辑|程鹏 1月27日,DeepSeek团队发布全新DeepSeek-OCR 2模型并开源,采用创新的DeepEncoder V2方法,让AI能够根据图像的含义动态重排图像的各个部分,而 不再只是机械地从左到右扫描。这种方式更接近人类的视觉编码逻辑。最终,该模型在处理布局复杂的图片时,表现优于传统的视觉-语言模型,实现了 更智能、更具因果推理能力的视觉理解。 编辑|程鹏 杜波 校对|许绍航 封面图片来源:视觉中国(资料图) 每日经济新闻综合自每经AI快讯 ...
DeepSeek开源OCR2模型
Cai Jing Wang· 2026-01-27 08:05
Core Viewpoint - The DeepSeek team has released a paper titled "DeepSeek-OCR2: Visual Causal Flow" and has open-sourced the DeepSeek-OCR2 model, which utilizes an innovative DeepEncoder V2 method to enable AI to dynamically rearrange parts of an image based on its meaning, aligning more closely with human visual encoding logic [1]. Group 1 - The DeepSeek-OCR2 model represents a significant advancement in AI's ability to interpret and manipulate visual information [1]. - The innovative DeepEncoder V2 method is a key feature that enhances the model's performance in visual tasks [1]. - The open-sourcing of the model allows for broader access and potential collaboration within the AI research community [1].
赶在农历新年前后,DeepSeek又发大模型,DeepSeek-OCR 2来了!更接近人类视觉编码逻辑
Jin Rong Jie· 2026-01-27 07:56
Core Insights - DeepSeek has launched its new model, DeepSeek-OCR 2, which utilizes the innovative DeepEncoder V2 method to dynamically rearrange image components based on their meaning, enhancing visual encoding logic similar to human perception [1] - The release of DeepSeek-OCR 2 comes approximately four months after the first version, indicating a rapid development cycle [1] - DeepSeek's approach contrasts traditional OCR by converting text information into visual images for efficient understanding, addressing challenges in processing long texts [1] Model Developments - The core component of DeepSeek's technology, the visual encoder, is believed to simulate the human brain's forgetting mechanism, providing a clear technical path for integrating optical and quantum computing in large language models (LLMs) [2] - Following the release of the V3 model in late 2024, DeepSeek is expected to unveil its next flagship model, DeepSeek V4, in February 2025, although the company has not confirmed this [2] - The DeepSeek-V3.1 upgrade features a hybrid reasoning architecture that supports both thinking and non-thinking modes, improving response efficiency and agent capabilities through post-training optimization [2] Performance Metrics - DeepSeek-V3.2 and its enhanced version, DeepSeek-V3.2-Speciale, reportedly achieve reasoning capabilities comparable to GPT-5, significantly reducing output length and computational costs compared to competitors [3] - DeepSeek-R1, launched on January 20, 2025, claims performance on par with OpenAI's models while maintaining a remarkably low inference cost of $294,000, with total training costs still below those of major international competitors [3] Market Impact - On January 27, 2025, DeepSeek topped the free app download charts in both the U.S. and China, surpassing ChatGPT, which has led to a significant revaluation of Chinese assets in the stock market [4] - Following DeepSeek's rise, indices related to computing power and cloud computing in the A-share market surged over 40%, with several stocks experiencing substantial gains [4] - The potential for DeepSeek to replicate its previous success in the market remains a point of interest for investors and analysts [4]
DeepSeek发布DeepSeek-OCR 2 让AI学会“人类视觉逻辑”
Zhi Tong Cai Jing· 2026-01-27 07:53
Core Insights - DeepSeek has launched the new DeepSeek-OCR2 model, which utilizes the innovative DeepEncoder V2 method to dynamically rearrange image components based on their meaning, enhancing visual understanding beyond traditional left-to-right scanning methods [1][2] - The model significantly outperforms traditional visual-language models (VLM) in processing complex layouts, achieving a score of 91.09% on the OmniDocBench v1.5 benchmark, which is a 3.73% improvement over its predecessor [1] Group 1 - The DeepSeek-OCR2 model maintains high accuracy while controlling computational costs, with visual token counts limited between 256 and 1120, aligning with Google’s Gemini-3Pro [2] - In practical applications, the model shows a reduction in repetition rates of 2.08% for online user logs and 0.81% for PDF pre-training data, indicating high practical maturity [2] Group 2 - The release of DeepSeek-OCR2 represents not only an upgrade in OCR performance but also significant architectural exploration, validating the potential of using language model architectures as visual encoders [2] - The DeepEncoder V2 architecture inherits advancements from the LLM community, such as mixture of experts (MoE) architecture and efficient attention mechanisms [2]
DeepSeek发布新模型,概念股短线拉升
Di Yi Cai Jing Zi Xun· 2026-01-27 06:48
Group 1 - DeepSeek team released a paper titled "DeepSeek-OCR 2: Visual Causal Flow" and open-sourced the DeepSeek-OCR 2 model, which utilizes the innovative DeepEncoder V2 method to enable AI to dynamically rearrange parts of an image based on its meaning, aligning more closely with human visual encoding logic [1] Group 2 - DeepSeek concept stocks experienced a short-term surge, with YunSai ZhiLian hitting the daily limit, Hongjing Technology reaching a 20% increase, and KaiPu Cloud, Shiji Hengtong, and Parallel Technology also seeing short-term gains [3]
DeepSeek开源OCR 2新模式,机器视觉编码逻辑更像“人类”
Xin Lang Cai Jing· 2026-01-27 06:40
据悉,在维持极高数据压缩效率的同时,DeepSeek-OCR 2 在多项基准测试和生产指标上均取得了显著 突破。模型仅需 256 到 1120 个视觉 Token 即可覆盖复杂的文档页面,这在同类模型中处于极低水平, 显著降低了下游 LLM 的计算开销。在 OmniDocBench v1.5 评测中,其综合得分达到 91.09%,较前代提 升了 3.73%,特别是在阅读顺序识别方面表现出了更强的逻辑性。 责任编辑:宋雅芳 责任编辑:宋雅芳 新浪科技讯 1月27日下午消息,DeepSeek团队今日《DeepSeek-OCR 2: Visual Causal Flow》论文并开 源了DeepSeek-OCR 2模型。据悉,该模型采用创新的DeepEncoder V2架构,实现了视觉编码从固定扫 描向语义推理的范式转变,可让AI能够根据图像的含义动态重排图像的各个部分,更接近人类的视觉 编码逻辑。 新浪科技讯 1月27日下午消息,DeepSeek团队今日《DeepSeek-OCR 2: Visual Causal Flow》论文并开 源了DeepSeek-OCR 2模型。据悉,该模型采用创新的DeepEncoder ...
DeepSeek新AI模型来袭,百亿规模的人工智能AIETF(515070)拉升1.7%,近10日“吸金”超13亿
Ge Long Hui A P P· 2026-01-27 06:25
Group 1 - DeepSeek's new DeepSeek-OCR 2 model utilizes the innovative DeepEncoder V2 method, enabling AI to "see" an image in a logical sequence similar to humans [2] - The AI ETF has seen a net inflow of 1.375 billion yuan over the past 10 days, with a current scale of 10.961 billion yuan, covering various segments of the AI industry chain [3] - The robot-themed ETF includes leading companies such as Huichuan Technology, leading in industrial robots, and Stone Technology, a leader in service robots [3] Group 2 - The Yuangbao plan will distribute 1 billion yuan in cash red envelopes starting February 1, with individual envelope amounts reaching up to 10,000 yuan [2] - Wenxin will launch a cash red envelope campaign totaling 500 million yuan from January 26 to March 12, also with a maximum prize of 10,000 yuan [2] - ByteDance's Volcano has become the exclusive AI cloud partner for CCTV's Spring Festival Gala, indicating potential for further promotional activities during the event [2]
DeepSeek概念股短线拉升,OCR 2重磅发布,让AI学会“人类视觉逻辑”
Jin Rong Jie· 2026-01-27 06:18
Core Insights - DeepSeek's release of the DeepSeek-OCR2 model has led to a short-term surge in related stocks, with companies like YunSai ZhiLian and Hongjing Technology hitting their upper trading limits [1] - The DeepSeek-OCR2 model utilizes the innovative DeepEncoder V2 method, allowing AI to dynamically rearrange image components based on their meanings, closely mimicking human visual encoding logic [1][6] Technology Advancements - The DeepSeek-OCR2 model breaks the limitations of traditional OCR by improving semantic understanding of images, significantly enhancing recognition accuracy in complex layouts, distortions, and occlusions [6] - In the OmniDocBench v1.5 benchmark test, the model achieved a score of 91.09%, a 3.73% improvement over its predecessor [6] - The model maintains high precision while controlling computational costs, with visual token counts limited to between 256 and 1120, aligning with Google's Gemini-3 Pro [6][7] Architectural Significance - The release of DeepSeek-OCR2 represents not just an upgrade in OCR performance but also a significant exploration of architecture, validating the potential of using language model architectures as visual encoders [7] - The model's "two cascaded 1D causal reasoning" approach may signify a breakthrough in achieving true 2D reasoning by decomposing 2D understanding into complementary sub-tasks [7] Industry Implications - The launch of the DeepSeek-OCR2 model provides a technological upgrade direction for the OCR industry, enabling companies involved in graphic information processing and digital transformation services to optimize their products and expand business opportunities in finance, healthcare, and government sectors [8] - DeepSeek's commitment to an open-source technology route and the continuous release of high-performance model products will benefit developers and enterprises focusing on secondary development and deployment services [8] - The adaptation of DeepSeek's model on edge devices is pushing AI capabilities towards the edge, creating growth opportunities for companies involved in edge hardware development and edge computing solutions [8]
DeepSeek发布DeepSeek-OCR 2
Mei Ri Jing Ji Xin Wen· 2026-01-27 06:15
每经AI快讯,1月27日消息,DeepSeek发布全新DeepSeek-OCR2模型,采用创新的DeepEncoder V2方 法,让AI能够根据图像的含义动态重排图像的各个部分,而不再只是机械地从左到右扫描。这种方式 模拟了人类在观看场景时所遵循的逻辑流程。最终,该模型在处理布局复杂的图片时,表现优于传统的 视觉-语言模型,实现了更智能、更具因果推理能力的视觉理解。 ...