Workflow
Seek .(SKLTY)
icon
Search documents
DeepSeek发布DeepSeek-OCR 2,AI能够以与人类相同的逻辑顺序“看”一张图片
Hua Er Jie Jian Wen· 2026-01-27 05:52
风险提示及免责条款 市场有风险,投资需谨慎。本文不构成个人投资建议,也未考虑到个别用户特殊的投资目标、财务状况或需要。用户应考虑本文中的任何 意见、观点或结论是否符合其特定状况。据此投资,责任自负。 DeepSeek发布全新DeepSeek-OCR 2模型,采用创新的DeepEncoder V2方法,让AI能够根据图像的含义 动态重排图像的各个部分,而不再只是机械地从左到右扫描。这种方式模拟了人类在观看场景时所遵循 的逻辑流程。最终,该模型在处理布局复杂的图片(如文档或图表)时,表现优于传统的视觉-语言模 型,实现了更智能、更具因果推理能力的视觉理解。 ...
继“DeepSeek时刻”之后,是什么让“中国时刻”持续刷屏?
Sou Hu Cai Jing· 2026-01-26 16:06
一年前,恰是这个周一,深度求索(DeepSeek)公司R1大模型以高性能和低训练成本震撼世界,外媒 称之为"DeepSeek时刻"。 DeepSeek的横空出世,可谓2025年中国科技创新的"开场戏"。近期,国际机构密集发文盘点过去一年全 球科技发展,来自中国的成果格外耀眼。 在基础研究领域,2025年底,国际顶尖学术期刊《科学》公布"2025年度十大科学突破",中国在可再生 能源领域的突破,以及破解"丹尼索瓦人"长相之谜、发现水稻耐高温"基因开关"等成果纷纷入选。 在人工智能(AI)领域,据麻省理工学院等机构2025年11月底统计,DeepSeek、千问等中国开源模型 下载量已占全球总量的17%,超越美国位居全球第一。今年1月,微软发布报告称,DeepSeek在俄罗 斯、白俄罗斯、伊朗、非洲等发展中国家的普及率呈"爆炸式"增长。 在与AI算力息息相关的先进芯片领域,全球知名金融研究机构伯恩斯坦研究公司预测,2026年华为等 本土企业将占据中国AI芯片市场80%的份额,英伟达的市场份额将降至8%。 过去一年,中国科技创新因何能让世界瞩目? 这源于实现高水平科技自立自强的决心与努力。在美国先进芯片的封锁禁运下, ...
数据:Seeker 代币 SKR 累计交易量已超过 2 亿美元
Xin Lang Cai Jing· 2026-01-26 14:54
(来源:吴说) 吴说获悉,据 Top Ledger 数据,自 Solana Mobile 空投以来,Seeker 代币 SKR 累计交易量已超过 2 亿美 元;现约 85% 的空投总额已被领取;SKR 的链上交易量主要在 Meteora 中完成,占据了 57.4%。 ...
PriceSeek重点提醒:LLDPE现货价格大幅上调
Xin Lang Cai Jing· 2026-01-26 11:09
生意社01月26日讯 1月26日,山东万华化学,华东地区LLDPE7042报7050元/吨,上涨200元/吨;华东地区LLDPE7050报 7100元/吨,上涨200元/吨。 PriceSeek评析 LLDPE,多空评分:2 生意社01月26日讯 1月26日,山东万华化学,华东地区LLDPE7042报7050元/吨,上涨200元/吨;华东地区LLDPE7050报 7100元/吨,上涨200元/吨。 PriceSeek评析 LLDPE,多空评分:2 文章显示华东地区LLDPE现货报价上涨200元/吨,如7042型号报7050元/吨、7050型号报7100元/吨,涨 幅显著。这反映出供应紧张或需求增加,对现货价格构成重大利好。结合大连商品交易所聚乙烯期货数 据,2605合约收盘价6865元/吨,上涨116元/吨,成交量达719797手,持仓量增加3532手,表明市场多 头情绪浓厚。现货价格上涨将强化期货上行预期,支撑未来价格走势。 【大宗商品公式定价原理】生意社基准价是基于价格大数据与生意社价格模型产生的交易指导价,又称 生意社价格。可用于确定以下两种需求的交易结算价: 1、指定日期的结算价 2、指定周期的平均结 ...
DeepSeek-R1推理智能从哪儿来?谷歌新研究:模型内心多个角色吵翻了
3 6 Ke· 2026-01-26 09:14
但如果把问题继续往深处追问:推理能力的本质,真的只是多算几步吗? 谷歌、芝加哥大学等机构的研究者最近发表的一篇论文给出了一个更具结构性的答案,推理能力的提升并非仅源于计算步数的增加,而是来自模型在推理 过程中隐式模拟了一种复杂的、类多智能体的交互结构,他们称之为「思维社会」(society of thought)。 过去两年,大模型的推理能力出现了一次明显的跃迁。在数学、逻辑、多步规划等复杂任务上,推理模型如 OpenAI 的 o 系列、DeepSeek-R1、QwQ- 32B,开始稳定拉开与传统指令微调模型的差距。直观来看,它们似乎只是思考得更久了:更长的 Chain-of-Thought、更高的 test-time compute,成为最常 被引用的解释。 简单理解就是,这项研究发现,为了解决难题,推理模型有时会模拟不同角色之间的内部对话,就像他们数字大脑中的辩论队一样。他们争论、纠正对 方、表达惊讶,并调和不同观点以达成正确答案。人类智能很可能是因为社交互动而进化的,而类似的直觉似乎也适用于人工智能! 通过对推理输出进行分类,以及结合作用于推理轨迹的机制可解释性方法,研究发现,诸如 DeepSeek-R ...
“DeepSeek-V3基于我们的架构打造”,欧版OpenAI CEO逆天发言被喷了
3 6 Ke· 2026-01-26 07:44
Core Viewpoint - The discussion centers around the competitive landscape in the AI field, particularly focusing on the contrasting approaches of Mistral and DeepSeek in developing sparse mixture of experts (MoE) models, with Mistral's CEO acknowledging China's strong position in AI and the significance of open-source models [1][4]. Group 1: Company Perspectives - Mistral's CEO, Arthur Mensch, claims that open-source models are a strategy for progress rather than competition, highlighting their early release of open-source models [1]. - The recent release of DeepSeek-V3 is built on Mistral's proposed architecture, indicating a collaborative yet competitive environment in AI development [1][4]. - There is skepticism among the audience regarding Mistral's claims, with some suggesting that Mistral's recent models may have borrowed heavily from DeepSeek's architecture [4][13]. Group 2: Technical Comparisons - Both DeepSeek and Mistral's Mixtral focus on sparse MoE systems, aiming to reduce computational costs while enhancing model capabilities, but they differ fundamentally in their approaches [9]. - Mixtral emphasizes engineering principles, showcasing the effectiveness of a robust base model combined with mature MoE technology, while DeepSeek focuses on algorithmic innovation to address issues in traditional MoE systems [9][12]. - DeepSeek introduces a fine-grained expert segmentation approach, allowing for more flexible combinations of experts, which contrasts with Mixtral's flat knowledge distribution among experts [11][12]. Group 3: Community Reactions - The community has reacted critically to Mistral's statements, with some users expressing disbelief and pointing out the similarities between Mistral's and DeepSeek's architectures [2][17]. - There is a sentiment that Mistral, once a pioneer in the open-source AI space, is now perceived as having lost its innovative edge, with DeepSeek gaining more influence in the sparse MoE and MLA technologies [14][17]. - The competitive race for foundational models is expected to continue, with DeepSeek reportedly targeting significant releases in the near future [19].
DeepSeek最新论文解读:mHC如何用更少的钱训练出更强的模型?——投资笔记第243期
3 6 Ke· 2026-01-26 07:38
Core Insights - DeepSeek has released a significant paper on Manifold-Constrained Hyper-Connections (mHC), focusing on the fundamental issue of how information flows stably through ultra-deep networks in large models, rather than on model parameters, data volume, or computational power [2] Group 1: Residual Connections and Their Limitations - The concept of residual connections, introduced by Kaiming He’s team in 2015, is a milestone in AI development, allowing deeper neural networks by addressing the vanishing gradient problem [3] - Prior to residual connections, neural networks were limited to depths of 20-30 layers due to the exponential decay of gradients, which hindered effective feature learning [3][4] - Residual connections introduced a "shortcut" for signal transmission, enabling the depth of trainable networks to increase from tens to hundreds or thousands of layers, forming the structural foundation of modern deep learning [4] Group 2: Introduction of Hyper-Connections - Hyper-Connections emerged as a solution to the limitations of residual connections, allowing multiple pathways for information transfer within a model, akin to a relay race with multiple runners [6][7] - This approach enables information to be distributed across multiple parallel channels, allowing for dynamic weight allocation during training, enhancing the model's ability to handle complex, multi-source information [6][7] Group 3: Challenges with Hyper-Connections - Hyper-Connections face a critical flaw: instability due to excessive freedom in information flow, which can lead to imbalances in the model's internal information flow [9] - The training process of models using Hyper-Connections can exhibit high volatility and loss divergence, indicating a lack of stability in information transmission [9] Group 4: The Solution - mHC - mHC, or Manifold-Constrained Hyper-Connections, introduces a crucial constraint to Hyper-Connections by employing a double stochastic matrix, ensuring that information is redistributed without amplification [11] - This constraint prevents both signal explosion and signal decay, maintaining a stable flow of information throughout the network [13] - The implementation of mHC enhances training stability and performance, with only a 6.7% increase in training time, which is negligible compared to the significant cost savings in computational resources and debugging time [13][14] Group 5: Implications for Future AI Development - mHC strikes a new balance between stability and efficiency, reducing computational costs by approximately 30% and shortening product iteration cycles [14] - It supports the development of larger models, addressing the stability bottleneck in scaling to models with hundreds of billions or trillions of parameters [16] - The framework of mHC demonstrates that "constrained freedom" is more valuable than "complete freedom," suggesting a shift in AI architecture design from experience-driven to theory-driven approaches [16]
DeepSeek——少即是多
2026-01-26 02:49
Summary of DeepSeek Conference Call Company and Industry Overview - **Company**: DeepSeek - **Industry**: Artificial Intelligence (AI) and Semiconductor Equipment in China Key Points and Arguments 1. **Engram Module Launch**: DeepSeek has introduced the Engram module, which decouples storage from computation, reducing reliance on High Bandwidth Memory (HBM) and lowering infrastructure costs. This innovation aims to alleviate bottlenecks in AI computing in China and suggests that future AI competition may focus on more efficient hybrid architectures rather than larger models [1][2][3] 2. **Efficiency Improvements**: The Engram module enhances the efficiency of large language models by implementing "conditional memory," which allows for better utilization of GPU resources. This decoupling of static memory from computation is expected to improve the performance of AI systems while reducing the need for expensive HBM [1][9][10] 3. **Infrastructure Cost Dynamics**: The findings indicate that infrastructure costs may shift from GPU to storage, as medium computational configurations may offer better cost-effectiveness than pure GPU expansions. The AI inference capability is expected to improve beyond knowledge growth, highlighting the importance of storage value beyond just computation [2][3][10] 4. **Next Generation Model**: DeepSeek's upcoming V4 model will utilize the Engram memory architecture, potentially achieving significant advancements in code generation and inference. The model is expected to run on consumer-grade hardware, such as the RTX 5090, and will be closely monitored for its performance against key benchmarks [2][3][10] 5. **Investment Opportunities**: The report highlights potential investment opportunities in the Chinese semiconductor equipment sector, particularly focusing on companies like Northern Huachuang (target price: RMB 514.2), Zhongwei Company (target price: RMB 364.32), and Changdian Technology (target price: RMB 49.49) [3][24][25] Additional Important Insights 1. **Performance Comparison**: Despite facing stricter constraints in advanced computing and hardware acquisition, Chinese AI models have rapidly closed the performance gap with leading models like ChatGPT 5.2. This progress is attributed to a focus on efficiency-driven innovations rather than sheer computational expansion [8][14] 2. **Long-term Implications**: The architecture developed by DeepSeek may lead to a more cost-effective, scalable, and adaptable AI ecosystem in China, potentially impacting global competitors by reducing the marginal costs of high-level intelligence and decreasing reliance on unlimited computational expansion [14][16] 3. **Engram's Unique Approach**: Engram's design allows for a more efficient memory usage model, significantly lowering the demand for HBM. This approach enhances the core transformer model without increasing FLOP or parameter scale, thereby improving overall system efficiency [11][18] 4. **Testing Results**: Tests on a 27 billion parameter model have shown that Engram outperforms in several benchmark tests, particularly in long-context processing, which is crucial for enhancing AI practicality [16][18] 5. **Strategic Positioning**: DeepSeek's advancements represent a strategic response to geopolitical and supply chain constraints, emphasizing algorithmic and system-level innovations over direct hardware competition [16][18] This summary encapsulates the critical insights from the conference call regarding DeepSeek's innovations, market positioning, and the broader implications for the AI and semiconductor industries in China.
AI周报丨DeepSeek新模型曝光;马斯克炮轰ChatGPT诱导自杀
Di Yi Cai Jing· 2026-01-25 01:31
Group 1 - DeepSeek has revealed a new model identifier "MODEL1" in its FlashMLA code, suggesting it may be nearing completion or deployment, potentially as a new architecture distinct from existing models [1] - Elon Musk criticized ChatGPT for being linked to multiple suicide cases, while OpenAI's Sam Altman acknowledged the complexities of operating a large AI platform and highlighted the safety concerns surrounding AI technologies [2] - Wang Xiaochuan responded to concerns about AI in healthcare, advocating for a model where AI assists doctors rather than replacing them, emphasizing the importance of patient benefits [3] Group 2 - OpenAI's API business generated over $1 billion in annual recurring revenue last month, with projections indicating a significant increase in annual revenue to over $20 billion by 2025 [4] - Baidu has established a new personal superintelligence business group, merging its document and cloud storage divisions, which is expected to enhance AI application capabilities [6] - NVIDIA's CEO highlighted three major breakthroughs in AI models over the past year, including the emergence of agentic AI and advancements in open-source models [7] Group 3 - Sequoia Capital is reportedly investing in AI unicorn Anthropic, which is raising over $25 billion in funding, potentially doubling its valuation to around $350 billion [8] - Meta's new AI lab has delivered its first key models, although significant work remains before these technologies are fully operational for internal and consumer use [9] - Musk's X platform has open-sourced its recommendation algorithm, which relies heavily on AI to customize user content [10][11] Group 4 - Suiruan Technology reported significant losses exceeding 4 billion yuan over three years, with a high dependency on sales to Tencent [12] - Moore Threads anticipates a narrowing of losses in the upcoming year, projecting revenues of 1.45 to 1.52 billion yuan for 2025 [13] - Yushu Technology announced that it shipped over 5,500 humanoid robots last year, surpassing previous market estimates [14] Group 5 - The "Qiming Plan" project has been launched to establish global consensus on AI safety measures, aiming to balance opportunities and risks associated with rapid AI development [15]
DeepSeek预测:黄金疯涨只是开始!这5样东西也会上涨,囤货清单来了
Sou Hu Cai Jing· 2026-01-24 17:39
Core Viewpoint - The article discusses the recent surge in gold prices and predicts that several other commodities, including silver, copper, natural gas, coffee, and cocoa, will also experience price increases due to various market factors [1][2][4][5][7]. Group 1: Gold Market Analysis - Gold prices have risen significantly, reaching over $4,000, with a year-to-date increase of 52%, marking the largest annual gain since 1979 [1][2]. - Key drivers for gold's price increase include geopolitical tensions, such as the Middle East conflicts and the ongoing Russia-Ukraine war, which have heightened market risk aversion [2]. - The expectation of two rate cuts by the Federal Reserve in 2025 is anticipated to weaken the dollar's appeal, further boosting gold prices [2]. Group 2: Other Commodities Expected to Rise - Silver is expected to rise due to strong industrial demand, particularly in the photovoltaic sector, where it accounts for 65% of industrial usage [4]. - Copper demand is projected to grow over 60% by 2030, driven by energy transition initiatives and infrastructure upgrades, with supply constraints from mining accidents [4]. - Natural gas prices are forecasted to increase by approximately 10% in Europe and 60% in the U.S. in 2025, influenced by geopolitical factors and weather conditions [5]. - Coffee prices are rising due to drought conditions in Brazil, which produces nearly half of the world's Arabica coffee [7]. - Cocoa prices are also increasing due to similar supply issues, with drought affecting production [7]. Group 3: Investment Considerations - Investment in commodities can be approached through physical assets like gold bars or coins, ETFs, or futures contracts for other commodities [10]. - The potential impact of rising commodity prices on everyday costs is acknowledged, particularly for coffee and cocoa, while natural gas price increases may affect heating costs [10]. - The article emphasizes the importance of risk management in commodity investments, suggesting that investors should allocate a reasonable portion of their assets to commodities [12].