每日 AI 简报

2026-06-01(内容获取于 06/01 05:48)

Claude Opus 4.8发布,协作与编码能力提升

TLDR AI

Anthropic推出Claude Opus 4.8,带来增量改进:协作和编码行为更好,但文档解析出现部分退化,基准测试结果混合。(多家报道)

推荐理由:Opus 4.8是Anthropic的最新旗舰模型,小幅升级对开发者有实际影响,值得关注其性能变化。

Claude Code推出动态工作流功能

Claude Blog

Claude Code新增动态工作流功能,支持开发者自定义和编排多步骤自动化任务,提升编码和运维效率。

推荐理由:直接提升开发者生产力,可立即在Claude Code中试用。

MoneyPrinterTurbo:AI一键生成短视频

GitHub Trending

利用AI大模型,一键生成高清短视频的开源项目,支持多种模板和自定义,降低视频制作门槛。

推荐理由:开源、高可行动性,内容创作者和开发者可直接上手使用或二次开发。

微软MarkItDown:文档转Markdown工具

GitHub Trending

微软开源的Python工具,支持将各类文件和办公文档(如Word、Excel、PPT等)转换为Markdown格式。

推荐理由:开源且实用性强,适合需要批量处理文档的开发者或数据预处理场景。

Anthropic详解多产品间Claude安全隔离策略

Anthropic Engineering

Anthropic分享了在claude.ai、Claude Code和Cowork等产品中,如何通过工程手段限制AI Agent的潜在影响范围,确保安全可控。

推荐理由:安全隔离是Agent落地的关键工程挑战,此文提供一线经验,对AI工程团队有直接参考价值。

ChatGPT for Google Sheets插件存在数据泄露风险

Hacker News

安全研究揭示ChatGPT for Google Sheets插件可能通过恶意指令窃取工作表数据,引发对AI插件安全的担忧。

推荐理由:安全风险警示,使用该插件的用户应及时评估风险并采取防护措施。

PyTorch Profiler初学者指南发布

Hugging Face Blog

Hugging Face发布PyTorch Profiler入门教程,指导开发者使用torch.profiler进行模型性能分析和优化。

推荐理由:实用教程,帮助PyTorch开发者快速上手性能分析工具,提升模型训练效率。

波士顿儿童医院用AI诊断40余例罕见病

OpenAI News

波士顿儿童医院利用OpenAI技术改善患者护理、减轻运营负担,并成功辅助诊断40余例罕见病病例。

推荐理由:真实医疗应用案例,展示AI在罕见病诊断中的价值,对医疗AI领域有重要参照意义。

VLM空间感知研究:3D理解还是统计捷径?

HuggingFace Trending Papers

研究探究视觉语言模型在空间推理任务中的表现,质疑其是否具有结构化3D理解,或仅依赖统计捷径。

推荐理由:对VLM内部机制感兴趣的研究者可读,提供学术视角的批判性思考。

教宗发布AI通谕,强调「技术非中立」

MIT Tech Review AI

教宗方济各新通谕《壮大的人性》中提出「技术从不中立」,呼吁为个人应对AI时代提供行为模板。

推荐理由:社会视角的AI讨论,适合关注AI伦理与公共政策的读者延伸思考。

harry0703/MoneyPrinterTurbo

Python · ★ 73,979 · 🍴 10,561 · 📈 1,937 stars today

利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

中文介绍 利用AI大模型一键生成高清短视频,用户只需输入主题或文案,即可快速产出带配音、字幕和背景音乐的短视频,适合内容创作者、营销人员快速生产视频内容。

microsoft/markitdown

Python · ★ 134,743 · 🍴 9,215 · 📈 2,759 stars today

Python tool for converting files and office documents to Markdown.

中文介绍 微软开源的Python工具,能将Office文档(如Word、Excel、PPT)和多种文件格式转换为Markdown,方便开发者提取文档内容做进一步处理或集成到知识库。

D4Vinci/Scrapling

Python · ★ 56,524 · 🍴 5,478 · 📈 639 stars today

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

中文介绍 自适应网页抓取框架,支持从单个请求到全站爬取的全流程,自动处理反爬、页面解析等复杂问题,适合需要稳定抓取动态或静态网页的开发者。

nesquena/hermes-webui

Python · ★ 9,882 · 🍴 1,360 · 📈 320 stars today

Hermes WebUI: The best way to use Hermes Agent from the web or from your phone!

中文介绍 Hermes Agent 的 WebUI,提供在浏览器或手机上使用智能体代理的最佳体验,用户可通过界面与AI进行交互并执行任务。

EveryInc/compound-engineering-plugin

TypeScript · ★ 18,680 · 🍴 1,408 · 📈 243 stars today

Official Compound Engineering plugin for Claude Code, Codex, Cursor, and more

中文介绍 面向 Claude Code、Codex、Cursor 等AI编程工具的化合物工程插件,扩展了代理编码系统的能力,适合需要定制化AI开发工作流的开发者。

github/docs

TypeScript · ★ 19,696 · 🍴 67,285 · 📈 20 stars today

The open-source repo for docs.github.com

中文介绍 GitHub 官方文档的开源仓库,包含平台使用指南、API参考和最佳实践,任何人都可以贡献或反馈,是开发者和企业了解GitHub功能的权威来源。

OpenBMB/VoxCPM

Python · ★ 23,417 · 🍴 2,710 · 📈 639 stars today

VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

中文介绍 VoxCPM2 是多语言无tokenizer的语音合成模型,支持创意语音设计和高保真声音克隆,直接生成音频波形,适合语音助手、有声内容制作和个性化语音应用。

revfactory/harness

HTML · ★ 4,560 · 🍴 643 · 📈 318 stars today

A meta-skill that designs domain-specific agent teams, defines specialized agents, and generates the skills they use.

中文介绍 一种元技能框架,用于设计特定领域的AI代理团队、定义专业代理并生成它们使用的技能,帮助构建模块化、可复用的多代理系统。

FareedKhan-dev/train-llm-from-scratch

Jupyter Notebook · ★ 2,887 · 🍴 435 · 📈 627 stars today

A straightforward method for training your LLM, from downloading data to generating text.

中文介绍 从数据下载到文本生成的完整LLM训练教程,提供直截了当的方法教开发者如何从头训练自己的语言模型,适合AI学习者和研究者动手实践。

supermemoryai/supermemory

TypeScript · ★ 23,276 · 🍴 2,100 · 📈 236 stars today

Memory engine and app that is extremely fast, scalable. The Memory API for the AI era.

中文介绍 面向AI时代的超级记忆引擎和应用,提供超快、可扩展的记忆API,帮助AI应用长期存储和检索用户上下文,适用于智能助手、个性化推荐等场景。

Crosstalk-Solutions/project-nomad

TypeScript · ★ 27,678 · 🍴 2,709 · 📈 372 stars today

Project N.O.M.A.D, is a self-contained, offline survival computer packed with critical tools, knowledge, and AI to keep you informed and empowered—anytime, anywhere.

中文介绍 自包含的离线生存计算机,集成了关键工具、知识和AI,无需网络即可运行,适合应急场景、野外探险或离线工作环境用户。

anthropics/claude-code

Python · ★ 128,847 · 🍴 20,994 · 📈 490 stars today

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.

中文介绍 Anthropic 推出的终端代理编码工具,能理解整个代码库,帮你执行例行任务、解释复杂代码、处理git工作流,大幅提升开发效率。

nicobailon/pi-subagents

TypeScript · ★ 1,820 · 🍴 254 · 📈 59 stars today

Pi extension for async subagent delegation with truncation, artifacts, and session sharing

中文介绍 Pi 框架的扩展,支持异步子代理委派,具备截断、产物和会话分享功能,适合需要高效管理多子任务协作的AI应用开发者。

emmabostian/developer-portfolios

Python · ★ 23,320 · 🍴 4,606 · 📈 67 stars today

A list of developer portfolios for your inspiration

中文介绍 整理收集的开发者个人作品集列表,供开发者获取设计灵感和代码参考,适合求职或建立个人品牌时参考。

codecrafters-io/build-your-own-x

Markdown · ★ 509,306 · 🍴 48,304 · 📈 1,112 stars today

Master programming by recreating your favorite technologies from scratch.

中文介绍 教你通过从零重建各种流行技术(如数据库、Git、Docker等)来掌握编程的经典项目合集,适合想深入理解底层原理的开发者。

Why Far Looks Up: Probing Spatial Representation in Vision-Language Models

👍 38

Vision-language models (VLMs) achieve strong performance on spatial reasoning benchmarks, yet it remains unclear whether this reflects structured 3D understanding or reliance on statistical shortcuts in natural images. We introduce a representation-level analysis framework that constructs minimal co

中文介绍 研究者引入一种表示级分析框架,探索视觉语言模型是否具备结构化3D空间理解,还是依赖于自然图像中的统计捷径。

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

👍 7

Robot manipulation critically depends on perception that preserves the action-relevant aspects of a scene. Yet most robot learning pipelines are built upon visual encoders pre-trained for static recognition or vision-language alignment, leaving motion understanding to downstream policies. We introdu

中文介绍 提出DynaFLIP方法,通过三模态动力学引导表示,提升机器人操作对场景中动作相关方面的感知能力,弥补现有视觉编码器忽视运动理解的不足。

Tiny but Trusted: Efficient Vision-Language Reasoning for Time-Series Anomaly Detection

👍 0

Recent advances in Vision-Language Models (VLMs) have achieved impressive performance across many tasks, yet prior studies report unsatisfactory performance when applying large language or multimodal models to finding abnormal patterns in sequential data. Public anomaly detection benchmarks typicall

中文介绍 针对时间序列异常检测任务,提出一种高效的视觉语言推理方法,解决大规模模型在序列数据异常模式检测中性能不佳的问题。

Reducing Political Manipulation with Consistency Training

👍 0

Large language models (LLMs) exhibit systematic political bias across a variety of sensitive contexts. We find that LLMs handle counterpart topics from opposing political sides asymmetrically. We refer to this phenomenon as covert political bias and identify 7 categories of techniques through which

中文介绍 研究发现大型语言模型在处理不同政治立场的对应话题时存在系统性偏差,并提出一致性训练方法以降低这种隐性政治操纵。

REPOT: Recoverable Program-of-Thought via Checkpoint Repair

👍 6

One-shot Program-of-Thought (PoT) emits a Python program that prints a primitive-action plan; a single invalid action silently invalidates the trajectory. We introduce RePoT (Recoverable PoT): a deterministic verified replay that walks the plan through the environment to its first invalid transition

中文介绍 提出RePoT方法,通过检查点修复实现可恢复的程序思路推理,确保单步无效操作不会导致整个轨迹失效。

Xetrieval: Mechanistically Explaining Dense Retrieval

👍 17

Explaining why dense retrievers assign high relevance scores remains challenging because retrieval decisions are made through opaque high-dimensional embeddings. Existing explanations often focus on surface signals, such as lexical matches, token alignments, or post-hoc textual rationales, and thus

中文介绍 针对稠密检索模型的高维嵌入不透明问题,提出Xetrieval方法,从机制层面解释模型为何赋予高相关性分数。

CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval

👍 4

Tool retrieval over large API catalogs is a core bottleneck for LLM agents: user queries arrive in colloquial, often underspecified language, while the catalog uses technical API vocabulary that no fixed encoder can bridge on its own. The two dominant training approaches, contrastive encoder fine-tu

中文介绍 提出CoHyDE方法,通过迭代联合训练LLM改写器和稠密编码器,解决用户口语化查询与API技术词汇之间的语义鸿沟问题。

EarlyTom: Early Token Compression Completes Fast Video Understanding

👍 27

Video large language models (Video-LLMs) have demonstrated strong capabilities in video understanding tasks. However, their practical deployment is still hindered by the inefficiency introduced by processing massive amounts of visual tokens. Although recent approaches achieve extremely low token ret

中文介绍 提出EarlyTom方法,通过早期令牌压缩减少视频理解中大量视觉令牌的处理开销,提升视频大语言模型的部署效率。

When Cloud Agents Meet Device Agents: Lessons from Hybrid Multi-Agent Systems

👍 11

The design space of agentic AI inference spans two extremes: frontier large language models (LLMs), typically hosted in the cloud and offering strong performance across a wide range of tasks at substantially high cost, and more cost-efficient small language models (SLMs), which are amenable to on-de

中文介绍 分析云侧大语言模型与端侧小语言模型在混合多智能体系统中的权衡,探讨性能与成本之间的最优设计空间。

Thinking Before Constraining: A Unified Decoding Framework for Large Language Models

👍 6

Natural generation allows Large Language Models (LLMs) to produce free-form responses with rich reasoning, yet the lack of structure makes outputs difficult to verify. Conversely, constrained decoding ensures standardized formats but can inadvertently restrict reasoning capabilities by imposing cons

中文介绍 提出一种统一解码框架,在生成自由文本与约束格式之间取得平衡,既保留推理能力又确保输出可验证。

PhyGenHOI: Physically-Aware 4D Generation of Dynamic Human-Object Interactions

👍 8

We address the task of generating physically accurate and visually faithful 4D Human-Object Interaction (HOI). Given a static 3D human and target object represented as 3D Gaussian Splats (3DGS), our goal is to synthesize dynamic scenes where the human actively engages with the object through actions

中文介绍 针对动态人-物交互的4D生成任务,提出PhyGenHOI方法,从静态3D对象生成物理准确且视觉真实的交互场景。

UniSteer: Text-Guided Flow Matching in Activation Space for Versatile LLM Steering

👍 21

Activation-based control steers large language models (LLMs) by intervening on their internal representations during inference, and has emerged as an effective paradigm for controlling behaviors such as persona and style. However, existing methods often rely on fixed steering directions or task-spec

中文介绍 提出UniSteer方法,利用文本引导的流匹配在激活空间中控制大型语言模型的行为,如人格和风格,无需固定方向。

Discovering Cooperative Pipelines: Autoresearch for Sequential Social Dilemmas

👍 1

We study two-level autoresearch for cooperation: an outer-loop AI agent autonomously redesigns the inner-loop pipeline of an LLM policy-synthesis system for multi-agent Sequential Social Dilemmas (SSDs). A researcher agent R (run as a coding agent) reads the inner-loop source code, edits system prom

Verifiable Rewards Beyond Math and Code: Lightweight Corpus-Grounded Process Supervision for Factual Question Answering

👍 6

Applying reinforcement learning to improve factual accuracy in knowledge-intensive question answering faces a reward design dilemma. Response-level rewards provide only coarse supervision and cannot distinguish correct from incorrect statements within a reasoning trace. Sentence-level alternatives o

Colored Noise Diffusion Sampling

👍 18

Diffusion models achieve state-of-the-art image synthesis, with their generative trajectories fundamentally exhibiting a spectral bias, resolving low-frequency global structures early and high-frequency fine details later. Conventional stochastic differential equation (SDE) solvers fail to account f

CausaLab: A Scalable Environment for Interactive Causal Discovery Toward AI Scientists

👍 13

We introduce CausaLab, a scalable environment for evaluating interactive causal discovery by LLM agents. Unlike prior evaluations, CausaLab evaluates both whether an agent can solve a problem using causal evidence and whether its answer is grounded in a faithful recovered causal mechanism. Each epis

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

👍 41

As video diffusion models (VDMs) advance toward world models, a key question arises: do they truly understand causality, or merely overfit to statistical temporal patterns? Existing benchmarks mostly rely on synthetic data, limiting real-world generalization due to the sim-to-real gap. We present Yo

PhoneWorld: Scaling Phone-Use Agent Environments

👍 1

A central bottleneck for phone-use agents is that controllable, reproducible environments covering real mobile behavior are hard to build at scale. Existing mobile-agent benchmarks have made important progress on evaluation, but they do not by themselves provide a scalable way to construct many new

WorldMemArena: Evaluating Multimodal Agent Memory Through Action-World Interaction

👍 7

Multimodal large language models are increasingly deployed as long-horizon agents, where memory must do more than recall: it must track an evolving world, revise what has gone stale, and surface the right evidence at decision time. Existing benchmarks measure recall over static dialogue, collapse me

PRISM: A Multi-Dimensional Benchmark for Evaluating LLM Peer Reviewers

👍 10

The rapid growth in submissions to machine learning venues has strained the scientific peer-review system and intensified interest in LLM-based automated peer reviewers. However, how good these systems are actually, especially compared to human reviewers at catching scientific gaps, remains poorly u

Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning

👍 21

Equipping large language models with explicit skills has emerged as a promising paradigm for enabling autonomous agents to solve complex tasks. Agent skills can be inherently divided into general skills for broad cognitive transfer and task-specific skills for dynamic execution. However, existing sk

PANDO: Efficient Multimodal AI Agents via Online Skill Distillation

👍 5

Recent advances in multimodal web agents often rely on increased inference-time computation, including rollout search, verifier passes, offline skill discovery, and specialist model stacks. This raises a central question: can a web agent become more efficient as it accumulates experience, rather tha

Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases

👍 2

Reinforcement Learning from Human Feedback (RLHF) is the standard method to align Large Language Models (LLMs) with human preferences. In this work, we introduce alignment tampering, a potential vulnerability where the LLM undergoing alignment influences the preference dataset, causing RLHF to ampli

Learning A Unified Risk Map for Autonomous Driving in Partially Observable Environments

👍 5

Occlusion-aware prediction remains a critical challenge in autonomous driving due to the inherent uncertainty of unobserved regions. Existing approaches either overestimate risk based on reachable states or struggle to predict accurate trajectories under high occlusion uncertainty. To address these

Reflective Prompt Tuning through Language Model Function-Calling

👍 4

Large language models (LLMs) have become increasingly capable of following instructions and complex reasoning, making prompting a flexible interface for adapting models without parameter updates. Yet prompt design remains labor-intensive and highly sensitive to formatting, phrasing, and instruction

ORACLE: Anticipating Scams from Partial Trajectories in Streaming App Usage

👍 0

Smartphone scams are increasingly prevalent and typically manifest as multi-stage, cross-application processes with gradually emerging intent. Effective intervention thus requires anticipating scams before the intent becomes explicit. This is inherently challenging, as decisions must rely on partial

How to Build a Software Factory with Claude Code That Ships Features While You Sleep

@sairahul1 · 106.0K 粉丝 · 2.8M 阅 · 1.5K 赞 · 205 转

I thought I was using AI to code. I was actually just typing faster. Here is the difference — and the 7-agent system that changed everything. Save this. It will save you months. THE PROBLEM NOBODY

中文介绍 开发者 saiRahul 分享用 Claude Code 搭建的 7-agent 软件工厂系统,能自动完成需求分析、代码编写、测试和部署,显著提升效率并实现全天候功能交付,核心是让 AI 从辅助打字变成端到端独立工作。

How to Build a Claude Research Agent That Reads the Internet Every Morning and Briefs You in 5 Mins

@cyrilXBT · 181.7K 粉丝 · 127.8K 阅 · 533 赞 · 80 转

Most people start their day the same way. They open Twitter and spend 20 minutes scrolling through noise looking for the three things that actually matter. They open their email and get pulled into

中文介绍 cyrilXBT 介绍了如何构建一个 Claude 研究 Agent,每天早晨自动扫描互联网,在 5 分钟内将关键信息整理成简报,帮助用户避开 Twitter 和邮件中的信息噪音,节省 20 分钟的手动浏览时间。

How I Use Cursor

@poteto · 26.6K 粉丝 · 86.5K 阅 · 540 赞 · 48 转

I need to get something off my chest. Before my interview @cursor_ai, I had never actually used Cursor. At Meta, Claude Code was explosively taking off. I even paid for a personal $200 a month plan

中文介绍 poteto 坦诚分享从 Meta 内部 Claude Code 白嫖用户转到自己面试 Cursor 的亲身经历,对比两者差异,并展示如何使用 Cursor 进行编码实践,视角独特,不吹不黑。

Step-By-Step LLM Engineering Projects (2026 Edition)

@TheAhmadOsman · 59.9K 粉丝 · 54.5K 阅 · 512 赞 · 65 转

At some point, reading about LLMs stops being enough. You need to build the stack yourself: Tokenizer first, then embeddings, position, attention, Transformer blocks, objectives, decoding, cache, long

中文介绍 Ahmad Osman 发布 2026 版 LLM 工程入门项目路线图,从 Tokenizer 、Embeddings 到 Attention、Transformer 块、解码、缓存等完整堆栈拆解,强调读完理论后必须动手搭建才能真理解。

THE DIFF THAT CHANGED EVERYTHING

@difflawb · 20.3K 粉丝 · 21.9K 阅 · 1.1K 赞 · 389 转

How a 40-line shell script became infrastructure In August 2024, Andrej Karpathy — co-founder of OpenAI, former AI Director at Tesla — published something unexpectedly small. Not a paper. Not a model.

中文介绍 回顾 Andrej Karpathy 在 2024 年 8 月发布的一个仅 40 行 shell 脚本,并非论文或模型,却成为基础设施级工具,揭示小而精的代码同样能带来巨大影响力。

The Start of the End: AI Replacement Has Begun

@ActionModelAI · 57.1K 粉丝 · 5.8K 阅 · 505 赞 · 344 转

We are witnessing the beginning of the biggest economic shift in modern history. And most people still don’t realize it. AI replacement is no longer some distant sci-fi prediction. It has started.

中文介绍 作者认为 AI 替代工作已不再是科幻概念,而是正在发生的最大经济变革,提醒大多数人仍未意识到这一趋势的紧迫性,属于对 AI 周期转折的预警性观点。

该源今日无内容。

Ship your first Managed Agent

中文介绍 Claude 发布如何部署首个受管代理(Managed Agent)的指南,降低AI代理应用门槛。

Ship your first Managed Agent

中文介绍 Claude 发布如何部署首个受管代理(Managed Agent)的指南,降低AI代理应用门槛。

[AINews] Founders and Forward Deployed Engineers

a quiet day lets us highlight the new AIE WF focuses

中文介绍 本周AI新闻聚焦创始人与前向部署工程师,Latent Space强调了AIE WF的新关注点。

Boston Children’s uses AI to unlock new diagnoses

Boston Children’s Hospital uses OpenAI technology to improve patient care, reduce operational burden, and help diagnose more than 40 rare disease cases.

中文介绍 波士顿儿童医院利用OpenAI技术改善患者护理、减少运营负担,并帮助诊断超过40种罕见疾病。

How Braintrust turns customer requests into code with Codex

How Braintrust engineers use Codex with GPT-5.5 to run experiments and code faster.

中文介绍 Braintrust工程师使用Codex结合GPT-5.5,将客户需求转化为代码,加速实验和开发流程。

How the Pope’s Magnifica Humanitas offers a template for individuals to meet the AI moment

Pope Leo XIV’s new encyclical on artificial intelligence includes a statement that warrants serious attention from technologists and policymakers: “Technology is never neutral.” Magnifica Humanitas (“Magnificent Humanity”) is a clarion call to all people to act with courage and solidarity as we ente

中文介绍 教皇利奥十四世发布人工智能通谕《辉煌人性》,强调“技术从未中立”,呼吁技术人员和政策制定者以勇气应对AI时代。

not much happened today

**Anthropic** rolled out **Claude Opus 4.8**, which shows incremental improvements but mixed benchmark results, including better cooperation and coding behavior but some regressions in document parsing. Platform updates include mid-conversation system instructions enhancing long agent sessions, thou

中文介绍 Anthropic推出Claude Opus 4.8,带来渐进式改进,如更好的协作和编码行为,但文档解析出现退化。支持对话中途修改系统指令。

Strengthening societal resilience with Rosalind Biodefense

OpenAI launches Rosalind Biodefense, expanding trusted access to GPT-Rosalind for vetted developers and U.S. government partners advancing biodefense, public health, and pandemic preparedness through frontier AI.

中文介绍 OpenAI启动Rosalind生物防御计划,向经审核的开发者和美国政府合作伙伴扩展GPT-Rosalind的受信任访问,推动生物防御和公共卫生。

A shared playbook for trustworthy third party evaluations

OpenAI shares guidance on third-party AI evaluations, covering how to assess model capabilities, safeguards, and validity for frontier systems.

中文介绍 OpenAI发布第三方AI评估共享指南,涵盖评估模型能力、安全措施及前沿系统有效性的方法。

How Endava builds an agentic organization with Codex

Learn how Endava uses Codex to build an agentic organization, accelerating software delivery and reducing requirements analysis from weeks to hours.

中文介绍 Endava利用Codex构建代理型组织,加速软件交付,将需求分析从数周缩短至数小时。

The AI Hype Index: AI gets booed in graduation season

It is one thing to say AI will change the world. It is another to expect the class of 2026 to applaud it. In fact, when former Google CEO Eric Schmidt told University of Arizona graduates that their task is to help shape AI, he was met with a resounding chorus of boos. “I can…

中文介绍 AI hype指数显示,在毕业季,前谷歌CEO Eric Schmidt在亚利桑那大学演讲时遭学生嘘声,反映公众对AI热度反感。

[AINews] Cognition raises $1B in $26B Series D

coding is an uncapped TAM market

中文介绍 Cognition以260亿美元估值完成10亿美元D轮融资,认为编程市场具有无限潜力。

Anthropic raises $65B in Series H at a $965B post-money valuation, releases Opus 4.8 and Dynamic Workflows

**Anthropic** announced a massive **$65B Series H financing** at a **$965B valuation**, led by **Altimeter, Dragoneer, Greenoaks, and Sequoia**, with run-rate revenue surpassing **$47B**. They launched **Claude Opus 4.8**, an update to Opus 4.7 featuring "sharper judgment," "more honesty," and longe

中文介绍 Anthropic完成650亿美元H轮融资,估值达9650亿美元,由Altimeter、Dragoneer等领投,年收入超470亿美元。同时发布Claude Opus 4.8,具有“更敏锐判断”。

弄了一个检测站点,看看到底谁在搞事情?

兄弟们我们看看谁在搞事情。 测试站 绝对不会收集各位的key信息,只是想看看谁有问题。 排行榜(模糊处理域名) 在说一次 ,永远不收费,也不会卖api,接受监督,公益服务已经关闭了,改到自用了。 28 个帖子 - 13 位参与者 阅读完整话题

Cherry Studio 隐私政策更新

各位 Cherry Studio 用户: Cherry Studio 早期版本中一直缺少完整、清晰的隐私政策说明,同时隐私相关开关在不同版本中的数据存储方式也不够一致,可能会导致部分用户在升级或切换版本后遇到隐私设置状态异常等问题。 从 Cherry Studio v1.9.8 版本开始,我们将正式加入隐私协议与更清晰的隐私设置说明,让用户能够更透明地了解软件会收集哪些匿名运行信息、不会收集哪些敏感数据,以及如何自行管理相关开关。 根据新版隐私协议,Cherry Studio 仅会在必要范围内收集匿名化的基础运行信息与产品改进信息,例如软件版本、功能使用汇总、功能活跃度与频次、错误日志与崩溃信

星辰AI 关于key的一点安全提醒

今天某位佬友把自己的key不小心上传到了github,被我朋友发现了,目前该key我已经协助禁用,在这里还是提醒各位佬友,一定要妥善保管好自己的key,尽量不要设置无限额度的key,尽可能降低损失。 其实newapi不支持根据key查用户的功能,我是直接联系newapitools的作者,紧急加了一个key反查用户的功能,在此也感谢newapitools的作者! 泄露自己的key就相当于把把钱扔在马路上,大家一定要注意!!! 祝大家早安,午安,晚安 17 个帖子 - 16 位参与者 阅读完整话题

深度好文——一篇文章说明白OPENAI的风控黑盒机制

前言: 本人使用GPT两年以上,订阅过plus,business,Pro 5x,Pro 20x。试过至少6个不同的IP,折腾过无数次,被风控的经验丰富。希望本文可以为大家提供一些关于OPENAI风控的思考,减少重复与杂乱的数据使得大家眼花缭乱,信息过载,甚至增加无谓的噪声。 转载请注明出处 风控解析————什么可能决定你是否被风控? 其实很简单,无非分为两类。一种是行为特征,一种是背景特征。前者指的是用户在使用中的行为特征,后者指的是用户在使用服务的非行为背景信息。我们来举例: 行为特征: 如短时间大量并发,咨询有关敏感信息与违规信息,多端同时大量使用,单次对话复杂度过高(常发生在Pro Ex

痛心疾首!!!揭露gpt plus日抛骗局

买过gpt plus的佬们注意了,gpt plus 日抛号骗局,openai 官方对于账号的封禁只有一种,直接删你账号,而不会出现账号能用反代不能用的情况,通过 codex2api + 授权的方式是一定可以反向代理出来的,各种渠道的所谓日抛号,没那么恶心人的给你 refresh token,恶心人的给你 access token (当然 openai 提高风控之后,access token 或者 session token 这条路基本堵死了),然后过了两天,卖家登录 chatgpt 网页端或者直接走 api 刷新 refresh token,原来的 refresh token 失效了,但是实际

Vibe 了三天的项目被说是 AI Slop,对 Vibe Coding 有些迷茫了……

最近在 L 站看到以下两个项目。我想起来我自己之前一直想补《魔法禁书目录》,但是因为小说文本太长了,一直没看。于是打算自己在我的服务器上面部署一个 web 服务自己看小说。 【开源推广】网文小说速读,脱水长文小说,节省70%阅读时间。 开发调优 本帖使用社区开源推广,符合推广要求。我申明并遵循社区要求的以下内容: 我的帖子已经打上 开源推广 标签: 是 我的开源项目完整开源,无未开源部分: 是 我的开源项目已链接认可 LINUX DO 社区: 是 我帖子内的项目介绍,AI生成、润色内容部分已截图发出: 是 以上选择我承诺是永久有效的,接受社区和佬友监督: 是 以下为项目介绍正文内容,AI生成、

Dav2d

373 points · 129 comments