每日 AI 简报

2026-06-09(内容获取于 06/09 06:19)

OpenAI 向 SEC 提交 S-1 草案,推进上市

OpenAI News

OpenAI 确认已向美国证券交易委员会(SEC)秘密提交 S-1 注册声明草案,但尚未确定后续行动的时间。此举是该公司走向IPO的重要一步。(多家报道)

推荐理由:OpenAI 上市进程是影响未来 AI 产业格局的关键事件,值得关注。

Anthropic 发布代理生物学研究报告

Anthropic Research

Anthropic 发布新研究《Paving the way for agents in biology》,探索 AI 代理在生物学研究中的应用前景和实现路径。

推荐理由:顶级 AI 实验室对生物领域的布局,对交叉学科从业者有重要参考价值。

last30days-skill:跨平台 AI 信息聚合工具

GitHub Trending

一款 AI 代理技能,可自动研究 Reddit、X、YouTube、HN 等平台上的任何话题,并合成一份有依据的总结摘要。

推荐理由:开源可上手的实用工具,适合做信息调研或内容监控。

开源社区力推 OpenEnv 进行 Agentic RL

Hugging Face Blog

Hugging Face 博客宣布,开源社区正在支持 OpenEnv 框架,用于代理强化学习(Agentic RL)的研究与开发。

推荐理由:RL 社区的新动向,对研究强化学习和代理系统的开发者有启发。

TurboVec:基于 TurboQuant 的 Rust 向量索引库

GitHub Trending

TurboVec 是一个用 Rust 编写的向量索引库,提供 Python 绑定,底层基于 TurboQuant 量化技术,旨在提高向量检索效率。

推荐理由:高性能向量索引库,适合搜索/推荐场景的开发者直接使用或集成。

Claude Code 修复 Agent Teams 权限对话框崩溃

Claude Code Changelog

Claude Code v2.1.114 发布,修复了 agent teams 团队成员请求工具权限时导致权限对话框崩溃的问题。

推荐理由:实用 bug 修复,对正在使用 Claude Code 团队功能的开发者有直接帮助。

Cognition 发布 FrontierCode 工具

Hacker News

Cognition 公司(Devin 的开发商)在 Hacker News 上介绍了她们的新工具 FrontierCode,聚焦前沿代码生成能力。

推荐理由:关注最强 AI 编程助手的最新进展,开发者值得跟踪。

Meta AI 客服漏洞导致 Instagram 账户被盗

MIT Tech Review AI

MIT Tech Review 报道,攻击者利用 Meta 的 AI 客服代理漏洞,通过简单指令将 Instagram 账户链接到自己的邮箱,大量盗取账户。

推荐理由:AI 安全事件直接威胁用户资产,提醒开发者重视代理安全设计。

AI 生成代码量惊人——OpenAI 团队 1 百万行零手写

X 推文 (AttentionVC)

据 X 博主分享,2026年2月,OpenAI 一个小组通过 AI 代理生成了 100 万行生产代码,没有一行由人手动编写。

推荐理由:展示 AI 编码的极限生产力,对工程管理者有强烈启示。

OpenAI 发布 AGI 全民受益愿景计划

OpenAI News

OpenAI 发布新博文《Built to benefit everyone: our plan》,阐述其在保障广泛访问、安全和共享繁荣方面的未来愿景。

推荐理由:了解 OpenAI 的官方立场和战略方向,适合行业观察参考。

mvanhorn/last30days-skill

Python · ★ 34,336 · 🍴 2,811 · 📈 3,558 stars today

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary

中文介绍 面向AI agent的技能模块,可自动搜索Reddit、X、YouTube、HN、Polymarket及全网话题,聚合多源信息后生成有据可依的摘要,适合舆情监测与快速调研场景。

RyanCodrai/turbovec

Python · ★ 8,729 · 🍴 807 · 📈 1,730 stars today

A vector index built on TurboQuant, written in Rust with Python bindings

中文介绍 基于TurboQuant构建的高性能向量索引,使用Rust编写并提供Python绑定,适用于大规模向量近似最近邻搜索,兼顾速度与内存效率。

google/skills

Python · ★ 12,342 · 🍴 966 · 📈 481 stars today

Agent Skills for Google products and technologies

中文介绍 Google官方发布的Agent技能集合,封装了Google产品与技术的API能力,供开发者构建基于Google生态的自主代理。

refactoringhq/tolaria

TypeScript · ★ 13,527 · 🍴 951 · 📈 649 stars today

Desktop app to manage markdown knowledge bases

中文介绍 桌面端知识管理应用,专注于Markdown格式的知识库,提供本地化管理与浏览功能,适合个人笔记与文档整理场景。

Panniantong/Agent-Reach

Python · ★ 24,020 · 🍴 2,026 · 📈 796 stars today

Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.

中文介绍 赋予AI agent网页浏览能力的CLI工具,支持搜索和读取Twitter、Reddit、YouTube、GitHub、Bilibili、小红书等平台,无需API密钥,零费用。

danielmiessler/Personal_AI_Infrastructure

TypeScript · ★ 15,393 · 🍴 2,158 · 📈 121 stars today

Agentic AI Infrastructure for magnifying HUMAN capabilities.

中文介绍 个人AI基础设施框架,聚焦增强人类能力的自主Agent系统,提供可扩展的架构用于构建和管理个性化AI工作流。

santifer/career-ops

JavaScript · ★ 50,439 · 🍴 10,330 · 📈 477 stars today

AI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.

中文介绍 基于Claude Code的AI求职系统,内置14种技能模式、Go语言仪表盘、PDF生成与批量处理功能,帮助用户自动化管理求职流程。

phuryn/pm-skills

★ 12,604 · 🍴 1,494 · 📈 112 stars today

PM Skills Marketplace: 100+ agentic skills, commands, and plugins — from discovery to strategy, execution, launch, and growth.

中文介绍 项目管理技能市场,提供100+ Agent技能、命令与插件,覆盖从发现、策略、执行到发布增长的全链路,帮助PM提升自动化能力。

openai/plugins

JavaScript · ★ 2,299 · 🍴 289 · 📈 296 stars today

OpenAI Plugins

中文介绍 OpenAI官方插件集合,为ChatGPT等模型提供扩展能力,允许连接第三方服务和工具,增强模型在具体场景中的实用功能。

Andyyyy64/whichllm

Python · ★ 3,406 · 🍴 200 · 📈 103 stars today

Find the local LLM that actually runs and performs best on your hardware. Ranked by real, recency-aware benchmarks, not parameter count. One command, run it instantly.

中文介绍 一键运行的本地LLM评测工具,基于真实基准而非参数量,为你的硬件推荐运行效果最佳的模型,支持即时比较。

MemPalace/mempalace

Python · ★ 54,888 · 🍴 7,158 · 📈 237 stars today

The best-benchmarked open-source AI memory system. And it's free.

中文介绍 开源AI记忆系统,在基准测试中表现优异,免费使用,为Agent提供持久化、可检索的上下文记忆能力,提升长期对话连贯性。

roboflow/supervision

Python · ★ 42,304 · 🍴 3,781 · 📈 1,140 stars today

We write your reusable computer vision tools. 💜

中文介绍 可复用的计算机视觉工具库,封装了目标检测、跟踪、标注等常见CV流水线模块,简化从模型输出到可视化分析的开发流程。

CopilotKit/CopilotKit

TypeScript · ★ 34,095 · 🍴 4,303 · 📈 398 stars today

The Frontend Stack for Agents & Generative UI. React, Angular, Mobile, Slack, and more. Makers of the AG-UI Protocol

中文介绍 面向Agent与生成式UI的前端框架,支持React、Angular、移动端和Slack等平台,实现了AG-UI协议,帮助开发者快速构建内嵌AI能力的界面。

TapXWorld/ChinaTextbook

Roff · ★ 72,953 · 🍴 16,330 · 📈 593 stars today

所有小初高、大学PDF教材。

中文介绍 收录中国小学、初中、高中及大学全学科PDF教材的公开资源库,供学习与参考使用。

luongnv89/claude-howto

Python · ★ 35,724 · 🍴 4,342 · 📈 393 stars today

A visual, example-driven guide to Claude Code — from basic concepts to advanced agents, with copy-paste templates that bring immediate value.

中文介绍 Claude Code的可视化实战指南,从基础概念到高级Agent用法,附带可直接复制使用的模板,提供即学即用的价值。

aaif-goose/goose

Rust · ★ 48,063 · 🍴 5,062 · 📈 699 stars today

an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM

中文介绍 开源可扩展AI Agent,超越代码建议,能安装、执行、编辑和测试任务,支持对接任意LLM,适用于自动化开发与运维场景。

Empirical Study on the Characteristics and Evolution of AI-usage in GitHub Repositories: Evidence from Code Comments

👍 0

Developers increasingly use AI tools such as ChatGPT, Copilot, and Claude in everyday software workflows, but prior studies often evaluate LLM outputs in isolation rather than examining how developers adapt them in real projects. We analyze 35,361 GitHub code comments that explicitly reference AI us

中文介绍 一项研究通过分析35,361条GitHub代码注释,探讨开发者在实际项目中如何适应ChatGPT、Copilot、Claude等AI工具,而非仅在孤立环境下评估LLM输出。

ECI_{sem}: Semantic Residual Effective Contrastive Information for Evaluating Hard Negatives

👍 1

Hard-negative source selection for dense retrieval is usually decided only after fine-tuning and downstream evaluation. We propose ECI_{sem}, a semantic residual variant of Effective Contrastive Information (ECI) that ranks candidate negative sources using frozen target-encoder embeddings. ECI_{sem}

中文介绍 研究提出ECI_sem,一种语义残差变体的有效对比信息方法,可在冻结目标编码器的情况下对难负例源进行排序,用于稠密检索中的难负例选择。

Towards Retrieving Interaction Spaces for Agentic Search

👍 1

Retrieval for search agents is still inherited from non-agentic information retrieval: a retriever ranks the corpus and the agent reads a small set of returned documents. Recent direct corpus interaction (DCI) work shows that agents can instead interact with the raw corpus through shell tools such a

中文介绍 针对代理搜索的检索提出新方向,指出代理可直接与原始语料交互而非仅读取返回文档,以拓展交互空间。

AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization

👍 24

Despite being a pivotal frontier, interactive world modeling remains underexplored in terms of the versatile controllability required by practical scenarios. To bridge this gap, we present AnchorWorld, a framework that advances egocentric simulation through enhanced interaction integrity and a flexi

中文介绍 提出AnchorWorld框架,通过增强交互控制能力推动体感仿真中的世界建模,实现基于视图演化的自定义交互。

MMAE: A Massive Multitask Audio Editing Benchmark

👍 39

We introduce MMAE, a Massive Multitask Audio Editing benchmark, serving as the first comprehensive evaluation testbed designed for general-purpose instruction-based audio editing. Spurred by the shift toward intelligent creation, interactive editing has rapidly expanded from visual domains, pioneere

中文介绍 推出MMAE基准,首个面向通用指令音频编辑的大规模多任务评估平台,覆盖从视觉到音频的交互编辑扩展。

Stream3D-VLM: Online 3D Spatial Understanding with Incremental Geometry Priors

👍 4

Despite advances in 3D scene understanding, existing 3D Large Multimodal Models operate in offline settings, requiring complete scene observations or predefined video clips. In this paper, we present an online 3D vision-language model that enables real-time spatial understanding from streaming video

中文介绍 提出Stream3D-VLM在线3D视觉语言模型,支持实时空间理解,无需完整场景观测或预定义视频片段。

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

👍 69

Large language models exhibit impressive zero-shot capabilities across a wide range of downstream tasks. However, they struggle to function as off-the-shelf embedding models, leading to suboptimal performance on massive text embedding benchmarks. In this paper, we identify a potential cause underlyi

中文介绍 研究发现大语言模型的解嵌入矩阵可充当文本嵌入的特征透镜,提升其作为嵌入模型在零样本任务上的表现。

Watch, Remember, Reason: Human-View Video Understanding with MLLMs

👍 12

Video understanding is being rapidly transformed by multimodal large language models (MLLMs), as research moves from short clips to long, multimodal, and knowledge-intensive video scenarios. These scenarios require models to handle sparse evidence, long-range dependencies, multimodal alignment, and

中文介绍 多模态大语言模型推动视频理解从短片段向长视频、多模态、知识密集型场景发展,需处理稀疏证据和长程依赖。

Almieyar-Oryx-BloomBench: A Bilingual Multimodal Benchmark for Cognitively Informed Evaluation of Vision-Language Models

👍 4

Despite the rapid progress of Vision-Language Models (VLMs), the field lacks benchmarks that rigorously diagnose their true reasoning abilities and chart meaningful progress toward human-like multimodal intelligence. Most existing evaluations focus on piecemeal or disconnected tasks, obscuring criti

中文介绍 构建Almieyar-Oryx-BloomBench双语多模态基准,用于认知启发下严格评估视觉语言模型的推理能力。

UnpredictaBench: A Benchmark for Evaluating Distributional Randomness in LLMs

👍 12

We introduce UnpredictaBench, an evaluation that tests the ability of large language models (LLMs) to capture true underlying distributions. As LLMs are increasingly used as substitutes for other entities (e.g., for humans in economic simulations), the tendency of many models to collapse towards a s

中文介绍 推出UnpredictaBench基准,评估大语言模型捕获真实底层分布的能力,尤其针对其在经济模拟等替代人使用场景中的分布随机性。

Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation

👍 1

We study the transformation of autoregressive models (ARLMs) into diffusion language models (DLMs). Rather than pretraining from scratch, prior work replaces the causal attention in ARLMs with bidirectional attention and then trains the resulting model using a DLM objective. However, these approache

中文介绍 研究将自回归语言模型转换为扩散语言模型的有效蒸馏方法,通过在线策略蒸馏提升数据效率。

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation

👍 1

Reasoning models produce long chain-of-thought traces that are costly to distill and encourage verbose student outputs. We study post-hoc compression of such traces before knowledge distillation. Two teachers, Qwen3.5-397B-A17B and gpt-oss-120B, generate about 283k correct traces each; two instructi

中文介绍 提出压缩蒸馏方法,在知识蒸馏前对推理模型的长思维链轨迹进行后处理压缩,减少学生模型输出冗长问题。

Robots Need More than VLA and World Models

👍 20

Generalist robot intelligence is often framed as a policy-scaling problem: collect more robot demonstrations, train larger Vision-Language-Action (VLA) models, and expect broader generalisation. In this position paper, we argue that this framing is incomplete. The central bottleneck is not only poli

Physics in 2-Steps: Locking Motion Priors Before Visual Refinement Erases Them

👍 12

Image-to-Video diffusion models leverage input images to generate visually stunning content, yet frequently produce motion that violates physical laws. We reveal a surprising finding: a 2-step generation often exhibits better physical consistency than a 50-step output from the same model. Through sp

SoCRATES: Towards Reliable Automated Evaluation of Proactive LLM Mediation across Domains and Socio-cognitive Variations

👍 42

Evaluating LLM mediators remains challenging, as mediation unfolds as a real-time trajectory shaped by disputants' shifting emotions, intentions, and context. Existing testbeds rely on a few expert-authored domains, vary mainly strategic posture, and score every turn against every topic, introducing

LLM Explainability with Counterfactual Chains and Causal Graphs

👍 12

Causal graphs provide a high-level language for making mechanisms transparent. Recent work uses Large Language Models (LLMs) to recover causal graphs of external-world processes. Instead, in this paper, we use causal graphs to model LLM inference itself, providing stakeholders with a transparent vie

Thinking with Imagination: Agentic Visual Spatial Reasoning with World Simulators

👍 12

While Vision-Language Models (VLMs) have shown strong visual reasoning capabilities, their spatial reasoning abilities remain largely constrained to the observed images and text-oriented chain-of-thought. They often struggle to infer unobserved layouts, maintain cross-view consistency, and reason fr

Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models

👍 1

Vision language models (VLMs) excel at many tasks but still struggle with spatial reasoning when critical information is not directly observable. Many such problems require imaginative perception: inferring what would be seen from an unseen viewpoint, tracing paths through occluded spaces, or integr

Reinforcement Learning from Rich Feedback with Distributional DAgger

👍 3

Reasoning models have advanced rapidly, but the dominant reinforcement learning from verifiable rewards (RLVR) recipe remains surprisingly narrow: sample many responses and reward each with a single bit indicating whether the final answer is correct. Yet many settings provide rich feedback, includin

GENEB: Why Genomic Models Are Hard to Compare

👍 42

Progress in genomic foundation models is difficult to assess due to fragmented benchmarks, incompatible evaluation protocols, and task-specific reporting. As a result, claims of superiority or generality across models are often not directly comparable. We introduce GENEB, a large-scale diagnostic be

A Cookbook of 3D Vision: Data, Learning Paradigms, and Application

👍 2

3D vision has rapidly evolved, driven by increasingly diverse data representations, learning paradigms, and modeling strategies. Yet the field remains fragmented across representations and benchmarks, making it difficult to develop unified perspectives on efficiency, fidelity, and scalability. This

HarnessForge: Joint Harness and Policy Evolution for Adaptive Agent Systems

👍 4

LLM agents are increasingly expected to operate across heterogeneous task regimes that require distinct execution paradigms. This challenges fixed agent systems and motivates system-level meta-adaptation beyond isolated component updates. While existing works have adapted external harness or trained

Parametric Social Identity Injection and Diversification in Public Opinion Simulation

👍 1

Large language models (LLMs) have recently been adopted as synthetic agents for public opinion simulation, offering a promising alternative to costly and slow human surveys. Despite their scalability, current LLM-based simulation methods fail to capture social diversity, producing flattened inter-gr

Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation

👍 1

Automatic speech recognition (ASR) is a core component of human--computer interaction and an increasingly important front-end for LLM-based assistants and agents. However, most current ASR systems still follow a single-pass paradigm, which is poorly aligned with human communication, where misunderst

CORE: Contrastive Reflection Enables Rapid Improvements in Reasoning

👍 1

Language models can use verifiable rewards to improve at a wide variety of reasoning tasks. However, both parametric (e.g. RLVR) and non-parametric (e.g. prompt optimization) approaches to doing so typically require hundreds of training samples and thousands of model rollouts, making them expensive

SIA: Self Improving AI with Harness & Weight Updates

👍 8

Humans are the bottleneck in building and improving AI. Both the models and the agents that wrap them are written, tuned, and corrected by people. The long-horizon goal of an AI that can figure out how to improve itself remains open. Two largely disjoint research lines attack this bottleneck. The ha

SPACENUM: Revisiting Spatial Numerical Understanding in VLMs

👍 5

Vision-Language Models (VLMs) are increasingly deployed in embodied environments, where they need produce numerical outputs such as action magnitudes and spatial coordinates. Although these numbers appear meaningful, it remains unclear whether these numerical outputs are genuinely grounded in spatia

Context as Topology: Why Your Agent's Memory Forgets, and How Structure Escapes It

@elpresidank · 116 粉丝 · 2.9M 阅 · 543 赞 · 35 转

Most AI agent memory is built on embeddings. And there's now a proof that this entire class of system is going to forget what you stored in it — and confidently make up things you never stored at all.

中文介绍 证明基于嵌入向量的智能体记忆系统本质上会遗忘,并自信编造未存储的信息。提出用拓扑结构解决记忆失效问题,挑战当前主流方法。

How To Become An AI Engineer in 2026 (Without a CS Degree)

@sairahul1 · 111.8K 粉丝 · 710.8K 阅 · 509 赞 · 97 转

How To Become An AI Engineer in 2026. Without a CS degree. Without a bootcamp. Without knowing what a transformer is today. Here's what nobody tells you: The companies hiring right now don't need

Harness Engineering: What Every AI Engineer Needs to Know in 2026

@sairahul1 · 111.8K 粉丝 · 546.4K 阅 · 536 赞 · 94 转

In February 2026, a small OpenAI team shipped 1 million lines of production code. They didn't write a single line by hand. The AI agents wrote it. The humans designed the system that made the agents

Generative UI Is the New Frontend

@Saboo_Shubham_ · 116.2K 粉丝 · 263.3K 阅 · 517 赞 · 74 转

The frontend used to be a fixed thing. Designers drew it. Engineers built it. Users got what shipped. That's over. The interfaces shipping in 2026 are drawn partly by the agent itself, in real time,

A guide to /goal 🥅

@dkundel · 19.3K 粉丝 · 116.9K 阅 · 523 赞 · 40 转

We launched the goal mode (or /goal) as a way to help you have Codex drive towards a concrete outcome. When you set a goal Codex will continue to work until the goal is achieved, whether that takes

State of Memory in Agent Harness

@mem0ai · 17.6K 粉丝 · 82.8K 阅 · 520 赞 · 60 转

Agent harnesses are where AI software actually runs. Cursor, Devin, Claude Code, Codex: these environments handle context, orchestrate tools, coordinate agents, and increasingly, manage memory. The

A harness for every task: dynamic workflows in Claude Code

@trq212 · 263.1K 粉丝 · 75.7K 阅 · 542 赞 · 36 转

Last week, we released dynamic workflows in Claude Code. Claude can now write its own harness on the fly, custom-built for the task at hand. While the default Claude Code harness is built for coding,

A Functional Taxonomy of World Models

@drfeifei · 738.0K 粉丝 · 72.2K 阅 · 699 赞 · 144 转

“The world is everything that is the case.” — Ludwig Wittgenstein, Tractatus Logico-Philosophicus, 1921 The world is not made of words. In an earlier essay, we argued that spatial intelligence is AI’s

How to Build a Custom Agent Harness

@sydneyrunkle · 7.5K 粉丝 · 69.5K 阅 · 511 赞 · 74 转

Building useful agents is largely about customization: connecting your agent to the right context, data, and environment(s) for the task at hand. At its core, an agent is a model calling tools in a

some notes on getting into frontier ai labs

@itsreallyvivek · 3.6K 粉丝 · 65.8K 阅 · 521 赞 · 28 转

A few days ago I wrote that getting into a frontier AI lab mostly comes down to two things: proven research and trench engineering. The more I think about it, the less these feel like separate skills.

Every Agentic Engineering Hack I Know (June 2026)

@mvanhorn · 30.8K 粉丝 · 54.5K 阅 · 545 赞 · 44 转

Three months ago I posted "Every Claude Code Hack I Know." It hit 913K views. @kevinrose had asked what IDE to use, and my answer was: "No IDE. Just plan.md files and voice." This used to be called

Working Like a Lawyer with Claude

中文介绍 Claude官方视频演示了如何像律师一样使用Claude进行法律相关工作,提升效率。

Working Like a Lawyer with Claude

中文介绍 Claude官方视频演示了如何像律师一样使用Claude进行法律相关工作,提升效率。

Confidential submission of draft S-1 to the SEC

OpenAI confirms a confidential S-1 submission to the SEC and has not yet determined timing for further action.

中文介绍 OpenAI 向美国证券交易委员会(SEC)秘密提交了 S-1 表格草案,目前尚未确定下一步行动的时间。

Built to benefit everyone: our plan

A vision for the future of AI, focusing on access, safety, and shared prosperity as OpenAI works to ensure AGI benefits everyone.

中文介绍 OpenAI 公布未来规划,聚焦通用人工智能(AGI)的访问性、安全性及共享繁荣。

Introducing the OpenAI Economic Research Exchange

OpenAI launches the Economic Research Exchange to study AI’s impact on jobs, productivity, and the economy. Applications are now open for selected research projects.

中文介绍 OpenAI 推出经济研究交流项目(Economic Research Exchange),研究 AI 对就业、生产力和经济的影响,现已开放研究项目申请。

[AINews] not much happened today

a quiet day of RSI.

中文介绍 AI 新闻简报报道当日市场相对平静,RSI 指标较低。

How to Stop Shipping Low-Quality RL Environments (with Examples)

Your broken harness is actively making the model worse. Here's what I keep seeing after years of eyeballing trajectories, and what you need to fix.

中文介绍 文章指出低质量强化学习环境会损害模型性能,并提供了改进建议。

The Meta hack shows there’s more to AI security than Mythos

On June 5, 404 Media reported that attackers had been using Meta’s AI customer support agent to steal Instagram accounts. Their approach was simple: They asked the agent to link the accounts to email addresses that they controlled, and the agent complied. One attacker broke into the dormant Obama Wh

中文介绍 2026年6月5日,攻击者利用 Meta 的 AI 客服助手窃取 Instagram 账户,暴露了 AI 安全漏洞。攻击方式简单:让 AI 将账户链接至攻击者控制的邮箱。

not much happened today

**Anthropic's Mythos/Opus cycle** sparked mixed reactions with praise for **Claude Mythos**'s one-shot workflows and concerns over **Opus 4.8** benchmark regressions. **Opus 4.7** showed strong chemistry task performance, "making Claude a chemist." **Sakana AI** launched an **RSI Lab** focusing on r

中文介绍 报道 Anthropic 的 Mythos/Opus 周期引发热议:Claude Mythos 的一次性工作流获好评,Opus 4.8 基准测试出现倒退,Opus 4.7 在化学任务上表现优异,Sakana AI 推出 RSI 技术。

关于最近几天 君の的公益 签到都是520的一些解释

其实我真的很忙,我也不知道我在忙什么,反正“降低签到额度” 这件事情的优先级很低,我就一直没弄,那个额度其实意义不大,我也不建议大家囤着不用,毕竟用出去的tokens,才是实在的,偶尔号池波动造成的卡顿其实是难免的(例如空回)。 因为并发和调用量已经达到了一个惊人的地步 就这样,我把额度改回52刀了,期待下一次狂欢吧,或许是端午节呢? 祝大家早安,午安,晚安~ 对了,大家遇到问题回帖询问即可,不需要单独开贴占用社区资源; 感谢很多佬友的无偿帮助,没有他们,就没有 君の的公益 47 个帖子 - 45 位参与者 阅读完整话题

无限team cpa格式

20260608-230233-339502.zip (298.5 KB) 100个team,注册机跑的,先发100个,别手动注册了,直接导入吧 最新的的202个 9号1点09分 20260609-002846-821592.zip (621.8 KB) 33 个帖子 - 21 位参与者 阅读完整话题

OpenCodex 2.0版本!新架构!随时随地AI Coding!

本帖使用社区开源推广,符合推广要求。我申明并遵循社区要求的以下内容: 我的帖子已经打上 开源推广 标签: 是 我的开源项目完整开源,无未开源部分: 是 我的开源项目已链接认可 LINUX DO 社区: 是 我帖子内的项目介绍,AI生成、润色内容部分已截图发出: 是 以上选择我承诺是永久有效的,接受社区和佬友监督: 是 以下为项目介绍正文内容,AI生成、润色内容已使用截图方式发出 好久不见,这段时间实在太忙,而且一直在优化和测试新的架构,现在终于能给大家发最新版本了! 新版本不再需要繁琐的实现IPC,最大程度兼容Codex!Codex更新也不会导致功能大面积失效!还提高了数据加载速度和完全兼容M

在中转妖魔鬼怪都存在的时代,你们可能需要隐私过滤,于是我们开源了

privacy-filter 纯 Go 实现,零依赖,单个二进制文件 支持三种接入方式:import 包、HTTP 接口、gRPC 两层检测机制:结构化 PII 识别 + Gitleaks 全套 222 条规则 之前在 PackyCode 上线隐私过滤功能后,收到不少反馈和需求,所以直接开源了。 欢迎大家 star、试用、提 issue、PR 项目地址:https://github.com/packyme/privacy-filter 为了省事我直接照搬推特发的内容了 这个玩意儿已经在packyapi 上面内置了,有需要的可以自取 20 个帖子 - 14 位参与者 阅读完整话题

又来修一下CC

问题是这样的 从 2.1.140+ (大约 144)开始 发现 Bash权限审批的弹窗搁那不弹出来还卡住了 现象是 ⏺ Bash(tar tzf /Users/haleclipse/WorkSpace/Node/ClaudeCodeRev/rc-server/dist/rc-server-0.1.0.tgz | wc -l; tar tzf /Users/haleclipse/WorkSpace/Node/ClaudeCodeRev/rc-ser…) ⎿ Waiting… ✶ Frosting… (5m 54s · ↓ 2.0k tokens) 嗯 就硬 Waiting… 不走了 头回发现的时

兄弟们,感觉被人作局了?

有人跟我说,5年 GitHub 账号没多少人的,能进的基本上之前也都进过了。我信了,开了这个渠道,结果 6 个小时整了差不多 5k 人进来?GitHub oauth API 都搞出 Rate limit 了? 这玩意据说是 5000次/1H 啊,这也能爆掉?幸亏 5.1 的服务器扩容给力。 这指定是被作局了,这另一条腿我根本都不敢出了。 当然,还是要欢迎新来的佬友,请务必仔细看一看我们的社区准则,希望你们在这里玩得愉快: https://linux.do/guidelines 366 个帖子 - 338 位参与者 阅读完整话题

友善提醒下刚来的新人们

发本帖的目的是,刚刚涌入了一些新的佬友们,似乎对咱们社区的一切不很熟悉。 因此,在我刚刚水帖的几分钟内,已经看到10个以上的佬友有违规行为。 特发此帖,予以提醒。 (点击了解更多详细信息) [!danger] 这些是重中之重,你必须阅读!! 首先,社区很欢迎你的加入! 始皇的表态: 兄弟们,感觉被人作局了? 当然,还是要欢迎新来的佬友,请务必仔细看一看我们的社区准则,希望你们在这里玩得愉快: https://linux.do/guidelines 但是,我们希望你能够认真地阅读准则,这是作为一名成员的基本素养。 然后,请注意:无数佬友踩坑的规定:非必要不抽奖,非必要不抽奖,非必要不抽奖。重要的

已恢复,莫辜负,签到升星站

快去看看有没有更加流畅更加舒适 但是,之前数据库是sqlite的,现在改成了pg,所以,咳咳,有部分数据丢失了,不知道二星要升三星的用户,时间会不会重置23333 特别感谢: @ouyangqiqi 144 个帖子 - 139 位参与者 阅读完整话题

我打算给我儿子注册个L站,让他上学就开始学习AI

我手里有个10年的github账号,很多佬友现在注册github进L站要等五年才行,还需要一个经历漫长的升级过程。 我儿子对电脑特别感兴趣,2岁不到,不给他电脑就哇哇大哭,一有电脑玩的不亦乐乎,不知道是不是因为父母都是程序员的缘故。 52 个帖子 - 47 位参与者 阅读完整话题

Siri AI

284 points · 214 comments

Show HN: Gitdot – a better GitHub. Open-source, anti-AI, and written in Rust

What works now: user signups, org creations, private/public repos, and importing GitHub repositories (both as read-only mirrors and full migrations). So basically, you can create, push and pull to a repo, but we don't have many features quite yet (issues, PRs, CI).What is a bit unique is:

Show HN: Courtside – TUI for NBA Games

Hi HN, I made this after seeing a few similar projects on the front page. NBA API endpoints are public and there’s a pretty robust python package ( https://github.com/swar/nba_api ) that I referenced for the endpoint structure to build an sdk in go. used BubbleTea and LipGloss fo

今日主题

今日AI领域呈现出多元并进的态势:模型层面,DeepSeek v4在性能上逼近Opus 4.8,社区热议Anthropic的Mythos/Opus周期;开源社区活跃,涌现出大量AI Agent、记忆系统、评测工具等实用项目,Google也发布了官方Agent技能集合;产品方面,Claude聚焦开发者体验,推出苹果平台集成指南和Claude Cowork产品指南;行业动态方面,OpenAI秘密提交IPO草案并发布未来规划与经济学研究项目,一场利用AI客服窃取Instagram账户的事件也引发了对AI安全的关注;此外,多篇高质量论文和观点文章深入探讨了LLM嵌入、记忆拓扑、视频理解等前沿话题。

01

模型发布/更新

Model Releases 33 篇

DeepSeek v4性能逼近Opus 4.8

大咖博客Riley Brown (YouTube)

YouTube博主Riley Brown在最新视频中讨论了Hermes Agent超级应用的最新进展,并指出DeepSeek v4在多项基准测试中已接近Anthropic的Opus 4.8水平,引发社区对模型性能对比的广泛关注。

模型对比大模型推理

DeepMind新AI思考方式引关注

大咖博客Two Minute Papers

Two Minute Papers频道报道称,DeepMind开发的新AI模型展现出一种非同寻常的思考方式,在解决复杂问题时表现独特,引发了学界对AI认知机制的新讨论。

DeepMindAI思考前沿研究

Anthropic Mythos/Opus周期引热议

综合资讯Smol AI News

Smol AI News报道称,近期Anthropic的Claude Mythos一次性工作流获得好评,但Opus 4.8在基准测试中出现倒退,而Opus 4.7在化学任务上表现优异,同时Sakana AI推出了RSI技术,社区对模型迭代节奏展开讨论。

大模型基准测试Anthropic
02

产品发布/更新

Product 55 篇

Claude集成苹果Foundation Models框架

官方Claude Blog

Anthropic官方博客介绍如何在苹果平台的Foundation Models框架中集成Claude,开发者可利用该框架在Apple设备上构建具有智能对话能力的原生应用,拓展了Claude在移动端的部署路径。

Claude苹果产品集成

Claude Cowork产品指南发布

官方Claude Blog

Anthropic官方发布了Claude Cowork产品使用指南,详细介绍了这款协作工具的核心功能与使用场景,旨在帮助团队提升协同办公效率,是目前Claude产品线的重要补充。

Claude产品发布协作

OpenAI秘密提交IPO草案

官方OpenAI News

OpenAI向美国证券交易委员会秘密提交了S-1表格草案,迈出首次公开募股的关键一步。虽然尚未确定后续行动时间,但此举标志着OpenAI从私营研究机构向公众公司的战略转型。

融资IPO战略

OpenAI公布AGI共享繁荣计划

官方OpenAI News

OpenAI发布未来规划文件,聚焦通用人工智能(AGI)的广泛访问性、安全性及共享繁荣,明确将AGI带来的利益惠及全人类作为核心使命,并提出了对应的技术与社会架构设想。

战略AI安全AGI

OpenAI推出经济研究交流项目

官方OpenAI News

OpenAI正式推出Economic Research Exchange项目,旨在研究AI对就业、生产力和宏观经济的影响,现已开放研究项目申请,邀请学界与业界共同探索AI的经济效应。

研究经济AI影响
03

行业动态

Industry 44 篇

Meta AI客服漏洞导致账户被盗

综合资讯MIT Tech Review AI

MIT Tech Review报道,攻击者在6月5日利用Meta AI客服助手的漏洞,通过简单的社交工程手段将用户Instagram账户链接至攻击者控制的邮箱,成功窃取账户。该事件暴露了AI客服在身份验证环节的安全缺陷。

安全AI安全Meta

谷歌与OpenAI达成计算交易

综合资讯TLDR AI

TLDR AI当日要闻汇总显示,OpenAI获得美国政府持股,谷歌与OpenAI达成大规模计算基础设施交易,同时微软推出全新的AI助手Scout,三大科技巨头在AI算力和产品层面展开新一轮博弈。

大模型政策产品发布

Anthropic探索AI智能体在生物学应用

官方Anthropic Research

Anthropic Research发布新研究,探索AI智能体在生物学领域的应用潜力,旨在通过智能体自动化实验流程、分析生物数据,为药物发现和生命科学研究提供新的技术路径。

研究AI Agent生物学

开源社区支持OpenEnv强化学习环境

官方Hugging Face Blog

Hugging Face官方博客宣布,开源社区正在积极支持OpenEnv环境——一个专为智能体强化学习设计的标准化交互环境,旨在降低RL研究的门槛并推动Agent训练的可复现性。

开源强化学习Agent
04

技巧与观点

Tips & Takes 44 篇

记忆拓扑结构解决Agent遗忘问题

X·KOLX 推文 (AttentionVC)

X平台KOL @elpresidank通过分析指出,当前基于嵌入向量的智能体记忆系统本质上会遗忘,并自信地编造未存储的信息。他提出使用拓扑结构来解决记忆失效问题,挑战了主流方法,为Agent持久记忆提供了新思路。

记忆观点技术分析

配置Claude Code自动回复短信

大咖博客Riley Brown (YouTube)

YouTube博主Riley Brown展示了如何配置Claude Code,使其能够自动接收并回复手机短信,实现AI助手的主动交互功能。该教程为开发者提供了将Claude与通信渠道深度整合的实用方案。

AI助手自动回复教程

强化学习环境质量提升指南

大咖博客Latent Space

Latent Space发表技术文章,指出低质量强化学习环境会直接损害模型性能,并通过实例分析常见问题,提供了改进RL环境设计的具体建议,对RL从业者具有实际参考价值。

强化学习技术指南

用Claude模拟律师工作流程

官方Claude (YouTube)

Claude官方YouTube频道发布视频,演示了如何利用Claude进行法律相关任务,包括合同分析、案例检索和文书撰写,展示了LLM在专业领域的辅助能力,并提供了可复用的工作流模板。

法律Claude工作流