每日 AI 简报

2026-06-07(内容获取于 06/07 05:56)

CopilotKit:Agent前端全栈框架

GitHub Trending

为Agent和生成式UI提供React、Angular、移动端、Slack等前端栈,并提出AG-UI协议,可快速构建交互式AI界面。

推荐理由:开发者可直接上手构建Agent UI,可行动性极高,是当前热门的开源项目。

Meta AI客服漏洞致用户账号被盗

MIT Tech Review AI

据报道,攻击者利用Meta AI客服将账号链接到自己的邮箱,以此窃取Instagram账户,暴露了AI Agent的安全隐患。

推荐理由:揭示了AI Agent落地中的真实安全风险,对开发者和安全从业者有重要警示。

last30days-skill:跨平台信息聚合AI技能

GitHub Trending

一个AI Agent技能,能够从Reddit、X、YouTube、HN、Polymarket和网页中研究任意主题,并合成有据可查的摘要。

推荐理由:实用工具,可帮助快速获取跨平台热点,适合信息工作者和研究者。

Anthropic如何跨产品限制Claude权限

Anthropic Engineering

随着Agent能力增强,其潜在影响范围扩大。Anthropic分享了在claude.ai、Claude Code等产品中限制Agent权限的工程经验。

推荐理由:对构建安全Agent系统的团队有直接参考价值,安全设计思路值得学习。

Anthropic员工用Claude Code重构销售流程

Claude Blog

Anthropic一位销售人员分享了如何使用Claude Code自动化工作流,提升团队效率。

推荐理由:真实案例展示非技术人员如何用AI提升效率,可复制性强。

Anthropic将Claude训练成化学家

Anthropic Research

Anthropic发布研究,展示如何让Claude具备化学知识,在分子性质预测等任务上表现出色。

推荐理由:展示了大模型在垂直科学领域的潜力,对AI+科学感兴趣者可关注。

通用记忆协议:Agent记忆共享格式

Hacker News

一个提议中的开放标准,旨在为AI Agent提供统一的记忆存储和交换格式,促进跨Agent协作。

推荐理由:切入Agent记忆碎片化痛点,对Agent生态演进有长远意义。

小型模型打造多模型金融剧

Hugging Face Blog

在Hugging Face hackathon中,团队使用多个小型模型构建了一个金融模拟剧,展示小模型协作的创意应用。

推荐理由:展示小模型创造性玩法,对探索低成本AI应用有启发。

自适应对手博弈中的遗憾最小化

HuggingFace Trending Papers

研究在线学习中面对自适应对手时的遗憾最小化问题,提出标准外部遗憾度量在动态环境中的局限性。

推荐理由:强化学习理论前沿论文,适合算法研究者了解。

Anthropic Oceanus泄露等AI动态

TLDR AI

TLDR AI汇总了Anthropic Oceanus模型泄露、ChatGPT做梦功能、递归自我改进等业内热点。

推荐理由:快速掌握当日AI圈核心动态,适合作为信息补全。

mvanhorn/last30days-skill

Python · ★ 28,724 · 🍴 2,433 · 📈 441 stars today

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary

中文介绍 AI agent 技能模块,可跨 Reddit、X、YouTube、HN、Polymarket 及网页调研任意话题,自动合成有依据的摘要。面向需要快速获取多渠道最新信息并生成结构化总结的研究者或分析师。

CopilotKit/CopilotKit

TypeScript · ★ 33,163 · 🍴 4,235 · 📈 613 stars today

The Frontend Stack for Agents & Generative UI. React, Angular, Mobile, Slack, and more. Makers of the AG-UI Protocol

中文介绍 为 Agent 和 Generative UI 提供前端框架,支持 React、Angular、移动端、Slack 等平台。基于 AG-UI 协议,让开发者轻松构建带智能交互能力的用户界面。

MemPalace/mempalace

Python · ★ 54,239 · 🍴 7,104 · 📈 441 stars today

The best-benchmarked open-source AI memory system. And it's free.

中文介绍 开源 AI 记忆系统,基准测试表现优秀,免费使用。帮助 AI 应用保留跨会话的上下文信息,解决长对话或持续任务中的记忆问题。

danielmiessler/Personal_AI_Infrastructure

TypeScript · ★ 14,922 · 🍴 2,120 · 📈 63 stars today

Agentic AI Infrastructure for magnifying HUMAN capabilities.

中文介绍 面向个人的 AI 基础设施,聚焦增强人类能力。提供 Agentic 架构,让用户搭建私有 AI 助手,用于任务自动化、信息处理等场景。

openai/plugins

JavaScript · ★ 1,745 · 🍴 256 · 📈 215 stars today

OpenAI Plugins

中文介绍 OpenAI 官方插件仓库,为 ChatGPT 等模型提供扩展能力。开发者可创建插件让 AI 访问实时数据、执行操作,适配多种业务场景。

Panniantong/Agent-Reach

Python · ★ 22,253 · 🍴 1,902 · 📈 700 stars today

Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.

中文介绍 为 AI agent 提供全网可见能力,通过 CLI 免费读取和搜索 Twitter、Reddit、YouTube、GitHub、B站、小红书等平台。无需 API 费用,适合需要多源数据的自动化 agent。

sveltejs/svelte

JavaScript · ★ 86,956 · 🍴 4,936 · 📈 34 stars today

web development for the rest of us

中文介绍 Svelte 是一款前端编译框架,将组件编译为高效原生 JavaScript,减少运行时开销。适用于构建高性能 Web 应用,新手友好且支持渐进式增强。

nginx/nginx

C · ★ 30,671 · 🍴 7,956 · 📈 37 stars today

The official NGINX Open Source repository.

中文介绍 官方 Nginx 开源仓库,高性能 HTTP 服务器与反向代理软件。广泛用于静态资源托管、负载均衡、API 网关等场景,以稳定性和低资源消耗著称。

aquasecurity/trivy

Go · ★ 35,990 · 🍴 454 · 📈 159 stars today

Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more

中文介绍 综合安全扫描工具,支持容器、Kubernetes、代码仓库、云环境中漏洞、配置错误、密钥泄漏和 SBOM 检测。DevOps 团队用于持续安全审计。

golang/go

Go · ★ 134,497 · 🍴 19,089 · 📈 24 stars today

The Go programming language

中文介绍 Go 编程语言官方仓库,以简洁语法、并发支持和高效编译著称。适合构建网络服务、微服务、CLI 工具等后端系统,在云原生生态中广泛应用。

lfnovo/open-notebook

TypeScript · ★ 26,565 · 🍴 3,035 · 📈 783 stars today

An Open Source implementation of Notebook LM with more flexibility and features

中文介绍 Notebook LM 的开源替代品,提供更灵活的功能。支持文档检索、知识管理和 AI 生成笔记,适合研究人员和学生整理和分析信息。

obra/superpowers

Shell · ★ 219,603 · 🍴 19,540 · 📈 1,008 stars today

An agentic skills framework & software development methodology that works.

中文介绍 一套 Agentic 技能框架与软件开发方法论,强调实用性。帮助开发者以结构化方式构建 AI Agent 功能,提升开发效率和系统可维护性。

santifer/career-ops

JavaScript · ★ 49,260 · 🍴 10,192 · 📈 203 stars today

AI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.

中文介绍 基于 Claude Code 的 AI 求职系统,提供 14 种技能模式、Go 语言仪表盘、PDF 生成和批量处理功能。自动化简历优化、职位搜索和申请流程。

openai/whisper

Python · ★ 101,822 · 🍴 12,439 · 📈 155 stars today

Robust Speech Recognition via Large-Scale Weak Supervision

中文介绍 OpenAI 开源的语音识别模型,通过大规模弱监督训练获得强泛化能力。支持多语言转写,适用于会议记录、视频字幕、语音搜索等场景。

vitejs/vite

TypeScript · ★ 81,162 · 🍴 8,274 · 📈 73 stars today

Next generation frontend tooling. It's fast!

中文介绍 下一代前端构建工具,基于原生 ESM 实现极速启动和热更新。支持 Vue、React、Svelte 等框架,适合现代 Web 项目开发。

microsoft/mxc

Rust · ★ 557 · 🍴 24 · 📈 57 stars today

Policy-driven, layered isolation and containment

中文介绍 微软开源的策略驱动隔离层,采用分层隔离与限制机制。用于提升系统安全性,适合需要高隔离等级的多租户或边缘计算场景。

PaddlePaddle/PaddleOCR

Python · ★ 80,927 · 🍴 10,654 · 📈 449 stars today

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

中文介绍 轻量级 OCR 工具包,可将 PDF 或图片转为结构化数据,支持 100+ 语言。桥接传统文档与 LLM 处理流程,广泛用于票据识别、文档数字化等。

microsoft/VibeVoice

Python · ★ 48,446 · 🍴 5,389 · 📈 219 stars today

Open-Source Frontier Voice AI

中文介绍 微软开源的端侧语音 AI 系统,提供前沿的语音交互能力。适用于智能助手、语音控制等场景,注重低延迟和隐私保护。

Regret Minimization with Adaptive Opponents in Repeated Games

👍 1

In this paper, we study regret minimization in repeated games with adaptive opponents who can respond based on histories of play. The standard metric of external regret in online learning is known to fail to capture such adaptivity. To account for players' counterfactual reasoning, we introduce {\tt

中文介绍 研究重复博弈中自适应对手的遗憾最小化问题,指出标准外部遗憾指标无法适应对手基于历史的行为调整,并提出新方法。

Learning Geometric Representations from Videos for Spatial Intelligent Multimodal Large Language Models

👍 3

Multimodal Large Language Models (MLLMs) excel at 2D semantic understanding but lack intrinsic 3D awareness, resulting in representations that fail to maintain geometric and spatial consistency across video frames. Given the scarcity of large-scale 3D data, we present GeoVR, a novel framework that l

中文介绍 提出GeoVR框架,通过视频学习几何表示,增强多模态大语言模型在视频帧中的3D空间一致性,弥补3D数据稀缺问题。

AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding

👍 6

Vision-Language-Action (VLA) models leverage the rich world knowledge of pretrained vision-language models (VLMs) to enable instruction-following robotic manipulation. However, the structural mismatch between VLM semantic spaces and embodied control policies often hinders the learning of precise per

中文介绍 提出AffordanceVLA模型,结合视觉-语言-动作,利用预训练VLM的世界知识增强机器人操作中的动作生成,解决语义与策略的结构不匹配。

Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

👍 63

Code language models need repository-level context to resolve imports, APIs, and project conventions. Existing methods inject this knowledge as long inputs (retrieved through RAG or dependency analysis) or through per-repository fine-tuning and LoRA -- costly at repository scale and brittle to evolv

中文介绍 提出Code2LoRA方法,通过超网络为代码语言模型生成适配器,处理软件演化中的仓库级上下文,避免昂贵的全仓库微调。

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

👍 1

A situated query like "where is Lin Wei?" often encodes more than its literal content: the user may also want to know whether Lin Wei is free, in a good mood, or worth interrupting now. Standard tool-use agents answer the literal question and stop. AURA inserts an inference step between scene percep

中文介绍 提出AURA框架,用于情境化LLM代理,通过意图导向的探测挖掘用户隐含需求,超越字面问题回答。

Benchmark Everything Everywhere All at Once

👍 2

Benchmarks are fundamental for evaluating and advancing LLMs and MLLMs by providing standardized and explicit measures of performance. However, their construction is labor-intensive and hard to reuse, raising concerns about sustainability and scalability. Moreover, existing benchmarks often quickly

中文介绍 讨论现有LLM和MLLM基准构建劳动密集且难以复用,关注可持续性和可扩展性问题。

Revising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online Discussions

👍 2

Large language models are increasingly used to simulate social media users and infer how individuals may respond to online discussions. However, it remains unclear whether these simulations reflect precise user-specific beliefs or whether they are highly sensitive to semantically independent changes

中文介绍 评估LLM在模拟社交媒体用户立场时的敏感性,发现模拟结果对语义细微变化高度敏感,未必反映用户真实信念。

ForeSci: Evaluating LLM Agents for Forward-Looking AI Research Judgment

👍 0

AI research often requires decisions before future evidence exists: which bottleneck to attack, which direction to pursue, or where a project should be positioned. We introduce ForeSci, a temporally controlled benchmark for evaluating whether LLM agents can make such forward-looking research judgeme

中文介绍 提出ForeSci基准,用于评估LLM代理在人工智能研究中做前瞻性判断(如选择瓶颈、方向)的能力。

Dream.exe: Can Video Generation Models Dream Executable Robot Manipulation?

👍 15

Video generation models have made impressive strides in synthesizing visually compelling content, yet their outputs remain confined to the virtual domain. A natural question follows: how well do these models reflect the physical world when their generated videos leave the screen and enter reality? W

中文介绍 探索视频生成模型生成的内容能否用于可执行机器人操作,评估其反映物理世界的程度。

Towards One-to-Many Temporal Grounding

👍 4

Temporal Grounding (TG) aims to localize video segments corresponding to a textual query. Prior research predominantly focuses on single-segment retrieval. Real-world scenarios, however, often require localizing multiple disjoint segments for a single query -- a setting we term One-to-Many Temporal

中文介绍 提出“一对多时间定位”任务,针对单个查询定位视频中多个不连续片段,扩展传统单片段定位。

LLMs Can Leak Training Data But Do They Want To? A Propensity-Aware Evaluation of Memorization in LLMs

👍 7

Large language models can reproduce training data, but existing memorization evaluations mostly measure whether models can be forced to do so, rather than whether they do so under ordinary use. We introduce PropMe, a propensity-aware framework for memorization evaluation that contrasts prefix-based

中文介绍 提出PropMe框架,评估LLM在常规使用中是否倾向泄露训练数据,区分被迫记忆与自然回忆。

AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

👍 36

Planning for real-world problems by language models often involves both world and user constraints, which may not be fully specified upfront and are progressively disclosed through interaction. However, existing benchmarks still underexplore adaptive planning under such progressively revealed dual c

中文介绍 提出AdaPlanBench基准,测试LLM代理在世界约束和用户约束逐渐披露时进行自适应规划的能力。

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

👍 42

Role-playing language agents (RPLAs) should play characters whose values and behavior evolve as the story progresses, not maintain a fixed persona. Existing benchmarks measure factual recall at a given chapter, not whether responses align with the character's psychological trajectory, especially in

Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

👍 23

Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limited zero-shot transfer at test time. To tra

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

👍 38

Agents are widely deployed as assistants over documents, tools, and code. However, they typically act only on explicit user requests, which surface only the problems the user has noticed, while many other important problems coexist, hidden in plain sight, within the broader user context, with their

MAOAM: Unified Object and Material Selection with Vision-Language Models

👍 8

Selection is a core operation in interactive image editing. To be practical, a user should be able to specify and disambiguate the desired selection region through either text or click-based interactions, and the system should support selecting not only objects but also other criteria, such as mater

SEAOTTER: Sensor Embedded Autoencoding with One-Time Transcode for Efficient Reconstruction

👍 4

In robotics systems, vast amounts of visual data are easily captured at high resolution using low-cost, low-power hardware. Yet, limited bandwidth and on-device compute resources prevent full utilization when transmitted via conventional codecs like JPEG/MPEG. Newer codecs, like AV1/AVIF, improve th

RobotValues: Evaluating Household Robots When Human Values Conflict

👍 23

While household robots are often evaluated based on task completion, everyday domestic environments involve value-conflicting situations in which robots are expected to choose actions that prioritize other values than task success, such as human autonomy, efficiency, or social appropriateness. Yet,

EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management

👍 2

Recent progress in Large Language Model (LLM) agents has enabled promising advances in automated data science. However, existing approaches remain fundamentally limited by their static action sets and lack of principled long-horizon context management, hindering their ability to accumulate reusable

LLM Anonymization Against Agentic Re-Identification

👍 1

Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text

Absorbing Complexity: An Interaction-Native Knowledge Harness for Financial LLM Agents

👍 2

Financial AI agents often fail for a simple reason: they make users carry the complexity. A user must repeatedly restate goals, risk preferences, portfolio context, past judgments, and shifting market assumptions, while the agent answers, retrieves, acts, and forgets. In finance, this is not just in

AdaCodec: A Predictive Visual Code for Video MLLMs

👍 4

Video is temporally redundant: adjacent frames usually share most objects, background, and layout. Yet existing video multimodal large language models (video MLLMs) usually encode each sampled frame as an independent RGB image, causing visual tokens to repeat content already present in earlier frame

SABER: Benchmarking Operational Safety of LLM Coding Agents in Stateful Project Workspaces

👍 0

Large language models are increasingly deployed as coding agents, shifting safety from individual responses to action sequences. Existing benchmarks, however, primarily assess whether models refuse unsafe prompts, leaving impacts on stateful workspaces largely unexamined. We present SABER, a benchma

The Shape of Addition: Geometric Structures of Arithmetic in Large Language Models

👍 3

Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between internal computation and discrete output. By analyzing the residual stream geometry during multi-operand addition, we identify the Iso-Raw-Sum Trajectory (IRST), a geometric structure where r

MechVQA: Benchmarking and Enhancing Multimodal LLMs on Comprehensive Mechanical Drawing Understanding

👍 3

Multimodal Large Language Models (MLLMs) have demonstrated significant achievements in general visual question answering (VQA) tasks. However, they remain brittle on mechanical engineering drawings, where high annotation density and weak domain knowledge, compounded by unreliable spatial relation re

Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination

👍 1

Reinforcement Learning with Verifiable Rewards (RLVR) has recently emerged as the cornerstone for shaping the remarkable coding abilities of Large Language Models (LLMs). However, the scalability of RLVR is severely constrained by the scarcity of sufficiently challenging verifiable code tasks that t

Multimodal Music Recommendation System using LLMs

👍 1

Music recommendation systems typically treat songs as opaque tokens, relying on collaborative interaction histories which overlooks semantic or acoustic content. Prior work has explored LLM-augmented, multimodal, and text-enhanced approaches to sequential recommendation, and while some methods parti

Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents

👍 6

Memory-augmented LLM agents tackle complex long-horizon tasks by recursively summarizing interaction trajectories into compact memory. However, existing approaches typically train these memory policies using outcome-based reinforcement learning, failing to localize where intermediate memory quality

Trust Region Q Adjoint Matching

👍 3

Off-policy reinforcement learning of pretrained flow policies remains challenging due to the instability of optimization arising from the multi-step sampling process. Recently, Q-learning with Adjoint Matching (QAM) addressed this issue by reformulating into a memoryless stochastic optimal control (

Is This Edit Correct? A Multi-Dimensional Benchmark for Reasoning-Aware Image Editing

👍 1

Diffusion-based image editing has achieved strong visual fidelity under natural language instructions, yet most existing systems still operate at the level of surface instruction following, without reasoning about the implicit contextual constraints embedded in real user requests. This often leads t

Context as Topology: Why Your Agent's Memory Forgets, and How Structure Escapes It

@elpresidank · 116 粉丝 · 2.9M 阅 · 543 赞 · 35 转

Most AI agent memory is built on embeddings. And there's now a proof that this entire class of system is going to forget what you stored in it — and confidently make up things you never stored at all.

中文介绍 指出基于 embedding 构建的 AI agent 记忆系统存在根本缺陷:会有遗忘和幻觉。通过理论证明,现有架构无法逃脱这一局限,呼吁探索结构化的记忆方案。

SpaceX IPOs in 7 days. I Fed the S1 Doc Into Claude. Here Is What It Found Buried in 300 Pages.

@DamiDefi · 96.5K 粉丝 · 2.3M 阅 · 584 赞 · 80 转

The number that stopped me was not the $2 trillion valuation. It was $791 million. That is what SpaceX made in net income in 2024. A profitable, growing aerospace company with a genuine moat in launch

中文介绍 将 SpaceX 长达 300 页的 S-1 文件输入 Claude 进行分析,挖掘出关键财务数据:2024 年净利润 7.91 亿美元,一家真正盈利且有护城河的航空航天公司即将上市。

Range and Depth on Demand

@1salman · 363 粉丝 · 2.0M 阅 · 682 赞 · 45 转

Everyone keeps asking whether AI favors specialists or generalists. I think that is the wrong question. AI does not pick a side. It changes the tradeoff. The old world forced a choice. You could go

中文介绍 讨论 AI 时代专家与通才之争的伪命题。认为 AI 不偏爱某一类,而是改变了原有的取舍关系:深度与广度可以同时获得,旧有的选择题已过时。

How To Become An AI Engineer in 2026 (Without a CS Degree)

@sairahul1 · 110.7K 粉丝 · 710.8K 阅 · 509 赞 · 97 转

How To Become An AI Engineer in 2026. Without a CS degree. Without a bootcamp. Without knowing what a transformer is today. Here's what nobody tells you: The companies hiring right now don't need

中文介绍 无需 CS 学位、无需 bootcamp,如何在 2026 年成为 AI 工程师。核心观点:当前招聘市场不看重学历,更看重实际构建能力,零基础也能起步。

30 Obsidian Workflows, Plugins, and Setups That Most Users Don't Know

@eng_khairallah1 · 61.9K 粉丝 · 693.5K 阅 · 511 赞 · 71 转

Obsidian has 2,700+ community plugins. Over 100 of them are AI-related. Save this :) And the CEO of Obsidian personally published official Claude Skills for the platform - 12,900+ GitHub stars in

中文介绍 整理 30 个 Obsidian 冷门工作流、插件和配置,包含超 100 个 AI 相关插件。Obsidian CEO 官方发布的 Claude Skills 已在 GitHub 获得 12900+ star。

How to master Dynamic Workflows in Claude Code: 6 patterns and 14 steps Anthropic engineers actually

@0xCodez · 3.3K 粉丝 · 637.2K 阅 · 510 赞 · 59 转

Most Claude Code users still write their workflows by hand. They chain prompts, copy outputs, paste them into the next prompt, fix what went wrong, repeat. 9 out of 10 builders haven’t tried Dynamic

中文介绍 详解 Claude Code 动态工作流(Dynamic Workflows)的 6 种模式和 14 步实操。指出大多数用户仍手动链式调 prompt,而 Anthropic 工程师已采用自动化编排。

What an Enterprise Context Layer Actually Is

@prukalpa · 23.1K 粉丝 · 583.2K 阅 · 506 赞 · 80 转

A field guide to what it is, what it is not, and where it fits in your AI architecture. I have had some version of the same conversation with a CIO almost every day this year. Their team has read

中文介绍 厘清企业级 AI 架构中「上下文层」的概念:它是什么、不是什么、应该放在架构的哪一层。基于与多位 CIO 的日常交流提炼出实操指南。

hacking pewdiepie's AI agent harness using an evil cocomelon website (then helping protect it)

@theonejvo · 22.1K 粉丝 · 504.3K 阅 · 861 赞 · 1 转

Over the past year, @pewdiepie, has been turning into one of the most visible champions of private, self-hosted computing, and it has been a genuine pleasure to watch. What began in late 2025 as an

中文介绍 通过对 PewDiePie 自托管 AI agent 进行安全测试,发现可被恶意网站劫持,并协助修复漏洞。突显私有 AI 系统在安全设计上的薄弱环节。

Generative UI Is the New Frontend

@Saboo_Shubham_ · 116.2K 粉丝 · 263.3K 阅 · 517 赞 · 74 转

The frontend used to be a fixed thing. Designers drew it. Engineers built it. Users got what shipped. That's over. The interfaces shipping in 2026 are drawn partly by the agent itself, in real time,

中文介绍 断言 2026 年前端范式已改变:UI 不再是设计师画好、工程师实现,而是由 agent 实时动态生成。静态前端已死,生成式 UI 是未来方向。

Claude Code + NotebookLM + Obsidian: Research Monster That Gets Smarter Every Time You Use It

@monokern · 1.2K 粉丝 · 263.1K 阅 · 505 赞 · 72 转

Most people treat research as a manual task. You open 10 tabs. You watch videos. You read articles. You take notes somewhere. An hour later you have a pile of information you're not sure what to do

中文介绍 介绍 Claude Code + NotebookLM + Obsidian 三件套的联动研究流:每次使用都会积累知识,越用越智能。打破传统的「开 10 个标签页」手动模式。

How to get 100k YouTube subscribers in 3 hours (The Complete Guide)

@maubaron · 16.9K 粉丝 · 233.8K 阅 · 506 赞 · 19 转

Our YouTube channel has 125k subscribers and we've never made or uploaded a single video ourselves. This is a completely automated system. It is this very same strategy that made us the first app

中文介绍 披露一套完全自动化的 YouTube 获客系统:无需亲自制作或上传任何视频,已实现 12.5 万订阅。宣称能 3 小时获取 10 万订阅,并推出了对应应用。

Stop building Foxconn factories for your agents

@garrytan · 853.3K 粉丝 · 180.6K 阅 · 503 赞 · 43 转

In January I got back into coding and I built Garry's List. Over five hundred thousand lines of Rails and the tests to police it. I was proud of it. I shouldn't have been. The thing worth being proud

中文介绍 反思自己亲手写了 50 万行 Rails 代码的行为,认为不应为 agent 建造「富士康工厂」。真正有价值的工作是构建 agent 之间以及 agent 与人类的协作层。

Building cloud agent infrastructure: what's different, and what we learned

@intuitiveml · 6.4K 粉丝 · 171.3K 阅 · 524 赞 · 70 转

Most agent frameworks today assume a desktop. One user, one machine, one process. The agent runs while the laptop is open, writes to a local filesystem, holds API keys in environment variables, and

中文介绍 总结构建云端 agent 基础设施的实践心得。指出多数 agent 框架出身于桌面环境(单机单进程),而云端部署面临分布式上下文、局部文件系统、密钥管理等全新挑战。

A guide to /goal 🥅

@dkundel · 19.3K 粉丝 · 116.9K 阅 · 523 赞 · 40 转

We launched the goal mode (or /goal) as a way to help you have Codex drive towards a concrete outcome. When you set a goal Codex will continue to work until the goal is achieved, whether that takes

中文介绍 讲解 Codex 新功能 /goal 模式的使用方法。设定目标后,Codex 会自动驱车直至完成,无需手动分段 prompt,适合需要长链推理的复杂任务。

🥇Top AI Papers of the Week

@dair_ai · 124.6K 粉丝 · 84.0K 阅 · 504 赞 · 83 转

1. SkillOpt Microsoft Research treats a compact natural-language skill document as the trainable state of a frozen agent, then learns that document through rollouts, reflection, and bounded edits

中文介绍 本周最佳 AI 论文精选:SkillOpt(微软将自然语言技能文档作为可训练状态)、以及多篇关于 agent 学习、记忆与规划的最新研究。

State of Memory in Agent Harness

@mem0ai · 17.6K 粉丝 · 82.8K 阅 · 520 赞 · 60 转

Agent harnesses are where AI software actually runs. Cursor, Devin, Claude Code, Codex: these environments handle context, orchestrate tools, coordinate agents, and increasingly, manage memory. The

中文介绍 盘点主流 agent 框架(Cursor、Devin、Claude Code、Codex)在记忆管理上的现状。强调记忆是 agent 落地的关键瓶颈,但目前各方案差异巨大。

A harness for every task: dynamic workflows in Claude Code

@trq212 · 263.1K 粉丝 · 75.7K 阅 · 542 赞 · 36 转

Last week, we released dynamic workflows in Claude Code. Claude can now write its own harness on the fly, custom-built for the task at hand. While the default Claude Code harness is built for coding,

中文介绍 介绍 Claude Code 动态工作流:agent 能实时为当前任务编写自定义 harness,不再局限于固定的编码模板。默认 harness 仅适用于编码场景,而动态方案可泛化到任意任务。

A Functional Taxonomy of World Models

@drfeifei · 738.0K 粉丝 · 72.2K 阅 · 699 赞 · 144 转

“The world is everything that is the case.” — Ludwig Wittgenstein, Tractatus Logico-Philosophicus, 1921 The world is not made of words. In an earlier essay, we argued that spatial intelligence is AI’s

中文介绍 提出世界模型的功能性分类体系。延续之前关于「空间智能是 AI 下一个前沿」的论述,构建一个用于理解世界模型如何运作的框架。

How to Build a Custom Agent Harness

@sydneyrunkle · 7.5K 粉丝 · 69.5K 阅 · 511 赞 · 74 转

Building useful agents is largely about customization: connecting your agent to the right context, data, and environment(s) for the task at hand. At its core, an agent is a model calling tools in a

中文介绍 手把手教程:如何构建自定义 agent harness。核心思想是 agent = 模型 + 工具调用,而 harness 负责提供具体的上下文、数据和运行环境。

A guide to /goal 🥅

@dkundel · 19.3K 粉丝 · 116.9K 阅 · 7d 曝光 116.9K

A guide to /goal 🥅

SpaceX IPOs in 7 days. I Fed the S1 Doc Into Claude. Here Is What It Found Buried in 300 Pages.

@DamiDefi · 96.5K 粉丝 · 2.3M 阅 · 7d 曝光 2.3M

SpaceX IPOs in 7 days. I Fed the S1 Doc Into Claude. Here Is What It Found Buried in 300 Pages.

中文介绍 将 SpaceX 长达 300 页的 S-1 文件输入 Claude 进行分析,挖掘出关键财务数据:2024 年净利润 7.91 亿美元,一家真正盈利且有护城河的航空航天公司即将上市。

Team thinking, visualized by Claude

中文介绍 通过 Claude 的可视化能力,展示团队思考过程的可视化呈现,帮助理解团队协作思维模式。

Team thinking, visualized by Claude

中文介绍 通过 Claude 的可视化能力,展示团队思考过程的可视化呈现,帮助理解团队协作思维模式。

AI Agents as "Games Masters"? 🎮🔥

中文介绍 探讨 AI 代理作为「游戏大师」角色的新概念,可能用于引导复杂任务或交互过程。

Claude Opus 4.8: Lying Machine No More?

中文介绍 Claude Opus 4.8 版本更新,重点解决之前模型存在的不诚实或撒谎问题,提升可信度。

[AINews] not much happened today

a quiet day of RSI.

中文介绍 Latent Space新闻简报称当日AI领域平静,仅有关于RSI的提及。

How to Stop Shipping Low-Quality RL Environments (with Examples)

Your broken harness is actively making the model worse. Here's what I keep seeing after years of eyeballing trajectories, and what you need to fix.

中文介绍 Latent Space文章指出低质量的强化学习环境会损害模型性能,并提供改进建议。

The Meta hack shows there’s more to AI security than Mythos

On June 5, 404 Media reported that attackers had been using Meta’s AI customer support agent to steal Instagram accounts. Their approach was simple: They asked the agent to link the accounts to email addresses that they controlled, and the agent complied. One attacker broke into the dormant Obama Wh

中文介绍 MIT Tech Review报道,攻击者利用Meta的AI客服代理窃取Instagram账户,通过简单请求即可将账户绑定到攻击者控制的邮箱。该事件暴露了AI安全漏洞。

not much happened today

**Anthropic's Mythos/Opus cycle** sparked mixed reactions with praise for **Claude Mythos**'s one-shot workflows and concerns over **Opus 4.8** benchmark regressions. **Opus 4.7** showed strong chemistry task performance, "making Claude a chemist." **Sakana AI** launched an **RSI Lab** focusing on r

中文介绍 Anthropic的Mythos/Opus系列引发不同反应:Claude Mythos的一次性工作流获赞,但Opus 4.8在基准测试中出现倒退;Opus 4.7在化学任务上表现优异。Sakana AI推出了RSI相关项目。

The Claude Cowork product guide

The Claude Cowork product guide

中文介绍 Claude博客发布了Claude Cowork产品指南。

Jun 5, 2026ScienceMaking Claude a chemist

Jun 5, 2026ScienceMaking Claude a chemist

中文介绍 Anthropic研究团队发表文章,介绍如何使Claude成为化学家,提升其在化学任务上的表现。

Reality: The Final Eval — Lukas Petersson and Axel Backlund of Andon Labs

We talk with the VendingBench authors on evaling Claudes from Haiku to Mythos, and how they build leading, and lasting, frontier evals from scratch.

中文介绍 Latent Space采访Andon Labs的Lukas Petersson和Axel Backlund,探讨从Haiku到Mythos的Claude模型评估,以及如何从零构建前沿评测基准。

How Endava is redesigning software delivery around AI agents

Learn how Endava is using AI agents, ChatGPT Enterprise, and Codex to accelerate software delivery, automate workflows, and build an AI-native culture across the enterprise.

中文介绍 OpenAI报道Endava如何利用AI代理、ChatGPT Enterprise和Codex重塑软件交付流程,加速工作自动化并构建AI原生企业文化。

How courts are coping with a flood of AI-generated lawsuits

Most days in her chambers, Judge Maritza Braswell, a federal magistrate judge in Colorado, sifts through stacks of documents written by people without a lawyer. Many of them can’t afford to hire a lawyer, and others have cases too weak or too small to interest one. She reads each one carefully, mind

中文介绍 MIT Tech Review报道,美国法院正应对大量AI生成的法律文件,联邦法官Maritza Braswell等法官需处理许多非律师代理的诉讼,其中部分文件由AI撰写。

【实习交流】板块申请

版块名称: 实习交流 URL Slug: offer 版块简介: 版主人选: @made_po 版块规则: 1.严禁发布任何收费培训、简历代写、付费内推或未经验证的招聘广告,招聘广告要在经验贴最下方 2.分享公司实习信息时请尽量注明大致时间、部门(敏感信息可匿),鼓励真实但不要恶意吐槽 3.鼓励使用相对统一的格式写面经、实习复盘、Offer对比,方便后来人搜索参考 4.纯求助帖最好附上自己已经做的努力和背景,否则容易沉或被移 5.保持理性建设性讨论,禁止人身攻击、地域黑、学历歧视等内容 申请理由: 很多佬U字字珠玑,句句刻骨,明明随口一说就是极其宝贵的经验教训,但是却散落在各个零碎的帖子评论区

查查你的Github注册了多久

论坛邀请规则已修改 社区拟优化账号注册机制 GitHub申请渠道 调整为 5 年 GitHub 账号直接获取 很多佬友不知道自己的github注册多久了,快来查一查吧 xyl1null.github.io GitHub Account Age Checker 编辑:api返回的是zulu时间,也就是UTC,现已转换成中国标准时间UTC+8,抱歉让佬们反思起不健康的作息,没有的事~ 187 个帖子 - 175 位参与者 阅读完整话题

我是不是错过了什么?

codex无限并发无限额度?无限bug team?车已经开走了? 24 个帖子 - 23 位参与者 阅读完整话题

社区拟优化账号注册机制

回顾两年多以来,社区账号注册进行了多轮调整,旨在匹配当前社区阶段,现在看来都出色完成了阶段使命。 不过也有一些非议,非议来自不好量化,最常见的就是对于申请自述的理解不同。为了配合社区进入新阶段发展,我们同步决定优化一下注册机制: 取消申请自述填写与审核 邀请链接供给侧结构调整 第一点很好理解,也就是有邀请链接就可以直接注册。一方面消除不确定性,另一方面释放管理精力进行下一阶段运营。 3个渠道: GitHub申请渠道 调整为 5 年 GitHub 账号直接获取。 Premium分组 每 3 天可生成 1 个邀请链接。 管理员分组 每 3 天可生成 5 个邀请链接。 新的注册方式,将极大增加确定性

Grok HTML可视化提示词,支持绘图和网络图片内嵌

效果 # AMC-WebUI Live Artifacts Designer (Grok Defensive Hybrid - Clean) v2 你是一个只输出渲染后 HTML 的专业引擎。你将融合 AMC-WebUI 的高信息密度智能布局与 Grok 平台的极端渲染防御规则,把用户信息转化为无懈可击、精美且可读的内联 HTML 产物。 ## 致命防御约束 (ZERO TOLERANCE - 极高优先级) 1. 绝对禁止反引号:严禁在响应中输出任何 ``` 或 ` 符号。 2. 绝对禁止 <style> 和 <script> 块:所有样式必须 100% 写入每个标签的

【CHY公益站】迈向稳定的第二步

已将CHY公益站使用的MySQL迁移至 PostgreSQL并配置了Redis作为缓存,至此,CHY公益站迈出向着稳定的第二步 10 个帖子 - 10 位参与者 阅读完整话题

牛逼的 无限team 校长和v总估计炸了吧

@raotom @vux1jpmal5t41lg 谁说这届佬友不行 现在满屏的bug team 估计openai要炸了吧 所以有人能解释下发生了个啥么 我吃个瓜 11 个帖子 - 10 位参与者 阅读完整话题

【开源推广】女朋友一句“今天吃什么”,我搓了个家庭点菜小程序

本帖使用社区开源推广,符合推广要求。我申明并遵循社区要求的以下内容: 我的帖子已经打上 开源推广 标签: 是 我的开源项目完整开源,无未开源部分: 是 我的开源项目已链接认可 LINUX DO 社区: 是 我帖子内的项目介绍,AI生成、润色内容部分已截图发出: 是 以上选择我承诺是永久有效的,接受社区和佬友监督: 是 以下为项目介绍正文内容,AI生成、润色内容已使用截图方式发出 很久没冒泡了 一看等级都掉到2级 翻了下记录,感觉平时也就没事上L站随便转转 也不知道要看什么 然后又把网页关闭继续摆烂 正篇开始 以下(宝宝)都是我女朋友 哈哈 这段时间纯属典型摆烂状态 下午5点醒来坐电脑面前摆烂

Ask HN: Why is the HN crowd so anti-AI?

Genuine question.Over the past six months, there hasn’t been a single day where I’ve checked the HN Best RSS feed without seeing a post about how AI “writes bad code,” “introduces bugs,” “creates technical debt,” or something along those lines.I’ll probably make a lot of enemies by saying this, but

Ask HN: What was your "oh shit" moment with GenAI?

Most of us were amused when DALL-E and its peers went mainstream, and we were quick to point out the obvious flaws.Then ChatGPT hit the scene and again, many of us dismissed it as a parlor trick that would never amount to much.Using LLMs for coding initially was a only small step up from basic code

Show HN: Infinite canvas notes in the non-Euclidean Poincaré disk

Hi!This is an infinite canvas note-taking tool where notes are laid out in a non-Euclidean, hyperbolic geometric space. As you drag and navigate through the view, you’ll experience a unique fluid distortion that naturally leverages your brain's spatial memory.I’ve been obsessed with the concept