GitHub Trending
为Agent和生成式UI提供React、Angular、移动端、Slack等前端栈,并提出AG-UI协议,可快速构建交互式AI界面。
推荐理由:开发者可直接上手构建Agent UI,可行动性极高,是当前热门的开源项目。
GitHub Trending
为Agent和生成式UI提供React、Angular、移动端、Slack等前端栈,并提出AG-UI协议,可快速构建交互式AI界面。
推荐理由:开发者可直接上手构建Agent UI,可行动性极高,是当前热门的开源项目。
MIT Tech Review AI
据报道,攻击者利用Meta AI客服将账号链接到自己的邮箱,以此窃取Instagram账户,暴露了AI Agent的安全隐患。
推荐理由:揭示了AI Agent落地中的真实安全风险,对开发者和安全从业者有重要警示。
GitHub Trending
一个AI Agent技能,能够从Reddit、X、YouTube、HN、Polymarket和网页中研究任意主题,并合成有据可查的摘要。
推荐理由:实用工具,可帮助快速获取跨平台热点,适合信息工作者和研究者。
Anthropic Engineering
随着Agent能力增强,其潜在影响范围扩大。Anthropic分享了在claude.ai、Claude Code等产品中限制Agent权限的工程经验。
推荐理由:对构建安全Agent系统的团队有直接参考价值,安全设计思路值得学习。
Claude Blog
Anthropic一位销售人员分享了如何使用Claude Code自动化工作流,提升团队效率。
推荐理由:真实案例展示非技术人员如何用AI提升效率,可复制性强。
Anthropic Research
Anthropic发布研究,展示如何让Claude具备化学知识,在分子性质预测等任务上表现出色。
推荐理由:展示了大模型在垂直科学领域的潜力,对AI+科学感兴趣者可关注。
Hacker News
一个提议中的开放标准,旨在为AI Agent提供统一的记忆存储和交换格式,促进跨Agent协作。
推荐理由:切入Agent记忆碎片化痛点,对Agent生态演进有长远意义。
Hugging Face Blog
在Hugging Face hackathon中,团队使用多个小型模型构建了一个金融模拟剧,展示小模型协作的创意应用。
推荐理由:展示小模型创造性玩法,对探索低成本AI应用有启发。
HuggingFace Trending Papers
研究在线学习中面对自适应对手时的遗憾最小化问题,提出标准外部遗憾度量在动态环境中的局限性。
推荐理由:强化学习理论前沿论文,适合算法研究者了解。
TLDR AI
TLDR AI汇总了Anthropic Oceanus模型泄露、ChatGPT做梦功能、递归自我改进等业内热点。
推荐理由:快速掌握当日AI圈核心动态,适合作为信息补全。
Python · ★ 28,724 · 🍴 2,433 · 📈 441 stars today
AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
中文介绍 AI agent 技能模块,可跨 Reddit、X、YouTube、HN、Polymarket 及网页调研任意话题,自动合成有依据的摘要。面向需要快速获取多渠道最新信息并生成结构化总结的研究者或分析师。
TypeScript · ★ 33,163 · 🍴 4,235 · 📈 613 stars today
The Frontend Stack for Agents & Generative UI. React, Angular, Mobile, Slack, and more. Makers of the AG-UI Protocol
中文介绍 为 Agent 和 Generative UI 提供前端框架,支持 React、Angular、移动端、Slack 等平台。基于 AG-UI 协议,让开发者轻松构建带智能交互能力的用户界面。
Python · ★ 54,239 · 🍴 7,104 · 📈 441 stars today
The best-benchmarked open-source AI memory system. And it's free.
中文介绍 开源 AI 记忆系统,基准测试表现优秀,免费使用。帮助 AI 应用保留跨会话的上下文信息,解决长对话或持续任务中的记忆问题。
TypeScript · ★ 14,922 · 🍴 2,120 · 📈 63 stars today
Agentic AI Infrastructure for magnifying HUMAN capabilities.
中文介绍 面向个人的 AI 基础设施,聚焦增强人类能力。提供 Agentic 架构,让用户搭建私有 AI 助手,用于任务自动化、信息处理等场景。
JavaScript · ★ 1,745 · 🍴 256 · 📈 215 stars today
OpenAI Plugins
中文介绍 OpenAI 官方插件仓库,为 ChatGPT 等模型提供扩展能力。开发者可创建插件让 AI 访问实时数据、执行操作,适配多种业务场景。
Python · ★ 22,253 · 🍴 1,902 · 📈 700 stars today
Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.
中文介绍 为 AI agent 提供全网可见能力,通过 CLI 免费读取和搜索 Twitter、Reddit、YouTube、GitHub、B站、小红书等平台。无需 API 费用,适合需要多源数据的自动化 agent。
JavaScript · ★ 86,956 · 🍴 4,936 · 📈 34 stars today
web development for the rest of us
中文介绍 Svelte 是一款前端编译框架,将组件编译为高效原生 JavaScript,减少运行时开销。适用于构建高性能 Web 应用,新手友好且支持渐进式增强。
C · ★ 30,671 · 🍴 7,956 · 📈 37 stars today
The official NGINX Open Source repository.
中文介绍 官方 Nginx 开源仓库,高性能 HTTP 服务器与反向代理软件。广泛用于静态资源托管、负载均衡、API 网关等场景,以稳定性和低资源消耗著称。
Go · ★ 35,990 · 🍴 454 · 📈 159 stars today
Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
中文介绍 综合安全扫描工具,支持容器、Kubernetes、代码仓库、云环境中漏洞、配置错误、密钥泄漏和 SBOM 检测。DevOps 团队用于持续安全审计。
Go · ★ 134,497 · 🍴 19,089 · 📈 24 stars today
The Go programming language
中文介绍 Go 编程语言官方仓库,以简洁语法、并发支持和高效编译著称。适合构建网络服务、微服务、CLI 工具等后端系统,在云原生生态中广泛应用。
TypeScript · ★ 26,565 · 🍴 3,035 · 📈 783 stars today
An Open Source implementation of Notebook LM with more flexibility and features
中文介绍 Notebook LM 的开源替代品,提供更灵活的功能。支持文档检索、知识管理和 AI 生成笔记,适合研究人员和学生整理和分析信息。
Shell · ★ 219,603 · 🍴 19,540 · 📈 1,008 stars today
An agentic skills framework & software development methodology that works.
中文介绍 一套 Agentic 技能框架与软件开发方法论,强调实用性。帮助开发者以结构化方式构建 AI Agent 功能,提升开发效率和系统可维护性。
JavaScript · ★ 49,260 · 🍴 10,192 · 📈 203 stars today
AI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.
中文介绍 基于 Claude Code 的 AI 求职系统,提供 14 种技能模式、Go 语言仪表盘、PDF 生成和批量处理功能。自动化简历优化、职位搜索和申请流程。
Python · ★ 101,822 · 🍴 12,439 · 📈 155 stars today
Robust Speech Recognition via Large-Scale Weak Supervision
中文介绍 OpenAI 开源的语音识别模型,通过大规模弱监督训练获得强泛化能力。支持多语言转写,适用于会议记录、视频字幕、语音搜索等场景。
TypeScript · ★ 81,162 · 🍴 8,274 · 📈 73 stars today
Next generation frontend tooling. It's fast!
中文介绍 下一代前端构建工具,基于原生 ESM 实现极速启动和热更新。支持 Vue、React、Svelte 等框架,适合现代 Web 项目开发。
Rust · ★ 557 · 🍴 24 · 📈 57 stars today
Policy-driven, layered isolation and containment
中文介绍 微软开源的策略驱动隔离层,采用分层隔离与限制机制。用于提升系统安全性,适合需要高隔离等级的多租户或边缘计算场景。
Python · ★ 80,927 · 🍴 10,654 · 📈 449 stars today
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
中文介绍 轻量级 OCR 工具包,可将 PDF 或图片转为结构化数据,支持 100+ 语言。桥接传统文档与 LLM 处理流程,广泛用于票据识别、文档数字化等。
Python · ★ 48,446 · 🍴 5,389 · 📈 219 stars today
Open-Source Frontier Voice AI
中文介绍 微软开源的端侧语音 AI 系统,提供前沿的语音交互能力。适用于智能助手、语音控制等场景,注重低延迟和隐私保护。
👍 1
In this paper, we study regret minimization in repeated games with adaptive opponents who can respond based on histories of play. The standard metric of external regret in online learning is known to fail to capture such adaptivity. To account for players' counterfactual reasoning, we introduce {\tt
中文介绍 研究重复博弈中自适应对手的遗憾最小化问题,指出标准外部遗憾指标无法适应对手基于历史的行为调整,并提出新方法。
👍 3
Multimodal Large Language Models (MLLMs) excel at 2D semantic understanding but lack intrinsic 3D awareness, resulting in representations that fail to maintain geometric and spatial consistency across video frames. Given the scarcity of large-scale 3D data, we present GeoVR, a novel framework that l
中文介绍 提出GeoVR框架,通过视频学习几何表示,增强多模态大语言模型在视频帧中的3D空间一致性,弥补3D数据稀缺问题。
👍 6
Vision-Language-Action (VLA) models leverage the rich world knowledge of pretrained vision-language models (VLMs) to enable instruction-following robotic manipulation. However, the structural mismatch between VLM semantic spaces and embodied control policies often hinders the learning of precise per
中文介绍 提出AffordanceVLA模型,结合视觉-语言-动作,利用预训练VLM的世界知识增强机器人操作中的动作生成,解决语义与策略的结构不匹配。
👍 63
Code language models need repository-level context to resolve imports, APIs, and project conventions. Existing methods inject this knowledge as long inputs (retrieved through RAG or dependency analysis) or through per-repository fine-tuning and LoRA -- costly at repository scale and brittle to evolv
中文介绍 提出Code2LoRA方法,通过超网络为代码语言模型生成适配器,处理软件演化中的仓库级上下文,避免昂贵的全仓库微调。
👍 1
A situated query like "where is Lin Wei?" often encodes more than its literal content: the user may also want to know whether Lin Wei is free, in a good mood, or worth interrupting now. Standard tool-use agents answer the literal question and stop. AURA inserts an inference step between scene percep
中文介绍 提出AURA框架,用于情境化LLM代理,通过意图导向的探测挖掘用户隐含需求,超越字面问题回答。
👍 2
Benchmarks are fundamental for evaluating and advancing LLMs and MLLMs by providing standardized and explicit measures of performance. However, their construction is labor-intensive and hard to reuse, raising concerns about sustainability and scalability. Moreover, existing benchmarks often quickly
中文介绍 讨论现有LLM和MLLM基准构建劳动密集且难以复用,关注可持续性和可扩展性问题。
👍 2
Large language models are increasingly used to simulate social media users and infer how individuals may respond to online discussions. However, it remains unclear whether these simulations reflect precise user-specific beliefs or whether they are highly sensitive to semantically independent changes
中文介绍 评估LLM在模拟社交媒体用户立场时的敏感性,发现模拟结果对语义细微变化高度敏感,未必反映用户真实信念。
👍 0
AI research often requires decisions before future evidence exists: which bottleneck to attack, which direction to pursue, or where a project should be positioned. We introduce ForeSci, a temporally controlled benchmark for evaluating whether LLM agents can make such forward-looking research judgeme
中文介绍 提出ForeSci基准,用于评估LLM代理在人工智能研究中做前瞻性判断(如选择瓶颈、方向)的能力。
👍 15
Video generation models have made impressive strides in synthesizing visually compelling content, yet their outputs remain confined to the virtual domain. A natural question follows: how well do these models reflect the physical world when their generated videos leave the screen and enter reality? W
中文介绍 探索视频生成模型生成的内容能否用于可执行机器人操作,评估其反映物理世界的程度。
👍 4
Temporal Grounding (TG) aims to localize video segments corresponding to a textual query. Prior research predominantly focuses on single-segment retrieval. Real-world scenarios, however, often require localizing multiple disjoint segments for a single query -- a setting we term One-to-Many Temporal
中文介绍 提出“一对多时间定位”任务,针对单个查询定位视频中多个不连续片段,扩展传统单片段定位。
👍 7
Large language models can reproduce training data, but existing memorization evaluations mostly measure whether models can be forced to do so, rather than whether they do so under ordinary use. We introduce PropMe, a propensity-aware framework for memorization evaluation that contrasts prefix-based
中文介绍 提出PropMe框架,评估LLM在常规使用中是否倾向泄露训练数据,区分被迫记忆与自然回忆。
👍 36
Planning for real-world problems by language models often involves both world and user constraints, which may not be fully specified upfront and are progressively disclosed through interaction. However, existing benchmarks still underexplore adaptive planning under such progressively revealed dual c
中文介绍 提出AdaPlanBench基准,测试LLM代理在世界约束和用户约束逐渐披露时进行自适应规划的能力。
👍 42
Role-playing language agents (RPLAs) should play characters whose values and behavior evolve as the story progresses, not maintain a fixed persona. Existing benchmarks measure factual recall at a given chapter, not whether responses align with the character's psychological trajectory, especially in
👍 23
Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a grammar book in their context. However, both methods typically overfit specific languages, with limited zero-shot transfer at test time. To tra
👍 38
Agents are widely deployed as assistants over documents, tools, and code. However, they typically act only on explicit user requests, which surface only the problems the user has noticed, while many other important problems coexist, hidden in plain sight, within the broader user context, with their
👍 8
Selection is a core operation in interactive image editing. To be practical, a user should be able to specify and disambiguate the desired selection region through either text or click-based interactions, and the system should support selecting not only objects but also other criteria, such as mater
👍 4
In robotics systems, vast amounts of visual data are easily captured at high resolution using low-cost, low-power hardware. Yet, limited bandwidth and on-device compute resources prevent full utilization when transmitted via conventional codecs like JPEG/MPEG. Newer codecs, like AV1/AVIF, improve th
👍 23
While household robots are often evaluated based on task completion, everyday domestic environments involve value-conflicting situations in which robots are expected to choose actions that prioritize other values than task success, such as human autonomy, efficiency, or social appropriateness. Yet,
👍 2
Recent progress in Large Language Model (LLM) agents has enabled promising advances in automated data science. However, existing approaches remain fundamentally limited by their static action sets and lack of principled long-horizon context management, hindering their ability to accumulate reusable
👍 1
Agentic LLMs with web search change the threat model for text anonymization: weak contextual cues can become cross-referenceable evidence for re-identification, yet those same details also carry downstream analytic value of the text. Existing defenses either remove explicit identifiers, perturb text
👍 2
Financial AI agents often fail for a simple reason: they make users carry the complexity. A user must repeatedly restate goals, risk preferences, portfolio context, past judgments, and shifting market assumptions, while the agent answers, retrieves, acts, and forgets. In finance, this is not just in
👍 4
Video is temporally redundant: adjacent frames usually share most objects, background, and layout. Yet existing video multimodal large language models (video MLLMs) usually encode each sampled frame as an independent RGB image, causing visual tokens to repeat content already present in earlier frame
👍 0
Large language models are increasingly deployed as coding agents, shifting safety from individual responses to action sequences. Existing benchmarks, however, primarily assess whether models refuse unsafe prompts, leaving impacts on stateful workspaces largely unexamined. We present SABER, a benchma
👍 3
Large Language Models exhibit paradoxical fragility in fundamental arithmetic, implying a disconnect between internal computation and discrete output. By analyzing the residual stream geometry during multi-operand addition, we identify the Iso-Raw-Sum Trajectory (IRST), a geometric structure where r
👍 3
Multimodal Large Language Models (MLLMs) have demonstrated significant achievements in general visual question answering (VQA) tasks. However, they remain brittle on mechanical engineering drawings, where high annotation density and weak domain knowledge, compounded by unreliable spatial relation re
👍 1
Reinforcement Learning with Verifiable Rewards (RLVR) has recently emerged as the cornerstone for shaping the remarkable coding abilities of Large Language Models (LLMs). However, the scalability of RLVR is severely constrained by the scarcity of sufficiently challenging verifiable code tasks that t
👍 1
Music recommendation systems typically treat songs as opaque tokens, relying on collaborative interaction histories which overlooks semantic or acoustic content. Prior work has explored LLM-augmented, multimodal, and text-enhanced approaches to sequential recommendation, and while some methods parti
👍 6
Memory-augmented LLM agents tackle complex long-horizon tasks by recursively summarizing interaction trajectories into compact memory. However, existing approaches typically train these memory policies using outcome-based reinforcement learning, failing to localize where intermediate memory quality
👍 3
Off-policy reinforcement learning of pretrained flow policies remains challenging due to the instability of optimization arising from the multi-step sampling process. Recently, Q-learning with Adjoint Matching (QAM) addressed this issue by reformulating into a memoryless stochastic optimal control (
👍 1
Diffusion-based image editing has achieved strong visual fidelity under natural language instructions, yet most existing systems still operate at the level of surface instruction following, without reasoning about the implicit contextual constraints embedded in real user requests. This often leads t
@elpresidank · 116 粉丝 · 2.9M 阅 · 543 赞 · 35 转
Most AI agent memory is built on embeddings. And there's now a proof that this entire class of system is going to forget what you stored in it — and confidently make up things you never stored at all.
中文介绍 指出基于 embedding 构建的 AI agent 记忆系统存在根本缺陷:会有遗忘和幻觉。通过理论证明,现有架构无法逃脱这一局限,呼吁探索结构化的记忆方案。
@DamiDefi · 96.5K 粉丝 · 2.3M 阅 · 584 赞 · 80 转
The number that stopped me was not the $2 trillion valuation. It was $791 million. That is what SpaceX made in net income in 2024. A profitable, growing aerospace company with a genuine moat in launch
中文介绍 将 SpaceX 长达 300 页的 S-1 文件输入 Claude 进行分析,挖掘出关键财务数据:2024 年净利润 7.91 亿美元,一家真正盈利且有护城河的航空航天公司即将上市。
@1salman · 363 粉丝 · 2.0M 阅 · 682 赞 · 45 转
Everyone keeps asking whether AI favors specialists or generalists. I think that is the wrong question. AI does not pick a side. It changes the tradeoff. The old world forced a choice. You could go
中文介绍 讨论 AI 时代专家与通才之争的伪命题。认为 AI 不偏爱某一类,而是改变了原有的取舍关系:深度与广度可以同时获得,旧有的选择题已过时。
@sairahul1 · 110.7K 粉丝 · 710.8K 阅 · 509 赞 · 97 转
How To Become An AI Engineer in 2026. Without a CS degree. Without a bootcamp. Without knowing what a transformer is today. Here's what nobody tells you: The companies hiring right now don't need
中文介绍 无需 CS 学位、无需 bootcamp,如何在 2026 年成为 AI 工程师。核心观点:当前招聘市场不看重学历,更看重实际构建能力,零基础也能起步。
@eng_khairallah1 · 61.9K 粉丝 · 693.5K 阅 · 511 赞 · 71 转
Obsidian has 2,700+ community plugins. Over 100 of them are AI-related. Save this :) And the CEO of Obsidian personally published official Claude Skills for the platform - 12,900+ GitHub stars in
中文介绍 整理 30 个 Obsidian 冷门工作流、插件和配置,包含超 100 个 AI 相关插件。Obsidian CEO 官方发布的 Claude Skills 已在 GitHub 获得 12900+ star。
@0xCodez · 3.3K 粉丝 · 637.2K 阅 · 510 赞 · 59 转
Most Claude Code users still write their workflows by hand. They chain prompts, copy outputs, paste them into the next prompt, fix what went wrong, repeat. 9 out of 10 builders haven’t tried Dynamic
中文介绍 详解 Claude Code 动态工作流(Dynamic Workflows)的 6 种模式和 14 步实操。指出大多数用户仍手动链式调 prompt,而 Anthropic 工程师已采用自动化编排。
@prukalpa · 23.1K 粉丝 · 583.2K 阅 · 506 赞 · 80 转
A field guide to what it is, what it is not, and where it fits in your AI architecture. I have had some version of the same conversation with a CIO almost every day this year. Their team has read
中文介绍 厘清企业级 AI 架构中「上下文层」的概念:它是什么、不是什么、应该放在架构的哪一层。基于与多位 CIO 的日常交流提炼出实操指南。
@theonejvo · 22.1K 粉丝 · 504.3K 阅 · 861 赞 · 1 转
Over the past year, @pewdiepie, has been turning into one of the most visible champions of private, self-hosted computing, and it has been a genuine pleasure to watch. What began in late 2025 as an
中文介绍 通过对 PewDiePie 自托管 AI agent 进行安全测试,发现可被恶意网站劫持,并协助修复漏洞。突显私有 AI 系统在安全设计上的薄弱环节。
@Saboo_Shubham_ · 116.2K 粉丝 · 263.3K 阅 · 517 赞 · 74 转
The frontend used to be a fixed thing. Designers drew it. Engineers built it. Users got what shipped. That's over. The interfaces shipping in 2026 are drawn partly by the agent itself, in real time,
中文介绍 断言 2026 年前端范式已改变:UI 不再是设计师画好、工程师实现,而是由 agent 实时动态生成。静态前端已死,生成式 UI 是未来方向。
@monokern · 1.2K 粉丝 · 263.1K 阅 · 505 赞 · 72 转
Most people treat research as a manual task. You open 10 tabs. You watch videos. You read articles. You take notes somewhere. An hour later you have a pile of information you're not sure what to do
中文介绍 介绍 Claude Code + NotebookLM + Obsidian 三件套的联动研究流:每次使用都会积累知识,越用越智能。打破传统的「开 10 个标签页」手动模式。
@maubaron · 16.9K 粉丝 · 233.8K 阅 · 506 赞 · 19 转
Our YouTube channel has 125k subscribers and we've never made or uploaded a single video ourselves. This is a completely automated system. It is this very same strategy that made us the first app
中文介绍 披露一套完全自动化的 YouTube 获客系统:无需亲自制作或上传任何视频,已实现 12.5 万订阅。宣称能 3 小时获取 10 万订阅,并推出了对应应用。
@garrytan · 853.3K 粉丝 · 180.6K 阅 · 503 赞 · 43 转
In January I got back into coding and I built Garry's List. Over five hundred thousand lines of Rails and the tests to police it. I was proud of it. I shouldn't have been. The thing worth being proud
中文介绍 反思自己亲手写了 50 万行 Rails 代码的行为,认为不应为 agent 建造「富士康工厂」。真正有价值的工作是构建 agent 之间以及 agent 与人类的协作层。
@intuitiveml · 6.4K 粉丝 · 171.3K 阅 · 524 赞 · 70 转
Most agent frameworks today assume a desktop. One user, one machine, one process. The agent runs while the laptop is open, writes to a local filesystem, holds API keys in environment variables, and
中文介绍 总结构建云端 agent 基础设施的实践心得。指出多数 agent 框架出身于桌面环境(单机单进程),而云端部署面临分布式上下文、局部文件系统、密钥管理等全新挑战。
@dkundel · 19.3K 粉丝 · 116.9K 阅 · 523 赞 · 40 转
We launched the goal mode (or /goal) as a way to help you have Codex drive towards a concrete outcome. When you set a goal Codex will continue to work until the goal is achieved, whether that takes
中文介绍 讲解 Codex 新功能 /goal 模式的使用方法。设定目标后,Codex 会自动驱车直至完成,无需手动分段 prompt,适合需要长链推理的复杂任务。
@dair_ai · 124.6K 粉丝 · 84.0K 阅 · 504 赞 · 83 转
1. SkillOpt Microsoft Research treats a compact natural-language skill document as the trainable state of a frozen agent, then learns that document through rollouts, reflection, and bounded edits
中文介绍 本周最佳 AI 论文精选:SkillOpt(微软将自然语言技能文档作为可训练状态)、以及多篇关于 agent 学习、记忆与规划的最新研究。
@mem0ai · 17.6K 粉丝 · 82.8K 阅 · 520 赞 · 60 转
Agent harnesses are where AI software actually runs. Cursor, Devin, Claude Code, Codex: these environments handle context, orchestrate tools, coordinate agents, and increasingly, manage memory. The
中文介绍 盘点主流 agent 框架(Cursor、Devin、Claude Code、Codex)在记忆管理上的现状。强调记忆是 agent 落地的关键瓶颈,但目前各方案差异巨大。
@trq212 · 263.1K 粉丝 · 75.7K 阅 · 542 赞 · 36 转
Last week, we released dynamic workflows in Claude Code. Claude can now write its own harness on the fly, custom-built for the task at hand. While the default Claude Code harness is built for coding,
中文介绍 介绍 Claude Code 动态工作流:agent 能实时为当前任务编写自定义 harness,不再局限于固定的编码模板。默认 harness 仅适用于编码场景,而动态方案可泛化到任意任务。
@drfeifei · 738.0K 粉丝 · 72.2K 阅 · 699 赞 · 144 转
“The world is everything that is the case.” — Ludwig Wittgenstein, Tractatus Logico-Philosophicus, 1921 The world is not made of words. In an earlier essay, we argued that spatial intelligence is AI’s
中文介绍 提出世界模型的功能性分类体系。延续之前关于「空间智能是 AI 下一个前沿」的论述,构建一个用于理解世界模型如何运作的框架。
@sydneyrunkle · 7.5K 粉丝 · 69.5K 阅 · 511 赞 · 74 转
Building useful agents is largely about customization: connecting your agent to the right context, data, and environment(s) for the task at hand. At its core, an agent is a model calling tools in a
中文介绍 手把手教程:如何构建自定义 agent harness。核心思想是 agent = 模型 + 工具调用,而 harness 负责提供具体的上下文、数据和运行环境。
@sheriyuo · 8.6K 粉丝 · 30.6K 阅 · 7d 曝光 30.6K
RL Interview Questions 2026
@maubaron · 16.9K 粉丝 · 233.8K 阅 · 7d 曝光 233.8K
How to get 100k YouTube subscribers in 3 hours (The Complete Guide)
@itsreallyvivek · 3.6K 粉丝 · 65.8K 阅 · 7d 曝光 65.8K
some notes on getting into frontier ai labs
@dickiebush · 441.8K 粉丝 · 57.7K 阅 · 7d 曝光 57.7K
I Gave Claude David Ogilvy's Writing Rules And Built A Legendary AI Writing Coach
@sairahul1 · 110.7K 粉丝 · 710.8K 阅 · 7d 曝光 710.8K
How To Become An AI Engineer in 2026 (Without a CS Degree)
@intuitiveml · 6.4K 粉丝 · 171.3K 阅 · 7d 曝光 171.3K
Building cloud agent infrastructure: what's different, and what we learned
@ENERGY · 884.0K 粉丝 · 102.3K 阅 · 7d 曝光 102.3K
Department of Energy Celebrates First Advanced Reactor Criticality
@dkundel · 19.3K 粉丝 · 116.9K 阅 · 7d 曝光 116.9K
A guide to /goal 🥅
@DamiDefi · 96.5K 粉丝 · 2.3M 阅 · 7d 曝光 2.3M
SpaceX IPOs in 7 days. I Fed the S1 Doc Into Claude. Here Is What It Found Buried in 300 Pages.
中文介绍 将 SpaceX 长达 300 页的 S-1 文件输入 Claude 进行分析,挖掘出关键财务数据:2024 年净利润 7.91 亿美元,一家真正盈利且有护城河的航空航天公司即将上市。
@168X_Fortune · 11.3K 粉丝 · 409.2K 阅 · 7d 曝光 409.2K
qinbafrank 拆解美股回调级别,AI 真正风险是什么?从软件股、Marvell 光互连、Nokia 到 SpaceX 看资金轮动
@servasyy_ai · 33.0K 粉丝 · 267.9K 阅 · 7d 曝光 267.9K
30分钟掌握Codex的97%功能(完整教程)
@yanhua1010 · 32.0K 粉丝 · 88.4K 阅 · 7d 曝光 88.4K
这应该是目前最接近正解的 Agent 记忆方案
@jainarvind · 9.3K 粉丝 · 53.7K 阅 · 7d 曝光 53.7K
Your token spend is an AI architecture problem, not just a model problem
中文介绍 Cursor 发布重大更新,同时 DeepSeek v4 在性能上已接近或赶上 Opus 4.8 的水平。
中文介绍 ChatGPT 与 Codex 正在合并,这一变化预计将彻底改变编程和 AI 交互的方式。
中文介绍 Anthropic 分享如何在其市场推广工程中使用 Claude,展示 AI 在商业流程中的实际应用。
中文介绍 Lovable 公司创始人 Anton Osika 参与节目,探讨 AI 作为问题解决者的角色与前景。
中文介绍 通过 Claude 的可视化能力,展示团队思考过程的可视化呈现,帮助理解团队协作思维模式。
中文介绍 Anthropic 分享如何在其市场推广工程中使用 Claude,展示 AI 在商业流程中的实际应用。
中文介绍 Lovable 公司创始人 Anton Osika 参与节目,探讨 AI 作为问题解决者的角色与前景。
中文介绍 通过 Claude 的可视化能力,展示团队思考过程的可视化呈现,帮助理解团队协作思维模式。
中文介绍 探讨 AI 代理作为「游戏大师」角色的新概念,可能用于引导复杂任务或交互过程。
中文介绍 DeepMind 的新 AI 模型发现了一种新颖的思维方式,可能改变人工智能对问题处理的方式。
中文介绍 介绍一款名为「Co-Scientist」的 AI 工具,它可能改变科学研究的方式,辅助科学发现。
中文介绍 Claude Opus 4.8 版本更新,重点解决之前模型存在的不诚实或撒谎问题,提升可信度。
中文介绍 Hugging Face博客报道了一个黑客松项目,五个实验室使用小型模型合作构建了一部多模型金融题材戏剧,展示了小型模型在创意领域的协作潜力。
a quiet day of RSI.
中文介绍 Latent Space新闻简报称当日AI领域平静,仅有关于RSI的提及。
Your broken harness is actively making the model worse. Here's what I keep seeing after years of eyeballing trajectories, and what you need to fix.
中文介绍 Latent Space文章指出低质量的强化学习环境会损害模型性能,并提供改进建议。
On June 5, 404 Media reported that attackers had been using Meta’s AI customer support agent to steal Instagram accounts. Their approach was simple: They asked the agent to link the accounts to email addresses that they controlled, and the agent complied. One attacker broke into the dormant Obama Wh
中文介绍 MIT Tech Review报道,攻击者利用Meta的AI客服代理窃取Instagram账户,通过简单请求即可将账户绑定到攻击者控制的邮箱。该事件暴露了AI安全漏洞。
a quiet day
中文介绍 Latent Space新闻简报称当日AI领域平静。
**Anthropic's Mythos/Opus cycle** sparked mixed reactions with praise for **Claude Mythos**'s one-shot workflows and concerns over **Opus 4.8** benchmark regressions. **Opus 4.7** showed strong chemistry task performance, "making Claude a chemist." **Sakana AI** launched an **RSI Lab** focusing on r
中文介绍 Anthropic的Mythos/Opus系列引发不同反应:Claude Mythos的一次性工作流获赞,但Opus 4.8在基准测试中出现倒退;Opus 4.7在化学任务上表现优异。Sakana AI推出了RSI相关项目。
How one Anthropic seller rebuilt his team's workflows with Claude Code
中文介绍 Claude博客介绍了一位Anthropic销售员工如何使用Claude Code重构团队工作流程。
The Claude Cowork product guide
中文介绍 Claude博客发布了Claude Cowork产品指南。
Jun 5, 2026ScienceMaking Claude a chemist
中文介绍 Anthropic研究团队发表文章,介绍如何使Claude成为化学家,提升其在化学任务上的表现。
中文介绍 TLDR AI简报报道了Anthropic的Oceanus模型泄露、ChatGPT的新功能Dreaming,以及递归自我改进相关进展。
We talk with the VendingBench authors on evaling Claudes from Haiku to Mythos, and how they build leading, and lasting, frontier evals from scratch.
中文介绍 Latent Space采访Andon Labs的Lukas Petersson和Axel Backlund,探讨从Haiku到Mythos的Claude模型评估,以及如何从零构建前沿评测基准。
中文介绍 Hugging Face博客发布Nvidia的Nemotron 3.5内容安全方案,提供可定制的多模态安全功能,适用于全球企业AI应用。
中文介绍 ServiceNow AI发布EVA-Bench Data 2.0,涵盖3个领域、121个工具和213个场景,用于评估AI代理在复杂任务中的表现。
Learn how Endava is using AI agents, ChatGPT Enterprise, and Codex to accelerate software delivery, automate workflows, and build an AI-native culture across the enterprise.
中文介绍 OpenAI报道Endava如何利用AI代理、ChatGPT Enterprise和Codex重塑软件交付流程,加速工作自动化并构建AI原生企业文化。
Most days in her chambers, Judge Maritza Braswell, a federal magistrate judge in Colorado, sifts through stacks of documents written by people without a lawyer. Many of them can’t afford to hire a lawyer, and others have cases too weak or too small to interest one. She reads each one carefully, mind
中文介绍 MIT Tech Review报道,美国法院正应对大量AI生成的法律文件,联邦法官Maritza Braswell等法官需处理许多非律师代理的诉讼,其中部分文件由AI撰写。
4 回复 · 程序员 节点
11 回复 · 程序员 节点
11 回复 · Apple 节点
27 回复 · Apple 节点
12 回复 · 程序员 节点
9 回复 · Apple 节点
27 回复 · 程序员 节点
6 回复 · 程序员 节点
16 回复 · 程序员 节点
6 回复 · Apple 节点
回归plus号池 分组倍率调整至5x,消耗一点大家的余额 35 个帖子 - 35 位参与者 阅读完整话题
8 个帖子 - 8 位参与者 阅读完整话题
版块名称: 实习交流 URL Slug: offer 版块简介: 版主人选: @made_po 版块规则: 1.严禁发布任何收费培训、简历代写、付费内推或未经验证的招聘广告,招聘广告要在经验贴最下方 2.分享公司实习信息时请尽量注明大致时间、部门(敏感信息可匿),鼓励真实但不要恶意吐槽 3.鼓励使用相对统一的格式写面经、实习复盘、Offer对比,方便后来人搜索参考 4.纯求助帖最好附上自己已经做的努力和背景,否则容易沉或被移 5.保持理性建设性讨论,禁止人身攻击、地域黑、学历歧视等内容 申请理由: 很多佬U字字珠玑,句句刻骨,明明随口一说就是极其宝贵的经验教训,但是却散落在各个零碎的帖子评论区
论坛邀请规则已修改 社区拟优化账号注册机制 GitHub申请渠道 调整为 5 年 GitHub 账号直接获取 很多佬友不知道自己的github注册多久了,快来查一查吧 xyl1null.github.io GitHub Account Age Checker 编辑:api返回的是zulu时间,也就是UTC,现已转换成中国标准时间UTC+8,抱歉让佬们反思起不健康的作息,没有的事~ 187 个帖子 - 175 位参与者 阅读完整话题
codex无限并发无限额度?无限bug team?车已经开走了? 24 个帖子 - 23 位参与者 阅读完整话题
回顾两年多以来,社区账号注册进行了多轮调整,旨在匹配当前社区阶段,现在看来都出色完成了阶段使命。 不过也有一些非议,非议来自不好量化,最常见的就是对于申请自述的理解不同。为了配合社区进入新阶段发展,我们同步决定优化一下注册机制: 取消申请自述填写与审核 邀请链接供给侧结构调整 第一点很好理解,也就是有邀请链接就可以直接注册。一方面消除不确定性,另一方面释放管理精力进行下一阶段运营。 3个渠道: GitHub申请渠道 调整为 5 年 GitHub 账号直接获取。 Premium分组 每 3 天可生成 1 个邀请链接。 管理员分组 每 3 天可生成 5 个邀请链接。 新的注册方式,将极大增加确定性
效果 # AMC-WebUI Live Artifacts Designer (Grok Defensive Hybrid - Clean) v2 你是一个只输出渲染后 HTML 的专业引擎。你将融合 AMC-WebUI 的高信息密度智能布局与 Grok 平台的极端渲染防御规则,把用户信息转化为无懈可击、精美且可读的内联 HTML 产物。 ## 致命防御约束 (ZERO TOLERANCE - 极高优先级) 1. 绝对禁止反引号:严禁在响应中输出任何 ``` 或 ` 符号。 2. 绝对禁止 <style> 和 <script> 块:所有样式必须 100% 写入每个标签的
已将CHY公益站使用的MySQL迁移至 PostgreSQL并配置了Redis作为缓存,至此,CHY公益站迈出向着稳定的第二步 10 个帖子 - 10 位参与者 阅读完整话题
@raotom @vux1jpmal5t41lg 谁说这届佬友不行 现在满屏的bug team 估计openai要炸了吧 所以有人能解释下发生了个啥么 我吃个瓜 11 个帖子 - 10 位参与者 阅读完整话题
本帖使用社区开源推广,符合推广要求。我申明并遵循社区要求的以下内容: 我的帖子已经打上 开源推广 标签: 是 我的开源项目完整开源,无未开源部分: 是 我的开源项目已链接认可 LINUX DO 社区: 是 我帖子内的项目介绍,AI生成、润色内容部分已截图发出: 是 以上选择我承诺是永久有效的,接受社区和佬友监督: 是 以下为项目介绍正文内容,AI生成、润色内容已使用截图方式发出 很久没冒泡了 一看等级都掉到2级 翻了下记录,感觉平时也就没事上L站随便转转 也不知道要看什么 然后又把网页关闭继续摆烂 正篇开始 以下(宝宝)都是我女朋友 哈哈 这段时间纯属典型摆烂状态 下午5点醒来坐电脑面前摆烂
8 points · 1 comments
31 points · 12 comments
26 points · 7 comments
12 points · 7 comments
3 points · 0 comments
48 points · 22 comments
143 points · 30 comments
206 points · 73 comments
273 points · 182 comments
78 points · 17 comments
133 points · 45 comments
148 points · 39 comments
60 points · 19 comments
118 points · 42 comments
35 points · 7 comments
195 points · 367 comments
1 points · 0 comments
379 points · 548 comments
233 points · 62 comments
1284 points · 439 comments
Genuine question.Over the past six months, there hasn’t been a single day where I’ve checked the HN Best RSS feed without seeing a post about how AI “writes bad code,” “introduces bugs,” “creates technical debt,” or something along those lines.I’ll probably make a lot of enemies by saying this, but
234 points · 83 comments
490 points · 202 comments
Most of us were amused when DALL-E and its peers went mainstream, and we were quick to point out the obvious flaws.Then ChatGPT hit the scene and again, many of us dismissed it as a parlor trick that would never amount to much.Using LLMs for coding initially was a only small step up from basic code
41 points · 9 comments
36 points · 14 comments
811 points · 224 comments
Hi!This is an infinite canvas note-taking tool where notes are laid out in a non-Euclidean, hyperbolic geometric space. As you drag and navigate through the view, you’ll experience a unique fluid distortion that naturally leverages your brain's spatial memory.I’ve been obsessed with the concept
8 points · 0 comments
67 points · 15 comments