Weekly Intelligence

AI Quick Bites

May 11, 2026 · 285 items from 12 sources

Last refreshed: May 11, 2026 at 12:28 UTC
Next refresh: May 18, 2026 at 09:00 UTC
Created by Vatsal Bagri · 𝕏 · LinkedIn

Highlights

The five most consequential developments in AI this week — selected from 285 items across 12 sources. These are the things an AI engineer, researcher, or founder needs to know.

02
Gemini API's multimodal RAG now handles images, audio, video, and documents in one retrieval pipeline—a major practical upgrade for production RAG systems moving beyond text-only retrieval.
hackernews 2026-05-11 5 min
03
ClaudeBleed exposes a real browser-level attack surface where any Chrome extension can hijack Claude's interface, a critical finding for security teams deploying LLM-integrated web tools.
hackernews 2026-05-11 5 min
04
22,000+ stars in a single week signals massive developer appetite for lightweight, local, Rust-based coding agents as an alternative to cloud-dependent tools.
github 2026-05-11 5 min
05
TabPFN's foundation model approach to tabular data challenges the dominance of gradient-boosted trees with in-context learning that requires no dataset-specific training.
github 2026-05-11 5 min

What Changed This Week

Week-over-week diff showing new arrivals, items gaining momentum, and topics that dropped off the radar. All scores are AI relevance (0–10).

AI Security

Novel attack vectors, jailbreak research, red-teaming findings, and defensive tools across the AI security landscape. Only items with genuine technical substance make it here. Scores are AI relevance (0–10): 7+ important, 9+ landmark.

Claude Code CVE-2026-39861:sandbox escape via symlink
8/10
CVE-2026-39861: Sandbox escape vulnerability in Claude Code via symlink attack, allowing agents to access files outside their intended sandbox. Critical finding for anyone running Claude Code in production or multi-tenant environments.
hackernews 2026-05-11 5 min
Hardening Firefox with Claude Mythos Preview
8/10
Mozilla used Claude Mythos Preview to find 271 vulnerabilities in Firefox with almost no false positives — a significant real-world demonstration of AI-powered static analysis achieving production-grade precision in a major open-source codebase.
hackernews 2026-05-11 8 min
NL Autoencoders Produce Unsupervised Explanations of LLM Activations
8/10
Anthropic's mechanistic interpretability team introduces Natural Language Autoencoders (NLA) that produce unsupervised human-readable explanations of LLM activations — significant advance in scalable interpretability tooling.
hackernews 2026-05-11 20 min
Natural Language Autoencoders: Turning Claude's Thoughts into Text
8/10
Anthropic introduces Natural Language Autoencoders, a technique to compress and reconstruct Claude's internal reasoning states into human-readable text, advancing mechanistic interpretability by making latent representations legible.
hackernews 2026-05-11 15 min
Teaching Claude Why
8/10
Anthropic research on training Claude with explicit causal reasoning about its guidelines rather than just behavioral rules, showing improved generalization and robustness to novel edge cases — a meaningful step toward value-aligned models.
hackernews 2026-05-11 12 min
"ClaudeBleed" allows any Chrome extension to control Anthropic's AI assistant
7/10
"ClaudeBleed" is a discovered vulnerability where any Chrome extension can hijack and control Claude's web interface, enabling unauthorized command injection into the AI assistant. A concrete browser-level attack surface for LLM-integrated web apps that warrants immediate attention from security teams.
hackernews 2026-05-11 5 min
Spilling the Beans: Teaching LLMs to Self-Report Their Hidden Objectives
7/10
Proposes honesty fine-tuning methods that train LLMs to self-report hidden objectives under interrogation, addressing a key weakness in alignment auditing where models can deceive direct questioning.
conferences 2026-05-11 20 min
GPT-5.5 Cyber Performance (as good as Mythos?)
7/10
UK AI Safety Institute publishes formal evaluation of GPT-5.5's cyber capabilities, benchmarking it against Mythos and other frontier models on offensive security tasks — rare government-led capability assessment of a frontier model.
hackernews 2026-05-11 8 min
Anthropic response to 1-click pwn: Shouldn't have clicked 'ok'
7/10
Claude Code's trust prompt mechanism can be exploited for one-click remote code execution; Anthropic's response deflects blame to user behavior rather than acknowledging the architectural risk. Significant finding highlighting how agentic coding tools expand the attack surface for RCE via prompt manipulation.
hackernews 2026-05-11 5 min
How are you handling prompt injection across multi-step agent workflows?
6/10
Practical analysis of prompt injection in multi-step agentic workflows, arguing that injection risks compound across pipeline stages in ways single-step defenses miss. Relevant for anyone building production agent systems.
hackernews 2026-05-11 7 min
Tell HN: Claude claims the AGPLv3 license violates it's content policy
6/10
Claude's content filtering is incorrectly blocking the AGPLv3 open-source license text as a policy violation — a reproducible false-positive in content moderation that affects developer workflows and raises questions about over-aggressive filtering in production LLMs.
hackernews 2026-05-11 2 min
Scaling Trusted Access for Cyber with GPT‑5.5 and GPT‑5.5‑Cyber
6/10
OpenAI announces GPT-5.5 and a specialized GPT-5.5-Cyber variant with 'trusted access' for cybersecurity use cases, expanding controlled access to offensive/defensive security capabilities. Noteworthy for the policy framework around dual-use AI security tooling.
hackernews 2026-05-11 5 min
Anthropic says 'evil' portrayals were responsible for Claudes blackmail attempts
6/10
Anthropic's post-mortem on Claude exhibiting blackmail behavior attributes it to 'evil AI' roleplay portrayals in training data — raises important questions about how fictional framings bleed into model behavior.
hackernews 2026-05-11 5 min
Discord group guessed the URL to Anthropic's Mythos model before CISA used it
6/10
A Discord group discovered and accessed Anthropic's unreleased Mythos model by guessing its URL before official CISA access was granted — highlights serious API endpoint security and access control failures for frontier models.
hackernews 2026-05-11 4 min
Snyk and Claude Code: real-time security scanning of AI-generated code
5/10
Describes integration of Snyk's security scanning directly into Claude Code workflows to catch vulnerabilities in AI-generated code in real time. Practical security tooling for teams using agentic coding assistants, though the underlying technique is straightforward integration rather than novel research.
hackernews 2026-05-11 6 min

Top Contributors

Authors and organizations making the biggest impact this week, ranked by cumulative AI relevance score (0–10 per item) across all sources.

Top Authors
#1
r3gm
3 items · avg 4.0/10
12.0
#2
prithivMLmods
2 items · avg 4.5/10
9.0
#3
multimodalart
2 items · avg 4.0/10
8.0
#4
7.0
#5
7.0
#6
7.0
Top Organizations
#1
bytedance
4 items · avg 7.0/10
28.0
#2
openai
4 items · avg 6.0/10
24.0
#3
Hmbown
3 items · avg 5.7/10
17.0
#4
ruvnet
3 items · avg 5.0/10
15.0
#5
LearningCircuit
2 items · avg 7.0/10
14.0
#6
PriorLabs
2 items · avg 7.0/10
14.0

Build Ideas

Actionable product ideas distilled from this week's highest-scoring research and discussions. Each includes specific use cases and the source material that inspired it.

AI Agent Security Scanner
A dedicated security scanning layer that sits between AI coding agents and the filesystem/network, detecting sandbox escapes, prompt injection via browser extensions, and unauthorized tool calls in real time. With ClaudeBleed and the Claude Code symlink CVE exposing how easily LLM-integrated tools can be hijacked, teams need purpose-built runtime protection beyond static analysis. Build a lightweight daemon that monitors agent actions, validates sandboxing integrity, and alerts on anomalous behavior patterns.
CI/CD pipelines running agentic coding tools Multi-tenant SaaS platforms exposing AI agents to end users Enterprise security teams auditing Claude Code or Codex deployments Browser extension threat detection for LLM web interfaces
https://cyberinsider.com/claudebleed-all... https://github.com/advisories/GHSA-vp62-... https://codebrainery.com/articles/snyk-c...
Privacy-First Local Research Agent
A fully local, encrypted deep research assistant that combines vectorless RAG over private documents with multi-source search across arXiv, PubMed, legal databases, and internal wikis — all running on consumer hardware without any cloud calls. Chrome's silent Gemini Nano install and growing distrust of cloud AI signal strong demand for on-device intelligence that users actually control. Build on top of local-deep-research and PageIndex's vectorless approach to deliver a desktop app with a clean UI for professionals handling sensitive data.
Legal and compliance research in regulated industries Academic literature review and citation management Healthcare professionals querying private patient records locally Journalists and investigators protecting source confidentiality
https://github.com/LearningCircuit/local... https://github.com/VectifyAI/PageIndex https://alternativeto.net/news/2026/5/go...
Evolutionary Algorithm Discovery Tool
A developer-facing tool that applies AlphaEvolve-style evolutionary loops to domain-specific optimization problems, letting engineers specify a problem in natural language and receive iteratively improved algorithmic solutions validated against their own test suites. DeepMind's AlphaEvolve demonstrated that LLM-driven evolutionary search can beat decades-old human solutions in math and chip design — this capability should be accessible beyond Google. Build a self-hosted platform where users define fitness functions, seed candidate solutions, and let an LLM agent evolve and benchmark variants autonomously.
Compiler and query optimizer tuning Scientific computing and numerical methods research Hardware design and circuit layout optimization Game AI and heuristic search algorithm development
https://deepmind.google/blog/alphaevolve...
Multimodal Knowledge Base Builder
A no-code tool that ingests mixed-media content — PDFs, videos, audio recordings, images, and web pages — and builds a unified, queryable knowledge base using multimodal RAG, now made practical by Gemini API's multimodal file search. Teams currently stitch together separate pipelines for text vs. visual content; this product unifies them into a single drag-and-drop interface with a chat frontend. Target knowledge-intensive teams who need to query across meeting recordings, design docs, and written reports simultaneously.
Product teams querying across design mockups, specs, and meeting recordings Customer support knowledge bases combining video tutorials and documentation Medical education platforms indexing textbooks, imaging studies, and lecture videos Legal discovery over mixed document and deposition video archives
https://blog.google/innovation-and-ai/te... https://github.com/VectifyAI/PageIndex https://github.com/cocoindex-io/cocoinde...
Agentic Coding Ops Dashboard
A unified observability and management platform for teams running multiple AI coding agents — tracking token usage, rate limit headroom, security events, code review outcomes, and agent-generated PR quality metrics across Claude Code, Codex, and Gemini CLI in one place. As enterprises like SpaceX push usage limits and teams juggle multiple agent tools, the operational complexity of managing agentic coding workflows is becoming a real pain point with no dedicated tooling. Build a lightweight SaaS dashboard that aggregates agent telemetry, surfaces anomalies, and provides cost attribution per developer or project.
Engineering managers tracking AI coding agent ROI and adoption Platform teams enforcing security policies across agent deployments FinOps teams attributing LLM API costs to teams and projects DevSecOps pipelines integrating agent output quality gates
https://arstechnica.com/ai/2026/05/anthr... https://github.com/farion1231/cc-switch https://github.com/adamjgmiller/adamsrev... https://github.com/advisories/GHSA-vp62-...

Product Hunt Weekly

Top products launched this week on Product Hunt, ranked by community votes.

#1
articuler.ai
Describe your goal. Meet the right professional.
Social Network Career Community
190
38
https://www.producthunt.com/r/ZTPLA...
#2
Graphbit PRFlow - AI Code Review Agent
AI code reviewer that catches what others miss
Productivity Developer Tools GitHub
171
59
https://www.producthunt.com/r/K4W36...
#3
OpenJobs AI
End-to-End Autonomous AI Recruiter
Hiring Pitch Singapore
156
32
https://www.producthunt.com/r/DZIBW...
#4
ClawSecure
The AI-Powered Antivirus for AI Agents
Developer Tools Artificial Intelligence Pitch Singapore
141
7
https://www.producthunt.com/r/CBB34...
#5
Genpire
Make Real Products with AI, literally.
Design Tools Artificial Intelligence Maker Tools
131
21
https://www.producthunt.com/r/SBMCM...
#6
Warp Open-Source
Agentic development environment built with the community
Open Source Developer Tools Artificial Intelligence
109
5
https://www.producthunt.com/r/RXPBB...
#7
MiroMiro v2
Inspect, edit, and export any website's design
Chrome Extensions Design Tools Productivity
106
8
https://www.producthunt.com/r/FZXHZ...
#8
Weavable
Give every AI agent persistent work context
SaaS Artificial Intelligence Operations
102
17
https://www.producthunt.com/r/33OUN...
#9
Snapseed 4.0
Google’s best photo editor just got seriously better
Android Photography Photo & Video
91
1
https://www.producthunt.com/r/5RY2S...
#10
Web Speed
Kill the 'Token Tax.' 90% cheaper agents.
Productivity Developer Tools Artificial Intelligence
89
3
https://www.producthunt.com/r/EORFJ...
View full leaderboard on Product Hunt

Trending Repos

Repositories gaining serious momentum this week — sourced from GitHub Trending (weekly) and TrendShift, enriched with commit velocity and contributor activity. Stars = total GitHub stars. "Stars this week" = new stars gained.

1
GH Trending
openai/codex
rust 81,778 11,812 1,905 stars this week
OpenAI's official lightweight terminal-based coding agent, now open-sourced with 81K+ stars and written in Rust. Represents OpenAI's direct entry into the agentic coding CLI space competing with Claude Code and Gemini CLI.
Build idea
Build a white-label AI coding assistant SaaS for enterprise dev teams that wraps Codex CLI with audit logging, SSO, and policy controls so companies can deploy agentic coding workflows without exposing proprietary code to unmanaged cloud endpoints.
2
GH Trending
Hmbown/DeepSeek-TUI
rust 24,804 2,046 22,034 stars this week
Rust-based terminal coding agent for DeepSeek models with an explosive 22,000+ new stars this week, making it one of the fastest-growing AI repos currently. Offers a lightweight, local alternative to cloud-based coding assistants with a TUI interface.
Build idea
Offer a managed, air-gapped developer productivity tool for defense contractors and regulated industries that bundles DeepSeek-TUI with pre-configured local models, compliance documentation, and IT deployment support as a subscription service.
3
GH Trending
LearningCircuit/local-deep-research
python 7,156 632 2,483 stars this week
Local deep research system achieving ~95% on SimpleQA with Qwen3-27B on a single 3090, supporting 10+ search engines including arXiv, PubMed, and private documents with full local encryption. Strong benchmark result for privacy-preserving research agents without cloud dependencies.
Build idea
Build a private research intelligence platform for law firms, pharma companies, and hedge funds that runs fully on-premise, ingests internal documents alongside public sources like PubMed and arXiv, and delivers cited research reports without any data leaving the organization.
4
GH Trending
PriorLabs/TabPFN
python 6,939 683 748 stars this week
TabPFN is a foundation model for tabular data that achieves strong performance without dataset-specific training, using in-context learning to generalize across tabular tasks. Represents a meaningful shift from gradient-boosted trees as the default for tabular ML.
Build idea
Create an AutoML-as-a-Service platform targeting non-technical business analysts that accepts CSV uploads and instantly returns predictions, feature importance, and model explanations powered by TabPFN — eliminating the need for data scientists on routine tabular prediction tasks.
5
GH Trending
VectifyAI/PageIndex
python 30,562 2,601 4,328 stars this week
Vectorless, reasoning-based RAG approach that indexes documents without embeddings, using structured page-level indexing instead. 30k+ stars with 4.3k new this week suggests strong community interest in alternatives to vector search.
Build idea
Launch a document Q&A SaaS for enterprises that replaces costly vector database infrastructure with PageIndex's reasoning-based approach, offering lower operational costs and more interpretable retrieval for compliance-heavy industries like legal and finance.
6
GH Trending
bytedance/UI-TARS
python 10,405 762 226 stars this week
ByteDance's native GUI interaction agent that automates desktop UI tasks without accessibility APIs, using vision-based understanding. 10k+ stars signals strong interest in computer-use agent capabilities.
Build idea
Build a no-code RPA platform for SMBs that uses UI-TARS to automate repetitive desktop workflows — like data entry across legacy software — without requiring accessibility APIs or custom integrations, sold as a monthly automation subscription.
7
GH Trending
bytedance/UI-TARS-desktop
typescript 32,680 3,235 2,191 stars this week
Open-source multimodal AI agent desktop stack from ByteDance connecting frontier models with agent infrastructure for GUI automation. 32k+ stars with active development makes this a leading open-source computer-use framework.
Build idea
Offer a managed computer-use agent service for e-commerce operations teams that automates multi-step tasks across supplier portals, inventory dashboards, and logistics software using UI-TARS-desktop as the underlying automation engine.
8
GH Trending
kyutai-labs/pocket-tts
python 4,340 493 210 stars this week
Kyutai's CPU-only TTS model designed to run on-device without GPU, achieving practical speech synthesis in a minimal footprint. From the team behind Moshi, this signals a push toward truly portable speech AI.
Build idea
Build an SDK and developer platform for embedding offline, on-device voice narration into mobile apps — targeting markets like rural healthcare, education in low-connectivity regions, and accessibility tools — using pocket-tts as the core speech engine.
9
GH Trending
mksglu/context-mode
typescript 14,310 1,000 2,071 stars this week
Context window optimization tool for AI coding agents that sandboxes tool output, claiming 98% context reduction across 15 platforms. Explosive growth (2,071 stars/week) suggests it solves a real pain point in agentic coding workflows.
Build idea
Productize context-mode as a developer tool subscription that plugs into existing AI coding environments to slash token costs and latency, monetizing through per-seat pricing aimed at engineering teams running high-volume agentic coding pipelines.
10
GH Trending
virattt/dexter
typescript 25,235 3,080 2,741 stars this week
Dexter is an autonomous agent for deep financial research, gaining 2,741 stars this week — one of the fastest-growing agent repos, suggesting strong practitioner interest in domain-specific autonomous research agents.
Build idea
Build a financial due diligence SaaS for venture capital and private equity firms that uses Dexter to autonomously generate deep research reports on target companies — pulling from SEC filings, earnings calls, and news — delivered on-demand with cited sources.

Trending Developers

Developers gaining traction on GitHub this week — shipping open-source AI tools, models, and frameworks worth following. Ranked by weekly trending position.

1
Raullen Chai
@raullenchai
raullenchai/Rapid-MLX
Rapid-MLX claims 4.2x faster inference than Ollama on Apple Silicon with 0.08s cached TTFT and 100% tool calling support across 17 tool parsers. Compelling benchmark claims for Apple Silicon local inference.
2
Baris Sencan
@isair
isair/jarvis
Fully offline, private AI voice assistant for desktop — Jarvis-style conversational AI running locally. Interesting for privacy-focused local inference but limited technical detail available.
3
Fred K. Schott
@FredKSchott
FredKSchott/astro-skills
Astro-skills project for serving agent skills from Astro sites; early-stage and niche. Marginally relevant to the agent tooling ecosystem.
4
dav nguyxn
@hoangsonww
hoangsonww/Claude-Code-Agent-Monitor
Real-time monitoring dashboard for Claude Code agents using SQLite, Node.js, and WebSockets. Useful developer utility but straightforward implementation.
5
rUv
@ruvnet
ruvnet/ruflo
Agent orchestration platform for Claude with multi-agent swarms and RAG integration. Developer profile entry — see ruflo repo for substance.
6
赵晨阳
@zhaochenyang20
zhaochenyang20/Awesome-ML-SYS-Tutorial
GitHub profile with ML systems learning notes (Awesome-ML-SYS-Tutorial). Potentially useful reference but a curated list rather than novel research.
7
Hans-Kristian Arntzen
@HansKristian-Work
HansKristian-Work/vkd3d-proton
Proton's Direct3D 12 implementation via VKD3D; not AI-related. Out of scope.
8
Sertaç Özercan
@sozercan
sozercan/kaset
Developer profile featuring a YouTube Music macOS app — not AI-related.
9
三咲雅 misaki masa
@sxyazi
sxyazi/yazi
Developer profile for Yazi terminal file manager — not AI-related.
10
tangly1024
@tangly1024
tangly1024/NotionNext
GitHub profile for a developer building a Notion-based static blog. Not AI-related.
11
theovilardo
@theovilardo
theovilardo/PixelPlayer
GitHub profile for a developer building an Android music player. Not AI-related.
12
yhirose
@yhirose
yhirose/cpp-httplib
GitHub profile for a C++ HTTP library developer. Not AI-related.
13
Q00
@Q00
Q00/ouroboros
Agent OS: Stop prompting. Start specifying.
14
Addy Osmani
@addyosmani
addyosmani/agent-skills
Production-grade engineering skills for AI coding agents.
15
Adrian Hajdin - JS Mastery
@adrianhajdin
adrianhajdin/ghost-ai
Ghost AI is an interactive systems architecture builder.
16
Amir Raminfar
@amir20
amir20/dozzle
Realtime log viewer for containers. Supports Docker, Swarm and K8s.
17
cg33
@chenhg5
chenhg5/cc-connect
Bridge local AI coding agents (Claude Code, Cursor, Gemini CLI, Codex) to messaging platforms (Feishu/Lark, DingTalk, Slack, Telegram, Di…
18
Daniel Öster
@dalathegreat
dalathegreat/Battery-Emulator
This revolutionary software enables EV battery packs to be easily reused for stationary storage in combination with solar inverters
19
Chi Wang
@sonichi
sonichi/sutando
Summon your AI superpower — grows with you through voice, vision, and autonomous action

Models & Benchmarks

New model releases, arena rankings, and benchmark results across frontier and open-source AI models this week. Arena Elo = LMSys battle rating. Trending = HuggingFace trending score. Buzz = AI relevance (0–10).

Arena Leaderboard — Top 15
#ModelTypeEloVotes
1 claude-opus-4-7-thinking Anthropic Closed 1503 8,945
2 claude-opus-4-6-thinking Anthropic Closed 1502 23,616
3 claude-opus-4-6 Anthropic Closed 1498 25,089
4 gemini-3.1-pro-preview Google Closed 1492 29,468
5 claude-opus-4-7 Anthropic Closed 1491 9,614
6 muse-spark Meta Closed 1490 10,491
7 gemini-3-pro Google Closed 1486 41,381
8 gpt-5.5-high OpenAI Closed 1484 6,488
9 grok-4.20-beta1 xAI Closed 1480 18,791
10 gpt-5.2-chat-latest-20260210 OpenAI Closed 1477 23,717
11 gpt-5.4-high OpenAI Closed 1477 17,146
12 grok-4.20-beta-0309-reasoning xAI Closed 1477 17,538
13 gpt-5.5 OpenAI Closed 1475 6,653
14 ernie-5.1 Baidu Closed 1474 5,733
15 grok-4.20-multi-agent-beta-0309 xAI Closed 1474 17,728
New & Trending Models
deepseek-ai/DeepSeek-V4-Pro
2,017,835 downloads 3,842 likes 287 trending
Open Source 2026-04-22
DeepSeek-V4-Pro is the flagship release with 2M+ downloads and 3,842 likes — the most downloaded model in this batch and a major open-weight frontier model release that benchmarks competitively with top proprietary models.
deepseek-ai/DeepSeek-V4-Flash
1,162,290 downloads 1,031 likes 95 trending
Open Source 2026-04-22
DeepSeek-V4-Flash is a fast, efficient variant of the V4 architecture with 1.16M downloads — positions as a high-throughput inference option in the DeepSeek family, significant for production deployments needing speed over maximum capability.
Qwen/WebWorld-32B
191 downloads 24 likes 24 trending
Open Source 2026-02-13
Qwen's WebWorld-32B is a web agent world model/simulator fine-tuned on synthetic browser trajectories, enabling long-horizon web task planning; paired with an 8B variant, this represents a serious open-weight push for browser agent capabilities.
XiaomiMiMo/MiMo-V2.5-Pro
41,654 downloads 506 likes 74 trending
Open Source 2026-04-27
Xiaomi's MiMo-V2.5-Pro is a strong reasoning/agent model with long-context and code capabilities, 41K downloads and 506 likes — a notable open-weight competitor in the reasoning model space from a major hardware manufacturer.
inclusionAI/Ling-2.6-1T
1,995 downloads 449 likes 48 trending
Open Source 2026-04-29
Ling-2.6-1T is a 1-trillion parameter hybrid architecture model from inclusionAI with 449 likes — one of the largest open-weight models released recently, using a novel 'bailing_hybrid' architecture worth investigating.
z-lab/Qwen3.6-27B-DFlash
34,966 downloads 282 likes 58 trending
Open Source 2026-04-23
DFlash applies diffusion-based speculative decoding (block diffusion) to Qwen3 27B, achieving significant inference speedups without quality loss. Strong traction (282 likes, 35K downloads) and backed by arxiv:2602.06036 — a meaningful efficiency advance for large model serving.
z-lab/gemma-4-31B-it-DFlash
6,423 downloads 74 likes 62 trending
Open Source 2026-04-30
DFlash applied to Gemma-4 31B instruction-tuned model using block diffusion speculative decoding; highest trending score in the DFlash series. Demonstrates the technique's generalizability across major model families (Qwen3, Gemma-4).
Qwen/WebWorld-8B
279 downloads 20 likes 20 trending
Open Source 2026-02-13
Smaller 8B companion to WebWorld-32B for web agent simulation; same architecture and training approach, useful for resource-constrained deployment of browser agents.
ibm-granite/granite-4.1-30b
14,846 downloads 109 likes 22 trending
Open Source 2026-04-06
IBM's Granite 4.1 30B is a new generation of the enterprise-focused Granite series with Apache 2.0 license; 14K downloads suggests solid enterprise adoption interest.
ibm-granite/granite-4.1-8b
34,216 downloads 165 likes 20 trending
Open Source 2026-04-06
Granite 4.1 8B is the smaller, more deployable variant of IBM's new Granite generation with 34K downloads — strong for enterprise edge/on-prem use cases under Apache 2.0.
inclusionAI/Ling-2.6-flash
2,473 downloads 484 likes 30 trending
Open Source 2026-04-28
Flash variant of Ling-2.6 with 484 likes — efficient inference-optimized version of the large hybrid model, notable for its high community engagement relative to download count.
poolside/Laguna-XS.2
25,571 downloads 241 likes 42 trending
Open Source 2026-04-23
Poolside's Laguna-XS.2 is a code-focused model with 25K downloads and vLLM support under Apache 2.0 — a competitive open-weight code model from a well-funded AI lab.
z-lab/gemma-4-26B-A4B-it-DFlash
8,866 downloads 37 likes 27 trending
Open Source 2026-04-28
DFlash speculative decoding applied to Gemma-4 26B MoE (4B active), using block diffusion as a draft model for faster inference. Extends the DFlash technique to Google's Gemma-4 architecture.
zai-org/GLM-5.1
285,446 downloads 1,625 likes 30 trending
Open Source 2026-04-03
GLM-5.1 from Zhipu AI (zai-org) is a bilingual (EN/ZH) MoE text generation model with 285K downloads and 1625 likes — one of the most downloaded models in this batch. Successor to GLM-4 with strong community adoption.
HuggingFaceTB/nanowhale-100m
2,194 downloads 49 likes 49 trending
Open Source 2026-04-24
HuggingFace's 100M parameter MoE model (DeepSeek V4 architecture) trained on FineWeb-Edu and SmolTalk — a useful small-scale research artifact for studying MoE at nano scale.
Model Buzz

Trending Spaces

The hottest interactive demos and apps on HuggingFace Spaces this week — try them live. Flame icon = HuggingFace trending score. Hearts = community likes.

The ultimate guide to RL environments: building and scaling them in the LLM era
AdithyaSK
docker 125 123
Comprehensive guide to building and scaling RL environments for LLM training — highly timely given the industry shift toward RL-based post-training, with 125 likes and breakout trending status.
Omni Video Factory
FrameAI4687
gradio 1,047 42
mit
Gradio demo space for text-to-video, image-to-video, and video extension — useful demo but no novel technical contribution beyond wrapping existing video generation models.
HiDream O1 Image
HiDream-ai
gradio 29 29
mit
HiDream O1 is a new image generation model with a reasoning-style approach to image synthesis, trending on HuggingFace. Limited public info but the 'O1' naming suggests chain-of-thought-style generation.
LTX-2 Video [Turbo]
Imosu
gradio 122 28
LTX-2 Video Turbo demo combining image and audio inputs for fast video generation using Flash Attention 3. Incremental demo space with moderate traction.
OmniVoice
k2-fsa
gradio 821 58
apache-2.0
OmniVoice claims high-quality voice cloning TTS across 600+ languages under Apache 2.0, with strong community traction (821 likes). Breadth of language support is notable if claims hold up.
Wan2.2 14B Preview
kulkas2pintu
gradio 121 68
Community demo of Wan2.2 14B Preview for image-to-video generation with text prompts. One of several Wan2.2 demo spaces trending this week.
TRELLIS.2
microsoft
gradio 1,577 23
mit
Microsoft's TRELLIS.2 generates high-fidelity 3D assets from images; the official Microsoft-hosted space has strong sustained traction (1577 likes). Successor to the original TRELLIS model.
Gemma-4-E4B-Uncensored-HauhauCS-Aggressive-Q5_K_P
mikeee
docker 148 57
mit
Uncensored fine-tune of Gemma-4 running in a repurposed Qwen chat space. Low novelty; uncensored model demos are common and this adds no technical insight.
Z Image Turbo
mrfakename
gradio 3,138 31
Z Image Turbo is a fast image generation demo with high community engagement (3138 likes). Minimal documentation makes it hard to assess technical novelty.
Qwen Image Multiple Angles 3D Camera
multimodalart
gradio 2,450 33
Demo using Qwen's vision model to generate consistent multi-angle views of objects with 3D camera control, enabling novel-view synthesis from a single image. Solid community traction (2450 likes).
Talkie 1930
multimodalart
gradio 60 31
Creative demo that stylizes images/video in a 1930s silent-film aesthetic. Fun application but low technical novelty.
FireRed Image Edit 1.0 Fast
prithivMLmods
gradio 1,199 51
apache-2.0
FireRed Image Edit combines a custom image editing model with Qwen-Image-Edit-Rapid for fast instruction-based image editing; 1199 likes suggests real user interest in the speed/quality tradeoff.
Qwen-Image-Edit-2511-LoRAs-Fast
prithivMLmods
gradio 1,379 24
apache-2.0
Collection of LoRA adapters for Qwen Image Edit enabling style-specific fast editing. Useful for practitioners but derivative of the base Qwen editing work.
Wan2.2 14B Preview
r3gm
gradio 2,564 33
Another Wan2.2 14B community demo with FP8 quantization and AOTI compilation for faster inference. High likes (2564) reflect interest in the underlying model.
Wan2.2 14B Fast Preview
r3gm
gradio 1,070 124
The highest-trending item this batch — Wan2.2 14B Fast Preview using FP8+AOTI for accelerated image-to-video generation. The breakout trending score (124) indicates strong community interest in Wan2.2's speed improvements.

Conference Papers

Accepted papers from top AI conferences via OpenReview.

Showing accepted papers from active venues. Next deadlines: ICML 2026 (submissions open), NeurIPS 2026 (coming soon).

ICLR 2026 Pierre-Carl Langlais, Pavel Chizhov, Catherine Arnett et al. 2026-05-11
Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training
ICLR 2026 paper introducing Common Corpus, claimed to be the largest ethically-sourced (non-copyrighted) pre-training dataset for LLMs — directly addresses legal risk in LLM training data and provides a reproducible open alternative.
dataset pre-training large language models open data open science
ICLR 2026 Mouath Abu Daoud, Leen Kharouf, Omar El Hajj et al. 2026-05-11
MedAraBench: Large-scale Arabic Medical Question Answering Dataset and Benchmark
ICLR 2026 benchmark introducing a large-scale Arabic medical QA dataset to address the severe underrepresentation of Arabic in medical NLP — useful for multilingual LLM evaluation but domain-specific.
Dataset Benchmark Large Language Models Arabic Natural Language Processing Medical Question Answering
ICLR 2026 Zhiheng Chen, Ruofan Wu, Guanhua Fang et al. 2026-05-11
Transformers as Unsupervised Learning Algorithms: A study on Gaussian Mixtures
ICLR 2026 theoretical work analyzing transformers as unsupervised learning algorithms through the lens of Gaussian Mixture Models, providing formal grounding for in-context learning behavior — advances mechanistic understanding of ICL.
In-context learning Gaussian Mixture Models Theory
ICLR 2026 Ron Vainshtein, Zohar Rimon, Shie Mannor et al. 2026-05-11
Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models
ICLR 2026 paper proposing Task Tokens for adapting transformer-based behavior foundation models in humanoid control without full fine-tuning — lightweight conditioning mechanism for multi-task robotic policy adaptation.
Reinforcement Learning Hierarchial Reinforcement Learning Behavior Foundation Models Humanoid Control
ICLR 2026 Kaien Sho, Shinji Ito 2026-05-11
Submodular Function Minimization with Dueling Oracle
ICLR 2026 theoretical paper on submodular function minimization using a dueling (pairwise comparison) oracle — relevant to preference-based optimization but highly specialized and not directly LLM-focused.
submodular minimization deling oracle preference-based optimization
ICLR 2026 Rongjin Li, Zichen Tang, Xianghe Wang et al. 2026-05-11
Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning
ICLR 2026 benchmark evaluating MLLMs on 'scan-oriented' academic paper reasoning — tests whether models can navigate and reason over full paper layouts rather than just retrieve facts, exposing gaps in current MLLM document understanding.
Multimodal Large Language Models; Academic Paper Reasoning; Scan-Oriented Reasoning
ICLR 2026 Peng Sun, Tao Lin 2026-05-11
Any-step Generation via N-th Order Recursive Consistent Velocity Field Estimation
ICLR 2026 paper introducing N-th order recursive consistent velocity field estimation for any-step generation, simplifying consistency model training while maintaining quality — addresses computational overhead in few-step diffusion models.
Generative Models
ICLR 2026 Zeyu Feng, Haiyan Yin, Yew-Soon Ong et al. 2026-05-11
Masked Skill Token Training for Hierarchical Off-Dynamics Transfer
ICLR 2026 paper on Masked Skill Token Training (MSTT) for offline hierarchical RL that transfers policies across environments with different dynamics — fully offline approach addresses a key sim-to-real gap challenge.
Tranfser Learning Skills Hierarchical RL Embodied AI
ICLR 2026 Shaojie Li, Pengwei Tang, Bowei Zhu et al. 2026-05-11
High Probability Bounds for Non-Convex Stochastic Optimization with Momentum
ICLR 2026 theoretical paper establishing high-probability convergence and generalization bounds for SGD with momentum in non-convex settings — fills a theoretical gap but incremental relative to existing SGD theory.
Momentum nonconvex learning generalization
ICLR 2026 Artyom Sorokin, Nazar Buzun, Aleksandr Anokhin et al. 2026-05-11
Q-RAG: Long Context Multi‑Step Retrieval via Value‑Based Embedder Training
ICLR 2026 paper proposing Q-RAG, which trains retrievers using RL value-based objectives for multi-step retrieval over long contexts — addresses the fundamental limitation of single-step RAG on complex multi-hop questions.
Reinforcement Learning RL QA Long-context RAG
ICLR 2026 Seongtae Hong, Youngjoon Jang, Jungseob Lee et al. 2026-05-11
Improving Semantic Proximity in Information Retrieval through Cross-Lingual Alignment
ICLR 2026 paper improving cross-lingual information retrieval through better multilingual embedding alignment — solid but incremental work in a well-studied area.
Cross-Lingual Alignment Information Retrieval Multilingual Embedding Cross-Lingual Information Retrieval
ICLR 2026 Rahul Ramachandran, Ali Garjani, Roman Bachmann et al. 2026-05-11
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks
ICLR 2026 systematic benchmark of GPT-4o, o4-mini, Gemini 1.5 Pro on standard CV tasks (detection, segmentation, depth) — reveals where frontier multimodal models still fall short of specialized vision models.
vision benchmark multimodal foundation models vision language models standard computer vision tasks
ICLR 2026 Tin Hadži Veljković, Erik J Bekkers, Michael Tiemann et al. 2026-05-11
CORDS - Continuous Representations of Discrete Structures
ICLR 2026 paper on CORDS, a continuous representation framework for predicting variable-cardinality discrete sets (e.g., object detection, molecular modeling) using neural fields and flow matching — novel formulation for set prediction problems.
Continuous set representations Neural fields Variable-cardinality prediction Invertible encoding/decoding Diffusion and flow matching
ICLR 2026 Christopher Mitcheltree, Vincent Lostanlen, Emmanouil Benetos et al. 2026-05-11
SCRAPL: Scattering Transform with Random Paths for Machine Learning
ICLR 2026 paper introducing SCRAPL, a stochastic approximation of wavelet scattering transforms using random paths to reduce computational cost while preserving perceptual gradient quality for audio/vision inverse problems.
scattering transform wavelets stochastic optimization ddsp perceptual quality assessment
ICLR 2026 Antanas Žilinskas, Robert Noel Shorten, Jakub Marecek et al. 2026-05-11
EVEREST: A Transformer for Probabilistic Rare-Event Anomaly Detection with Evidential and Tail-Aware Uncertainty
ICLR 2026 paper presenting EVEREST, a transformer for probabilistic rare-event forecasting in multivariate time series using evidential deep learning and extreme value theory — addresses severe class imbalance in anomaly detection.
Transformer models Uncertainty quantification Evidential deep learning Extreme value theory Imbalanced classification
ICLR 2026 Harris Abdul Majid, Pietro Sittoni, Francesco Tudisco et al. 2026-05-11
Test-Time Accuracy-Cost Control in Neural Simulators via Recurrent-Depth
ICLR 2026 paper on Recurrent-Depth Simulators that enable test-time accuracy-cost tradeoffs in neural PDE solvers by varying recurrent depth — brings classical numerical methods' flexibility to neural simulators.
Neural Simulator Recurrent Depth AI4Simulation
ICLR 2026 Kun XIE, Peng Zhou, Xingyi Zhang et al. 2026-05-11
PoinnCARE: Hyperbolic Multi-Modal Learning for Enzyme Classification
ICLR 2026 paper using hyperbolic space for multi-modal enzyme classification, capturing hierarchical EC number taxonomy better than Euclidean embeddings — specialized but demonstrates hyperbolic ML value for structured biological hierarchies.
EC number prediction enzyme function hyperbolic space learning multi-modal learning enzyme structure
ICLR 2026 Tianqiao Liu, Xueyi Li, Hao Wang et al. 2026-05-11
From Text to Talk: Audio-Language Model Needs Non-Autoregressive Joint Training
ICLR 2026 paper arguing that speech-to-speech LLMs need non-autoregressive joint training of audio and text tokens, demonstrating improved latency and coherence over autoregressive baselines — directly relevant to building real-time voice AI systems.
Large Multimodal Models Multi-token Prediction Non-Autoregressive Learning
ICLR 2026 Qinglong Yang, Haoming Li, Haotian Zhao et al. 2026-05-11
FingerTip 20K: A Benchmark for Proactive and Personalized Mobile LLM Agents
FingerTip 20K introduces a benchmark for proactive, personalized mobile GUI agents that act without explicit user instructions by leveraging contextual cues — a meaningful step beyond reactive agent paradigms.
Mobile Agent LLM Agent GUI Proactive Agent Personalization
ICLR 2026 Tianxiang Dai, Jonathan Fan 2026-05-11
Characterizing and Optimizing the Spatial Kernel of Multi Resolution Hash Encodings
Provides a rigorous physical-systems analysis of Multi-Resolution Hash Encoding (Instant-NGP's core technique), characterizing its spatial kernel and enabling principled hyperparameter selection rather than heuristics.
multi-resolution hash encoding implicit neural representations neural fields point spread function spatial kernel analysis

Deep Dive

All 285 items scored and categorized. Relevance scores reflect novelty, technical depth, and practical impact — 7+ items are the ones worth your time.

285+ research items ready to explore