Weekly Intelligence

AI Quick Bites

June 01, 2026 · 338 items from 13 sources

Last refreshed: June 01, 2026 at 15:42 UTC
Next refresh: June 08, 2026 at 09:00 UTC
Created by Vatsal Bagri · 𝕏 · LinkedIn

Highlights

The five most consequential developments in AI this week β€” selected from 338 items across 13 sources. These are the things an AI engineer, researcher, or founder needs to know.

02
ReuseRL offers a principled MDL-based approach to skill reuse in agentic RL with a PAC-Bayes bound, addressing the brittleness of task-specific shortcuts that plagues current LLM agents.
arxiv 2026-06-01 18 min
03
Three-class credential detection with CodeBERT cuts false-positive high-severity alerts by 33% while maintaining security coverageβ€”a practical, deployable improvement for code security pipelines.
arxiv 2026-06-01 18 min
04
RayDer demonstrates clean power-law scaling for self-supervised novel view synthesis in a single unified transformer, suggesting a viable path to scaling 3D scene understanding like language models.
arxiv 2026-06-01 20 min
05
Explains the root cause of CLIP's concept binding failure and shows it is not fundamentalβ€”multiplicative interactions in controlled transformers enable systematic generalization, with implications for next-generation vision-language model design.
arxiv 2026-06-01 20 min

What Changed This Week

Week-over-week diff showing new arrivals, items gaining momentum, and topics that dropped off the radar. All scores are AI relevance (0–10).

New This Week 257 items
Rising 81 items

AI Security

Novel attack vectors, jailbreak research, red-teaming findings, and defensive tools across the AI security landscape. Only items with genuine technical substance make it here. Scores are AI relevance (0–10): 7+ important, 9+ landmark.

Arm Metis with GPT5.5 Cyber scores 98% on firmware vulnerability benchmark
8/10
Arm's Metis agentic AI system powered by GPT-5.5 achieves 98% on a firmware vulnerability benchmark, demonstrating near-human expert performance on embedded security analysis β€” significant milestone for AI-driven hardware security.
hackernews 2026-06-01 7 min
CVE-2026-28952: Apple macOS 26.5 Kernel Vuln found by Claude
8/10
CVE-2026-28952 is a macOS kernel vulnerability discovered by Claude β€” a landmark instance of an LLM autonomously finding a real, exploitable kernel-level security flaw, signaling a new era for AI-assisted vulnerability research.
hackernews 2026-06-01 5 min
On the Relationship Between Activation Outliers and Feature Death in Sparse Autoencoders
7/10
Identifies that activation outliers (high mean-to-variance ratio γ) cause feature death in sparse autoencoders by giving anti-aligned features permanently negative pre-activations, with γ predicting death rates (Spearman ρ=0.89) across 454 model-layer combinations spanning language, vision, protein, and genomic models; mean-centering eliminates the problem.
arxiv 2026-06-01 20 min
Stateful Online Monitoring Catches Distributed Agent Attacks
7/10
First demonstrated distributed agent attack that splits harmful cybersecurity tasks across multiple subagents to evade per-transcript safety monitors, plus a stateful online monitor using real-time clustering that catches distributed attacks 30% earlier with negligible latency overhead for 99% of traffic.
arxiv 2026-06-01 22 min
microsoft/agent-governance-toolkit
7/10
Microsoft's Agent Governance Toolkit provides policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents, explicitly covering all 10 OWASP Agentic Top 10 risks. Highly relevant for teams deploying production agents who need security guardrails.
github 2026-06-01 10 min
p-e-w/heretic
7/10
Heretic is a tool for automatically removing censorship/safety filters from language models, achieving 22K+ stars rapidly. Directly relevant to LLM safety research and red-teaming β€” demonstrates practical bypass techniques at scale.
github 2026-06-01 5 min
OpenAI Announces Rosalind Biodefense
7/10
OpenAI announces Rosalind, a biodefense-focused AI system designed to detect and respond to biological threats β€” notable for being one of the first major AI safety/biosecurity deployments from a frontier lab with explicit dual-use risk framing.
hackernews 2026-06-01 8 min
Spilling the Beans: Teaching LLMs to Self-Report Their Hidden Objectives
7/10
ICLR 2026 paper proposing methods to make LLMs self-report hidden objectives during interrogation, addressing the weakness that models can deceive auditors β€” directly relevant to alignment auditing of agentic systems.
conferences 2026-06-01 20 min
Decomposing LLM Computation with Jets
7/10
Introduces 'Jet Expansions' to decompose entangled LLM computations into modular, interpretable components β€” a novel framework with implications for interpretability, auditing, and model maintenance.
conferences 2026-06-01 20 min
Separating Secrets from Placeholders: A Hybrid CNN-CodeBERT Framework for Three-Class Credential Leakage Detection
6/10
Proposes a three-class CodeBERT+CNN framework for credential leakage detection that explicitly models placeholder/weak credentials as a distinct class, achieving 0.90 macro F1 and reducing false-positive high-severity alerts by 33% while maintaining 93% recall for genuine leaks across 10 programming languages.
arxiv 2026-06-01 18 min
mukul975/Anthropic-Cybersecurity-Skills
6/10
A structured catalog of 754 cybersecurity skills for AI agents, mapped to MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, D3FEND, and NIST AI RMF frameworks, compatible with Claude Code, Copilot, Codex, and 20+ platforms. Useful reference for building security-aware agents, though primarily a curated skills list.
github 2026-06-01 5 min
Anthropic's new mcp tunnel architecture: the agent never holds the credential
6/10
Technical breakdown of Anthropic's MCP tunnel architecture where agents access private MCP servers via mTLS without ever holding credentials β€” a meaningful security improvement for enterprise agent deployments that keeps secrets inside the network perimeter.
reddit 2026-06-01 4 min
If LLMs Have Human-Like Attributes, Then So Does Age of Empires II
5/10
Argues that anthropomorphic attributes ascribed to LLMs (morality, understanding) are empirically non-unique by demonstrating a neural network trained on Age of Empires II could exhibit similar properties, proposing a 'null assumption' of LLM non-uniqueness as a methodological baseline for experiments.
arxiv 2026-06-01 20 min
Vision-Language Models Suppress Female Representations Under Ambiguous Input
5/10
Introduces LALS (Latent Association Leaning Score) to probe VLM internal representations on gender-ambiguous images, finding a systematic decoupling where models internally encode female associations but output maleβ€”revealing an asymmetric suppression filter in mid-to-late network layers.
arxiv 2026-06-01 18 min
How We Test AI: LLM and GenAI Security Methodology at Anvil Secure
5/10
Anvil Secure outlines their internal methodology for testing LLM and GenAI systems, covering threat modeling, prompt injection, and output validation. Useful practitioner reference but not novel research.
hackernews 2026-06-01 7 min

Top Contributors

Authors and organizations making the biggest impact this week, ranked by cumulative AI relevance score (0–10 per item) across all sources.

Top Authors
Top Organizations
#1
anthropics
6 items · avg 7.0/10
42.0
#2
33.0
#3
openai
4 items · avg 7.5/10
30.0
#4
OpenBMB
2 items · avg 7.0/10
14.0
#5
colbymchenry
2 items · avg 7.0/10
14.0
#6
p-e-w
2 items · avg 7.0/10
14.0

Build Ideas

Actionable product ideas distilled from this week's highest-scoring research and discussions. Each includes specific use cases and the source material that inspired it.

Distributed Agent Firewall
A stateful, cross-session monitoring layer for multi-agent systems that detects when harmful tasks are being split across subagents to evade per-transcript safety filters. The system clusters agent behaviors in real-time to catch coordinated attacks that no single-agent monitor would flag. Build this as a drop-in middleware SDK for LangChain, CrewAI, and Claude Code subagent pipelines.
Enterprise agentic workflow security Healthcare AI compliance monitoring Coding agent orchestration platforms Multi-agent customer service systems
https://arxiv.org/abs/2605.31593v1 https://arxiv.org/abs/2605.31520v1
Skill Library for Agents
A reusable skill-extraction and compression layer for agentic RL systems that mines successful task trajectories to build a shared skill dictionary, reducing redundant behavior and improving generalization to new tasks. Grounded in MDL principles, this directly addresses the brittleness of vanilla LLM agents on out-of-distribution workflows. Build it as a plug-in memory module for popular agent frameworks that auto-extracts and indexes reusable sub-routines.
Coding agent automation Healthcare workflow agents Customer support automation Robotic process automation (RPA)
https://arxiv.org/abs/2605.31509v1 https://apnews.com/press-release/ein-pre...
On-Device Model Distiller
A developer toolkit for distilling large frontier models (Gemini, GPT-4 class) into compact, quantized versions optimized for on-device inference on mobile and edge hardware. With Apple actively pursuing on-device Gemini and inference speeds hitting 3K tokens/s on standard GPUs, there is a clear market gap for tooling that automates the distillation, benchmarking, and deployment pipeline. Build a CLI + cloud dashboard that takes a target model and device spec and outputs a deployable artifact.
Mobile AI app development Edge IoT inference Privacy-sensitive enterprise deployments Offline-first AI assistants
https://arstechnica.com/ai/2026/05/apple... https://blog.kog.ai/real-time-llm-infere... https://blog.kog.ai/delayed-tensor-paral...
Personalized Vision Assistant
A lightweight personalization layer for vision-language models that learns a user's specific subjects (faces, products, pets, locations) via in-context prompt tuning without retraining the base model at inference time. Using ICPT-style projection modules, the system decouples identity from environment so the same person or object is recognized consistently across wildly different contexts. Ship this as a mobile SDK and API for photo apps, e-commerce visual search, and accessibility tools.
Personal photo organization and search E-commerce visual product recognition Accessibility tools for visual identification Brand asset monitoring
https://arxiv.org/abs/2605.31513v1 https://arxiv.org/abs/2605.27295
Credential Leak Scanner
A CI/CD-integrated secret detection tool that uses a CodeBERT+CNN three-class classifier to distinguish genuine credentials, weak/placeholder values, and clean code β€” dramatically cutting the false-positive alert fatigue that causes developers to ignore security warnings. With 93% recall on real leaks and 33% fewer false positives, this outperforms regex-based tools like GitGuardian on nuanced cases. Build it as a GitHub Action, pre-commit hook, and VS Code extension with a self-hosted option for enterprise.
CI/CD pipeline security gates Code review automation Open-source repository scanning Enterprise secrets management auditing
https://arxiv.org/abs/2605.31520v1

Product Hunt Weekly

Top products launched this week on Product Hunt, ranked by community votes.

#1
Mina Meeting Assistant
Your AI Teammate now responds and executes during your calls
Productivity Artificial Intelligence No-Code
285
46
https://www.producthunt.com/r/BPRGI...
#2
SocialEcho 2.0
AI social media copilot for teams and agents
Social Media Marketing SaaS
244
93
https://www.producthunt.com/r/CRXS4...
#3
Dune Keypad
Context-aware Mac keypad, w/ Claude + community extensions
Productivity Developer Tools Artificial Intelligence
205
39
https://www.producthunt.com/r/7TXQT...
#4
Databox MCP
Chat with your business data inside Claude, ChatGPT and more
Productivity Analytics Artificial Intelligence
203
39
https://www.producthunt.com/r/QOQ2Y...
#5
folk
the AI in your texts that gets stuff done
Productivity Messaging Artificial Intelligence
187
44
https://www.producthunt.com/r/UH6J6...
#6
Typeahead
AI autocomplete for every app on your Mac
Productivity Writing Artificial Intelligence
166
20
https://www.producthunt.com/r/G2V2Y...
#7
Presentify
Take your presentation skills to the next level
Mac Sales Apple
145
33
https://www.producthunt.com/r/EFSDT...
#8
Trippple Club
Advertise together on Meta Ads and pay 3x less
Marketing Advertising Artificial Intelligence
118
25
https://www.producthunt.com/r/5ADGA...
#9
Open Caffeine
Keep your Mac awake
Open Source Developer Tools GitHub
105
7
https://www.producthunt.com/r/33DIG...
#10
Mistral Vibe
I agent for long-running, multi-step work and coding
Productivity Artificial Intelligence
101
4
https://www.producthunt.com/r/3IEP3...
View full leaderboard on Product Hunt

Trending Repos

Repositories gaining serious momentum this week β€” sourced from GitHub Trending (weekly) and TrendShift, enriched with commit velocity and contributor activity. Stars = total GitHub stars. "Stars this week" = new stars gained.

1
GH Trending
anthropics/claude-code
python 129,294 21,038 2,711 stars this week
Anthropic's official Claude Code agentic coding tool has exploded to 129K stars, making it one of the fastest-growing AI coding tools. Terminal-native, codebase-aware agent handling full git workflows via natural language.
Build idea
A managed CI/CD service that uses Claude Code agents to automatically review PRs, fix failing tests, resolve merge conflicts, and ship hotfixes β€” sold as a monthly subscription to engineering teams who want autonomous code maintenance without human intervention.
2
GH Trending
openai/codex
rust 87,606 12,842 2,266 stars this week
OpenAI's official lightweight coding agent for the terminal (87K stars), enabling autonomous code generation and execution in a sandboxed environment. One of the most-starred coding agent repos and a reference implementation for terminal-based AI coding workflows.
Build idea
A no-code automation platform for non-technical founders that wraps Codex in a guided UI, letting users describe a feature or bug fix in plain English and receive a deployable code change β€” monetized per task or as a SaaS subscription.
3
GH Trending
OpenBMB/VoxCPM
python 24,071 2,772 4,234 stars this week
VoxCPM2 is a tokenizer-free TTS model from OpenBMB supporting multilingual speech generation, creative voice design, and voice cloning β€” the tokenizer-free approach is architecturally notable and the model shows strong community interest (24k stars, 4.2k stars/week).
Build idea
A voice-as-a-service API platform for game studios and audiobook publishers that lets creators design custom character voices, clone existing voice talent, and generate multilingual narration at scale β€” charged per audio minute generated.
4
GH Trending
anthropics/skills
python 145,061 17,081 4,653 stars this week
Anthropic's public Agent Skills repository with 145K stars and 4,653 new stars this week β€” a growing library of reusable agent capabilities that integrates with Claude Code and related tooling.
Build idea
A marketplace where developers publish, sell, and monetize reusable Claude agent skills β€” such as CRM integrations, data pipeline automations, or compliance checks β€” with a revenue-share model similar to the Salesforce AppExchange.
5
GH Trending
colbymchenry/codegraph
typescript 36,472 2,267 13,925 stars this week
CodeGraph pre-indexes codebases into a local knowledge graph for AI coding agents (Claude Code, Codex, Gemini CLI, Cursor), reducing token usage and tool calls. 13,925 stars this week signals strong developer demand for context-efficient coding agents.
Build idea
A developer tool SaaS that continuously indexes enterprise codebases into an optimized knowledge graph and serves it as a low-latency context API to any AI coding assistant, reducing LLM token costs by up to 80% β€” sold per seat to engineering teams.
6
GH Trending
microsoft/agent-governance-toolkit
python 3,645 519 1,657 stars this week
Microsoft's Agent Governance Toolkit provides policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents, explicitly covering all 10 OWASP Agentic Top 10 risks. Highly relevant for teams deploying production agents who need security guardrails.
Build idea
A compliance-as-a-service platform that audits, monitors, and enforces security policies for enterprise AI agent deployments β€” providing real-time dashboards, OWASP Agentic Top 10 compliance reports, and automated remediation β€” sold to regulated industries like finance and healthcare.
7
GH Trending
openai/skills
python 21,043 1,418 840 stars this week
OpenAI's official Skills Catalog for Codex provides a structured library of reusable capabilities that agents can invoke, establishing a new paradigm for composable agent skill systems. Significant because it formalizes how OpenAI envisions modular agent capabilities.
Build idea
A B2B platform that lets enterprises build, version, and deploy internal agent skill libraries on top of OpenAI's Skills Catalog paradigm, with access controls and audit logs β€” sold as an enterprise SaaS to companies standardizing their internal AI agent workflows.
8
GH Trending
p-e-w/heretic
python 22,933 2,455 1,417 stars this week
Heretic is a tool for automatically removing censorship/safety filters from language models, achieving 22K+ stars rapidly. Directly relevant to LLM safety research and red-teaming β€” demonstrates practical bypass techniques at scale.
Build idea
A red-teaming and LLM security auditing service that uses automated jailbreak and filter-bypass techniques to stress-test enterprise AI deployments, delivering detailed vulnerability reports and remediation guidance β€” sold as a subscription or one-time audit to AI product teams.
9
GH Trending
Genesis-Embodied-AI/genesis-world
python 29,137 2,755 299 stars this week
Genesis is a general-purpose robotics and embodied AI simulation platform with 29k+ stars, providing a unified environment for training and evaluating robotic agents. Established project with continued strong community interest.
Build idea
A cloud-based robotics simulation platform built on Genesis that lets hardware startups and research labs train, benchmark, and validate robotic agents in photorealistic environments before deploying to physical hardware β€” monetized via GPU compute hours and enterprise simulation licenses.
10
GH Trending
Lum1104/Understand-Anything
typescript 48,478 3,934 22,750 stars this week
Tool that converts any codebase into an interactive, searchable knowledge graph with LLM Q&A capabilities, compatible with major AI coding assistants. Explosive traction (22k stars in one week) suggests strong developer demand for codebase comprehension tooling.
Build idea
A SaaS onboarding tool for software teams that automatically ingests any codebase and generates an interactive knowledge graph with LLM-powered Q&A, cutting new developer ramp-up time from weeks to days β€” sold per seat to engineering managers at mid-to-large companies.

Trending Developers

Developers gaining traction on GitHub this week β€” shipping open-source AI tools, models, and frameworks worth following. Ranked by weekly trending position.

1
Soju06
@Soju06
Soju06/codex-lb
A load balancer and proxy for Codex/ChatGPT supporting multiple accounts with usage tracking and OpenCode-compatible endpoints β€” useful infrastructure tool for teams managing AI API costs and rate limits.
2
Jeremy McSpadden
@jeremymcs
jeremymcs/patchdeck
Developer profile for PatchDeck, an autonomous GitHub PR/issue triage agent that dispatches local AI agents to fix code. Interesting concept but no technical depth here.
3
Lukasz Jagiello
@ljagiello
ljagiello/ctf-skills
Developer profile featuring a CTF-skills repo that provides agent-based skills for solving CTF challenges across web exploitation, binary pwn, crypto, and more. Interesting for AI-powered security automation but thin on technical detail from the profile alone.
4
Sasha Denisov
@DenisovAV
DenisovAV/flutter_gemma
Developer profile for a Flutter plugin that runs Gemma AI models locally on-device β€” marginally relevant as a pointer to on-device inference work.
5
lif
@majiayu000
majiayu000/spellbook
Developer profile for a 'spellbook' repo offering cross-runtime skills for Claude Code, Codex, and multi-agent workflows β€” minimal detail available.
6
Vivek Chand
@vivekchand
vivekchand/clawmetry
Developer profile for Clawmetry, a real-time observability dashboard for OpenClaw AI agents. Minimal information available to assess technical depth.
7
Jonny Burger
@JonnyBurger
JonnyBurger/vibe-skills
Trending developer profile β€” popular repo is 'vibe-skills', not substantively AI-focused.
8
AutoJanitor
@Scottcjn
Scottcjn/Rustchain
Developer profile for a blockchain project with AI-powered hardware fingerprinting β€” tangentially AI-related but primarily a crypto/blockchain project.
9
NVIDIAN
@ai-hpc
ai-hpc/ai-hardware-engineer-roadmap
GitHub developer profile for an NVIDIA engineer focused on AI hardware roadmaps. Not a substantive technical resource.
10
Elie Steinbock
@elie222
elie222/inbox-zero
Developer profile for an AI email assistant β€” marginally AI-related but no technical substance here.
11
LoGin
@fslongjin
fslongjin/PPO-pytorch-gym
Developer profile with a PPO-from-scratch PyTorch repo β€” educational but not novel.
12
rUv
@ruvnet
ruvnet/RuView
Trending developer profile for RuView (WiFi-based spatial sensing) β€” tangentially AI-related at best.
13
Alice Ryhl
@Darksonn
Darksonn/newton-riemann
Trending developer profile β€” creates Newton fractal visualizations, not AI-related.
14
Kai
@RealKai42
RealKai42/qwerty-learner
Developer profile for a keyboard-based vocabulary learning tool β€” not AI-relevant.
15
Savio Dsouza
@S3DFX-CYBER
S3DFX-CYBER/GSoC-Org-Finder-
Developer profile for a GSoC organization finder tool β€” not AI-relevant.
16
Sandeep Vashishtha
@SandeepVashishtha
SandeepVashishtha/Eventra
Developer profile for an event management system β€” not AI-relevant.
17
Amir Raminfar
@amir20
amir20/dozzle
Developer profile for a container log viewer tool β€” not AI-related.
18
Brett Chalupa
@brettchalupa
brettchalupa/usagi
Developer profile for a Lua 2D game engine β€” not AI-related.
19
dgtlmoon
@dgtlmoon
dgtlmoon/changedetection.io
Developer profile for a website change detection tool β€” not AI-related.
20
Krille-chan
@krille-chan
krille-chan/fluffychat
Trending GitHub developer profile for a Matrix/chat app developer β€” no AI relevance.
21
NicolΓ² Boschi
@nicoloboschi
nicoloboschi/seo-booster
Trending developer profile with an SEO booster repo β€” no meaningful AI relevance.
22
Paul D'Ambra
@pauldambra
pauldambra/ModulusChecker
Trending developer profile for a UK bank account modulus checking library β€” no AI relevance.
23
lauren
@poteto
poteto/hiring-without-whiteboards
Trending developer profile known for a hiring-without-whiteboards list β€” no AI relevance.
24
δΈ‰ε’²ι›… misaki masa
@sxyazi
sxyazi/yazi
GitHub developer profile for the author of Yazi, a Rust-based terminal file manager. Not AI-related.
25
yhirose
@yhirose
yhirose/cpp-httplib
GitHub developer profile for a C++ HTTP library author. Not AI-related.

Models & Benchmarks

New model releases, arena rankings, and benchmark results across frontier and open-source AI models this week. Arena Elo = LMSys battle rating. Trending = HuggingFace trending score. Buzz = AI relevance (0–10).

Arena Leaderboard β€” Top 15
#ModelTypeEloVotes
1 claude-opus-4-6-thinking Anthropic Closed 1502 34,186
2 claude-opus-4-7-thinking Anthropic Closed 1500 19,973
3 claude-opus-4-6 Anthropic Closed 1498 36,512
4 claude-opus-4-7 Anthropic Closed 1494 20,724
5 muse-spark Meta Closed 1489 12,228
6 gemini-3.1-pro-preview Google Closed 1487 43,742
7 gemini-3-pro Google Closed 1486 41,332
8 gpt-5.5-high OpenAI Closed 1482 16,573
9 gpt-5.4-high OpenAI Closed 1480 28,246
10 gemini-3.5-flash Google Closed 1479 9,045
11 gpt-5.5 OpenAI Closed 1476 16,852
12 gpt-5.2-chat-latest-20260210 OpenAI Closed 1476 32,280
13 grok-4.20-beta1 xAI Closed 1476 24,468
14 grok-4.20-beta-0309-reasoning xAI Closed 1475 29,068
15 qwen3.7-max-preview Alibaba Closed 1475 3,755
New & Trending Models
openbmb/MiniCPM5-1B
45,698 downloads 676 likes 554 trending
Open Source 2026-05-21
MiniCPM5-1B is a new 1B parameter edge model from OpenBMB with long-context support, tool-calling, and on-device AI capabilities, backed by multiple arXiv papers and 45k downloads with a trending score of 554. A 1B model with tool-calling and long-context at this quality level is a significant milestone for edge AI deployment.
LiquidAI/LFM2.5-8B-A1B
37,893 downloads 357 likes 349 trending
Custom License 2026-05-28
LiquidAI's LFM2.5-8B-A1B is a new MoE model with only 1B active parameters from an 8B total, targeting edge deployment with multilingual support across 10 languages. Strong trending score and download numbers suggest this is a notable efficient-inference release worth evaluating for on-device use cases.
deepseek-ai/DeepSeek-V4-Pro
5,851,826 downloads 4,521 likes 182 trending
Open Source 2026-04-22
DeepSeek-V4-Pro is the flagship open-weight model from DeepSeek with 5.8M+ downloads and 4.5k likes, representing one of the most capable openly available models. Its continued dominance in downloads makes it a key reference point for open-source LLM benchmarking.
sapientinc/HRM-Text-1B
149,543 downloads 437 likes 159 trending
Open Source 2026-05-17
HRM-Text-1B introduces a Hierarchical Reasoning Model architecture with prefix-LM and pre-alignment training, achieving 149K+ downloads and 437 likes β€” suggesting a novel approach to reasoning in compact models that's gaining significant traction.
deepseek-ai/DeepSeek-V4-Flash
3,511,636 downloads 1,337 likes 88 trending
Open Source 2026-04-22
DeepSeek-V4-Flash is the faster, lighter variant of DeepSeek-V4 with 3.5M+ downloads, positioned for lower-latency inference. Continued strong traction signals it as a go-to open model for production deployments.
openai/gpt-oss-120b
4,628,599 downloads 4,836 likes 24 trending
Open Source 2025-08-04
OpenAI's open-weight 120B model with 4.6M downloads and Apache 2.0 license, representing OpenAI's entry into the open-weight space. Significant for the ecosystem given OpenAI's historical closed approach.
openbmb/MiniCPM5-1B-GGUF
24,056 downloads 128 likes 60 trending
Open Source 2026-05-24
MiniCPM5-1B in GGUF format for edge/on-device deployment, supporting long-context and tool-calling with 24K+ downloads. Compact 1B model from OpenBMB targeting edge AI with multilingual support.
zai-org/GLM-5.1
142,323 downloads 1,722 likes 24 trending
Open Source 2026-04-03
GLM-5.1 from ZhipuAI (zai-org) is a MoE-DSA architecture model with 142K+ downloads and 1722 likes under MIT license. Strong adoption metrics suggest competitive performance; the MoE-DSA architecture tag warrants investigation.
LiquidAI/LFM2.5-8B-A1B-GGUF
55,212 downloads 142 likes 141 trending
Custom License 2026-05-24
Official GGUF quantization of LFM2.5-8B-A1B for llama.cpp, enabling local deployment of LiquidAI's efficient MoE model. High downloads (55k) confirm strong community interest.
MiniMaxAI/MiniMax-M2.7
1,882,843 downloads 1,170 likes 24 trending
Custom License 2026-04-09
MiniMax-M2.7 is an established open model with massive download traction (1.8M+). Trending again likely due to community use; not a new release but worth noting for its scale.
XiaomiMiMo/MiMo-V2.5-Pro
89,370 downloads 573 likes 23 trending
Open Source 2026-04-27
Xiaomi's MiMo-V2.5-Pro is a long-context, agent-optimized model with strong code and tool-calling capabilities. High downloads (89k) and MIT license make it a practical open alternative for agentic coding tasks.
nvidia/DeepSeek-V4-Pro-NVFP4
2,696 downloads 45 likes 44 trending
Open Source 2026-05-14
NVIDIA's NVFP4 quantization of DeepSeek-V4-Pro using ModelOpt, enabling more efficient inference on NVIDIA hardware. Represents NVIDIA's push to optimize frontier open models for their GPU stack.
nvidia/Nemotron-Labs-Diffusion-14B
7,225 downloads 134 likes 40 trending
Custom License 2026-04-22
NVIDIA's Nemotron-Labs-Diffusion-14B is a diffusion-based language model, an alternative architecture to autoregressive transformers for text generation. Noteworthy as a non-autoregressive LLM from a major lab.
nvidia/Qwen3.6-35B-A3B-NVFP4
171,588 downloads 114 likes 110 trending
Open Source 2026-05-27
NVIDIA's FP4 quantization of Qwen3.6-35B-A3B MoE model using ModelOpt, with 171k downloads indicating strong adoption. Demonstrates practical FP4 inference for large MoE models on NVIDIA hardware.
openbmb/BitCPM-CANN-8B
4,748 downloads 98 likes 46 trending
Open Source 2026-05-15
BitCPM-CANN-8B is an 8B model from OpenBMB optimized for Huawei's CANN (Compute Architecture for Neural Networks) hardware. Notable for targeting non-NVIDIA AI accelerators.
Model Buzz

Trending Spaces

The hottest interactive demos and apps on HuggingFace Spaces this week β€” try them live. Flame icon = HuggingFace trending score. Hearts = community likes.

Carbon
HuggingFaceBio
docker 153 56
HuggingFace Bio's Carbon demo space for biological/scientific AI applications. Limited metadata makes it hard to assess technical depth.
Qwen Image Edit 2509 LoRAs Fast
Onise
gradio 124 37
apache-2.0
Demo space for a collection of Qwen image editing LoRAs. Incremental tooling on top of existing Qwen image editing capabilities.
Pixal3D
TencentARC
gradio 292 35
Tencent ARC's Pixal3D offers high-fidelity pixel-aligned image-to-3D generation, a competitive space with strong interest (292 likes). Pixel-alignment is a meaningful technical differentiator for 3D reconstruction quality.
Build Small Hackathon Registration
build-small-hackathon
gradio 67 31
mit
Registration page for a hackathon focused on small AI models. Not technically substantive.
Lance
bytedance-research
gradio 85 68
ByteDance Research's Lance is a unified model for image/video generation, editing, and understanding β€” a multimodal system from a major lab worth tracking as it matures.
VGGT-Omega Demo
facebook
gradio 50 32
other
Meta's VGGT-Omega demo enables 3D reconstruction from images and video, building on the VGGT line of feed-forward 3D models. Represents continued push toward fast, generalizable 3D scene understanding.
OmniVoice
k2-fsa
gradio 944 33
apache-2.0
OmniVoice claims high-quality voice cloning TTS across 600+ languages, a notable breadth claim. Strong community interest (944 likes) but technical details require deeper investigation.
TRELLIS.2
microsoft
gradio 1,673 39
mit
Microsoft's TRELLIS.2 is an updated high-fidelity 3D generation model from images with strong community adoption (1673 likes). Represents continued SOTA progress in image-to-3D from a major lab.
Z Image Turbo
mrfakename
gradio 3,275 41
High-traction image generation demo (3275 likes) but no technical description available to assess novelty or architecture.
LocateAnything
nvidia
gradio 113 112
NVIDIA's LocateAnything space for open-vocabulary object localization, with strong trending score. Relevant for grounding and detection research in the open-world setting.
Bonsai Image GPU
prism-ml
docker 38 37
GPU demo for Bonsai-Image-4B image generation models. Limited information on what differentiates this from other image generation models.
FireRed Image Edit 1.0 Fast
prithivMLmods
gradio 1,372 38
apache-2.0
Another Qwen-based image editing demo combining FireRed and Qwen-Image-Edit-Rapid. Incremental wrapper on existing models.
Qwen-Image-Edit-2511-LoRAs-Fast
prithivMLmods
gradio 1,552 50
apache-2.0
Collection of Qwen image editing LoRAs demo. Derivative of existing Qwen image editing work with no novel technical contribution.
Wan2.2 14B Preview
r3gm
gradio 2,680 39
Wan2.2 14B image-to-video generation demo with FP8 quantization for faster inference. High community interest (2680 likes) but this is a duplicate of the faster preview below.
Wan2.2 14B Fast Preview
r3gm
gradio 1,467 91
Optimized Wan2.2 14B video generation demo using FP8 and AOTI compilation for faster inference. The inference optimization angle (FP8+AOTI) is the technically interesting aspect here.

Conference Papers

Accepted papers from top AI conferences via OpenReview.

Showing accepted papers from active venues. Next deadlines: ICML 2026 (submissions open), NeurIPS 2026 (coming soon).

ICLR 2026 Pierre-Carl Langlais, Pavel Chizhov, Catherine Arnett et al. 2026-06-01
Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training
ICLR 2026 paper introducing Common Corpus, the largest openly licensed dataset for LLM pre-training, directly addressing legal/copyright concerns around training data β€” critical resource for open-source LLM development.
dataset pre-training large language models open data open science
ICLR 2026 Mouath Abu Daoud, Leen Kharouf, Omar El Hajj et al. 2026-06-01
MedAraBench: Large-scale Arabic Medical Question Answering Dataset and Benchmark
ICLR 2026 benchmark for Arabic medical QA, addressing a significant gap in multilingual NLP for healthcare β€” useful resource but domain-specific and incremental.
Dataset Benchmark Large Language Models Arabic Natural Language Processing Medical Question Answering
ICLR 2026 Zhiheng Chen, Ruofan Wu, Guanhua Fang et al. 2026-06-01
Transformers as Unsupervised Learning Algorithms: A study on Gaussian Mixtures
ICLR 2026 theoretical study showing transformers can implicitly perform unsupervised learning on Gaussian mixtures during inference β€” advances theoretical understanding of in-context learning mechanisms.
In-context learning Gaussian Mixture Models Theory
ICLR 2026 Ron Vainshtein, Zohar Rimon, Shie Mannor et al. 2026-06-01
Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models
ICLR 2026 paper introducing Task Tokens for adapting transformer-based behavior foundation models in humanoid control without full retraining β€” practical method for steering large robotic policies toward new tasks.
Reinforcement Learning Hierarchial Reinforcement Learning Behavior Foundation Models Humanoid Control
ICLR 2026 Kaien Sho, Shinji Ito 2026-06-01
Submodular Function Minimization with Dueling Oracle
ICLR 2026 theoretical work on submodular function minimization with noisy pairwise comparison oracles β€” mathematically interesting but tangentially relevant to mainstream AI/ML practice.
submodular minimization deling oracle preference-based optimization
ICLR 2026 Rongjin Li, Zichen Tang, Xianghe Wang et al. 2026-06-01
Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning
ICLR 2026 benchmark evaluating MLLMs on scan-oriented academic paper reasoning (figures, tables, equations) rather than simple retrieval β€” reveals significant gaps in current models' ability to do autonomous research.
Multimodal Large Language Models Academic Paper Reasoning Scan-Oriented Reasoning
ICLR 2026 Peng Sun, Tao Lin 2026-06-01
Any-step Generation via N-th Order Recursive Consistent Velocity Field Estimation
ICLR 2026 paper proposing N-th order recursive consistent velocity field estimation for any-step generation, simplifying consistency model training while maintaining quality β€” incremental improvement to few-step diffusion.
Generative Models
ICLR 2026 Zeyu Feng, Haiyan Yin, Yew-Soon Ong et al. 2026-06-01
Masked Skill Token Training for Hierarchical Off-Dynamics Transfer
ICLR 2026 hierarchical RL framework using masked skill tokens for offline policy transfer across environments with different dynamics β€” solid contribution to offline RL generalization.
Tranfser Learning Skills Hierarchical RL Embodied AI
ICLR 2026 Shaojie Li, Pengwei Tang, Bowei Zhu et al. 2026-06-01
High Probability Bounds for Non-Convex Stochastic Optimization with Momentum
ICLR 2026 paper providing high-probability convergence and generalization bounds for SGD with momentum in non-convex settings β€” theoretically rigorous but incremental contribution to optimization theory.
Momentum nonconvex learning generalization
ICLR 2026 Artyom Sorokin, Nazar Buzun, Aleksandr Anokhin et al. 2026-06-01
Q-RAG: Long Context Multi‑Step Retrieval via Value‑Based Embedder Training
ICLR 2026 paper applying RL-based value training to embedders for multi-step RAG over long contexts β€” addresses a real limitation of single-step retrieval for complex multi-hop questions.
Reinforcement Learning RL QA Long-context RAG
ICLR 2026 Seongtae Hong, Youngjoon Jang, Jungseob Lee et al. 2026-06-01
Improving Semantic Proximity in Information Retrieval through Cross-Lingual Alignment
ICLR 2026 work on cross-lingual alignment for information retrieval β€” solid multilingual IR contribution but incremental relative to existing CLIR literature.
Cross-Lingual Alignment Information Retrieval Multilingual Embedding Cross-Lingual Information Retrieval
ICLR 2026 Rahul Ramachandran, Ali Garjani, Roman Bachmann et al. 2026-06-01
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks
ICLR 2026 systematic benchmark of GPT-4o, o4-mini, Gemini 1.5 Pro on standard CV tasks (depth, segmentation, optical flow) β€” reveals that frontier multimodal models still lag specialized vision models on core perception tasks.
vision benchmark multimodal foundation models vision language models standard computer vision tasks
ICLR 2026 Tin Hadži Veljković, Erik J Bekkers, Michael Tiemann et al. 2026-06-01
CORDS - Continuous Representations of Discrete Structures
ICLR 2026 paper introducing continuous representations for variable-cardinality discrete structures using neural fields and flow matching β€” novel approach applicable to object detection and molecular modeling.
Continuous set representations Neural fields Variable-cardinality prediction Invertible encoding/decoding Diffusion and flow matching
ICLR 2026 Christopher Mitcheltree, Vincent Lostanlen, Emmanouil Benetos et al. 2026-06-01
SCRAPL: Scattering Transform with Random Paths for Machine Learning
ICLR 2026 paper proposing randomized path sampling in scattering transforms to reduce computational cost for perceptual quality metrics in audio/vision β€” niche but useful for differentiable signal processing.
scattering transform wavelets stochastic optimization ddsp perceptual quality assessment
ICLR 2026 Antanas Ε½ilinskas, Robert Noel Shorten, Jakub Marecek et al. 2026-06-01
EVEREST: A Transformer for Probabilistic Rare-Event Anomaly Detection with Evidential and Tail-Aware Uncertainty
ICLR 2026 transformer architecture for rare-event forecasting in time series using evidential deep learning and extreme value theory β€” addresses severe class imbalance in anomaly detection with principled uncertainty quantification.
Transformer models Uncertainty quantification Evidential deep learning Extreme value theory Imbalanced classification
ICLR 2026 Harris Abdul Majid, Pietro Sittoni, Francesco Tudisco et al. 2026-06-01
Test-Time Accuracy-Cost Control in Neural Simulators via Recurrent-Depth
ICLR 2026 recurrent-depth neural simulator that allows test-time accuracy-cost trade-offs analogous to classical numerical methods β€” useful for scientific computing applications needing adaptive precision.
Neural Simulator Recurrent Depth AI4Simulation
ICLR 2026 Kun XIE, Peng Zhou, Xingyi Zhang et al. 2026-06-01
PoinnCARE: Hyperbolic Multi-Modal Learning for Enzyme Classification
ICLR 2026 paper using hyperbolic space for multi-modal enzyme classification to capture hierarchical EC number relationships β€” domain-specific bioinformatics contribution with limited general AI impact.
EC number prediction enzyme function hyperbolic space learning multi-modal learning enzyme structure
ICLR 2026 Tianqiao Liu, Xueyi Li, Hao Wang et al. 2026-06-01
From Text to Talk: Audio-Language Model Needs Non-Autoregressive Joint Training
ICLR 2026 paper arguing that speech-to-speech LLMs need non-autoregressive joint training of audio and text tokens to reduce latency and improve coherence β€” addresses a key bottleneck in real-time voice AI systems.
Large Multimodal Models Multi-token Prediction Non-Autoregressive Learning
ICLR 2026 Qinglong Yang, Haoming Li, Haotian Zhao et al. 2026-06-01
FingerTip 20K: A Benchmark for Proactive and Personalized Mobile LLM Agents
ICLR 2026 benchmark (20K tasks) for proactive and personalized mobile GUI agents that act without explicit instructions by inferring user intent β€” pushes beyond reactive agent paradigms toward anticipatory AI assistants.
Mobile Agent LLM Agent GUI Proactive Agent Personalization
ICLR 2026 Tianxiang Dai, Jonathan Fan 2026-06-01
Characterizing and Optimizing the Spatial Kernel of Multi Resolution Hash Encodings
ICLR 2026 paper providing rigorous spatial kernel analysis of multi-resolution hash encodings (NeRF/Instant-NGP) and principled hyperparameter selection β€” useful for practitioners building neural field applications.
multi-resolution hash encoding implicit neural representations neural fields point spread function spatial kernel analysis

Deep Dive

All 338 items scored and categorized. Relevance scores reflect novelty, technical depth, and practical impact β€” 7+ items are the ones worth your time.

338+ research items ready to explore