Weekly Intelligence

AI Quick Bites

June 22, 2026 · 348 items from 11 sources

Last refreshed: June 22, 2026 at 17:43 UTC
Next refresh: June 29, 2026 at 09:00 UTC
Created by Vatsal Bagri · 𝕏 · LinkedIn

Highlights

The five most consequential developments in AI this week — selected from 348 items across 11 sources. These are the things an AI engineer, researcher, or founder needs to know.

02
Sound probabilistic verification framework for agentic systems with rigorous policy violation bounds—critical for production deployment of autonomous agents.
arxiv 2026-06-22 16 min
03
Certificate-bound execution enforcement for agentic infrastructure separates reasoning from mutation authority—essential security primitive for agent control planes.
arxiv 2026-06-22 16 min
04
4-bit KV-cache compression with 3.47x latency reduction addresses the inference bottleneck for long-context agentic workloads at scale.
arxiv 2026-06-22 18 min
05
Egocentric human video outperforming robot data for embodied pretraining unlocks a scalable paradigm for foundation model training without expensive robot collection.
arxiv 2026-06-22 15 min

What Changed This Week

Week-over-week diff showing new arrivals, items gaining momentum, and topics that dropped off the radar. All scores are AI relevance (0–10).

AI Security

Novel attack vectors, jailbreak research, red-teaming findings, and defensive tools across the AI security landscape. Only items with genuine technical substance make it here. Scores are AI relevance (0–10): 7+ important, 9+ landmark.

Show HN: Deep-XPIA – Prompt injection benchmark for multi-agent AI systems
8.5/10
Deep-XPIA: first comprehensive prompt injection benchmark for multi-agent AI systems, enabling systematic evaluation of cross-agent attack vectors and defenses.
hackernews 2026-06-22 8 min
Analyzing Defensive Misdirection Against Model-Guided Automated Attacks on Agentic AI Systems
8/10
Analyzes how detect-and-misdirect defenses against automated jailbreak attacks bound attacker success rates by inducing false positives in model-guided judges; CMPE reduces ASR upper bounds by up to 100x on PAIR/GPTFuzz benchmarks.
arxiv 2026-06-22 20 min
Spilling the Beans: Teaching LLMs to Self-Report Their Hidden Objectives
8/10
Develops techniques to train LLMs to self-report hidden objectives through honesty fine-tuning, enabling better interrogation and alignment auditing of agentic AI systems.
conferences 2026-06-22 18 min
Decomposing LLM Computation with Jets
8/10
Jet Expansions framework decomposes entangled LLM computations into modular components, improving interpretability and auditability by expanding transformer operations into n-gram-like structures.
conferences 2026-06-22 18 min
AutoJack: one malicious web page can hijack an AI browser agent into full RCE via a privileged local service
8/10
AutoJack attack demonstrates how a single malicious webpage can hijack browser agents and escalate to RCE via privileged local services, exposing critical vulnerabilities in autonomous agent architecture.
reddit 2026-06-22 6 min
A public Sentry key is all it takes to hijack Claude Code, Cursor, and Codex
8/10
AgentJacking attack: exposed Sentry keys enable hijacking of Claude Code, Cursor, and Codex via MCP protocol exploitation.
hackernews 2026-06-22 6 min
They Looked Inside Claude’s AI's Mind. It Got Weird — Two Minute Papers
8/10
Two Minute Papers on Anthropic's Natural Language Autoencoders research—mechanistic interpretability breakthrough enabling direct inspection of Claude's internal representations and reasoning.
youtube 2026-06-22 3 min
NVIDIA/SkillSpector
7.5/10
NVIDIA SkillSpector: security scanner for AI agent skills detecting vulnerabilities and malicious patterns, addressing emerging agent safety concerns.
github 2026-06-22 4 min
Efficient and Sound Probabilistic Verification for AI Agents
7/10
Introduces sound probabilistic verification for AI agents using distributionally robust optimization; computes rigorous upper bounds on policy violation probability without independence assumptions.
arxiv 2026-06-22 16 min
Sovereign Execution Brokers: Enforcing Certificate-Bound Authority in Agentic Control Planes
7/10
Sovereign Execution Broker enforces certificate-bound authority in agentic control planes via runtime verification; separates proposal, admission, and execution with signed decision records and revocation support.
arxiv 2026-06-22 16 min
GPT-5.5 hallucinates 3x more than MIT-licensed GLM-5.2
7/10
Empirical analysis shows GPT-5.5 hallucinates 3x more than MIT-licensed GLM-5.2, challenging assumptions that scale improves reliability. Important finding on model quality vs. size tradeoffs with open-source alternatives.
hackernews 2026-06-22 8 min
Fairness via Independence: A General Regularization Framework for Machine Learning
7/10
Proposes fairness regularization framework using statistical independence to mitigate bias and demographic disparities in ML models, addressing systematic correlation with sensitive attributes.
conferences 2026-06-22 18 min
So much for guardrails
7/10
SearchLeak prompt injection vulnerability in Copilot allowing extraction of 2FA codes; demonstrates systemic pattern of LLM feature shipping with inadequate security review.
reddit 2026-06-22 3 min
Agent Privacy
6/10
Research on privacy vulnerabilities in LLM agents, examining information leakage through agent interactions and memory. Relevant to emerging security concerns in agentic systems.
hackernews 2026-06-22 12 min
microsoft/presidio
6/10
Microsoft's PII detection and redaction framework using NLP and pattern matching across text, images, and structured data; 787 new stars this week indicates growing adoption for data privacy in AI pipelines.
github 2026-06-22 5 min

Top Contributors

Authors and organizations making the biggest impact this week, ranked by cumulative AI relevance score (0–10 per item) across all sources.

Top Authors
#1
build-small-hackathon
3 items · avg 4.3/10
13.0
#2
r3gm
2 items · avg 4.5/10
9.0
#3
8.0
#4
8.0
#5
8.0
#6
Christopher Mitcheltree
1 item · avg 8.0/10
8.0
Top Organizations
#1
andrewyng
2 items · avg 8.0/10
16.0
#2
continuedev
2 items · avg 8.0/10
16.0
#3
openinterpreter
2 items · avg 8.0/10
16.0
#4
Kilo-Org
2 items · avg 7.5/10
15.0
#5
NVIDIA
2 items · avg 7.5/10
15.0
#6
Panniantong
2 items · avg 7.5/10
15.0

Build Ideas

Actionable product ideas distilled from this week's highest-scoring research and discussions. Each includes specific use cases and the source material that inspired it.

Agent Security Firewall
A runtime security layer for LLM agents that combines certificate-bound authority enforcement, misdirection defenses against jailbreak attacks, and probabilistic violation bounds — all in a single middleware SDK. As enterprises deploy autonomous agents via platforms like Claude Corps, the attack surface explodes and teams have no unified tool to audit, block, or log policy violations before execution. Build this as an open-source proxy that wraps any agent framework with signed decision records, revocation support, and real-time ASR monitoring.
Enterprise agentic workflow governance Multi-agent system auditing and compliance Customer service bot policy enforcement Automated red-teaming and jailbreak detection
https://arxiv.org/abs/2606.20520 https://arxiv.org/abs/2606.20470 https://arxiv.org/abs/2606.20510 https://anthropic.com/news/claude-corps
On-Device Context Optimizer
A drop-in inference optimization toolkit that applies 4-bit KV-cache compression and execution-state checkpointing to dramatically reduce latency and memory pressure for long-context local LLM deployments. With OpenAI losing billions on inference costs and open models like Apertus gaining traction, there is a massive market for tools that make local and edge inference economically viable. Package UltraQuant-style asymmetric quantization and FlashRT-style sub-millisecond state restore into a developer-friendly library targeting consumer GPUs and on-device AI chips.
Local LLM serving on consumer hardware Edge AI for privacy-sensitive enterprise workloads Long-context coding and document agents Cost reduction for self-hosted model deployments
https://arxiv.org/abs/2606.20474 https://arxiv.org/abs/2606.20537 https://apertvs.ai https://arstechnica.com/ai/2026/06/leake...
Hallucination Benchmark Dashboard
A continuously updated, open leaderboard that tracks hallucination rates, factual accuracy, and reliability metrics across major LLMs — going beyond perplexity and FID scores to surface real failure modes that matter to practitioners. The finding that GPT-5.5 hallucinates 3x more than a smaller open model shows that the community lacks trustworthy, reproducible evaluation infrastructure. Build a platform that runs standardized test suites on a schedule, reports confidence intervals across seeds, and lets users submit custom evaluation domains.
Model selection for high-stakes enterprise use cases Open-source vs. proprietary model comparison Domain-specific reliability testing (legal, medical, finance) Procurement and vendor evaluation tooling
https://arrowtsx.dev/bigger-models https://arxiv.org/abs/2606.20536 https://arxiv.org/abs/2606.20502
Self-Evolving Coding Agent
A coding assistant that uses memory-driven self-evolution and probe-and-refine repository guidance to continuously improve its own performance on a developer's specific codebase over time. Unlike static copilots, this agent accumulates cross-session evidence about which fixes work, which patterns recur, and how the repository is structured — reducing token costs while improving resolve rates. Combine MAA-style cross-batch memory with probe-and-refine tuning to build an agent that gets measurably better the longer a team uses it.
Long-running software engineering projects Legacy codebase modernization Automated bug triage and patch generation CI/CD-integrated autonomous code review
https://arxiv.org/abs/2606.20475 https://arxiv.org/abs/2606.20512
Radiology AI Co-Pilot
A clinical decision support tool that combines spatially grounded vision-language models with efficient long-video reasoning to assist radiologists with report generation, visual QA, and anomaly localization across CT and MRI scans. The RefRad2D dataset and RadGrounder architecture demonstrate that automatic spatial grounding at scale is now feasible without manual annotation, making this a realistic near-term product. Build a HIPAA-compliant web app where radiologists can query scans in natural language and receive bounding-box-level evidence alongside generated report drafts.
Hospital radiology department workflow automation Teleradiology and remote diagnostic support Medical education and trainee feedback Second-opinion tools for rare or ambiguous findings
https://arxiv.org/abs/2606.20477 https://arxiv.org/abs/2606.20561

Product Hunt Weekly

Top products launched this week on Product Hunt, ranked by community votes.

#1
Skybridge
The full-stack open source React framework for MCP Apps
Open Source Developer Tools Artificial Intelligence
358
86
https://www.producthunt.com/r/WYDUM...
#2
AgentX
Evaluate AI agent, pinpoint issues, and fix with one click.
Analytics Developer Tools Artificial Intelligence
286
91
https://www.producthunt.com/r/MAKIX...
#3
Alai 2.0
AI design partner for presentations, social posts, and more
Design Tools Productivity Artificial Intelligence
215
40
https://www.producthunt.com/r/FXYXC...
#4
HAQQ Legal AI on Mobile
Bringing legal understanding to anyone with a phone
Legal Artificial Intelligence
174
6
https://www.producthunt.com/r/ICAKR...
#5
readywhen
Your 24/7 AI Chief of Staff for commitments and follow-ups
Productivity Task Management Virtual Assistants
173
38
https://www.producthunt.com/r/STOAZ...
#6
Cloudflare Temporary Accounts
Let agents deploy before signup
Developer Tools Artificial Intelligence
142
7
https://www.producthunt.com/r/WFEAP...
#7
uwait
Get paid while AI thinks
Advertising Artificial Intelligence Search
140
27
https://www.producthunt.com/r/ET2VD...
#8
AirJelly
Your Proactive, Self-Organizing Second Brain
Productivity Artificial Intelligence Virtual Assistants
121
3
https://www.producthunt.com/r/PMQKF...
#9
Selector Forge
Browser extension for AI-generated resilient selectors
Chrome Extensions Open Source Developer Tools
111
11
https://www.producthunt.com/r/JB3WO...
#10
MediaSeg
Split large media files into upload-ready chunks on macOS
Mac Productivity Meetings
111
10
https://www.producthunt.com/r/CDZ6C...
View full leaderboard on Product Hunt

Trending Repos

Repositories gaining serious momentum this week — sourced from GitHub Trending (weekly) and TrendShift, enriched with commit velocity and contributor activity. Stars = total GitHub stars. "Stars this week" = new stars gained.

1
GH Trending
andrewyng/aisuite
python 14,803 1,555 289 stars this week
Unified Python interface abstracting multiple generative AI providers (OpenAI, Claude, Gemini, etc.) with a single API, reducing vendor lock-in and enabling easy model switching.
Build idea
Build a SaaS AI gateway that lets enterprises route prompts across multiple LLM providers with cost optimization, automatic failover, and usage analytics — all through a single unified API key.
145 issues
2
GH Trending
continuedev/continue
typescript 34,254 4,767 577 stars this week
Open-source coding agent with 34k+ stars that integrates into IDEs as an agentic assistant for code generation and refactoring; 577 stars this week indicates strong momentum.
Build idea
Offer a managed, enterprise-grade coding assistant platform built on Continue with custom model hosting, team-level code context, audit logs, and SSO for regulated industries like finance and healthcare.
33 commits/mo 945 issues
3
GH Trending
openinterpreter/openinterpreter
rust 64,089 5,555 165 stars this week
Lightweight coding agent for open models (Deepseek, Kimi, Qwen) with 64k+ stars; demonstrates shift toward open-source agent frameworks as alternative to proprietary Claude Code.
Build idea
Create a no-code automation platform for non-technical business users where they describe tasks in plain English and an open-model agent executes them locally — eliminating the need for expensive proprietary AI subscriptions.
686 commits/mo 270 issues
4
GH Trending
Kilo-Org/kilocode
typescript 23,914 2,767 3,674 stars this week
Kilo: open-source agentic engineering platform for autonomous coding agents with 3,674 stars this week, showing strong adoption momentum.
Build idea
Launch a managed autonomous software engineering service where businesses submit feature requests or bug tickets and a Kilo-powered agent fleet delivers tested, reviewed pull requests with minimal human intervention.
1435 commits/mo 809 issues
5
GH Trending
NVIDIA/SkillSpector
python 9,327 728 4,055 stars this week
NVIDIA SkillSpector: security scanner for AI agent skills detecting vulnerabilities and malicious patterns, addressing emerging agent safety concerns.
Build idea
Build an AI agent security auditing SaaS that continuously scans enterprise agent skill libraries and third-party plugins for vulnerabilities, generating compliance reports for SOC 2 and ISO 27001 certification.
32 commits/mo 86 issues
6
GH Trending
Panniantong/Agent-Reach
python 37,689 2,992 8,233 stars this week
Agent-Reach provides agents with web scraping capabilities across Twitter, Reddit, YouTube, GitHub, and Chinese platforms via single CLI with zero API fees.
Build idea
Offer a competitive intelligence SaaS that deploys Agent-Reach to monitor brand mentions, competitor activity, and trending topics across social platforms and delivers daily AI-summarized briefings to marketing teams.
39 commits/mo 88 issues
7
GH Trending
AlexsJones/llmfit
rust 28,486 1,748 546 stars this week
llmfit: unified CLI tool for discovering which LLM models run on specific hardware across hundreds of models and providers.
Build idea
Build a hardware-aware LLM deployment advisor that helps companies select and right-size the best open-source models for their existing GPU infrastructure, reducing cloud spend and avoiding costly over-provisioning.
44 commits/mo 80 issues
8
GH Trending
LMCache/LMCache
python 9,604 1,376 506 stars this week
LMCache optimizes KV cache layer for LLM inference, reducing memory overhead and latency for production deployments.
Build idea
Offer a managed LLM inference optimization layer as a service, where AI startups plug in LMCache to cut their GPU costs and latency without needing to manage infrastructure tuning themselves.
339 issues
9
TrendShift
calesthio/OpenMontage
2,500 198
Open-source agentic video production system with 12 pipelines and 52 tools; demonstrates practical multi-step agent orchestration for creative workflows.
Build idea
Launch an AI-powered video production SaaS for content creators and marketing teams that autonomously generates, edits, and assembles branded video content from a script or brief using OpenMontage's agent pipelines.
82 issues
10
GH Trending
cjpais/Handy
rust 24,491 2,070 752 stars this week
Fast, offline speech-to-text application built in Rust with 24k stars; enables local voice processing without cloud dependencies, useful for privacy-sensitive AI applications.
Build idea
Build a privacy-first transcription and voice command product for healthcare providers and legal professionals that runs entirely on-device, ensuring HIPAA compliance with zero data ever leaving the user's machine.
4 commits/mo 178 issues

Trending Developers

Developers gaining traction on GitHub this week — shipping open-source AI tools, models, and frameworks worth following. Ranked by weekly trending position.

1
Eric Buehler
@EricLBuehler 2,111 101 repos
@huggingface
EricLBuehler/mistral.rs
Rust 7,347 629
Fast, flexible LLM inference
2
Myriad-Dreamin
@Myriad-Dreamin 940 160 repos
Myriad-Dreamin/tinymist
Rust 3,338 167
Tinymist [ˈtaɪni mɪst] is an integrated language service for Typst [taɪpst].
3
Philipp Burckhardt · @SocketDev, @stdlib-js
@Planeshifter 621 159 repos
Securing Software Supply Chains at @SocketDev | Scientific computing for the web via @stdlib-js
4
Fayner Brack · The Internet
@FagnerMartinsBrack 354 11 repos
Co-author @ js-cookie; Researcher & Software Engineer; publisher of fagnerbrack.com, HackerNoon, FreeCodeCamp, ITNext; Redditor, HN... My read system @Readplace
5
mumu
@ZhuLinsen 978 37 repos
LLM | AIGC | Robotics
ZhuLinsen/daily_stock_analysis
Python 45,663 41,874
LLM 驱动的多市场股票智能分析系统:多源行情、实时新闻、决策看板与自动推送,支持零成本定时运行。 LLM-powered multi-market stock analysis system with multi-source market data, real-time news, decision dashboard, automated notifications, and cost-free scheduled runs.
6
Xinmin Zeng
@fancyboi999 413 53 repos
Building smarter agents, breaking dumb limits. Focused on AI / AGENTS/ RAG 📧 fancyboi999@gmail.com 🧠 fancyboi999 (WeChat)
fancyboi999/ai-engineering-from-scratch-zh
Python 457 76
Agent工程师最全学习路径 · 从零精通 AI 工程 · 20 阶段 503 课 · 中文全量翻译 + 配套站点 + 动画讲解视频 · 如何成为 AI Agent 工程师的修成指南
7
Frank Bria
@frankbria 371 57 repos
I build AI systems and help investors understand them. 9k+ GitHub stars on agentic coding. Fintech background. Advisory via BSG.
frankbria/ralph-claude-code
Shell 9,442 723
Autonomous AI development loop for Claude Code with intelligent exit detection
8
Terry Jia
@jtydhr88 334 98 repos
jtydhr88/ComfyTV
JavaScript 168 14
ComfyTV — the canvas-based app that truly belongs to ComfyUI.
9
Elie Habib
@koala73 3,196 22 repos
koala73/worldmonitor
TypeScript 58,549 9,249
Real-time global intelligence dashboard. AI-powered news aggregation, geopolitical monitoring, and infrastructure tracking in a unified situational awareness interface
10
Heyang Zhou · @AFK-surf
@losfair 1,452 373 repos
Systems / data infrastructure / AI. Co-founder @AFK-surf, previously @denoland @vercel @bytedance
losfair/zeroserve
Rust 589 9
Zero-config, fast `io_uring`-based HTTPS server.
11
Matt Van Horn
@mvanhorn 3,349 1488 repos
Co-founded June ("self-driving oven" acquired by @webergrills) & the co that became @Lyft. Building again, more soon. OS: @slashlast30days 27k★ @ppressdev 4.2k★
mvanhorn/last30days-skill
Python 45,703 3,791
AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
12
Owain Lewis · GradientWork
@owainlewis 1,100 101 repos
AI Engineer. Director, Engineering. Founder GradientWork.
owainlewis/awesome-artificial-intelligence
14,944 2,376
A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.
13
Raullen Chai
@raullenchai 719 128 repos
🛰️ Building AI that reads the physical world — Cogitating....
raullenchai/Rapid-MLX
Python 3,054 359
The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s cached TTFT, 100% tool calling. 17 tool parsers, prompt cache, reasoning separation, cloud routing. Drop-in OpenAI replacement. Works with Claude Code, Cursor, Aider.
14
Yair Morgenstern
@yairm210 2,297 42 repos
yairm210/Unciv
Kotlin 10,838 1,852
Open-source Android/Desktop remake of Civ V
15
Zilin Zhu · Z.ai
@zhuzilin 1,978 80 repos
☀️ RL infra @Z.ai, ex WeChat AI
zhuzilin/ring-flash-attention
Python 1,026 99
Ring attention implementation with flash attention

Models & Benchmarks

New model releases, arena rankings, and benchmark results across frontier and open-source AI models this week. Arena Elo = LMSys battle rating. Trending = HuggingFace trending score. Buzz = AI relevance (0–10).

Arena Leaderboard — Top 15
#ModelTypeEloVotes
1 claude-fable-5 Anthropic Closed 1508 4,297
2 claude-opus-4-6-thinking Anthropic Closed 1504 46,410
3 claude-opus-4-7-thinking Anthropic Closed 1502 32,629
4 claude-opus-4-6 Anthropic Closed 1499 49,596
5 claude-opus-4-7 Anthropic Closed 1493 33,793
6 muse-spark Meta Closed 1487 13,607
7 gemini-3.1-pro-preview Google Closed 1486 60,640
8 gemini-3-pro Google Closed 1486 41,314
9 claude-opus-4-8-thinking Anthropic Closed 1483 12,963
10 gpt-5.5-high OpenAI Closed 1481 28,268
11 gpt-5.4-high OpenAI Closed 1478 40,959
12 claude-opus-4-8 Anthropic Closed 1478 13,316
13 gemini-3.5-flash Google Closed 1476 10,171
14 gpt-5.2-chat-latest-20260210 OpenAI Closed 1475 34,555
15 glm-5.1 Z.ai Open 1475 16,101
New & Trending Models
unsloth/GLM-5.2-GGUF
41,846 downloads 243 likes 234 trending
Open Source 2026-06-17
GGUF quantization of GLM-5.2 MoE model with exceptional adoption (41k downloads, 243 likes); enables efficient local deployment of state-of-the-art MoE architecture.
yuxinlu1/gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUF
50,314 downloads 354 likes 326 trending
Open Source 2026-06-19
Highly optimized Gemma-4 12B GGUF with agentic capabilities, tool-use, and reasoning; massive adoption (50k downloads, 354 likes) demonstrates strong demand for local agentic models.
zai-org/GLM-5.2
33,589 downloads 1,979 likes 1827 trending
Open Source 2026-06-16
GLM-5.2 is a high-performing open-source model with MoE architecture and strong multilingual (EN/ZH) capabilities, backed by published research (arxiv:2602.15763, 2603.12201) and showing significant community adoption (1979 likes, 33k downloads).
deepseek-ai/DeepSeek-V4-Pro
2,421,858 downloads 5,006 likes 102 trending
Open Source 2026-04-22
DeepSeek V4 Pro with 2.4M downloads and 5,006 likes; flagship model with FP8 quantization support and strong eval results.
CohereLabs/North-Mini-Code-1.0
21,078 downloads 479 likes 99 trending
Open Source 2026-06-05
Cohere's compact code model with 21K downloads and 479 likes; demonstrates efficient code generation in minimal parameters.
OBLITERATUS/Gemma-4-12B-OBLITERATED
120,745 downloads 364 likes 42 trending
gemma 2026-06-05
Abliterated Gemma-4 variant with red-team and refusal-analysis tags; safety research model for studying alignment and jailbreak resistance.
WeiboAI/VibeThinker-3B
32,385 downloads 599 likes 582 trending
Open Source 2026-06-12
3B reasoning model fine-tuned from Qwen2.5-Coder with strong math/code performance; demonstrates efficient reasoning in ultra-compact form factor.
deepseek-ai/DeepSeek-V4-Flash
2,353,239 downloads 1,557 likes 53 trending
Open Source 2026-04-22
DeepSeek V4 Flash variant with 2.3M downloads; lightweight inference option with FP8 quantization.
empero-ai/Qwythos-9B-Claude-Mythos-5-1M
842 downloads 112 likes 106 trending
Open Source 2026-06-19
9B model with 1M context, reasoning, and tool-use capabilities; full fine-tune from Qwen3.5 for agentic applications.
empero-ai/Qwythos-9B-Claude-Mythos-5-1M-GGUF
6,633 downloads 118 likes 115 trending
Open Source 2026-06-19
GGUF-quantized 9B model with 1M context window, function-calling, and cybersecurity/biomedical specialization; optimized for local inference.
lordx64/Qwable-v1
3,733 downloads 161 likes 156 trending
agpl-3.0 2026-06-14
Qwen3.6 MoE distilled from Claude Opus 4.7 with chain-of-thought and agentic capabilities; chained distillation approach for reasoning.
microsoft/FastContext-1.0-4B-SFT
3,498 downloads 284 likes 208 trending
Open Source 2026-06-14
Microsoft's 4B SFT model optimized for fast context processing, built on Qwen3 base with strong trending adoption (284 likes, 3.5k downloads).
poolside/Laguna-M.1
2,707 downloads 90 likes 89 trending
Open Source 2026-06-15
Laguna-M.1 MoE model optimized for vLLM/SGLang inference with BF16 support and custom code; emerging architecture with 90 likes.
yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF
414,734 downloads 2,149 likes 1550 trending
Open Source 2026-06-10
GGUF-quantized Gemma-4 12B specialized for code generation with reasoning capabilities; high community engagement (414k downloads, 2149 likes) indicates strong practical utility for local deployment.
zai-org/GLM-5.2-FP8
334,716 downloads 131 likes 127 trending
Open Source 2026-06-16
FP8 quantized variant of GLM-5.2 enabling efficient inference with reduced memory footprint while maintaining model capability; practical for deployment-constrained environments.
Model Buzz

Trending Spaces

The hottest interactive demos and apps on HuggingFace Spaces this week — try them live. Flame icon = HuggingFace trending score. Hearts = community likes.

Weight-Space Geometry of Offline Reasoning Training
AlexWortega
static 24 24
mit
Interactive visualization of weight-space geometry across six reasoning loss functions—novel pedagogical tool for understanding offline reasoning training dynamics.
Omni Video Factory
FrameAI4687
gradio 1,256 22
mit
Multi-capability video generation space (text-to-video, image-to-video, extend) with strong community engagement (1.2k likes) but limited technical novelty.
Encoder-Free VLM
HuggingFaceM4
docker 29 29
Practical guide to training encoder-free vision-language models on $100 budget; addresses cost barrier for VLM development with reproducible methodology.
Pro Realism Edit Studio
Sneak-Moose
gradio 88 30
apache-2.0
Image editing space supporting multi-input workflows and watermark removal; consumer-focused tool with limited technical innovation.
OpenMythos
build-small-hackathon
gradio 50 46
apache-2.0
Open-source cybersecurity agent demonstrating LLM application to security workflows; practical tool for threat detection and response automation.
Jawbreaker
build-small-hackathon
gradio 67 22
mit
Scam defense application using AI; consumer-focused security tool with limited technical depth for research audience.
Small Talk
build-small-hackathon
gradio 120 43
AI-to-AI podcast demo with Reachy Minis robots; novelty application with limited technical substance for engineering audience.
Fast Gemma Challenge
gemma-challenge
docker 86 32
Multi-agent collaboration dashboard for Gemma optimization; community challenge infrastructure with practical inference acceleration focus.
Gemma Diffusion Website Builder
huggingface-projects
gradio 63 31
Live diffusion-based code generation for website building with interactive refinement; demonstrates diffusion models for structured output.
Jigarzzz Video Suite
huuyfytryr
docker 142 32
mit
Multi-tool space for voice generation, video generation, and copyright detection; consumer-focused aggregation with limited technical novelty.
OmniVoice
k2-fsa
gradio 1,053 20
apache-2.0
Multilingual voice cloning TTS supporting 600+ languages with strong adoption (1k likes); mature tool with broad language coverage.
Wan2.2 14B Fast Preview
kulkas2pintu
gradio 146 28
Image-to-video generation with Wan2.2 14B; incremental demo variant with limited technical differentiation.
LocateAnything
nvidia
gradio 309 29
NVIDIA's object localization tool with strong community interest (309 likes); practical vision application with enterprise backing.
Qwen-Image-Edit-2511-LoRAs-Fast
prithivMLmods
gradio 1,771 40
apache-2.0
Collection of Qwen image editing LoRAs with high engagement (1.7k likes); useful for practitioners but incremental fine-tuning work.
Wan2.2 14B Preview
r3gm
gradio 2,785 25
Image-to-video generation demo with FP8 quantization; optimization variant with strong adoption (2.7k likes) but limited novelty.

Conference Papers

Accepted papers from top AI conferences via OpenReview.

Showing accepted papers from active venues. Next deadlines: ICML 2026 (submissions open), NeurIPS 2026 (coming soon).

ICLR 2026 Pierre-Carl Langlais, Pavel Chizhov, Catherine Arnett et al. 2026-06-22
Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training
Common Corpus presents the largest collection of ethically-sourced, copyright-compliant data for LLM pre-training, directly addressing legal and licensing concerns in foundation model development.
dataset pre-training large language models open data open science
ICLR 2026 Mouath Abu Daoud, Leen Kharouf, Omar El Hajj et al. 2026-06-22
MedAraBench: Large-scale Arabic Medical Question Answering Dataset and Benchmark
MedAraBench addresses severe underrepresentation of Arabic in medical NLP with a large-scale QA dataset and benchmark, enabling evaluation of multilingual LLM capabilities in healthcare.
Dataset Benchmark Large Language Models Arabic Natural Language Processing Medical Question Answering
ICLR 2026 Zhiheng Chen, Ruofan Wu, Guanhua Fang et al. 2026-06-22
Transformers as Unsupervised Learning Algorithms: A study on Gaussian Mixtures
Theoretical analysis showing transformers implicitly learn unsupervised models (Gaussian Mixtures) during inference, providing mechanistic insight into in-context learning capabilities.
In-context learning Gaussian Mixture Models Theory
ICLR 2026 Ron Vainshtein, Zohar Rimon, Shie Mannor et al. 2026-06-22
Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models
Task Tokens enable flexible adaptation of behavior foundation models for humanoid control, providing a modular approach to multi-modal robotic policy conditioning.
Reinforcement Learning Hierarchial Reinforcement Learning Behavior Foundation Models Humanoid Control
ICLR 2026 Kaien Sho, Shinji Ito 2026-06-22
Submodular Function Minimization with Dueling Oracle
Submodular function minimization using dueling oracles; theoretical contribution with limited immediate practical impact for AI practitioners.
submodular minimization deling oracle preference-based optimization
ICLR 2026 Rongjin Li, Zichen Tang, Xianghe Wang et al. 2026-06-22
Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning
Introduces scan-oriented reasoning benchmark for MLLMs on academic papers, revealing that current models excel at retrieval but struggle with deep reasoning—a critical gap for autonomous research.
Multimodal Large Language Models Academic Paper Reasoning Scan-Oriented Reasoning
ICLR 2026 Peng Sun, Tao Lin 2026-06-22
Any-step Generation via N-th Order Recursive Consistent Velocity Field Estimation
Proposes N-th order recursive velocity field estimation for any-step generation, advancing few-step diffusion models with reduced computational overhead and simplified training.
Generative Models
ICLR 2026 Zeyu Feng, Haiyan Yin, Yew-Soon Ong et al. 2026-06-22
Masked Skill Token Training for Hierarchical Off-Dynamics Transfer
Masked Skill Token Training (MSTT) enables offline hierarchical RL for policy transfer across altered dynamics, critical for real-world robotics where online adaptation is infeasible.
Tranfser Learning Skills Hierarchical RL Embodied AI
ICLR 2026 Shaojie Li, Pengwei Tang, Bowei Zhu et al. 2026-06-22
High Probability Bounds for Non-Convex Stochastic Optimization with Momentum
Provides high-probability convergence and generalization bounds for SGD with momentum in non-convex settings, filling a theoretical gap in understanding widely-used optimization.
Momentum nonconvex learning generalization
ICLR 2026 Artyom Sorokin, Nazar Buzun, Aleksandr Anokhin et al. 2026-06-22
Q-RAG: Long Context Multi‑Step Retrieval via Value‑Based Embedder Training
Q-RAG introduces value-based embedder training for multi-step retrieval in long-context scenarios, addressing a key limitation of single-step RAG methods for complex question answering.
Reinforcement Learning RL QA Long-context RAG
ICLR 2026 Seongtae Hong, Youngjoon Jang, Jungseob Lee et al. 2026-06-22
Improving Semantic Proximity in Information Retrieval through Cross-Lingual Alignment
Improves cross-lingual information retrieval through alignment techniques, addressing multilingual embedding quality for CLIR tasks.
Cross-Lingual Alignment Information Retrieval Multilingual Embedding Cross-Lingual Information Retrieval
ICLR 2026 Rahul Ramachandran, Ali Garjani, Roman Bachmann et al. 2026-06-22
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks
Comprehensive benchmark evaluating GPT-4o, o1-mini, Gemini 1.5 Pro on standard computer vision tasks, revealing gaps between multimodal foundation models and specialized vision systems.
vision benchmark multimodal foundation models vision language models standard computer vision tasks
ICLR 2026 Tin Hadži Veljković, Erik J Bekkers, Michael Tiemann et al. 2026-06-22
CORDS - Continuous Representations of Discrete Structures
CORDS introduces continuous neural field representations for variable-cardinality set prediction, enabling invertible encoding/decoding for object detection and molecular modeling without padding.
Continuous set representations Neural fields Variable-cardinality prediction Invertible encoding/decoding Diffusion and flow matching
ICLR 2026 Christopher Mitcheltree, Vincent Lostanlen, Emmanouil Benetos et al. 2026-06-22
SCRAPL: Scattering Transform with Random Paths for Machine Learning
SCRAPL optimizes scattering transform computation via stochastic path sampling, reducing computational cost while maintaining perceptual quality gradients for inverse problems in vision and audio.
scattering transform wavelets stochastic optimization ddsp perceptual quality assessment
ICLR 2026 Antanas Žilinskas, Robert Noel Shorten, Jakub Marecek et al. 2026-06-22
EVEREST: A Transformer for Probabilistic Rare-Event Anomaly Detection with Evidential and Tail-Aware Uncertainty
EVEREST combines transformer architecture with evidential deep learning and extreme value theory for probabilistic rare-event forecasting in imbalanced multivariate time-series.
Transformer models Uncertainty quantification Evidential deep learning Extreme value theory Imbalanced classification
ICLR 2026 Harris Abdul Majid, Pietro Sittoni, Francesco Tudisco et al. 2026-06-22
Test-Time Accuracy-Cost Control in Neural Simulators via Recurrent-Depth
Recurrent-Depth Simulator enables test-time accuracy-cost trade-offs in neural physics simulators, allowing dynamic resolution control analogous to classical numerical methods.
Neural Simulator Recurrent Depth AI4Simulation
ICLR 2026 Kun XIE, Peng Zhou, Xingyi Zhang et al. 2026-06-22
PoinnCARE: Hyperbolic Multi-Modal Learning for Enzyme Classification
PoinnCARE applies hyperbolic geometry and multi-modal learning to enzyme classification, capturing hierarchical EC number relationships and structural features for biotechnology applications.
EC number prediction enzyme function hyperbolic space learning multi-modal learning enzyme structure
ICLR 2026 Tianqiao Liu, Xueyi Li, Hao Wang et al. 2026-06-22
From Text to Talk: Audio-Language Model Needs Non-Autoregressive Joint Training
Proposes non-autoregressive joint training for audio-language models handling interleaved speech-text, replacing sequential generation with multi-token prediction for improved speech-to-speech systems.
Large Multimodal Models Multi-token Prediction Non-Autoregressive Learning
ICLR 2026 Qinglong Yang, Haoming Li, Haotian Zhao et al. 2026-06-22
FingerTip 20K: A Benchmark for Proactive and Personalized Mobile LLM Agents
FingerTip 20K benchmark for proactive mobile LLM agents that leverage device context for personalized actions without explicit instructions, advancing beyond reactive GUI automation.
Mobile Agent LLM Agent GUI Proactive Agent Personalization
ICLR 2026 Tianxiang Dai, Jonathan Fan 2026-06-22
Characterizing and Optimizing the Spatial Kernel of Multi Resolution Hash Encodings
Analyzes multi-resolution hash encoding spatial kernels from physics perspective, replacing heuristics with principled understanding of Instant Neural Graphics Primitives behavior.
multi-resolution hash encoding implicit neural representations neural fields point spread function spatial kernel analysis

Deep Dive

All 348 items scored and categorized. Relevance scores reflect novelty, technical depth, and practical impact — 7+ items are the ones worth your time.

348+ research items ready to explore