Weekly Intelligence

AI Quick Bites

March 23, 2026 · 292 items from 11 sources

Last refreshed: March 23, 2026 at 10:22 UTC

Highlights

The five most consequential developments in AI this week — selected from 292 items across 11 sources. These are the things an AI engineer, researcher, or founder needs to know.

02
EvoJail systematically automates discovery of long-tail jailbreaks via evolutionary search, exposing a largely undefended attack surface that existing safety evaluations miss.
arxiv 2026-03-23 20 min
03
Published CoT faithfulness numbers are not comparable across studies — classifier choice alone reverses model rankings, making this a must-read for anyone designing or consuming LLM evaluation benchmarks.
arxiv 2026-03-23 20 min
04
D-MMD is the first discrete diffusion distillation that doesn't collapse, unlocking fast sampling for text and discrete image generation analogous to what consistency models did for continuous diffusion.
arxiv 2026-03-23 20 min
05
The multi-agent cybersecurity assessment system's finding that context window size — not model quality — caused 100% pipeline failure is a concrete, reproducible lesson for anyone deploying multi-agent systems on constrained hardware.
arxiv 2026-03-23 15 min

What Changed This Week

Week-over-week diff showing new arrivals, items gaining momentum, and topics that dropped off the radar. All scores are AI relevance (0–10).

AI Security

Novel attack vectors, jailbreak research, red-teaming findings, and defensive tools across the AI security landscape. Only items with genuine technical substance make it here. Scores are AI relevance (0–10): 7+ important, 9+ landmark.

Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training
8/10
Replication and extension of the RYS layer-duplication method showing that duplicating 3-4 contiguous 'reasoning circuit' layers in a 24B LLM boosts logical deduction from 0.22 to 0.76 with zero training — strong mechanistic interpretability finding suggesting discrete cognitive modules in transformers.
hackernews 2026-03-23 10 min
promptfoo/promptfoo
7/10
Comprehensive LLM testing and red-teaming framework supporting prompt evaluation, vulnerability scanning, and CI/CD integration across GPT, Claude, Gemini, and Llama — 18K+ stars and 1.9K new this week makes it the leading open-source tool for systematic LLM security testing.
github 2026-03-23 5 min
Show HN: FireClaw – Open-source proxy defending AI agents from prompt injection
7/10
Open-source security proxy that intercepts web fetches by AI agents through a 4-stage pipeline (DNS blocklist, content scanning, injection detection, response sanitization) to prevent prompt injection attacks before they reach the model. Proactive prevention rather than post-hoc detection is a meaningful architectural distinction.
hackernews 2026-03-23 5 min
Paper: Detecting hallucinations before the first token
7/10
Research on detecting hallucinations in transformer LLMs before the first output token is generated, using pre-generative epistemic signals. If validated at scale, this could enable proactive hallucination filtering rather than post-hoc detection.
hackernews 2026-03-23 20 min
Evolving Jailbreaks: Automated Multi-Objective Long-Tail Attacks on Large Language Models
6/10
EvoJail automates discovery of long-tail jailbreak attacks (low-resource languages, encrypted inputs) via multi-objective evolutionary search, jointly optimizing attack effectiveness and perplexity — more systematic than handcrafted approaches and reveals underexplored attack surface.
arxiv 2026-03-23 20 min
2% of ICML papers desk rejected because the authors used LLM in their reviews
6/10
ICML officially desk-rejected 2% of submitted papers because authors used LLMs to write peer reviews, violating conference policy—a concrete enforcement action with implications for academic integrity in AI research. Sets a precedent for how top venues will police LLM misuse in the review process.
hackernews 2026-03-23 5 min
Trojan horse hunt in deep forecasting models: Insights from the European Space Agency competition
5/10
ESA competition where 200+ teams hunted backdoor triggers in deep forecasting models for spacecraft telemetry — novel application of trojan detection to time-series safety-critical systems with public benchmark and competition solutions released.
arxiv 2026-03-23 20 min
Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation
5/10
Demonstrates that LLM chain-of-thought faithfulness scores are classifier-dependent to a degree that reverses model rankings — three classifiers on identical data produce 69.7% to 82.6% faithfulness rates with non-overlapping confidence intervals, undermining cross-study comparisons.
arxiv 2026-03-23 20 min
Improving Generalization on Cybersecurity Tasks with Multi-Modal Contrastive Learning
5/10
Proposes a two-stage multi-modal contrastive learning framework that transfers knowledge from text vulnerability descriptions to network payload classification, reducing shortcut learning in cybersecurity ML models. Releases a synthetic CVE/payload benchmark; addresses a real generalization gap in production security ML.
arxiv 2026-03-23 20 min
I built a runtime guardrail that stops AI agents from doing dumb things
5/10
MoltGuard is a runtime guardrail tool that intercepts and blocks dangerous tool calls from AI agents before execution, claiming 16K+ downloads; concept is sound but technical depth in the post is limited.
hackernews 2026-03-23 3 min
Are developers trusting AI-generated code too much?
5/10
Developer built a proxy to scan AI-generated code for hardcoded secrets, unsafe patterns, and prompt injection hidden in comments — addresses a real and underappreciated attack surface in AI-assisted coding workflows.
hackernews 2026-03-23 3 min
I built an AI agent after the OpenClaw mess — zero permissions by default, runs free on Ollama
5/10
Developer built a zero-permissions-by-default AI agent running on Ollama in response to OpenClaw's CVSS 8.8 RCE vulnerability and 30k+ exposed instances — addresses real security concerns in agentic systems but lacks deep technical detail.
reddit 2026-03-23 4 min
FSF statement on copyright infringement lawsuit Bartz v. Anthropic
5/10
FSF statement on the Bartz v. Anthropic copyright settlement — important for understanding how AI training data licensing disputes are being resolved and implications for open-source AI development.
hackernews 2026-03-23 6 min
Cross-Model Void Convergence: GPT-5.2 and Claude Opus 4.6 Deterministic Silence
5/10
Research paper on 'Cross-Model Void Convergence' — a phenomenon where GPT-5.2 and Claude Opus 4.6 produce deterministic silence (refusal/null output) under specific conditions. If methodologically sound, this is an interesting behavioral alignment/safety finding across frontier models.
hackernews 2026-03-23 20 min

Top Contributors

Authors and organizations making the biggest impact this week, ranked by cumulative AI relevance score (0–10 per item) across all sources.

Top Authors
#1
webml-community
2 items · avg 5.5/10
11.0
#2
r3gm
2 items · avg 5.0/10
10.0
#3
prithivMLmods
2 items · avg 4.5/10
9.0
#4
7.0
#5
7.0
#6
7.0
Top Organizations
#1
8.0
#2
ChromeDevTools
1 item · avg 7.0/10
7.0
#3
dimensionalOS
1 item · avg 7.0/10
7.0
#4
7.0
#5
langchain-ai
1 item · avg 7.0/10
7.0
#6
openai
1 item · avg 7.0/10
7.0

Build Ideas

Actionable product ideas distilled from this week's highest-scoring research and discussions. Each includes specific use cases and the source material that inspired it.

LLM Sycophancy Shield
A middleware layer or evaluation harness that detects and flags when LLMs are likely to reverse factually grounded answers under user pressure. Research shows that even richer in-context evidence fails to prevent sycophantic reversals, so this tool would monitor conversation turns for capitulation patterns and alert users or downstream systems. Build it as a lightweight API wrapper or browser extension that scores each model response for evidence-grounding consistency.
Enterprise chatbots where factual accuracy is critical (legal, medical, finance) AI tutoring systems where students might pressure the model toward wrong answers Automated fact-checking pipelines LLM evaluation and red-teaming workflows
https://arxiv.org/abs/2603.20162v1 https://arxiv.org/abs/2603.20172v1
Agentic Science Autopilot
A domain-configurable agentic framework that lets researchers run full analysis pipelines — data ingestion, statistical inference, visualization, and draft generation — with minimal scaffolding, inspired by Claude Code autonomously executing high-energy physics workflows. The key insight is a 'Just Furnish Context' pattern: give the agent rich domain context upfront and let it self-direct. Build it as a configurable template library covering common scientific domains (genomics, climate, economics) with sandboxed execution environments.
Academic research labs needing to accelerate exploratory data analysis Pharmaceutical and biotech companies running repetitive assay pipelines Government agencies processing large open datasets Science journalism and policy research requiring rapid evidence synthesis
https://arxiv.org/abs/2603.20179v1 https://arxiv.org/abs/2603.20132v1
Trojan Scan for Time-Series
A commercial security auditing tool that detects backdoor triggers in deep learning models trained on time-series data, targeting industries like aerospace, energy, and finance where forecasting models are safety-critical. The ESA competition demonstrated that 200+ teams could hunt trojans in spacecraft telemetry models, validating demand for systematic tooling. Build a SaaS platform where customers upload their trained forecasting models and receive a backdoor risk report with trigger candidates and mitigation recommendations.
Spacecraft and satellite telemetry monitoring systems Industrial IoT predictive maintenance models Algorithmic trading and financial forecasting models Power grid and critical infrastructure anomaly detection
https://arxiv.org/abs/2603.20108v1 https://arxiv.org/abs/2603.20181v1
Dialogue-Aware Reasoning Bench
A benchmarking and fine-tuning dataset toolkit that specifically tests and improves LLM reasoning when tasks are embedded inside multi-turn task-oriented dialogues, addressing the documented performance gap versus isolated reasoning settings. Current benchmarks overestimate real-world capability because they test reasoning in isolation, not mid-conversation. Build a dataset generator that wraps standard reasoning tasks (math, logic, QA) inside realistic dialogue scaffolds, plus a fine-tuning recipe to close the gap.
Customer service AI that must reason while managing conversation context AI coding assistants handling multi-turn debugging sessions Healthcare triage chatbots requiring accurate reasoning under conversational pressure LLM provider evaluation and model selection tooling
https://arxiv.org/abs/2603.20133v1 https://arxiv.org/abs/2603.20101v1
Smart Video Seek API
A developer API that brings efficient long-video understanding to any application by intelligently seeking answer-critical frames rather than densely sampling, achieving dramatically lower token costs with higher accuracy. Inspired by VideoSeek's 93% frame reduction with a 10-point accuracy gain, this product wraps the seek logic into a simple endpoint: send a video URL and a query, get back a grounded answer with timestamped evidence. Monetize on a per-query basis targeting media, surveillance, and e-learning platforms.
E-learning platforms enabling natural language search inside lecture recordings Legal and compliance teams reviewing hours of meeting or deposition footage Sports analytics extracting specific play moments from game footage Security and surveillance systems querying long camera recordings
https://arxiv.org/abs/2603.20185v1 https://arxiv.org/abs/2603.20180v1

Product Hunt Weekly

Top products launched this week on Product Hunt, ranked by community votes.

#1
Zoer.ai
Build full-stack webapps from the database up
Productivity Website Builder Vibe coding
188
26
https://www.producthunt.com/r/DGY2W...
#2
Tobira.ai
A network where AI agents find deals for their humans
Productivity Developer Tools Artificial Intelligence
166
28
https://www.producthunt.com/r/G5JOB...
#3
Honestly
Real reviews from Reddit & YouTube when shopping online
Browser Extensions Chrome Extensions E-Commerce
119
4
https://www.producthunt.com/r/FAH2F...
#4
Fastlane
Remix viral videos into content for your business
Marketing Social media marketing Vibe coding
120
11
https://www.producthunt.com/r/LTQWN...
#5
Iris
Send work beautifully, pinned feedback, see what they viewed
Design Tools Productivity Freelance
93
4
https://www.producthunt.com/r/3B4QR...
#6
Claude Usage Tracker
See exactly how much you spend on Claude, across every tool
Open Source Developer Tools GitHub
92
3
https://www.producthunt.com/r/FPUNW...
#7
Pause.do
Interrupt scrolling, tab overload, and AI autopilot
Chrome Extensions Productivity Artificial Intelligence
91
2
https://www.producthunt.com/r/VCCW5...
#8
Nomie
AI wellness app that turns doomscrolling into self‑care
Health & Fitness Productivity Artificial Intelligence
88
1
https://www.producthunt.com/r/3RIZW...
#9
AlphaClaw Apex
OpenClaw harness and fleet manager for Mac
Productivity Open Source Artificial Intelligence
85
1
https://www.producthunt.com/r/VTELX...
#10
WeixinClawBot
The official WeChat pipeline for OpenClaw
Messaging Artificial Intelligence
83
2
https://www.producthunt.com/r/E4TIK...
View full leaderboard on Product Hunt

Trending Repos

Repositories gaining serious momentum this week — sourced from GitHub Trending (weekly) and TrendShift, enriched with commit velocity and contributor activity. Stars = total GitHub stars. "Stars this week" = new stars gained.

1
GH Trending
ChromeDevTools/chrome-devtools-mcp
typescript 30,942 1,831 1,717 stars this week
Official Chrome DevTools MCP (Model Context Protocol) server enabling coding agents to inspect, debug, and interact with browser state directly; significant for browser-based agent workflows and agentic web automation.
Build idea
A SaaS QA automation platform that uses AI agents to autonomously detect, reproduce, and diagnose frontend bugs by connecting to live browser sessions via the Chrome DevTools MCP server.
2
GH Trending
dimensionalOS/dimos
python 2,135 325 995 stars this week
Agentic OS for physical robotics platforms (humanoids, quadrupeds, drones) enabling natural language programming and multi-agent coordination with hardware I/O — gaining rapid traction with ~1K stars/week and represents a meaningful step toward accessible robotics AI.
Build idea
A robotics-as-a-service platform for warehouse and logistics operators that lets non-engineers program and coordinate fleets of robots using plain English commands built on top of dimos.
3
GH Trending
langchain-ai/deepagents
python 16,908 2,398 5,498 stars this week
LangChain's deep agent harness with planning tools, filesystem backend, and subagent spawning — 5.5K stars this week signals strong traction; represents LangChain's answer to complex multi-step agentic task execution.
Build idea
A managed agentic workflow service for enterprises that automates complex, multi-step back-office tasks — like financial reconciliation or compliance reporting — using deepagents' planning and subagent orchestration.
4
GH Trending
openai/codex
rust 67,025 8,962 1,578 stars this week
OpenAI's lightweight terminal-based coding agent with 67K stars — the official open-source CLI coding agent from OpenAI, directly competing with Claude Code and Gemini CLI in the agentic coding space.
Build idea
A developer productivity tool for software agencies that wraps OpenAI Codex CLI into a team-shared terminal environment with audit logs, role-based permissions, and billing controls for client project work.
5
GH Trending
promptfoo/promptfoo
typescript 18,221 1,554 1,941 stars this week
Comprehensive LLM testing and red-teaming framework supporting prompt evaluation, vulnerability scanning, and CI/CD integration across GPT, Claude, Gemini, and Llama — 18K+ stars and 1.9K new this week makes it the leading open-source tool for systematic LLM security testing.
Build idea
An LLM compliance and security auditing service for regulated industries (finance, healthcare) that continuously red-teams customers' AI applications using promptfoo and delivers certified vulnerability reports.
6
GH Trending
unslothai/unsloth
python 57,658 4,859 3,564 stars this week
Unsloth now includes a web UI (Unsloth Studio) for training and running open models like Qwen, DeepSeek, and Gemma locally with significant memory/speed optimizations. 57k+ stars and 3.5k new stars this week reflects continued dominance in efficient local fine-tuning.
Build idea
A no-code fine-tuning platform for SMBs that lets businesses upload their proprietary data and produce custom, locally-deployable open-source models using Unsloth's memory-efficient training backend.
7
GH Trending
vllm-project/vllm-omni
python 3,667 609 514 stars this week
Official vLLM extension for omni-modality model inference (text, vision, audio in one framework), extending vLLM's high-throughput serving to multimodal models. Significant for production deployment of models like Gemini-style omni architectures.
Build idea
A multimodal AI inference API service targeting media and e-commerce companies that need high-throughput processing of mixed text, image, and audio inputs — such as automated product cataloging or content moderation.
8
GH Trending
volcengine/OpenViking
python 18,206 1,243 6,297 stars this week
ByteDance's Volcengine open-sources OpenViking, a context database for AI agents that unifies memory, resources, and skills through a file-system paradigm with hierarchical context delivery and self-evolution. 6.3k stars this week suggests significant interest in structured agent context management.
Build idea
A persistent agent memory management SaaS that gives enterprise AI assistants long-term, structured context about company knowledge, past interactions, and workflows using OpenViking's hierarchical context database.
9
GH Trending
NousResearch/hermes-agent
python 10,706 1,333 2,665 stars this week
NousResearch's open-source agent framework built around their Hermes model series, gaining significant traction (2,665 stars this week); worth watching as a capable open-weight agent stack.
Build idea
A privacy-first AI assistant platform for law firms and consultancies that deploys fully on-premise using the open-weight Hermes agent stack, ensuring sensitive client data never leaves the organization.
10
GH Trending
alibaba/page-agent
typescript 13,500 1,027 4,586 stars this week
Alibaba's JavaScript in-page GUI agent enabling natural language control of web interfaces directly in the browser; strong traction (4,586 stars this week) and practical for web automation use cases.
Build idea
A browser extension product for enterprise users that lets non-technical employees automate repetitive web-based workflows — like data entry across SaaS tools — using natural language instructions powered by page-agent.

Trending Developers

Developers gaining traction on GitHub this week — shipping open-source AI tools, models, and frameworks worth following. Ranked by weekly trending position.

1
Sebastian Raschka
@rasbt
rasbt/LLMs-from-scratch
Sebastian Raschka's GitHub profile, best known for the LLMs-from-scratch repo — a widely-used educational resource for building ChatGPT-style LLMs in PyTorch. Profile listing, not a new release.
2
comfyanonymous
@comfyanonymous
comfyanonymous/ComfyUI_examples
ComfyUI workflow examples repository from the creator of ComfyUI — useful reference for node-based diffusion pipelines but primarily example content rather than new tooling.
3
Jarrod Watts
@jarrodwatts
jarrodwatts/claude-hud
Claude Code HUD plugin surfacing real-time context usage, active tools, running agents, and todo progress — useful observability layer for Claude Code power users.
4
Matt Van Horn
@mvanhorn
mvanhorn/last30days-skill
AI agent skill that aggregates and synthesizes information from Reddit, X, YouTube, HN, Polymarket, and the web — useful multi-source research agent pattern.
5
Fengda Huang
@phodal
phodal/routa
Workspace-first multi-agent coordination platform for AI development with shared state — one of several emerging multi-agent orchestration frameworks targeting software development workflows.
6
David East
@davideast
davideast/stitch-mcp
CLI tool bridging Google's Stitch AI-generated UI designs into developer workflows via MCP — early-stage integration tool with limited documentation.
7
jakevin
@jackwener
jackwener/opencli
Universal CLI hub that wraps websites and apps into a command-line AI-native runtime — interesting concept but early-stage.
8
Paul Bakaus
@pbakaus
pbakaus/impeccable
Design language specification aimed at improving AI-generated UI quality — interesting prompt engineering angle for design systems but early-stage.
9
Daniel Griesser
@HazAT
HazAT/glimpse
Trending developer profile; the linked project (Glimpse) is a macOS micro-UI for scripts and agents but lacks sufficient detail to evaluate.
10
Dotta
@cryppadotta
cryppadotta/scryfall-mcp
MCP server wrapping the Scryfall Magic: The Gathering API — niche hobby project with minimal broader AI relevance.
11
Lawrence Chen
@lawrencecchen
lawrencecchen/awesome-libghostty
Curated list of libghostty projects — not AI-relevant.
12
Matthew Diakonov
@m13v
m13v/fazm
Fazm Desktop macOS app — insufficient information to assess AI relevance.
13
qixing-jk
@qixing-jk
qixing-jk/all-api-hub
API relay manager for managing multiple AI API accounts with balance tracking and key export. Utility tool with no novel AI research value.
14
Bartek Iwańczuk
@bartlomieju
15
Dream Hunter
@dreamhunter2333
dreamhunter2333/cloudflare_temp_email
Cloudflare-based temporary email service — not AI-related.
16
Hartmut Kaiser
@hkaiser
17
Josh Lehman
@jalehman
jalehman/xc
CLI client for X API v2 — not AI-related.
18
Jorge Manrubia
@jorgemanrubia
19
Klaus Post
@klauspost
klauspost/compress
Go compression library — not AI-related.
20
Marc Seitz
@mfts
mfts/papermark
Papermark open-source DocSend alternative — not AI-related.
21
Alireza Rezvani
@alirezarezvani
alirezarezvani/claude-skills
+192 Claude Code skills & agent plugins for Claude Code, Codex, Gemini CLI, Cursor, and 8 more coding agents — engineering, marketing, pr…
22
Michael Ramos
@backnotprop
backnotprop/plannotator
Annotate and review coding agent plans and code diffs visually, share with your team, send feedback to agents with one click.
23
Brady Gaster
@bradygaster
bradygaster/squad
Squad: AI agent teams for any project

Models & Benchmarks

New model releases, arena rankings, and benchmark results across frontier and open-source AI models this week. Arena Elo = LMSys battle rating. Trending = HuggingFace trending score. Buzz = AI relevance (0–10).

Arena Leaderboard — Top 15
#ModelTypeEloVotes
1 claude-opus-4-6-thinking Anthropic Closed 1502 11,801
2 claude-opus-4-6 Anthropic Closed 1501 12,546
3 gemini-3.1-pro-preview Google Closed 1493 14,677
4 grok-4.20-beta1 xAI Closed 1492 7,396
5 gemini-3-pro Google Closed 1486 41,762
6 gpt-5.4-high OpenAI Closed 1485 4,965
7 gpt-5.2-chat-latest-20260210 OpenAI Closed 1482 10,140
8 grok-4.20-beta-0309-reasoning xAI Closed 1481 4,504
9 gemini-3-flash Google Closed 1475 31,060
10 claude-opus-4-5-20251101-thinking-32k Anthropic Closed 1474 37,036
11 grok-4.1-thinking xAI Closed 1472 43,930
12 claude-opus-4-5-20251101 Anthropic Closed 1469 41,976
13 claude-sonnet-4-6 Anthropic Closed 1465 9,843
14 qwen3.5-max-preview Alibaba Closed 1464 4,252
15 gpt-5.3-chat-latest OpenAI Closed 1464 8,942
New & Trending Models
nvidia/Nemotron-Cascade-2-30B-A3B
5,346 downloads 212 likes 212 trending
Custom License 2026-03-18
NVIDIA's Nemotron-Cascade-2 is a 30B total / 3B active MoE reasoning model with an associated arxiv paper (2603.19220), combining SFT and RL post-training. The extremely high trending score (212) and novel cascade architecture for efficient reasoning make this the standout NVIDIA release this week.
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled
151,482 downloads 1,062 likes 339 trending
Open Source 2026-02-27
Qwen3.5-27B fine-tuned via knowledge distillation from Claude 4.6 Opus reasoning traces, achieving strong chain-of-thought performance at 27B scale. 151k downloads and 1k+ likes indicate significant community adoption; notable for distilling frontier closed-model reasoning into an open-weight model.
deepseek-ai/DeepSeek-V3.2
293,362 downloads 1,326 likes 23 trending
Open Source 2025-12-01
DeepSeek's V3.2 update with 293K+ downloads and 1326 likes, continuing the high-performance open-weight frontier model series. FP8 support and strong benchmark results make this a significant incremental release for practitioners running large open models.
microsoft/bitnet-b1.58-2B-4T
15,096 downloads 1,391 likes 34 trending
Open Source 2025-04-15
Microsoft's BitNet b1.58 2B model trained on 4 trillion tokens, implementing 1.58-bit quantization for extreme efficiency. Represents a meaningful step toward ultra-low-bit LLMs that can run on edge hardware with minimal memory footprint.
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16
103,484 downloads 283 likes 70 trending
Custom License 2026-03-10
NVIDIA's Nemotron-3 Super 120B MoE model (120B total, 12B active) using a novel 'latent-MoE' architecture with multi-token prediction, trained on a curated multilingual dataset. Strong downloads (103K) and a new architecture variant make this a notable open-weight frontier release.
openai/gpt-oss-120b
4,549,831 downloads 4,602 likes 28 trending
Open Source 2025-08-04
OpenAI's open-source 120B model released on HuggingFace with Apache 2.0 license, 4.5M+ downloads and an arxiv paper. Significant as OpenAI's first major open-weight release at frontier scale, with vLLM support and quantization variants.
zai-org/GLM-5
136,040 downloads 1,854 likes 52 trending
Open Source 2026-02-11
GLM-5 from Zhipu AI is a new MoE-based bilingual (EN/ZH) foundation model with 136K+ downloads and 1854 likes, representing a significant open-weight release with a DSA architecture variant worth tracking against Qwen and Llama families.
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF
447,707 downloads 317 likes 97 trending
Open Source 2026-02-27
GGUF-quantized version of the Claude-distilled Qwen3.5-27B reasoning model for local inference. 447k downloads makes it one of the most-downloaded variants, enabling local deployment of frontier-distilled reasoning.
MiniMaxAI/MiniMax-M2.5
492,806 downloads 1,268 likes 70 trending
Custom License 2026-02-12
MiniMax-M2.5 is a large-scale mixture-of-experts text generation model with 492k downloads and 1.2k likes. Limited public technical details but strong download traction suggests competitive performance.
Multilingual-Multimodal-NLP/IndustrialCoder
287 downloads 30 likes 30 trending
Open Source 2026-03-13
Specialized code model targeting industrial hardware description languages (Verilog, CUDA, Triton) and chip/CAD design, backed by an arXiv paper (2603.16790). Niche but high-value domain where general LLMs underperform.
Qwen/Qwen3-Coder-Next
1,232,461 downloads 1,165 likes 38 trending
Open Source 2026-01-30
Next iteration of Qwen's coding-focused model with 1.2M+ downloads and strong trending score, suggesting a significant update to the Qwen3 coder series. Limited metadata but high community traction indicates practical utility for code generation tasks.
Tesslate/OmniCoder-9B
19,168 downloads 358 likes 134 trending
Open Source 2026-03-12
Multimodal coding model fine-tuned from Qwen3.5-9B supporting image-text-to-text for agentic coding tasks, with 19K+ downloads and strong trending. Targets code generation with visual context, a useful niche for developer tooling.
nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16
28,713 downloads 54 likes 54 trending
Custom License 2026-03-07
NVIDIA's Nemotron-3 Nano 4B model in BF16, part of a new Nemotron-H architecture series trained on specialized datasets including agentic, math, and competitive programming data. Compact model with strong post-training pipeline targeting edge/local deployment.
silx-ai/Quasar-10B
138 downloads 30 likes 28 trending
Open Source 2026-03-09
Quasar-10B is a linear attention model fine-tuned from Qwen3.5-9B-Base supporting up to 2M token context via GLA (Gated Linear Attention). Noteworthy for the extreme context length capability at 10B scale, though from a lesser-known lab.
Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-GGUF
130,983 downloads 137 likes 39 trending
Open Source 2026-03-03
Smaller 9B GGUF variant of the Claude-distilled reasoning model for resource-constrained local inference. Complements the 27B version for edge deployment.
Model Buzz

Trending Spaces

The hottest interactive demos and apps on HuggingFace Spaces this week — try them live. Flame icon = HuggingFace trending score. Hearts = community likes.

Tensorbend
Ex0bit
static 48 46
A static HuggingFace Space called 'Tensorbend' with minimal metadata and 48 likes. Insufficient information to assess technical substance.
Omni Video Factory
FrameAI4687
gradio 665 94
mit
A Gradio space offering text-to-video, image-to-video, and video extension capabilities with 665 likes. Aggregates multiple video generation modalities in one interface, but appears to be a wrapper rather than novel research.
NSFW Uncensored Adult Image
Heartsync
gradio 577 31.5
NSFW image generation space — not relevant to AI research or engineering audiences.
LTX 2.3 Distilled
Lightricks
gradio 215 61
Official demo space for LTX Video 2.3 Distilled from Lightricks, a distilled video generation model offering faster inference. Distillation of video diffusion models is an active research area and this represents a production-quality release.
Wan2.2 Animate
Wan-AI
gradio 5,019 72
apache-2.0
Official demo for Wan2.2 Animate with 5K+ likes, one of the most popular open video generation models. The 2.2 update to the Wan series continues to be a leading open-source video generation option.
Fish Audio S2 Pro
artificialguybr
gradio 108 67
other
ZeroGPU demo for Fish Audio S2 Pro text-to-speech model, enabling high-quality TTS without local GPU. Fish Audio S2 Pro is a competitive open TTS model; this demo lowers the barrier to evaluation.
FLUX.2 Klein 9B KV
black-forest-labs
gradio 97 66
Official Black Forest Labs demo for FLUX.2 Klein 9B with KV-cache optimization, a new efficient image generation model. The KV-cache variant suggests inference speed improvements for the FLUX architecture.
Free Unlimited Google Veo 3
deddytoyota
static 239 114
Unofficial 'free unlimited' wrapper claiming to provide access to Google Veo 3 with NSFW content — likely a scraper or misleading space, not technically substantive.
LTX 2.3 First-Last Frame
linoyts
gradio 67 42
Demo for LTX Video 2.3 with first-and-last-frame conditioning, enabling controlled video generation between two keyframes. Useful capability for video editing workflows built on the LTX 2.3 distilled model.
Voxtral Realtime WebGPU
mistralai
static 75 41
Mistral's Voxtral real-time speech transcription running entirely in-browser via WebGPU — no server required. Demonstrates on-device ASR capability from a frontier lab, relevant for privacy-preserving and offline speech applications.
Z Image Turbo
mrfakename
gradio 2,627 71
Z Image Turbo is a fast image generation space with 2.6K+ likes, indicating strong community adoption. Likely a distilled or optimized image generation model, but limited metadata makes technical assessment difficult.
Qwen Image Multiple Angles 3D Camera
multimodalart
gradio 1,951 40
Demo using Qwen's vision model to generate images from multiple 3D camera angles, with 1.9K+ likes indicating strong interest. Useful for 3D-consistent image generation workflows but appears to be a creative application rather than novel research.
FireRed Image Edit 1.0 Fast
prithivMLmods
gradio 407 136
apache-2.0
Fast image editing space combining FireRed-Image-Edit with Qwen-Image-Edit-Rapid for rapid instruction-based image editing. High trending score (136) and 407 likes suggest practical utility, but is a combination of existing models rather than novel research.
Qwen-Image-Edit-2511-LoRAs-Fast
prithivMLmods
gradio 1,122 51
apache-2.0
Gradio demo showcasing a collection of Qwen-based image editing LoRAs for fast inference. Useful for exploring fine-tuned image editing capabilities but primarily a demo wrapper.
Wan2.2 14B Preview
r3gm
gradio 1,477 190
Demo of Wan2.2 14B, a video generation model running with FP8 quantization and AOTI compilation for image-to-video generation. High trending score suggests significant community interest in this video gen model.

Deep Dive

All 292 items scored and categorized. Relevance scores reflect novelty, technical depth, and practical impact — 7+ items are the ones worth your time.

292+ research items ready to explore