Breaking News
Tuesday, March 31, 2026
Show HN: How This Graybeard Built the Fastest and Freest Postgres BM25 Search https://ift.tt/qDHprdl
Show HN: How This Graybeard Built the Fastest and Freest Postgres BM25 Search Last summer we faced a conundrum at my company, Tiger Data, a Postgres cloud vendor whose main business is in timeseries data. We were trying to grow our business towards emerging AI-centric workloads and wanted to provide a state-of-the-art hybrid search stack in Postgres. We'd already built pgvectorscale in house with the goal of scaling semantic search beyond pgvector's main memory limitations. We just needed a scalable ranked keyword search solution too. The problem: core Postgres doesn't provide this; the leading Postgres BM25 extension, ParadeDB, is guarded behind AGPL; developing our own extension appeared daunting. We'd need a small team of sharp engineers and 6-12 months, I figured. And we'd probably still fall short of the performance of a mature system like Parade/Tantivy. Or would we? I'd be experimenting long enough with AI-boosted development at that point to realize that with the latest tools (Claude Code + Opus) and an experienced hand (I've been working in database systems internals for 25 years now), the old time estimates pretty much go out the window. I told our CTO I thought I could solo the project in one quarter. This raised some eyebrows. It did take a little more time than that (two quarters), and we got some real help from the community (amazing!) after open-sourcing the pre-release. But I'm thrilled/exhausted today to share that pg_textsearch v1.0 is freely available via open source (Postgres license), on Tiger Data cloud, and hopefully soon, a hyperscalar near you: https://ift.tt/s4KoTzP In the blog post accompanying the release, I overview the architecture and present benchmark results using MS-MARCO. To my surprise, we were not only able to meet Parade/Tantivy's query performance, but exceed it substantially, measuring a 4.7x advantage on query throughput at scale: https://ift.tt/cS6aA2W... It's exciting (and, to be honest, a little unnerving) to see a field I've spent so much time toiling in change so quickly in ways that enable us to be more ambitious in our technical objectives. Technical moats are moats no longer. The benchmark scripts and methodology are available in the github repo. Happy to answer any questions in the thread. Thanks, TJ (tj@tigerdata.com) https://ift.tt/s4KoTzP March 31, 2026 at 08:29PM
Monday, March 30, 2026
Show HN: Rusdantic https://ift.tt/sDR3KI7
Show HN: Rusdantic A unified, high-performance data validation and serialization framework for Rust, inspired by Pydantic's ergonomics and powered by Serde. https://ift.tt/I61KxVg March 31, 2026 at 01:57AM
Show HN: AI Spotlight for Your Computer (natural language search for files) https://ift.tt/gzxYvSk
Show HN: AI Spotlight for Your Computer (natural language search for files) Hi HN, I built SEARCH WIZARD — a tool that lets you search your computer using natural language. Traditional file search only works if you remember the filename. But most of the time we remember things like: "the screenshot where I was in a meeting" "the PDF about transformers" "notes about machine learning" Smart Search indexes your files and lets you search by meaning instead of filename. Currently supports: - Images - Videos - Audio - Documents Example query: "old photo where a man is looking at a monitor" The system retrieves the correct file instantly. Everything runs locally except embeddings. I'm looking for feedback on: - indexing approaches - privacy concerns - features you'd want in a tool like this GitHub: https://ift.tt/TjCGQfq Demo: https://deepanmpc.github.io/SMART-SEARCH/ March 30, 2026 at 07:13PM
Show HN: Memv – Memory for AI Agents https://ift.tt/I78rWjx
Show HN: Memv – Memory for AI Agents memv is an open-source Python library that gives AI agents persistent memory. Feed it conversations; it extracts knowledge. The extraction mechanism is predict-calibrate (Nemori paper): given existing knowledge, it predicts what a new conversation should contain, then extracts only what the prediction missed. v0.1.2 adds the production path: - PostgreSQL backend (pgvector for vectors, tsvector for text search, asyncpg pooling). Single db_url parameter — file path for SQLite, connection string for Postgres. - Embedding adapters: OpenAI, Voyage, Cohere, fastembed (local ONNX). Other things it does: - Bi-temporal validity: event time (when was the fact true) + transaction time (when did we learn it), following Graphiti's model. - Hybrid retrieval: vector similarity + BM25 merged with Reciprocal Rank Fusion. - Episode segmentation: groups messages before extraction. - Contradiction handling: new facts invalidate old ones, with full audit trail. Procedural memory (agents learning from past runs) is next, deferred until there's usage data. https://ift.tt/F0uVy1g March 30, 2026 at 09:09PM
Sunday, March 29, 2026
Show HN: React-Rewrite – Figma for localhost that directly edits your codebase https://ift.tt/ne7AMJj
Show HN: React-Rewrite – Figma for localhost that directly edits your codebase https://ift.tt/odbjiRz March 30, 2026 at 06:59AM
Show HN: Real-time visualization of Claude Code agent orchestration https://ift.tt/Df8UxHe
Show HN: Real-time visualization of Claude Code agent orchestration https://ift.tt/IiZXQBo March 30, 2026 at 06:21AM
Show HN: Tabical – Tinder-style city micro-itineraries, personalized by swipe https://ift.tt/0DGhrK7
Show HN: Tabical – Tinder-style city micro-itineraries, personalized by swipe tabical: swipeable 2-4 stop city itineraries for NYC, DC, and Atlanta. You swipe right or left and a personalization vector updates on each swipe to curate your deck. The backend pipeline is where most of the interesting work lives: currently trending signals are harvested each day, and from those signals we fetch the candidates to build itineraries. Built this because deciding what to do in a city like NYC is a genuinely annoying problem that no existing app solves end-to-end. Happy to talk more. https://tabical.com/ March 30, 2026 at 12:46AM
Subscribe to:
Comments (Atom)