RisingWave

RisingWave · 2026-06-10T19:19:32.091Z

Vector databases solved retrieval. But they didn't solve freshness. That's why we think the next evolution of vector search is not just better indexing. It's streaming-native vector retrieval. Most vector search architectures look like: Source Data → Embeddings → Vector Database → Application The problem? The data changes continuously, but the index is often updated in batches. That creates: ➡️ ingestion lag ➡️ stale embeddings ➡️ consistency challenges ➡️ operational complexity That's why modern AI systems are evolving beyond traditional vector search. But the real shift is not: "adding vector search." The real shift is: streaming-native computation + live vector retrieval. Because the biggest challenge is not similarity search itself. It is: ➡️ keeping embeddings fresh ➡️ continuously updating indexes ➡️ reducing data-to-retrieval latency ➡️ maintaining consistency between data and vectors That's why modern AI systems increasingly combine: ✅ continuous computation ✅ vector search ✅ HNSW indexes With a streaming-native architecture: Source → Streaming Database → Live Vector Index → Application New events update the index automatically and incrementally. Updated records update the index automatically. The gap between data and retrieval shrinks from hours to seconds. The future of AI retrieval is not just: vector-native. It's: ✅ streaming-native ✅ continuously updated ✅ real-time ✅ retrieval-ready Because the data is live. The index should be too. Read the blog here: https://lnkd.in/dAGduWRh

Software Development

San Francisco, California 14,624 followers

The live data company. Powering humans and agents with what's happening now.

View all 41 employees

About us

The live data company. Powering humans and agents with what's happening now. Talk to us: https://risingwave.com/slack.

Website: http://www.risingwave.com/
External link for RisingWave
Industry: Software Development
Company size: 51-200 employees
Headquarters: San Francisco, California
Type: Privately Held
Founded: 2021

Products

RisingWave

Event Stream Processing (ESP) Software

RisingWave is an event stream processing and management platform. It offers an unified experience for real-time data ingestion, stream processing, data persistence, and low-latency serving.

Employees at RisingWave

Xiangyu (Sam) Hu

LinkedIn Member
LinkedIn Member
LinkedIn Member
LinkedIn Member
LinkedIn Member

View 41 employees at RisingWave

Join with email

Already on LinkedIn? Sign in

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

See all employees

Locations

Primary

95 3rd St

2nd Floor

San Francisco, California 94103, US

Get directions
16 Collyer Quay

Downtown Core, Central Region 049318, SG

Get directions

Updates

RisingWave

14,624 followers
2h Edited
Report this post
SQL is Dead. Long Live SQL. For more than 50 years, people have been predicting the death of SQL. As Prof. Andy Pavlo says: Somebody invents a SQL replacement every decade. It then fails and/or SQL absorbs the key ideas into the standard. Every NoSQL DBMS (except Redis) now supports SQL. Today, the latest challenge comes from AI. At a recent keynote, Databricks CEO Ali Ghodsi declared that "AGI is here today." The implication is that natural language interfaces and agents may eventually replace SQL. Yet the reality is more nuanced. The recently released BEAVER enterprise Text-to-SQL benchmark shows that even the most advanced models still struggle with real-world enterprise SQL tasks. As of 27 February 2026, the latest benchmark results are: Claude 4.5 Sonnet: 11.4% GPT-5.2: 10.8% These models have improved a lot over the past few months. But when faced with large enterprise schemas, complex joins, domain-specific knowledge, nested queries, and analytical workloads, they are still far from reliably replacing SQL expertise. The lesson is that SQL keeps proving remarkably resilient. Just as it survived many "SQL killers," it is likely to evolve alongside AI rather than be replaced by it. After more than half a century, SQL remains the lingua franca of data. RIP SQL? Not anytime soon.
Like Comment Share
RisingWave

14,624 followers
7h
Report this post
We had a great meetup yesterday with our partners Lenses.io and MotherDuck at the Plug and Play Tech Center in the Bay Area. 😍 It was really exciting to bring together the data and AI community of the Bay Area to explore how real-time data, lakehouse architectures, and streaming systems are shaping the next generation of AI and agentic applications. From streaming-first approaches to Apache Iceberg and lakehouse architectures to real-time context engineering for AI agents, the discussions highlighted a common theme: AI is only as powerful as the data behind it. Thank you to all the members of our data and AI community who joined us, as well as our incredible speakers, Bev Turnbaugh, Tun Shwe, and Rayees Pasha, for sharing their insights. Special thanks to everyone who worked behind the scenes to make this event possible, especially Carly Spoljaric, Jacob Matson, Jovana, and Fahad Shah. 🙌 And of course, thank you to our partners at Lenses.io and MotherDuck for helping create such a great meetup for the community. We're excited to continue these conversations and collaborations in the months ahead, including an upcoming meetup with Lenses in Paris. Image credit: Lenses team.
Like Comment Share
RisingWave

14,624 followers
1d
Report this post
If everything is a log, then what isn't a log? ➡️ Kafka is a log. ➡️ Databases are logs with tables materialized on top. ➡️ CDC just reads the log. ➡️ The WAL comes first; the rows come later. ➡️ MVCC is versioned history built on top of a log. ➡️ Raft replicates logs. ➡️ ZooKeeper and etcd are consensus logs. ➡️ Leader election is deciding who gets to append to the log. ➡️ Stream processors replay logs. ➡️ Materialized views are logs turned into tables. ➡️ Checkpoints are shortcuts for replaying logs. ➡️ State machines are logs turned into applications. ➡️ Replication, recovery, snapshots, and time travel all start with a log. ➡️ Distributed systems are just logs pretending to be different things. 🤔 If everything is a log, then what isn't a log?
Like Comment Share
RisingWave

14,624 followers
1d
Report this post
Siemens achieved ~1000× faster data availability while cutting infrastructure costs by more than 50% with a single architectural change. Traditional data stacks rely on: ➡️ Batch ETL ➡️ Scheduled jobs ➡️ Complex orchestration ➡️ Delayed reporting Every layer adds latency. And latency creates a gap between data and decisions. Siemens closed that gap. By replacing nightly batch jobs with a fully streaming Medallion architecture powered by RisingWave. Before: ➡️ Hours of latency ➡️ Complex ETL pipelines ➡️ Dedicated scheduling clusters ➡️ Next-day reporting After: ✅ Real-time processing ✅ Continuous SQL transformations ✅ Always-fresh insights The shift wasn't incremental. It was architectural. Traditional Medallion architectures follow: Bronze → Silver → Gold But every layer waits for the next batch. Siemens changed that. With RisingWave: ➡️ Bronze streams continuously ➡️ Silver transforms continuously ➡️ Gold updates continuously No batches. No waiting. No stale data. The biggest win wasn't just speed. It was simplicity. Before: ➡️ ETL scripts ➡️ Schedulers ➡️ Landing zones ➡️ Batch infrastructure After: ✅ Postgres style SQL ✅ Materialized views ✅ One streaming platform The results: ➡️ ~1000× faster data availability ➡️ >50% lower infrastructure costs ➡️ Real-time operational visibility ➡️ Live business metrics As data volumes grow, the challenge isn't collecting data. It's turning data into decisions before the opportunity passes. That's what streaming architectures enable. A special thanks to our partner, Hivemind Technologies, for their incredible support and collaboration throughout this journey. Andreas Vogler Erik Schmiegelow #RealTimeAnalytics #MedallionArchitecture

Like Comment Share
RisingWave

14,624 followers
5d
Report this post
The RUM Conjecture explains one of the most important realities of database design. There is no perfect storage engine. Databases like: ➡️ PostgreSQL ➡️ MySQL ➡️ RocksDB ➡️ Cassandra ➡️ ScyllaDB ➡️ CockroachDB ➡️ YugabyteDB ➡️ Redis ➡️ RisingWave all make different tradeoffs. And those tradeoffs can be explained through a simple framework: The RUM Conjecture. RUM stands for: ➡️ Read efficiency ➡️ Update (write) efficiency ➡️ Memory (space) efficiency The core idea is deceptively simple: You can optimize for any two. But you cannot simultaneously optimize all three. Storage engine designers focus on what they are willing to sacrifice. Every architecture sits somewhere inside the RUM triangle. For example: B+ Trees favor: ✅ Fast reads ✅ Space efficiency while paying for: ➡️ Higher write costs ➡️ Random I/O Pure log structures favor: ✅ Fast writes ✅ Sequential I/O while paying for: ➡️ Expensive reads ➡️ Space amplification Hash tables favor: ✅ Fast reads ✅ Fast writes while paying for: ➡️ Large memory consumption ➡️ Poor space efficiency LSM Trees take a more balanced approach: Write to MemTable → Flush to SSTables → Compact later In other words: Optimize writes first. Organize data later. This allows LSM-based systems to achieve: ✅ High write throughput ✅ Good read performance ✅ Reasonable space efficiency while accepting: ➡️ Compaction overhead ➡️ Read amplification ➡️ Write amplification This is why systems such as: ➡️ RocksDB ➡️ Cassandra ➡️ ScyllaDB ➡️ TiKV ➡️ CockroachDB ➡️ YugabyteDB ➡️ RisingWave all build on LSM principles. The interesting part is that the hard problem isn't optimization itself. It's understanding the cost of optimization. Improve reads? ➡️ More indexes ➡️ More metadata ➡️ More maintenance Improve writes? ➡️ More files ➡️ More versions ➡️ More read work later Improve space efficiency? ➡️ Less redundancy ➡️ Less metadata ➡️ Slower lookups Every improvement introduces a new cost somewhere else in the system. As the RUM Conjecture teaches: Pull one rope, and the other two pull back. Modern storage engines are ultimately exercises in balancing: ➡️ Read amplification ➡️ Write amplification ➡️ Space amplification rather than eliminating them entirely. This is also why LSM Trees became the dominant storage architecture behind modern data systems. Not because they maximize one dimension. But because they strike a practical balance across all three. Understanding the RUM Conjecture is ultimately about understanding a fundamental truth of system design: Every storage engine is a tradeoff. The question is not whether tradeoffs exist. The question is which tradeoffs best fit your workload. #Databases #StorageEngines # #LSMTree #PostgreSQL

Like Comment Share
RisingWave reposted this
Tun Shwe
6d
Report this post
🌉 I'm excited about my next talk on Monday in Sunnyvale. I'll be returning to the San Francisco Bay Area with RisingWave, MotherDuck and Lenses.io to host an evening meetup on lakehouse architecture, streaming data and agentic AI. I'll be in the line-up with Rayees Pasha and Jacob Matson, speaking about the practice of agentic engineering using real-time context from Apache Kafka. A lot of my focus these days is on helping engineers find practical approaches for applying AI in their day-to-day work. Looking forward to sharing what's been working for me. I've dropped the signup link in the comments. Hope to see you there!
5 Comments

Like Comment Share
RisingWave reposted this
Fei Lang
6d Edited
Report this post
Everyone is talking about AI agents. Less attention is being paid to the data infrastructure required to make them work reliably at scale. Excited to join RisingWave ahead of AWS Summit DC to discuss how real-time data, streaming systems, and AI-native architectures are shaping the future of intelligent applications. Looking forward to sharing what we’re seeing across the industry and exchanging ideas with builders, architects, and data leaders. Hope to see you there. #AWS #AWSSummit2026 #AgenticAI #RealTimeData #StreamingData #AIInfrastructure #DataEngineering
RisingWave

14,624 followers
1w

Amazon Web Services (AWS) 🤝 RisingWave: Real-Time Data on AWS Meetup We're excited to partner with AWS for a special pre–AWS Summit Washington, DC meetup on June 29, 2026, where we'll explore the future of real-time data infrastructure, streaming systems, and AI-native architectures on AWS. Our speakers will share their perspectives on the technologies, architectures, and paradigms shaping the future of AI and data infrastructure. Speakers include: ➡️ Rayees Pasha, CPO, RisingWave ➡️ Fei Lang, Principal Partner Solution Architect, AWS ➡️ Manish Devgan, Product & AI Leader | Former KX SVP & Hazelcast CPO Event Details: ➡️ June 29, 2026 ➡️ 6:30 PM – 8:30 PM EDT ➡️ Amazon WAS16 Aurora, Arlington, Virginia The evening will include talks, along with live discussions, Q&A sessions, networking opportunities, food, and refreshments. It's a great opportunity to connect with AWS Summit attendees and professionals working across real-time data, analytics, and AI. Register for the event and see you there! ➡️ https://luma.com/real-u0j8 #AWS #AWSSummit2026 #AWSWashingtonDC #RealTimeData
1 Comment

Like Comment Share
RisingWave reposted this
Lenses.io

5,345 followers
6d Edited
Report this post
In a world where decisions depend on what's happening right now, real-time context matters more than ever. On Monday, 15 June, Lenses.io, RisingWave and MotherDuck are bringing together the AI, data, and developer community in Sunnyvale to explore the architectures and best practices behind modern real-time AI systems. Tun Shwe, Rayees Pasha and Jacob Matson will be on stage. We're in town for Databricks Data + AI Summit - this is a good chance to meet us before the conference kicks off. 📅 Monday 15 June, 6 PM PDT 📍 Plug and Play Tech Center, Sunnyvale Register before it fills up: https://okt.to/2ROFBo
Like Comment Share
RisingWave

14,624 followers
6d
Report this post
Vector databases solved retrieval. But they didn't solve freshness. That's why we think the next evolution of vector search is not just better indexing. It's streaming-native vector retrieval. Most vector search architectures look like: Source Data → Embeddings → Vector Database → Application The problem? The data changes continuously, but the index is often updated in batches. That creates: ➡️ ingestion lag ➡️ stale embeddings ➡️ consistency challenges ➡️ operational complexity That's why modern AI systems are evolving beyond traditional vector search. But the real shift is not: "adding vector search." The real shift is: streaming-native computation + live vector retrieval. Because the biggest challenge is not similarity search itself. It is: ➡️ keeping embeddings fresh ➡️ continuously updating indexes ➡️ reducing data-to-retrieval latency ➡️ maintaining consistency between data and vectors That's why modern AI systems increasingly combine: ✅ continuous computation ✅ vector search ✅ HNSW indexes With a streaming-native architecture: Source → Streaming Database → Live Vector Index → Application New events update the index automatically and incrementally. Updated records update the index automatically. The gap between data and retrieval shrinks from hours to seconds. The future of AI retrieval is not just: vector-native. It's: ✅ streaming-native ✅ continuously updated ✅ real-time ✅ retrieval-ready Because the data is live. The index should be too. Read the blog here: https://lnkd.in/dAGduWRh
Like Comment Share
RisingWave reposted this
Lenses.io

5,345 followers
1w
Report this post
We're co-hosting a meetup with RisingWave and MotherDuck the evening before Databricks Data + AI Summit👇 #DataAISummit
RisingWave

14,624 followers
1w

We're excited to co-host the Real-Time Lakehouse & Agentic AI Meetup on June 15, 2026, together with our partners MotherDuck and Lenses.io to discuss the future of real-time AI and data infrastructure. Our speakers will share their perspectives on lakehouse architectures, streaming data, and the technologies powering AI infra. Speakers include: ➡️ Rayees Pasha, CPO, RisingWave ➡️ Tun Shwe, AI Lead, Lenses ➡️ Jacob Matson, MotherDuck Event Details ➡️ June 15, 2026 ➡️ 6:00 PM – 8:00 PM PDT ➡️ Plug and Play Tech Center 440 N Wolfe Rd Sunnyvale, CA 94085, USA The evening will include talks, community discussions, networking opportunities, food, and refreshments. It's a great opportunity to connect with AI and data from across the Bay Area. Join us as we together explore how real-time data, lakehouse architectures, and streaming systems are converging to power intelligent applications. Register for the event and see you there! ➡️ https://luma.com/real-uhka #AI #AgenticAI #RealTimeData #Lakehouse #DataEngineering
1 Comment

Like Comment Share

RisingWave

Software Development

San Francisco, California 14,624 followers

The live data company. Powering humans and agents with what's happening now.

About us

Products

RisingWave

Event Stream Processing (ESP) Software

Employees at RisingWave

Xiangyu (Sam) Hu

View 41 employees at RisingWave

Locations

Updates

Join now to see what you are missing

Similar pages

e6data

Apache Iceberg

StreamNative

Materialize

Redpanda Data

Feldera

Vakamo

ClickHouse

Confluent

StarRocks

Browse jobs

Engineer jobs

Risk Director jobs

Senior Director of Product Management jobs

Planning Director jobs

Information System Security Manager jobs

Director jobs

Vice President of Product Management jobs

Head of Content jobs

Director of Product Management jobs

Head of Product jobs

Senior Director jobs

Director of Engineering jobs

Director of Operations jobs

Director Project Control jobs

Lead Software Engineer jobs

Platform Engineer jobs

Senior Product Manager jobs

Associate jobs

Machine Learning Engineer jobs

Engineering Manager jobs