Lium exits stealth with $5.5m seed to query complex scientific data
Lium, formerly known as Astromind, has emerged from stealth with a commercial platform that ingests complex, high-dimensional datasets and makes them accessible through plain-language queries. The Dallas company, founded in 2024, disclosed $5.5 million in seed funding from SJF Ventures, Wavemaker 360, Reach Capital, and GC&H Investments alongside its launch.
The platform targets datasets that standard large language models struggle to reason over — seismic surveys, satellite imagery, radar feeds, well logs and scientific sensor data. Rather than requiring a data engineer to write bespoke pipelines, Lium ingests raw files, structures them into a format AI systems can process reliably, and pre-computes common query paths so that results are consistent across runs. The company says the system compounds institutional knowledge over time, becoming more useful as query volume grows.
Early customers and use cases
Three named customers provide some substance to the commercial launch. Industrial power services firm nexGEN is using the platform to automate electromagnetic spectrum analysis, replacing a manual reporting workflow with automated generator health reports. Geoscience software provider Imaged Reality has integrated Lium into its Stratbox subsurface visualisation tool, allowing geologists to run natural-language queries across core imagery, well logs and facies tables.
The North Carolina Institute for Climate Studies (NCICS) represents the most data-intensive deployment disclosed. Scientists there are using Lium to query terabytes of publicly available NOAA data — covering weather stations, radar, satellites, ships and sensors — to surface climate and weather-risk insights without needing to write software. Researcher James Anheuser said the platform "manages the compute, blends datasets, and navigates disparate file formats," freeing scientists from acting as data engineers.
Co-founder and chief executive Josh Knutson said large language models "are quite limited when it comes to understanding the data that represents our physical world," and that Lium provides an "agentic harness purpose built for turning complex data into knowledge." The company developed an early version of its technology working with astrophysicists interpreting data from NASA's Chandra X-ray Observatory, making sparse X-ray observations queryable for LLMs.
Market context and competitive landscape
Lium is entering a competitive and fast-moving corner of the AI-data infrastructure market. A number of well-funded startups and established players are pursuing similar goals of making unstructured or domain-specific data accessible to foundation models — approaches range from domain-specific retrieval-augmented generation pipelines to multimodal models trained directly on scientific data types. Hyperscalers are also building connectors and data-prep tooling into their model-serving platforms that partially address the same friction.
What distinguishes Lium's stated approach is its focus on physical-world data formats — geospatial, seismic, spectral — that remain outside the mainstream text-and-code training distributions of general-purpose LLMs. The energy, climate science and infrastructure sectors it targets are also subject to growing regulatory data-management obligations; the EU's European Data Act and the UK's forthcoming data use and access legislation both create pressure on industrial operators to make siloed datasets more interoperable, which could accelerate adoption of middleware platforms like Lium's.
The $5.5 million seed is a modest war chest against enterprise sales cycles in energy and geoscience, where procurement timelines are long and security-review requirements are demanding. Investors will watch closely for a Series A raise and the conversion of early pilot customers to multi-year contracts as the company's primary near-term milestones.