💥 Check out this trending post from TechCrunch 📖
📂 Category: AI,artificial intelligence,inference,laude ventures
📌 Main takeaway:
As AI infrastructure reaches dizzying proportions, there is more pressure than ever to extract as much inference as possible from the GPUs at their disposal. For researchers with expertise in a particular technology, this is a good time to raise funding.
That’s part of the driving force behind Tensormesh, which quietly launched this week with $4.5 million in seed funding. This investment was led by Laude Ventures, with additional funding from database pioneer Michael Franklin.
Tensormesh is using the funds to build a commercial version of the open source LMCache utility, which was launched and maintained by Tensormesh co-founder Yihua Cheng. If used well, LMCache can reduce inference costs by up to 10x – a strength that has made it a staple in open source deployments and drawn from integrations from large companies like Google and Nvidia. Tensormesh now plans to exploit this academic reputation and turn it into a viable business.
The core of the product is the key-value cache (or KV cache), which is a memory system used to process complex inputs more efficiently by condensing them down to their key values. In traditional architectures, the KV cache is discarded at the end of each query — but Tensormesh co-founder and CEO Junchen Jiang says this is a massive source of inefficiency.
“It’s like having a very smart analyst who reads all the data, but forgets what he learned after every question,” says Jiang.
Instead of disposing of this cache, Tensormesh systems keep it, allowing it to be redeployed when the model performs a similar operation in a separate query. Since GPU memory is so precious, this may mean spreading data across several different storage layers, but the reward is significantly more inference power for the same server load.
The change is particularly powerful for chat interfaces, since forms need to constantly reference the growing chat history as the conversation progresses. Agent systems face a similar problem, with an ever-increasing history of actions and goals.
In theory, these are changes that AI companies can implement themselves, but the technical complexity makes them a daunting task. Given the Tensormesh team’s work researching the process and the complexity of the details themselves, the company is betting there will be significant demand for the finished product.
“Keeping the KV cache in a secondary storage system and reusing it efficiently without slowing down the entire system is a very challenging problem,” says Jiang. “We’ve seen people hire 20 engineers and spend three or four months building such a system. Or they can use our product and do it very efficiently.”
💬 What do you think?
#️⃣ #Tensormesh #raises #million #squeeze #inference #server #loads
🕒 Posted on 1761289360
