OctoML makes it easy to put AI/ML models into production

- Advertisement -


OctoML, well funded The machine learning startup that helps enterprises optimize and deploy their models today released a major update to its product that will make it much easier for developers to integrate machine learning models into their applications. With this new release, OctoML can now transform machine learning models into portable programming functions that developers can interact with through a consistent API. It will also make it easier to integrate these models into existing DevOps workflows.

- Advertisement -

As OctoML founder and CEO Luis Cese told me, he thinks this is a big moment for the company. Ceze, along with Tianqi Chen’s CTO, CPO Jason Knight, Chief Architect Jared Resch, and Thierry Moreau, Vice President of Technology Partnerships, founded the company to manufacture products. TVMan open source machine learning compiler framework that helps machine learning engineers optimize their models for specific hardware.

- Advertisement -

Image Credits: OctoML

“ATchicken we began octoML, we said: let’s do TVM as a FROMservice,” Cese said. “ATe learned a a lot of from what but then became clearWith we worked with more clients what AI/ML deployment still too much hard.”

- Advertisement -

He noted that as data collection and model building tools have improved over the past few years, the gap between what these models can do and actually integrating them into applications has only widened. Thus, by turning models into functions, this gap practically disappears. This new system abstracts much of that complexity from developers, which will surely help get more models into production. Currently, depending on who you trust, more than half of trained machine learning models never make it into production.

Since OctoML already offers the tools to run these models virtually anywhere, many of the choices for choosing where to deploy a model can now also be automated. “AThat distinguishes us from any other solution in ability to get in model per deployment, integrate it into in statement – as well as then run away on the Any endpoint,” Cese said, noting that this is also a game-changer for autoscaling, as it allows engineers to create autoscaling systems that can move the model to processors and accelerators with different performance characteristics as needed.

However, the capabilities of models as functions are only part of the company’s claims to date. Also new to the platform is a new tool that helps OctoML use machine learning to optimize machine learning models. The service can automatically detect and resolve dependencies and clean up and optimize model code. There’s also a new local OctoML CLI and support for the Nvidia Triton inference server, which can now be used with the new model-as-a-feature service.

“NVIDIA Triton is a powerful abstraction that allows users to run multiple deep learning frameworks and acceleration technologies on both the NVIDIA CPU and GPU,” said OctoMl CTO Jared Resh. “What’s more, by pairing NVIDIA Triton with OctoML, we’re making it easier for users to select, integrate, and deploy Triton features. The OctoML workflow further enhances the user value of Triton-based deployments by seamlessly integrating OctoML acceleration technology, allowing you to get the most out of both the service layer and the model layer.”

Looking ahead, Cese noted that the company, which has grown from 20 to over 140 employees since 2020, will focus on delivering its services to more peripherals, including smartphones and other Snapdragon-powered devices through a partnership with Qualcomm.

“The timing seems Correctly because as we speak to close what are deployment to in cloud, currently They all they say have plans to deploy on the in edge, too,” he said.


Credit: techcrunch.com /

- Advertisement -

Stay on top - Get the daily news in your inbox

DMCA / Correction Notice

Recent Articles

Related Stories

Stay on top - Get the daily news in your inbox