PyTorch Optimizations from Intel

Comments

Hi everyone,

I'm trying to deploy a model server on OpenShift AI using a TorchServe image that supports the Intel Extension for PyTorch. My goal is to leverage Intel AMX for optimized model inference.

Has anyone successfully configured this setup? If so, could you provide guidance or examples of the necessary YAML configuration and steps to get this running?

Any help or pointers to relevant documentation would be greatly appreciated!

The modle I am working with currently is the Llama-3-8b-inference.

Thanks in advance!

Started 2024-05-24T09:53:01+00:00 by

Simon Hess

RHCOE Newbie 5 points

Responses

Here are the common uses of Markdown.

Code blocks

~~~
Code surrounded in tildes is easier to read
~~~

Links/URLs

[Red Hat Customer Portal](https://access.redhat.com)

Select Your Language

Responses

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Responses

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links