(genai-deployment)=
# Deploying gen AI 

MLRun serving can produce managed ML application pipelines using real-time auto-scaling Nuclio serverless functions. 
The application pipeline includes all the steps including: accepting events or data, preparing the required model features, 
inferring results using one or more models, and driving actions.

**In this section**

```{toctree}
:maxdepth: 1

genai_serving
gpu_utilization
genai_serving_graph
```