관리형 MLflow(Managed MLflow)

더 나은 모델과 생성형 AI 앱 구축

생성형 AI Big Book

프로덕션 품질의 GenAI 애플리케이션 구축 모범 사례

관리형 MLFlow란?

Managed MLflow는 보다 효과적인 모델과 생성형 AI 앱을 구축하기 위해 Databricks에서 개발한 오픈 소스 플랫폼인 MLflow의 기능을 확장하므로 엔터프라이즈 안정성, 보안 및 확장성에 중점을 둘 수 있습니다. MLflow에 대한 최신 업데이트에는 대규모 언어 모델(LLM)의 관리 및 배포 기능을 강화하는 혁신적인 GenAI 및 LLMOps 기능이 포함되어 있습니다. 이 확장된 LLM 지원은 MLflow 배포 서버뿐만 아니라 OpenAI, Hugging Face Transformers와 같은 산업 표준 LLM 도구와의 새로운 통합을 통해 제공됩니다. 또한 MLflow와 LLM 프레임워크(예: LangChain)의 통합을 통해 챗봇, 문서 요약, 텍스트 분류, 감정 분석 등을 포함한 다양한 사용 사례에 대한 생성형 AI 애플리케이션을 구축하기 위한 단순화된 모델 개발이 가능합니다.

장점

모델 개발

프로덕션 지원 모델을 개발하기 위한 표준화된 프레임워크를 기반으로 머신 러닝 수명 주기 관리를 강화하고 가속화합니다. Managed MLflow 사용법인 레시피를 활용하면 원활한 ML 프로젝트 부트스트랩, 빠른 반복, 대규모 모델 배포가 가능합니다. 챗봇, 문서 요약, 감정 분석, 분류 등의 애플리케이션을 손쉽게 제작할 수 있습니다. LangChain, Hugging Face, OpenAI와 원활하게 통합되는 MLflow의 LLM 오퍼링을 사용하여 생성형 AI 앱(예: 챗봇, 문서 요약)을 손쉽게 개발할 수 있습니다.

Experiment 추적

모든 ML 라이브러리, 프레임워크 또는 언어로 실험을 실행하고 각 실험의 매개변수, 지표, 코드 및 모델을 자동으로 추적합니다. Databricks에서 MLflow를 사용하면 Databricks 워크스페이스 및 노트북과의 기본 통합 덕분에 해당 아티팩트 및 코드 버전과 함께 실험 결과를 안전하게 공유, 관리, 비교할 수 있습니다. 또한 MLflow 평가 기능을 통해 GenAI 실험 결과를 평가하고 품질을 개선할 수 있습니다.

모델 관리

하나의 중앙 집중된 위치에서 ML 모델을 검색 및 공유하고, Experiment에서 온라인 테스트 및 프로덕션으로 이동하는 데 협력하고, 승인 및 거버넌스 워크플로 및 CI/CD 파이프라인과 통합하고, ML 배포 및 성능을 모니터링할 수 있습니다. MLflow 모델 레지스트리는 전문 지식의 공유를 촉진하고 빈틈없는 관리를 지원합니다.

모델 배포

Docker 컨테이너, Azure ML 또는 Amazon SageMaker와의 기본 통합을 사용하여 Apache Spark™ 또는 REST API로의 배치 추론을 위한 프로덕션 모델을 빠르게 배포합니다. Databricks 기반의 관리형 MLflow에서는 Databricks 작업 스케줄러 및 자동 관리형 클러스터를 사용하여 프로덕션 모델을 운용하고 모니터링하여 비즈니스 요구 사항에 따라 확장할 수 있습니다.

MLflow의 최신 업그레이드는 GenAI 애플리케이션을 원활하게 패키지화하여 배포할 수 있도록 지원합니다. 이제 Databricks Model Serving을 사용하여 문서 요약, 감정 분석 및 분류와 같은 챗봇 및 기타 GenAI 애플리케이션을 대규모로 배포할 수 있습니다.

Features

MLflow Tracking

MLFLOW TRACKING: Automatically log parameters, code versions, metrics, and artifacts for each run using Python, REST, R API, and Java API

GENERATIVE AI DEVELOPMENT: Simplify model development to build GenAI applications for a variety of use cases such as chatbots, document summarization, sentiment analysis and classification with MLflow’s Deployments Server and Evaluation UI, supported by native integration with LangChain, and seamless UI for fast prototyping and iteration.

MLFLOW TRACKING SERVER: Get started quickly with a built-in tracking server to log all runs and experiments in one place. No configuration needed on Databricks.

EXPERIMENT MANAGEMENT: Create, secure, organize, search and visualize experiments from within the workspace with access control and search queries.

MLFLOW RUN SIDEBAR: Automatically track runs from within notebooks and capture a snapshot of your notebook for each run so that you can always go back to previous versions of your code.

LOGGING DATA WITH RUNS: Log parameters, datasets, metrics, artifacts and more as runs to local files, to a SQLAlchemy compatible database, or remotely to a tracking server.

DELTA LAKE INTEGRATION: Track large-scale datasets that fed your models with Delta Lake snapshots.

ARTIFACT STORE: Store large files such as S3 buckets, shared NFS file system, and models in Amazon S3, Azure Blob Storage, Google Cloud Storage, SFTP server, NFS, and local file paths.

MLflow Models

MLFLOW MODELS: A standard format for packaging machine learning models that can be used in a variety of downstream tools — for example, real-time serving through a REST API or batch inference on Apache Spark.

MODEL CUSTOMIZATION: Use Custom Python Models and Custom Flavors for models from an ML library that is not explicitly supported by MLflow’s built-in flavors.

BUILT-IN MODEL FLAVORS: MLflow provides several standard flavors that might be useful in your applications, like Python and R functions, Hugging Face, OpenAI and LangChain, PyTorch, Spark MLlib, TensorFlow and ONNX.

BUILT-IN DEPLOYMENT TOOLS: Quickly deploy on Databricks via Apache Spark UDF for a local machine, or several other production environments such as Microsoft Azure ML, Amazon SageMaker, and building Docker Images for Deployment.

MLflow Model Registry

CENTRAL REPOSITORY: Register MLflow models with the MLflow Model Registry. A registered model has a unique name, version, stage and other metadata.

MODEL VERSIONING: Automatically keep track of versions for registered models when updated.

MODEL STAGE: Assign preset or custom stages to each model version, like “Staging” and “Production” to represent the lifecycle of a model.

CI/CD WORKFLOW INTEGRATION: Record stage transitions, request, review and approve changes as part of CI/CD pipelines for better control and governance.

MODEL STAGE TRANSITIONS: Record new registration events or changes as activities that automatically log users, changes, and additional metadata such as comments.

MLflow Deployments Server

GOVERN ACCESS TO LLMS: Manage SaaS LLM credentials.

CONTROL COSTS: Set up rate limits.

STANDARDIZE LLM INTERACTIONS: Experiment with different OSS/SaaS LLMs with standard input/output interfaces for different tasks: completions, chat, embeddings.

Features

MLflow Tracking

MLFLOW TRACKING: Automatically log parameters, code versions, metrics, and artifacts for each run using Python, REST, R API, and Java API

MLFLOW TRACKING SERVER: Get started quickly with a built-in tracking server to log all runs and experiments in one place. No configuration needed on Databricks.

EXPERIMENT MANAGEMENT: Create, secure, organize, search and visualize experiments from within the workspace with access control and search queries.

MLFLOW RUN SIDEBAR: Automatically track runs from within notebooks and capture a snapshot of your notebook for each run so that you can always go back to previous versions of your code.

LOGGING DATA WITH RUNS: Log parameters, datasets, metrics, artifacts and more as runs to local files, to a SQLAlchemy compatible database, or remotely to a tracking server.

DELTA LAKE INTEGRATION: Track large-scale datasets that fed your models with Delta Lake snapshots.

ARTIFACT STORE: Store large files such as S3 buckets, shared NFS file system, and models in Amazon S3, Azure Blob Storage, Google Cloud Storage, SFTP server, NFS, and local file paths.

MLflow Models

MODEL CUSTOMIZATION: Use Custom Python Models and Custom Flavors for models from an ML library that is not explicitly supported by MLflow’s built-in flavors.

MLflow Model Registry

CENTRAL REPOSITORY: Register MLflow models with the MLflow Model Registry. A registered model has a unique name, version, stage and other metadata.

MODEL VERSIONING: Automatically keep track of versions for registered models when updated.

MODEL STAGE: Assign preset or custom stages to each model version, like “Staging” and “Production” to represent the lifecycle of a model.

CI/CD WORKFLOW INTEGRATION: Record stage transitions, request, review and approve changes as part of CI/CD pipelines for better control and governance.

MODEL STAGE TRANSITIONS: Record new registration events or changes as activities that automatically log users, changes, and additional metadata such as comments.

MLflow Deployments Server

GOVERN ACCESS TO LLMS: Manage SaaS LLM credentials.

CONTROL COSTS: Set up rate limits.

STANDARDIZE LLM INTERACTIONS: Experiment with different OSS/SaaS LLMs with standard input/output interfaces for different tasks: completions, chat, embeddings.

최신 기능에 대한 자세한 내용은 Azure Databricks 및 AWS의 제품 뉴스를 참조하세요.

MLflow 제품 비교

	Open Source MLflow	Managed MLflow on Databricks
Experiment 추적
MLflow 추적 API
MLflow 추적 서버	자체 호스팅	완전 관리형
노트북 통합
워크플로 통합
재현 가능한 프로젝트
MLflow 프로젝트
모델 관리
Git 및 Conda 통합
프로젝트 실행을 위한 확장 가능한 클라우드/클러스터
MLflow 모델 레지스트리
모델 버전 관리
유연한 배포
ACL 기반 단계 전환
CI/CD 워크플로 통합
보안 및 관리
기본 내장 배치 추론
MLflow 모델
기본 내장 스트리밍 분석
고가용성
자동 업데이트
역할 기반 액세스 제어
보안 및 관리
역할 기반 액세스 제어
역할 기반 액세스 제어
역할 기반 액세스 제어
역할 기반 액세스 제어
보안 및 관리
역할 기반 액세스 제어
역할 기반 액세스 제어
역할 기반 액세스 제어
역할 기반 액세스 제어
보안 및 관리
역할 기반 액세스 제어
역할 기반 액세스 제어
역할 기반 액세스 제어

리소스

Ready to get started?

Try Databricks for free Quick Start Guide

관리형 MLflow(Managed MLflow)

더 나은 모델과 생성형 AI 앱 구축

관리형 MLFlow란?

장점

모델 개발

Experiment 추적

모델 관리

모델 배포

Features

Features

MLflow 제품 비교

Experiment 추적

재현 가능한 프로젝트

모델 관리

유연한 배포

보안 및 관리

보안 및 관리

보안 및 관리

보안 및 관리

리소스

블로그

영상

튜토리얼

웨비나

웨비나

웨비나