Sesja 17: Potoki CI/CD dla rozwiązań AI
MLOps i automatyzacja deployment
🎯 Cele sesji
- Implementacja CI/CD pipelines dla projektów ML
- Azure DevOps dla automatyzacji MLOps
- Model deployment strategies (blue-green, canary)
- Automated testing dla ML systems
🔄 MLOps Pipeline Architecture
CI/CD dla Machine Learning
CODE COMMIT → AUTOMATED TESTING → MODEL TRAINING → MODEL VALIDATION → DEPLOYMENT
↓ ↓ ↓ ↓ ↓
VERSION CTRL → DATA VALIDATION → EXPERIMENT TRACKING → QUALITY GATES → MONITORING
Kluczowe komponenty:
- Source Control - kod, data versioning, model registry
- Automated Testing - unit tests, data validation, model tests
- Training Pipeline - automated retraining, hyperparameter optimization
- Model Validation - performance benchmarking, bias testing
- Deployment - containerization, blue-green deployments
Azure DevOps dla ML
# azure-pipelines.yml dla ML project
trigger:
branches:
include:
- main
- develop
paths:
include:
- src/
- data/
- models/
variables:
azureServiceConnection: 'azure-ml-service-connection'
workspaceName: 'ai-workshop-workspace'
experimentName: 'production-model-training'
stages:
- stage: DataValidation
displayName: 'Data Quality Validation'
jobs:
- job: ValidateData
displayName: 'Validate Training Data'
pool:
vmImage: 'ubuntu-latest'
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '3.9'
- script: |
pip install -r requirements.txt
python scripts/validate_data.py --data-path $(dataPath)
displayName: 'Run Data Validation'
- task: PublishTestResults@2
inputs:
testResultsFiles: 'data-validation-results.xml'
- stage: ModelTraining
displayName: 'Model Training & Evaluation'
dependsOn: DataValidation
condition: succeeded()
jobs:
- job: TrainModel
displayName: 'Train and Evaluate Model'
pool:
vmImage: 'ubuntu-latest'
steps:
- task: AzureCLI@2
displayName: 'Train Model in Azure ML'
inputs:
azureSubscription: $(azureServiceConnection)
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
az ml job create --file training-job.yml \
--workspace-name $(workspaceName)
- stage: ModelDeployment
displayName: 'Model Deployment'
dependsOn: ModelTraining
condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
jobs:
- deployment: DeployToStaging
displayName: 'Deploy to Staging'
environment: 'staging'
strategy:
runOnce:
deploy:
steps:
- task: AzureCLI@2
displayName: 'Deploy Model Endpoint'
inputs:
azureSubscription: $(azureServiceConnection)
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
az ml online-endpoint create --file staging-endpoint.yml
az ml online-deployment create --file staging-deployment.yml
🚀 Model Deployment Strategies
Blue-Green Deployment
from azure.ai.ml import MLClient
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment
class ModelDeploymentManager:
def __init__(self, ml_client):
self.ml_client = ml_client
async def blue_green_deployment(self, model_name, new_model_version, endpoint_name):
"""Blue-green deployment strategy"""
print(f"🔄 Starting blue-green deployment for {model_name}")
try:
# Get current deployment status
current_deployments = list(
self.ml_client.online_deployments.list(endpoint_name)
)
# Determine current and new deployment names
if current_deployments:
active_deployment = current_deployments[0].name
new_deployment = "green" if active_deployment == "blue" else "blue"
else:
active_deployment = None
new_deployment = "blue"
print(f"📊 Current: {active_deployment}, New: {new_deployment}")
# Create new deployment
deployment_config = ManagedOnlineDeployment(
name=new_deployment,
endpoint_name=endpoint_name,
model=f"{model_name}:{new_model_version}",
instance_type="Standard_DS3_v2",
instance_count=1,
request_settings={
"request_timeout_ms": 60000,
"max_concurrent_requests_per_instance": 1
},
liveness_probe={
"initial_delay": 10,
"period": 10,
"timeout": 2,
"failure_threshold": 30
}
)
# Deploy new version
print(f"🚀 Deploying {new_deployment}...")
deployment_poller = self.ml_client.online_deployments.begin_create_or_update(
deployment_config
)
deployment_result = deployment_poller.result()
# Test new deployment
print("🧪 Testing new deployment...")
test_results = await self._test_deployment(endpoint_name, new_deployment)
if test_results["success"]:
# Switch traffic to new deployment
print("✅ Tests passed, switching traffic...")
await self._update_traffic_allocation(
endpoint_name,
{new_deployment: 100, active_deployment: 0} if active_deployment else {new_deployment: 100}
)
# Clean up old deployment after verification
if active_deployment:
print(f"🗑️ Cleaning up old deployment: {active_deployment}")
await asyncio.sleep(60) # Wait for traffic switch
self.ml_client.online_deployments.begin_delete(
endpoint_name, active_deployment
)
return {
"status": "success",
"active_deployment": new_deployment,
"previous_deployment": active_deployment,
"test_results": test_results
}
else:
# Rollback - delete failed deployment
print("❌ Tests failed, rolling back...")
self.ml_client.online_deployments.begin_delete(
endpoint_name, new_deployment
)
return {
"status": "rollback",
"active_deployment": active_deployment,
"error": test_results["error"],
"failed_deployment": new_deployment
}
except Exception as e:
print(f"❌ Deployment failed: {str(e)}")
return {
"status": "failed",
"error": str(e)
}
async def _test_deployment(self, endpoint_name: str, deployment_name: str) -> Dict:
"""Comprehensive testing nowego deployment"""
test_scenarios = [
{
"name": "basic_prediction",
"input": {"data": [1, 2, 3, 4, 5]},
"expected_type": "array"
},
{
"name": "edge_case_input",
"input": {"data": []},
"expected_error": True
},
{
"name": "performance_test",
"input": {"data": list(range(1000))},
"max_response_time": 5000 # ms
}
]
test_results = {"success": True, "tests": []}
for scenario in test_scenarios:
try:
# Get endpoint details
endpoint = self.ml_client.online_endpoints.get(endpoint_name)
scoring_uri = endpoint.scoring_uri
# Make test request
import requests
import time
start_time = time.time()
response = requests.post(
scoring_uri,
json=scenario["input"],
headers={
"Authorization": f"Bearer {self._get_auth_token()}",
"Content-Type": "application/json"
},
timeout=30
)
response_time = (time.time() - start_time) * 1000 # ms
# Validate response
test_result = {
"scenario": scenario["name"],
"status": "passed",
"response_time_ms": response_time
}
if scenario.get("expected_error"):
if response.status_code == 200:
test_result["status"] = "failed"
test_result["reason"] = "Expected error but got success"
elif response.status_code != 200:
test_result["status"] = "failed"
test_result["reason"] = f"Request failed with status {response.status_code}"
# Performance check
if "max_response_time" in scenario and response_time > scenario["max_response_time"]:
test_result["status"] = "failed"
test_result["reason"] = f"Response time {response_time:.0f}ms exceeds limit {scenario['max_response_time']}ms"
test_results["tests"].append(test_result)
if test_result["status"] == "failed":
test_results["success"] = False
print(f"✅ Test {scenario['name']}: {test_result['status']}")
except Exception as e:
test_results["tests"].append({
"scenario": scenario["name"],
"status": "failed",
"error": str(e)
})
test_results["success"] = False
print(f"❌ Test {scenario['name']}: {str(e)}")
return test_results
✅ Zadania praktyczne
Zadanie 1: Azure DevOps Setup (45 min)
- Skonfiguruj Azure DevOps project
- Stwórz service connections dla Azure ML
- Zaimplementuj basic CI/CD pipeline
- Przetestuj automated training
Zadanie 2: Blue-Green Deployment (30 min)
- Zaimplementuj blue-green deployment strategy
- Dodaj comprehensive testing
- Skonfiguruj automatic rollback
- Przetestuj z sample model
Zadanie 3: Monitoring Pipeline (30 min)
- Dodaj monitoring do deployment pipeline
- Skonfiguruj alerty dla failures
- Implement automated notifications
- Create deployment dashboard
Zadanie 4: Advanced Automation (15 min)
- Triggered retraining na data changes
- Automated hyperparameter optimization
- Multi-environment deployment
- Performance benchmarking automation
🎯 Metryki sukcesu
- Deployment frequency - daily releases possible
- Lead time < 4 hours from commit to production
- Mean time to recovery < 30 minutes
- Change failure rate < 15%